This is a series blog posts that deep dives into the world of SAML:
SAML 2.0 Explained in Simple Words - Part I: Intro and Overview
SAML 2.0 Explained in Simple Words - Part II: SAML Flows
SAML 2.0 Explained in Simple Words - Part III: SAML Model and Best Practices
Similar to OAuth and OIDC, SAML (Security Assertion Markup Language) is another essential protocol used in the IAM world. Even it was originally developed in 2006 and people say that OAuth/OIDC is replacing SAML, the reality is that SAML is still widely and actively being used and deployed in the industry today. The reason is that there are some features that SAML has but OAuth/OIDC doesn't. Hence, having a good understanding is crucial for any IAM professionals.
The most widely used version is SAML 2.0, which was developed to fix some security issues in SAML 1.0, and when people talk about SAML, it is usually SAML 2.0 they are referring to. We will follow that fashion in this post.
1. SAML Overview
SAML was published by the open standard consortium OASIS (Organization for the Advancement of Structured Information Standards) in 2006 to address the issue of repeated password inputs and SSO (Single Sign-On) was introduced during that time.
At that time, JSON or YAML was not quite adopted yet and XML was popular, so SAML was developed based on XML.
The major use case scenarios for SAML are: SSO/Cross-Domain SSO and Identity Federation. Let's understand them first before diving into SAML.
1.1 SSO vs. Cross-Domain SSO vs. Identity Federation
When talking about SAML, sometimes you will encounter so called 'Cross-Domain SSO' and 'Identity Federation'. It's good to know a little bit among SSO, Cross-Domain SSO and Identity Federation.
1.1.1 SSO
SSO could be referring to just the general concept of single sign-on, that is when you have multiple applications and once you sign into one application, you don't have to sign into other applications. These applications could be within the same domain or from different domains. However, when SSO is used against Cross-Domain SSO, the former usually means SSO within the same domain.
The implementation for same-domain SSO is like typical setup where you have a centralized IAM platform and other applications will redirect users to this centralized IAM platform for authentication. This redirect mechanism could be based on standard protocol like SAML or OIDC. It could also be something custom developed. The goal is to have all apps integrated with the centralized IAM platform so that SSO can be done with those Apps.
1.1.2 Cross-Domain SSO
Cross-Domain SSO, like its name, is for SSO from different domains. When applications are from different domains, they could be from different departments within an organization, but more often than not, they are applications from other organizations. So here we are talking about SSO across different companies or organizations, whose infrastructure environment could be very different from same-domain SSO. Custom development is usually limited, as one company won't easily allow another company to access their APIs or other resources. In this case, it's better to use standard protocols like SAML or OIDC. For enterprise-to-enterprise SSO integration, SAML is sometimes preferred over OIDC.
1.1.3 Identity Federation
Identity Federation (or Federated Identity) is a system which allows identities from different enterprises (domains) to use the same digital identity to access all applications and networks.
In a general sense, SSO is needed to fulfill Identity Federation, and since Identity Federate usually deals with different organizations/enterprises, it is closer to the concept of Cross-Domain SSO.
When identities are federated across multiple organizations, their profile informations are shared to those organizations as well so that the organizations can operate more efficiently using that set of identity profile. In order for that to happen, besides the trusted relationship established for Cross-Domain SSO, organizations need to have an agreement on what profile information to be shared as well. After that, protocols like SAML or OIDC can be leveraged again to implement it.
1.2.1 IDP (Identity Provider) and SP (Service Provider)
Now, let's jump on the boat of SAML. There are two major parties in SAML: IDP and SP, they are both called SAML entities.
IDP (Identity Provider) is the party responsible for Authenticating the user while SP (Service Provider) is the party responsible for providing services, who will NOT authenticate user but delegate authentication to IDP.
The reason for delegating authentication to IDP is that in this case IDP can act as a centralized platform to authenticate users, that is when there are multiple SP, they can all rely on IDP for authentication. This will help to fulfill SSO, because once a user login to IDP, IDP will maintain an active user session (e.g. browser cookie) and next time when the same user is redirected to IDP from another SP, he doesn't need to login anymore.
IDP is usually connected to a centralized User Directory where all user profile information for authentication are present.
1.2.2 SP-Initiated Flow and IDP-Initiated Flow
Once understood what IDP and SP are, we can look at the workflows in SAML. The are two workflows in SAML: SP-Initiated flow and IDP-Initiated flow.
SP-Initiated flow is more frequently seen and the high level process is the same as shown in the above diagram. A user first tries to access some services provided by SP. In order to provide the service, SP needs to authenticate the user to know who he is and what profile information he possesses. Furthermore, to follow the pattern of SSO, SP redirects the user to IDP for authentication. If the user hasn't authenticated in IDP before or the authenticated session is expired, the user will be presented with login page. Once user has finished authentication, IDP will pull the user profile information from the centralized user directory and generate a response to send back to SP. SP then parse and verify the response to extract user information. Also, since the user is verified, a lot of times, SP will create a user session as well so that next time when user comes, he doesn't need to be redirected to IDP again. Note this SP user session is specifically for this SP and will store locally on SP side and it is different from the IDP user session.
IDP-Initiated flow is similar to SP-Initiated flow. The difference is that instead of SP redirecting user to IDP for authentication, user directly accesses IDP and authenticates at the IDP side. Then IDP forwards the authentication result to SP.
One thing to note is that as IDP is usually integrated with multiple SP, it needs to know which SP to forward the authentication result. This is usually indicated when user accesses IDP in the first step.
2. SAML Setup
Now that we have some general understanding of SAML, we can check on how SAML is set up. This setup is needed before the SAML flow can work.
2.1 COT (Circle of Trust)
Before SP can delegate authentication to IDP or use the authentication result from IDP, a trusted relationship needs to be established between IDP and SP. This is kind of intuitive, as a SP just can't delegate authentication to some random party or use the authentication result from it. Vice versa, IDP can't just forward authentication result to some random SP to disclose the user information. Some form of trusted-relationship needs to be established.
Before we look at how the trust is established, let's take a look at the 'end state'. Eventually at some point, multiple applications (SP) need to establish a trusted relationship with at least one IDP (could be more which is referred to as muti-IDP). As illustrated in the above diagram, all SPs and IDPs are in the same league (circle). This basically means, any SP can delegate authentication to an IDP in the circle and any authentication result sent by IDP is trusted by SP (though more validation is need to make sure no tampering on the response which we will explore later).
The circle of trust (COT) is the prerequisite for either IDP-Initiated or SP-Initiated SAML flow, and a lot of times, a COT already exists and we just need to add the new SP to the COT.
2.2 SAML Metadata
Now that we know COT is needed before SAML flow, the question comes as how to establish this handshake. The answer is through a method called Metadata Exchange. Let's first understand what SAML Metadata is.
In theory, SAML metadata is an XML-based configuration data file which includes various information about the SAML entity (IDP or SP) such as identifier, binding endpoints, certificates, keys and etc. Since IDP and SP are serving different purposes in a SAML flow, it's straightforward to tell that IDP and SP will contain different information. On the other hand, they share some similar format and content as well.
Here is an example of IDP and SP metadata combined. We can see the combined version of IDP and SP metadata from time to time. This is because an IDP platform can also serve as a SP. However, SP doesn't usually serve as an IDP. We show a combined version here for conciseness.
2.2.1 General Format
The general format of metadata for IDP looks like:
EntityDescriptor
IDPSSODescriptor
Extensions (optional, IdP discovery and algorithm support, not much used today)
KeyDescriptor (optional, but usually included)
ArtifactResolutionService (optional, legacy and not much used today)
SingleLogoutService (optional)
NameIDFormat (optional)
SingleSignOnService (required at least one)
Organization (Not Important)
ContactPerson (Not Important)
The general format of metadata for SP looks like:
EntityDescriptor
SPSSODescriptor
Extensions (optional, service discovery, algorithm support and entity category, not much used today)
KeyDescriptor (optional, but usually included)
SingleLogoutService (optional)
NameIDFormat (optional)
AssertionConsumerService (required at least one)
Organization (Not Important)
ContactPerson (Not Important)
The most important part for both IDP and SP metadata are the entityID and SSO Descriptor. Organization and ContactPerson are more of reference information type which are usually ignored during SAML integration.
2.2.2 Unique Identifier of SAML Entity
Each SAML Entity (SP and IdP) that participates in SAML protocol should have a Unique Identifier. This identifier could be a URI with maximum 1024 characters. In the above example, the IDP Unique Identifier (entityID) is "https://idp.example.org/idp.xml", while the SP Unique Identifier (entityID) is "https://sp.example.org/sp.xml". They are both within the <EntityDescriptor>, which is the root element of either IDP metadata or SP metadata.
It might look weird to use URL as identifier at first. But seeing that each application usually have unique home URL and that URL represents the application well, it kind of make sense to use URL as the entity identifier. Actually, that's what is most of the time you see in practice.
2.2.3 SSO Descriptor
SSO Descriptor contains Extensions, KeyDescriptor, ArtifactResolutionService (IDP), SingleLogoutService, NameIDFormat, SingleSignOnService(IDP), AssertionConsumerService(SP).
Extensions
The extensions part can contain IDP Discovery, Algorithm Supported and etc. This is not much used today and are usually omitted.
Key Descriptor
This part of metadata includes information of one or more public keys used in the SAML flow. The SAML request/response can be signed as well as encrypted. In order to verify the signature or decrypt the request/response, public key is needed. Usually the public key to verify that information is provided inline along with the request/response. However, pre-registered keys as provided in metadata can prevent invalid public key from being presented in the SAML request/response.
ArtifactResolutionService (ARS)
This is the IDP endpoint for resolving the artifact when Http Artifact Binding is used in the SAML flow. Binding method basically means how IDP send the authenticated user information back to SP.
There are two ways of binding in SAML: Http Front-Channel (Redirect or Post) and Http Back-Channel (Artifact Binding). The most frequently used is the Http Front-Channel with Post method. We will look more into this later.
SingleLogoutService
This endpoint is for Single Logout Service. When Single Logout Service is implemented, both IDP and SP should present a SingleLogoutService endpoint. When a single logout action is triggered from the IDP SingleLogoutService endpoint, the IDP logs the user off and it also send multiple requests to all other SP SingleLogoutService endpoints to log user off on the SP side. Note not all SAML implementation support Single Logout.
NameIDFormat
The NameIDFormat is used to define the NameID value in the IDP response to SP. IDP presents a list of NameIDFormats in its metadata to show what kind of NameIDFormats it supports in the IDP Response. SP presents a list of NameIDFormats to show what kind of NameIDFormats it can request and parse in the IDP response.
A list of examples here:
urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified: Indicates that the format of the NameID is not specified.
urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress: Indicates that the NameID is an email address.
urn:oasis:names:tc:SAML:1.1:nameid-format:X509SubjectName: Indicates that the NameID is an X.509 distinguished name.
urn:oasis:names:tc:SAML:1.1:nameid-format:kerberos: Indicates that the NameID is a Kerberos principal name.
urn:oasis:names:tc:SAML:2.0:nameid-format:persistent: Indicates that the NameID is a persistent identifier that is unique within the scope of the IDP.
urn:oasis:names:tc:SAML:2.0:nameid-format:transient: Indicates that the NameID is a transient identifier that is unique within the scope of the IDP and is not intended to be persisted.
SingleSignOnService
This is IDP's endpoint, which is used as the entry URL to start the SSO service. For example, in the SP-Initiated flow, SP will send request to this endpoint to start the flow. At least one value is required in the IDP metadata.
AssertionConsumerService (ACS)
This is SP's endpoint. It used by IDP when forwarding its SAML response to the SP, where SP will consume the response. At least one value is required from SP metadata.
3.Sum Up
In this post, we looked at another essential protocol in IAM - SAML 2.0. We took a quick peek at SSO and Cross-Domain SSO, and then jump into the two parties in SAML: IDP and SP. In order for IDP and SP to work, a trusted relationship needs to be established between IDP and SP. The trusted group could involve multiple IDP and SP, which is called Circle of Trust. Then we looked at IDP and SP metadata at a more detailed level.
We will continue our SAML exploration in SAML 2.0 Explained in Simple Words - Part II.