SSL certificate basics

An SSL certificate is a sort of ID card. Technically it consists of a set of identifying information and parameters, which is then signed by an issuer after they verify the information matches with the person/entity requesting the certificate. The signature is created by taking a cryptographic hash of the certificate itself and then encrypting that hash with the issuer's private key. Anyone wanting to verify a certificate can use the issuer's public key to decrypt the signature, then take the same cryptographic hash of the information in the certificate being presented and compare it to the signature. The two will only match if the information being presented is the same as what the issuer saw. If anyone's altered the information in the certificate, the signature won't match and you'll know the certificate isn't valid. If someone other than the issuer tries to replace the signature, they won't have the private key matching the issuer's public key so when the signature is decrypted the results won't be what was encrypted and the comparison will fail.

The identifying information is held in what's called Distinguished Name format, which comes from the LDAP protocol. In practice it looks something like this:

C=US, ST=CA, O=Silverglass Technical, OU=silverglass.org, CN=arachnae.silverglass.org

This would be a valid distinguished name used for one of my own server certificates. C is country, US. ST is state, California. O is organization. OU is organizational unit, a division within the organization, in this case the domain I'm generating certificates for. CN is the common name, which might be the actual name of a person or in this case the server name for a certificate for a Web server. If you use the OpenSSL x509 command to dump out a certificate, one piece of data you'll see is the Subject entry. This is the DN for the entity the certificate was issued to. Another is the Issuer entry, which is the DN for the certificate authority that issued the certificate and signed it. There's also a Validity entry, which is where you'll find the datestamps indicating when a certificate is valid. Outside the span indicated in the Validity entry the certificate is not valid. That means that certificates have to be reissued every so often as they reach their maximum valid lifetime and expire. That's to insure that if someone does gain access to and copy your keys they can't use them forever.

DNs come into play when it comes to the basic question: how do you verify that an SSL certificate is valid? The answer comes from the signature chain. Most certificates are signed by someone, either an issuing certificate authority or themselves. It's possible to have an unsigned certificate, but it's insecure because without a signature the information can easily be altered. A self-signed certificate isn't much use in verifying identity, but it does let you confirm that the information in it hasn't been changed by anyone except the certificate's owner (who's in theory the only one with access to the private key you'd need to generate the signature). Self-signed certificates come into play in two cases: quick certificates you might generate internally where you just need a certificate and don't need to confirm any identity, and the master certificates used by the very top-level certificate authorities like Verisign. A certificate chain looks like this:

The server certificate of a Web server is at the bottom. It's signed by CA 2. CA 2's certificate is signed by CA 1. CA 1's certificate is signed by a root certificate authority. The root authority's certificate is signed by itself. The server certificate contains this chain, it includes the certificates for all the CAs above it in the chain. That way client software like a Web browser has everything it needs to verify the entire chain. The process involves using what's called the root certificate authority bundle. That's a collection of the known valid certificates for the well-known certificate authorities like Verisign, Thawte and their ilk. The client starts with the server certificate and, for each certificate in the chain, compares the DN in the Issuer field with the DNs of the Subject fields of the certificates in it's root CA bundle. If it finds a match in the bundle, it compares the copy of the certificate in the chain with the copy in it's bundle. If they're identical, it knows that certificate in the chain is valid. Then it can work back down the chain verifying the signature at each level using the issuing certificate above it in the chain. If all the signatures match, the server certificate is valid. Usually the match happens near the top of the chain, matching the root certificate from a major CA. Most of the time there'll be several intermediate levels: a large company like Citibank might get a certificate from Verisign and then use that certificate to issue a certificate to it's retail banking division, which in turn would issue certificates to it's various IT operations, who in turn would issue server certificates for the Web servers they're responsible for. Your browser's root CA bundle would only have Verisign's certificate in it, so the chain in your bank's Web server SSL certificate would have 3 certificates between it and the Verisign root certificate (the IT department certificate, the retail banking division certificate, and finally Citibank's overall certificate).

Generally when putting certificates into the root CA bundle you want to take the highest ones in the food chain. If you're creating an infrastructure for the company, create one certificate to use to set up all the issuing authorities and put that certificate in the bundle. That way when you create a new authority your software will recognize it immediately. If you add server certificates themselves to the bundle, you'll have to update the bundle every time you add a new server or reissue a certificate to a server. It very quickly becomes infeasible to maintain that.

The CN, or common name, has some canonical uses. For server certificates, convention is that the CN is the DNS name of the server. Most client software enforces this by checking the CN of a server certificate against the name of the host it's trying to contact, and if the two don't match (eg. the client tried to talk to www.example.com and got a certificate with a CN of www.silverglass.org) it rejects the certificate. For non-server certificates, the CN would be a human-readable name for whoever or whatever owns the certificate. For instance, the CN of a personal certificate issued to me (for signing e-mail) might say "Todd Knarr". The root CA certificate for an organization might have a CN of "OrganizationName Certificate Authority master root certificate".

Certificates can be marked for particular purposes using the X509v3 Basic Constraints. You can flag a certificate as being for a CA or not, for use for signing e-mail, for SSL server or client authentication and so on. By default OpenSSL generates certificates with only the CA flag set to false, so generated certificates can't be used to issue other certificates. With no other purpose/type constraints, the certificates are usable for any purpose. You can set the constraints to restrict certificates to eg. only server authentication, or only e-mail signing. That's probably a good idea if you're developing a comprehensive certificate infrastructure, but not needed for simple setups where all you need is a few Web server certificates.

SSL is completely symmetric when it comes to authentication. Most commonly you'll encounter certificates presented by the server and used by the client to verify the identity of the server, but the protocol allows for the client to have a certificate it presents to the server so the server can verify the client's identity. Mostly those see use in SSL VPNs, but they can be used with regular Web sites with appropriate configuration of the Web server software.

The Flaw

If you thought about how certificate validation works, you may have thought to yourself that there's a flaw in there: if you completely replace the certificate and get it signed, you can put anything you like in it and it'll still pass validation. You'd be correct. The entire system rests on the idea that issuers will not sign certificates unless they've positively identified the party presenting the certificate as the same one named in the identifying information in the certificate. While it's a nice idea, it's been demonstrated repeatedly that it can't be relied on. Even the most obvious case, an issuer simply failing to check identity, has happened on more than one occasion (the most embarrassing one being where certificates in the name of Microsoft were issued to criminals by an issuer listed in the default root certificate bundle distributed with Windows and Internet Explorer). Certificates have been wrongly issued with permissible uses including signing other certificates, allowing the parties who received them to become issuers themselves (as far as any validation software was concerned, at least). And major certificate authorities have had their signing certificates compromised, with criminals gaining access to the private keys used by the CA itself to sign certificates. Once that happens, the criminals can issue certificates that come from the compromised CA, and all bets are off.

This flaw goes uncorrected because of two things. Firstly, most of the time it doesn't matter. SSL is used not to verify the identity of the other end of the connection but just to protect the connection from eavesdropping by third parties. This breaks down, though, when you think about criminals sending malicious e-mails designed to trick you into going to the criminal's Web sites thinking you're going to your bank's site or something. If you aren't thinking about verifying the identity of the server, you're leaving yourself vulnerable to this. Which leads us to the second reason: identity verification is hard. With multiple layers of certificate chain involved and many entities in there that you probably weren't aware were involved and don't recognize off-hand, it's non-trivial to figure out what the signature chain should look like and who should be on it. The whole situation is made workable only if you assume absolute trust in the root authorities, so that's what most people do and never think further on it.

If you truly want to be secure, to know that the other end of the link is who you think they are, you have to do exactly the opposite of what's usually done. There's several variations, but they all boil down to individual entities like your bank generating their own certificates and signing them themselves, and you obtaining copies of their certificates directly from them via another channel and validating directly against those copies without reliance on any other certificate authority. That though involves end users dealing directly with adding certificates to their software, and changes in the user interface to require users to mark which other parties each certificate is valid for (so that eg. Google's certificate is only used to authenticate Web sites in Google's domain and some other domain presenting a Google certificate would trigger an error). All that requires too much work and too much thinking for most people, so we're left with the current system with all it's flaws.