The NFC handshake, Part 1: what happens in the air
In the 300 milliseconds a TfL contactless transaction takes to complete, a reader and a card negotiate, authenticate, and exchange enough cryptographic material to commit to a financial transaction. Here is what actually happens in that window.
The TfL contactless system has a hard performance constraint: the card must complete its transaction before the gate can open, and the gate cannot hold you long enough for there to be a queue. Three hundred milliseconds is the design budget. In that window, the reader and the card have to find each other in the RF field, select the application, authenticate, authorise, and communicate the result to the gate controller. Everything in the design of EMV contactless, and everything in the design of the NFC hardware that implements it, is shaped by that constraint.
I spent time working on TfL's contactless payment architecture. What I want to write about here is not the system-level design — that is a separate and complex subject — but the lower-level exchange: what actually happens in the air between the reader and the card, and why it is designed the way it is.
The RF exchange
EMV contactless operates over ISO 14443, the near-field communication standard for proximity cards. The reader generates an RF field at 13.56 MHz. The card, passive, harvests energy from that field to power its chip. The communication is half-duplex: reader and card take turns. The reader initiates; the card responds. The physical layer is designed to be fast, to be reliable at short range, and to fail gracefully when the card is removed from the field mid-transaction.
The first exchange is application selection: the reader sends a list of application identifiers it supports, the card responds with the ones it can handle. For a payment card this typically means one EMV application, though multi-application cards that carry both payment and transit credentials are common. The selection establishes which application handles the rest of the exchange.
The authentication exchange
Authentication in EMV contactless is challenge-response. The reader sends a nonce. The card computes a CMAC — a cipher-based message authentication code, using AES-128 — over the transaction data and the nonce, and returns the result. The reader, which knows the card's key (via a key hierarchy that puts the actual key in the reader's secure element, not its general-purpose memory), computes the expected CMAC and compares.
The key never leaves the secure hardware on either side. The card's key is provisioned at manufacture and never transmitted. The reader's key derivation material is held in hardware that enforces access controls. What passes over the air is the challenge, the response, and the transaction authorisation data — not the key, not anything from which the key can be derived by an eavesdropper.
This is the right design choice. The cryptographic material in the air is the minimum necessary to prove, to the reader, that the card holds the correct key. It does not prove it to anyone else. A passive attacker recording the exchange cannot replay it: the nonce changes every transaction, and the CMAC over a previous nonce is not valid for a new one.
NTAG 424 DNA
The NTAG 424 DNA, NXP's implementation of AES-128 authenticated NFC for non-payment use cases, follows the same basic architecture: challenge-response authentication using AES in CMAC mode, with a session key derived from the static card key and a random challenge. The session key is per-transaction. The counter on the card increments with each successful authentication. An attacker who records one transaction cannot replay it against the reader because the counter will not match.
The authentication exchange produces, as a side effect of the handshake, a session key that both card and reader now share without it ever having been transmitted. This session key can be used for the remainder of the transaction to encrypt and authenticate subsequent data. The key-agreement-without-transmission property is not magic — it follows directly from the structure of the CMAC-based exchange — but it is elegant, and it is what makes the design robust against passive interception.
Part 2 will look at what the proof actually proves, and what it does not: the limitations of CMAC correctness as a security argument, what counter replay protection does and does not prevent, and why cloning a tag that uses this scheme is harder than cloning one that does not.
Three hundred milliseconds. In that window, the PING goes out — the challenge — and the answer comes back — the CMAC. Whether that answer proves what you think it proves is the question Part 2 addresses.