# A hands-on intro to RSA

A simple framework to explain and demonstrate how the most widely used encryption protocol works ... and how to break it!

by Dr. Tomás Silveira Salles

9 minute read

RSA is one of the most common encryption protocols today, and everyone in the world of IT should have at least a basic understanding of it. However, explaining only the mathematics involved makes the experience quite demotivating for a lot of people, so we decided to go for a hands-on approach and wrote a small framework called TextbookRSA that implements all the crucial steps, and that you can read and play around with.

TextbookRSA is written in *swift* and uses only its base standard library: *Foundation*. This means you can try it out on any Mac or Linux machine (as long as swift is installed).

As a bonus, we also included a working implementation of how quantum computers can be used to break RSA encryption, to fulfil a promise made in a previous article.

TextbookRSA was never meant to be safe for real-world applications, but rather to be easy to read and understand. The most obvious weakness is the fact that the public keys in this framework are at most 32 bits long (on 64-bit machines, and at most 16 bits long on 32-bit machines). This is because we didn't want to implement "big-integer" types and arithmetic functions. We see big-integer arithmetics as a separate problem that is independent of the RSA protocol, and it would distract the reader/user from the key concepts that we wanted to convey. Other weaknesses are for example the lack of protection against timing attacks, the lack of a check against weak prime pairs, and using electronic codebook as the block cypher mode.

On the other hand, the framework fulfils its purpose of demonstrating the main concepts, it's very easy to use and the code is quite clean and simple.

**Note**: If you're already familiar with RSA encryption, you can probably understand the project more quickly by looking at the README file and browsing through the sources.

To begin a message exchange through RSA, the *receiver*, usually named *Bob* in textbooks, chooses two prime numbers $p$ and $q$. Their product $N=pq$ is often called the *RSA modulo*, and is part of the public key, but the two prime numbers are private and Bob never shares them with anyone. In TextbookRSA the interface for this triple of numbers is defined in the protocol `RSAKeysProtocol`

and we have only one implementation, named `UIntRSAKeys`

(which uses `UInt`

as the integer type). In the code, you can see that we use our method `UInt.randomPrimeInRange(...)`

to choose $p$ and $q$, which picks random numbers in a given interval and tests whether they are prime. In real-world applications, where $p$ and $q$ must be hundreds or even thousands of bits long, it is unrealistic to check this with absolute certainty, but there are tests that can quickly tell *with high probability* whether the chosen numbers are prime or not.

The next step is to choose an exponent $e$ to be used for encryption, and which is the second and last part of the public key. It is important that $e$ not have any common factors with $\left(p-1\right)\left(q-1\right)$, but that is really the only condition. As far as is known, it does not matter if the same exponent is used many times or by many different people. Many implementations of RSA simply hard-code $e=3$, for example, because it is the smallest (and therefore the most efficient) exponent you can choose. An even more common value is $e=65537$, which has some mathematical properties that can be used to make encryption more efficient and is a prime number, which makes it unlikely to share factors with $\left(p-1\right)\left(q-1\right)$. In our library we stick to the original idea and choose $e$ at random. This is done by calling `keys.generateEncryptionParameters()`

, which essentially returns the pair $\left(N,e\right)$.

Since the pair $\left(N,e\right)$ is public, Bob can transmit it to the *message sender*, usually named *Alice* in textbooks, through any public channel. To facilitate these transmissions, the relevant types in TextbookRSA implement the protocol `Codable`

, which is a new and extremely helpful Swift 4 feature. This means you can easily convert these types into well known formats such as *JSON* and back.

➕➖ Encoding of the RSA public key

There are several different official formats to encode RSA public keys, depending on the toolkit you use (for example OpenSSH or OpenSSL) but the essential content is always the same: the RSA modulo and the encryption exponent. Additionally, many formats include an identification string to document which algorithm is being used, so you don't get your keys mixed up. These formats are usually not in readable text form (like JSON) but in data form (binary), and usually the result gets converted into text through base-64 encoding. This encoding turns each group of 6 bits into a character, even though the data is organised into groups of 8 bits and most of them represent numbers and not letters. The string that comes out has nothing to do with a natural language description of the keys for humans. This is why the file that is automatically generated by the toolkit usually looks like this:

-----BEGIN PUBLIC KEY-----

MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAvUYHKDzF277x/IF81JBT

ApD4zgczNrUBvTWbLBNA8uEby7i/5RRf1PdwUojGjANK5gpV3+5M7dkRBerWHBNM

GWb2zho1tyO+ujcMlQFfuY2z7cju5IMmrbt7IlO84OCWSS5WfsNdGcWTjLHMm0Dn

NlViSBODxNFFuPUTnApUJtVeFXZlAiZwwICYs+QzsgaJFDhcQoKuK4vP5A101Kv0

muws4toR0qd119qMQuR+wp7EpAzP6FbmRdr2a7JBkj4bXcHWrEaq5K4lFkILokCy

TCXZv9hBIo6AMrvLzNtFsdIwT09bS0wqxdIIa/THX+Z7b4afeN1JM5qwa5N98VqK

SwIDAQAB

-----END PUBLIC KEY-----

It looks as if the keys were encrypted within the file (which doesn't make sense because they are public!) but in reality they are just encrypted in a very efficient but human-unfriendly format.

Now Alice can encrypt her message by calling `Encrypter.encrypt(message, parameters: parms)`

. You will find the implementation of two overloads of this method in *EncrypterProtocol+DecrypterProtocol.swift* and *Encrypter+Decrypter.swift*. One of them expects `message`

to be of type `Data`

(so you can encrypt arbitrary files) and the other takes `message`

of type `String`

(which is easier to use for quick tests, for example on a Swift Playground).

The reason why `Encrypter`

is defined in the directory *ECB* is because RSA really only encrypts individual integers, and only integers smaller than $N$. This would be very boring for a beginner trying to learn something by using our framework, so we added ECB (short for *electronic codebook*) as a way of chopping longer messages into small blocks so that these blocks can be individually encrypted via RSA. Again, this is not a good idea in real-world applications, because when you use ECB, equal blocks are transformed into other equal blocks, which reveals patterns in the data. In actual encryption systems RSA would be used either for signing (where it is applied only to a short hash of the message) or to encrypt the private keys for a different protocol (such as AES-GCM), which would then be used to encrypt the actual message.

The RSA part of the encryption process is the transformation of each individual block. A block (i.e. a number) $b$ is transformed into the number ${b}^{e}modN$. This happens in the method `UIntRSA.transform(...)`

and uses the class `Math.PowerOracle`

, which is probably the most interesting piece of mathematics in the project. The numbers involved in this exponentiation can be very large, so we have to worry about two things: Computing the result efficiently and avoiding an overflow of our integer type. To do this, we use the classic approach of breaking the exponent into a sum of powers of 2 and calculating the exponentiation in several steps.

➕➖ Easier done than said: Efficient modular exponentiation

Let's say we want to compute ${7}^{10}mod11$. First we write 10 as a sum of powers of 2: $10=8+2$. This means that ${7}^{10}$ is the same as ${7}^{8}{7}^{2}$. Then we compute ${7}^{2}mod11=5$. Now, we know that ${7}^{4}mod11=\left({7}^{2}\right){\mathrm{}}^{2}mod11$, so we substitute $5$ for ${7}^{2}$ and get ${7}^{4}mod11={5}^{2}mod11=3$. Similarly, we know that ${7}^{8}mod11=\left({7}^{4}\right){\mathrm{}}^{2}mod11$, so we substitute $3$ for ${7}^{4}$ and get ${7}^{8}mod11={3}^{2}mod11=9$. Finally, to compute ${7}^{10}mod11={7}^{8}{7}^{2}mod11$ we substitute $9$ for ${7}^{8}$ and $5$ for ${7}^{2}$ and get ${7}^{10}mod11=\left(9\right)\left(5\right)mod11=1$.

This may seem like a long detour to compute something simple, but in practice it is very efficient. The key point is that after each step we apply the modulo again, so we're always working with small numbers. This way we don't have to worry about overflows, and the process is also faster because small numbers are easier to multiply.

When Bob receives Alice's message, he can create a `Decrypter`

object using his keys and call `decrypter.decrypt(encryptedMessage)`

(if the original message was of type `Data`

) or `decrypter.decryptText(encryptedMessage)`

(if the original message was of type `String`

). Once again, you'll find the code in the ECB directory because the decryption is applied to each block individually and then the blocks are put together to get the original message. The RSA protocol only defines what happens to a single block. Recall that for a block $b$, Alice sends Bob the number $c={b}^{e}modN$. To decrypt, Bob first computes the inverse of $e$ in the $mod\left(p-1\right)\left(q-1\right)$ number system. That means, Bob finds a number $d$ such that $edmod\left(p-1\right)\left(q-1\right)=1$. Then he computes ${c}^{d}modN$, and a well-known mathematical theorem guarantees that this number is exactly the original block $b$.

Finding the number $d$ is easy if you know the primes $p$ and $q$. This is done by calling `keys.generateDecryptionParameters(...)`

, which internally uses Euclides's *Greatest Common Divisor* algorithm (yes, the one you learned in school). But as far as we know it is *very very* hard if you don't know $p$ and $q$, and this is the assumption RSA is built on.

Earlier this year we published an article about quantum computers and mentioned that they can be used to break RSA encryption using Shor's algorithm. We also mentioned that this algorithm *does not find the prime factors of the RSA modulo*, which left a lot of people confused and waiting for a well-deserved explanation.

To be clear, quantum computers *can* factor large integers quickly, and Shor's algorithm can be used as *part* of the process. What we want to show is that Shor's algorithm alone is not a factoring routine, and that you don't *have* to factor the RSA modulo to decrypt messages.

The famous quantum computing algorithm written by *Peter Shor* is neither a factorization nor a decryption algorithm. It computes the period of a specific function. The function is of the form $f\left(x\right)={c}^{x}modN$ where $c$ and $N$ are fixed and $x$ is the variable. As you increment $x$, this function takes on different values for a while, but eventually starts to repeat a pattern. The number of distinct values before the function starts to repeat itself is what is called its *period*, and is the output of Shor's algorithm. For example, if $c=2$ and $N=5$, the values will be (starting at $x=0$): 1, 2, 4, 3, 1, 2, 4, 3, ... so we can see that the period is 4. When $N$ is large, finding the period is extremely difficult, but a quantum computer can do it.

In our framework, due to the current lack of a quantum computer here at the office, we compute periods by trial and error, also called the *brute force* method, and therefore we can only do it for very small $N$. The class that takes care of this is the `PeriodOracle`

. We also added an extension to `UIntRSAKeys`

with the static method `small(maxPrime:)`

(obviously not a part of the RSA protocol), which creates RSA keys with small prime numbers so that you can actually try out the `PeriodOracle`

(check out *Period/SmallKeys.swift*).

Going back to our usual example, we have the original block $b$, the RSA modulo $N$ and the exponent $e$ and from these Alice computed the encrypted block $c={b}^{e}modN$. Notice that $N$, $e$ and $c$ are all public, so an *attacker*, usually named *Eve* in textbooks, would know them. If Eve used a quantum computer with Shor's algorithm, she could compute the period $r$ of the function $f\left(x\right)={c}^{x}modN$. Similarly to what Bob did for decryption, she would then compute the inverse of $e$ in the $modr$ number system. In other words, Eve would find a number $d$ such that $edmodr=1$ and then compute ${c}^{d}modN$, which is exactly the original block $b$.

➕➖ Gimme more math. I can take it!

First, notice that ${c}^{d}modN={b}^{ed}modN$. Since $edmodr=1$, there is some integer $k$ such that $ed=kr+1$, so ${b}^{ed}modN={b}^{kr+1}modN$.

It is easy to see that the period with base $b$ is the same as the period with base $c$, so ${b}^{r}modN=1$. In particular, ${b}^{kr+1}modN={\left({b}^{r}\right)}^{k}bmodN=b$. Eve would have thus decrypted Alice's message and never knew or needed Bob's secret prime factors $p$ and $q$.

This whole process is implemented by our class `PeriodDecrypter`

, which has the same interface as `Decrypter`

(namely the `DecrypterProtocol`

), except you don't need the private keys to initialize it, but only the RSA modulo instead.

The home page of the project on GitHub gives a more detailed description of the public API, without going into implementation details or the theory of RSA encryption. It also shows how to install the framework using CocoaPods (Mac) or the Swift Package Manager (Mac and Linux). Finally, if you want to start testing right away, the repository also contains a Swift Playground where the entire public API is used.

Want to try it out?

Go to GitHub