Understanding blockchain starting at zeros and ones
WHY IS THIS A BLOG POST FOR MY ETHICS CLASS?
When I was at UCLA I took a GIS class taught by Tom Gillespie that changed my life. I actually took many classes that he taught (maybe four). The first convinced me to add Geography as a double major and the GIS class was my first time really understanding an important flaw with the things we are asked to do in school and left me with a mindset about how to do school and later how to do work that has stuck with me ever since.
In this class we were learning how to use arcGIS and publicly available data sets to solve problems. This quarter-long class culminated in a final project that was worth about 80% of our final grade. There were no smaller deliverables, no formal check-ins along the way, just one massive project. And there was very little guidance. A bare-bones project brief, no rubric, and whenever we asked him to clarify ‘Should it do this?’ ‘Should it be this?’ ‘Should it look like this?’ He would answer cheerfully in his booming voice, “Do Good Work!”
He told us to rely on our inner north star to tell us if we were doing good work. Is it something you’d like to see in the world? Does it make the world a better place? Does it advance knowledge? Learning to listen to this voice would be more important than learning to conform to a list of requirements on a rubric. People who change the world in big or small ways are not using a rubric. At the end of that class, we all ended up having lots of good work to show off, after working through the anxiety provoked by such an ambiguous assignment.
In the early weeks of the class we logged dozens of hours of the GIS lab trying to make something that would impress our professor, but by the final weeks of the class we were logging time there working on projects we were passionate about. So in the spirit on deviating from the rubric, and with Tom’s voice in my head, I’m departing from the assignment briefing in an expression of my own ethics around ‘doing good work.’
This blog post somewhat touches on ethics, but primarily is an explainer about blockchain, the one that I wished I had stumbled upon when I didn’t know anything about how it all worked, because it felt like it could be an asset to the conversations we were having in class and I hope that anyone who reads it might feel like it’s good work.
STARTING AT THE BEGINNING
A lot of understanding blockchain rests on understanding the fact that all of this is built on good, old-fashioned math done with good, old-fashioned numbers. Between discussion of bits (data encoded in binary as 0’s or 1’s) and long strings of numbers and letters, it’s hard to visualize how there could be actual math involved. The mechanics of blockchain are happening with real numbers that can be added or multiplied just like the math that we did in school.
Technical writing about blockchain will often drift back and forth between discussion of different types of numbers without being explicit about their relationships to each other.
Four ways of representing a number that come up a lot in blockchain talk:
Binary (zeroes and ones)
Regular numbers (all the numbers you learned in Kindergarten, 0-9)
Hexadecimal (numbers plus letters A, B, C, D, E and F)
Base58 (numbers, plus uppercase letter, plus lowercase letters, minus lookalikes)
If I want to express a number in binary it takes a lot of digits because each digit can only express two options, zero or one. If I limit myself to three digits of binary I can only express 8 numbers. (This is 2x2x2 or 2^3, because for each digit there are only two possibilities.) For a number as big as 100 I need 7 digits. For a number as big as a million, I would need 20 binary digits.
If you hear someone talking about ‘bits,’ they are talking about binary numbers. Computers store data as 0’s and 1’s, which we all know (because we’ve seen “The Matrix”), but binary numbers are actually really hard for our human brains to understand. And they take up a lot of space when we are representing them visually, like on a screen. When we want to deal with computer numbers we can translate them to regular numbers to make them shorter or hexadecimal to make them even shorter.
HEXADECIMAL AND REGULAR NUMBERS
Each digit of a regular number can be one of ten different values (0, 1, 2, 3, 4, 5, 6, 7, 8, 9) which means to express a number as big as 100 I need (wait for it…) 3 digits! There’s probably nothing about how regular numbers work that I could tell you that you don’t already know. With hexadecimal, each digit can be one of 16 different values, so I only need two digits to express a number as big as 100. It’s a marginal improvement.
Base58 is a way to more substantially compress binary data into a shorter form. It takes all the lowercase letters (26) and add the upper case letters (+26) and adds the numbers (+10) and then subtracts 0 (zero), O (capital O), l (lowercase L) and I (uppercase I) to give you 58 characters to work with for each digit, making the compression much more effective. Base58 is a variation of an earlier version, Base64 which keeps all the letters and numbers and adds + and /. The reason for removing those was to make it easier for humans to unambiguously read and write these long strings of numbers. (Human-centered design in blockchain?)
Here’s my childhood phone number in binary, regular number, hexadecimal and Base58.
|33 digits||10 digits||9 digits||6 digits|
So the answer to the question, “What does the blockchain ledger actually look like?” has three answers. One is that it is just a series of 0s and 1s, which is how your computer is storing them. But then often users are converting them into hex or Base58 and when people are displaying them online, writing them down or talking about those numbers. But at the heart of blockchain is a lot of math, all of which is done with good, old-fashioned regular numbers. And if people wanted to they could talk about it in regular numbers, but the convention is to use hexadecimal or Base58. And if people really wanted to they could do math in Base58, but that sounds terrible.
The files themselves then are long strings of numbers that encode information corresponding to transactions on the blockchain. When people look at them they can use a parser, which takes information from the chain and makes it more palatable to read or use. I really like this parser because it retains much of the original data, but parsers like this one that give the data more structure and formatting are much more commonplace.
THE MATH YOU NEED TO KNOW ABOUT, BUT PROBABLY DON’T NEED TO KNOW
THE PRIVATE KEY AND THE PUBLIC KEY
To send and receive transactions with Bitcoin, you need a private key, a number that identifies you and that only you know. Unlike your credit card number or your driver’s license number, you get to choose it! It can be any number below 115792089237316195423570985008687907852837564279074904382605163141518161494337. That’s not even a joke. It corresponds to 256 bits. So it would be a binary number that was 256 digits long. Most people use some kind of random number generator to make theirs. Most people convert their private key to hex or Base58 for ease of writing/remembering.
Using math developed by cryptographers in the 1970s you can turn your private key into a public key. Any private key will only transform into one public key and the operation cannot be reversed. Most math operations that are familiar to us are reversible. I can divide 56 by 8 to get 7 and then I can multiply 7 by 8 to get 56. However, this math only goes one way. You cannot use the public key to derive the private key.
This one-way math appears again when creating a signature for a transaction in the ledger. You use your private key, a random number and the transaction data (your public key, the recipients public key, the transaction amount) to do math that results in a signature. When comparing this signature against your public key with more math, you can verify that the signature could only have been created by someone who knew the private key, however this math does not reveal the private key.
THE ACTUAL BLOCKCHAIN
WHAT IS A BLOCKCHAIN?
It is an ever-growing set of files that each record transactions or other data. Many people keep copies of the files and update them regularly. You can add to the blockchain, but for the most part, you cannot change or remove things in the past. Additions are made one block at a time. Currently, the most actively used and discussed blockchains are for cryptocurrencies. Cryptocurrencies are made-up currencies that aren’t associated with any government but can be traded by users of the currencies and even exchanged for government-backed currency.
The blockchain stores the history of the transactions as a means of verifying that the transaction took place. The oldest and most-discussed cryptocurrency is Bitcoin, but there are tons of other cryptocurrencies and blockchains that are used for other purposes such as identity verification. For this article if I use an example of a blockchain it will be Bitcoin, however, those terms aren’t interchangeable. Bitcoin is built on a blockchain, but not all blockchains function just like Bitcoin. Understanding Bitcoin is a good place to start because it was the first widely-used blockchain and most newer blockchains function in almost the same way as Bitcoin.
WHERE IS A BLOCKCHAIN STORED?
Lots of places! Tens of thousands of people have a copy of the Bitcoin ledger on their computers. If you remember the days of Napster and Limewire, you are familiar with the idea of peer to peer sharing. Instead of having a file hosted on a single server, many people hold copies of it on their personal computers. This is one of the things that keeps it secure. Even if you change one copy of the ledger, the other ones have the original data, so other people know yours isn’t the true one.
Blockchain files are often a lot bigger than your Smashmouth All-Star mp3, so it’s a pretty big commitment to keep a copy of it and keep it updated as new additions to the ledger are being made all the time. The current size of the Bitcoin blockchain is about 160gb. Why do people want to do this? Hard to say. The excitement of being part of a movement? Civic responsibility? Keeping the blockchain files on your computer doesn’t involve any pay out.
BUILDING THE BLOCKCHAIN
Anyone who has a Bitcoin transaction to record adds that transaction to the ‘memory pool’ (a term I find disturbingly Orwelian). This is just a list of transactions that people want to have added to the blockchain. Each transaction identifies the giver, the recipient, and the amount. It is signed with a signature (which is just another number) through complicated math can be used to verify the authenticity of the person giving the money.
While storing the blockchain is unrewarded, building blocks is very lucrative. ‘Mining’ is the term used to describe the work done by computers mathematically vetting transactions in a queue waiting to be added to the blockchain. And it’s not just one miner working on one block, miners compete to add a block to the chain. For anyone single block there are thousands in competition to add the next block.
Many of transactions in the memory pool promise a small percentage as a payout to the miner who does the work of bundling a bunch of transactions into a block. In addition, there is a reward to the successful miner which can be worth tens or hundreds of thousands of dollars (depending on the price of Bitcoin). A pretty amazing reward for ten minutes of work, however, most of that huge award goes back into paying for the electricity that powers the process.
The math involved in vetting the transactions is highly energy intensive because it is so complex. Once the bundle has been vetted, there is one more step to getting is accepted as a block, which is a process that involves multiplying huge random numbers together to try to hit a target that was defined when Bitcoin was first built. It adapts to how many miners are competing to keep the rate of blocks being added to approximately one every 10 minutes.
The high-intensity energy consumption is intentionally built into the system–a feature, not a bug–to slow down the process to give all the computers storing copies of the blockchain time to sync up before the next block is released. It is also a way to ensure that the blocks are accurate. If you put a ton of energy into adding a block, only to find out that it contains errors or inconsistencies you won’t receive the payout for the block. People are incentivized to add a ‘good’ block, not just any block.
Whose auditing these blocks? Other miners, who want that payout to go to them, rather than the person who added the latest block. At any given time there are a few different versions of the blockchain propagating through the network of people who have the blockchain stored on their computers. The most recent blocks are considered ‘candidate blocks’ and there might be a few different contenders for that same block in the chain.
Miners choose which blocks to build their next block on top of based on which candidate they think is going to be the block that propagates to the majority of the block-keepers. If they chose wrong, their candidate block built on top of an unsuccessful candidate cannot be added to the blockchain. Miners will choose blocks that have been assembled by ‘trusted’ miners before building another block on top of theirs, or will check candidate blocks for errors.
While mining was once somewhat accessible to casual cryptocurrency enthusiasts, the arms race for more powerful machines to outcompete other miners has led to the community of miners becoming quite small and not really a community at all. At the outset Bitcoin ran on the enthusiasm of a fandom galvanized around the idea of fortifying currency against unstable and untrustworthy central banks. Today the Bitcoin machine is powered by financial interests, primarily in China who are motivated to develop the most efficient ways of extracting the value from mining.
Is blockchain technology really a sustainable strategy for commerce at a global scale? Bitcoin is still pretty fringe, and the energy involved in adding to it is astronomical. Could cryptocurrencies scale globally to replace government-backed currencies? Probably not. Blockchain in its current form is not really scalable, especially for one currency to take on a dominant role. One might argue that was always the point. Lots of blockchains are more resilient that one single blockchain. However, even an ecosystem of smaller blockchains would have problems related to the amount of work needed to be done per transaction. It’s hard to imagine even multiple cryptocurrencies having the capability to record as many transactions as credit card companies do every day.
Another big issue is the environmental sustainability of blockchain. As mentioned above, the energy consumption involved in Bitcoin is not negligible. Due to the complexities of the calculations that are performed in the growth and maintenance of the blockchain, a single bitcoin transaction can use as much energy as an entire household would in a month. Even as a technology that has not yet been widely adopted the carbon footprint of blockchain is devastatingly huge.
The idea of using blockchain to decentralize economic power has unintentionally led to a concentration of power in an unexpected place: small, well-financed groups in a communist country who aren’t really accountable to anyone. Many people are ceasing to find that more reassuring that trusting central bankers who are at least somewhat democratically accountable in most countries.
Along with the energy costs of mining, this recentralizing of power is one of the biggest flaws in the way Bitcoin was designed to use blockchain technology for financial transactions. Other applications of blockchain technology have arisen that are attempting to build a better blockchain. In particular, the ways of verifying a block and rewarding that verification, called “proof of work,” has been reimagined to be less energy-intensive and less likely to result in centralizing the work to people who have the means to engage in it on a large scale. “Proof of stake” is one alternative that is being explored to address these issues.
Congrats to all who made it this far! If you are still looking for more info, check out these links below. And please let me know if you have any suggestions on how to make this blog post more accessible or clearer. I hope to be reworking it in the future!
Resources I used to write this post:
https://learnmeabitcoin.com/ – a great resource for going a little bit deeper into the technical aspects beyond the scope of the post.
https://medium.com/coinmonks/blockchain-public-private-key-cryptography-in-a-nutshell-b7776e475e7c – more about private keys, public keys and signatures
https://twitter.com/SatoshiDoodles – the best bitcoin themed doodles on Twitter
My interview with my friend Sunny (there is one swear word, so sorry) – https://drive.google.com/file/d/1JIv2F5wN4gXo3aHUPxjEMTKlC9XaJhkT/view?usp=sharing
https://www.wnycstudios.org/podcasts/radiolab/articles/ceremony – an excellent RadioLab podcast about people starting a new blockchain
https://www.newyorker.com/magazine/2018/10/22/the-prophets-of-cryptocurrency-survey-the-boom-and-bust – a great New Yorker article also available as audio that talks about the crypto landscape
https://www.wired.com/story/guide-blockchain/ – a quick explainer from Wired