*Press Any Key*
You have to walk before you run. You have to crawl before you walk. And before you crawl, you have to press a single key. And literally, these days some children do indeed do that before they crawl. I jumped the gun in the previous tutorials. I expected you to be able to press multiple keys on your keyboard in a coherent sequence. As if you were some kind of seasoned pro or computer whiz! Well, let's rewind and rectify that. Today, we are going to press a single key. We are not going to press just any key (some people have trouble finding the "any" key because there isn't one). We are going to press the J key. *The J Key* I don't just like the J key because my name starts with J. I like the J key because it is the home row key on your dominant hand on your dominant finger. It reminds you that to use a computer effectively, you should try to keep your hands positioned on the home row as much as possible instead of reaching for the arrow keys, the mouse, or a can of Dr. Pepper. You should learn how to touch type as fast as you can. Try to minimize your mouse use. This will force you to learn shortcut keys and other techniques that will make your far more efficient than someone who is constantly switching between hunt and peck typing and mousing through a maze of menus, moving like molasses. It can be positively painful to perpetually perceive these plodding, ponderous peasants pecking. (Okay, again, there are no peasants.) Please persist in perfecting your performance or perish pathetically. *Press the J Key* Fire up your Cygwin terminal (or whatever terminal emulator you prefer), press the J key, and promptly release it. Some people get confused and hold it down for ever. A lowercase j should appear in your terminal window just after the command prompt. Don't press Enter. Don't press any other keys. Stop and ponder what just happened. What the heck did just happen? On my screen, the letter h appeared. WTF, man! What kind of teacher are you, Jeff? You suck. I want a refund. This is bogus. Apparently, when you press the letter J on your keyboard, magical little elves do not deliver a letter J to your active window. Something far more treacherous and mind-boggling is happening. *Scan Codes, UARTs, and Bits--Oh My!* You are not in Kansas anymore, Dorothy. Your simplistic view of the world is a lie. This is your last chance. After this, there is no turning back. You take the blue pill - the story ends, you wake up in your bed and believe whatever you want to believe. You take the red pill - you stay in Wonderland and I show you how deep the rabbit-hole goes. (Yes, being a programmer whose last name is Anderson, people say "Mr. Anderson ..." to me now and then. Sometimes they call me Neo. It's kind of like being that guy name Michael Bolton on Office Space. Except Neo doesn't suck like Michael Bolton.) The rabbit hole is DEEP! I am just going to give you a high level tour of it. Some of what you read may be lies. I don't really know what's happening. I don't think any single person does anymore. Okay, when you press a key, two contact points close completing an electrical circuit. This causes a sequence of values (called scan codes) to be transmitted to your computer. On a desktop, you probably have a USB keyboard. USB means "universal serial bus". A bus is a set of communication lines, and serial means that signals (information) travel over them 1 bit at a time. Universal means that the same communication mechanism can be used for a wide variety of devices: keyboards, mice, external hard disks, web cams, printers, game controllers--just about anything. However, clearly, this is the real reason USB was invented: http://www.amazon.com/Doctor-Who-Tardis-USB-Hub/dp/B000F46CQM/. So, the scan codes are information that gets transmitted over USB one bit at a time. What's a bit? A bit is a single digit of base 2. That is a binary digit. There are two possible values for each digit: 0 and 1. On expensive Apple hardware, you also get the elite value 2 (just kidding). We are used to talking about values in base 10 (aka, decimal), where there are 10 possible values per digit: 0 to 9. Let's count to decimal 10 in binary. 0: 0 1: 1 2: 10 3: 11 4: 100 5: 101 6: 110 7: 111 8: 1000 9: 1001 10: 1010 Suppose a scan code is 8 bits long. It can represent 2^8 = 256 distinct values. A sequence of 8 bits is often referred to as a byte (although bytes can be of different lengths than that). How do these bits travel across the wire in the USB cable to the computer? The bits travel across the wire as changes in voltage over time. Let's say we hold a wire at +12V for a second. Maybe this is interpreted as a bit with value 1 when the receiving end sees it. Before USB and PCs, people connected to mainframes via real terminals (a keyboard/display combination--but in fact in early terminals, the "display" was essentially a remote controlled typewriter and was thus called a teletype). These reals terminals were connected to said mainframes at a distance by serial cables. At each end of the serial cable was a UART (universal asynchronous transmitter/receiver). Computers were so expensive back then that it made sense for multiple terminals to attach to a single central computing resource. These days, we are used to everyone having their own cheap computer where the keyboard, processer, and display are combined into one unit. This was not always so. Over a UART (RS-232) serial connection, -12V => 1 and +12V => 0. The values are inverted from what you'd assume they'd be. Each byte is either 7 or 8 bits and is sent 1 bit at a time at some bit rate (bits/second). Each byte is framed by a start bit (a 0) and 1 or 2 stop bits (also 0s), but usually just one. When the wire is not in use, it is held at -12V (1). An error check, the parity bit, may also be transmitted before the stop bit(s). The UARTs can be set to even, odd, or no parity. If set to odd parity, the sum of the data bits and the parity bit should be odd. For even parity, the sum should be even. If you've ever downloaded large files from the internet, this is the byte-level equivalent of a checksum. If one side of the connection is transmitting bytes faster than the other side can deal with them, that side can tell the other to stop or continue. This is known as flow control. It can be implemented directly as hardware signals (on a different wire) or in software (via special control bytes). The UART on each side of a connection must be configured with the same settings: bit rate, parity, stop bits, and flow control or else they cannot communicate with each other. Just like if someone showed up at your house speaking Chinese, you probably would not understand them. You both must agree on a language (a protocol). Consider that C3PO in Star Wars is called a protocol droid. That means he speaks all the languages. Note that even at this basic level, we have already made the distinction between bytes that represent data, and bytes that represent commands (flow control). Also, we have seen that there is non-data overhead to communication (start/stop/parity bits). These same concepts exist in other communication protocols like Ethernet and TCP/IP which are the foundation of the Internet that you waste most of your time on. The concepts in a UART are so fundamental to computing, they actually made us build one as part of my Computer Science degree. It's kind of like building your own light saber, but more frustrating. I never want to do it again. Okay, so the Universal, Receive, and Transmit parts of UART make sense, but what about this Asynchronous thing? Well, even though the individual bits travel at a certain rate (according to a clock), and can thus be said to be synchronous, the UART is transmitting bytes, not bits. The timing between the bytes is totally arbitrary. Consider yourself typing at a keyboard. You don't press each key one after the other at a constant unwavering speed in an endless stream, like your boss wants you to. No, you get half way through a line of code, get distracted by a cat hanging onto a ceiling fan on YouTube, go to lunch, come back an hour later, can't remember what you were typing because of the food coma you are experiencing, erase the line, and start over. That's what asynchronous means. If the clock on the receiving UART is running at a constant speed, but you don't know when the sender will begin transmitting, how the heck does the receiver know when the first bit begins? The way the receiver knows is that it oversamples the line voltages at a rate higher than the transmission rate. Since the non-transmitting voltage is always -12V, it can approximately now detect the time when it changed to +12V (for the start bit) and begin processing bits with that in mind. This also explains why you need a start bit. Also of note is the order of transmission of bits is from least significant to most significant. This is the opposite of how we write numbers. Assuming we were sending base 10 digits on the serial line, to send the number 123, I'd have to send 3, 2, 1. *Buffers, Interrupts, and Drivers* All that to get one lousy byte! I am exhausted. This is harder than mining Dogecoin. Jeez man, I am not immortal, tell me why the freakin' h appeared on your screen!!! We're still a ways off. I told you the rabbit hole was deep. Not my fault your whole generation has ADD. Okay, so now you have a byte on the receiving UART. Where is it? It's in a buffer on the UART. In the early days, UARTs really did have a single byte buffer. Memory was so expensive back then people really couldn't afford more. And they walked uphill both ways 2 miles through 10 feet of snow to get to school every day, where they had things called "books" instead of iPads. Lucky you with 16GB of RAM in a computer you don't even really know how to press a key on. Speaking of that computer, it needs to see this super-expensive byte that just arrived. What to do? It turns out your CPU has a number of interrupt request lines attached to it. When the UART gets a byte, it uses one of the interrupt lines to signal the CPU that it needs to do something. When this signal arrives, the computer stops whatever it was doing (probably updating its Facebook profile) and jumps to a specific instruction in its main memory. The machine code at that instruction is the interrupt handler. Now, whoever made your UART probably wrote some code to do whatever it is should happen when a byte shows up. Either that or it was surely Linus Torvalds or Richard Stallman, because they alone wrote all the software. This bit of code is known as a device driver. Your operating system loads these drivers early on as it boots. In doing so, makes it so the interrupt handler on that particular interrupt request (IRQ) line will call the appropriate device driver code. Wow, suddenly you have a modular way to extend the capabilities of your OS to deal with a variety of hardware (possibly unknown future hardware). Pretty spiffy. In the old days, you used to have to configure your physical device and driver/OS setting so that they agreed on what IRQ each device would use. I was alive in those old days and did this. Given that humans were involved, chaos ensued. Thankfully, now the OS just handles these settings for you. This is known as "plug and play". At the time it was known as "plug and pray" because it didn't always work as advertised. Okay, sweet, the computer has a byte. How does that end up as a j appearing on my screen? Well, let's remember that in the past the keyboard and display (which combined are the terminal), were both connected to a the remote computer via one serial cable. Also, the keyboard did not talk directly to the display. So, no letter j has showed up on your screen yet. What actually happens is the computer echoes this same value that you just typed at the terminal (keyboard) back to the terminal (specifically, the display part of it). That is, the serial cable is bidirectional: the computer can talk back to the terminal. Unlike in this tutorial, where I just talk and you listen because I don't want to hear any of your sass. Now you have a hint as to why the Bash echo command is called echo and what it does. We kind of missed a step. The computer does not echo the raw scan codes back to your terminal. It first figures out what they mean. The keyboard driver maps the scan code to a key code. The key code is then passed through whatever key map is active. The key may turns the key code into an ASCII value (or maybe a Unicode value, or EBCDIC, or something else). An ASCII value a small integer that represent an abstract symbol like the letter J. The key map allows you to have different keyboard languages or layouts. Such as Programmer Dvorak, which I use. So, when I press the J key on my keyboard, it translates to the letter H. When you press it, it probably translates to the letter J. A rose by any other name is still a rose, and the letter H is still H even if it is called J on my keyboard. Labels can be deceiving. Even the label on your keyboard falsely advertises a capital J when it produces a lowercase j. Key-presser beware. It gets worse. If you are some kind of deranged psychopath, you might have other layers of key mapping in your setup. Like AutoHotkey or a text expander. Maybe your keyboard itself has onboard macros. Pressing the J key could cause your garage door to open. All these transformation layers need to be correct for whatever you want to happen to happen when you press a single key. With great power comes great responsibility. *Display the J* Your keystroke's journey is not over. Aren't you glad you followed my advice and did not hit a second key? Consider the absolute mayhem that would have unfolded then! Your terminal has now received an ASCII value. To display this, it needs a font. A default font will already be preloaded into the terminal. A font is a set of particular visual representations of a particular set of abstract symbols. What does the letter J look like anyway? It can look like a miniature dump truck or an upside down smiley face if you have the right font installed. We just associate the concept of J with a certain visual representation. So, the ASCII value gets translated to a code point in the font which yields a glyph. That value is stored at some location in the display buffer (memory). Early display buffers had maybe only 80 columns and 24 rows. Maybe there was 2 bytes at each position. We're talking at most 4K of memory. This explains why the earliest terminals (teletypes) did not even have video displays. 4K of RAM would have been insanely expensive back then. Whereas a graphics card today will have 2GB of onboard memory. This glyph is then rendered to the display at its location. The current state of the terminal affects how this glyph is rendered on the screen. Certain terminal settings could cause it to blink, be underlined, be a different color, etc. The glyph is translated into individual pixels and the display driver of the terminal makes them light up on the screen for 1/60th of a second, 60 times a second. Photons interact with rods/cones in your eye transmitting electrochemical signals to your brain, etc. Yada, yada: http://dilbert.com/2010-11-17/. Without a display buffer, each character would disappear after 1/60th of a second. That would not be useful. No one except Methuselah (and maybe my dad) uses real terminals these days. You probably have to go to a museum to actually see one. We all use terminal emulators like Cygwin Terminal, xterm, PuTTY, and the like. Still, a lot of the concepts from real terminals apply. A lot of the existing Linux terminology is based on this terminalology, so it is useful to know. Even this explanation is a gross oversimplification of everything that is going on when you type a character into a computer, but hopefully it is enough to help make Bash easier to understand. *Pro Tip* Now that you've worked up a sweat by pressed the J key, it's all smudged up by your greasy, sweaty fingerprint. Don't leave your filth everywhere like you do at home. Because the electronics in most keyboards is fairly simple, you can actually run them through the dishwasher when they get dirty (desecrated by your slime). Don't use detergent and maybe avoid the heated drying cycle. Give them at least a day to air dry afterwards. I've done this at least a dozen times, and it works great. Do not do this with your super-fancy Logitech gaming keyboard with a fancy built-in display. Actually, do because I'm jealous. Also, probably don't do this on wireless keyboards. At least take the batteries out before you do. This method of keyboard cleaning is actually a Dell-recommended procedure (my brother did Dell tech support). So, I'm not just making this up (unlike the rest of the information in this tutorial). *Stay Tuned* Even though we've talked about terminals and (key)strokes, this is not the end--unless the information overload has killed you. In the next episode, we will press a bunch of keys to compose an entire command and then hit Enter. We'll likely see an error message. /* PLUG: http://plug.org, #utah on irc.freenode.net Unsubscribe: http://plug.org/mailman/options/plug Don't fear the penguin. */