Re: Printing Keys and using OCR (was: Proofreadable base64)
On Fri, 21 Sep 2007, Brian Smith wrote: Peter Palfrader wrote: Nice idea. When trying to find decent backup methods for my new Tor identity key I cam accross this thread. I played all day with ocr and friends. In the course I wrote a small script that does what you suggest. I tried to keep it small enough to print it along with whatever data you have - I clearly failed there. But other than that it works nicely. That didn't work out so well at first - gocr had real trouble distinguishing zeroes and the letter D like Delta. Why not use a 2D barcode like a QR code? A QR code will hold most typical keys, is easy for machines to read, is small, and has redundancy features that allow it to work even if you hole-punch or black out part of the code. Because I like to have a fallback to entering the data manually. Who knows how easy it will be to get barcode software for a specific version of barcodes 10 years in the future. And will it even compile? -- | .''`. ** Debian GNU/Linux ** Peter Palfrader | : :' : The universal http://www.palfrader.org/ | `. `' Operating System | `-http://www.debian.org/ ___ Gnupg-users mailing list Gnupg-users@gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-users
Re: Printing Keys and using OCR (was: Proofreadable base64)
On Fri, Sep 21, 2007 at 01:48:02PM +0700, Brian Smith wrote: Peter Palfrader wrote: Nice idea. When trying to find decent backup methods for my new Tor identity key I cam accross this thread. I played all day with ocr and friends. In the course I wrote a small script that does what you suggest. I tried to keep it small enough to print it along with whatever data you have - I clearly failed there. But other than that it works nicely. That didn't work out so well at first - gocr had real trouble distinguishing zeroes and the letter D like Delta. Why not use a 2D barcode like a QR code? A QR code will hold most typical keys, is easy for machines to read, is small, and has redundancy features that allow it to work even if you hole-punch or black out part of the code. See http://www.denso-wave.com/qrcode/aboutqr-e.html There is no Free Software to create or read QR code, and it is patented: http://www.denso-wave.com/qrcode/qrstandard-e.html Otherwise it is an excellent data format. Alex -- JID: [EMAIL PROTECTED] PGP: 0x46399138 od zwracania uwagi na detale są lekarze, adwokaci, programiści i zegarmistrze -- Czerski ___ Gnupg-users mailing list Gnupg-users@gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-users
Re: Printing Keys and using OCR (was: Proofreadable base64)
On Fri, Sep 21, 2007 at 12:59:00AM +0200, Peter Palfrader wrote: On Mon, 28 May 2007, Peter S. May wrote: Not meaning to kick a dead thread This must be a zombie by now :) Indeed. I'm very glad the thread woke up again, though, as it reminded me that I had written some code for this back in May, but unfortunately let it get buried under other work. I've tidied things a bit and packaged it at http://www.jabberwocky.com/software/paperkey/ It implements a secrets-only backup via paper (or bar code, or whatever you like), and then allows you to rebuild the original secret key when you like. README file is attached. David Paperkey - an OpenPGP key archiver -- David Shaw [EMAIL PROTECTED] A reasonable way to achieve a long term backup of OpenPGP (GnuPG, PGP, etc) keys is to print them out on paper. The reasoning behind this is that paper and ink has amazingly long retention qualities - far longer than the magnetic or optical means that are generally used to back up computer data. Paper? Seriously? -- The goal with paper is not secure storage. There are countless ways to store something securely. A paper backup also isn't a replacement for the usual machine readable (tape, CD-R, DVD-R, etc) backups, but rather as an if-all-else-fails method of restoring a key. Most of the storage media in use today do not have particularly good long-term (measured in years to decades) retention of data. If and when the CD-R and/or tape cassette and/or USB key and/or hard drive the secret key is stored on becomes unusable, the paper copy can be used to restore the secret key. What paperkey does -- Due to metadata and redundancy, OpenPGP secret keys are significantly larger than just the secret bits. In fact, the secret key contains a complete copy of the public key. Since the public key generally doesn't need to be escrowed (most people have many copies of it on various keyservers, web pages, etc), only extracting the secret parts can be a real advantage. Paperkey extracts just those secret bytes and prints them. To reconstruct, you re-enter those bytes (whether by hand or via OCR) and paperkey can use them to transform your existing public key into a secret key. For example, the regular DSA+Elgamal secret key I just tested comes out to 1281 bytes. The secret parts of that (plus some minor packet structure) come to only 149 bytes. It's a lot easier to re-enter 149 bytes correctly. Aren't CD-Rs supposed to last a long time? -- They're certainly advertised to (I've seen some pretty incredible claims of 100 years or more), but in practice it doesn't really work out that way. The manufacturing of the media, the burn quality, the burner quality, the storage, etc, all have a significant impact on how long an optical disc will last. Some tests show that you're lucky to get 10 years. For paper, on the other hand, to claim it will last for 100 years is not even vaguely impressive. High-quality paper with good ink regularly lasts many hundreds of years even under less than optimal conditions. Another bonus is that ink on paper is readable by humans. Not all backup methods will be readable 50 years later, so even if you have the backup, you can't easily buy a drive to read it. I doubt this will happen anytime soon with CD-R as there are just so many of them out there, but the storage industry is littered with old now-dead ways of storing data. Examples Take the secret key in key.gpg and generate a text file to-be-printed.txt that contains the secret data: $ paperkey --secret-key my-secret-key.gpg --output to-be-printed.txt Take the secret key data in my-key-text-file.txt and combine it with my-public-key.gpg to reconstruct my-secret-key.gpg: $ paperkey --pubring my-public-key.gpg --secrets my-key-text-file.txt --output my-secret-key.gpg If --output is not specified, the output goes to stdout. If --secret-key is not specified, the data is read from stdin. Some other useful options are: --output-type can be base16 or raw. base16 is human-readable, and raw is useful if you want to pass the output to another program like a bar code generator. --input-type same as --output-type, but for the restore side of things. By default the input type is inferred automatically from the input data. --output-width sets the width of base16 output --ignore-crc-error allows paperkey to continue when reconstructing even if it detects data corruption in the input. --verbose (or -v) be chatty about what is happening. Repeat this multiple times for more verbosity. Security Note that paperkey does not change the security requirements of storing a secret key. If your key has a
Printing Keys and using OCR (was: Proofreadable base64)
On Mon, 28 May 2007, Peter S. May wrote: Not meaning to kick a dead thread This must be a zombie by now :) I've come up with something which I haven't yet tried to implement but which I think would be interesting to try. Let's call it proofreadable base64. It's not terribly efficient, but we're going for recoverability more than efficiency. It goes something like this: We can assume that each line of our medium is capable of relaying 76 relatively legible characters. The first 32 are data in normal base64. Then, there is a space and a CRC-24 as specified in OpenPGP. Then, there are two spaces. After this, the first part of the line is repeated, except it is as if it were filtered through the command: tr 'A-Za-z0-9+/=' '0-9A-Z+/=a-z' Nice idea. When trying to find decent backup methods for my new Tor identity key I cam accross this thread. I played all day with ocr and friends. In the course I wrote a small script that does what you suggest. I tried to keep it small enough to print it along with whatever data you have - I clearly failed there. But other than that it works nicely. I used the OCR-A font available from a CTAN[0] mirror near you to print the output of my script. Then I used gocr[1][2] (0.41-1 as shipped in debian etch) to turn a scan back into data. That didn't work out so well at first - gocr had real trouble distinguishing zeroes and the letter D like Delta. Fortunately gocr has an option to disable its internal recognition engine and instead use a mode whereby it asks you about characters it doesn't recognize - initially that's all of them - and writes that to a database. In the end it asked me for about 300 chars out of 8000 - most of them at the beginning of the text - but produced the original text with only a few mishaps, which were caught easily using the encoding described above. [maybe I should also try a more recent version of gocr] If anybody wants to play with this, I uploaded my two scans to http://asteria.noreply.org/~weasel/ocr/ To use gocr with the database learning and its internal recognition engine turned off simply mkdir db; gocr -m 256 -m 130 -i 1.ppm -o 1.txt I guess playing with encodings other than base64 might be the next step. There was a strong point made for simply using base16, maybe with different characters that play nicely with gocr using OCR-A. Optar[2] is another nice tool which I tried today. While it does not provide the fallback to typing it all in option it shows promise. Using the default values I still had several bitflips after scanning in the printout tho. Future tests will probably include changing optar's paramters to larger dots (I don't need 200kb per page), and maybe preprocessing the data with par2. Cheers, Peter 0. http://www.ctan.org/ http://www.ctan.org/cgi-bin/search.py?metadataSearch=ocr-ametadataSearchSubmit=Search 1. http://packages.debian.org/gocr http://packages.debian.org/etch/gocr http://jocr.sourceforge.net/ 2. http://ronja.twibright.com/optar/ 3. http://www.par2.net/ -- | .''`. ** Debian GNU/Linux ** Peter Palfrader | : :' : The universal http://www.palfrader.org/ | `. `' Operating System | `-http://www.debian.org/ #!/usr/bin/perl use strict; use warnings; use Digest::SHA1 qw(sha1_hex); use MIME::Base64; if (@ARGV != 1 || $ARGV[0] !~ /^-[de]$/) { die Usage: $0 -d|-e\n; }; if ($ARGV[0] eq '-e') { # encoding. not needed for decoding undef $/; my ($bytes, $totallength, $totalhash, $line); $bytes = STDIN; $totallength = length($bytes); $totalhash = sha1_hex($bytes); $line = 1; printf(line data in base64 first 12 chars base64 with tr\n); printf( of sha1 in hex 'A-Za-z0-9+/=' '0-9A-Z+/=a-z'\n); printf(-A-B-C-\n); while (length($bytes) 0) { my ($this, $encoded, $tred, $hash); $this = substr($bytes, 0, 18, ''); $encoded = encode_base64($this, ''); ($tred = $encoded) =~ tr#A-Za-z0-9+/=#0-9A-Z+/=a-z#; $hash = substr( sha1_hex($this), 0, 12); printf(%06d %-24s %s %-24s\n, $line++, $encoded, $hash, $tred); }; printf(-A-B-C-\n); print(XX total length: $totallength\n); print(XX SHA1: $totalhash\n); } else { # decoding my (@bytes, $line, $found_marker, $exit); $exit = 0; $line = 0; $found_marker = 0; while (STDIN) { chomp; if ($_ eq '-A-B-C-') { $found_marker = 1; last; }; }; unless ($found_marker) { die (Did not find start marker '-A-B-C-' in input\n); }; $found_marker = 0; while (STDIN) { $line++; chomp; if ($_ eq '-A-B-C-') { $found_marker = 1; last; }; my ($l, $d, $h, $t, $t2, $decoded_d, $decoded_t, $hashd, $hasht, $bytes) = split; $bytes = ''; ($t2 = $t) =~ tr#0-9A-Z+/=a-z#A-Za-z0-9+/=#; $decoded_d = decode_base64($d); $decoded_t = decode_base64($t2); $hashd = substr( sha1_hex($decoded_d), 0,