subject:"Printing Keys and using OCR \(was\: Proofreadable base64\)"

Re: Printing Keys and using OCR (was: Proofreadable base64)

2007-09-21 Thread David Shaw

On Fri, Sep 21, 2007 at 12:59:00AM +0200, Peter Palfrader wrote:
> On Mon, 28 May 2007, Peter S. May wrote:
> 
> > Not meaning to kick a dead thread
> 
> This must be a zombie by now :)

Indeed.  I'm very glad the thread woke up again, though, as it
reminded me that I had written some code for this back in May, but
unfortunately let it get buried under other work.  I've tidied things
a bit and packaged it at http://www.jabberwocky.com/software/paperkey/

It implements a secrets-only backup via paper (or bar code, or
whatever you like), and then allows you to rebuild the original secret
key when you like.

README file is attached.

David
  Paperkey - an OpenPGP key archiver
  --
  David Shaw <[EMAIL PROTECTED]>

A reasonable way to achieve a long term backup of OpenPGP (GnuPG, PGP,
etc) keys is to print them out on paper.  The reasoning behind this is
that paper and ink has amazingly long retention qualities - far longer
than the magnetic or optical means that are generally used to back up
computer data.

Paper?  Seriously?
--

The goal with paper is not secure storage.  There are countless ways
to store something securely.  A paper backup also isn't a replacement
for the usual machine readable (tape, CD-R, DVD-R, etc) backups, but
rather as an if-all-else-fails method of restoring a key.  Most of the
storage media in use today do not have particularly good long-term
(measured in years to decades) retention of data.  If and when the
CD-R and/or tape cassette and/or USB key and/or hard drive the secret
key is stored on becomes unusable, the paper copy can be used to
restore the secret key.

What paperkey does
--

Due to metadata and redundancy, OpenPGP secret keys are significantly
larger than just the "secret bits".  In fact, the secret key contains
a complete copy of the public key.  Since the public key generally
doesn't need to be escrowed (most people have many copies of it on
various keyservers, web pages, etc), only extracting the secret parts
can be a real advantage.

Paperkey extracts just those secret bytes and prints them.  To
reconstruct, you re-enter those bytes (whether by hand or via OCR) and
paperkey can use them to transform your existing public key into a
secret key.

For example, the regular DSA+Elgamal secret key I just tested comes
out to 1281 bytes.  The secret parts of that (plus some minor packet
structure) come to only 149 bytes.  It's a lot easier to re-enter 149
bytes correctly.

Aren't CD-Rs supposed to last a long time?
--

They're certainly advertised to (I've seen some pretty incredible
claims of 100 years or more), but in practice it doesn't really work
out that way.  The manufacturing of the media, the burn quality, the
burner quality, the storage, etc, all have a significant impact on how
long an optical disc will last.  Some tests show that you're lucky to
get 10 years.

For paper, on the other hand, to claim it will last for 100 years is
not even vaguely impressive.  High-quality paper with good ink
regularly lasts many hundreds of years even under less than optimal
conditions.

Another bonus is that ink on paper is readable by humans.  Not all
backup methods will be readable 50 years later, so even if you have
the backup, you can't easily buy a drive to read it.  I doubt this
will happen anytime soon with CD-R as there are just so many of them
out there, but the storage industry is littered with old now-dead ways
of storing data.

Examples

Take the secret key in key.gpg and generate a text file
to-be-printed.txt that contains the secret data:

$ paperkey --secret-key my-secret-key.gpg --output to-be-printed.txt

Take the secret key data in my-key-text-file.txt and combine it with
my-public-key.gpg to reconstruct my-secret-key.gpg:

$ paperkey --pubring my-public-key.gpg --secrets my-key-text-file.txt --output 
my-secret-key.gpg

If --output is not specified, the output goes to stdout.  If
--secret-key is not specified, the data is read from stdin.

Some other useful options are:

  --output-type  can be "base16" or "raw".  "base16" is human-readable,
 and "raw" is useful if you want to pass the output to
 another program like a bar code generator.

  --input-type   same as --output-type, but for the restore side of
 things.  By default the input type is inferred
 automatically from the input data.

  --output-width sets the width of base16 output

  --ignore-crc-error allows paperkey to continue when reconstructing
 even if it detects data corruption in the input.

  --verbose (or -v)  be chatty about what is happening.  Repeat this
 multiple times for more verbosity.

Security

Note that paperkey does not change the security requirements of
storing a secret key.  If your k

Re: Printing Keys and using OCR (was: Proofreadable base64)

2007-09-21 Thread Janusz A. Urbanowicz

On Fri, Sep 21, 2007 at 01:48:02PM +0700, Brian Smith wrote:
> Peter Palfrader wrote:
> > Nice idea.  When trying to find decent backup methods for my 
> > new Tor identity key I cam accross this thread.
> > 
> > I played all day with ocr and friends.  In the course I wrote 
> > a small script that does what you suggest.  I tried to keep 
> > it small enough to print it along with whatever data you have 
> > - I clearly failed there.
> > But other than that it works nicely.
> > 
> > That didn't work out so well at  first
> > - gocr had real trouble distinguishing zeroes and the 
> > letter D like Delta. 
> 
> Why not use a 2D barcode like a QR code? A QR code will hold most
> typical keys, is easy for machines to read, is small, and has redundancy
> features that allow it to work even if you hole-punch or black out part
> of the code.
> 
> See http://www.denso-wave.com/qrcode/aboutqr-e.html

There is no Free Software to create or read QR code, and it is
patented:



Otherwise it is an excellent data format.

Alex
-- 
JID: [EMAIL PROTECTED]
PGP: 0x46399138
od zwracania uwagi na detale są lekarze, adwokaci, programiści i zegarmistrze
 -- Czerski

___
Gnupg-users mailing list
Gnupg-users@gnupg.org
http://lists.gnupg.org/mailman/listinfo/gnupg-users

RE: Printing Keys and using OCR (was: Proofreadable base64)

2007-09-21 Thread Brian Smith

Peter Palfrader wrote:
> Nice idea.  When trying to find decent backup methods for my 
> new Tor identity key I cam accross this thread.
> 
> I played all day with ocr and friends.  In the course I wrote 
> a small script that does what you suggest.  I tried to keep 
> it small enough to print it along with whatever data you have 
> - I clearly failed there.
> But other than that it works nicely.
> 
> That didn't work out so well at  first
> - gocr had real trouble distinguishing zeroes and the 
> letter D like Delta. 

Why not use a 2D barcode like a QR code? A QR code will hold most
typical keys, is easy for machines to read, is small, and has redundancy
features that allow it to work even if you hole-punch or black out part
of the code.

See http://www.denso-wave.com/qrcode/aboutqr-e.html

- Brian


___
Gnupg-users mailing list
Gnupg-users@gnupg.org
http://lists.gnupg.org/mailman/listinfo/gnupg-users

Re: Printing Keys and using OCR (was: Proofreadable base64)

2007-09-21 Thread Peter Palfrader

On Fri, 21 Sep 2007, Brian Smith wrote:

> Peter Palfrader wrote:
> > Nice idea.  When trying to find decent backup methods for my 
> > new Tor identity key I cam accross this thread.
> > 
> > I played all day with ocr and friends.  In the course I wrote 
> > a small script that does what you suggest.  I tried to keep 
> > it small enough to print it along with whatever data you have 
> > - I clearly failed there.
> > But other than that it works nicely.
> > 
> > That didn't work out so well at  first
> > - gocr had real trouble distinguishing zeroes and the 
> > letter D like Delta. 
> 
> Why not use a 2D barcode like a QR code? A QR code will hold most
> typical keys, is easy for machines to read, is small, and has redundancy
> features that allow it to work even if you hole-punch or black out part
> of the code.

Because I like to have a fallback to entering the data manually.  Who
knows how easy it will be to get barcode software for a specific version
of barcodes 10 years in the future.  And will it even compile?

-- 
   |  .''`.  ** Debian GNU/Linux **
  Peter Palfrader  | : :' :  The  universal
 http://www.palfrader.org/ | `. `'  Operating System
   |   `-http://www.debian.org/

___
Gnupg-users mailing list
Gnupg-users@gnupg.org
http://lists.gnupg.org/mailman/listinfo/gnupg-users

Printing Keys and using OCR (was: Proofreadable base64)

2007-09-20 Thread Peter Palfrader

On Mon, 28 May 2007, Peter S. May wrote:

> Not meaning to kick a dead thread

This must be a zombie by now :)

> I've come up with something which I haven't yet tried to implement but
> which I think would be interesting to try.  Let's call it "proofreadable
> base64".  It's not terribly efficient, but we're going for
> recoverability more than efficiency.
> 
> It goes something like this:  We can assume that each line of our medium
> is capable of relaying 76 relatively legible characters.  The first 32
> are data in normal base64.  Then, there is a space and a CRC-24 as
> specified in OpenPGP.  Then, there are two spaces.  After this, the
> first part of the line is repeated, except it is as if it were filtered
> through the command:
> 
> tr 'A-Za-z0-9+/=' '0-9A-Z+/=a-z'

Nice idea.  When trying to find decent backup methods for my new Tor
identity key I cam accross this thread.

I played all day with ocr and friends.  In the course I wrote a small
script that does what you suggest.  I tried to keep it small enough to
print it along with whatever data you have - I clearly failed there.
But other than that it works nicely.

I used the OCR-A font available from a CTAN[0] mirror near you to print the
output of my script.

Then I used gocr[1][2] (0.41-1 as shipped in debian etch) to turn a scan back
into data.  That didn't work out so well at first - gocr had real
trouble distinguishing zeroes and the letter D like Delta.  Fortunately
gocr has an option to disable its internal recognition engine and
instead use a mode whereby it asks you about characters it doesn't
recognize - initially that's all of them - and writes that to a
database.  In the end it asked me for about 300 chars out of 8000 - most
of them at the beginning of the text - but produced the original text
with only a few mishaps, which were caught easily using the encoding
described above.
[maybe I should also try a more recent version of gocr]

If anybody wants to play with this, I uploaded my two scans to
http://asteria.noreply.org/~weasel/ocr/

To use gocr with the database learning and its internal recognition
engine turned off simply
  mkdir db; gocr -m 256 -m 130 -i 1.ppm -o 1.txt

I guess playing with encodings other than base64 might be the next step.
There was a strong point made for simply using base16, maybe with
different characters that play nicely with gocr using OCR-A.

Optar[2] is another nice tool which I tried today.  While it does not
provide the "fallback to typing it all in" option it shows promise.
Using the default values I still had several bitflips after scanning in
the printout tho.  Future tests will probably include changing optar's
paramters to larger dots (I don't need 200kb per page), and maybe
preprocessing the data with par2.

Cheers,
Peter

0. http://www.ctan.org/

http://www.ctan.org/cgi-bin/search.py?metadataSearch=ocr-a&metadataSearchSubmit=Search
1. http://packages.debian.org/gocr
   http://packages.debian.org/etch/gocr
   http://jocr.sourceforge.net/
2. http://ronja.twibright.com/optar/
3. http://www.par2.net/
-- 
   |  .''`.  ** Debian GNU/Linux **
  Peter Palfrader  | : :' :  The  universal
 http://www.palfrader.org/ | `. `'  Operating System
   |   `-http://www.debian.org/
#!/usr/bin/perl

use strict;
use warnings;
use Digest::SHA1 qw(sha1_hex);
use MIME::Base64;

if (@ARGV != 1 ||
  $ARGV[0] !~ /^-[de]$/) {
  die "Usage: $0 -d|-e\n";
};

if ($ARGV[0] eq '-e') {
  # encoding.  not needed for decoding
  undef $/;
  my ($bytes, $totallength, $totalhash, $line);
  $bytes = ;
  $totallength = length($bytes);
  $totalhash = sha1_hex($bytes);
  $line = 1;
  printf("  'A-Za-z0-9+/=' 
'0-9A-Z+/=a-z'>\n");
  printf("-A-B-C-\n");
  while (length($bytes) > 0) {
my ($this, $encoded, $tred, $hash);
$this = substr($bytes, 0, 18, '');
$encoded = encode_base64($this, '');
($tred = $encoded) =~ tr#A-Za-z0-9+/=#0-9A-Z+/=a-z#;
$hash = substr( sha1_hex($this), 0, 12);
printf("%06d  %-24s  %s  %-24s\n", $line++, $encoded, $hash, $tred);
  };
  printf("-A-B-C-\n");
  print("XX total length: $totallength\n");
  print("XX SHA1: $totalhash\n");
} else {
  # decoding
  my (@bytes, $line, $found_marker, $exit);
  $exit = 0;
  $line = 0;
  $found_marker = 0;
  while () {
chomp;
if ($_ eq '-A-B-C-') {
  $found_marker = 1;
  last;
};
  };
  unless ($found_marker) {
die ("Did not find start marker '-A-B-C-' in input\n");
  };
  $found_marker = 0;

  while () {
$line++;
chomp;
if ($_ eq '-A-B-C-') {
  $found_marker = 1;
  last;
};
my ($l, $d, $h, $t, $t2, $decoded_d, $decoded_t, $hashd, $hasht, $bytes) = 
split;
$bytes = '';
($t2 = $t) =~ tr#0-9A-Z+/=a-z#A-Za-z0-9+/=#;
$decoded_d = decode_base64($d);
$decoded_t = decode_base64($t2);
$hashd = substr( sha1_hex($decoded_d), 0, 12);
$hasht = substr( sha1_hex($decoded_t), 0, 12);

if ($l != $line) { w

Re: Printing Keys and using OCR (was: Proofreadable base64)

Re: Printing Keys and using OCR (was: Proofreadable base64)

RE: Printing Keys and using OCR (was: Proofreadable base64)

Re: Printing Keys and using OCR (was: Proofreadable base64)

Printing Keys and using OCR (was: Proofreadable base64)

5 matches

Site Navigation

Mail list logo

Footer information