[PATCH perlfunc.pod/crypt] crypt() digests, not encrypts

Michael G Schwern Sat, 23 Jul 2005 17:25:16 -0700

Attached is a patch which replaces the term "encrypt" in the perlfunc/crypt
documentation with "digest" or "hash" which more accurately describes what
crypt does.  I also added in some better explaination of what crypt does and
what one-way hashing is useful for.


I tried to avoid "hash" where possible to avoid confusion with the hash
data structure.  Even though they are related it would just confuse things.

Below is the new documentation after patching.


=item crypt PLAINTEXT,SALT 
 
Digests a string exactly like the crypt(3) function in the C library 
(assuming that you actually have a version there that has not been 
extirpated as a potential munition). 
 
crypt() is a one-way hash function.  The PLAINTEXT and SALT is turned 
into a short, fixed length string, called a digest, which is returned. 
The same PLAINTEXT and SALT will always return the same string, but 
there is no (known) way to get the original PLAINTEXT from the hash. 
The SALT is visible as part of the digest.  Small changes in the 
PLAINTEXT will result in large changes in the hash. 
 
There is no decrypt function.  This function isn't all that useful for 
cryptography (for that, see your nearby CPAN mirror) and the name 
"crypt" is a bit of a misnomer.  Instead it is primarily used to check 
if two pieces of text are the same without having to transmit or store 
the text itself.  An example is checking if a correct password is 
given.  The digest of the password is stored, not the password itself. 
The user types in a password which is crypt()'d with the same salt as 
the stored digest.  If the two digests match the password is correct. 
 
When verifying an existing digest string you should use the digest as 
the salt (like C<crypt($plain, $digest) eq $digest>).  This allows 
your code to work with the standard L<crypt|/crypt> and with more 
exotic implementations.  In other words, do not assume anything about 
the returned string itself, or how many bytes in the digest matter. 
 
Traditionally the result is a string of 13 bytes: two first bytes of 
the salt, followed by 11 bytes from the set C<[./0-9A-Za-z]>, and only 
the first eight bytes of the digest string mattered, but alternative 
hashing schemes (like MD5), higher level security schemes (like C2), 
and implementations on non-UNIX platforms may produce different 
strings. 
 
When choosing a new salt create a random two character string whose 
characters come from the set C<[./0-9A-Za-z]> (like C<join '', ('.', 
'/', 0..9, 'A'..'Z', 'a'..'z')[rand 64, rand 64]>).  This set of 
characters is just a recommendation; the characters allowed in 
the salt depend solely on your system's crypt library, and Perl can't 
restrict what salts C<crypt()> accepts. 
 
Here's an example that makes sure that whoever runs this program knows 
their own password: 

    $pwd = (getpwuid($<))[1]; 
 
    system "stty -echo"; 
    print "Password: "; 
    chomp($word = <STDIN>); 
    print "\n"; 
    system "stty echo"; 
 
    if (crypt($word, $pwd) ne $pwd) { 
        die "Sorry...\n"; 
    } else { 
        print "ok\n"; 
    } 
 
Of course, typing in your own password to whoever asks you 
for it is unwise. 
 
The L<crypt|/crypt> function is unsuitable for hashing large quantities 
of data, not least of all because you can't get the information 
back.  Look at the F<by-module/Crypt> and F<by-module/PGP> directories 
on your favorite CPAN mirror for a slew of potentially useful 
modules. 
 
If using crypt() on a Unicode string (which I<potentially> has 
characters with codepoints above 255), Perl tries to make sense 
of the situation by trying to downgrade (a copy of the string) 
the string back to an eight-bit byte string before calling crypt() 
(on that copy).  If that works, good.  If not, crypt() dies with 
C<Wide character in crypt>. 


-- 
Michael G Schwern     [EMAIL PROTECTED]     http://www.pobox.com/~schwern
Just call me 'Moron Sugar'.
        http://www.somethingpositive.net/sp05182002.shtml

--- pod/perlfunc.pod    2005/07/24 00:04:05     1.1
+++ pod/perlfunc.pod    2005/07/24 00:24:01
@@ -885,31 +885,38 @@
 
 =item crypt PLAINTEXT,SALT
 
-Encrypts a string exactly like the crypt(3) function in the C library
+Digests a string exactly like the crypt(3) function in the C library
 (assuming that you actually have a version there that has not been
-extirpated as a potential munition).  This can prove useful for checking
-the password file for lousy passwords, amongst other things.  Only the
-guys wearing white hats should do this.
-
-Note that L<crypt|/crypt> is intended to be a one-way function, much like
-breaking eggs to make an omelette.  There is no (known) corresponding
-decrypt function (in other words, the crypt() is a one-way hash
-function).  As a result, this function isn't all that useful for
-cryptography.  (For that, see your nearby CPAN mirror.)
-
-When verifying an existing encrypted string you should use the
-encrypted text as the salt (like C<crypt($plain, $crypted) eq
-$crypted>).  This allows your code to work with the standard L<crypt|/crypt>
-and with more exotic implementations.  In other words, do not assume
-anything about the returned string itself, or how many bytes in
-the encrypted string matter.
+extirpated as a potential munition).
+
+crypt() is a one-way hash function.  The PLAINTEXT and SALT is turned
+into a short, fixed length string, called a digest, which is returned.
+The same PLAINTEXT and SALT will always return the same string, but
+there is no (known) way to get the original PLAINTEXT from the hash.
+The SALT is visible as part of the digest.  Small changes in the
+PLAINTEXT will result in large changes in the hash.
+
+There is no decrypt function.  This function isn't all that useful for
+cryptography (for that, see your nearby CPAN mirror) and the name
+"crypt" is a bit of a misnomer.  Instead it is primarily used to check
+if two pieces of text are the same without having to transmit or store
+the text itself.  An example is checking if a correct password is
+given.  The digest of the password is stored, not the password itself.
+The user types in a password which is crypt()'d with the same salt as
+the stored digest.  If the two digests match the password is correct.
+
+When verifying an existing digest string you should use the digest as
+the salt (like C<crypt($plain, $digest) eq $digest>).  This allows
+your code to work with the standard L<crypt|/crypt> and with more
+exotic implementations.  In other words, do not assume anything about
+the returned string itself, or how many bytes in the digest matter.
 
 Traditionally the result is a string of 13 bytes: two first bytes of
 the salt, followed by 11 bytes from the set C<[./0-9A-Za-z]>, and only
-the first eight bytes of the encrypted string mattered, but
-alternative hashing schemes (like MD5), higher level security schemes
-(like C2), and implementations on non-UNIX platforms may produce
-different strings.
+the first eight bytes of the digest string mattered, but alternative
+hashing schemes (like MD5), higher level security schemes (like C2),
+and implementations on non-UNIX platforms may produce different
+strings.
 
 When choosing a new salt create a random two character string whose
 characters come from the set C<[./0-9A-Za-z]> (like C<join '', ('.',
@@ -938,7 +945,7 @@
 Of course, typing in your own password to whoever asks you
 for it is unwise.
 
-The L<crypt|/crypt> function is unsuitable for encrypting large quantities
+The L<crypt|/crypt> function is unsuitable for hashing large quantities
 of data, not least of all because you can't get the information
 back.  Look at the F<by-module/Crypt> and F<by-module/PGP> directories
 on your favorite CPAN mirror for a slew of potentially useful

[PATCH perlfunc.pod/crypt] crypt() digests, not encrypts

Reply via email to