Cryptography-Digest Digest #433

Digestifier Mon, 27 Mar 2000 20:12:53 -0800
Cryptography-Digest Digest #433, Volume #11      Mon, 27 Mar 00 23:13:01 EST

Contents:
  Re: Examining random() functions (Johnny Bravo)
  Re: Re-seeding PRNG's in central key distribution systems (Bryan Olson)
  Re: Sunday People 26/3/2000: "FORGET YOUR PASSWORD... END UP IN JAIL"  (Anonymous)

----------------------------------------------------------------------------

From: Johnny Bravo <[EMAIL PROTECTED]>
Subject: Re: Examining random() functions
Date: Mon, 27 Mar 2000 21:52:03 -0500

On Mon, 27 Mar 2000 14:12:40 GMT, [EMAIL PROTECTED] (_Andy_)
wrote:

>       I've been playing around with random integer generators and
>was wondering about different methods of examining the output.
>
>       Currently, I take my results and plot a 3-dimensional graph
>and examine it by eye. i.e. I take three consecutive results and treat
>them as the (x,y,z) coordinates of a point and plot it on the graph. I
>repeat this until a visual pattern emerges. (As described in "Jungles
>of Randomness")
>
>       If a pattern emerges, I tend to apply fiddle factors until the
>pattern becomes so complex that it seems to disappear.
>
>       So, can anyone recommend another way?

  See the tests in the DieHard test suite.  Rather than first just tell
you to go get it, I'm running output from your program though the suite as
I type this.  I put further comments below.  DieHard has quite a few
tests, descriptions of which will be on each section I include below.
I ran your program with a Random(255) and output each value as a byte into
a file for testing.  After I got the results I reran the program with a
different seed and tested it again, where the results were different
between runs (one failure and one success) I ran it a third time with yet
another seed and only performed those tests to get a majority. :)

>       The distribution seems fair (for small nMax - I haven't
>checked large yet) now that the various fiddle factors have been
>introduced. It would be nice to have a more formalised proof, and
>better development process... rather than the trial-and-error that
>I've been using.

  Unfortunately there is no proof of randomness.  We can apply tests and
give a probability with some confidence, but we can only prove order, not
a lack thereof.  Here are the DieHard results.  (For brevity I will
include the explanatory notes and descriptions of the tests but will
exclude most of the returned value tables to keep this short as possible.)

NOTE  Most of the tests in DIEHARD return a p-value, which should be
uniform on [0,1) if the input file contains truly independent random bits.
Those p-values are obtained by p=F(X), where F is the assumed distribution
of the sample random variable X---often normal. But that assumed F is just
an asymptotic approximation, for which the fit will be worst in the tails.
Thus you should not be surprised with occasional p-values near 0 or 1,
such as .0012 or .9983. When a bit stream really FAILS BIG, you will get
p's of 0 or 1 to six or more places.  By all means, do not, as a
Statistician might, think that a p < .025 or p> .975 means that the RNG
has "failed the test at the .05 level".  Such p's happen among the
hundreds that DIEHARD produces, even with good RNG's.  So keep in mind
that "p happens".

           This is the BIRTHDAY SPACINGS TEST
Choose m birthdays in a year of n days.  List the spacings between the
birthdays.  If j is the number of values that occur more than once in that
list, then j is asymptotically Poisson distributed with mean m^3/(4n).
Experience shows n must be quite large, say n>=2^18, for comparing the
results to the Poisson distribution with that mean.  This test uses
n=2^24 and m=2^9,  so that the underlying distribution for j is taken to
be Poisson with lambda=2^27/(2^26)=2.  A sample of 500 j's is taken, and a
chi-square goodness of fit test provides a p value.  The first test uses
bits 1-24 (counting from the left) from integers in the specified file.
  Then the file is closed and reopened. Next, bits 2-25 are used to
provide birthdays, then 3-26 and so on to bits 9-32. Each set of bits
provides a p-value, and the nine p-values provide a sample for a KSTEST.
  Your RNG passed all the tests.

           THE OVERLAPPING 5-PERMUTATION TEST
This is the OPERM5 test.  It looks at a sequence of one million 32-bit 
random integers.  Each set of five consecutive integers can be in one of 
120 states, for the 5! possible orderings of five numbers.  Thus the 
5th, 6th, 7th,...numbers each provide a state. As many thousands of state 
transitions are observed, cumulative counts are made of the number of 
occurences of each state.  Then the quadratic form in the weak inverse of 
the 120x120 covariance matrix yields a test equivalent to the likelihood 
ratio test that the 120 cell counts came from the specified 
(asymptotically) normal distribution with the specified 120x120 
covariance matrix (with rank 99).  This version uses 1,000,000 integers, 
twice.
  Your RNG passed both tests.

           BINARY RANK TEST
This is the BINARY RANK TEST for 31x31 matrices. The leftmost 31 bits of 
31 random integers from the test sequence are used to form a 31x31 binary 
matrix over the field {0,1}. The rank is determined. That rank can be from
0 to 31, but ranks< 28 are rare, and their counts are pooled with those 
for rank 28. Ranks are found for 40,000 such random matrices and a 
chisquare test is performed on counts for ranks 31,30,29 and <=28.
  Tests are also performed for 32x32 and multiple tests for 6x8 matrices.
All tests were passed by the RNG.

                          THE BITSTREAM TEST
The file under test is viewed as a stream of bits. Call them b1,b2,... .  
Consider an alphabet with two "letters", 0 and 1 and think of the stream 
of bits as a succession of 20-letter "words", overlapping.  Thus the first
word is b1b2...b20, the second is b2b3...b21, and so on.  The bitstream 
test counts the number of missing 20-letter (20-bit) words in a string of 
2^21 overlapping 20-letter words.  There are 2^20 possible 20 letter 
words.  For a truly random string of 2^21+19 bits, the number of missing 
words j should be (very close to) normally distributed with mean 141,909 
and sigma 428.  Thus (j-141909)/428 should be a standard normal variate (z
score) that leads to a uniform [0,1) p value.  The test is repeated twenty
times.  Your RNG has problems with this test, with 18 out of 20 values
being .95 or higher and 14 of them being .99 or higher.

                    The tests OPSO, OQSO and DNA
                OPSO means Overlapping-Pairs-Sparse-Occupancy
The OPSO test considers 2-letter words from an alphabet of 1024 letters.  
Each letter is determined by a specified ten bits from a 32-bit integer in
the sequence to be tested. OPSO generates  2^21 (overlapping) 2-letter 
words  (from 2^21+1 "keystrokes")  and counts the number of missing 
words---that is 2-letter words which do not appear in the entire sequence.
That count should be very close to normally distributed with mean 141,909,
sigma 290. Thus (missingwrds-141909)/290 should be a standard normal 
variable. The OPSO test takes 32 bits at a time from the test file and 
uses a designated set of ten consecutive bits. It then restarts the file 
for the next de- signated 10 bits, and so on.  Your RNG also had serious
problems with this test, getting 16 scores of 1.0000 out of 23 tests, and
7 of the remaining 7 being at least .95.  The second run had 15 scores of
1.0000 out of 23, and 6 of the remaining 7 being at least .95.

            OQSO means Overlapping-Quadruples-Sparse-Occupancy
  The test OQSO is similar, except that it considers 4-letter words from 
an alphabet of 32 letters, each letter determined by a designated string 
of 5 consecutive bits from the test file, elements of which are assumed 
32-bit random integers. The mean number of missing words in a sequence of 
2^21 four- letter words,  (2^21+3 "keystrokes"), is again 141909, with 
sigma = 295.  The mean is based on theory; sigma comes from extensive 
simulation.  Your RNG failed this test with roughly 1/2 of the values
being .99 or higher.

   The DNA test considers an alphabet of 4 letters    C,G,A,T, determined 
by two designated bits in the sequence of random integers being tested.  
It considers 10-letter words, so that as in OPSO and OQSO, there are 2^20 
possible words, and the mean number of missing words from a string of 2^21
(over- lapping)  10-letter  words (2^21+9 "keystrokes") is 141909. The 
standard deviation sigma=339 was determined as for OQSO by simulation.  
(Sigma for OPSO, 290, is the true value (to three places), not determined 
by simulation.  Your RNG had no problems with this test.

            This is the COUNT-THE-1's TEST on a stream of bytes.
Consider the file under test as a stream of bytes (four per 32 bit 
integer).  Each byte can contain from 0 to 8 1's, with probabilities 
1,8,28,56,70,56,28,8,1 over 256.  Now let the stream of bytes provide a 
string of overlapping  5-letter words, each "letter" taking values 
A,B,C,D,E. The letters are determined by the number of 1's in a byte    
0,1,or 2 yield A, 3 yields B, 4 yields C, 5 yields D and 6,7 or 8 yield E.
Thus we have a monkey at a typewriter hitting five keys with various 
probabilities (37,56,70,56,37 over 256).  There are 5^5 possible 5-letter 
words, and from a string of 256,000 (over- lapping) 5-letter words, counts
are made on the frequencies for each word.   The quadratic form in the 
weak inverse of the covariance matrix of the cell counts provides a 
chisquare test    Q5-Q4, the difference of the naive Pearson sums of 
(OBS-EXP)^2/EXP on counts for 5- and 4-letter cell counts.

   Test results for test.dat
 Chi-square with 5^5-5^4=2500 d.of f. for sample size:2560000
                               chisquare  equiv normal  p-value
  Results of COUNT-THE-1's in successive bytes:
 byte stream for test.dat         2703.30      2.875      .997980
 byte stream for test.dat         2794.20      4.161      .999984
Similar failure occured upon the second run.

            This is the COUNT-THE-1's TEST for specific bytes.
Consider the file under test as a stream of 32-bit integers. From each 
integer, a specific byte is chosen , say the left- most  bits 1 to 8. Each
byte can contain from 0 to 8 1's, with probabilitie 1,8,28,56,70,56,28,8,1
over 256.  Now let the specified bytes from successive integers provide a 
string of (overlapping) 5-letter words, each "letter" taking values 
A,B,C,D,E. The letters are determined  by the number of 1's, in that byte
0,1,or 2 ---> A, 3 ---> B, 4 ---> C, 5 ---> D, and  6,7 or 8 ---> E.  Thus
we have a monkey at a typewriter hitting five keys with with various 
probabilities    37,56,70, 56,37 over 256. There are 5^5 possible 5-letter
words, and from a string of 256,000 (overlapping) 5-letter words, counts 
are made on the frequencies for each word. The quadratic form in the weak 
inverse of the covariance matrix of the cell counts provides a chisquare 
test    Q5-Q4, the difference of the naive Pearson  sums of
(OBS-EXP)^2/EXP on counts for 5- and 4-letter cell counts.  The RNG passed
this test.

                      THIS IS A PARKING LOT TEST
In a square of side 100, randomly "park" a car---a circle of radius 1.   
Then try to park a 2nd, a 3rd, and so on, each time parking "by ear".  
That is, if an attempt to park a car causes a crash with one already 
parked, try again at a new random location. (To avoid path problems, 
consider parking helicopters rather than cars.)   Each attempt leads to 
either a crash or a success, the latter followed by an increment to the 
list of cars already parked. If we plot n   the number of attempts, versus
k    the number successfully parked, we get a curve that should be similar
to those provided by a perfect random number generator.  Theory for the 
behavior of such a random curve seems beyond reach, and as graphics 
displays are not available for this battery of tests, a simple characteriz
ation of the random experiment is used  k, the number of cars successfully
parked after n=12,000 attempts. Simulation shows that k should average 
3523 with sigma 21.9 and is very close to normally distributed.  Thus 
(k-3523)/21.9 should be a standard normal variable, which, converted to a 
uniform variable provides input to a KSTEST based on a sample of 10.
  The RNG passed once and failed once, it also passed the third run.

                      THE MINIMUM DISTANCE TEST
It does this 100 times     choose n=8000 random points in a square of side
10000.  Find d, the minimum distance between the (n^2-n)/2 pairs of 
points.  If the points are truly independent uniform, then d^2, the square
of the minimum distance should be (very close to) exponentially 
distributed with mean .995 .  Thus 1-exp(-d^2/.995) should be uniform on 
[0,1) and a KSTEST on the resulting 100 values serves as a test of 
uniformity for random points in the square. Test numbers=0 mod 5 are 
printed but the KSTEST is based on the full set of 100 random choices of 
8000 points in the 10000x10000 square.  RNG passed once and failed once,
it passed the third run.  For this test and the above test, they failed
when a small seed was chosen (95), but passed with a seed of around 2500
and 28000.  The small seed was the second one chosen and had little effect
on most of the tests.

                     THE 3DSPHERES TEST
Choose  4000 random points in a cube of edge 1000.  At each point, center 
a sphere large enough to reach the next closest point. Then the volume of 
the smallest such sphere is (very close to) exponentially distributed with
mean 120pi/3.  Thus the radius cubed is exponential with mean 30. (The 
mean is obtained by extensive simulation).  The 3DSPHERES test generates 
4000 such spheres 20 times.  Each min radius cubed leads to a uniform 
variable by means of 1-exp(-r^3/30.), then a KSTEST is done on the 20 
p-values.  Test passed without problem.

             This is the SQEEZE test
Random integers are floated to get uniforms on [0,1). Starting with 
k=2^31=2147483647, the test finds j, the number of iterations necessary to
reduce k to 1, using the reduction k=ceiling(k*U), with U provided by 
floating integers from the file being tested.  Such j's are found 100,000 
times, then counts for the number of times j was <=6,7,...,47,>=48 are 
used to provide a chi-square test for cell frequencies.  RNG Passed
without problem.

             This is the SQEEZE test
Random integers are floated to get uniforms on [0,1). Starting with 
k=2^31=2147483647, the test finds j, the number of iterations necessary to
reduce k to 1, using the reduction k=ceiling(k*U), with U provided by 
floating integers from the file being tested.  Such j's are found 100,000 
times, then counts for the number of times j was <=6,7,...,47,>=48 are 
used to provide a chi-square test for cell frequencies.  Passed without
problem.

    This is the RUNS test.  It counts runs up, and runs down, in a 
sequence of uniform [0,1) variables, obtained by floating the 32-bit 
integers in the specified file. This example shows how runs are counted   
.123,.357,.789,.425,.224,.416,.95 contains an up-run of length 3, a 
down-run of length 2 and an up-run of (at least) 2, depending on the next 
values.  The covariance matrices for the runs-up and runs-down are well 
known, leading to chisquare tests for quadratic forms in the weak inverses
of the covariance matrices.  Runs are counted for sequences of length 
10,000.  This is done ten times. Then repeated.  Passed without problem.

This is the CRAPS TEST. It plays 200,000 games of craps, finds the number 
of wins and the number of throws necessary to end each game.  The number 
of wins should be (very close to) a normal with mean 200000p and variance 
200000p(1-p), with p=244/495.  Throws necessary to complete the game can 
vary from 1 to infinity, but counts for all>21 are lumped with 21. A 
chi-square test is made on the no.-of-throws cell counts. Each 32-bit 
integer from the test file provides the value for the throw of a die, by 
floating to [0,1), multiplying by 6 and taking 1 plus the integer part of 
the result.  Passed without problem.

-- 
  Best Wishes,
    Johnny Bravo

"The most merciful thing in the world, I think, is the inability
of the human mind to correlate all it's contents." - HPL

------------------------------

From: Bryan Olson <[EMAIL PROTECTED]>
Subject: Re: Re-seeding PRNG's in central key distribution systems
Date: Tue, 28 Mar 2000 03:12:25 GMT

Tim Tyler wrote:
> Bryan Olson  wrote:

> : Correct, and the requirements on PRNGs for key generation
> : are even stronger. Specifically, even given the state
> : of the PRNG, one should not be able to determine previous
> : outputs. This is not usually a requirement on stream-cipher
> : PRNGs, and thus a strong stream cipher is not automatically
> : good for key generation. One should not, for example, use
> : RC4.
>
> I find your post difficult to interpret.
>
> You say "given the state of the PRNG".
>
> You mean the *entire* internal state?

One should not be able to determine previous outputs from
the state.  Not from the entire state and not from any part
of the state. I don't see any ambiguity in what I wrote.  I
even gave an example of a generator which fails this
criteria, RC4, and one which satisfies it (barring unknown
cryptanalytic results), from FIPS 186.

> If so, I don't see why this
> requirement is a necessary one for a PRNG used for
> key generation.

Are you familiar with "ephemeral keys" or "forward secrecy"?

> If you have the *entire* internal state you can predict
> future output.  This would be very bad. Attackers should
> not be able to obtain the entire internal state of the
> PRNG in the first place.

Be carefull not to confuse "in the first place" with "ever".
In a lecture, Whit Diffie noted a disadvantage of
cryptographic protection, compared with other measures, is
that compromised keys can reach backward through time.
Cryptographers have learned to destroy secrets as soon as
possible.

Note also that the key generator from FIPS 186 allows an
optional input that gets mixed into the state.  It is not
always forward-predictable.

> The ability to work "backwards" from such an internal
> state might make the disaster /slightly/ worse, I
> suppose - but most effort should probably be directed
> at avoiding the disaster in the first place.

Exposure may be no disaster at all. Specifically, if we have
forward secrecy we need only detect exposure when it happens
or is likely to have happened - a strictly weaker condition
than preventing exposer.

Effort is not an issue.  The FIPS 186 generator is efficient
enough for 8-bit smart-cards.


--Bryan
--
email: bolson at certicom dot com


Sent via Deja.com http://www.deja.com/
Before you buy.

------------------------------

Date: Tue, 28 Mar 2000 05:40:02 +0200
From: Anonymous <[EMAIL PROTECTED]>
Subject: Re: Sunday People 26/3/2000: "FORGET YOUR PASSWORD... END UP IN JAIL" 

>> FORGET YOUR PASSWORD... END UP IN JAIL
>>
>> INTERNET FURY AT STRAW
>>
>> BIG Brother wants to know your computer password - and he'll throw
>> you in jail if you don't tell him.
>>
>> Home Secretary Jack Straw aims to make it a criminal offence to
>> refuse to tell police or secret services the way into your personal
>> computer.
>>
>> And you could go down for two years, even if you've only forgotten
>> the vital word.
>>
>> Under the Regulation of Investigatory Powers Bill, any data you
>> have stored will be presumed to be incriminating unless you can
>> prove otherwise. Civil liberties groups are furious over the
>> controversial new legislation, which is part of the Government's
>> bid to crack down on computer fraud, internet terrorism and child
>> porn.
>>
>>
>>
>> America, France, Ireland and Germany have already rejected similar
>> laws.
>>
>> www.fipr.org/rip#media
>
> What this means in effect, no one will want to use encryption in
> case they forget their password and end up in jail.

Which is precisely their goal, of course.

>                                                     This means that
> any attempt at privacy by the British computer user when using
> electronic communications will carry with it real risks. The old
> Soviet Union would have been proud of this law as I'm sure they and
> other repressive totalitarian states still existing around the world
> would welcome Jack Straw on to their Central Committee as Security
> Minister.

Is it any wonder that all of George Orwell's books are set in Britian
(or a fictional place that bears an uncanny resemblance to Britian)?
He was just a bit too early on his timetable.

> The most frightening thing is he can get away with it. There seem to
> be no organised protest to this attack on fundamental right's to
> privacy. Well maybe this just goes to prove, as I have often
> suspected, that we are the most intolerant and authoritarian country
> in the Western world.

The best slave is one who puts on his own shackles. Your country has a 
bad case of the liberal-socialist disease going back decades. The country
formerly known as the USA also has this disease but for not as long so
the decay isn't as pronounced. Yet.


Steve



























------------------------------


** FOR YOUR REFERENCE **

The service address, to which questions about the list itself and requests
to be added to or deleted from it should be directed, is:

    Internet: [EMAIL PROTECTED]

You can send mail to the entire list (and sci.crypt) via:

    Internet: [EMAIL PROTECTED]

End of Cryptography-Digest Digest
******************************
Cryptography-Digest Digest #433

Reply via email to