Cryptography-Digest Digest #425, Volume #10 Sun, 17 Oct 99 21:13:03 EDT
Contents:
Re: Testing of randomness (Johnny Bravo)
Re: Ciphertext Stealing in CBC Mode
War Surplus Bargain Unlikely
----------------------------------------------------------------------------
From: [EMAIL PROTECTED] (Johnny Bravo)
Subject: Re: Testing of randomness
Date: Sun, 17 Oct 1999 18:24:58 GMT
On 17 Oct 1999 19:29:51 GMT, [EMAIL PROTECTED] (Kim G. S. OEyhus) wrote:
>I have made a random generator, electronic. So far all my tests show
>complete randomness. Do you have some thorough tests to recommend?
>I work in the GB range.
>
>Kim0
See http://stat.fsu.edu/~geo/diehard.html for the Diehard test suite.
The actual tests are as follows
BIRTHDAY SPACINGS TEST
Choose m birthdays in a year of n days. List the spacings
between the birthdays. If j is the number of values that
occur more than once in that list, then j is asymptotically
Poisson distributed with mean m^3/(4n). Experience shows n
must be quite large, say n>=2^18, for comparing the results
to the Poisson distribution with that mean. This test uses
n=2^24 and m=2^9, so that the underlying distribution for j
is taken to be Poisson with lambda=2^27/(2^26)=2. A sample
of 500 j's is taken, and a chi-square goodness of fit test
provides a p value. The first test uses bits 1-24 (counting
from the left) from integers in the specified file.
Then the file is closed and reopened. Next, bits 2-25 are
used to provide birthdays, then 3-26 and so on to bits 9-32.
Each set of bits provides a p-value, and the nine p-values
provide a sample for a KSTEST.
THE OVERLAPPING 5-PERMUTATION TEST
This is the OPERM5 test. It looks at a sequence of one mill-
ion 32-bit random integers. Each set of five consecutive
integers can be in one of 120 states, for the 5! possible or-
derings of five numbers. Thus the 5th, 6th, 7th,...numbers
each provide a state. As many thousands of state transitions
are observed, cumulative counts are made of the number of
occurences of each state. Then the quadratic form in the
weak inverse of the 120x120 covariance matrix yields a test
equivalent to the likelihood ratio test that the 120 cell
counts came from the specified (asymptotically) normal dis-
tribution with the specified 120x120 covariance matrix (with
rank 99). This version uses 1,000,000 integers, twice.
This is the BINARY RANK TEST for 31x31 matrices. The leftmost
31 bits of 31 random integers from the test sequence are used
to form a 31x31 binary matrix over the field {0,1}. The rank
is determined. That rank can be from 0 to 31, but ranks< 28
are rare, and their counts are pooled with those for rank 28.
Ranks are found for 40,000 such random matrices and a chisqua-
re test is performed on counts for ranks 31,30,29 and <=28.
This is the BINARY RANK TEST for 32x32 matrices. A random 32x
32 binary matrix is formed, each row a 32-bit random integer.
The rank is determined. That rank can be from 0 to 32, ranks
less than 29 are rare, and their counts are pooled with those
for rank 29. Ranks are found for 40,000 such random matrices
and a chisquare test is performed on counts for ranks 32,31,
30 and <=29.
This is the BINARY RANK TEST for 6x8 matrices. From each of
six random 32-bit integers from the generator under test, a
specified byte is chosen, and the resulting six bytes form a
6x8 binary matrix whose rank is determined. That rank can be
from 0 to 6, but ranks 0,1,2,3 are rare; their counts are
pooled with those for rank 4. Ranks are found for 100,000
random matrices, and a chi-square test is performed on
counts for ranks 6,5 and <=4.
THE BITSTREAM TEST
The file under test is viewed as a stream of bits. Call them
b1,b2,... . Consider an alphabet with two "letters", 0 and 1
and think of the stream of bits as a succession of 20-letter
"words", overlapping. Thus the first word is b1b2...b20, the
second is b2b3...b21, and so on. The bitstream test counts
the number of missing 20-letter (20-bit) words in a string of
2^21 overlapping 20-letter words. There are 2^20 possible 20
letter words. For a truly random string of 2^21+19 bits, the
number of missing words j should be (very close to) normally
distributed with mean 141,909 and sigma 428. Thus
(j-141909)/428 should be a standard normal variate (z score)
that leads to a uniform [0,1) p value. The test is repeated
twenty times.
The tests OPSO, OQSO and DNA
OPSO means Overlapping-Pairs-Sparse-Occupancy
The OPSO test considers 2-letter words from an alphabet of
1024 letters. Each letter is determined by a specified ten
bits from a 32-bit integer in the sequence to be tested. OPSO
generates 2^21 (overlapping) 2-letter words (from 2^21+1
"keystrokes") and counts the number of missing words---that
is 2-letter words which do not appear in the entire sequence.
That count should be very close to normally distributed with
mean 141,909, sigma 290. Thus (missingwrds-141909)/290 should
be a standard normal variable. The OPSO test takes 32 bits at
a time from the test file and uses a designated set of ten
consecutive bits. It then restarts the file for the next de-
signated 10 bits, and so on.
OQSO means Overlapping-Quadruples-Sparse-Occupancy
The test OQSO is similar, except that it considers 4-letter
words from an alphabet of 32 letters, each letter determined
by a designated string of 5 consecutive bits from the test
file, elements of which are assumed 32-bit random integers.
The mean number of missing words in a sequence of 2^21 four-
letter words, (2^21+3 "keystrokes"), is again 141909, with
sigma = 295. The mean is based on theory; sigma comes from
extensive simulation.
The DNA test considers an alphabet of 4 letters C,G,A,T,
determined by two designated bits in the sequence of random
integers being tested. It considers 10-letter words, so that
as in OPSO and OQSO, there are 2^20 possible words, and the
mean number of missing words from a string of 2^21 (over-
lapping) 10-letter words (2^21+9 "keystrokes") is 141909.
The standard deviation sigma=339 was determined as for OQSO
by simulation. (Sigma for OPSO, 290, is the true value (to
three places), not determined by simulation.
This is the COUNT-THE-1's TEST on a stream of bytes.
Consider the file under test as a stream of bytes (four per
32 bit integer). Each byte can contain from 0 to 8 1's,
with probabilities 1,8,28,56,70,56,28,8,1 over 256. Now let
the stream of bytes provide a string of overlapping 5-letter
words, each "letter" taking values A,B,C,D,E. The letters are
determined by the number of 1's in a byte 0,1,or 2 yield A,
3 yields B, 4 yields C, 5 yields D and 6,7 or 8 yield E. Thus
we have a monkey at a typewriter hitting five keys with vari-
ous probabilities (37,56,70,56,37 over 256). There are 5^5
possible 5-letter words, and from a string of 256,000 (over-
lapping) 5-letter words, counts are made on the frequencies
for each word. The quadratic form in the weak inverse of
the covariance matrix of the cell counts provides a chisquare
test Q5-Q4, the difference of the naive Pearson sums of
(OBS-EXP)^2/EXP on counts for 5- and 4-letter cell counts.
This is the COUNT-THE-1's TEST for specific bytes.
Consider the file under test as a stream of 32-bit integers.
From each integer, a specific byte is chosen , say the left-
most bits 1 to 8. Each byte can contain from 0 to 8 1's,
with probabilitie 1,8,28,56,70,56,28,8,1 over 256. Now let
the specified bytes from successive integers provide a string
of (overlapping) 5-letter words, each "letter" taking values
A,B,C,D,E. The letters are determined by the number of 1's,
in that byte 0,1,or 2 ---> A, 3 ---> B, 4 ---> C, 5 ---> D,
and 6,7 or 8 ---> E. Thus we have a monkey at a typewriter
hitting five keys with with various probabilities 37,56,70,
56,37 over 256. There are 5^5 possible 5-letter words, and
from a string of 256,000 (overlapping) 5-letter words, counts
are made on the frequencies for each word. The quadratic form
in the weak inverse of the covariance matrix of the cell
counts provides a chisquare test Q5-Q4, the difference of
the naive Pearson sums of (OBS-EXP)^2/EXP on counts for 5-
and 4-letter cell counts.
THIS IS A PARKING LOT TEST
In a square of side 100, randomly "park" a car---a circle of
radius 1. Then try to park a 2nd, a 3rd, and so on, each
time parking "by ear". That is, if an attempt to park a car
causes a crash with one already parked, try again at a new
random location. (To avoid path problems, consider parking
helicopters rather than cars.) Each attempt leads to either
a crash or a success, the latter followed by an increment to
the list of cars already parked. If we plot n the number of
attempts, versus k the number successfully parked, we get a
curve that should be similar to those provided by a perfect
random number generator. Theory for the behavior of such a
random curve seems beyond reach, and as graphics displays are
not available for this battery of tests, a simple characteriz
ation of the random experiment is used k, the number of cars
successfully parked after n=12,000 attempts. Simulation shows
that k should average 3523 with sigma 21.9 and is very close
to normally distributed. Thus (k-3523)/21.9 should be a st-
andard normal variable, which, converted to a uniform varia-
ble, provides input to a KSTEST based on a sample of 10.
THE MINIMUM DISTANCE TEST
It does this 100 times choose n=8000 random points in a
square of side 10000. Find d, the minimum distance between
the (n^2-n)/2 pairs of points. If the points are truly inde-
pendent uniform, then d^2, the square of the minimum distance
should be (very close to) exponentially distributed with mean
.995 . Thus 1-exp(-d^2/.995) should be uniform on [0,1) and
a KSTEST on the resulting 100 values serves as a test of uni-
formity for random points in the square. Test numbers=0 mod 5
are printed but the KSTEST is based on the full set of 100
random choices of 8000 points in the 10000x10000 square.
THE 3DSPHERES TEST
Choose 4000 random points in a cube of edge 1000. At each
point, center a sphere large enough to reach the next closest
point. Then the volume of the smallest such sphere is (very
close to) exponentially distributed with mean 120pi/3. Thus
the radius cubed is exponential with mean 30. (The mean is
obtained by extensive simulation). The 3DSPHERES test gener-
ates 4000 such spheres 20 times. Each min radius cubed leads
to a uniform variable by means of 1-exp(-r^3/30.), then a
KSTEST is done on the 20 p-values.
This is the SQEEZE test
Random integers are floated to get uniforms on [0,1). Start-
ing with k=2^31=2147483647, the test finds j, the number of
iterations necessary to reduce k to 1, using the reduction
k=ceiling(k*U), with U provided by floating integers from
the file being tested. Such j's are found 100,000 times,
then counts for the number of times j was <=6,7,...,47,>=48
are used to provide a chi-square test for cell frequencies.
The OVERLAPPING SUMS test
Integers are floated to get a sequence U(1),U(2),... of uni-
form [0,1) variables. Then overlapping sums,
S(1)=U(1)+...+U(100), S2=U(2)+...+U(101),... are formed.
The S's are virtually normal with a certain covariance mat-
rix. A linear transformation of the S's converts them to a
sequence of independent standard normals, which are converted
to uniform variables for a KSTEST. The p-values from ten
KSTESTs are given still another KSTEST.
This is the RUNS test. It counts runs up, and runs down,
in a sequence of uniform [0,1) variables, obtained by float-
ing the 32-bit integers in the specified file. This example
shows how runs are counted .123,.357,.789,.425,.224,.416,.95
contains an up-run of length 3, a down-run of length 2 and an
up-run of (at least) 2, depending on the next values. The
covariance matrices for the runs-up and runs-down are well
known, leading to chisquare tests for quadratic forms in the
weak inverses of the covariance matrices. Runs are counted
for sequences of length 10,000. This is done ten times. Then
repeated.
This is the CRAPS TEST. It plays 200,000 games of craps, finds
the number of wins and the number of throws necessary to end
each game. The number of wins should be (very close to) a
normal with mean 200000p and variance 200000p(1-p), with
p=244/495. Throws necessary to complete the game can vary
from 1 to infinity, but counts for all>21 are lumped with 21.
A chi-square test is made on the no.-of-throws cell counts.
Each 32-bit integer from the test file provides the value for
the throw of a die, by floating to [0,1), multiplying by 6
and taking 1 plus the integer part of the result.
NOTE Most of the tests in DIEHARD return a p-value, which
should be uniform on [0,1) if the input file contains truly
independent random bits. Those p-values are obtained by
p=F(X), where F is the assumed distribution of the sample
random variable X---often normal. But that assumed F is just
an asymptotic approximation, for which the fit will be worst
in the tails. Thus you should not be surprised with
occasional p-values near 0 or 1, such as .0012 or .9983.
When a bit stream really FAILS BIG, you will get p's of 0 or
1 to six or more places. By all means, do not, as a
Statistician might, think that a p < .025 or p> .975 means
that the RNG has "failed the test at the .05 level". Such
p's happen among the hundreds that DIEHARD produces, even
with good RNG's. So keep in mind that " p happens".
Johnny Bravo
------------------------------
From: [EMAIL PROTECTED] ()
Subject: Re: Ciphertext Stealing in CBC Mode
Date: 17 Oct 99 23:07:33 GMT
Eric W Braeden ([EMAIL PROTECTED]) wrote:
: I'm reviewing the idea of Ciphertext Stealing in CBC mode
: and having a little trouble with Figure 9.5 page 196 in AC.
: Because Bruce didn't explain much in the book, the Figure
: just isn't clear to me. I would seem that the last full block
: and the final partial block are flipped around. Also what
: is the symbol phi supposed to be? Any good discriptions
: of Ciphertext Stealing out there?
Well, "Tying Up Loose Ends" mentions Ciphertext Stealing among other
things. The encipherment which includes a partial block should just be
done in plain ECB mode, since part of the previous block is obscured, so
one can't decipher if CBC continues.
(somewhere on my page, http://www.ecn.ab.ca/~jsavard/jscrypt.htm is the
index)
The idea is simply this: after enciphering as many full blocks as you can,
encipher the last partial block, and a part of the last full block (after
it was already enciphered) of sufficient size to make what you are
enciphering a whole block.
Ciphertext Stealing involves taking those two pieces, and flipping them
around, because that way _alignment_ is maintained. This can be important
if the cipher has some types of weakness, or if your message is an odd
number of bits long, so that not maintaining alignment makes things more
complicated.
John Savard
------------------------------
From: [EMAIL PROTECTED] ()
Subject: War Surplus Bargain Unlikely
Date: 17 Oct 99 23:28:25 GMT
During a web search, I came across the following URL:
http://www.hypertools.com/wantpage.html
which lists some old Collins radio equipment that the page owner is
looking for...and also some old Hagelin machines.
However, also listed in the want list are the KL-7 and the KW-7...
which I don't think have been declassified just yet. Of course, there was
a news report a few years back of war surplus equipment intercepted on its
way to China which indicates that declassification isn't _always_ a
prerequisite to things turning up in war surplus outlets...
John Savard
------------------------------
** FOR YOUR REFERENCE **
The service address, to which questions about the list itself and requests
to be added to or deleted from it should be directed, is:
Internet: [EMAIL PROTECTED]
You can send mail to the entire list (and sci.crypt) via:
Internet: [EMAIL PROTECTED]
End of Cryptography-Digest Digest
******************************