Cryptography-Digest Digest #267

Digestifier Sat, 18 Sep 1999 21:18:22 -0700
Cryptography-Digest Digest #267, Volume #10      Sun, 19 Sep 99 01:13:03 EDT

Contents:
  Re: some information theory (Anti-Spam)
  Re: (US) Administration Updates Encryption Export Policy (wtshaw)
  Re: Okay "experts," how do you do it? ("Trevor Jackson, III")
  Re: unix clippers that implement strong crypto. (SCOTT19U.ZIP_GUY)
  Re: Okay "experts," how do you do it?
  Re: Mystery inc. (Beale cyphers) ("Wayne G. Namerow")
  Re: Okay "experts," how do you do it?

----------------------------------------------------------------------------

From: Anti-Spam <[EMAIL PROTECTED]>
Subject: Re: some information theory
Date: Sat, 18 Sep 1999 18:13:07 -0700

Tim Tyler wrote:
> 
> Anti-Spam <[EMAIL PROTECTED]> wrote:
> : Tim Tyler wrote:
> 
        ***  (((((((( edited )))))))) ***
> 
> If you have a set of N target data strings in need of compressing, the
> optimum compression technique (in terms of size) is essentially to map
> these strings onto the integers from 1 to N.
> 
> Given a string taken at random from the starting set, the resulting
> "compressed file" will be indistinguishable from random.

Let me paraphrase, to make sure I understand this assertion:

Start with a set of N data strings.  Assume each data string is an
encoding for a symbol S.
I assume this set corresponds then to N symbols  ( S1, S2, S3 ... Sn )
that will appear in messages (compressed data/files). Call this set A.

Conjecture: the optimum compression (here defined in terms of size of
the resulting compression-encoded data/file) is achieved when (1) the N
symbols are encoded as integers 1 through N in the compressed data/file,
irrespective of the frequency of occurance of the Nth symbol in any
particular message. 
Furthermore, (2) the random assigment of the integer index value code
for the resulting compression encoded data/file will pass all confidence
tests for random bit strings of the length of the resulting compressed
data/file. 

Part (1):

here is the set A: { S1, S2, S3 ... Sn }
index into set A:    0,  1 , 2  ... N-1   ->  N integer values. 

Assume the symbols S1 ... Sn do not occur equiprobably. The maximal
entropy code (and thus minimal number of bits and thus minimal sized bit
strings for messages composed of the symbols) requires 

H = - (SUM over i = 1 to n )[ ( probability of Si ) * ( log(base2) of
probability of Si) ] 

where probability of Si = number of occurances of the ith symbol / total
number of symbols that occur. Occur where, how?  Look at a
pre-compressed message M composed of Q occurances of symbols found in
the set A.  For some messages maybe all of the symbols in A occur. For
others, maybe only a subset of the symbols in A occur.
The frequencies of occurance are a function of the particular message
M.  

So each symbol can be encoded as one out of 2 ^ H unique possible binary
strings, with H varying from message M1, M2, M3...  (Static and adaptive
huffman codes create strings shorter and longer than H bits such that
the *average* number of bits per symbol tends to H bits, to handle the
fractional bit values given by this formula for maximal entropy
encoding.) 

There are N symbols in the set A. An integer index into the set A
requires ( log(base 2)of N ) bits to be represented. The use of an
integer value (index) to code for the symbol in the compressed data/file
results in the minimal size compressed data/file iff:

        H =  ( log(base 2)of N ).  

i.e this is only true if the probability of occurance for each Si is the
same for all symbols. All symbols as they appear in a specific message M
must be equiprobable for the use of the index into the set A to give the
minimal size for the compressed data/file. 

If the symbols are not equiprobable, then H bits/symbol < ( log(base
2)of N ) bits/symbol. Therefore there is another code with shorter bit
lengths for the symbols.  The use of the index values for elements in
the set A as a code would not  produce the minimal bit length
compression of the data/file.

But this is not the original assertion in (1).  ***** So what am I
missing?  *****

Point (2):  

Given the  set A: { S1, S2, S3 ... Sn }   and 
index into set A:    0,  1 , 2  ... N-1   ->  N integer values. 

Randomly select an index value and its(  log(base 2)of N ) bit string to
represent (encode) the kth occurance of a symbol Si in the "compressed"
data/file as it appears in the original data/file.  

Assume there is a random variable IND, an index value over in the range
[0, N-1], the pdf of IND is uniformly distrbuted across the integer 0
through N-1.  Thus the probability of IND taking on any specific index
value is 1/N.  

Take the message M composed of occurances of the symbols Si and for each
symbol, take the ith value of IND as the index into set A and place that
bit string code into the ith position in the "compressed" data/file:

message M -  (no spaces here really, just spreading 'em out to make it
easier to read )
s1  s3  s1  s2  s5  s2  s4  s2  s1  s1  s4  s7  s9 s10 ......   

index value
1   3   1   2   5   2    4   2   1  1   4   7   9   10

value of IND (a random variable )
1   6   12  4   2   7   11  3   9  12  2   2   14  8 ..........

so how do we get to the "compressed" data/file representation?  Whatever
the function, it must be reversible, one-to-one and onto - to ensure we
successfully "decompress" the result.  A random substitution of a new
index value for the existing index value in the uncompressed data/file
is not reversible.  How would we know what it originally mapped to when
we decompressed? 

Assume the substitution is equal to a modulo function (this is just one
possible choice - there are others):

compressed index =  ( original index + IND ) mod N 

"decompression" given by original index = ( compressed index - IND ) mod
N.

so "compressed" values are:
(1 + 1 )mod N, (3 + 6 ) mod N, (1 + 12) mod N, (2 + 4 ) mod N ......

The compressed index is a random variable.  

This isn't acting as a compression, however, this is a cipher. ( If
compression is supposed to come about due to the use of the index value
coding scheme, we already showed it doesn't give the shortest bit length
total for the "compressed" data/file. )  In fact, this is a non-periodic
polyalphabetic cipher where the key is the total sequence of random
values of IND and that key is as long as the original message (in terms
of number of symbols in the original message M.) 

To decode the "compressed" data/file, which in this case is actually
enciphered, the same key or sequence of values for IND must be
subtracted.  

Conveying that sequence of random values, as long as the message M in
terms of symbols Q in it, to the decompressor is equivalent to conveying
a one-time pad sequence to the recipient to use to decipher it back to
the original message M. 

Given the key (sequence of IND values) is random, and the index code
strings are identical lengths, the "compressed" data/file, which is
actually an encipher data/file, is random.  The compressed index is a
random variable.  

So, this will pass the tests for randomness since it is fed by a random
key, the sequence of IND values. 
* AGREED.*

However, if the "compression" scheme uses some function of the symbols
frequencies that preceed the current symbol to be "compressed" :

IND(k) = function( frequencies of occurance of some or all previous
symbols as they appeared in M )

then the sequence of IND values will be a pseudorandom sequence ( call
it PIND).  There will be correlations in the values of PIND due to the
non-randomness in the occurance of symbols in M.  Therefore,

compressed index =  ( original index + PIND ) mod N 

"decompression" given by original index = ( compressed index - PIND )
mod N.

The compressed index is not a random variable. Each new value of PIND is
influenced by some or all of the previous frequencies of symbol
occurance as represented in the un-"compressed" data/file, and thus this
shared influence manifests as correlations across the bit sequence in
the "compressed" data/file.  The bit sequences in the "compressed"
data/file will fail some or all statistical tests for randomness. If
NONE of the frequencies of symbol occurance of any of the preceeding
un-"compressed" data/file influences the code chosen for the current
symbol to "compress" then the result would not be pseudo-random. 

A deterministic bit stream can be "combined" with a random bit stream in
a reversable (one-to-one and onto) manner such that the result is
random.  A deterministic bit stream "combined" with another
deterministic bit stream in a reversable (one-to-one and onto) process,
no matter how complex or convoluted, produces another deterministic bit
stream whose substrings will evidence some correlations.

And the result as shown in (1) is not compressed to the minimal bit
length.

What did I miss?  I don't understand the validity of your assertions.
Please help.


[EMAIL PROTECTED]

------------------------------

From: [EMAIL PROTECTED] (wtshaw)
Subject: Re: (US) Administration Updates Encryption Export Policy
Date: Sat, 18 Sep 1999 19:34:54 -0600

In article <7rutbv$n6l$[EMAIL PROTECTED]>, "rosi" <[EMAIL PROTECTED]> wrote:
> 
>    As the other asked: How is this 'technical review' defined and on what
> basis can one application be turned down?
> 
They can say that nothing was turned down, merely shelved for their convenience.

I sure would like to dump on them with a few dozen items to see if they
have effectively invented a new way to say no to something trivial or not.
-- 
Mark my return, in a somewhat limited way. 
Less us keep up the good fight for reasonable laws.

------------------------------

Date: Sat, 18 Sep 1999 22:30:03 -0400
From: "Trevor Jackson, III" <[EMAIL PROTECTED]>
Subject: Re: Okay "experts," how do you do it?



Sundial Services wrote:

> jerome wrote:
> > so you cryptanalyze all the cyphers described by beginners ?
> > i can easily write a feistel network which -seems- not too
> > bad in 2hours.
> > after that, how resistant it is... even against known attack, it is
> > another matter. to find out you have to crypanalyze it and try
> > at least the most common attacks (linear, differential, key realated).
>
> To try to steer this discussion gently back on course, let me say that
> for example it seems to me [an untutored individual] that one could
> create some kind of a test-bed into which the cipher algorithm could be
> installed and fed a bunch of known data.  One could then statistically
> analyze the outputs obtained to predict the quality of the cipher.

One could but the statistical analysis your describing is probably much
worse than a brute force attack.  If this kind of testing were feasible it
would be the basis for an objective evaluation of cipher strength.  It may
be possible to proive that this is impossible in principle.

> Cryptanalysis papers that I have read often focus upon the number of
> key-bits actually used by the cipher, the influence of the key vs. the
> influence of the data on the output of the cipher, or the difference in
> output given slight or substantial changes in the input data.
>
> These should be quantitatively analyzable characteristics.  Anyone ought
> to be able to do them.  And furthermore, it seems to me not wholly
> implausible that some kind of automatic, perhaps neural-network-like
> strategy could be employed to *design* a cipher or to establish the
> actual values used in critical components like s-boxes.
>
> I tend to view with skepticism any argument that says that only certain
> individuals know how to do things.  HOW do they know?  WHAT do they
> know?  I should not have to undergo the experience of "cracking zillions
> of ciphers" to slowly and painfully gain knowledge if anyone else in
> this world has gone through the experience before and can, and will,
> express in some objective way what it is they learned.
>
> We have notions to this day that cryptographers sit in dark rooms,
> poring over large sheets of paper with square grids of letters and
> numbers on them, waiting for some magic flash of insight, "ah hah!"  But
> I doubt that real cryptographers have done that in a hundred years or
> more.  We have computers now.  They can accomplish the work in
> milliseconds once we can articulate what the work should be.
>
> So, "what do you do," and "how do you do it?"
>
> Can anyone, more clever than I, speculate in any knowledgeable way what
> -might- an automatic test-bed look for, or analyze, to gauge the quality
> of a cipher, umm, automatically?

If you supply one critical definition I can handle the rest of the project.
The critical piece is a language/grammar/algebra for describing all possible
ciphers.  Once you have a universal cipher description mechanism you can
automate the task of reasoning about ciphers.  Even a partial definition
would be a useful starting point.


------------------------------

From: [EMAIL PROTECTED] (SCOTT19U.ZIP_GUY)
Crossposted-To: comp.security.unix
Subject: Re: unix clippers that implement strong crypto.
Date: Sun, 19 Sep 1999 04:23:19 GMT

In article <[EMAIL PROTECTED]>, [EMAIL PROTECTED] (Terry Ritter) wrote:
>
>On Fri, 17 Sep 1999 20:38:35 -0400, in
><[EMAIL PROTECTED]>, in sci.crypt Alwyn Allan
><[EMAIL PROTECTED]> wrote:
>
>>"Christopher J. Mattern" wrote:
>>
>>> "Illegal but perhaps difficult to prosecute" is *still* illegal...
>>
>>Illegal is not a precisely defined legal term. In general use it means
> "forbidden by law."
>>There is no law which forbids the use of patented technology, therefore it is
> not illegal.
>>
>>Patent law simply allows the patent holder, through civil court action, to
> stop the
>>commercial use of the patented technology for a period of time. It does not
> provide for
>>damages, even in cases of flagrant and willful infringement.
>
>Absolutely false.  Where do you get this stuff?  Damages are at the
>heart of patent infringement litigation.  
>
>Specific damages include lost royalty income and profits made from
>infringement.  In "cases of flagrant and willful infringement," one
>could recover attorney fees with *triple* damages.  Deliberately
>breaking a cipher well-known to be patented is clearly willful and
>might be flagrant.  
>

   Terry this seems to conflict with what you just anwsered in my
message you just wrote:

Maybe you have the wrong idea about patents:  The whole point of a
patent is to *reveal* information, not protect it.  It is trade
secrecy which hides information.  A patent protects the particular
*use* of particular information, not learning about it.  

The above is what you just wrote on another thread. If the whole
point of a patent is to "reveal" information to advance the art
of cryptograohy who could breaking a cipher that is well-known to
be patented be clearly some sort of flagrant violation when you just
stated you want to advance cryptography. I think I am missing something
in what you mean. Since the above seem like opposites to me.



David A. Scott
--
                    SCOTT19U.ZIP NOW AVAILABLE WORLD WIDE
                    http://www.jim.com/jamesd/Kong/scott19u.zip
                    http://members.xoom.com/ecil/index.htm
                    NOTE EMAIL address is for SPAMERS

------------------------------

From: [EMAIL PROTECTED] ()
Subject: Re: Okay "experts," how do you do it?
Date: 19 Sep 99 03:55:52 GMT

David A Molnar ([EMAIL PROTECTED]) wrote:
: Trevor Jackson, III <[EMAIL PROTECTED]> wrote:
: > Sure it is.  "Very tough" does not describe the Halting Problem, Godel's
: > Theorem, or incrementing from one to 2^googol.  Those problems cannot be
: > "solved", and we can prove it.

: The output of incrementing from one to 2^googol is finite and unique,
: therefore the problem is solvable. I know at least one person who really
: does think this way. 

You don't count many mathematicians among your friends, do you?

John Savard

------------------------------

From: "Wayne G. Namerow" <[EMAIL PROTECTED]>
Subject: Re: Mystery inc. (Beale cyphers)
Date: Sat, 18 Sep 1999 23:57:51 -0400



Curt Welch wrote:

> Niteowl <[EMAIL PROTECTED]> wrote:
> > Dr. Matyas wrote a paper after much apparent research and thinks the
> > version of the DOI used was from "An Historical, Geographical,
> > Commercial, and Philosophical View of the American United States"
> > published in 1795 by W. Winterbotham.  The version of the DOI there does
> > 'correct' many of the numbering errors seen in B2.
>

It should be noted that according to Dr. Matyas, the Winterbotham London
editions were also published in 1799 and 1819. We don't know which of
these Beale used.

>
> Interesting of course that the document was published 25 years before 1821
> (the time of the encoding according to the story) and some 90 years before
> the publication by Ward in 1885.  It seems that someone creating a hoax
> in 1885 wouldn't likely be using a 1795 publication as their source for the
> DOI.
>
> So I guess this means that Dr. Matyas examined many different versions
> of the DOI and _only_ that publication was found to contain a version
> of the DOI which so closely matched the errors?
>

Dr. Matyas compiled a checklist of books, pamphlets, periodicals and
broadsides (1776 - 1822) that print a complete copy of the U.S. DOI.
The final checklist contained over 370 entries(!) It was found that every
Declaration was different except in the instances where the Declaration
had been obviously printed from the same setting of type. The differences
were of wording, capitalization and punctuation.

>
> I actually find it surprising that anyone would publish a modified
> version of the DOI.  Do you know what the differences were (in the 1795
> publication) and have any understanding of why it was changed?  i.e., just
> sloppy work (typos etc) or some type of intentional editing?
>

I would guess mostly due to hand transcription and typesetting.


-Wayne


------------------------------

From: [EMAIL PROTECTED] ()
Subject: Re: Okay "experts," how do you do it?
Date: 19 Sep 99 03:54:37 GMT

Sundial Services ([EMAIL PROTECTED]) wrote:
: To try to steer this discussion gently back on course, let me say that
: for example it seems to me [an untutored individual] that one could
: create some kind of a test-bed into which the cipher algorithm could be
: installed and fed a bunch of known data.  One could then statistically
: analyze the outputs obtained to predict the quality of the cipher.

No, one can't do that. One can test the qualtiy of a random-number
generator that way, if it is going to be used for a Monte Carlo method
calculation, because the characteristics it must have to be satisfactory
for that are well-defined.

Strength in a cipher system is an open-ended quality; a cipher can be
strong against a long list of documented attacks, yet be very weak if one
of these attacks is applied to it with a small variation - one that only
an _intelligent human being_ can recognize. Not a simple computer program.

: I tend to view with skepticism any argument that says that only certain
: individuals know how to do things.  HOW do they know?  WHAT do they
: know?  I should not have to undergo the experience of "cracking zillions
: of ciphers" to slowly and painfully gain knowledge if anyone else in
: this world has gone through the experience before and can, and will,
: express in some objective way what it is they learned.

There is a difference between learning *facts* and learning *skills*. To
look at the description of a cipher, and see where it might be weak, is a
task that is assisted by being familiar with a lot of ciphers and the
kinds of attacks that were useful against them.

But one also has to be able to see how an attack can be transformed to
apply to a different kind of cipher than the one it was originally used
with. One has to be able to see how a general principle can be applied to
extend an attack. One has to be able to invent new attacks.

: We have notions to this day that cryptographers sit in dark rooms,
: poring over large sheets of paper with square grids of letters and
: numbers on them, waiting for some magic flash of insight, "ah hah!"  But
: I doubt that real cryptographers have done that in a hundred years or
: more.  We have computers now.  They can accomplish the work in
: milliseconds once we can articulate what the work should be.

They do accomplish their part of the work very quickly and conveniently.
And I suspect that even now it is possible to design symmetric-key ciphers
which offer no hope of solution. But as long as ciphers _can_ be broken,
human ingenuity, and human ingenuity of a high order, will be required for
the process.

It is not accidental that the British effort to crack the Enigma included
some of Britain's foremost mathematicians, such as Alan Turing and I. J.
Good, or the well-known chess player Conel Hugh O'Donel Alexander.

John Savard

------------------------------


** FOR YOUR REFERENCE **

The service address, to which questions about the list itself and requests
to be added to or deleted from it should be directed, is:

    Internet: [EMAIL PROTECTED]

You can send mail to the entire list (and sci.crypt) via:

    Internet: [EMAIL PROTECTED]

End of Cryptography-Digest Digest
******************************
Cryptography-Digest Digest #267

Reply via email to