Re: Comcast@Home bans VPNs

2000-08-24 Thread Phil Karn

>Is making an SSL connection creating a VPN? It's really not much 
>different in an abstract sense. Most applications are using browsers 

I've been saying for some time that we need a IP-over-SSL tunneling
protocol standard. ISPs would *never* dare block TCP port 443, since
as we all know the only important Internet application is to let
people buy stuff online...

*Great* essay in today's NYT, by the way.

Phil




Re: GPS integrity

2000-05-21 Thread Phil Karn

>Sounds like some interested parties should take some GPS gear and some
>radio receiving and test gear to one of the spots where the millatree
>is warning airmen that "for the next two weeks, GPS doesn't work
>here", and see just what sort of jamming they are using...

A good idea, but I note that most of these sites are in the middle of
large western military reservations out in the middle of nowhere. You
might be able to see something from a small plane (keeping out of
restricted airspace, of course) but then you need some other way of
knowing where you are so you can compare results.

Phil




Re: GPS integrity

2000-05-11 Thread Phil Karn

>To decrease the jamming power required (this -is- spread spectrum,
>after all), it's helpful to have your jammer hop the same way your
>receiver will be hopping.  This is pretty easy to do, since your
>jammer can trivially figure out the hops by observing the satellites
>you can see.  Note also that any outfit that makes GPS's typically

GPS uses direct sequence spread spectrum, not frequency
hopping. Accurate timing is an inherent feature of direct sequence,
but usually not of frequency hopping.

The GPS C/A chipping sequences (known as Gold Codes) are openly
published, so you can generate them yourself with just a few shift
registers and some combinatorial logic. There are 32 different
sequences, each 1023 chips long, one for each satellite in the
constellation. No need to observe the satellites you're jamming.

You'll also need to impress the 50bps navigation message on the
chipping sequence, but again this is all openly documented (except for
the reserved fields that are apparently carrying encrypted data).

In many ways, a GPS spoofer is a much simpler device than a GPS
receiver.

Phil







Re: GPS integrity

2000-05-11 Thread Phil Karn

>As for RAIM, my Garmin GNS430 (spiffy aviation GPS) has RAIM.  Luckily
>I've never actually seen the RAIM warning flag.  My understanding of
>RAIM matches what's been said before, position information is
>heuristically computed and when an anomalous position/speed occurs,
>the flag is raised.  Sudden changes in position, altitude, speed, etc.
>would set off the flag, taking under consideration that an airplane
>would generally not invoke an implulse accelleration :)

I don't think that's how RAIM works. My understanding is that it uses
extra satellites (anything over 4 in a 3D fix) to compute an
overdetermined solution by least squares. Then it looks to see if one
or more of the raw satellite measurements are inconsistent with this
fix by more than a certain amount. If so, it marks this satellite as
unhealthy, notifies the user, and drops it from the fix. I believe 6
is the minimum number of satellites needed for RAIM in a certified
aviation receiver.

This is nothing more than standard statistical processing when you
detect and eliminate outliers in your raw data.

There are bits in the broadcast ephemeris that allow a satellite to
mark itself as unhealthy so that receivers will ignore it. There is
also supposed to be a mechanism that will cause a satellite to switch
to a "nonstandard" code sequence when it detects an internal fault so
that ordinary receivers will stop receiving it.

The problem is when a satellite "silently" fails, without marking
itself as unhealthy and/or changing the PRN code.

This happened some years ago when one GPS satellite suddenly jumped
way off in its timing for just a few frames.  This was enough,
however, to screw up the many CDMA base stations in North America that
were all using that one satellite for timing.

In those days, it was common for GPS receivers used for stationary
timing to track just one satellite at a time, as this reduced the
small-scale timing jitters. But it left you completely vulnerable
to the failure of the satellite you were tracking.

There's a common theme here -- redundancy is good.  And I think that's
the only reasonable approach to solving the original problem of GPS
timing integrity: have lots of GPS receivers in lots of different
places all comparing results.

Phil






Re: GPS integrity

2000-05-11 Thread Phil Karn

>There is a CRC or something similar on the C/A code, and this is all
>publicly documented.  I'm quite sure there is nothing that would
>qualify as 'authentication' in any strong sense.  One of the

It's actually a Hamming code. But yes, it it used only for error detection, and
does not qualify as any kind of authentication.

>DOD certainly has the capability to jam, probably with incorrect
>signals in addition to just noise.  There have been public notices for
>various areas over the last few years.  (If they couldn't do this,

Here's a typical notice (NOTAM) for the White Sands area:

For the Aviation Community

GPS UNRELIABLE WITHIN A 390NM RADIUS OF TCS VORTAC AT FL400 AND ABOVE, 
DECREASING IN AREA AS YOU DECREASE IN ALTITUDE TO 300 NMR AT FL250, 
200 NMR AT 10,000 FT MSL, AND 125 NMR AT THE SURFACE.  IFR OPERATIONS 
BASED UPON GPS NAVIGATION SHOULD NOT BE PLANNED IN THE AFFECTED 
AREA DURING THE PERIODS INDICATED (WHICH WILL BE PUBLISHED IN A GPS 
NOTAM).  THESE OPERATIONS INCLUDE DOMESTIC RNAV OR LONG-RANGE 
NAVIGATION REQUIRING GPS.  THESE OPERATIONS ALSO INCLUDE GPS 
STANDALONE AND OVERLAY INSTRUMENT APPROACH OPERATIONS.

ON THE FOLLOWING DATES AND TIMES:

Between the hours of 0200-0800Z (2000-0200 MDT)

MAY - 7, 10, 12, 14, 17, 19, 21, 24, 26, 28, & 31.

JUN - 2, 4, 7, 9, 11, 14, 16, 18, 21, 23, 25, 28, & 30.

JUL - 2, 5, 9, 14, 16, 26, 28, & 30.

Between the hours of 0001-0600Z (1800-2400 MDT)

JUL - 7, 12, 19, 21, & 23.



Phil





Re: GPS integrity

2000-05-11 Thread Phil Karn

>If I were worried about integrity of timing signals, I'd use a
>GPS-disciplined rubidium oscillator.  I think most of the available
>devices like this are not quite as concerned with integrity as phase
>noise reduction in the normal case, so some tweaking of the

These are actually quite common in stationary timing applications. In
the early days of CDMA digital cellular development, we used GPS
receivers with rubidium oscillators in each cell site because the GPS
constellation was too incomplete to guarantee continuous coverage. I
think the commercial base stations still have rubidium oscillators
along with a spec to stay within 1 microsecond (the tolerance required
to permit soft handoff) for at least 24 hrs without seeing a GPS
satellite. This is to cover local blockages and interference as well
as any outage of the GPS constellation.

Phil





Re: GPS integrity

2000-05-09 Thread Phil Karn

As you say, there are two coded GPS signal streams: C/A (Clear/Access
or Coarse/Acquision, depending on the reference) and P
(Precision). These are in turn placed on two L-band RF frequencies, L1
and L2.

The C/A and P signal structures are fully documented in the open
literature. See:

http://www.navcen.uscg.mil/gps/geninfo/gpsdocuments/icd200/default.htm

However, the P-code is normally XORed with a classified cryptographic
sequence, the Y-code; this is "anti spoof". As far as I can tell from
the open literature, this is conventional symmetric cryptography with
keys shared by the satellite and all "authorized" users. Security
relies entirely on the controlled distribution and physical security
of the receivers.

In normal operation, L1 carries both C/A and P/Y, while L2 carries
only P/Y.  The configuration of these two carriers, along with the
state of anti-spoof (Y) on the P code, is indicated by bits in the
50bps navigation message from each satellite.

As an aside, there are "reserved" fields in the navigation messages
that started carrying apparently random data several years ago. A good
guess is that these control periodic key changes in the military
Y-code receivers.

Some high-end civilian receivers can still make use of the L2 signal
by "squaring" it to remove the (unknown) modulation so that the
carrier phase can be extracted.

Since the C/A format is fully documented and unencrypted, it's
actually quite simple to spoof. There are commercial GPS "satellite
simulators" on the market that do precisely this. They can make a GPS
receiver display any time and location you want.  They have legitimate
uses in laboratory testing, e.g., to verify GPS receiver operation
through a week 1024 rollover (we bought one at Qualcomm some time ago,
since we rely heavily on GPS timing in our CDMA systems).

It doesn't take much imagination to realize what could happen if you
connected one of these simulators to a power amplifier and antenna.
The GPS satellite signals are pretty weak at the earth's surface, so
they're easy to jam or spoof, at least over a small area.

The original GPS design for a military receiver used the C/A code only
for initial acquisition (hence the Coarse/Acquision term), followed by
a transition to the P/Y code. (Civilian receivers simply stay on the
C/A code).

Several years ago, the NRC report on GPS policy recommended that DoD
turn off Selective Availability and develop a capability to regionally
jam the L1 signal (the only one carrying C/A) during a war. This would
render mass-market GPSes unusable. So that the US military could still
use GPS in the area, the NRC recommended the development of a GPS
receiver that could acquire using only on the P/Y code on the L2
carrier. This was a challenge since the P code is not only 10x the
chip rate of the C/A code, but it also repeats only once per week (as
opposed to once every millisecond for C/A). This requires a highly
stable oscillator that can run continuously and accurately in a
hand-held unit under battlefield conditions.

It's been pretty obvious for some time that the DoD has been
conducting GPS jamming tests. There have been regular NOTAMs (Notices
to Airmen) that GPS signals would be "unreliable" in the vicinity of
such-and-such military base during such-and-such hours. This was most
often Fort Huachuca, near Yuma AZ, where there is an "electronic
proving ground".

They must have succeeded, which is why SA was finally turned off.

So while the military has themselves covered, there's not much we
civilian users can do beyond looking for strength in numbers by having
lots of GPS receivers in lots of places all exchanging observations
with NTP. Spoofing this network would require spoofing nearly all of
the GPS receivers in the network to keep the receivers from detecting
that something fishy is going on.

Phil






Re: DeCSS MPAA New York Opinion

2000-02-03 Thread Phil Karn

>Judge Kaplan aims at settling the code as expression
>dispute, citing Bernstein, Karn and Junger cases, and 
>the First Amendment loses to Copyright and DMCA Acts.

This is one of the sloppiest and misinformed judicial opinions I've
read in a long time. E.g., he states that copyright infringement is
not protected by the first amendment, that DMCA is intended to bolster
copyright protection, ergo, the first amendment is irrelevant to the
DMCA.

With this logic, he could also conclude that the fourth amendment
wouldn't apply to any law intended to bolster copyright protection,
e.g., one authorizing regular, warrantless midnight raids looking for
anybody with infringing materials in their homes.

My attorney tells me that Commerce has finally issued a letter
formally ruling that the Applied Cryptography source code diskette at
issue in my case is now exportable. I should receive a hard copy soon.
When I do, I will post a copy to the net and frame the physical copy
for my wall.

The contents of the AC diskette have been widely available on the net
for a long time, starting (to my knowledge) with an Italian FTP site
in September 1994. This lent a certain air of surrealism to my
case. The CSS cases have even more of this Wonderland flavor to them.

Phil




Re: DVD CCA Emergency Hearing to seal DeCSS

2000-01-26 Thread Phil Karn

>There have been over 26,000 downloads and they are now going out at 
>600 per hour.

I hope you're keeping only the total counts, not the detailed access logs.

Phil



Re: NEC Claims World's Strongest Encryption System - still more snake oil?

2000-01-24 Thread Phil Karn

>NEC's system creates a intermediate key of several thousand bits in 
>length from the master key, and that serves as the base for the 
>encryption process.  [...]

Can anybody say "key schedule generation"?

Phil



Re: More BXA mail about regs

2000-01-21 Thread Phil Karn

So it appears that there is now a significant difference in the
treatment of source code and object code, even object code compiled
from open source already on the net. Am I correct?

If so, this could complicate the wholesale incorporation of crypto
libraries and applications as packages (e.g., .rpm and .deb files) in
binary Linux distributions. These binary distributions are very popular,
as they make it really easy for the neophyte to bring up Linux.

It would be nice to get a BXA ruling that treats object code compiled
from open source the same as the source itself. Barring that, I think
an expedient approach would be to distribute crypto applications and
libraries as packages containing, instead of the usual object code,
source code that is automatically compiled and installed on the target
machine by the package's installation script. This would slow down the
installation and introduce package dependencies on compilers and
header files, but it wouldn't have to change what the user sees and
does very much.

What about the "open cryptographic interface" provision? Would this
mean that any application that calls a crypto function would also have
to be distributed in source code that gets compiled during
installation?  We'd have to be careful, or else the entire "binary"
Linux distribution would consist of source code plus compilation
scripts, and installation could take quite a bit longer than it does
now.

Phil



Re: beyond what is necessary

2000-01-21 Thread Phil Karn

>>"a.4. Specially designed or modified to reduce the compromising
>>emanations of information-bearing signals beyond what is necessary
>>for the health, safety or electromagnetic interference standards;"

>So, who gets to say what's a standard?  

>Some people's standards are higher than the government
>(e.g., varieties of 'organic'; kosher; etc).

This is especially true for radio amateurs (hams) doing "weak-signal"
work on the VHF (30-300 MHz) and UHF (300-3000 MHz) bands. Some of the
propagation modes used include tropospheric scatter; meteor trail
reflection; satellite communications; and the ultimate, EME
(earth-moon-earth, i.e., using the moon as a passive reflector).

Natural background and modern receiver noise levels are all very low
on these bands, so unwanted computer emissions have long been a
serious problem. (Modern CPU clock speeds are now well into the UHF
region). Simply meeting the FCC Class B (residential) emission limits
is not nearly enough. Those regulations were intended to protect
broadcast receivers working with signals considerably stronger than
those involved in amateur weak-signal work.

So ever since the first personal computer appeared in a ham shack,
hams have been trying to shield, bypass and otherwise suppress their
interfering signals. Some approaches resemble those taken in Tempest
equipment: special filters on power and signal lines; metal equipment
cases with insulating paint removed and resealed with finger stock and
copper tape; plastic cases coated with conductive paint; and so forth.

I still have an early-80's clone monochrome PC monitor that I
extensively modified in this way. It's obsolete, but it's quiet.  And
it was all "necessary for (my) electromagnetic interference
standards".

Phil




Unrestricted crypto software web posting

2000-01-20 Thread Phil Karn

Pursuant to 15 CFR Part 734, as revised on January 14, 2000, notice is
hereby given that files including freely-available (open source)
source code for cryptographic functions is being published on the
World Wide Web at URL

http://people.qualcomm.com/karn/code/des/index.html

Phil Karn




Re: BXA press release URL; and where to get the regs in HTML

2000-01-17 Thread Phil Karn

>Apache 2.0 has general programming hooks that are sufficient for adding
>crypto.

And so does the UNIX shell:

tar cf - . | ssh -C foo 'tar xvf -'

Dunno how far they tried to control this even under the old regs.

Phil



Re: BXA press release URL; and where to get the regs in HTML

2000-01-12 Thread Phil Karn

Okay, I've read the latest version of the regs. As usual, they're long and
confusing, with exceptions to the exceptions to the exceptions. But
several things seem to stand out.

1. You can export pretty much anything to anyone but a foreign
government or to the seven pariah countries (Libya, Iraq, etc).

2. You can export anything that's publicly available (retail products,
source code, toolkits, etc) to anybody, including a foreign
government, as long as they're not in one of the seven pariah
countries.

3. When posting free crypto (source or object) on the net, you don't
need to implement any form of access control, even though this would
make it technically possible for one of the seven pariah countries to
download it.

4. The bottom line is, the only stuff that's still controlled is
proprietary encryption provided directly to a foreign government, or
to the pariah countries.

Do I have all this right so far?

What still confuses me are the circumstances that let you just send
an email pointer to BXA, and which ones require a review of some
sort before you can export.

Phil



Re: DeCSS Court Hearing Report

2000-01-04 Thread Phil Karn

>No, October 28, 2000 is when the act of circumventing an effective
>technological measure becomes a violation (with exceptions for fair

But if it was an "effective technological measure", it couldn't have
been circumvented. And by circumventing CSS, wasn't it shown to not be
an effective technological measure??

Phil



Re: Globalstar close to pact with FBI over wiretaps

1999-09-29 Thread Phil Karn

Yet another illustration of how true security can only be provided by
the users themselves on an end-to-end basis. Saltzer, Reed & Clark
(authors of "End-to-End Arguments in Systems Design") have been proven
right yet again. So has Machiavelli, author of "The Prince".

The necessary hook for CDMA PCS users to provide their own end-to-end
encryption -- a generic IP packet data service -- has finally been
rolled out by Sprint PCS, over six years after I first prototyped it
in the lab. You may have seen their ads last weekend for their
"Wireless Web" service. I haven't used it for VoIP yet, but SSH works
just fine. A Palm Pilot (or pdQ) also works just fine.

Plugging a secure VoIP phone into a PCS handset certainly won't be as
convenient as a cell phone with built-in encryption, but at least
it'll make true end-to-end security possible. And I'm pushing hard for
the same packet data service to be provided in Globalstar; we're
already testing it in-house on an ad-hoc basis.

Phil




Re: 3DEs export?

1999-09-01 Thread Phil Karn

>http://www.zixmail.com/ZixFAQ/index.html#4
>claims that a 3DES email security procuct has been approved for export.
>Is there something about the security of this system that is compromised?

That's because it implements key recovery. They don't stress that fact,
but it's there if you dig.

Phil



Re: US Urges Ban of Internet Crypto

1999-07-28 Thread Phil Karn

>I recognize that this issue is controversial, unless we address 
>this situation, use of the Internet to distribute encryption products 
>will render Wassenaar's controls immaterial."

Gee, I thought Reinsch said it didn't matter that encryption software
was distributed on the Internet because nobody will trust anything
they download off the Internet... :-)

Trying to debate these people rationally is like trying to nail Jello
to a wall.

Phil



Re: A5/1 cracking hardware estimate

1999-05-11 Thread Phil Karn

I worked on cryptanalyzing A5-1 several years ago. I built a
tree-based search routine that could retire many keys in each test
cycle. The exact number per cycle varied enormously depending on how
far into the tree I was when I found a conflict with the keystream
that would let me prune the branch. In the early phases of the search
this could be as much as 1/8 of the entire 64-bit shift register
space, but most of the time it was "only" a few million keys.

My approach assumed an arbitrary 64 bits of initial shift register
state, and I couldn't readily see how to exploit the fact that the
initial key had less entropy because of the way the crank is turned
100 times before generating a keystream.

I haven't worked on this problem in a while, but it did seem to me
that this problem is even more amenable to custom hardware than DES.

I suppose I could dust off my code...

Phil



Re: Bernstein Opinion Up

1999-05-06 Thread Phil Karn

>I agree.  There -is- a little nit in that they seem to conflate
>"low-level", "assembly language", and "machine code" as all being
>exactly the same thing, with the implicit presumption that humans
>never read or write assembly language and that only a "high-level"
>language like C or Lisp might be appropriately protected as being
>speech and not purely functional.  This would seem to be a problem

I don't see the term "assembly" anywhere in the opinion, and in
context I don't think this court would have a problem classifying an
assembler implementation of DES as "source code" for the purposes of
First Amendment protection. After all, both C and assembler are
"source code" written so humans can understand it.

They do say

"We express no opinion regarding whether object code manifests a
"close enough nexus to expression" to warrant application of the prior
restraint doctrine."

because object code was not at issue in this case. Of course, even if
the First Amendment protections were ultimately limited to source
code, it would be a simple manner to distribute a makefile (and
possibly a compiler) with the source so a user could generate object
code as needed.

Perhaps it would actually be a blessing to have such a restriction to
source code. It would help boost the Open Source movement, and it
would also help make distributed encryption code easier to examine for
bugs and trojan horses...

Phil



Re: Bernstein Opinion Up

1999-05-06 Thread Phil Karn

I just read the opinion. These judges actually *got* it! Or at least
two of them did, judges Bright and Fletcher. There's some marvelous
stuff in their opinion, such as the observation that Bernstein's code
had more than a little political expression to it since by showing how
to turn a hash function (which isn't regulated) into a cipher (which
is) he meant to demonstrate the arbitrary and silly nature of the
regulations.

Judge Nelson unfortunately bought the government's bogus claim that
crypto source code was more like a machine than speech, claiming that
"Only a few people can actually understand what a line of source code
would direct a computer to do."  But even Nelson did not say he'd
definitely uphold the regulations as constitutional; he just thought
Bernstein should have used a different legal theory to argue his case.

Phil




Re: references to password sniffer incident

1999-03-24 Thread Phil Karn

>sniffible, none of my passwords were.  I happen to be one of the lucky
>few who has made it through the politics of large companies to "open
>up the firewall".  Yes, corporate IT people see something even as
>secure as SSH as 'opening the firewall'.

>Clearly we need to teach the MIS/IT personnel about existing
>techniques.

In general, the problem is not that the MIS/IT personnel don't know
about or understand existing techniques, or think them insecure.  That
would be easy to fix with a little education. The real problem with
SSH, IPSEC, encrypted Telnet and the like is that they're much too
*decentralized* for their taste. And that directly threatens their
power base.

The people who run today's MIS/IT departments are the direct
descendents of those who ran big computer centers in the old days.
They've watched as most of their reason for being has been eroded out
from under them by the personal computer. The network is the only
thing they have left. They justify their tight central control of it
with strident appeals to security fears, just as governments have for
centuries whipped up fears about crime to justify the creation of
police states.

Deploy good security mechanisms in host systems so they no longer
depend on (largely illusionary) security mechanisms in the network,
and you've taken away the very last reason these people have to go on
living. Expect a big fight.

Phil






Re: references to password sniffer incident

1999-03-24 Thread Phil Karn

>...And of course nobody has compromised any of the ssh binaries on the
>workstations...

Workstations? What workstations? Anybody serious about security brings
their own laptops. And then they worry about them being tampered with
by the hotel custodial staff.

Laptops are also easier to lug into a working group meeting so you can
read your mail during the more boring presentations.

Phil




Re: references to password sniffer incident

1999-03-23 Thread Phil Karn

Actually, things are getting much better in the IETF terminal rooms.
SSH is now *very* widely used, with encrypted Telnet and IPSEC
trailing well behind.

Phil




Re: references to password sniffer incident

1999-03-08 Thread Phil Karn

I don't specfically know about MAE-West, but there are any number of
attacks on ISPs that involved setting up password sniffers on major
transit Ethernets.

Phil



Good news in my crypto case

1999-02-24 Thread Phil Karn

Judge Oberdorfer has granted our request for discovery and a hearing
in my long-running court case challenging the crypto export
controls. Read the judge's ruling:

http://people.qualcomm.com/karn/export/lbo_ruling.html

Other material on my case is available under

http://people.qualcomm.com/karn/export/

--Phil



Re: How to put info in the public domain for patent puropses?

1999-01-14 Thread Phil Karn

>I f I recall correctly, the US Patent and Trademark Office has said that it
>would not consider information placed on the Internet to be published for
>patent purposes. Preparoing papers for journals or conferences is a pain,

Is this really true? I thought I had heard the opposite, but I'm not sure.

Not that it really matters, of course, because the PTO never bothers
to read much prior art (including even prior patents) before granting
a claim. Their apparent policy is to just sit on an application for
about 18 months during which time they may quibble about the precise
wording of a few claims to maintain the fiction that they really
"examine" them. Then they issue the patent -- no matter how obviously
bogus -- and let the courts sort it all out. It's all very depressing.

Phil




Re: Building crypto archives worldwide to foil US-built Berlin Walls

1998-12-09 Thread Phil Karn

>I've always wanted to set up some secret-sharing filesystem where
>you have to download multiple "shares" to reconstruct the data.
>But other combinations of those exact same shares give other data.

I've also been toying with this idea for a few years. Throw in
Reed-Solomon code, and you can make a fault-tolerant network where you
need only K servers out of N to reconstruct the data, but less than K
are insufficient. I have written fast RS code -- it's on my website,
http://people.qualcomm.com/karn/dsp.html.

Phil



Re: Using MD5/SHA1-style hashes for document

1998-11-03 Thread Phil Karn

>Take disk files as an example.  Hashing files (ignoring the name)
>would be a saner way to discover whether you have duplicate files on your
>disk than to compare every file with every other.

I actually played with this many years ago when I wrote a utility
to traverse a UNIX file system looking for duplicate files and to link
the duplicates together to recover disk space.

My first program produced a MD5 hash of every file and sorted the hash
results, just as you suggest. That worked, but I later came up with a
better (faster) algorithm that doesn't involve hashing. I simply
quicksort the list of files using a comparison function that, as a
side effect, links duplicate files together. Then I rescan the sorted
list comparing adjacent entries to mop up any remaining duplicates.

The vast number of duplicates are found and deleted by the comparisons
in the quicksort phase; relatively few are found by the subsequent
re-scan. I also applied the obvious performance enhancements, such as
caching the results of fstat() operations in memory.

This algorithm was considerably faster than hashing because a typical
filesystem has many files with unique sizes, and some of these files
can be quite big. Once the comparison function discovers that two
files have different sizes, there's no need to go any further. You
don't need to hash them or even compare them. This is a big win when
you have lots of large files.

Phil