> Adam Olsen (AO) wrote:
>AO> The Wayback Machine has 150 billion pages, so 2**37. Google's index
>AO> is a bit larger at over a trillion pages, so 2**40. A little closer
>AO> than I'd like, but that's still 56294995000 to 1 odds of having
>AO> *any* collisions between *any* of the file
On Fri, 17 Apr 2009 11:19:31 -0700, Adam Olsen wrote:
> Actually, *cryptographic* hashes handle that just fine. Even for files
> with just a 1 bit change the output is totally different. This is known
> as the Avalanche Effect. Otherwise they'd be vulnerable to attacks.
>
> Which isn't to say
In message , Nigel
Rantor wrote:
> Adam Olsen wrote:
>
>> The chance of *accidentally* producing a collision, although
>> technically possible, is so extraordinarily rare that it's completely
>> overshadowed by the risk of a hardware or software failure producing
>> an incorrect result.
>
> Not
On Apr 17, 9:59 am, SpreadTooThin wrote:
> You know this is just insane. I'd be satisfied with a CRC16 or
> something in the situation i'm in.
> I have two large files, one local and one remote. Transferring every
> byte across the internet to be sure that the two files are identical
> is just n
On Apr 17, 9:59 am, norseman wrote:
> The more complicated the math the harder it is to keep a higher form of
> math from checking (or improperly displacing) a lower one. Which, of
> course, breaks the rules. Commonly called improper thinking. A number
> of math teasers make use of that.
Of cou
On Apr 17, 5:30 am, Tim Wintle wrote:
> On Thu, 2009-04-16 at 21:44 -0700, Adam Olsen wrote:
> > The Wayback Machine has 150 billion pages, so 2**37. Google's index
> > is a bit larger at over a trillion pages, so 2**40. A little closer
> > than I'd like, but that's still 56294995000 to 1 od
On Apr 17, 4:54 am, Nigel Rantor wrote:
> Adam Olsen wrote:
> > On Apr 16, 11:15 am, SpreadTooThin wrote:
> >> And yes he is right CRCs hashing all have a probability of saying that
> >> the files are identical when in fact they are not.
>
> > Here's the bottom line. It is either:
>
> > A) Sever
Adam Olsen wrote:
On Apr 16, 11:15 am, SpreadTooThin wrote:
And yes he is right CRCs hashing all have a probability of saying that
the files are identical when in fact they are not.
Here's the bottom line. It is either:
A) Several hundred years of mathematics and cryptography are wrong.
The
On Thu, 2009-04-16 at 21:44 -0700, Adam Olsen wrote:
> The Wayback Machine has 150 billion pages, so 2**37. Google's index
> is a bit larger at over a trillion pages, so 2**40. A little closer
> than I'd like, but that's still 56294995000 to 1 odds of having
> *any* collisions between *any* o
Adam Olsen wrote:
On Apr 16, 11:15 am, SpreadTooThin wrote:
And yes he is right CRCs hashing all have a probability of saying that
the files are identical when in fact they are not.
Here's the bottom line. It is either:
A) Several hundred years of mathematics and cryptography are wrong.
The
Adam Olsen wrote:
On Apr 16, 4:27 pm, "Rhodri James"
wrote:
On Thu, 16 Apr 2009 10:44:06 +0100, Adam Olsen wrote:
On Apr 16, 3:16 am, Nigel Rantor wrote:
Okay, before I tell you about the empirical, real-world evidence I have
could you please accept that hashes collide and that no matter ho
On Apr 16, 4:27 pm, "Rhodri James"
wrote:
> On Thu, 16 Apr 2009 10:44:06 +0100, Adam Olsen wrote:
> > On Apr 16, 3:16 am, Nigel Rantor wrote:
> >> Okay, before I tell you about the empirical, real-world evidence I have
> >> could you please accept that hashes collide and that no matter how many
On Apr 16, 11:15 am, SpreadTooThin wrote:
> And yes he is right CRCs hashing all have a probability of saying that
> the files are identical when in fact they are not.
Here's the bottom line. It is either:
A) Several hundred years of mathematics and cryptography are wrong.
The birthday problem
On Thu, 16 Apr 2009 10:44:06 +0100, Adam Olsen wrote:
On Apr 16, 3:16 am, Nigel Rantor wrote:
Okay, before I tell you about the empirical, real-world evidence I have
could you please accept that hashes collide and that no matter how many
samples you use the probability of finding two files th
On Apr 16, 8:59 am, Grant Edwards wrote:
> On 2009-04-16, Adam Olsen wrote:
> > I'm afraid you will need to back up your claims with real files.
> > Although MD5 is a smaller, older hash (128 bits, so you only need
> > 2**64 files to find collisions),
>
> You don't need quite that many to have a
On Apr 16, 3:16 am, Nigel Rantor wrote:
> Adam Olsen wrote:
> > On Apr 15, 12:56 pm, Nigel Rantor wrote:
> >> Adam Olsen wrote:
> >>> The chance of *accidentally* producing a collision, although
> >>> technically possible, is so extraordinarily rare that it's completely
> >>> overshadowed by the
On 2009-04-16, Adam Olsen wrote:
> The chance of *accidentally* producing a collision, although
> technically possible, is so extraordinarily rare that it's
> completely overshadowed by the risk of a hardware or software
> failure producing an incorrect result.
Not when
Adam Olsen wrote:
On Apr 16, 3:16 am, Nigel Rantor wrote:
Adam Olsen wrote:
On Apr 15, 12:56 pm, Nigel Rantor wrote:
Adam Olsen wrote:
The chance of *accidentally* producing a collision, although
technically possible, is so extraordinarily rare that it's completely
overshadowed by the risk
On Apr 16, 3:16 am, Nigel Rantor wrote:
> Adam Olsen wrote:
> > On Apr 15, 12:56 pm, Nigel Rantor wrote:
> >> Adam Olsen wrote:
> >>> The chance of *accidentally* producing a collision, although
> >>> technically possible, is so extraordinarily rare that it's completely
> >>> overshadowed by the
Adam Olsen wrote:
On Apr 15, 12:56 pm, Nigel Rantor wrote:
Adam Olsen wrote:
The chance of *accidentally* producing a collision, although
technically possible, is so extraordinarily rare that it's completely
overshadowed by the risk of a hardware or software failure producing
an incorrect resu
On Apr 15, 12:56 pm, Nigel Rantor wrote:
> Adam Olsen wrote:
> > The chance of *accidentally* producing a collision, although
> > technically possible, is so extraordinarily rare that it's completely
> > overshadowed by the risk of a hardware or software failure producing
> > an incorrect result.
Adam Olsen wrote:
The chance of *accidentally* producing a collision, although
technically possible, is so extraordinarily rare that it's completely
overshadowed by the risk of a hardware or software failure producing
an incorrect result.
Not when you're using them to compare lots of files.
Tr
On Apr 15, 11:04 am, Nigel Rantor wrote:
> The fact that two md5 hashes are equal does not mean that the sources
> they were generated from are equal. To do that you must still perform a
> byte-by-byte comparison which is much less work for the processor than
> generating an md5 or sha hash.
>
> I
On Apr 15, 8:04 am, Grant Edwards wrote:
> On 2009-04-15, Martin wrote:
>
>
>
> > Hi,
>
> > On Mon, Apr 13, 2009 at 10:03 PM, Grant Edwards wrote:
> >> On 2009-04-13, SpreadTooThin wrote:
>
> >>> I want to compare two binary files and see if they are the same.
> >>> I see the filecmp.cmp functi
Grant Edwards wrote:
We all rail against premature optimization, but using a
checksum instead of a direct comparison is premature
unoptimization. ;)
And more than that, will provide false positives for some inputs.
So, basically it's a worse-than-useless approach for determining if two
files
Martin wrote:
On Wed, Apr 15, 2009 at 11:03 AM, Steven D'Aprano
wrote:
The checksum does look at every byte in each file. Checksumming isn't a
way to avoid looking at each byte of the two files, it is a way of
mapping all the bytes to a single number.
My understanding of the original question
On 2009-04-15, Martin wrote:
> On Wed, Apr 15, 2009 at 11:03 AM, Steven D'Aprano
> I'd still say rather burn CPU cycles than development hours (if I got
> the question right),
_Hours_? Calling the file compare module takes _one_line_of_code_.
Implementing a file compare
On 2009-04-15, Martin wrote:
> Hi,
>
> On Mon, Apr 13, 2009 at 10:03 PM, Grant Edwards wrote:
>> On 2009-04-13, SpreadTooThin wrote:
>>
>>> I want to compare two binary files and see if they are the same.
>>> I see the filecmp.cmp function but I don't get a warm fuzzy feeling
>>> that it is doin
On Wed, Apr 15, 2009 at 11:03 AM, Steven D'Aprano
wrote:
> The checksum does look at every byte in each file. Checksumming isn't a
> way to avoid looking at each byte of the two files, it is a way of
> mapping all the bytes to a single number.
My understanding of the original question was a way t
On Wed, 15 Apr 2009 07:54:20 +0200, Martin wrote:
>> Perhaps I'm being dim, but how else are you going to decide if two
>> files are the same unless you compare the bytes in the files?
>
> I'd say checksums, just about every download relies on checksums to
> verify you do have indeed the same fil
Hi,
On Mon, Apr 13, 2009 at 10:03 PM, Grant Edwards wrote:
> On 2009-04-13, SpreadTooThin wrote:
>
>> I want to compare two binary files and see if they are the same.
>> I see the filecmp.cmp function but I don't get a warm fuzzy feeling
>> that it is doing a byte by byte comparison of two files
On Apr 13, 8:39 pm, Grant Edwards wrote:
> On 2009-04-13, Peter Otten <__pete...@web.de> wrote:
>
> > But there's a cache. A change of file contents may go
> > undetected as long as the file stats don't change:
>
> Good point. You can fool it if you force the stats to their
> old values after you
I want to compare two binary files and see if they are the same.
I see the filecmp.cmp function but I don't get a warm fuzzy feeling
that it is doing a byte by byte comparison of two files to see if they
are they same.
What should I be using if not filecmp.cmp?
--
http://mail.python.org/mailman/li
On 2009-04-13, Peter Otten <__pete...@web.de> wrote:
> But there's a cache. A change of file contents may go
> undetected as long as the file stats don't change:
Good point. You can fool it if you force the stats to their
old values after you modify a file and you don't clear the
cache.
--
Gra
SpreadTooThin wrote:
On Apr 13, 2:37 pm, Grant Edwards wrote:
On 2009-04-13, Grant Edwards wrote:
On 2009-04-13, SpreadTooThin wrote:
I want to compare two binary files and see if they are the same.
I see the filecmp.cmp function but I don't get a warm fuzzy feeling
that i
On Mon, 13 Apr 2009 15:03:32 -0500, Grant Edwards wrote:
> On 2009-04-13, SpreadTooThin wrote:
>
>> I want to compare two binary files and see if they are the same. I see
>> the filecmp.cmp function but I don't get a warm fuzzy feeling that it
>> is doing a byte by byte comparison of two files t
Grant Edwards wrote:
> On 2009-04-13, Grant Edwards wrote:
>> On 2009-04-13, SpreadTooThin wrote:
>>
>>> I want to compare two binary files and see if they are the same.
>>> I see the filecmp.cmp function but I don't get a warm fuzzy feeling
>>> that it is doing a byte by byte comparison of two
On Apr 13, 2:37 pm, Grant Edwards wrote:
> On 2009-04-13, Grant Edwards wrote:
>
>
>
> > On 2009-04-13, SpreadTooThin wrote:
>
> >> I want to compare two binary files and see if they are the same.
> >> I see the filecmp.cmp function but I don't get a warm fuzzy feeling
> >> that it is doing a by
On 2009-04-13, Grant Edwards wrote:
> On 2009-04-13, SpreadTooThin wrote:
>
>> I want to compare two binary files and see if they are the same.
>> I see the filecmp.cmp function but I don't get a warm fuzzy feeling
>> that it is doing a byte by byte comparison of two files to see if they
>> are t
On Apr 13, 2:03 pm, Grant Edwards wrote:
> On 2009-04-13, SpreadTooThin wrote:
>
> > I want to compare two binary files and see if they are the same.
> > I see the filecmp.cmp function but I don't get a warm fuzzy feeling
> > that it is doing a byte by byte comparison of two files to see if they
On 2009-04-13, SpreadTooThin wrote:
> I want to compare two binary files and see if they are the same.
> I see the filecmp.cmp function but I don't get a warm fuzzy feeling
> that it is doing a byte by byte comparison of two files to see if they
> are they same.
Perhaps I'm being dim, but how el
On Apr 13, 2:00 pm, Przemyslaw Kaminski wrote:
> SpreadTooThin wrote:
> > I want to compare two binary files and see if they are the same.
> > I see the filecmp.cmp function but I don't get a warm fuzzy feeling
> > that it is doing a byte by byte comparison of two files to see if they
> > are they
SpreadTooThin wrote:
> I want to compare two binary files and see if they are the same.
> I see the filecmp.cmp function but I don't get a warm fuzzy feeling
> that it is doing a byte by byte comparison of two files to see if they
> are they same.
>
> What should I be using if not filecmp.cmp?
W
JohnV wrote:
> I have a txt file that gets appended with data over a time event. The
> data comes from an RFID reader and is dumped to the file by the RFID
> software. I want to poll that file several times over the time period
> of the event to capture the current data in the RFID reader.
>
> W
Here is the latest version of the code:
currentdata_file = r"C:\Users\Owner\Desktop\newdata.txt" # the latest
download from the clock
lastdata_file = r"C:\Users\Owner\Desktop\mydata.txt" # the prior
download from the clock
output_file = r"C:\Users\Owner\Desktop\out.txt" # will hold delta
clock dat
The below code does the trick with one small problem left to be solved
import shutil
import string
currentdata_file = r"C:\Users\Owner\Desktop\newdata.txt" # the current
download from the clock
lastdata_file = r"C:\Users\Owner\Desktop\mydata.txt" # the prior
download from the clock
output_file =
En Wed, 18 Mar 2009 21:02:42 -0200, Emile van Sebille
escribió:
JohnV wrote:
> What I want to do is compare the old data (lets day it is saved to a
file called 'lastdata.txt') with the new data (lets day it is saved to
a file called 'currentdata.txt') and save the new appended data to a
va
JohnV wrote:
> What I want to do is compare the old data (lets day it is saved to a
file called 'lastdata.txt') with the new data (lets day it is saved to
a file called 'currentdata.txt') and save the new appended data to a
variable
You may get away with something like: (untested)
newdata=op
Maybe something like this will work though I am not sure of my quotes
and what to import
import shutil
f = open(r'C:\Users\Owner\Desktop\mydata.txt', 'r')
read_data1 = f.read()
f.close()
shutil.copy('C:\Users\Owner\Desktop\newdata.txt', 'C:\Users\Owner
\Desktop\out.txt')
file = open(r'C:\Users\O
On Wed, Mar 18, 2009 at 2:30 PM, JohnV wrote:
> I have a txt file that gets appended with data over a time event. The
> data comes from an RFID reader and is dumped to the file by the RFID
> software. I want to poll that file several times over the time period
> of the event to capture the curre
I have a txt file that gets appended with data over a time event. The
data comes from an RFID reader and is dumped to the file by the RFID
software. I want to poll that file several times over the time period
of the event to capture the current data in the RFID reader.
When I read the data I wan
but what if
case 1:
no.of keys in f1 > f2 and
case2:
no.of keys in f1 < f2.
Should'nt we get 1.1 if case 1 and 0.9 if case 2?? it errors of with a
keyerror.?
--
http://mail.python.org/mailman/listinfo/python-list
PyPK wrote:
> I have two files
> file1 in format
>
> 'AA' 1 T T
> 'AB' 1 T F
>
> file2 same as file1
>
> 'AA' 1 T T
> 'AB' 1 T T
>
> Also the compare should be based on id. So it should look for line
> starting with id 'AA' (for example) and then match the line so if in
> second case.
S
Not for homework. But anyway thanks much...
--
http://mail.python.org/mailman/listinfo/python-list
Sounds a little like "homework", but I'll help you out.
There are lots of ways, but this works.
import sys
class fobject:
def __init__(self, inputfilename):
try:
fp=open(inputfilename, 'r')
self.lines=fp.readlines()
except IOError:
print "Una
Note that the code i wrote wont do the compare based on id which i am
looking for..it just does a direct file to file compare..
--
http://mail.python.org/mailman/listinfo/python-list
I have two files
file1 in format
'AA' 1 T T
'AB' 1 T F
file2 same as file1
'AA' 1 T T
'AB' 1 T T
Also the compare should be based on id. So it should look for line
starting with id 'AA' (for example) and then match the line so if in
second case.
so this is what I am looking for:
1. read
57 matches
Mail list logo