On 13 Feb 2006 13:13:51 -0800
Paul Rubin http://phr.cx@NOSPAM.invalid wrote:
VSmirk [EMAIL PROTECTED] writes:
Aweseme!!! I got as far as segmenting the large file on
my own, and I ran out of ideas. I kind of thought about
checksum, but I never put the two together.
Thanks. You've
Terry,
Yeah, I was sketching out a scenario much like that. It does break
things down pretty well, and that gets my file sync scenario up to much
larger files. Even if many changes are made to a file, if you keep
track of the number of bytes and checksum over from 1 to the number of
bytes
I have a task that involves knowing when a file has changed. But while
for small files this is an easy enough task, checking the modification
dates, or doing a compare on the contents, I need to be able to do this
for very large files.
Is there anything already available in Python that will
VSmirk:
I have a task that involves knowing when a file has changed. But while
for small files this is an easy enough task, checking the modification
dates,
Checking the modification time works the same way for large files. Why is
that not good enough?
What's your platform?
--
René Pijlman
I'm working primarily on Windows XP, but my solution needs to be cross
platform.
The problem is that I need more than the fact that a file has been
modified. I need to know what has been modified in that file.
I am needing to synchronize the file on a remote folder, and my current
solution,
VSmirk [EMAIL PROTECTED] writes:
I am needing to synchronize the file on a remote folder, and my current
solution, which simply copies the file if a date comparison or a
content comparison, becomes a bit unmanageable for very large files.
Some of the files I'm working with are hundreds of MB
I agree with you wholeheartedly, but the large files is part of the
business requirements.
Thanks for the link. I'll look into it.
V
--
http://mail.python.org/mailman/listinfo/python-list
VSmirk wrote:
I'm working primarily on Windows XP, but my solution needs to be cross
platform.
The problem is that I need more than the fact that a file has been
modified. I need to know what has been modified in that file.
I am needing to synchronize the file on a remote folder, and my
Paul Rubin wrote:
VSmirk [EMAIL PROTECTED] writes:
I am needing to synchronize the file on a remote folder, and my current
solution, which simply copies the file if a date comparison or a
content comparison, becomes a bit unmanageable for very large files.
Some of the files I'm working with
Pretty much, yeah. Except I need diffing a pair of files that exist on
opposite ends of a network, without causing the entire contents of the
file to be transferred over that network.
Now, I have the option of doing this: If I am able to determine that
(for instance) bytes 10468 to 1473 in a
VSmirk [EMAIL PROTECTED] writes:
But the trick in my mind is figuring out which specific bytes have been
written to disk. That's why I was thinking device level. Am I going
to have to work in C++ or Assembler for something like this?
No, you can do it in Python. The basic idea is: locally
Aweseme!!! I got as far as segmenting the large file on my own, and I
ran out of ideas. I kind of thought about checksum, but I never put
the two together.
Thanks. You've helped a lot
V
--
http://mail.python.org/mailman/listinfo/python-list
VSmirk [EMAIL PROTECTED] writes:
Aweseme!!! I got as far as segmenting the large file on my own, and I
ran out of ideas. I kind of thought about checksum, but I never put
the two together.
Thanks. You've helped a lot
The checksum method I described works ok if bytes change in the
Thanks for the head's up. I was so giddy with the simplicity of the
solution, I stopped trying to poke holes in it.
I agree with your philosophy of not reinventing the wheel, but I did
notice two things: First, the link you provided claims in the features
section that rsync if for *nix systems,
Maybe an example will help
file A
abef | 1938 | 4bac | 0def | 8675
file B
adef | 0083 | abfd | 3356 | 2465
File A is different from file B and you want to have File A look like
File B. So do the segmentation (I have chosen ' | ' as the divide
between segments).
After that do checksums on
VSmirk [EMAIL PROTECTED] writes:
So I'm wondering if you know off-hand which windows port does this
checksum validation you outlined.
I think rsync has been ported to Windows but I don't know any details.
I don't use Windows.
--
http://mail.python.org/mailman/listinfo/python-list
So I'm wondering if you know off-hand which windows port does this
checksum validation you outlined.
http://www.gaztronics.net/rsync.php is one source.
Just do a Google search for windows rsync.
--
http://mail.python.org/mailman/listinfo/python-list
So I'm wondering if you know off-hand which windows port does this
checksum validation you outlined.
http://www.gaztronics.net/rsync.php is one source. Just do a Google search
for windows rsync.
--
http://mail.python.org/mailman/listinfo/python-list
Of course that was the first thing I tried.
But what I meant to say was that at least one port, the python one,
didn't have the checksum validation that Paul was talking about, so I
was wondering if he knew of one that was faithful to the unix port of
it.
Thanks much for the links, though, and
19 matches
Mail list logo