Re: [reiserfs-list] duplicate files and recent changes
On Thu, 6 Jun 2002, Oleg Drokin wrote: On Thu, Jun 06, 2002 at 01:30:46AM -0400, S. Alexander Jacobson wrote: I just joined this list. Two question: 1. Can reiserfs detect that I have two copies of the same file on disk and store tham as one file Hm, you mean, each time you create a file, reiserfs should scan all other files and see if there is exactly a file like you just wrote? Hm, even something more complicated as you are writing to a file in 4k chunks. Definitely no. I could imagine a cheaper implementation in which the fs computes an MD5 hash of each file as it is being written. If the hash matches the pre-existing hash of some other file, then consolidate. (doing a lazy copy) if someone writes to one of them? 2. Is there a fast way to get access to the file change list? It would be nice to be able to do fast backup of changed files without having to traverse entire directory trees. No. I would presume that journalling gives access to this sort of recent information. It is really a question of whether applications may access the journal and whether the journal format is opaque. -Alex- ___ S. Alexander Jacobson i2x Media 1-212-787-1914 voice1-603-288-1280 fax
Re: [reiserfs-list] duplicate files and recent changes
S. Alexander Jacobson wrote: On Thu, 6 Jun 2002, Robert Brockway wrote: On Thu, Jun 06, 2002 at 01:30:46AM -0400, S. Alexander Jacobson wrote: I just joined this list. Two question: 1. Can reiserfs detect that I have two copies of the same file on disk and store tham as one file Hm, you mean, each time you create a file, reiserfs should scan all other files and see if there is exactly a file like you just wrote? Hm, even something more complicated as you are writing to a file in 4k chunks. Definitely no. And besides, how could it even know whether you even want them to be the same or not. windows uses a signature. I keen plenty of identical files around (online backups, etc) and I'd be mighty upset if the filesystem started hard linking them together :) That is why I said lazy copy on write. This is not patented, it is an old thread, everyone agrees that it should be done, no sponsor at the moment though. One implementation would be for the file system to keep a ref count of how many different files the user intended. When an applicatiion writes to a file, the file system checks the ref count. If the ref count is greater than 1, the file system copies the file, decrements the ref-count, and gives the writing application a pointer to the new copy. I'm sure there are smarter implementations, but the point is that the file system can certainly differentiate in theory between user hard links and actual hard links. -Alex- ___ S. Alexander Jacobson i2x Media 1-212-787-1914 voice1-603-288-1280 fax
Re: [reiserfs-list] duplicate files and recent changes
Hello! On Thu, Jun 06, 2002 at 03:33:17AM -0400, S. Alexander Jacobson wrote: Hm, you mean, each time you create a file, reiserfs should scan all other files and see if there is exactly a file like you just wrote? Hm, even something more complicated as you are writing to a file in 4k chunks. Definitely no. I could imagine a cheaper implementation in which the fs computes an MD5 hash of each file as it is being written. If the hash matches the pre-existing hash of some other file, then consolidate. But MD5 may be identical for different files. Also this buys you nothing. You write file in chunks, once file is identical to other file, one of the files deleted. Looks like just more extra work (but some saved space of course). 2. Is there a fast way to get access to the file change list? It would be nice to be able to do fast backup of changed files without having to traverse entire directory trees. No. I would presume that journalling gives access to this sort of recent information. It is really a No. All kinds of metadata is journaled. Also it is possible to get in situation where file was modified, but not journaled, because no metadata changed. (mmaped writes coming to mind). Also journal is not infinite, it is only 32M long. And I presume you want some kind of info like what have changed since week ago. Journal well might be overwritten many times since then. Bye, Oleg
Re: [reiserfs-list] duplicate files and recent changes
On Thu, 06 Jun 2002 01:30:46 EDT, S. Alexander Jacobson [EMAIL PROTECTED] said: 1. Can reiserfs detect that I have two copies of the same file on disk and store tham as one file (doing a lazy copy) if someone writes to one of them? Something I'd like to see even more than that would be VMS-style version numbers on files. -- Valdis Kletnieks Computer Systems Senior Engineer Virginia Tech msg05584/pgp0.pgp Description: PGP signature
Re: [reiserfs-list] duplicate files and recent changes
On Thu, 06 Jun 2002 13:25:05 +0400, Oleg Drokin said: But MD5 may be identical for different files. Only a 2**128 chance of that. If you know a way to force a hash collision more frequently than that, the crypto world wants to hear from you.. ;) Also this buys you nothing. You write file in chunks, once file is identical to other file, one of the files deleted. Looks like just more extra work (but some saved space of course). A much more productive way to save space is file-system compression. AIX supports LZ-compressing each 4K block and then only saving as many 512 byte fragments as actually needed. It's a big win - /usr (even with all the binaries) needs about 30% less space, and I've seen over 50% for file systems with source trees in them... -- Valdis Kletnieks Computer Systems Senior Engineer Virginia Tech msg05585/pgp0.pgp Description: PGP signature
Re: [reiserfs-list] duplicate files and recent changes
(Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic06027.pcx) On Thu, 06 Jun 2002 01:30:46 EDT, S. Alexander Jacobson [EMAIL PROTECTED] said: 1. Can reiserfs detect that I have two copies of the same file on disk and store tham as one file (doing a lazy copy) if someone writes to one of them? Something I'd like to see even more than that would be VMS-style version numbers on files. -- Valdis Kletnieks Computer Systems Senior Engineer Virginia Tech
Re: [reiserfs-list] duplicate files and recent changes
(Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic24632.pcx) On Thu, 06 Jun 2002 13:25:05 +0400, Oleg Drokin said: But MD5 may be identical for different files. Only a 2**128 chance of that. If you know a way to force a hash collision more frequently than that, the crypto world wants to hear from you.. ;) Also this buys you nothing. You write file in chunks, once file is identical to other file, one of the files deleted. Looks like just more extra work (but some saved space of course). A much more productive way to save space is file-system compression. AIX supports LZ-compressing each 4K block and then only saving as many 512 byte fragments as actually needed. It's a big win - /usr (even with all the binaries) needs about 30% less space, and I've seen over 50% for file systems with source trees in them... -- Valdis Kletnieks Computer Systems Senior Engineer Virginia Tech
Re: [reiserfs-list] duplicate files and recent changes
(Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic26147.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic06027.pcx) On Thu, 06 Jun 2002 01:30:46 EDT, S. Alexander Jacobson [EMAIL PROTECTED] said: 1. Can reiserfs detect that I have two copies of the same file on disk and store tham as one file (doing a lazy copy) if someone writes to one of them? Something I'd like to see even more than that would be VMS-style version numbers on files. -- Valdis Kletnieks Computer Systems Senior Engineer Virginia Tech
Re: [reiserfs-list] duplicate files and recent changes
(Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic27268.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic24632.pcx) On Thu, 06 Jun 2002 13:25:05 +0400, Oleg Drokin said: But MD5 may be identical for different files. Only a 2**128 chance of that. If you know a way to force a hash collision more frequently than that, the crypto world wants to hear from you.. ;) Also this buys you nothing. You write file in chunks, once file is identical to other file, one of the files deleted. Looks like just more extra work (but some saved space of course). A much more productive way to save space is file-system compression. AIX supports LZ-compressing each 4K block and then only saving as many 512 byte fragments as actually needed. It's a big win - /usr (even with all the binaries) needs about 30% less space, and I've seen over 50% for file systems with source trees in them... -- Valdis Kletnieks Computer Systems Senior Engineer Virginia Tech
Re: [reiserfs-list] duplicate files and recent changes
(Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic28349.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic26147.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic06027.pcx) On Thu, 06 Jun 2002 01:30:46 EDT, S. Alexander Jacobson [EMAIL PROTECTED] said: 1. Can reiserfs detect that I have two copies of the same file on disk and store tham as one file (doing a lazy copy) if someone writes to one of them? Something I'd like to see even more than that would be VMS-style version numbers on files. -- Valdis Kletnieks Computer Systems Senior Engineer Virginia Tech
Re: [reiserfs-list] duplicate files and recent changes
(Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic04806.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic10465.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic17079.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic10007.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic16147.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic11308.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic15841.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic20581.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic19402.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic30124.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic23278.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic03814.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic14574.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic07359.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic07726.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic14099.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic20730.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic28349.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic26147.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic06027.pcx) On Thu, 06 Jun 2002 01:30:46 EDT, S. Alexander Jacobson [EMAIL PROTECTED] said: 1. Can reiserfs detect that I have two copies of the same file on disk and store tham as one file (doing a lazy copy) if someone writes to one of them? Something I'd like to see even more than that would be VMS-style version numbers on files. -- Valdis Kletnieks Computer Systems Senior Engineer Virginia Tech
Re: [reiserfs-list] duplicate files and recent changes
(Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic13257.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic00081.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic15154.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic07244.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic00072.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic31655.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic01253.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic19373.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic27832.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic30803.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic04969.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic08117.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic11628.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic06033.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic18619.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic14738.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic07549.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic26942.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic27268.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic24632.pcx) On Thu, 06 Jun 2002 13:25:05 +0400, Oleg Drokin said: But MD5 may be identical for different files. Only a 2**128 chance of that. If you know a way to force a hash collision more frequently than that, the crypto world wants to hear from you.. ;) Also this buys you nothing. You write file in chunks, once file is identical to other file, one of the files deleted. Looks like just more extra work (but some saved space of course). A much more productive way to save space is file-system compression. AIX supports LZ-compressing each 4K block and then only saving as many 512 byte fragments as actually needed. It's a big win - /usr (even with all the binaries) needs about 30% less space, and I've seen over 50% for file systems with source trees in them... -- Valdis Kletnieks Computer Systems Senior Engineer Virginia Tech
Re: [reiserfs-list] duplicate files and recent changes
(Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic09063.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic04806.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic10465.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic17079.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic10007.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic16147.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic11308.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic15841.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic20581.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic19402.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic30124.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic23278.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic03814.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic14574.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic07359.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic07726.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic14099.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic20730.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic28349.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic26147.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic06027.pcx) On Thu, 06 Jun 2002 01:30:46 EDT, S. Alexander Jacobson [EMAIL PROTECTED] said: 1. Can reiserfs detect that I have two copies of the same file on disk and store tham as one file (doing a lazy copy) if someone writes to one of them? Something I'd like to see even more than that would be VMS-style version numbers on files. -- Valdis Kletnieks Computer Systems Senior Engineer Virginia Tech
Re: [reiserfs-list] duplicate files and recent changes
(Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic09442.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic09063.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic04806.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic10465.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic17079.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic10007.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic16147.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic11308.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic15841.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic20581.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic19402.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic30124.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic23278.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic03814.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic14574.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic07359.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic07726.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic14099.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic20730.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic28349.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic26147.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic06027.pcx) On Thu, 06 Jun 2002 01:30:46 EDT, S. Alexander Jacobson [EMAIL PROTECTED] said: 1. Can reiserfs detect that I have two copies of the same file on disk and store tham as one file (doing a lazy copy) if someone writes to one of them? Something I'd like to see even more than that would be VMS-style version numbers on files. -- Valdis Kletnieks Computer Systems Senior Engineer Virginia Tech
Re: [reiserfs-list] duplicate files and recent changes
(Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic26184.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic13257.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic00081.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic15154.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic07244.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic00072.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic31655.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic01253.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic19373.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic27832.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic30803.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic04969.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic08117.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic11628.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic06033.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic18619.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic14738.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic07549.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic26942.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic27268.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic24632.pcx) On Thu, 06 Jun 2002 13:25:05 +0400, Oleg Drokin said: But MD5 may be identical for different files. Only a 2**128 chance of that. If you know a way to force a hash collision more frequently than that, the crypto world wants to hear from you.. ;) Also this buys you nothing. You write file in chunks, once file is identical to other file, one of the files deleted. Looks like just more extra work (but some saved space of course). A much more productive way to save space is file-system compression. AIX supports LZ-compressing each 4K block and then only saving as many 512 byte fragments as actually needed. It's a big win - /usr (even with all the binaries) needs about 30% less space, and I've seen over 50% for file systems with source trees in them... -- Valdis Kletnieks Computer Systems Senior Engineer Virginia Tech
Re: [reiserfs-list] duplicate files and recent changes
(Embedded image moved S. Alexander Jacobson [EMAIL PROTECTED] to file: 06/06/2002 02:14 PM pic25352.pcx) Can someone turn this off? I am recieving large amounts of duplicate mail from Valdis Kletnieks through the namesys listserv. -Alex- On Thu, 6 Jun 2002 [EMAIL PROTECTED] wrote: (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic13257.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic00081.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic15154.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic07244.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic00072.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic31655.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic01253.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic19373.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic27832.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic30803.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic04969.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic08117.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic11628.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic06033.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic18619.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic14738.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic07549.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic26942.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic27268.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic24632.pcx) On Thu, 06 Jun 2002 13:25:05 +0400, Oleg Drokin said: But MD5 may be identical for different files. Only a 2**128 chance of that. If you know a way to force a hash collision more frequently than that, the crypto world wants to hear from you.. ;) Also this buys you nothing. You write file in chunks, once file is identical to other file, one of the files deleted. Looks like just more extra work (but some saved space of course). A much more productive way to save space is file-system compression. AIX supports LZ-compressing each 4K block and then only saving as many 512 byte fragments as actually needed. It's a big win - /usr (even with all the binaries) needs about 30% less space, and I've seen over 50% for file systems with source trees in them... -- Valdis Kletnieks Computer Systems Senior Engineer Virginia Tech ___ S. Alexander Jacobson i2x Media 1-212-787-1914 voice1-603-288-1280 fax
Re: [reiserfs-list] duplicate files and recent changes
(Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic16144.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic09442.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic09063.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic04806.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic10465.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic17079.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic10007.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic16147.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic11308.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic15841.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic20581.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic19402.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic30124.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic23278.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic03814.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic14574.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic07359.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic07726.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic14099.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic20730.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic28349.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic26147.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic06027.pcx) On Thu, 06 Jun 2002 01:30:46 EDT, S. Alexander Jacobson [EMAIL PROTECTED] said: 1. Can reiserfs detect that I have two copies of the same file on disk and store tham as one file (doing a lazy copy) if someone writes to one of them? Something I'd like to see even more than that would be VMS-style version numbers on files. -- Valdis Kletnieks Computer Systems Senior Engineer Virginia Tech
Re: [reiserfs-list] duplicate files and recent changes
(Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic01575.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic26184.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic13257.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic00081.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic15154.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic07244.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic00072.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic31655.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic01253.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic19373.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic27832.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic30803.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic04969.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic08117.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic11628.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic06033.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic18619.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic14738.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic07549.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic26942.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic27268.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic24632.pcx) On Thu, 06 Jun 2002 13:25:05 +0400, Oleg Drokin said: But MD5 may be identical for different files. Only a 2**128 chance of that. If you know a way to force a hash collision more frequently than that, the crypto world wants to hear from you.. ;) Also this buys you nothing. You write file in chunks, once file is identical to other file, one of the files deleted. Looks like just more extra work (but some saved space of course). A much more productive way to save space is file-system compression. AIX supports LZ-compressing each 4K block and then only saving as many 512 byte fragments as actually needed. It's a big win - /usr (even with all the binaries) needs about 30% less space, and I've seen over 50% for file systems with source trees in them... -- Valdis Kletnieks Computer Systems Senior Engineer Virginia Tech
Re: [reiserfs-list] duplicate files and recent changes
(Embedded image moved S. Alexander Jacobson [EMAIL PROTECTED] to file: 06/06/2002 02:14 PM pic29705.pcx) (Embedded image moved S. Alexander Jacobson [EMAIL PROTECTED] to file: 06/06/2002 02:14 PM pic25352.pcx) Can someone turn this off? I am recieving large amounts of duplicate mail from Valdis Kletnieks through the namesys listserv. -Alex- On Thu, 6 Jun 2002 [EMAIL PROTECTED] wrote: (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic13257.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic00081.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic15154.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic07244.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic00072.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic31655.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic01253.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic19373.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic27832.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic30803.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic04969.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic08117.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic11628.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic06033.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic18619.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic14738.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic07549.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic26942.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic27268.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic24632.pcx) On Thu, 06 Jun 2002 13:25:05 +0400, Oleg Drokin said: But MD5 may be identical for different files. Only a 2**128 chance of that. If you know a way to force a hash collision more frequently than that, the crypto world wants to hear from you.. ;) Also this buys you nothing. You write file in chunks, once file is identical to other file, one of the files deleted. Looks like just more extra work (but some saved space of course). A much more productive way to save space is file-system compression. AIX supports LZ-compressing each 4K block and then only saving as many 512 byte fragments as actually needed. It's a big win - /usr (even with all the binaries) needs about 30% less space, and I've seen over 50% for file systems with source trees in them... -- Valdis Kletnieks Computer Systems Senior Engineer Virginia Tech ___ S. Alexander Jacobson i2x Media 1-212-787-1914 voice1-603-288-1280 fax
Re: [reiserfs-list] duplicate files and recent changes
(Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic31865.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic16144.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic09442.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic09063.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic04806.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic10465.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic17079.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic10007.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic16147.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic11308.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic15841.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic20581.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic19402.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic30124.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic23278.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic03814.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic14574.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic07359.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic07726.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic14099.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic20730.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic28349.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic26147.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:54 AM pic06027.pcx) On Thu, 06 Jun 2002 01:30:46 EDT, S. Alexander Jacobson [EMAIL PROTECTED] said: 1. Can reiserfs detect that I have two copies of the same file on disk and store tham as one file (doing a lazy copy) if someone writes to one of them? Something I'd like to see even more than that would be VMS-style version numbers on files. -- Valdis Kletnieks Computer Systems Senior Engineer Virginia Tech
Re: [reiserfs-list] duplicate files and recent changes
Hello! Looks like this shit is e-mail virus. And now you two are also were infected. Sigh. Valdis Kletnieks: You should cure your box too, of course. Bye, Oleg On Thu, Jun 06, 2002 at 06:28:01PM +, Richard Thornton wrote: (Embedded image moved Richard Thornton to file: [EMAIL PROTECTED] pic24558.pcx) 06/06/2002 02:28 PM (Embedded image moved Richard Thornton to file: [EMAIL PROTECTED] pic00900.pcx) 06/06/2002 02:28 PM (Embedded image moved Richard Thornton to file: [EMAIL PROTECTED] pic31969.pcx) 06/06/2002 02:28 PM (Embedded image moved Richard Thornton to file: [EMAIL PROTECTED] pic15312.pcx) 06/06/2002 02:28 PM Will you quit sending this shit? RT On Thu, 6 Jun 2002 [EMAIL PROTECTED] wrote: (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic01575.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic26184.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic13257.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic00081.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic15154.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic07244.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic00072.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic31655.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic01253.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic19373.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic27832.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic30803.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic04969.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic08117.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic11628.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic06033.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic18619.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic14738.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic07549.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic26942.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic27268.pcx) (Embedded image moved [EMAIL PROTECTED] to file: 06/06/2002 09:58 AM pic24632.pcx) On Thu, 06 Jun 2002 13:25:05 +0400, Oleg Drokin said: But MD5 may be identical for different files. Only a 2**128 chance of that. If you know a way to force a hash collision more frequently than that, the crypto world wants to hear from you.. ;) Also this buys you nothing. You write file in chunks, once file is identical to other file, one of the files deleted. Looks like just more extra work (but some saved space of course). A much more productive way to save space is file-system compression. AIX supports LZ-compressing each 4K block and then only saving as many 512 byte fragments as actually needed. It's a big win - /usr (even with all the binaries) needs about 30% less space, and I've seen over 50% for file systems with source trees in them... -- Valdis Kletnieks Computer Systems Senior Engineer Virginia Tech
[reiserfs-list] duplicate files and recent changes
I just joined this list. Two question: 1. Can reiserfs detect that I have two copies of the same file on disk and store tham as one file (doing a lazy copy) if someone writes to one of them? 2. Is there a fast way to get access to the file change list? It would be nice to be able to do fast backup of changed files without having to traverse entire directory trees. -Alex- ___ S. Alexander Jacobson i2x Media 1-212-787-1914 voice1-603-288-1280 fax