Le Wednesday 02 Jan 2013 à 12:26:37 (-0600), Troy Benjegerdes a écrit : > The probability may be 'low' but it is not zero. Just because it's > hard to calculate the hash doesn't mean you can't do it. If your > input data is not random the probability of a hash collision is > going to get scewed. > > Read about how Bitcoin uses hashes. > > I need a budget of around $10,000 or so for some FPGAs and/or GPU cards, > and I can make a regression test that will create deduplication hash > collisions on purpose.
It's not a problem as Eric pointed out while reviewing the previous patchset there is a small place left with zeroes on the deduplication block. A bit could be set on it when a collision is detected and an offset could point to a cluster used to resolve collisions. > > > On Wed, Jan 02, 2013 at 06:33:24PM +0100, Beno?t Canet wrote: > > > How does this code handle hash collisions, and do you have some regression > > > tests that purposefully create a dedup hash collision, and verify that the > > > 'right thing' happens? > > > > The two hash function that can be used are cryptographics and not broken > > yet. > > So nobody knows how to generate a collision. > > > > You can do the math to calculate the probability of collision using a 256 > > bit > > hash while processing 1EiB of data the result is so low you can consider it > > won't happen. > > The sha256 ZFS deduplication works the same way regarding collisions. > > > > I currently use qemu-io-test for testing purpose and iozone with the -w > > flag in > > the guest. > > I would like to find a good deduplication stress test to run in a guest. > > > > Regards > > > > Beno?t > > > > > It's great that this almost works, but it seems rather dangerous to put > > > something like this into the mainline code without some regression tests. > > > > > > (I'm also suspecting the regression test will be a great way to find > > > flakey hardware) > > > > > > -------------------------------------------------------------------------- > > > Troy Benjegerdes 'da hozer' ho...@hozed.org > > > > > > Somone asked my why I work on this free (http://www.fsf.org/philosophy/) > > > software & hardware (http://q3u.be) stuff and not get a real job. > > > Charles Shultz had the best answer: > > > > > > "Why do musicians compose symphonies and poets write poems? They do it > > > because life wouldn't have any meaning for them if they didn't. That's why > > > I draw cartoons. It's my life." -- Charles Shultz > > -- > -------------------------------------------------------------------------- > Troy Benjegerdes 'da hozer' ho...@hozed.org > > Somone asked my why I work on this free (http://www.fsf.org/philosophy/) > software & hardware (http://q3u.be) stuff and not get a real job. > Charles Shultz had the best answer: > > "Why do musicians compose symphonies and poets write poems? They do it > because life wouldn't have any meaning for them if they didn't. That's why > I draw cartoons. It's my life." -- Charles Shultz >