Re: [RFC PATCH] Btrfs: add sha256 checksum option

2014-12-01 Thread John Williams
On Mon, Dec 1, 2014 at 4:39 AM, Austin S Hemmelgarn ahferro...@gmail.com wrote: Just because it's a filesystem doesn't always mean that speed is the most important thing. Personally, I can think of multiple cases where using a cryptographically strong hash would be preferable, for example:

Re: [RFC PATCH] Btrfs: add sha256 checksum option

2014-12-01 Thread John Williams
On Mon, Dec 1, 2014 at 9:42 AM, Austin S Hemmelgarn Except most of the CPU optimized hashes aren't crypto hashes (other than the various SHA implementations). Furthermore, I've actually tested the speed of a generic CRC32c implementation versus SHA-1 using the SHA instructions on an

Re: [RFC PATCH] Btrfs: add sha256 checksum option

2014-12-01 Thread John Williams
On Mon, Dec 1, 2014 at 11:28 AM, Alex Elsayed eternal...@gmail.com wrote: I think there's a fundamental set of points being missed. That may be true, but it is not me who is missing them. * The Crypto API can be used to access non-cryptographic hashes. Full stop. Irrelevant to my point. I am

Re: [RFC PATCH] Btrfs: add sha256 checksum option

2014-12-01 Thread John Williams
On Mon, Dec 1, 2014 at 12:08 PM, Alex Elsayed eternal...@gmail.com wrote: Actually, I said Sure here, but this isn't strictly true. At some point, you're more memory-bound than CPU-bound, and with CPU intrinsic instructions (like SPARC and recent x86 have for SHA) you're often past that. Then,

Re: [RFC PATCH] Btrfs: add sha256 checksum option

2014-12-01 Thread John Williams
On Mon, Dec 1, 2014 at 12:35 PM, Austin S Hemmelgarn ahferro...@gmail.com wrote: My only reasoning is that with this set of hashes (crc32c, adler32, and md5), the statistical likely-hood of running into a hash collision with more than one of them at a time is infinitesimally small compared to

Re: [RFC PATCH] Btrfs: add sha256 checksum option

2014-12-01 Thread John Williams
On Mon, Dec 1, 2014 at 3:05 PM, Alex Elsayed eternal...@gmail.com wrote: Incidentally, you can be 'skeptical' all you like - per Austin's message upthread, he was testing the Crypto API. Thus, skeptical as you may be, hard evidence shows that SHA-1 was equal to or faster than CRC32, which is

Re: [RFC PATCH] Btrfs: add sha256 checksum option

2014-12-01 Thread John Williams
On Mon, Dec 1, 2014 at 3:05 PM, Alex Elsayed eternal...@gmail.com wrote: hard evidence shows that SHA-1 was equal to or faster than CRC32, which is unequivocally simpler and faster than CityHash (though CityHash comes close). And the CPUs in question are *not* particularly rare - Intel since

Re: [RFC PATCH] Btrfs: add sha256 checksum option

2014-12-01 Thread John Williams
On Mon, Dec 1, 2014 at 3:46 PM, Alex Elsayed eternal...@gmail.com wrote: And I'm not sure what is convoluted or incorrect about saying Look, empirical evidence! No empirical evidence of the speed of SpookyHash or CityHash versus SHA-1 was cited. The only empirical data mentioned was on an

Re: [RFC PATCH] Btrfs: add sha256 checksum option

2014-12-01 Thread John Williams
On Mon, Dec 1, 2014 at 3:46 PM, Alex Elsayed eternal...@gmail.com wrote: And that _is_ the case; they are faster... *when both are software implementations* They are also faster when both are optimized to use special instructions of the CPU. According to this Intel whitepaper, SHA-1 does not

Re: [RFC PATCH] Btrfs: add sha256 checksum option

2014-12-01 Thread John Williams
On Mon, Dec 1, 2014 at 4:06 PM, Alex Elsayed eternal...@gmail.com wrote: https://github.com/openssl/openssl/blob/master/crypto/sha/asm/sha1-armv8.pl # hardware-assisted software(*) # Apple A72.31 4.13 (+14%) # Cortex-A53 2.19 8.73 (+108%) #

Re: [RFC PATCH] Btrfs: add sha256 checksum option

2014-12-01 Thread John Williams
On Mon, Dec 1, 2014 at 4:15 PM, Alex Elsayed eternal...@gmail.com wrote: There's a thing called the transitive property. When CRC32 is faster than SpookyHash and CityHash (while admittedly weaker), and SHA-1 on SPARC is faster than CRC32, there are comparisons that can be made. And yet you

Re: [RFC PATCH] Btrfs: add sha256 checksum option

2014-11-29 Thread John Williams
On Sat, Nov 29, 2014 at 12:38 PM, Alex Elsayed eternal...@gmail.com wrote: Why not just use the kernel crypto API? Then the user can just specify any hash the kernel supports. One reason is that crytographic hashes are an order of magnitude slower than the fastest non-cryptographic hashes. And

Re: [RFC PATCH] Btrfs: add sha256 checksum option

2014-11-29 Thread John Williams
On Sat, Nov 29, 2014 at 1:07 PM, Alex Elsayed eternal...@gmail.com wrote: I'd suggest looking more closely at the crypto api section of menuconfig - it already has crc32c, among others. Just because it's called the crypto api doesn't mean it only has cryptographically-strong algorithms. I have

Re: [RFC PATCH] Btrfs: add sha256 checksum option

2014-11-26 Thread John Williams
On Wed, Nov 26, 2014 at 4:50 AM, Holger Hoffstätte holger.hoffstae...@googlemail.com wrote: On Tue, 25 Nov 2014 15:17:58 -0800, John Williams wrote: 2) CityHash : for 256-bit hashes on all systems https://code.google.com/p/cityhash/ Btw this is now superseded by Farmhash: https

Re: [RFC PATCH] Btrfs: add sha256 checksum option

2014-11-25 Thread John Williams
On Tue, Nov 25, 2014 at 2:30 AM, Liu Bo bo.li@oracle.com wrote: On Mon, Nov 24, 2014 at 11:34:46AM -0800, John Williams wrote: For example, Spooky V2 hash is 128 bits and is very fast. It is noncryptographic, but it is more than adequate for data checksums. http://burtleburtle.net/bob

Re: [RFC PATCH] Btrfs: add sha256 checksum option

2014-11-24 Thread John Williams
On Mon, Nov 24, 2014 at 12:23 AM, Holger Hoffstätte holger.hoffstae...@googlemail.com wrote: Would there be room for a compromise with e.g. 128 bits? For example, Spooky V2 hash is 128 bits and is very fast. It is noncryptographic, but it is more than adequate for data checksums.

Re: [systemd-devel] Slow startup of systemd-journal on BTRFS

2014-06-15 Thread John Williams
On Sun, Jun 15, 2014 at 5:17 PM, Russell Coker russ...@coker.com.au wrote: I just did some tests using fallocate(1). I did the tests both with and without the -n option which appeared to make no difference. I started by allocating a 24G file on a 106G filesystem that had 30G free according

Re: Triple parity and beyond

2013-11-23 Thread John Williams
On Sat, Nov 23, 2013 at 8:03 PM, Stan Hoeppner s...@hardwarefreak.com wrote: Parity array rebuilds are read-modify-write operations. The main difference from normal operation RMWs is that the write is always to the same disk. As long as the stripe reads and chunk reconstruction outrun the

Re: Triple parity and beyond

2013-11-22 Thread John Williams
On Fri, Nov 22, 2013 at 1:35 AM, Stan Hoeppner s...@hardwarefreak.com wrote: Only one graph goes to 2019, the rest are 2010 or less. That being the case, his 2019 graph deals with projected reliability of single, double, and triple parity. The whole article goes to 2019 (or longer). He shows

Re: Triple parity and beyond

2013-11-22 Thread John Williams
On Fri, Nov 22, 2013 at 9:04 PM, NeilBrown ne...@suse.de wrote: I guess with that many drives you could hit PCI bus throughput limits. A 16-lane PCIe 4.0 could just about give 100MB/s to each of 16 devices. So you would really need top-end hardware to keep all of 16 drives busy in a

Re: Triple parity and beyond

2013-11-20 Thread John Williams
On Wed, Nov 20, 2013 at 2:31 AM, David Brown david.br...@hesbynett.no wrote: That's certainly a reasonable way to look at it. We should not limit the possibilities for high-end systems because of the limitations of low-end systems that are unlikely to use 3+ parity anyway. I've also looked

Re: Triple parity and beyond

2013-11-20 Thread John Williams
For myself or any machines I managed for work that do not need high IOPS, I would definitely choose triple- or quad-parity over RAID 51 or similar schemes with arrays of 16 - 32 drives. No need to go into detail here on a subject Adam Leventhal has already covered in detail in an article

Re: Triple parity and beyond

2013-11-20 Thread John Williams
On Wed, Nov 20, 2013 at 10:52 PM, Stan Hoeppner s...@hardwarefreak.com wrote: On 11/20/2013 8:46 PM, John Williams wrote: For myself or any machines I managed for work that do not need high IOPS, I would definitely choose triple- or quad-parity over RAID 51 or similar schemes with arrays of 16

Re: Triple parity and beyond

2013-11-19 Thread John Williams
On Tue, Nov 19, 2013 at 4:54 PM, Chris Murphy li...@colorremedies.com wrote: If anything, I'd like to see two implementations of RAID 6 dual parity. The existing implementation in the md driver and btrfs could remain the default, but users could opt into Cauchy matrix based dual parity which

Re: csum failure messages

2013-11-05 Thread John Williams
On Tue, Nov 5, 2013 at 6:34 AM, Hugo Mills h...@carfax.org.uk wrote: On Tue, Nov 05, 2013 at 07:26:54AM -0700, Chris Murphy wrote: On Nov 5, 2013, at 5:16 AM, Russell Coker russ...@coker.com.au wrote: I presume that my filesystem is still corrupt. I'm the original reporter of the bug. The

Why does btrfs benchmark so badly in this case?

2013-08-08 Thread John Williams
Phoronix periodically runs benchmarks on filesystems, and one thing I have noticed is that btrfs always does terribly on their fio Intel IOMeter fileserver access pattern benchmark: http://www.phoronix.com/scan.php?page=articleitem=linux_310_10fsnum=2 Here, btrfs is more than 6 times slower than

Re: Why does btrfs benchmark so badly in this case?

2013-08-08 Thread John Williams
On Thu, Aug 8, 2013 at 12:40 PM, Josef Bacik jba...@fusionio.com wrote: On Thu, Aug 08, 2013 at 09:13:04AM -0700, John Williams wrote: Phoronix periodically runs benchmarks on filesystems, and one thing I have noticed is that btrfs always does terribly on their fio Intel IOMeter fileserver