Re: Reiser4 und LZO compression

2006-08-29 Thread Ray Lee

On 8/29/06, Nigel Cunningham [EMAIL PROTECTED] wrote:

Hi.
On Tue, 2006-08-29 at 03:23 -0500, David Masover wrote:
 Nigel Cunningham wrote:
  We used gzip when we first implemented compression support, and found it
  to be far too slow. Even with the fastest compression options, we were
  only getting a few megabytes per second. Perhaps I did something wrong
  in configuring it, but there's not that many things to get wrong!

 All that comes to mind is the speed/quality setting -- the number from 1
 to 9.  Recently, I backed up someone's hard drive using -1, and I
 believe I was still able to saturate... the _network_.  Definitely try
 again if you haven't changed this, but I can't imagine I'm the first
 persson to think of it.

  From what I remember, gzip -1 wasn't faster than the disk.  But at
 least for (very) repetitive data, I was wrong:

 eve:~ sanity$ time bash -c 'dd if=/dev/zero of=test bs=10m count=10; sync'
 10+0 records in
 10+0 records out
 104857600 bytes transferred in 3.261990 secs (32145287 bytes/sec)

 real0m3.746s
 user0m0.005s
 sys 0m0.627s
 eve:~ sanity$ time bash -c 'dd if=/dev/zero bs=10m count=10 | gzip -v1 
 test; sync'
 10+0 records in
 10+0 records out
 104857600 bytes transferred in 2.404093 secs (43616282 bytes/sec)
   99.5%

 real0m2.558s
 user0m1.554s
 sys 0m0.680s
 eve:~ sanity$



 This was on OS X, but I think it's still valid -- this is a slightly
 older Powerbook, with a 5400 RPM drive, 1.6 ghz G4.

 -1 is still worlds better than nothing.  The backup was over 15 gigs,
 down to about 6 -- loads of repetitive data, I'm sure, but that's where
 you win with compression anyway.

Wow. That's a lot better; I guess I did get something wrong in trying to
tune deflate. That was pre-cryptoapi though; looking at
cryptoapi/deflate.c, I don't see any way of controlling the compression
level. Am I missing anything?


Compressing /dev/zero isn't a great test. The timings are really data-dependant:

[EMAIL PROTECTED]:~$ time bash -c 'sudo dd if=/dev/zero bs=8M count=64 |
gzip -v1 /dev/null'
64+0 records in
64+0 records out
536870912 bytes (537 MB) copied, 7.60817 seconds, 70.6 MB/s
99.6%

real0m7.652s
user0m6.581s
sys 0m0.701s
[EMAIL PROTECTED]:~$ time bash -c 'sudo dd if=/dev/mem bs=8M count=64 | gzip
-v1 /dev/null'
64+0 records in
64+0 records out
536870912 bytes (537 MB) copied, 21.5863 seconds, 24.9 MB/s
70.4%

real0m21.626s
user0m18.763s
sys 0m1.762s

This is on an AMD64 laptop.

Ray


Re: Reiser4 und LZO compression

2006-08-27 Thread Ray Lee

On 8/27/06, Andrew Morton [EMAIL PROTECTED] wrote:

On Sun, 27 Aug 2006 04:34:26 +0400
Alexey Dobriyan [EMAIL PROTECTED] wrote:

 The patch below is so-called reiser4 LZO compression plugin as extracted
 from 2.6.18-rc4-mm3.

 I think it is an unauditable piece of shit and thus should not enter
 mainline.


Sheesh.


Like lib/inflate.c (and this new code should arguably be in lib/).

The problem is that if we clean this up, we've diverged very much from the
upstream implementation.  So taking in fixes and features from upstream
becomes harder and more error-prone.


Right. How about putting it in as so that everyone can track
divergences, but to not use it for a real compile. Rather, consider it
meta-source, and do mechanical, repeatable transformations only,
starting with something like:

mv minilzo.c minilzo._c
cpp 2/dev/null -w -P -C -nostdinc -dI minilzo._c minilzo.c
lindent minilzo.c

to generate a version that can be audited. Doing so on a version of
minilzo.c google found on the web generated something that looked much
like any other stream coder source I've read, so it approaches
readability. Of a sorts. Further cleanups could be done with cpp -D to
rename some of the more bizarre symbols.

Downside is that bugs would have to be fixed in the 'meta-source'
(horrible name, but it's late here), but at least they could be found
(potentially) easier than in the original.

Ray