Re: [GIT PULL v2] Update LZO compression

2012-08-22 Thread Johannes Stezenbach
On Tue, Aug 21, 2012 at 05:21:50PM +0200, Markus F.X.J. Oberhumer wrote:
 as suggested on the mailing list I have converted the updated LZO
 code into git, so please pull my lzo-update branch from
...
 [ Changes in v2: Optimize code for CPUs with inefficient unaligned
   access = significant speed increase on ARM ]

I can confirm that this new code runs at the same speed
as the current lzo code in the Linux kernel on
my ARM926EJ-S based platform.  I only tested decompression,
using the attached hacky userspace code.

   # time ./lzo-bench/old/unlzop lzoimage /dev/null
   real0m 0.29s
   # time ./lzo-bench/new/unlzop lzoimage /dev/null
   real0m 0.29s

   (where lzoimage is a Linux Image compressed with lzop)

So, from my side there are no more objections.
Thanks for doing this work, Markus.


Johannes


lzo-bench.tar.gz
Description: Binary data


Re: [GIT PULL] Update LZO compression

2012-08-16 Thread Johannes Stezenbach
On Thu, Aug 16, 2012 at 08:27:15AM +0200, Markus F.X.J. Oberhumer wrote:
 On 2012-08-15 16:45, Johannes Stezenbach wrote:
  
  I made the attached quick hack userspace code
  using ARM kernel headers and barebox unlzop code.
  (new == your new code, old == linux-3.5 git, test == new + your suggested 
  change)
  (sorry I had no time to clean it up)
 
 My suggested COPY4 replacement probably has a lot of load stalls - maybe some
 ARM expert could have a look and suggest a more efficient implementation.
 
 In any case, I still would like to see the new code in linux-next because
 of the huge improvements on other modern CPUs.

Well, ~2x speedup on x86 is certainly a good achievement, but there
are more ARM based devices than there are PCs, and I guess many
embedded devices use lzo compressed kernels and file systems
while I'm not convinced many PCs rely on lzo in the kernel.

I know everyone's either busy or on vacation, but it would
be so cool if someone could test on a more modern ARM core,
with the userspace test code I posted it should be easy to do.


Johannes
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Update LZO compression

2012-08-16 Thread Johannes Stezenbach
On Thu, Aug 16, 2012 at 12:48:49PM -0400, Jeff Garzik wrote:
 On 08/16/2012 12:20 PM, Andi Kleen wrote:
 If you think a little bit, I bet you could come up with a solution that
 operates at cacheline-aligned granularity, something that would be _even
 faster_ than simply fixing the code to do aligned accesses.
 
 Cache aligned compression is unlikely to compress anything at all.
 Compression algorithms are usually by definition unaligned.
 
 Sure it's a bitstream, but that does not imply the impossibility of
 reading data in in an word-aligned manner.
 
 Maybe cache-aligned is ambitious, because of resultant code bloat,
 but machine-int-aligned is doable and reasonable.

Well, I for one would be content if the old and new lzo versions
could be merged based on CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS.

(Assuming that the slowdown on ARM is due to unaligned access,
since the old version also uses get/put_unaligned, is the new
version actually using more unaligned accesses?)


Johannes
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Update LZO compression

2012-08-14 Thread Johannes Stezenbach
On Tue, Aug 14, 2012 at 01:44:02AM +0200, Markus F.X.J. Oberhumer wrote:
 On 2012-07-16 20:30, Markus F.X.J. Oberhumer wrote:
 
  As stated in the README this version is significantly faster (typically more
  than 2 times faster!) than the current version, has been thoroughly tested 
  on
  x86_64/i386/powerpc platforms and is intended to get included into the
  official Linux 3.6 or 3.7 release.
 
  I encourage all compression users to test and benchmark this new version,
  and I also would ask some official LZO maintainer to convert the updated
  source files into a GIT commit and possibly push it to Linus or linux-next.

Sorry for not reporting earlier, but I didn't have time to do real
benchmarks, just a quick test on ARM926EJ-S using barebox,
and found in the new version decompression is slower:
http://lists.infradead.org/pipermail/barebox/2012-July/008268.html

BTW, do you have userspace code matching the old and new
lzo versions?  It would be easier to benchmark.

Unfortunately I cannot claim high confidence in my benchmark results
due to missing time to do it properly, it would be useful if
someone else could do some benchmarks on ARM before merging this.


Johannes
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html