Re: [GIT PULL v2] Update LZO compression
On Tue, Aug 21, 2012 at 05:21:50PM +0200, Markus F.X.J. Oberhumer wrote: as suggested on the mailing list I have converted the updated LZO code into git, so please pull my lzo-update branch from ... [ Changes in v2: Optimize code for CPUs with inefficient unaligned access = significant speed increase on ARM ] I can confirm that this new code runs at the same speed as the current lzo code in the Linux kernel on my ARM926EJ-S based platform. I only tested decompression, using the attached hacky userspace code. # time ./lzo-bench/old/unlzop lzoimage /dev/null real0m 0.29s # time ./lzo-bench/new/unlzop lzoimage /dev/null real0m 0.29s (where lzoimage is a Linux Image compressed with lzop) So, from my side there are no more objections. Thanks for doing this work, Markus. Johannes lzo-bench.tar.gz Description: Binary data
Re: [GIT PULL] Update LZO compression
On Thu, Aug 16, 2012 at 08:27:15AM +0200, Markus F.X.J. Oberhumer wrote: On 2012-08-15 16:45, Johannes Stezenbach wrote: I made the attached quick hack userspace code using ARM kernel headers and barebox unlzop code. (new == your new code, old == linux-3.5 git, test == new + your suggested change) (sorry I had no time to clean it up) My suggested COPY4 replacement probably has a lot of load stalls - maybe some ARM expert could have a look and suggest a more efficient implementation. In any case, I still would like to see the new code in linux-next because of the huge improvements on other modern CPUs. Well, ~2x speedup on x86 is certainly a good achievement, but there are more ARM based devices than there are PCs, and I guess many embedded devices use lzo compressed kernels and file systems while I'm not convinced many PCs rely on lzo in the kernel. I know everyone's either busy or on vacation, but it would be so cool if someone could test on a more modern ARM core, with the userspace test code I posted it should be easy to do. Johannes -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] Update LZO compression
On Thu, Aug 16, 2012 at 12:48:49PM -0400, Jeff Garzik wrote: On 08/16/2012 12:20 PM, Andi Kleen wrote: If you think a little bit, I bet you could come up with a solution that operates at cacheline-aligned granularity, something that would be _even faster_ than simply fixing the code to do aligned accesses. Cache aligned compression is unlikely to compress anything at all. Compression algorithms are usually by definition unaligned. Sure it's a bitstream, but that does not imply the impossibility of reading data in in an word-aligned manner. Maybe cache-aligned is ambitious, because of resultant code bloat, but machine-int-aligned is doable and reasonable. Well, I for one would be content if the old and new lzo versions could be merged based on CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS. (Assuming that the slowdown on ARM is due to unaligned access, since the old version also uses get/put_unaligned, is the new version actually using more unaligned accesses?) Johannes -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] Update LZO compression
On Tue, Aug 14, 2012 at 01:44:02AM +0200, Markus F.X.J. Oberhumer wrote: On 2012-07-16 20:30, Markus F.X.J. Oberhumer wrote: As stated in the README this version is significantly faster (typically more than 2 times faster!) than the current version, has been thoroughly tested on x86_64/i386/powerpc platforms and is intended to get included into the official Linux 3.6 or 3.7 release. I encourage all compression users to test and benchmark this new version, and I also would ask some official LZO maintainer to convert the updated source files into a GIT commit and possibly push it to Linus or linux-next. Sorry for not reporting earlier, but I didn't have time to do real benchmarks, just a quick test on ARM926EJ-S using barebox, and found in the new version decompression is slower: http://lists.infradead.org/pipermail/barebox/2012-July/008268.html BTW, do you have userspace code matching the old and new lzo versions? It would be easier to benchmark. Unfortunately I cannot claim high confidence in my benchmark results due to missing time to do it properly, it would be useful if someone else could do some benchmarks on ARM before merging this. Johannes -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html