Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
Hi Daniel, On 6/1/07, Daniel Hazelton <[EMAIL PROTECTED]> wrote: On Wednesday 30 May 2007 19:02:28 Mark Adler wrote: > On May 30, 2007, at 6:30 AM, Satyam Sharma wrote: > > [1] For your reference, here is the user code in question: > > ... > > >if (srclen > 2 && !(data_in[1] & PRESET_DICT) && > > ((data_in[0] & 0x0f) == Z_DEFLATED) && > > !(((data_in[0]<<8) + data_in[1]) % 31)) { > > The funny thing here is that the author felt compelled to use a > #defined constant for the dictionary bit (PRESET_DICT), but had no > problem with a numeric constant to isolate the compression method > (0x0f), or for that matter extracting the window bits from the > header. The easy way to avoid the use of an internal zlib header > file here is to simply replace PRESET_DICT with 0x20. That constant > will never change -- it is part of the definition of the zlib header > in RFC 1950. If there is no objection, I'll put together a patch that changes the use in JFFS2 into a "magic number", complete with documentation on it, and also moves all of the zlib stuff into a single directory. Right, s/PRESET_DICT/0x20/ would have the least fall-out / unknown side-effects. > The slightly more involved patch to avoid the problem is to let > inflate() do all that work for you, including the integrity check on > the zlib header (% 31). Also this corrects an error in the original > code, which is that it continues to try to decompress after finding > that a dictionary is needed or that the zlib header is invalid. In > this version, a bad header simply returns an error: > Does anyone know if doing as Mark suggests would negatively impact the performance of JFFS2 to such a degree that it could be considered a regression? I, unfortunately, don't have the hardware to properly test the code. And, at this point in time, I also don't have enough familiarity with the JFFS2 code to make such a change myself. David would have to comment on that, but you could simultaneously make and submit the patch as suggested by Mark with both your signed-offs-by. That'll naturally need to go through David's tree, so we'll know if he likes/accepts the suggested modifications ... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
Hi Daniel, On 6/1/07, Daniel Hazelton [EMAIL PROTECTED] wrote: On Wednesday 30 May 2007 19:02:28 Mark Adler wrote: On May 30, 2007, at 6:30 AM, Satyam Sharma wrote: [1] For your reference, here is the user code in question: ... if (srclen 2 !(data_in[1] PRESET_DICT) ((data_in[0] 0x0f) == Z_DEFLATED) !(((data_in[0]8) + data_in[1]) % 31)) { The funny thing here is that the author felt compelled to use a #defined constant for the dictionary bit (PRESET_DICT), but had no problem with a numeric constant to isolate the compression method (0x0f), or for that matter extracting the window bits from the header. The easy way to avoid the use of an internal zlib header file here is to simply replace PRESET_DICT with 0x20. That constant will never change -- it is part of the definition of the zlib header in RFC 1950. If there is no objection, I'll put together a patch that changes the use in JFFS2 into a magic number, complete with documentation on it, and also moves all of the zlib stuff into a single directory. Right, s/PRESET_DICT/0x20/ would have the least fall-out / unknown side-effects. The slightly more involved patch to avoid the problem is to let inflate() do all that work for you, including the integrity check on the zlib header (% 31). Also this corrects an error in the original code, which is that it continues to try to decompress after finding that a dictionary is needed or that the zlib header is invalid. In this version, a bad header simply returns an error: Does anyone know if doing as Mark suggests would negatively impact the performance of JFFS2 to such a degree that it could be considered a regression? I, unfortunately, don't have the hardware to properly test the code. And, at this point in time, I also don't have enough familiarity with the JFFS2 code to make such a change myself. David would have to comment on that, but you could simultaneously make and submit the patch as suggested by Mark with both your signed-offs-by. That'll naturally need to go through David's tree, so we'll know if he likes/accepts the suggested modifications ... - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
On Wednesday 30 May 2007 19:02:28 Mark Adler wrote: > On May 30, 2007, at 6:30 AM, Satyam Sharma wrote: > > [1] For your reference, here is the user code in question: > > ... > > >if (srclen > 2 && !(data_in[1] & PRESET_DICT) && > > ((data_in[0] & 0x0f) == Z_DEFLATED) && > > !(((data_in[0]<<8) + data_in[1]) % 31)) { > > The funny thing here is that the author felt compelled to use a > #defined constant for the dictionary bit (PRESET_DICT), but had no > problem with a numeric constant to isolate the compression method > (0x0f), or for that matter extracting the window bits from the > header. The easy way to avoid the use of an internal zlib header > file here is to simply replace PRESET_DICT with 0x20. That constant > will never change -- it is part of the definition of the zlib header > in RFC 1950. If there is no objection, I'll put together a patch that changes the use in JFFS2 into a "magic number", complete with documentation on it, and also moves all of the zlib stuff into a single directory. > The slightly more involved patch to avoid the problem is to let > inflate() do all that work for you, including the integrity check on > the zlib header (% 31). Also this corrects an error in the original > code, which is that it continues to try to decompress after finding > that a dictionary is needed or that the zlib header is invalid. In > this version, a bad header simply returns an error: > Does anyone know if doing as Mark suggests would negatively impact the performance of JFFS2 to such a degree that it could be considered a regression? I, unfortunately, don't have the hardware to properly test the code. And, at this point in time, I also don't have enough familiarity with the JFFS2 code to make such a change myself. DRH - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 5/31/07, Nitin Gupta <[EMAIL PROTECTED]> wrote: [...] The author (Markus Oberhumer) of LZO provided these comments for this patch: --- I've only briefly looked over it, but it's obvious that your version does not work on architechtures which do not allow unaligned access (arm, mips, ...). As for further quality assurance, your version should generate byte-identical object code to LZO 2.02 when using the same compiler & flags. So you could do some semi-automatic testing by compiling with -ffunction-sections and use "objcopy --only-section .text.XXX" to compare the md5sum of all generated functions. This also works fine with crosscompilers. Finally I'm not too happy that you renamed the functions and #defines like LZO1X_WORKMEM_SIZE - please stay compatible with the official library version. As suggested by Johannes earlier, it'd be great if you could submit the various changes (as per Changelog) as _individual patches_ on the original userspace code. That would be easier for others to review, and there's lesser chances of bugs / issues leaking in that way. As for "byte-identical object code", I definitely do *not* think it is necessarily a requirement / good idea. As long as all the changes you make are reviewed individually / closely by people here on this list, there's very low chances of any bugs creeping in. [ F.e. I see nothing wrong in removing the usage of "register" -- that could clearly lead to different object code, but with no bugs introduced. ] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
Hi, FYI... The author (Markus Oberhumer) of LZO provided these comments for this patch: --- I've only briefly looked over it, but it's obvious that your version does not work on architechtures which do not allow unaligned access (arm, mips, ...). As for further quality assurance, your version should generate byte-identical object code to LZO 2.02 when using the same compiler & flags. So you could do some semi-automatic testing by compiling with -ffunction-sections and use "objcopy --only-section .text.XXX" to compare the md5sum of all generated functions. This also works fine with crosscompilers. Finally I'm not too happy that you renamed the functions and #defines like LZO1X_WORKMEM_SIZE - please stay compatible with the official library version. Still a lot to do... - Nitin On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote: This is kernel port of LZO1X-1 compressor and LZO1X decompressor (safe version only). * Changes since 'take 5' (Full Changelog after this): - Added compressor and decomrpesssor as separate and hidden config options (default: n) - Cleanups: removed LZO_CHECK_MPOS_NON_DET macro - removed PTR* macros. * Benchmarks: (system: P4 3GHz, 1G RAM) (Using tester program from Daniel) Following compares this kernel port ('take 6') vs original miniLZO code: 'TinyLZO' refers to this kernel port. 1 run averages: 'Tiny LZO': Combined: 61.2223 usec Compression: 41.8412 usec Decompression: 19.3811 usec 'miniLZO': Combined: 66.0444 usec Compression: 46.6323 usec Decompression: 19.4121 usec Result: Overall: TinyLZO is 7.3% faster Compressor: TinyLZO is 10.2% faster Decompressor: TinyLZO is 0.15% faster TODO: test 'take 6' port on archs other than x86(_32) * Changelog vs. original LZO: 1) Used standard/kernel defined data types: (this eliminated _huge_ #ifdef chunks) lzo_bytep -> unsigned char * lzo_uint -> size_t lzo_xint -> size_t lzo_uint32p -> u32 * lzo_uintptr_t -> unsigned long 2) Removed everything #ifdef'ed under COPY_DICT (this is not set for LZO1X, so removed corres. parts). 3) Removed code #ifdef'ed for LZO1Y, LZO1Z, other variants. 4) Reformatted the code to match general kernel style. 5) The only code change: (as suggested by Andrey) -#if defined(__LITTLE_ENDIAN) - m_pos = op - 1; - m_pos -= (*(const unsigned short *)ip) >> 2; -#else - m_pos = op - 1; - m_pos -= (ip[0] >> 2) + (ip[1] << 6); -#endif + m_pos = op - 1 - (cpu_to_le16(*(const u16 *)ip) >> 2); (Andrey suggested le16_to_cpu for above but I think it should be cpu_to_le16). *** Need testing on big endian machine *** Similarly: -#if defined(__LITTLE_ENDIAN) - m_pos -= (*(const unsigned short *)ip) >> 2; -#else - m_pos -= (ip[0] >> 2) + (ip[1] << 6); -#endif + m_pos -= cpu_to_le16(*(const u16 *)ip) >> 2; 6) Get rid of LZO_CHECK_MPOS_NON_DET macro and PTR* macros. I think it's now ready for mainline inclusion. include/linux/lzo1x.h| 66 +++ lib/Kconfig |6 + lib/Makefile |2 + lib/lzo1x/Makefile |2 + lib/lzo1x/lzo1x_compress.c | 259 ++ lib/lzo1x/lzo1x_decompress.c | 238 ++ lib/lzo1x/lzo1x_int.h| 85 ++ 7 files changed, 658 insertions(+), 0 deletions(-) Signed-off-by: Nitin Gupta --- diff --git a/include/linux/lzo1x.h b/include/linux/lzo1x.h new file mode 100755 index 000..11a6f23 --- /dev/null +++ b/include/linux/lzo1x.h @@ -0,0 +1,66 @@ +/* lzo1x.h -- public interface of the LZO1X compression algorithm + + This file is part of the LZO real-time data compression library. + + Copyright (C) 1996-2005 Markus Franz Xaver Johannes Oberhumer + All Rights Reserved. + + The LZO library is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License, + version 2, as published by the Free Software Foundation. + + The LZO library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with the LZO library; see the file COPYING. + If not, write to the Free Software Foundation, Inc., + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + + Markus F.X.J. Oberhumer + <[EMAIL PROTECTED]> + http://www.oberhumer.com/opensource/lzo/ + + + This file is modified version of lzo1x.h found in original LZO 2.02 + code. Some additional changes have also been made to make it work + in kernel space. + + Nitin Gupta + <[EMAIL PROTECTED]> + */ + +#ifndef __LZO1X_H +#define __LZO1X_H + +/*
Re: [RFC] LZO de/compression support - take 6
Hi, FYI... The author (Markus Oberhumer) of LZO provided these comments for this patch: --- I've only briefly looked over it, but it's obvious that your version does not work on architechtures which do not allow unaligned access (arm, mips, ...). As for further quality assurance, your version should generate byte-identical object code to LZO 2.02 when using the same compiler flags. So you could do some semi-automatic testing by compiling with -ffunction-sections and use objcopy --only-section .text.XXX to compare the md5sum of all generated functions. This also works fine with crosscompilers. Finally I'm not too happy that you renamed the functions and #defines like LZO1X_WORKMEM_SIZE - please stay compatible with the official library version. Still a lot to do... - Nitin On 5/28/07, Nitin Gupta [EMAIL PROTECTED] wrote: This is kernel port of LZO1X-1 compressor and LZO1X decompressor (safe version only). * Changes since 'take 5' (Full Changelog after this): - Added compressor and decomrpesssor as separate and hidden config options (default: n) - Cleanups: removed LZO_CHECK_MPOS_NON_DET macro - removed PTR* macros. * Benchmarks: (system: P4 3GHz, 1G RAM) (Using tester program from Daniel) Following compares this kernel port ('take 6') vs original miniLZO code: 'TinyLZO' refers to this kernel port. 1 run averages: 'Tiny LZO': Combined: 61.2223 usec Compression: 41.8412 usec Decompression: 19.3811 usec 'miniLZO': Combined: 66.0444 usec Compression: 46.6323 usec Decompression: 19.4121 usec Result: Overall: TinyLZO is 7.3% faster Compressor: TinyLZO is 10.2% faster Decompressor: TinyLZO is 0.15% faster TODO: test 'take 6' port on archs other than x86(_32) * Changelog vs. original LZO: 1) Used standard/kernel defined data types: (this eliminated _huge_ #ifdef chunks) lzo_bytep - unsigned char * lzo_uint - size_t lzo_xint - size_t lzo_uint32p - u32 * lzo_uintptr_t - unsigned long 2) Removed everything #ifdef'ed under COPY_DICT (this is not set for LZO1X, so removed corres. parts). 3) Removed code #ifdef'ed for LZO1Y, LZO1Z, other variants. 4) Reformatted the code to match general kernel style. 5) The only code change: (as suggested by Andrey) -#if defined(__LITTLE_ENDIAN) - m_pos = op - 1; - m_pos -= (*(const unsigned short *)ip) 2; -#else - m_pos = op - 1; - m_pos -= (ip[0] 2) + (ip[1] 6); -#endif + m_pos = op - 1 - (cpu_to_le16(*(const u16 *)ip) 2); (Andrey suggested le16_to_cpu for above but I think it should be cpu_to_le16). *** Need testing on big endian machine *** Similarly: -#if defined(__LITTLE_ENDIAN) - m_pos -= (*(const unsigned short *)ip) 2; -#else - m_pos -= (ip[0] 2) + (ip[1] 6); -#endif + m_pos -= cpu_to_le16(*(const u16 *)ip) 2; 6) Get rid of LZO_CHECK_MPOS_NON_DET macro and PTR* macros. I think it's now ready for mainline inclusion. include/linux/lzo1x.h| 66 +++ lib/Kconfig |6 + lib/Makefile |2 + lib/lzo1x/Makefile |2 + lib/lzo1x/lzo1x_compress.c | 259 ++ lib/lzo1x/lzo1x_decompress.c | 238 ++ lib/lzo1x/lzo1x_int.h| 85 ++ 7 files changed, 658 insertions(+), 0 deletions(-) Signed-off-by: Nitin Gupta nitingupta910 [at] gmail [dot] com --- diff --git a/include/linux/lzo1x.h b/include/linux/lzo1x.h new file mode 100755 index 000..11a6f23 --- /dev/null +++ b/include/linux/lzo1x.h @@ -0,0 +1,66 @@ +/* lzo1x.h -- public interface of the LZO1X compression algorithm + + This file is part of the LZO real-time data compression library. + + Copyright (C) 1996-2005 Markus Franz Xaver Johannes Oberhumer + All Rights Reserved. + + The LZO library is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License, + version 2, as published by the Free Software Foundation. + + The LZO library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with the LZO library; see the file COPYING. + If not, write to the Free Software Foundation, Inc., + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + + Markus F.X.J. Oberhumer + [EMAIL PROTECTED] + http://www.oberhumer.com/opensource/lzo/ + + + This file is modified version of lzo1x.h found in original LZO 2.02 + code. Some additional changes have also been made to make it work + in kernel space. + + Nitin Gupta + [EMAIL PROTECTED] + */ + +#ifndef __LZO1X_H +#define __LZO1X_H +
Re: [RFC] LZO de/compression support - take 6
On 5/31/07, Nitin Gupta [EMAIL PROTECTED] wrote: [...] The author (Markus Oberhumer) of LZO provided these comments for this patch: --- I've only briefly looked over it, but it's obvious that your version does not work on architechtures which do not allow unaligned access (arm, mips, ...). As for further quality assurance, your version should generate byte-identical object code to LZO 2.02 when using the same compiler flags. So you could do some semi-automatic testing by compiling with -ffunction-sections and use objcopy --only-section .text.XXX to compare the md5sum of all generated functions. This also works fine with crosscompilers. Finally I'm not too happy that you renamed the functions and #defines like LZO1X_WORKMEM_SIZE - please stay compatible with the official library version. As suggested by Johannes earlier, it'd be great if you could submit the various changes (as per Changelog) as _individual patches_ on the original userspace code. That would be easier for others to review, and there's lesser chances of bugs / issues leaking in that way. As for byte-identical object code, I definitely do *not* think it is necessarily a requirement / good idea. As long as all the changes you make are reviewed individually / closely by people here on this list, there's very low chances of any bugs creeping in. [ F.e. I see nothing wrong in removing the usage of register -- that could clearly lead to different object code, but with no bugs introduced. ] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
On Wednesday 30 May 2007 19:02:28 Mark Adler wrote: On May 30, 2007, at 6:30 AM, Satyam Sharma wrote: [1] For your reference, here is the user code in question: ... if (srclen 2 !(data_in[1] PRESET_DICT) ((data_in[0] 0x0f) == Z_DEFLATED) !(((data_in[0]8) + data_in[1]) % 31)) { The funny thing here is that the author felt compelled to use a #defined constant for the dictionary bit (PRESET_DICT), but had no problem with a numeric constant to isolate the compression method (0x0f), or for that matter extracting the window bits from the header. The easy way to avoid the use of an internal zlib header file here is to simply replace PRESET_DICT with 0x20. That constant will never change -- it is part of the definition of the zlib header in RFC 1950. If there is no objection, I'll put together a patch that changes the use in JFFS2 into a magic number, complete with documentation on it, and also moves all of the zlib stuff into a single directory. The slightly more involved patch to avoid the problem is to let inflate() do all that work for you, including the integrity check on the zlib header (% 31). Also this corrects an error in the original code, which is that it continues to try to decompress after finding that a dictionary is needed or that the zlib header is invalid. In this version, a bad header simply returns an error: Does anyone know if doing as Mark suggests would negatively impact the performance of JFFS2 to such a degree that it could be considered a regression? I, unfortunately, don't have the hardware to properly test the code. And, at this point in time, I also don't have enough familiarity with the JFFS2 code to make such a change myself. DRH - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
On Wednesday 30 May 2007 19:02:28 Mark Adler wrote: > On May 30, 2007, at 6:30 AM, Satyam Sharma wrote: > > [1] For your reference, here is the user code in question: > > ... > > >if (srclen > 2 && !(data_in[1] & PRESET_DICT) && > > ((data_in[0] & 0x0f) == Z_DEFLATED) && > > !(((data_in[0]<<8) + data_in[1]) % 31)) { > > The funny thing here is that the author felt compelled to use a > #defined constant for the dictionary bit (PRESET_DICT), but had no > problem with a numeric constant to isolate the compression method > (0x0f), or for that matter extracting the window bits from the > header. The easy way to avoid the use of an internal zlib header > file here is to simply replace PRESET_DICT with 0x20. That constant > will never change -- it is part of the definition of the zlib header > in RFC 1950. That was one of my original suggestions, though I personally downchecked it (and believe I also did so in the mail with the suggestions) because of the dislike of "Magic Numbers" in the kernel. However, I had only looked at a small part of the JFFS2 code to see what it used zutil.h for - if it is using said 'Magic Numbers' in the rest of the code, the consistent way to fix this would be to replace the PRESET_DICT macro with the constant it stands for. I've gotten a bit caught-up in extending the test-bed code I used to benchmark Nitin's 'tinyLZO' implementation, but I will see about getting a patch together to change JFFS2 in the manner Mark suggests. (Unless, of course, someone gets to it before I do). DRH > > The slightly more involved patch to avoid the problem is to let > inflate() do all that work for you, including the integrity check on > the zlib header (% 31). Also this corrects an error in the original > code, which is that it continues to try to decompress after finding > that a dictionary is needed or that the zlib header is invalid. In > this version, a bad header simply returns an error: > > /* provide input data and output space */ > inf_strm.next_in = data_in; > inf_strm.avail_in = srclen; > inf_strm.next_out = cpage_out; > inf_strm.avail_out = destlen; > > /* verify and skip zlib header (this updates next_in and > avail_in) */ > inf_strm.zalloc = Z_NULL; > inf_strm.zfree = Z_NULL; > inf_strm.opaque = Z_NULL; > if (zlib_inflateInit(_strm) != Z_OK) { > printk(KERN_WARNING "inflateInit failed\n"); > return 1; > } > ret = zlib_inflate(_strm, Z_BLOCK); > if (ret != Z_OK || (inf_strm.data_type & 0x80) == 0) { > printk(KERN_WARNING "inflate failed on zlib header\n"); > return 1; > } > zlib_inflateEnd(_strm); > > /* do raw inflate (no adler32) on deflate data after zlib header */ > inf_strm.zalloc = Z_NULL; > inf_strm.zfree = Z_NULL; > inf_strm.opaque = Z_NULL; > if (zlib_inflateInit2(_strm, -MAX_WBITS) != Z_OK) { > printk(KERN_WARNING "inflateInit failed\n"); > return 1; > } > while ((ret = zlib_inflate ... > > Mark - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
On May 30, 2007, at 6:30 AM, Satyam Sharma wrote: [1] For your reference, here is the user code in question: ... if (srclen > 2 && !(data_in[1] & PRESET_DICT) && ((data_in[0] & 0x0f) == Z_DEFLATED) && !(((data_in[0]<<8) + data_in[1]) % 31)) { The funny thing here is that the author felt compelled to use a #defined constant for the dictionary bit (PRESET_DICT), but had no problem with a numeric constant to isolate the compression method (0x0f), or for that matter extracting the window bits from the header. The easy way to avoid the use of an internal zlib header file here is to simply replace PRESET_DICT with 0x20. That constant will never change -- it is part of the definition of the zlib header in RFC 1950. The slightly more involved patch to avoid the problem is to let inflate() do all that work for you, including the integrity check on the zlib header (% 31). Also this corrects an error in the original code, which is that it continues to try to decompress after finding that a dictionary is needed or that the zlib header is invalid. In this version, a bad header simply returns an error: /* provide input data and output space */ inf_strm.next_in = data_in; inf_strm.avail_in = srclen; inf_strm.next_out = cpage_out; inf_strm.avail_out = destlen; /* verify and skip zlib header (this updates next_in and avail_in) */ inf_strm.zalloc = Z_NULL; inf_strm.zfree = Z_NULL; inf_strm.opaque = Z_NULL; if (zlib_inflateInit(_strm) != Z_OK) { printk(KERN_WARNING "inflateInit failed\n"); return 1; } ret = zlib_inflate(_strm, Z_BLOCK); if (ret != Z_OK || (inf_strm.data_type & 0x80) == 0) { printk(KERN_WARNING "inflate failed on zlib header\n"); return 1; } zlib_inflateEnd(_strm); /* do raw inflate (no adler32) on deflate data after zlib header */ inf_strm.zalloc = Z_NULL; inf_strm.zfree = Z_NULL; inf_strm.opaque = Z_NULL; if (zlib_inflateInit2(_strm, -MAX_WBITS) != Z_OK) { printk(KERN_WARNING "inflateInit failed\n"); return 1; } while ((ret = zlib_inflate ... Mark - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
Satyam Sharma wrote: Hi Artem, On 5/30/07, Artem Bityutskiy <[EMAIL PROTECTED]> wrote: Err, and important note is that it also wants this compressed data to be independently uncompressable. Hmm, okay. But what I meant was that if jffs2's needs are "standard" enough in the sense that they could conceivably be required by other users too (and this one mentioned by you does appear to be one of those), then why not make such hacks (if they are necessary and suitable indeed, lets wait for Mark's response) part of zlib itself? I really not sure - I vaguely remember that zlib just cannot provide this capability. And what JFFS2 does is not always correct. But again, it was so long time ago... -- Best Regards, Artem Bityutskiy (Артём Битюцкий) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
Hi Artem, On 5/30/07, Artem Bityutskiy <[EMAIL PROTECTED]> wrote: Err, and important note is that it also wants this compressed data to be independently uncompressable. Hmm, okay. But what I meant was that if jffs2's needs are "standard" enough in the sense that they could conceivably be required by other users too (and this one mentioned by you does appear to be one of those), then why not make such hacks (if they are necessary and suitable indeed, lets wait for Mark's response) part of zlib itself? Thanks, Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
Artem Bityutskiy wrote: JFFS2 needs: it has _big_ input buffer, and _small_ output buffer, and it wants zlib to compress as much as possible from the input buffer, and make the output buffer full or nearly full of compressed data. Err, and important note is that it also wants this compressed data to be independently uncompressable. -- Best Regards, Artem Bityutskiy (Артём Битюцкий) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 5/30/07, Johannes Stezenbach <[EMAIL PROTECTED]> wrote: On Wed, May 30, 2007, Nitin Gupta wrote: > > Again, all the original code has been retained _as-is_. Whatever was > changed, has been mentioned in that detailed changelog that I post > along with patch. Just a general remark (I haven't been following this thread closely): IMHO it would be _much_ better to merge the original code and your changes as seperate patches. Then someone who wants to review it later doesn't have to jump through all the hoops of finding the original code himself to diff it and see your changes. Additionally, you should also split stylistic/cleanup changes like "Reformatted the code to match general kernel style" from functional changes like "use cpu_to_le16()". Ideally each of the changes you mention in your "Changelog vs. original LZO" should be a seperate patch, this would make review much easier. I violently agree with this method of going forward. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Wed, May 30, 2007, Nitin Gupta wrote: > > Again, all the original code has been retained _as-is_. Whatever was > changed, has been mentioned in that detailed changelog that I post > along with patch. Just a general remark (I haven't been following this thread closely): IMHO it would be _much_ better to merge the original code and your changes as seperate patches. Then someone who wants to review it later doesn't have to jump through all the hoops of finding the original code himself to diff it and see your changes. Additionally, you should also split stylistic/cleanup changes like "Reformatted the code to match general kernel style" from functional changes like "use cpu_to_le16()". Ideally each of the changes you mention in your "Changelog vs. original LZO" should be a seperate patch, this would make review much easier. Regards, Johannes - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
Satyam Sharma wrote: Hmmm, either jffs2 thinks the zlib interfaces exposed through include/linux/zlib.h are insufficient, or it isn't using zlib properly (or at least the way it was supposed to be used :-) I do not remember many details, but yes, JFFS2 uses zlib trickily. Traditionally, zlib is used like this: you have an input buffer, and you have a large enough out put buffer, you compress whole input buffer to the output buffer. JFFS2 needs: it has _big_ input buffer, and _small_ output buffer, and it wants zlib to compress as much as possible from the input buffer, and make the output buffer full or nearly full of compressed data. This is why JFFS2 uses different hacks, but I do not remember details anymore. Otherwise it could just use cryptoapi instead. In fact, I tried this 2 years ago, but gave up: http://lkml.org/lkml/2005/3/25/104 -- Best Regards, Artem Bityutskiy (Артём Битюцкий) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
Hi Mark, On 5/30/07, Mark Adler <[EMAIL PROTECTED]> wrote: On May 29, 2007, at 8:15 AM, Satyam Sharma wrote: > skipping some checksum calculation if some > flag (PRESET_DICT) is absent from the input stream about to > be decompressed ... You don't need to dissect the header manually to look for that bit. If you feed inflate() at least the first two bytes, it will return immediately with the Z_NEED_DICT return code if a preset dictionary is requested. You can force inflate() to return immediately after decoding the two byte header even if a preset dictionary is not requested by using the Z_BLOCK flush code. Thanks for replying, unfortunately I don't know either zlib or jffs2 code deeply enough to actually understand what you meant here. Are you saying that the if-else block in question [1] in fs/jffs2/compr_zlib.c:jffs2_zlib_decompress() is unnecessary and can be done away with? Thanks, Satyam [1] For your reference, here is the user code in question: inf_strm.next_in = data_in; inf_strm.avail_in = srclen; inf_strm.total_in = 0; inf_strm.next_out = cpage_out; inf_strm.avail_out = destlen; inf_strm.total_out = 0; int wbits = MAX_WBITS; /* If it's deflate, and it's got no preset dictionary, then we can tell zlib to skip the adler32 check. */ if (srclen > 2 && !(data_in[1] & PRESET_DICT) && ((data_in[0] & 0x0f) == Z_DEFLATED) && !(((data_in[0]<<8) + data_in[1]) % 31)) { D2(printk(KERN_DEBUG "inflate skipping adler32\n")); wbits = -((data_in[0] >> 4) + 8); inf_strm.next_in += 2; inf_strm.avail_in -= 2; } else { /* Let this remain D1 for now -- it should never happen */ D1(printk(KERN_DEBUG "inflate not skipping adler32\n")); } if (zlib_inflateInit2(_strm, wbits) != Z_OK) { printk(KERN_WARNING "inflateInit failed\n"); return 1; } while((ret = zlib_inflate(_strm, Z_FINISH)) == Z_OK) ; if (ret != Z_STREAM_END) printk(KERN_NOTICE "inflate returned %d\n", ret); zlib_inflateEnd(_strm); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
On Wednesday 30 May 2007 01:31:19 Mark Adler wrote: > On May 29, 2007, at 8:15 AM, Satyam Sharma wrote: > > skipping some checksum calculation if some > > flag (PRESET_DICT) is absent from the input stream about to > > be decompressed ... > > You don't need to dissect the header manually to look for that bit. > If you feed inflate() at least the first two bytes, it will return > immediately with the Z_NEED_DICT return code if a preset dictionary > is requested. You can force inflate() to return immediately after > decoding the two byte header even if a preset dictionary is not > requested by using the Z_BLOCK flush code. > > Mark I think that JFFS2 is doing the work of looking for PRESET_DICT itself not only to skip the checksum calc, but also to lose the function-call overhead. Yes, the performance difference shouldn't be all that great, but might have made a difference in some test. When I'm done working on the benchmark/testbed for the LZO code proposed for inclusion in the kernel I'll do some testing and see how large the difference is in userspace. (for most code the differences hold true, though usually with less drastic numbers, in kernel) DRH - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 5/30/07, Jan Engelhardt <[EMAIL PROTECTED]> wrote: On May 30 2007 11:24, Nitin Gupta wrote: >> >> It is the bytestream (ip) that is reinterpreted as uint16_t. >> And I really doubt that the LZO author has a big-endian machine, >> given the days of ubiquitous x86. > >> le16_to_cpu it is. > > But then why you think it should be > le_16_cpu() -- how will this make any difference? Like I said, we are reinterpreting the byte stream as a uint16_t (= reading from bytestream to CPU). While writing out an uint16_t as a bytestream is cpu_to_le16. I didn't properly understand this. But you seem to be correct - now I ran sparse check: * using cpu_to_le16() lib/lzo1x/lzo1x_decompress.c:141:35: error: incompatible types for operation (>>) lib/lzo1x/lzo1x_decompress.c:141:35:left side has type restricted unsigned short [usertype] [force] lib/lzo1x/lzo1x_decompress.c:141:35:right side has type int lib/lzo1x/lzo1x_decompress.c:140:20: error: incompatible types for operation (-) lib/lzo1x/lzo1x_decompress.c:140:20:left side has type unsigned char *register [assigned] op lib/lzo1x/lzo1x_decompress.c:140:20:right side has type bad type lib/lzo1x/lzo1x_decompress.c:157:35: error: incompatible types for operation (>>) lib/lzo1x/lzo1x_decompress.c:157:35:left side has type restricted unsigned short [usertype] [force] lib/lzo1x/lzo1x_decompress.c:157:35:right side has type int lib/lzo1x/lzo1x_decompress.c:156:11: error: incompatible types for operation (-=) * using le16_to_cpu() lib/lzo1x/lzo1x_decompress.c:140:23: warning: cast to restricted type lib/lzo1x/lzo1x_decompress.c:156:14: warning: cast to restricted type But still, how do I get rid of these two warnings? or, do these warning suggest some real problem that still exist in code? - Nitin > For your ref (from big_endian.h): > # define __cpu_to_le16(x) ((__force __le16)__swab16((x))) > # define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x)) Jan -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On May 30 2007 11:24, Nitin Gupta wrote: >> >> It is the bytestream (ip) that is reinterpreted as uint16_t. >> And I really doubt that the LZO author has a big-endian machine, >> given the days of ubiquitous x86. > >> le16_to_cpu it is. > > But then why you think it should be > le_16_cpu() -- how will this make any difference? Like I said, we are reinterpreting the byte stream as a uint16_t (= reading from bytestream to CPU). While writing out an uint16_t as a bytestream is cpu_to_le16. > For your ref (from big_endian.h): > # define __cpu_to_le16(x) ((__force __le16)__swab16((x))) > # define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x)) Jan -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
On May 30, 2007, at 6:30 AM, Satyam Sharma wrote: [1] For your reference, here is the user code in question: ... if (srclen 2 !(data_in[1] PRESET_DICT) ((data_in[0] 0x0f) == Z_DEFLATED) !(((data_in[0]8) + data_in[1]) % 31)) { The funny thing here is that the author felt compelled to use a #defined constant for the dictionary bit (PRESET_DICT), but had no problem with a numeric constant to isolate the compression method (0x0f), or for that matter extracting the window bits from the header. The easy way to avoid the use of an internal zlib header file here is to simply replace PRESET_DICT with 0x20. That constant will never change -- it is part of the definition of the zlib header in RFC 1950. The slightly more involved patch to avoid the problem is to let inflate() do all that work for you, including the integrity check on the zlib header (% 31). Also this corrects an error in the original code, which is that it continues to try to decompress after finding that a dictionary is needed or that the zlib header is invalid. In this version, a bad header simply returns an error: /* provide input data and output space */ inf_strm.next_in = data_in; inf_strm.avail_in = srclen; inf_strm.next_out = cpage_out; inf_strm.avail_out = destlen; /* verify and skip zlib header (this updates next_in and avail_in) */ inf_strm.zalloc = Z_NULL; inf_strm.zfree = Z_NULL; inf_strm.opaque = Z_NULL; if (zlib_inflateInit(inf_strm) != Z_OK) { printk(KERN_WARNING inflateInit failed\n); return 1; } ret = zlib_inflate(inf_strm, Z_BLOCK); if (ret != Z_OK || (inf_strm.data_type 0x80) == 0) { printk(KERN_WARNING inflate failed on zlib header\n); return 1; } zlib_inflateEnd(inf_strm); /* do raw inflate (no adler32) on deflate data after zlib header */ inf_strm.zalloc = Z_NULL; inf_strm.zfree = Z_NULL; inf_strm.opaque = Z_NULL; if (zlib_inflateInit2(inf_strm, -MAX_WBITS) != Z_OK) { printk(KERN_WARNING inflateInit failed\n); return 1; } while ((ret = zlib_inflate ... Mark - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
On Wednesday 30 May 2007 19:02:28 Mark Adler wrote: On May 30, 2007, at 6:30 AM, Satyam Sharma wrote: [1] For your reference, here is the user code in question: ... if (srclen 2 !(data_in[1] PRESET_DICT) ((data_in[0] 0x0f) == Z_DEFLATED) !(((data_in[0]8) + data_in[1]) % 31)) { The funny thing here is that the author felt compelled to use a #defined constant for the dictionary bit (PRESET_DICT), but had no problem with a numeric constant to isolate the compression method (0x0f), or for that matter extracting the window bits from the header. The easy way to avoid the use of an internal zlib header file here is to simply replace PRESET_DICT with 0x20. That constant will never change -- it is part of the definition of the zlib header in RFC 1950. That was one of my original suggestions, though I personally downchecked it (and believe I also did so in the mail with the suggestions) because of the dislike of Magic Numbers in the kernel. However, I had only looked at a small part of the JFFS2 code to see what it used zutil.h for - if it is using said 'Magic Numbers' in the rest of the code, the consistent way to fix this would be to replace the PRESET_DICT macro with the constant it stands for. I've gotten a bit caught-up in extending the test-bed code I used to benchmark Nitin's 'tinyLZO' implementation, but I will see about getting a patch together to change JFFS2 in the manner Mark suggests. (Unless, of course, someone gets to it before I do). DRH The slightly more involved patch to avoid the problem is to let inflate() do all that work for you, including the integrity check on the zlib header (% 31). Also this corrects an error in the original code, which is that it continues to try to decompress after finding that a dictionary is needed or that the zlib header is invalid. In this version, a bad header simply returns an error: /* provide input data and output space */ inf_strm.next_in = data_in; inf_strm.avail_in = srclen; inf_strm.next_out = cpage_out; inf_strm.avail_out = destlen; /* verify and skip zlib header (this updates next_in and avail_in) */ inf_strm.zalloc = Z_NULL; inf_strm.zfree = Z_NULL; inf_strm.opaque = Z_NULL; if (zlib_inflateInit(inf_strm) != Z_OK) { printk(KERN_WARNING inflateInit failed\n); return 1; } ret = zlib_inflate(inf_strm, Z_BLOCK); if (ret != Z_OK || (inf_strm.data_type 0x80) == 0) { printk(KERN_WARNING inflate failed on zlib header\n); return 1; } zlib_inflateEnd(inf_strm); /* do raw inflate (no adler32) on deflate data after zlib header */ inf_strm.zalloc = Z_NULL; inf_strm.zfree = Z_NULL; inf_strm.opaque = Z_NULL; if (zlib_inflateInit2(inf_strm, -MAX_WBITS) != Z_OK) { printk(KERN_WARNING inflateInit failed\n); return 1; } while ((ret = zlib_inflate ... Mark - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On May 30 2007 11:24, Nitin Gupta wrote: It is the bytestream (ip) that is reinterpreted as uint16_t. And I really doubt that the LZO author has a big-endian machine, given the days of ubiquitous x86. le16_to_cpu it is. But then why you think it should be le_16_cpu() -- how will this make any difference? Like I said, we are reinterpreting the byte stream as a uint16_t (= reading from bytestream to CPU). While writing out an uint16_t as a bytestream is cpu_to_le16. For your ref (from big_endian.h): # define __cpu_to_le16(x) ((__force __le16)__swab16((x))) # define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x)) Jan -- - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 5/30/07, Jan Engelhardt [EMAIL PROTECTED] wrote: On May 30 2007 11:24, Nitin Gupta wrote: It is the bytestream (ip) that is reinterpreted as uint16_t. And I really doubt that the LZO author has a big-endian machine, given the days of ubiquitous x86. le16_to_cpu it is. But then why you think it should be le_16_cpu() -- how will this make any difference? Like I said, we are reinterpreting the byte stream as a uint16_t (= reading from bytestream to CPU). While writing out an uint16_t as a bytestream is cpu_to_le16. I didn't properly understand this. But you seem to be correct - now I ran sparse check: * using cpu_to_le16() lib/lzo1x/lzo1x_decompress.c:141:35: error: incompatible types for operation () lib/lzo1x/lzo1x_decompress.c:141:35:left side has type restricted unsigned short [usertype] [force] noident lib/lzo1x/lzo1x_decompress.c:141:35:right side has type int lib/lzo1x/lzo1x_decompress.c:140:20: error: incompatible types for operation (-) lib/lzo1x/lzo1x_decompress.c:140:20:left side has type unsigned char *register [assigned] op lib/lzo1x/lzo1x_decompress.c:140:20:right side has type bad type lib/lzo1x/lzo1x_decompress.c:157:35: error: incompatible types for operation () lib/lzo1x/lzo1x_decompress.c:157:35:left side has type restricted unsigned short [usertype] [force] noident lib/lzo1x/lzo1x_decompress.c:157:35:right side has type int lib/lzo1x/lzo1x_decompress.c:156:11: error: incompatible types for operation (-=) * using le16_to_cpu() lib/lzo1x/lzo1x_decompress.c:140:23: warning: cast to restricted type lib/lzo1x/lzo1x_decompress.c:156:14: warning: cast to restricted type But still, how do I get rid of these two warnings? or, do these warning suggest some real problem that still exist in code? - Nitin For your ref (from big_endian.h): # define __cpu_to_le16(x) ((__force __le16)__swab16((x))) # define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x)) Jan -- - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
On Wednesday 30 May 2007 01:31:19 Mark Adler wrote: On May 29, 2007, at 8:15 AM, Satyam Sharma wrote: skipping some checksum calculation if some flag (PRESET_DICT) is absent from the input stream about to be decompressed ... You don't need to dissect the header manually to look for that bit. If you feed inflate() at least the first two bytes, it will return immediately with the Z_NEED_DICT return code if a preset dictionary is requested. You can force inflate() to return immediately after decoding the two byte header even if a preset dictionary is not requested by using the Z_BLOCK flush code. Mark I think that JFFS2 is doing the work of looking for PRESET_DICT itself not only to skip the checksum calc, but also to lose the function-call overhead. Yes, the performance difference shouldn't be all that great, but might have made a difference in some test. When I'm done working on the benchmark/testbed for the LZO code proposed for inclusion in the kernel I'll do some testing and see how large the difference is in userspace. (for most code the differences hold true, though usually with less drastic numbers, in kernel) DRH - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
Hi Mark, On 5/30/07, Mark Adler [EMAIL PROTECTED] wrote: On May 29, 2007, at 8:15 AM, Satyam Sharma wrote: skipping some checksum calculation if some flag (PRESET_DICT) is absent from the input stream about to be decompressed ... You don't need to dissect the header manually to look for that bit. If you feed inflate() at least the first two bytes, it will return immediately with the Z_NEED_DICT return code if a preset dictionary is requested. You can force inflate() to return immediately after decoding the two byte header even if a preset dictionary is not requested by using the Z_BLOCK flush code. Thanks for replying, unfortunately I don't know either zlib or jffs2 code deeply enough to actually understand what you meant here. Are you saying that the if-else block in question [1] in fs/jffs2/compr_zlib.c:jffs2_zlib_decompress() is unnecessary and can be done away with? Thanks, Satyam [1] For your reference, here is the user code in question: inf_strm.next_in = data_in; inf_strm.avail_in = srclen; inf_strm.total_in = 0; inf_strm.next_out = cpage_out; inf_strm.avail_out = destlen; inf_strm.total_out = 0; int wbits = MAX_WBITS; /* If it's deflate, and it's got no preset dictionary, then we can tell zlib to skip the adler32 check. */ if (srclen 2 !(data_in[1] PRESET_DICT) ((data_in[0] 0x0f) == Z_DEFLATED) !(((data_in[0]8) + data_in[1]) % 31)) { D2(printk(KERN_DEBUG inflate skipping adler32\n)); wbits = -((data_in[0] 4) + 8); inf_strm.next_in += 2; inf_strm.avail_in -= 2; } else { /* Let this remain D1 for now -- it should never happen */ D1(printk(KERN_DEBUG inflate not skipping adler32\n)); } if (zlib_inflateInit2(inf_strm, wbits) != Z_OK) { printk(KERN_WARNING inflateInit failed\n); return 1; } while((ret = zlib_inflate(inf_strm, Z_FINISH)) == Z_OK) ; if (ret != Z_STREAM_END) printk(KERN_NOTICE inflate returned %d\n, ret); zlib_inflateEnd(inf_strm); - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
Satyam Sharma wrote: Hmmm, either jffs2 thinks the zlib interfaces exposed through include/linux/zlib.h are insufficient, or it isn't using zlib properly (or at least the way it was supposed to be used :-) I do not remember many details, but yes, JFFS2 uses zlib trickily. Traditionally, zlib is used like this: you have an input buffer, and you have a large enough out put buffer, you compress whole input buffer to the output buffer. JFFS2 needs: it has _big_ input buffer, and _small_ output buffer, and it wants zlib to compress as much as possible from the input buffer, and make the output buffer full or nearly full of compressed data. This is why JFFS2 uses different hacks, but I do not remember details anymore. Otherwise it could just use cryptoapi instead. In fact, I tried this 2 years ago, but gave up: http://lkml.org/lkml/2005/3/25/104 -- Best Regards, Artem Bityutskiy (Артём Битюцкий) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Wed, May 30, 2007, Nitin Gupta wrote: Again, all the original code has been retained _as-is_. Whatever was changed, has been mentioned in that detailed changelog that I post along with patch. Just a general remark (I haven't been following this thread closely): IMHO it would be _much_ better to merge the original code and your changes as seperate patches. Then someone who wants to review it later doesn't have to jump through all the hoops of finding the original code himself to diff it and see your changes. Additionally, you should also split stylistic/cleanup changes like Reformatted the code to match general kernel style from functional changes like use cpu_to_le16(). Ideally each of the changes you mention in your Changelog vs. original LZO should be a seperate patch, this would make review much easier. Regards, Johannes - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 5/30/07, Johannes Stezenbach [EMAIL PROTECTED] wrote: On Wed, May 30, 2007, Nitin Gupta wrote: Again, all the original code has been retained _as-is_. Whatever was changed, has been mentioned in that detailed changelog that I post along with patch. Just a general remark (I haven't been following this thread closely): IMHO it would be _much_ better to merge the original code and your changes as seperate patches. Then someone who wants to review it later doesn't have to jump through all the hoops of finding the original code himself to diff it and see your changes. Additionally, you should also split stylistic/cleanup changes like Reformatted the code to match general kernel style from functional changes like use cpu_to_le16(). Ideally each of the changes you mention in your Changelog vs. original LZO should be a seperate patch, this would make review much easier. I violently agree with this method of going forward. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
Artem Bityutskiy wrote: JFFS2 needs: it has _big_ input buffer, and _small_ output buffer, and it wants zlib to compress as much as possible from the input buffer, and make the output buffer full or nearly full of compressed data. Err, and important note is that it also wants this compressed data to be independently uncompressable. -- Best Regards, Artem Bityutskiy (Артём Битюцкий) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
Hi Artem, On 5/30/07, Artem Bityutskiy [EMAIL PROTECTED] wrote: Err, and important note is that it also wants this compressed data to be independently uncompressable. Hmm, okay. But what I meant was that if jffs2's needs are standard enough in the sense that they could conceivably be required by other users too (and this one mentioned by you does appear to be one of those), then why not make such hacks (if they are necessary and suitable indeed, lets wait for Mark's response) part of zlib itself? Thanks, Satyam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
Satyam Sharma wrote: Hi Artem, On 5/30/07, Artem Bityutskiy [EMAIL PROTECTED] wrote: Err, and important note is that it also wants this compressed data to be independently uncompressable. Hmm, okay. But what I meant was that if jffs2's needs are standard enough in the sense that they could conceivably be required by other users too (and this one mentioned by you does appear to be one of those), then why not make such hacks (if they are necessary and suitable indeed, lets wait for Mark's response) part of zlib itself? I really not sure - I vaguely remember that zlib just cannot provide this capability. And what JFFS2 does is not always correct. But again, it was so long time ago... -- Best Regards, Artem Bityutskiy (Артём Битюцкий) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
On May 29, 2007, at 8:15 AM, Satyam Sharma wrote: skipping some checksum calculation if some flag (PRESET_DICT) is absent from the input stream about to be decompressed ... You don't need to dissect the header manually to look for that bit. If you feed inflate() at least the first two bytes, it will return immediately with the Z_NEED_DICT return code if a preset dictionary is requested. You can force inflate() to return immediately after decoding the two byte header even if a preset dictionary is not requested by using the Z_BLOCK flush code. Mark - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 5/30/07, Jan Engelhardt <[EMAIL PROTECTED]> wrote: On May 28 2007 20:04, Nitin Gupta wrote: > > * Changelog vs. original LZO: > 1) Used standard/kernel defined data types: (this eliminated _huge_ > #ifdef chunks) > lzo_bytep -> unsigned char * > lzo_uint -> size_t > lzo_xint -> size_t Is this safe (as far as compressed LZO stream is concerned) -- or is it even needed (could it be unsigned int)? > - m_pos -= (*(const unsigned short *)ip) >> 2; > -#else > - m_pos = op - 1; > - m_pos -= (ip[0] >> 2) + (ip[1] << 6); > -#endif > > + m_pos = op - 1 - (cpu_to_le16(*(const u16 *)ip) >> 2); > > (Andrey suggested le16_to_cpu for above but I think it should be cpu_to_le16). > *** Need testing on big endian machine *** On i386, both cpu_to_le16 and le16_to_cpu do nothing. On sparc for example, cpu_to_leXX and leXX_to_cpu do 'the same' ;-) they swap 1234<->4321. It is the bytestream (ip) that is reinterpreted as uint16_t. And I really doubt that the LZO author has a big-endian machine, given the days of ubiquitous x86. le16_to_cpu it is. I just missed the point that leXX_to_cpu() and cpu_to_leXX() are identical on BE machine anyway. But then why you think it should be le_16_cpu() -- how will this make any difference? For your ref (from big_endian.h): #define __cpu_to_le16(x) ((__force __le16)__swab16((x))) #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x)) - Nitin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 5/30/07, Adrian Bunk <[EMAIL PROTECTED]> wrote: On Tue, May 29, 2007 at 11:10:05PM +0200, Jan Engelhardt wrote: > > On May 28 2007 19:11, Adrian Bunk wrote: You completely miss the point of my question. It's about the performance improvements of the modified code that were mentioned. What you are talking about shouldn't have any effect on the generated code. After Daniel refined this testing program, we can see that perf gain is < 2% which can surely be accounted to cleanups like unnecessary castings in tight loops. Again, all the original code has been retained _as-is_. Whatever was changed, has been mentioned in that detailed changelog that I post along with patch. If someone want to review these 500 lines with this changelog in hand, it should not take more than couple of hours. If no-one can see any problem in code by now and considering that it's tested on x86(_32), amd64, ppc and giving somewhat better perf. than original then I believe it is unnecessarily hanging outside of -mm tree. I also contacted author (Markus Oberhumer) regarding above changes but he seems not be responding right now. But still, if it gets into -mm and gets used by various related projects then it should be good enough for mainline also. Cheers, Nitin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 5/30/07, Daniel Hazelton <[EMAIL PROTECTED]> wrote: I just noticed a bug in my testbed/benchmarking code. It's fixed, but I decided to compare version 6 of the code against the *unsafe* decompressor again. The results of the three runs I've put it through after changing it to compare against the unsafe decompressor were startling. `Tiny's` compressor is still faster - I've seen it be rated up to 3% faster. The decompressor, OTOH, when compared to the unsafe version (which is the comparison that started me on this binge of hacking), is more than 7% worse. About 11% slower on the original test against a C source file, and about 6% slower for random data. Unsafe vs safe is within 10%. Its okay. However, looking at the numbers involved, I can't see a reason to keep the unsafe version around - the percentages look worse than they are - from 1 to 3 microseconds. Not just numbers. Most of applications cannot afford to use unsafe versions anyway (like fs people). (well, the compressed-cache people might want those extra usecs - but the difference will never be noticeable anywhere outside the kernel) DRH compressed cache people require every single percent of that performance. For now, ccaching is not ready for mainline (many things need to be done). So, till then I will keep off the unsafe version. If ever compressed caching is on its way to mainline _then_ I will try and add back the unsafe version. But I see no other project that really cares about unsafe version so it's okay to keep it off. - Nitin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
I just noticed a bug in my testbed/benchmarking code. It's fixed, but I decided to compare version 6 of the code against the *unsafe* decompressor again. The results of the three runs I've put it through after changing it to compare against the unsafe decompressor were startling. `Tiny's` compressor is still faster - I've seen it be rated up to 3% faster. The decompressor, OTOH, when compared to the unsafe version (which is the comparison that started me on this binge of hacking), is more than 7% worse. About 11% slower on the original test against a C source file, and about 6% slower for random data. However, looking at the numbers involved, I can't see a reason to keep the unsafe version around - the percentages look worse than they are - from 1 to 3 microseconds. (well, the compressed-cache people might want those extra usecs - but the difference will never be noticeable anywhere outside the kernel) DRH lzo1x-test-6a.tar.bz2 Description: application/tbz
Re: [RFC] LZO de/compression support - take 6
All problems I was having with the test-bed code have been solved, and the error I was running into was, as I suspected, in the code I used to fill the buffer for the random-data test. Results of running the new benchmark (version 6 of the benchmark, version 6 of 'tinyLZO'): 1 run averages: 'Tiny LZO': Combined: 104.1529 usec Compression: 43.065 usec Decompression: 18.4648 usec Random Data Compression: 31.132 usec Random Data Decompression: 11.4911 usec 'miniLZO': Combined: 106.1363 usec Compression: 44.6988 usec Decompression: 18.3576 usec Random Data Compression: 31.5772 usec Random Data Decompression: 11.5027 usec Percentages (calculated as: ((full - tiny)/full)*100): Overall (combined totals): Tiny is 1.87 % faster Compression: Tiny is 3.66 % faster Decompression: Tiny is 0.58 % slower Random Compression: Tiny is 1.41 % faster Random Decompression: Tiny is 0.10 % faster The results, on my system, are not consistent, except in that 'TinyLZO' is generally faster in the non-random data tests than miniLZO. It did, three of the five times I ran the test, perform (on average) about 1% worse than miniLZO. In order to sidestep any issues that a difference in the input data might have caused, I'm going to rewrite the code so that all tests are run against the same data. However, in the meantime, I've attached the latest version of my test-code. DRH PS: the code is going to massively change as I work to include more data sources for the benchmarking, as well as tests that will try to really stress the stripped-down code. lzo1x-test-6.tar.bz2 Description: application/tbz
Re: [RFC] LZO de/compression support - take 6
On Tue, May 29, 2007 at 11:10:05PM +0200, Jan Engelhardt wrote: > > On May 28 2007 19:11, Adrian Bunk wrote: > > > >I have not seen any explanations: > >- Why did the upstream author write the code that way? > > I guess it's along the lines of > - portability > > (note how this contradicts itself). Really. I have yet to > figure out why everyone invents their own xxx32_t types, > like e.g. glib. well, integer types, one might understand, but when it > comes to gchar or gpointer, that's just plain microsoft-style > (think LPCSTR and LPCVOID...) You completely miss the point of my question. It's about the performance improvements of the modified code that were mentioned. What you are talking about shouldn't have any effect on the generated code. > Jan cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On May 28 2007 19:11, Adrian Bunk wrote: > >I have not seen any explanations: >- Why did the upstream author write the code that way? I guess it's along the lines of - portability (note how this contradicts itself). Really. I have yet to figure out why everyone invents their own xxx32_t types, like e.g. glib. well, integer types, one might understand, but when it comes to gchar or gpointer, that's just plain microsoft-style (think LPCSTR and LPCVOID...) Jan -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On May 28 2007 20:04, Nitin Gupta wrote: > > * Changelog vs. original LZO: > 1) Used standard/kernel defined data types: (this eliminated _huge_ > #ifdef chunks) > lzo_bytep -> unsigned char * > lzo_uint -> size_t > lzo_xint -> size_t Is this safe (as far as compressed LZO stream is concerned) -- or is it even needed (could it be unsigned int)? > - m_pos -= (*(const unsigned short *)ip) >> 2; > -#else > - m_pos = op - 1; > - m_pos -= (ip[0] >> 2) + (ip[1] << 6); > -#endif > > + m_pos = op - 1 - (cpu_to_le16(*(const u16 *)ip) >> 2); > > (Andrey suggested le16_to_cpu for above but I think it should be cpu_to_le16). > *** Need testing on big endian machine *** On i386, both cpu_to_le16 and le16_to_cpu do nothing. On sparc for example, cpu_to_leXX and leXX_to_cpu do 'the same' ;-) they swap 1234<->4321. It is the bytestream (ip) that is reinterpreted as uint16_t. And I really doubt that the LZO author has a big-endian machine, given the days of ubiquitous x86. le16_to_cpu it is. Jan -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Tuesday 29 May 2007 16:14:34 Daniel Hazelton wrote: > On Tuesday 29 May 2007 01:58:43 Nitin Gupta wrote: > > On 5/29/07, Bret Towe <[EMAIL PROTECTED]> wrote: > > > tested this on ppc and its still good > > > > > > is there any reason to bother with a test on amd64? > > > if there is I might be able to get to it tonight > > > > Yes, this test is desired on 'take 6' > > (In future I will append version to patch bz2) > > > > Thanks, > > Nitin > > I added a few functions to my test-bed to see how well the LZO code coped > with random input data. I'll extend my error checking so that it dumps all > local variables at the first error, but for the time being I'm staring at a > problem with what appears to be a double-free (in addition to > lzo1x_decompress(_safe) screaming about overruns when I pass it the buffer > of compressed, totally random data). > > (I fill the buffer by doing a read from /dev/urandom, though I'll certainly > change it to use random() if anyone thinks that'll make a difference) > > I'll post the version with the random-data tests when I get the double-free > solved. (I've looked and don't see how its happening) > > DRH After some additional debugging the problem appears to be that both lzo1x_decompress_safe (from miniLZO) and lzo1x_decompress (from the 'Tiny' version) both are somehow corrupting memory. *** glibc detected *** ./tinytest: double free or corruption (out): 0x08070e60 *** The buffer at 0x08070e60 is the one I allocated specifically for the new random data tests. It has the same location when it is allocated as it is when free()'d, so the corruption has to be coming from somewhere. The problem *might* be in my code - that is, I might have made a mistake in the read() from /dev/urandom. I'm going to go test that now. DRH - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Tuesday 29 May 2007 01:58:43 Nitin Gupta wrote: > On 5/29/07, Bret Towe <[EMAIL PROTECTED]> wrote: > > tested this on ppc and its still good > > > > is there any reason to bother with a test on amd64? > > if there is I might be able to get to it tonight > > Yes, this test is desired on 'take 6' > (In future I will append version to patch bz2) > > Thanks, > Nitin I added a few functions to my test-bed to see how well the LZO code coped with random input data. I'll extend my error checking so that it dumps all local variables at the first error, but for the time being I'm staring at a problem with what appears to be a double-free (in addition to lzo1x_decompress(_safe) screaming about overruns when I pass it the buffer of compressed, totally random data). (I fill the buffer by doing a read from /dev/urandom, though I'll certainly change it to use random() if anyone thinks that'll make a difference) I'll post the version with the random-data tests when I get the double-free solved. (I've looked and don't see how its happening) DRH - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
On Tuesday 29 May 2007 11:15:41 Satyam Sharma wrote: > [ Trimmed Cc list; added original zlib authors. ] > > > On Tuesday 29 May 2007 09:33:51 Michael-Luke Jones wrote: > > > On 29 May 2007, at 12:27, Satyam Sharma wrote: > > > > Right, actually, zlib could be switched over to [using a common > > > > directory]. > > > > Because zlib_deflate/ and zlib_inflate/ too share a private header > > > > zutil.h which has unfortunately been stuck into include/linux/ with > > > > a big > > > > /* WARNING: this file should *not* be used by applications. */ > > > > comment ... > > > > > > Well, unfortunately zutil.h is currently being used in-kernel by fs/ > > > jffs2/compr_zlib.c despite the "WARNING: this file should *not* be > > > used by applications" notice. > > Hmmm, either jffs2 thinks the zlib interfaces exposed through > include/linux/zlib.h are insufficient, or it isn't using zlib properly > (or at least the way it was supposed to be used :-) > > Looking at fs/jffs2/compr_zlib.c, however, it just seems to be > an optimization -- skipping some checksum calculation if some > flag (PRESET_DICT) is absent from the input stream about to > be decompressed ... > > This /looks/ like an optimization that might make sense for other > users of zlib too, in which case that entire check and > skip-the-checksum thing could actually be put into zlib proper. (?) This *is* a good idea. If someone doesn't beat me to it I'll make sure I've got the latest git and do up a patch that does this. DRH > > > So moving this header to a truly private location isn't possible > > > right now, unfortunately, > > > > > > Michael-Luke Jones > > > [added David Whitehouse to cc] > > On 5/29/07, Daniel Hazelton <[EMAIL PROTECTED]> wrote: > > I've looked at that code and it seems to need that file for one constant. > > Perhaps it'd be better for jffs2/compr_zlib.c to define that constant > > itself (or use it as a "Magic Number") rather than include the zlib > > private header. Another possibility would be to move that constant out of > > zutil.h and into zconf.h or zlih.b - doing any of those would allow the > > zlib private header to be moved such that zlib could be changed to use a > > common directory *and* have said private header in that directory. > > Moving PRESET_DICT to zlib.h robs lib/zlib_deflate/deflate.c from > being able to resolve that macro. FWIW, a dirty hack could be to > simply duplicate it in zlib.h. But then, implementation and interface > are decoupled for good reasons ... > > Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
[ Trimmed Cc list; added original zlib authors. ] On Tuesday 29 May 2007 09:33:51 Michael-Luke Jones wrote: > On 29 May 2007, at 12:27, Satyam Sharma wrote: > > Right, actually, zlib could be switched over to [using a common > > directory]. > > Because zlib_deflate/ and zlib_inflate/ too share a private header > > zutil.h which has unfortunately been stuck into include/linux/ with > > a big > > /* WARNING: this file should *not* be used by applications. */ > > comment ... > > Well, unfortunately zutil.h is currently being used in-kernel by fs/ > jffs2/compr_zlib.c despite the "WARNING: this file should *not* be > used by applications" notice. Hmmm, either jffs2 thinks the zlib interfaces exposed through include/linux/zlib.h are insufficient, or it isn't using zlib properly (or at least the way it was supposed to be used :-) Looking at fs/jffs2/compr_zlib.c, however, it just seems to be an optimization -- skipping some checksum calculation if some flag (PRESET_DICT) is absent from the input stream about to be decompressed ... This /looks/ like an optimization that might make sense for other users of zlib too, in which case that entire check and skip-the-checksum thing could actually be put into zlib proper. (?) > So moving this header to a truly private location isn't possible > right now, unfortunately, > > Michael-Luke Jones > [added David Whitehouse to cc] On 5/29/07, Daniel Hazelton <[EMAIL PROTECTED]> wrote: I've looked at that code and it seems to need that file for one constant. Perhaps it'd be better for jffs2/compr_zlib.c to define that constant itself (or use it as a "Magic Number") rather than include the zlib private header. Another possibility would be to move that constant out of zutil.h and into zconf.h or zlih.b - doing any of those would allow the zlib private header to be moved such that zlib could be changed to use a common directory *and* have said private header in that directory. Moving PRESET_DICT to zlib.h robs lib/zlib_deflate/deflate.c from being able to resolve that macro. FWIW, a dirty hack could be to simply duplicate it in zlib.h. But then, implementation and interface are decoupled for good reasons ... Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
On Tuesday 29 May 2007 09:33:51 Michael-Luke Jones wrote: > On 29 May 2007, at 12:27, Satyam Sharma wrote: > > Right, actually, zlib could be switched over to [using a common > > directory]. > > Because zlib_deflate/ and zlib_inflate/ too share a private header > > zutil.h which has unfortunately been stuck into include/linux/ with > > a big > > /* WARNING: this file should *not* be used by applications. */ > > comment ... > > Well, unfortunately zutil.h is currently being used in-kernel by fs/ > jffs2/compr_zlib.c despite the "WARNING: this file should *not* be > used by applications" notice. > > So moving this header to a truly private location isn't possible > right now, unfortunately, > > Michael-Luke Jones > [added David Whitehouse to cc] I've looked at that code and it seems to need that file for one constant. Perhaps it'd be better for jffs2/compr_zlib.c to define that constant itself (or use it as a "Magic Number") rather than include the zlib private header. Another possibility would be to move that constant out of zutil.h and into zconf.h or zlih.b - doing any of those would allow the zlib private header to be moved such that zlib could be changed to use a common directory *and* have said private header in that directory. DRH - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
On 29 May 2007, at 12:27, Satyam Sharma wrote: Right, actually, zlib could be switched over to [using a common directory]. Because zlib_deflate/ and zlib_inflate/ too share a private header zutil.h which has unfortunately been stuck into include/linux/ with a big /* WARNING: this file should *not* be used by applications. */ comment ... Well, unfortunately zutil.h is currently being used in-kernel by fs/ jffs2/compr_zlib.c despite the "WARNING: this file should *not* be used by applications" notice. So moving this header to a truly private location isn't possible right now, unfortunately, Michael-Luke Jones [added David Whitehouse to cc] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Tuesday 29 May 2007 08:03:55 Nitin Gupta wrote: > On 5/29/07, Adrian Bunk <[EMAIL PROTECTED]> wrote: > > On Tue, May 29, 2007 at 09:08:27AM +0100, Michael-Luke Jones wrote: > > > On 28 May 2007, at 18:11, Adrian Bunk wrote: > > >> I have not seen any explanations: > > >> - Why did the upstream author write the code that way? > > > > > > Apparently due to his requirement for extreme portability. The original > > > code was designed to work on everything from 16-bit DOS through CRAY > > > supercomputers through Windows, Unices and Linux. > > > > Sure, this could be the reason in some or all cases. > > > > The upstream author knows the code best, and discussing such issues with > > him will in many cases be a win: > > > > It could be that there was in some cases no good reason, and the > > upstream code that gets used by many other projects could become faster. > > > > Or there was a good reason that applies also to the in-kernel version > > and a change breaks some corner case. > > I have mailed the author with detailed changelog - waiting for reply. > > > > The author has stated on the thread that it's a good idea to remove > > > unnecessary ifdefs when porting the code into the kernel, given that > > > the portability requirements are obviously no longer needed. > > > > "remove unnecessary ifdefs" implies "generated code is identical". > > > > That's quite different from "code is 10% faster". > > Daniel made some changes to his testing code and now the perf gain is just > 1.6%. > > - Nitin The changes for version 5 are all in one file - helpers.h. All I did was define the macros that had previously been NOP's. Most of the performance loss, IMHO, is due to the compression and decompression functions being marked noinline. The currently reported 1.6% overall performance boost is likely related to the use of likely()/unlikely(). DRH - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Tuesday 29 May 2007 01:48:29 Nitin Gupta wrote: > On 5/29/07, Bret Towe <[EMAIL PROTECTED]> wrote: > > On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote: > > > Hi, > > > > > > Attached is tester code used for testing. > > > (developed by Daniel Hazelton -- modified slightly to now use 'take 6' > > > version for 'TinyLZO') > > > > > > Cheers, > > > Nitin > > > > > > On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote: > > > > (Using tester program from Daniel) > > > > > > > > Following compares this kernel port ('take 6') vs original miniLZO > > > > code: > > > > > > > > 'TinyLZO' refers to this kernel port. > > > > > > > > 1 run averages: > > > > 'Tiny LZO': > > > >Combined: 61.2223 usec > > > >Compression: 41.8412 usec > > > >Decompression: 19.3811 usec > > > > 'miniLZO': > > > >Combined: 66.0444 usec > > > >Compression: 46.6323 usec > > > >Decompression: 19.4121 usec > > > > > > > > Result: > > > > Overall: TinyLZO is 7.3% faster > > > > Compressor: TinyLZO is 10.2% faster > > > > Decompressor: TinyLZO is 0.15% faster > > > > 1 run averages: > > 'Tiny LZO': > > Combined: 112.6642 usec > > Compression: 56.3321 usec > > Decompression: 56.3321 usec > > 'miniLZO': > > Combined: 77.6642 usec > > Compression: 56.3416 usec > > Decompression: 21.3226 usec > > > > now the interesting bit I thought was the following > > [EMAIL PROTECTED]:/usr/src/lzo1x-test-4# ./fulltest > > [test_lzo.c::compress (93)] run took 42 microseconds > > [test_lzo.c::decompress (127)] run took 20 microseconds > > [EMAIL PROTECTED]:/usr/src/lzo1x-test-4# ./tinytest > > [test.c::compress (91)] run took 44 microseconds > > [test.c::decompress (117)] BUG: lzo1x_decompress has failed ( t == -6 ) > > [test.c::main (149)] BUG: Decompression routine failure > > Did you use x86 for above test? Maybe some problem with testing > script? What data did you use for this test? > > > - Nitin Using any version of the test code below 5 on a non-LE system would result in this error. Those earlier versions couldn't handle BE systems because I was too lazy to define cpu_to_le16. (And yes, Nitin, that does appear to be the correct conversion to use) DRH - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 5/29/07, Adrian Bunk <[EMAIL PROTECTED]> wrote: On Tue, May 29, 2007 at 09:08:27AM +0100, Michael-Luke Jones wrote: > On 28 May 2007, at 18:11, Adrian Bunk wrote: > >> I have not seen any explanations: >> - Why did the upstream author write the code that way? > > Apparently due to his requirement for extreme portability. The original > code was designed to work on everything from 16-bit DOS through CRAY > supercomputers through Windows, Unices and Linux. Sure, this could be the reason in some or all cases. The upstream author knows the code best, and discussing such issues with him will in many cases be a win: It could be that there was in some cases no good reason, and the upstream code that gets used by many other projects could become faster. Or there was a good reason that applies also to the in-kernel version and a change breaks some corner case. I have mailed the author with detailed changelog - waiting for reply. > The author has stated on the thread that it's a good idea to remove > unnecessary ifdefs when porting the code into the kernel, given that the > portability requirements are obviously no longer needed. "remove unnecessary ifdefs" implies "generated code is identical". That's quite different from "code is 10% faster". Daniel made some changes to his testing code and now the perf gain is just 1.6%. - Nitin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Tue, May 29, 2007 at 09:08:27AM +0100, Michael-Luke Jones wrote: > On 28 May 2007, at 18:11, Adrian Bunk wrote: > >> I have not seen any explanations: >> - Why did the upstream author write the code that way? > > Apparently due to his requirement for extreme portability. The original > code was designed to work on everything from 16-bit DOS through CRAY > supercomputers through Windows, Unices and Linux. Sure, this could be the reason in some or all cases. The upstream author knows the code best, and discussing such issues with him will in many cases be a win: It could be that there was in some cases no good reason, and the upstream code that gets used by many other projects could become faster. Or there was a good reason that applies also to the in-kernel version and a change breaks some corner case. > The author has stated on the thread that it's a good idea to remove > unnecessary ifdefs when porting the code into the kernel, given that the > portability requirements are obviously no longer needed. "remove unnecessary ifdefs" implies "generated code is identical". That's quite different from "code is 10% faster". > Michael-Luke cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Makefile question (was [RFC] LZO de/compression support - take 6)
On 5/29/07, Michael-Luke Jones <[EMAIL PROTECTED]> wrote: On 29 May 2007, at 11:41, Satyam Sharma wrote: > This is syntactically correct (and wouldn't produce any build errors), > but it's quite ... strange, still. Why would I even want to /build/ > the > compress code if all I selected was the decompress option? Apologies, you gave me the answer I was looking for (make is 'intelligent' enough not to build the same file twice in this situation...) but I think you may have missed the makefile in the lzo1x directory: > diff --git a/lib/lzo1x/Makefile b/lib/lzo1x/Makefile > new file mode 100644 > index 000..7b56a4d > --- /dev/null > +++ b/lib/lzo1x/Makefile > @@ -0,0 +1,2 @@ > +obj-$(CONFIG_LZO1X_COMPRESS) += lzo1x_compress.o > +obj-$(CONFIG_LZO1X_DECOMPRESS) += lzo1x_decompress.o Ah, yes, I only saw the mail and didn't check lib/lzo1x/Makefile, thanks. So effectively, we've done precisely what zlib does too ... only, in a smarter way! Thus, the Kconfig options for compress/decompress won't be simply 'dummy' options... Given the shared private header between the compress and decompress code, I don't think there is any need to separate the code into two directories a la the zlib code. Right, actually, zlib could be switched over to this style, in fact. Because zlib_deflate/ and zlib_inflate/ too share a private header zutil.h which has unfortunately been stuck into include/linux/ with a big /* WARNING: this file should *not* be used by applications. */ comment ... Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Makefile question (was [RFC] LZO de/compression support - take 6)
On 29 May 2007, at 11:41, Satyam Sharma wrote: This is syntactically correct (and wouldn't produce any build errors), but it's quite ... strange, still. Why would I even want to /build/ the compress code if all I selected was the decompress option? Apologies, you gave me the answer I was looking for (make is 'intelligent' enough not to build the same file twice in this situation...) but I think you may have missed the makefile in the lzo1x directory: diff --git a/lib/lzo1x/Makefile b/lib/lzo1x/Makefile new file mode 100644 index 000..7b56a4d --- /dev/null +++ b/lib/lzo1x/Makefile @@ -0,0 +1,2 @@ +obj-$(CONFIG_LZO1X_COMPRESS) += lzo1x_compress.o +obj-$(CONFIG_LZO1X_DECOMPRESS) += lzo1x_decompress.o Thus, the Kconfig options for compress/decompress won't be simply 'dummy' options... Given the shared private header between the compress and decompress code, I don't think there is any need to separate the code into two directories a la the zlib code. Sorry for noise, Michael-Luke - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Makefile question (was [RFC] LZO de/compression support - take 6)
On 5/29/07, Michael-Luke Jones <[EMAIL PROTECTED]> wrote: On 28 May 2007, at 15:34, Nitin Gupta wrote: > diff --git a/lib/Kconfig b/lib/Kconfig > index 2e7ae6b..eb95eaa 100644 > --- a/lib/Kconfig > +++ b/lib/Kconfig > @@ -64,6 +64,12 @@ config ZLIB_INFLATE > config ZLIB_DEFLATE > tristate > > +config LZO1X_COMPRESS > + tristate > + > +config LZO1X_DECOMPRESS > + tristate > + > # > # Generic allocator support is selected if needed > # > diff --git a/lib/Makefile b/lib/Makefile > index c8c8e20..448ae37 100644 > --- a/lib/Makefile > +++ b/lib/Makefile > @@ -49,6 +49,8 @@ obj-$(CONFIG_GENERIC_ALLOCATOR) += genalloc.o > obj-$(CONFIG_ZLIB_INFLATE) += zlib_inflate/ > obj-$(CONFIG_ZLIB_DEFLATE) += zlib_deflate/ > obj-$(CONFIG_REED_SOLOMON) += reed_solomon/ > +obj-$(CONFIG_LZO1X_COMPRESS) += lzo1x/ > +obj-$(CONFIG_LZO1X_DECOMPRESS) += lzo1x/ > > obj-$(CONFIG_TEXTSEARCH) += textsearch.o > obj-$(CONFIG_TEXTSEARCH_KMP) += ts_kmp.o Hey there, Is this syntax OK for Makefile use? I'm only asking out of personal ignorance and because I'd use: > +obj-$(CONFIG_LZO1X_COMPRESS) += lzo1x/ > +obj-$(CONFIG_LZO1X_DECOMPRESS) += lzo1x/ This is syntactically correct (and wouldn't produce any build errors), but it's quite ... strange, still. Why would I even want to /build/ the compress code if all I selected was the decompress option? Also, when both the options are y, make would just make the first target, then recurse back into that same directory again and skip it because it just made it ... which, although harmless, is ... strange. So I'm not sure if we really want both the options to resolve to the same make target -- the kernel typically has more users of _decompress_ than those of _compress_, so it does make some sense to completely decouple the two into separate directories in lib/ and have separate Kconfig options (as well as Makefile target subdirs) for them. Kconfig: config LZO1X_COMPRESS select LZO1X tristate config LZO1X_DECOMPRESS select LZO1X tristate config LZO1X tristate (or boolean??) Hmmm ... if this is /really/ what we want to do, what's wrong with simply: config LZO1x tristate ? Makefile: obj-$(CONFIG_LZO1X) += lzo1x/ I find nothing wrong with the style being used for ZLIB presently. Agreed, users of ZLIB do typically want to select only decompress *without* compress whereas there's no such user for LZO currently, but I see nothing wrong in just copying over ZLIB's style for LZO too (i.e. decoupling the compress and decompress code into separate subdirs in lib/). And if not, then why have two separate (dummy) Kconfig options either? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Makefile question (was [RFC] LZO de/compression support - take 6)
On 28 May 2007, at 15:34, Nitin Gupta wrote: diff --git a/lib/Kconfig b/lib/Kconfig index 2e7ae6b..eb95eaa 100644 --- a/lib/Kconfig +++ b/lib/Kconfig @@ -64,6 +64,12 @@ config ZLIB_INFLATE config ZLIB_DEFLATE tristate +config LZO1X_COMPRESS + tristate + +config LZO1X_DECOMPRESS + tristate + # # Generic allocator support is selected if needed # diff --git a/lib/Makefile b/lib/Makefile index c8c8e20..448ae37 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -49,6 +49,8 @@ obj-$(CONFIG_GENERIC_ALLOCATOR) += genalloc.o obj-$(CONFIG_ZLIB_INFLATE) += zlib_inflate/ obj-$(CONFIG_ZLIB_DEFLATE) += zlib_deflate/ obj-$(CONFIG_REED_SOLOMON) += reed_solomon/ +obj-$(CONFIG_LZO1X_COMPRESS) += lzo1x/ +obj-$(CONFIG_LZO1X_DECOMPRESS) += lzo1x/ obj-$(CONFIG_TEXTSEARCH) += textsearch.o obj-$(CONFIG_TEXTSEARCH_KMP) += ts_kmp.o Hey there, Is this syntax OK for Makefile use? I'm only asking out of personal ignorance and because I'd use: Kconfig: config LZO1X_COMPRESS select LZO1X tristate config LZO1X_DECOMPRESS select LZO1X tristate config LZO1X tristate (or boolean??) Makefile: obj-$(CONFIG_LZO1X) += lzo1x/ Thanks, Michael-Luke - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 28 May 2007, at 18:11, Adrian Bunk wrote: I have not seen any explanations: - Why did the upstream author write the code that way? Apparently due to his requirement for extreme portability. The original code was designed to work on everything from 16-bit DOS through CRAY supercomputers through Windows, Unices and Linux. The author has stated on the thread that it's a good idea to remove unnecessary ifdefs when porting the code into the kernel, given that the portability requirements are obviously no longer needed. Michael-Luke - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 28 May 2007, at 18:11, Adrian Bunk wrote: I have not seen any explanations: - Why did the upstream author write the code that way? Apparently due to his requirement for extreme portability. The original code was designed to work on everything from 16-bit DOS through CRAY supercomputers through Windows, Unices and Linux. The author has stated on the thread that it's a good idea to remove unnecessary ifdefs when porting the code into the kernel, given that the portability requirements are obviously no longer needed. Michael-Luke - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Makefile question (was [RFC] LZO de/compression support - take 6)
On 28 May 2007, at 15:34, Nitin Gupta wrote: diff --git a/lib/Kconfig b/lib/Kconfig index 2e7ae6b..eb95eaa 100644 --- a/lib/Kconfig +++ b/lib/Kconfig @@ -64,6 +64,12 @@ config ZLIB_INFLATE config ZLIB_DEFLATE tristate +config LZO1X_COMPRESS + tristate + +config LZO1X_DECOMPRESS + tristate + # # Generic allocator support is selected if needed # diff --git a/lib/Makefile b/lib/Makefile index c8c8e20..448ae37 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -49,6 +49,8 @@ obj-$(CONFIG_GENERIC_ALLOCATOR) += genalloc.o obj-$(CONFIG_ZLIB_INFLATE) += zlib_inflate/ obj-$(CONFIG_ZLIB_DEFLATE) += zlib_deflate/ obj-$(CONFIG_REED_SOLOMON) += reed_solomon/ +obj-$(CONFIG_LZO1X_COMPRESS) += lzo1x/ +obj-$(CONFIG_LZO1X_DECOMPRESS) += lzo1x/ obj-$(CONFIG_TEXTSEARCH) += textsearch.o obj-$(CONFIG_TEXTSEARCH_KMP) += ts_kmp.o Hey there, Is this syntax OK for Makefile use? I'm only asking out of personal ignorance and because I'd use: Kconfig: config LZO1X_COMPRESS select LZO1X tristate config LZO1X_DECOMPRESS select LZO1X tristate config LZO1X tristate (or boolean??) Makefile: obj-$(CONFIG_LZO1X) += lzo1x/ Thanks, Michael-Luke - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Makefile question (was [RFC] LZO de/compression support - take 6)
On 5/29/07, Michael-Luke Jones [EMAIL PROTECTED] wrote: On 28 May 2007, at 15:34, Nitin Gupta wrote: diff --git a/lib/Kconfig b/lib/Kconfig index 2e7ae6b..eb95eaa 100644 --- a/lib/Kconfig +++ b/lib/Kconfig @@ -64,6 +64,12 @@ config ZLIB_INFLATE config ZLIB_DEFLATE tristate +config LZO1X_COMPRESS + tristate + +config LZO1X_DECOMPRESS + tristate + # # Generic allocator support is selected if needed # diff --git a/lib/Makefile b/lib/Makefile index c8c8e20..448ae37 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -49,6 +49,8 @@ obj-$(CONFIG_GENERIC_ALLOCATOR) += genalloc.o obj-$(CONFIG_ZLIB_INFLATE) += zlib_inflate/ obj-$(CONFIG_ZLIB_DEFLATE) += zlib_deflate/ obj-$(CONFIG_REED_SOLOMON) += reed_solomon/ +obj-$(CONFIG_LZO1X_COMPRESS) += lzo1x/ +obj-$(CONFIG_LZO1X_DECOMPRESS) += lzo1x/ obj-$(CONFIG_TEXTSEARCH) += textsearch.o obj-$(CONFIG_TEXTSEARCH_KMP) += ts_kmp.o Hey there, Is this syntax OK for Makefile use? I'm only asking out of personal ignorance and because I'd use: +obj-$(CONFIG_LZO1X_COMPRESS) += lzo1x/ +obj-$(CONFIG_LZO1X_DECOMPRESS) += lzo1x/ This is syntactically correct (and wouldn't produce any build errors), but it's quite ... strange, still. Why would I even want to /build/ the compress code if all I selected was the decompress option? Also, when both the options are y, make would just make the first target, then recurse back into that same directory again and skip it because it just made it ... which, although harmless, is ... strange. So I'm not sure if we really want both the options to resolve to the same make target -- the kernel typically has more users of _decompress_ than those of _compress_, so it does make some sense to completely decouple the two into separate directories in lib/ and have separate Kconfig options (as well as Makefile target subdirs) for them. Kconfig: config LZO1X_COMPRESS select LZO1X tristate config LZO1X_DECOMPRESS select LZO1X tristate config LZO1X tristate (or boolean??) Hmmm ... if this is /really/ what we want to do, what's wrong with simply: config LZO1x tristate ? Makefile: obj-$(CONFIG_LZO1X) += lzo1x/ I find nothing wrong with the style being used for ZLIB presently. Agreed, users of ZLIB do typically want to select only decompress *without* compress whereas there's no such user for LZO currently, but I see nothing wrong in just copying over ZLIB's style for LZO too (i.e. decoupling the compress and decompress code into separate subdirs in lib/). And if not, then why have two separate (dummy) Kconfig options either? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Makefile question (was [RFC] LZO de/compression support - take 6)
On 29 May 2007, at 11:41, Satyam Sharma wrote: This is syntactically correct (and wouldn't produce any build errors), but it's quite ... strange, still. Why would I even want to /build/ the compress code if all I selected was the decompress option? Apologies, you gave me the answer I was looking for (make is 'intelligent' enough not to build the same file twice in this situation...) but I think you may have missed the makefile in the lzo1x directory: diff --git a/lib/lzo1x/Makefile b/lib/lzo1x/Makefile new file mode 100644 index 000..7b56a4d --- /dev/null +++ b/lib/lzo1x/Makefile @@ -0,0 +1,2 @@ +obj-$(CONFIG_LZO1X_COMPRESS) += lzo1x_compress.o +obj-$(CONFIG_LZO1X_DECOMPRESS) += lzo1x_decompress.o Thus, the Kconfig options for compress/decompress won't be simply 'dummy' options... Given the shared private header between the compress and decompress code, I don't think there is any need to separate the code into two directories a la the zlib code. Sorry for noise, Michael-Luke - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Makefile question (was [RFC] LZO de/compression support - take 6)
On 5/29/07, Michael-Luke Jones [EMAIL PROTECTED] wrote: On 29 May 2007, at 11:41, Satyam Sharma wrote: This is syntactically correct (and wouldn't produce any build errors), but it's quite ... strange, still. Why would I even want to /build/ the compress code if all I selected was the decompress option? Apologies, you gave me the answer I was looking for (make is 'intelligent' enough not to build the same file twice in this situation...) but I think you may have missed the makefile in the lzo1x directory: diff --git a/lib/lzo1x/Makefile b/lib/lzo1x/Makefile new file mode 100644 index 000..7b56a4d --- /dev/null +++ b/lib/lzo1x/Makefile @@ -0,0 +1,2 @@ +obj-$(CONFIG_LZO1X_COMPRESS) += lzo1x_compress.o +obj-$(CONFIG_LZO1X_DECOMPRESS) += lzo1x_decompress.o Ah, yes, I only saw the mail and didn't check lib/lzo1x/Makefile, thanks. So effectively, we've done precisely what zlib does too ... only, in a smarter way! Thus, the Kconfig options for compress/decompress won't be simply 'dummy' options... Given the shared private header between the compress and decompress code, I don't think there is any need to separate the code into two directories a la the zlib code. Right, actually, zlib could be switched over to this style, in fact. Because zlib_deflate/ and zlib_inflate/ too share a private header zutil.h which has unfortunately been stuck into include/linux/ with a big /* WARNING: this file should *not* be used by applications. */ comment ... Satyam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Tue, May 29, 2007 at 09:08:27AM +0100, Michael-Luke Jones wrote: On 28 May 2007, at 18:11, Adrian Bunk wrote: I have not seen any explanations: - Why did the upstream author write the code that way? Apparently due to his requirement for extreme portability. The original code was designed to work on everything from 16-bit DOS through CRAY supercomputers through Windows, Unices and Linux. Sure, this could be the reason in some or all cases. The upstream author knows the code best, and discussing such issues with him will in many cases be a win: It could be that there was in some cases no good reason, and the upstream code that gets used by many other projects could become faster. Or there was a good reason that applies also to the in-kernel version and a change breaks some corner case. The author has stated on the thread that it's a good idea to remove unnecessary ifdefs when porting the code into the kernel, given that the portability requirements are obviously no longer needed. remove unnecessary ifdefs implies generated code is identical. That's quite different from code is 10% faster. Michael-Luke cu Adrian -- Is there not promise of rain? Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. Only a promise, Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 5/29/07, Adrian Bunk [EMAIL PROTECTED] wrote: On Tue, May 29, 2007 at 09:08:27AM +0100, Michael-Luke Jones wrote: On 28 May 2007, at 18:11, Adrian Bunk wrote: I have not seen any explanations: - Why did the upstream author write the code that way? Apparently due to his requirement for extreme portability. The original code was designed to work on everything from 16-bit DOS through CRAY supercomputers through Windows, Unices and Linux. Sure, this could be the reason in some or all cases. The upstream author knows the code best, and discussing such issues with him will in many cases be a win: It could be that there was in some cases no good reason, and the upstream code that gets used by many other projects could become faster. Or there was a good reason that applies also to the in-kernel version and a change breaks some corner case. I have mailed the author with detailed changelog - waiting for reply. The author has stated on the thread that it's a good idea to remove unnecessary ifdefs when porting the code into the kernel, given that the portability requirements are obviously no longer needed. remove unnecessary ifdefs implies generated code is identical. That's quite different from code is 10% faster. Daniel made some changes to his testing code and now the perf gain is just 1.6%. - Nitin - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Tuesday 29 May 2007 01:48:29 Nitin Gupta wrote: On 5/29/07, Bret Towe [EMAIL PROTECTED] wrote: On 5/28/07, Nitin Gupta [EMAIL PROTECTED] wrote: Hi, Attached is tester code used for testing. (developed by Daniel Hazelton -- modified slightly to now use 'take 6' version for 'TinyLZO') Cheers, Nitin On 5/28/07, Nitin Gupta [EMAIL PROTECTED] wrote: (Using tester program from Daniel) Following compares this kernel port ('take 6') vs original miniLZO code: 'TinyLZO' refers to this kernel port. 1 run averages: 'Tiny LZO': Combined: 61.2223 usec Compression: 41.8412 usec Decompression: 19.3811 usec 'miniLZO': Combined: 66.0444 usec Compression: 46.6323 usec Decompression: 19.4121 usec Result: Overall: TinyLZO is 7.3% faster Compressor: TinyLZO is 10.2% faster Decompressor: TinyLZO is 0.15% faster 1 run averages: 'Tiny LZO': Combined: 112.6642 usec Compression: 56.3321 usec Decompression: 56.3321 usec 'miniLZO': Combined: 77.6642 usec Compression: 56.3416 usec Decompression: 21.3226 usec now the interesting bit I thought was the following [EMAIL PROTECTED]:/usr/src/lzo1x-test-4# ./fulltest [test_lzo.c::compress (93)] run took 42 microseconds [test_lzo.c::decompress (127)] run took 20 microseconds [EMAIL PROTECTED]:/usr/src/lzo1x-test-4# ./tinytest [test.c::compress (91)] run took 44 microseconds [test.c::decompress (117)] BUG: lzo1x_decompress has failed ( t == -6 ) [test.c::main (149)] BUG: Decompression routine failure Did you use x86 for above test? Maybe some problem with testing script? What data did you use for this test? - Nitin Using any version of the test code below 5 on a non-LE system would result in this error. Those earlier versions couldn't handle BE systems because I was too lazy to define cpu_to_le16. (And yes, Nitin, that does appear to be the correct conversion to use) DRH - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Tuesday 29 May 2007 08:03:55 Nitin Gupta wrote: On 5/29/07, Adrian Bunk [EMAIL PROTECTED] wrote: On Tue, May 29, 2007 at 09:08:27AM +0100, Michael-Luke Jones wrote: On 28 May 2007, at 18:11, Adrian Bunk wrote: I have not seen any explanations: - Why did the upstream author write the code that way? Apparently due to his requirement for extreme portability. The original code was designed to work on everything from 16-bit DOS through CRAY supercomputers through Windows, Unices and Linux. Sure, this could be the reason in some or all cases. The upstream author knows the code best, and discussing such issues with him will in many cases be a win: It could be that there was in some cases no good reason, and the upstream code that gets used by many other projects could become faster. Or there was a good reason that applies also to the in-kernel version and a change breaks some corner case. I have mailed the author with detailed changelog - waiting for reply. The author has stated on the thread that it's a good idea to remove unnecessary ifdefs when porting the code into the kernel, given that the portability requirements are obviously no longer needed. remove unnecessary ifdefs implies generated code is identical. That's quite different from code is 10% faster. Daniel made some changes to his testing code and now the perf gain is just 1.6%. - Nitin The changes for version 5 are all in one file - helpers.h. All I did was define the macros that had previously been NOP's. Most of the performance loss, IMHO, is due to the compression and decompression functions being marked noinline. The currently reported 1.6% overall performance boost is likely related to the use of likely()/unlikely(). DRH - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
On 29 May 2007, at 12:27, Satyam Sharma wrote: Right, actually, zlib could be switched over to [using a common directory]. Because zlib_deflate/ and zlib_inflate/ too share a private header zutil.h which has unfortunately been stuck into include/linux/ with a big /* WARNING: this file should *not* be used by applications. */ comment ... Well, unfortunately zutil.h is currently being used in-kernel by fs/ jffs2/compr_zlib.c despite the WARNING: this file should *not* be used by applications notice. So moving this header to a truly private location isn't possible right now, unfortunately, Michael-Luke Jones [added David Whitehouse to cc] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
On Tuesday 29 May 2007 09:33:51 Michael-Luke Jones wrote: On 29 May 2007, at 12:27, Satyam Sharma wrote: Right, actually, zlib could be switched over to [using a common directory]. Because zlib_deflate/ and zlib_inflate/ too share a private header zutil.h which has unfortunately been stuck into include/linux/ with a big /* WARNING: this file should *not* be used by applications. */ comment ... Well, unfortunately zutil.h is currently being used in-kernel by fs/ jffs2/compr_zlib.c despite the WARNING: this file should *not* be used by applications notice. So moving this header to a truly private location isn't possible right now, unfortunately, Michael-Luke Jones [added David Whitehouse to cc] I've looked at that code and it seems to need that file for one constant. Perhaps it'd be better for jffs2/compr_zlib.c to define that constant itself (or use it as a Magic Number) rather than include the zlib private header. Another possibility would be to move that constant out of zutil.h and into zconf.h or zlih.b - doing any of those would allow the zlib private header to be moved such that zlib could be changed to use a common directory *and* have said private header in that directory. DRH - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
[ Trimmed Cc list; added original zlib authors. ] On Tuesday 29 May 2007 09:33:51 Michael-Luke Jones wrote: On 29 May 2007, at 12:27, Satyam Sharma wrote: Right, actually, zlib could be switched over to [using a common directory]. Because zlib_deflate/ and zlib_inflate/ too share a private header zutil.h which has unfortunately been stuck into include/linux/ with a big /* WARNING: this file should *not* be used by applications. */ comment ... Well, unfortunately zutil.h is currently being used in-kernel by fs/ jffs2/compr_zlib.c despite the WARNING: this file should *not* be used by applications notice. Hmmm, either jffs2 thinks the zlib interfaces exposed through include/linux/zlib.h are insufficient, or it isn't using zlib properly (or at least the way it was supposed to be used :-) Looking at fs/jffs2/compr_zlib.c, however, it just seems to be an optimization -- skipping some checksum calculation if some flag (PRESET_DICT) is absent from the input stream about to be decompressed ... This /looks/ like an optimization that might make sense for other users of zlib too, in which case that entire check and skip-the-checksum thing could actually be put into zlib proper. (?) So moving this header to a truly private location isn't possible right now, unfortunately, Michael-Luke Jones [added David Whitehouse to cc] On 5/29/07, Daniel Hazelton [EMAIL PROTECTED] wrote: I've looked at that code and it seems to need that file for one constant. Perhaps it'd be better for jffs2/compr_zlib.c to define that constant itself (or use it as a Magic Number) rather than include the zlib private header. Another possibility would be to move that constant out of zutil.h and into zconf.h or zlih.b - doing any of those would allow the zlib private header to be moved such that zlib could be changed to use a common directory *and* have said private header in that directory. Moving PRESET_DICT to zlib.h robs lib/zlib_deflate/deflate.c from being able to resolve that macro. FWIW, a dirty hack could be to simply duplicate it in zlib.h. But then, implementation and interface are decoupled for good reasons ... Satyam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
On Tuesday 29 May 2007 11:15:41 Satyam Sharma wrote: [ Trimmed Cc list; added original zlib authors. ] On Tuesday 29 May 2007 09:33:51 Michael-Luke Jones wrote: On 29 May 2007, at 12:27, Satyam Sharma wrote: Right, actually, zlib could be switched over to [using a common directory]. Because zlib_deflate/ and zlib_inflate/ too share a private header zutil.h which has unfortunately been stuck into include/linux/ with a big /* WARNING: this file should *not* be used by applications. */ comment ... Well, unfortunately zutil.h is currently being used in-kernel by fs/ jffs2/compr_zlib.c despite the WARNING: this file should *not* be used by applications notice. Hmmm, either jffs2 thinks the zlib interfaces exposed through include/linux/zlib.h are insufficient, or it isn't using zlib properly (or at least the way it was supposed to be used :-) Looking at fs/jffs2/compr_zlib.c, however, it just seems to be an optimization -- skipping some checksum calculation if some flag (PRESET_DICT) is absent from the input stream about to be decompressed ... This /looks/ like an optimization that might make sense for other users of zlib too, in which case that entire check and skip-the-checksum thing could actually be put into zlib proper. (?) This *is* a good idea. If someone doesn't beat me to it I'll make sure I've got the latest git and do up a patch that does this. DRH So moving this header to a truly private location isn't possible right now, unfortunately, Michael-Luke Jones [added David Whitehouse to cc] On 5/29/07, Daniel Hazelton [EMAIL PROTECTED] wrote: I've looked at that code and it seems to need that file for one constant. Perhaps it'd be better for jffs2/compr_zlib.c to define that constant itself (or use it as a Magic Number) rather than include the zlib private header. Another possibility would be to move that constant out of zutil.h and into zconf.h or zlih.b - doing any of those would allow the zlib private header to be moved such that zlib could be changed to use a common directory *and* have said private header in that directory. Moving PRESET_DICT to zlib.h robs lib/zlib_deflate/deflate.c from being able to resolve that macro. FWIW, a dirty hack could be to simply duplicate it in zlib.h. But then, implementation and interface are decoupled for good reasons ... Satyam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Tuesday 29 May 2007 01:58:43 Nitin Gupta wrote: On 5/29/07, Bret Towe [EMAIL PROTECTED] wrote: tested this on ppc and its still good is there any reason to bother with a test on amd64? if there is I might be able to get to it tonight Yes, this test is desired on 'take 6' (In future I will append version to patch bz2) Thanks, Nitin I added a few functions to my test-bed to see how well the LZO code coped with random input data. I'll extend my error checking so that it dumps all local variables at the first error, but for the time being I'm staring at a problem with what appears to be a double-free (in addition to lzo1x_decompress(_safe) screaming about overruns when I pass it the buffer of compressed, totally random data). (I fill the buffer by doing a read from /dev/urandom, though I'll certainly change it to use random() if anyone thinks that'll make a difference) I'll post the version with the random-data tests when I get the double-free solved. (I've looked and don't see how its happening) DRH - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Tuesday 29 May 2007 16:14:34 Daniel Hazelton wrote: On Tuesday 29 May 2007 01:58:43 Nitin Gupta wrote: On 5/29/07, Bret Towe [EMAIL PROTECTED] wrote: tested this on ppc and its still good is there any reason to bother with a test on amd64? if there is I might be able to get to it tonight Yes, this test is desired on 'take 6' (In future I will append version to patch bz2) Thanks, Nitin I added a few functions to my test-bed to see how well the LZO code coped with random input data. I'll extend my error checking so that it dumps all local variables at the first error, but for the time being I'm staring at a problem with what appears to be a double-free (in addition to lzo1x_decompress(_safe) screaming about overruns when I pass it the buffer of compressed, totally random data). (I fill the buffer by doing a read from /dev/urandom, though I'll certainly change it to use random() if anyone thinks that'll make a difference) I'll post the version with the random-data tests when I get the double-free solved. (I've looked and don't see how its happening) DRH After some additional debugging the problem appears to be that both lzo1x_decompress_safe (from miniLZO) and lzo1x_decompress (from the 'Tiny' version) both are somehow corrupting memory. *** glibc detected *** ./tinytest: double free or corruption (out): 0x08070e60 *** The buffer at 0x08070e60 is the one I allocated specifically for the new random data tests. It has the same location when it is allocated as it is when free()'d, so the corruption has to be coming from somewhere. The problem *might* be in my code - that is, I might have made a mistake in the read() from /dev/urandom. I'm going to go test that now. DRH - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On May 28 2007 20:04, Nitin Gupta wrote: * Changelog vs. original LZO: 1) Used standard/kernel defined data types: (this eliminated _huge_ #ifdef chunks) lzo_bytep - unsigned char * lzo_uint - size_t lzo_xint - size_t Is this safe (as far as compressed LZO stream is concerned) -- or is it even needed (could it be unsigned int)? - m_pos -= (*(const unsigned short *)ip) 2; -#else - m_pos = op - 1; - m_pos -= (ip[0] 2) + (ip[1] 6); -#endif + m_pos = op - 1 - (cpu_to_le16(*(const u16 *)ip) 2); (Andrey suggested le16_to_cpu for above but I think it should be cpu_to_le16). *** Need testing on big endian machine *** On i386, both cpu_to_le16 and le16_to_cpu do nothing. On sparc for example, cpu_to_leXX and leXX_to_cpu do 'the same' ;-) they swap 1234-4321. It is the bytestream (ip) that is reinterpreted as uint16_t. And I really doubt that the LZO author has a big-endian machine, given the days of ubiquitous x86. le16_to_cpu it is. Jan -- - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On May 28 2007 19:11, Adrian Bunk wrote: I have not seen any explanations: - Why did the upstream author write the code that way? I guess it's along the lines of - portability (note how this contradicts itself). Really. I have yet to figure out why everyone invents their own xxx32_t types, like e.g. glib. well, integer types, one might understand, but when it comes to gchar or gpointer, that's just plain microsoft-style (think LPCSTR and LPCVOID...) Jan -- - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Tue, May 29, 2007 at 11:10:05PM +0200, Jan Engelhardt wrote: On May 28 2007 19:11, Adrian Bunk wrote: I have not seen any explanations: - Why did the upstream author write the code that way? I guess it's along the lines of - portability (note how this contradicts itself). Really. I have yet to figure out why everyone invents their own xxx32_t types, like e.g. glib. well, integer types, one might understand, but when it comes to gchar or gpointer, that's just plain microsoft-style (think LPCSTR and LPCVOID...) You completely miss the point of my question. It's about the performance improvements of the modified code that were mentioned. What you are talking about shouldn't have any effect on the generated code. Jan cu Adrian -- Is there not promise of rain? Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. Only a promise, Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
All problems I was having with the test-bed code have been solved, and the error I was running into was, as I suspected, in the code I used to fill the buffer for the random-data test. Results of running the new benchmark (version 6 of the benchmark, version 6 of 'tinyLZO'): 1 run averages: 'Tiny LZO': Combined: 104.1529 usec Compression: 43.065 usec Decompression: 18.4648 usec Random Data Compression: 31.132 usec Random Data Decompression: 11.4911 usec 'miniLZO': Combined: 106.1363 usec Compression: 44.6988 usec Decompression: 18.3576 usec Random Data Compression: 31.5772 usec Random Data Decompression: 11.5027 usec Percentages (calculated as: ((full - tiny)/full)*100): Overall (combined totals): Tiny is 1.87 % faster Compression: Tiny is 3.66 % faster Decompression: Tiny is 0.58 % slower Random Compression: Tiny is 1.41 % faster Random Decompression: Tiny is 0.10 % faster The results, on my system, are not consistent, except in that 'TinyLZO' is generally faster in the non-random data tests than miniLZO. It did, three of the five times I ran the test, perform (on average) about 1% worse than miniLZO. In order to sidestep any issues that a difference in the input data might have caused, I'm going to rewrite the code so that all tests are run against the same data. However, in the meantime, I've attached the latest version of my test-code. DRH PS: the code is going to massively change as I work to include more data sources for the benchmarking, as well as tests that will try to really stress the stripped-down code. lzo1x-test-6.tar.bz2 Description: application/tbz
Re: [RFC] LZO de/compression support - take 6
I just noticed a bug in my testbed/benchmarking code. It's fixed, but I decided to compare version 6 of the code against the *unsafe* decompressor again. The results of the three runs I've put it through after changing it to compare against the unsafe decompressor were startling. `Tiny's` compressor is still faster - I've seen it be rated up to 3% faster. The decompressor, OTOH, when compared to the unsafe version (which is the comparison that started me on this binge of hacking), is more than 7% worse. About 11% slower on the original test against a C source file, and about 6% slower for random data. However, looking at the numbers involved, I can't see a reason to keep the unsafe version around - the percentages look worse than they are - from 1 to 3 microseconds. (well, the compressed-cache people might want those extra usecs - but the difference will never be noticeable anywhere outside the kernel) DRH lzo1x-test-6a.tar.bz2 Description: application/tbz
Re: [RFC] LZO de/compression support - take 6
On 5/30/07, Daniel Hazelton [EMAIL PROTECTED] wrote: I just noticed a bug in my testbed/benchmarking code. It's fixed, but I decided to compare version 6 of the code against the *unsafe* decompressor again. The results of the three runs I've put it through after changing it to compare against the unsafe decompressor were startling. `Tiny's` compressor is still faster - I've seen it be rated up to 3% faster. The decompressor, OTOH, when compared to the unsafe version (which is the comparison that started me on this binge of hacking), is more than 7% worse. About 11% slower on the original test against a C source file, and about 6% slower for random data. Unsafe vs safe is within 10%. Its okay. However, looking at the numbers involved, I can't see a reason to keep the unsafe version around - the percentages look worse than they are - from 1 to 3 microseconds. Not just numbers. Most of applications cannot afford to use unsafe versions anyway (like fs people). (well, the compressed-cache people might want those extra usecs - but the difference will never be noticeable anywhere outside the kernel) DRH compressed cache people require every single percent of that performance. For now, ccaching is not ready for mainline (many things need to be done). So, till then I will keep off the unsafe version. If ever compressed caching is on its way to mainline _then_ I will try and add back the unsafe version. But I see no other project that really cares about unsafe version so it's okay to keep it off. - Nitin - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 5/30/07, Adrian Bunk [EMAIL PROTECTED] wrote: On Tue, May 29, 2007 at 11:10:05PM +0200, Jan Engelhardt wrote: On May 28 2007 19:11, Adrian Bunk wrote: You completely miss the point of my question. It's about the performance improvements of the modified code that were mentioned. What you are talking about shouldn't have any effect on the generated code. After Daniel refined this testing program, we can see that perf gain is 2% which can surely be accounted to cleanups like unnecessary castings in tight loops. Again, all the original code has been retained _as-is_. Whatever was changed, has been mentioned in that detailed changelog that I post along with patch. If someone want to review these 500 lines with this changelog in hand, it should not take more than couple of hours. If no-one can see any problem in code by now and considering that it's tested on x86(_32), amd64, ppc and giving somewhat better perf. than original then I believe it is unnecessarily hanging outside of -mm tree. I also contacted author (Markus Oberhumer) regarding above changes but he seems not be responding right now. But still, if it gets into -mm and gets used by various related projects then it should be good enough for mainline also. Cheers, Nitin - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 5/30/07, Jan Engelhardt [EMAIL PROTECTED] wrote: On May 28 2007 20:04, Nitin Gupta wrote: * Changelog vs. original LZO: 1) Used standard/kernel defined data types: (this eliminated _huge_ #ifdef chunks) lzo_bytep - unsigned char * lzo_uint - size_t lzo_xint - size_t Is this safe (as far as compressed LZO stream is concerned) -- or is it even needed (could it be unsigned int)? - m_pos -= (*(const unsigned short *)ip) 2; -#else - m_pos = op - 1; - m_pos -= (ip[0] 2) + (ip[1] 6); -#endif + m_pos = op - 1 - (cpu_to_le16(*(const u16 *)ip) 2); (Andrey suggested le16_to_cpu for above but I think it should be cpu_to_le16). *** Need testing on big endian machine *** On i386, both cpu_to_le16 and le16_to_cpu do nothing. On sparc for example, cpu_to_leXX and leXX_to_cpu do 'the same' ;-) they swap 1234-4321. It is the bytestream (ip) that is reinterpreted as uint16_t. And I really doubt that the LZO author has a big-endian machine, given the days of ubiquitous x86. le16_to_cpu it is. I just missed the point that leXX_to_cpu() and cpu_to_leXX() are identical on BE machine anyway. But then why you think it should be le_16_cpu() -- how will this make any difference? For your ref (from big_endian.h): #define __cpu_to_le16(x) ((__force __le16)__swab16((x))) #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x)) - Nitin - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)
On May 29, 2007, at 8:15 AM, Satyam Sharma wrote: skipping some checksum calculation if some flag (PRESET_DICT) is absent from the input stream about to be decompressed ... You don't need to dissect the header manually to look for that bit. If you feed inflate() at least the first two bytes, it will return immediately with the Z_NEED_DICT return code if a preset dictionary is requested. You can force inflate() to return immediately after decoding the two byte header even if a preset dictionary is not requested by using the Z_BLOCK flush code. Mark - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 5/29/07, Bret Towe <[EMAIL PROTECTED]> wrote: tested this on ppc and its still good is there any reason to bother with a test on amd64? if there is I might be able to get to it tonight Yes, this test is desired on 'take 6' (In future I will append version to patch bz2) Thanks, Nitin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 5/28/07, Adrian Bunk <[EMAIL PROTECTED]> wrote: On Mon, May 28, 2007 at 09:33:32PM +0530, Nitin Gupta wrote: > On 5/28/07, Adrian Bunk <[EMAIL PROTECTED]> wrote: I have not seen any explanations: - Why did the upstream author write the code that way? - Why are your changes correct? - Why do your changes allow the compiler to produce faster code? The upstream author of the code is available - and he might be able to help you getting answers. No matter whether your changes are incorrect or correct optimizations that should go upstream, in both cases discussing these issues with upstream is the best solution. The changelog I posted along with patch mentions all the changes I made. I thought we will find all problems with this changelog in hand and considering that its just ~500 LOC. But still, ok, asking author himself will be good if he replies. I will mail him detailed changelog and seek his feedback on this. This should answer all of your questions. And testing is nice, but if you broke some case that's outside of your tests you'll never notice. Yes. We cannot come up with exhaustive set of test cases to cover all cases. But assuming that _original_ version is right and taking the chagelog we should be able to verify if the porting is correct. - Nitin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 5/29/07, Bret Towe <[EMAIL PROTECTED]> wrote: On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote: > Hi, > > Attached is tester code used for testing. > (developed by Daniel Hazelton -- modified slightly to now use 'take 6' > version for 'TinyLZO') > > Cheers, > Nitin > > On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote: > > (Using tester program from Daniel) > > > > Following compares this kernel port ('take 6') vs original miniLZO code: > > > > 'TinyLZO' refers to this kernel port. > > > > 1 run averages: > > 'Tiny LZO': > >Combined: 61.2223 usec > >Compression: 41.8412 usec > >Decompression: 19.3811 usec > > 'miniLZO': > >Combined: 66.0444 usec > >Compression: 46.6323 usec > >Decompression: 19.4121 usec > > > > Result: > > Overall: TinyLZO is 7.3% faster > > Compressor: TinyLZO is 10.2% faster > > Decompressor: TinyLZO is 0.15% faster > > > > 1 run averages: 'Tiny LZO': Combined: 112.6642 usec Compression: 56.3321 usec Decompression: 56.3321 usec 'miniLZO': Combined: 77.6642 usec Compression: 56.3416 usec Decompression: 21.3226 usec now the interesting bit I thought was the following [EMAIL PROTECTED]:/usr/src/lzo1x-test-4# ./fulltest [test_lzo.c::compress (93)] run took 42 microseconds [test_lzo.c::decompress (127)] run took 20 microseconds [EMAIL PROTECTED]:/usr/src/lzo1x-test-4# ./tinytest [test.c::compress (91)] run took 44 microseconds [test.c::decompress (117)] BUG: lzo1x_decompress has failed ( t == -6 ) [test.c::main (149)] BUG: Decompression routine failure Did you use x86 for above test? Maybe some problem with testing script? What data did you use for this test? - Nitin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 5/28/07, Bret Towe <[EMAIL PROTECTED]> wrote: On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote: > Hi, > > This is kernel port of LZO1X-1 compressor and LZO1X decompressor (safe > version only). > > * Changes since 'take 5' (Full Changelog after this): > - Added compressor and decomrpesssor as separate and hidden config > options (default: n) > - Cleanups: removed LZO_CHECK_MPOS_NON_DET macro > - removed PTR* macros. > > * Benchmarks: (system: P4 3GHz, 1G RAM) > (Using tester program from Daniel) > > Following compares this kernel port ('take 6') vs original miniLZO code: > > 'TinyLZO' refers to this kernel port. > > 1 run averages: > 'Tiny LZO': >Combined: 61.2223 usec >Compression: 41.8412 usec >Decompression: 19.3811 usec > 'miniLZO': >Combined: 66.0444 usec >Compression: 46.6323 usec >Decompression: 19.4121 usec > > Result: > Overall: TinyLZO is 7.3% faster > Compressor: TinyLZO is 10.2% faster > Decompressor: TinyLZO is 0.15% faster > > TODO: test 'take 6' port on archs other than x86(_32) > > * Changelog vs. original LZO: > 1) Used standard/kernel defined data types: (this eliminated _huge_ > #ifdef chunks) > lzo_bytep -> unsigned char * > lzo_uint -> size_t > lzo_xint -> size_t > lzo_uint32p -> u32 * > lzo_uintptr_t -> unsigned long > 2) Removed everything #ifdef'ed under COPY_DICT (this is not set for > LZO1X, so removed corres. parts). > 3) Removed code #ifdef'ed for LZO1Y, LZO1Z, other variants. > 4) Reformatted the code to match general kernel style. > 5) The only code change: (as suggested by Andrey) > -#if defined(__LITTLE_ENDIAN) > - m_pos = op - 1; > - m_pos -= (*(const unsigned short *)ip) >> 2; > -#else > - m_pos = op - 1; > - m_pos -= (ip[0] >> 2) + (ip[1] << 6); > -#endif > > + m_pos = op - 1 - (cpu_to_le16(*(const u16 *)ip) >> 2); > > (Andrey suggested le16_to_cpu for above but I think it should be cpu_to_le16). > *** Need testing on big endian machine *** > > Similarly: > -#if defined(__LITTLE_ENDIAN) > - m_pos -= (*(const unsigned short *)ip) >> 2; > -#else > - m_pos -= (ip[0] >> 2) + (ip[1] << 6); > -#endif > > + m_pos -= cpu_to_le16(*(const u16 *)ip) >> 2; > > 6) Get rid of LZO_CHECK_MPOS_NON_DET macro and PTR* macros. > > > I think it's now ready for mainline inclusion. > > > include/linux/lzo1x.h| 66 +++ > lib/Kconfig |6 + > lib/Makefile |2 + > lib/lzo1x/Makefile |2 + > lib/lzo1x/lzo1x_compress.c | 259 tested this on ppc and its still good is there any reason to bother with a test on amd64? if there is I might be able to get to it tonight forgot to mention that it would make life easier if you would put the version of the patch in the .bz2 files you attach since you poping out new takes so quick - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote: Hi, Attached is tester code used for testing. (developed by Daniel Hazelton -- modified slightly to now use 'take 6' version for 'TinyLZO') Cheers, Nitin On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote: > (Using tester program from Daniel) > > Following compares this kernel port ('take 6') vs original miniLZO code: > > 'TinyLZO' refers to this kernel port. > > 1 run averages: > 'Tiny LZO': >Combined: 61.2223 usec >Compression: 41.8412 usec >Decompression: 19.3811 usec > 'miniLZO': >Combined: 66.0444 usec >Compression: 46.6323 usec >Decompression: 19.4121 usec > > Result: > Overall: TinyLZO is 7.3% faster > Compressor: TinyLZO is 10.2% faster > Decompressor: TinyLZO is 0.15% faster > 1 run averages: 'Tiny LZO': Combined: 112.6642 usec Compression: 56.3321 usec Decompression: 56.3321 usec 'miniLZO': Combined: 77.6642 usec Compression: 56.3416 usec Decompression: 21.3226 usec now the interesting bit I thought was the following [EMAIL PROTECTED]:/usr/src/lzo1x-test-4# ./fulltest [test_lzo.c::compress (93)] run took 42 microseconds [test_lzo.c::decompress (127)] run took 20 microseconds [EMAIL PROTECTED]:/usr/src/lzo1x-test-4# ./tinytest [test.c::compress (91)] run took 44 microseconds [test.c::decompress (117)] BUG: lzo1x_decompress has failed ( t == -6 ) [test.c::main (149)] BUG: Decompression routine failure - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote: Hi, This is kernel port of LZO1X-1 compressor and LZO1X decompressor (safe version only). * Changes since 'take 5' (Full Changelog after this): - Added compressor and decomrpesssor as separate and hidden config options (default: n) - Cleanups: removed LZO_CHECK_MPOS_NON_DET macro - removed PTR* macros. * Benchmarks: (system: P4 3GHz, 1G RAM) (Using tester program from Daniel) Following compares this kernel port ('take 6') vs original miniLZO code: 'TinyLZO' refers to this kernel port. 1 run averages: 'Tiny LZO': Combined: 61.2223 usec Compression: 41.8412 usec Decompression: 19.3811 usec 'miniLZO': Combined: 66.0444 usec Compression: 46.6323 usec Decompression: 19.4121 usec Result: Overall: TinyLZO is 7.3% faster Compressor: TinyLZO is 10.2% faster Decompressor: TinyLZO is 0.15% faster TODO: test 'take 6' port on archs other than x86(_32) * Changelog vs. original LZO: 1) Used standard/kernel defined data types: (this eliminated _huge_ #ifdef chunks) lzo_bytep -> unsigned char * lzo_uint -> size_t lzo_xint -> size_t lzo_uint32p -> u32 * lzo_uintptr_t -> unsigned long 2) Removed everything #ifdef'ed under COPY_DICT (this is not set for LZO1X, so removed corres. parts). 3) Removed code #ifdef'ed for LZO1Y, LZO1Z, other variants. 4) Reformatted the code to match general kernel style. 5) The only code change: (as suggested by Andrey) -#if defined(__LITTLE_ENDIAN) - m_pos = op - 1; - m_pos -= (*(const unsigned short *)ip) >> 2; -#else - m_pos = op - 1; - m_pos -= (ip[0] >> 2) + (ip[1] << 6); -#endif + m_pos = op - 1 - (cpu_to_le16(*(const u16 *)ip) >> 2); (Andrey suggested le16_to_cpu for above but I think it should be cpu_to_le16). *** Need testing on big endian machine *** Similarly: -#if defined(__LITTLE_ENDIAN) - m_pos -= (*(const unsigned short *)ip) >> 2; -#else - m_pos -= (ip[0] >> 2) + (ip[1] << 6); -#endif + m_pos -= cpu_to_le16(*(const u16 *)ip) >> 2; 6) Get rid of LZO_CHECK_MPOS_NON_DET macro and PTR* macros. I think it's now ready for mainline inclusion. include/linux/lzo1x.h| 66 +++ lib/Kconfig |6 + lib/Makefile |2 + lib/lzo1x/Makefile |2 + lib/lzo1x/lzo1x_compress.c | 259 tested this on ppc and its still good is there any reason to bother with a test on amd64? if there is I might be able to get to it tonight - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Monday 28 May 2007 16:18:40 Daniel Hazelton wrote: > On Monday 28 May 2007 13:11:15 Adrian Bunk wrote: > > On Mon, May 28, 2007 at 09:33:32PM +0530, Nitin Gupta wrote: > > > On 5/28/07, Adrian Bunk <[EMAIL PROTECTED]> wrote: > > >... > > > > > >> - then ensure that it works correctly on all architectures and > > > > > > Already tested on x86, amd64, ppc (by Bret). I do not have machines > > > from other archs available. Bret tested 'take 3' version but no > > > changes were introduced in further revisions that could affect > > > correctness - but still it will be good to have this version tested > > > too. Only with inclusion in -mm and testing by much wider user base > > > can make it to mainline (I suppose nobody uses -mm for production use > > > anyway). > > > > > >> document why your version is that much faster than the original > > >> version and why you know your optimizations have no side effects > > With likely(), unlikely() and noinline *not* defined as NOP's performance > drops: > > 1 run averages: > 'Tiny LZO': > Combined: 84.9292 usec > Compression: 42.4646 usec > Decompression: 42.4646 usec > 'miniLZO': > Combined: 61.3548 usec > Compression: 43.5648 usec > Decompression: 17.79 usec > > However, I'm worried that my testbed code - likely the Perl script that > actually loops the test code and collects its output - is somehow faulting, > as the way that the Compression and Decompression code have the exact same > value. > > I'm going to toss some debugging output in the script and see if I can spot > the problem. > Okay, checked my code and it was all a problem with cpu_to_le16 getting defined fully instead of just being a NOP. With that problem corrected the performance returns: 1 run averages: 'Tiny LZO': Combined: 60.1028 usec Compression: 42.0652 usec Decompression: 18.0376 usec 'miniLZO': Combined: 61.0932 usec Compression: 43.4382 usec Decompression: 17.655 usec Combined average shows 'Tiny' to be 1.6% faster Compression in 'Tiny' is 3.2% faster Decompression in 'Tiny' is 2.2% slower All in all, the trade-off in this code, with the overall performance of the 'Tiny' code being faster than the stock miniLZO code I'd like to say that I'm certain that the decompressor could probably be sped up more, although I don't know of a place in the kernel where less than half a microsecond would make a massive impact. (Only place I can think of where this might have a negative is in SLAB/SLOB/SLUB, and I don't think that a low-level memory manager like those is a place for a compression algorithm anyway) Later today or tommorrow I'll start putting together another part to this "test-bed" for stress-testing the algorithm. Currently I only have a few ideas for tests: (for benchmarking) 1) Random input to compressor 2) large input data 3) real-world data (mmap()'d chunk of /dev/mem as input) (for stability checking) 4) deliberate corruption of compressed data 5) early finish (realloc() compressed data buffer to less than the full size of the data) 6) late start (change pointer to start of compressed data so the decompressor doesn't start at the beginning) When I have a better idea of how the LZO algorithm works I might try creating an input data-set for the decompressor that would deliberately overflow the output buffer. This is supposed to be caught by bounds-checking in the '_safe' version of the decompressor (the version used in 'Tiny' and used in the benchmarking of miniLZO), however it might be possible to trick the decompressor. (If I do manage this and it crashes both 'Tiny' and 'miniLZO' I *will* report it to Herr Oberhumer as well as see if I can come up with a quick patch for the problem). As I said before, any suggestions for how to improve the benchmark code are welcome, as are suggestions for tests I can add to it for stressing the new 'TinyLZO' implementation. Latest version of the test-bed attached. Significant changes include: likely(), unlikely() and noinline are no longer NOP's when the marked line in helpers.h is commented out, cpu_to_le16 functions as expected and is no longer a NOP. (Yes, that means this code is now fully capable of working correctly on a BE system). DRH lzo1x-test-bed-5.tar.bz2 Description: application/tbz
Re: [RFC] LZO de/compression support - take 6
On Monday 28 May 2007 13:11:15 Adrian Bunk wrote: > On Mon, May 28, 2007 at 09:33:32PM +0530, Nitin Gupta wrote: > > On 5/28/07, Adrian Bunk <[EMAIL PROTECTED]> wrote: > >... > > > >> - then ensure that it works correctly on all architectures and > > > > Already tested on x86, amd64, ppc (by Bret). I do not have machines > > from other archs available. Bret tested 'take 3' version but no > > changes were introduced in further revisions that could affect > > correctness - but still it will be good to have this version tested > > too. Only with inclusion in -mm and testing by much wider user base > > can make it to mainline (I suppose nobody uses -mm for production use > > anyway). > > > >> document why your version is that much faster than the original > >> version and why you know your optimizations have no side effects With likely(), unlikely() and noinline *not* defined as NOP's performance drops: 1 run averages: 'Tiny LZO': Combined: 84.9292 usec Compression: 42.4646 usec Decompression: 42.4646 usec 'miniLZO': Combined: 61.3548 usec Compression: 43.5648 usec Decompression: 17.79 usec However, I'm worried that my testbed code - likely the Perl script that actually loops the test code and collects its output - is somehow faulting, as the way that the Compression and Decompression code have the exact same value. I'm going to toss some debugging output in the script and see if I can spot the problem. DRH - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Monday 28 May 2007 13:11:15 Adrian Bunk wrote: > On Mon, May 28, 2007 at 09:33:32PM +0530, Nitin Gupta wrote: > > On 5/28/07, Adrian Bunk <[EMAIL PROTECTED]> wrote: > >... > > > >> - then ensure that it works correctly on all architectures and > > > > Already tested on x86, amd64, ppc (by Bret). I do not have machines > > from other archs available. Bret tested 'take 3' version but no > > changes were introduced in further revisions that could affect > > correctness - but still it will be good to have this version tested > > too. Only with inclusion in -mm and testing by much wider user base > > can make it to mainline (I suppose nobody uses -mm for production use > > anyway). > > > >> document why your version is that much faster than the original > >> version and why you know your optimizations have no side effects > > > > Replied in earlier mail regarding this. > >... > > I have not seen any explanations: > - Why did the upstream author write the code that way? > - Why are your changes correct? > - Why do your changes allow the compiler to produce faster code? Would it help if I added instrumentation code to both the upstream code and the stripped down code in question? > The upstream author of the code is available - and he might be able to > help you getting answers. No matter whether your changes are incorrect > or correct optimizations that should go upstream, in both cases > discussing these issues with upstream is the best solution. > > And testing is nice, but if you broke some case that's outside of your > tests you'll never notice. Any suggestions for tests I could add to the boilerplate I've been using to benchmark the code? I'll add a few tests that use random input rather than one of the source files - I'd add a test to see how it deals with NULL input, but I ran into that in an early version of the testbed (where I had a typo that stopped the code from working) and the compressor didn't like it. And I suppose I could put in a test that reads in a chunk of /dev/mem or have it get a chunk of data from HAL/DBus as well. But if anyone has a suggestion of what tests I could toss at it to really stress the code, let me know and I'll get to work extending this code so it can stress test both the standard implementation of miniLZO *and* this stripped down 'Tiny' version. DRH > > - Nitin > > cu > Adrian - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Monday 28 May 2007 13:01:09 Adrian Bunk wrote: > On Mon, May 28, 2007 at 11:55:14AM -0400, Daniel Hazelton wrote: > >... > > This is my guess as well. Though performance will likely drop when I make > > the noinline macro mean something. (This may be offset by figuring out a > > way to make likely() and unlikely() also have a meaningful effect in > > userspace). ... > > What is your problem? > > The likely/unlikely macros aren't in any way depending on any kernel > infrastructure. That I'm just plain lazy and haven't felt like pulling them out of the kernel sources ? Actually, that is the case - and I wasn't exactly sure that likely() and unlikely() were completely decoupled from the kernel's infrastructure. DRH > > DRH > > cu > Adrian - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Mon, May 28, 2007 at 09:33:32PM +0530, Nitin Gupta wrote: > On 5/28/07, Adrian Bunk <[EMAIL PROTECTED]> wrote: >... >> - then ensure that it works correctly on all architectures and > > Already tested on x86, amd64, ppc (by Bret). I do not have machines > from other archs available. Bret tested 'take 3' version but no > changes were introduced in further revisions that could affect > correctness - but still it will be good to have this version tested > too. Only with inclusion in -mm and testing by much wider user base > can make it to mainline (I suppose nobody uses -mm for production use > anyway). > >> document why your version is that much faster than the original >> version and why you know your optimizations have no side effects > > Replied in earlier mail regarding this. >... I have not seen any explanations: - Why did the upstream author write the code that way? - Why are your changes correct? - Why do your changes allow the compiler to produce faster code? The upstream author of the code is available - and he might be able to help you getting answers. No matter whether your changes are incorrect or correct optimizations that should go upstream, in both cases discussing these issues with upstream is the best solution. And testing is nice, but if you broke some case that's outside of your tests you'll never notice. > - Nitin cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Mon, May 28, 2007 at 11:55:14AM -0400, Daniel Hazelton wrote: >... > This is my guess as well. Though performance will likely drop when I make the > noinline macro mean something. (This may be offset by figuring out a way to > make likely() and unlikely() also have a meaningful effect in userspace). >... What is your problem? The likely/unlikely macros aren't in any way depending on any kernel infrastructure. > DRH cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 5/28/07, Adrian Bunk <[EMAIL PROTECTED]> wrote: On Mon, May 28, 2007 at 08:36:44PM +0530, Nitin Gupta wrote: >... > So, before this change, it will be good if it gets merged in mainline > and tested, at least for correctness, on all supported achs. All the > while, we will have a good feeling that there is still a good scope > for perf improvement :) The correct order is: - create one version with all the optimizations you have in mind Already done. One more optimization is regarding use of memcpy() in place of COPY4() macros and open byte-by-byte copying. There are some places where it's very hard to get it correct without adding additional checks on various values which casues futher overhead by iteslf - even then I could not get them correct so I decided not to go with this particular optimization by myself. - then ensure that it works correctly on all architectures and Already tested on x86, amd64, ppc (by Bret). I do not have machines from other archs available. Bret tested 'take 3' version but no changes were introduced in further revisions that could affect correctness - but still it will be good to have this version tested too. Only with inclusion in -mm and testing by much wider user base can make it to mainline (I suppose nobody uses -mm for production use anyway). document why your version is that much faster than the original version and why you know your optimizations have no side effects Replied in earlier mail regarding this. - then get it tested in -mm This is what I am looking for :) After these steps, you can start considering getting it into mainline. - Nitin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Monday 28 May 2007 11:47:55 Nitin Gupta wrote: > On 5/28/07, Adrian Bunk <[EMAIL PROTECTED]> wrote: > > On Mon, May 28, 2007 at 08:10:31PM +0530, Nitin Gupta wrote: > > > Hi, > > > > > > Attached is tester code used for testing. > > > (developed by Daniel Hazelton -- modified slightly to now use 'take 6' > > > version for 'TinyLZO') > > > > > > Cheers, > > > Nitin > > > > > > On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote: > > >> (Using tester program from Daniel) > > >> > > >> Following compares this kernel port ('take 6') vs original miniLZO > > >> code: > > >> > > >> 'TinyLZO' refers to this kernel port. > > >> > > >> 1 run averages: > > >> 'Tiny LZO': > > >>Combined: 61.2223 usec > > >>Compression: 41.8412 usec > > >>Decompression: 19.3811 usec > > >> 'miniLZO': > > >>Combined: 66.0444 usec > > >>Compression: 46.6323 usec > > >>Decompression: 19.4121 usec > > >> > > >> Result: > > >> Overall: TinyLZO is 7.3% faster > > >> Compressor: TinyLZO is 10.2% faster > > >> Decompressor: TinyLZO is 0.15% faster > > > > So your the compressor of your version runs 10.2% faster than the > > original version. > > > > That's a huge difference. > > > > Why exactly is it that much faster? > > > > cu > > Adrian > > I am not sure on how to account for this _big_ perf. gain but from > benchmarks I see that whenever I remove unncessary casting from tight > loops I get perf. gain of 1-2%. For e.g. open coding > LZO_CHECK_MPOS_NON_DET macro with all unnecessary casting removed, > gave perf. gain of ~2%. Similarly, I found many other places where > such casting was unnecessary. This is my guess as well. Though performance will likely drop when I make the noinline macro mean something. (This may be offset by figuring out a way to make likely() and unlikely() also have a meaningful effect in userspace). This benchmark should be run on BE machines, but I'm still trying to figure out a way to open-code a *fast* and *reliable* version of cpu_to_le16. Suggestions for the above bits will always be appreciated. > These changes have been tested on x86, amd64, ppc. Testing of 'take 6' > version is yet to be done - this will confirm that I didn't > reintroduce some error. I don't see that 'take 6' has had too much changed from 'take 4', although I haven't do a full comparison. DRH - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Monday 28 May 2007 11:30:55 Adrian Bunk wrote: > On Mon, May 28, 2007 at 08:10:31PM +0530, Nitin Gupta wrote: > > Hi, > > > > Attached is tester code used for testing. > > (developed by Daniel Hazelton -- modified slightly to now use 'take 6' > > version for 'TinyLZO') > > > > Cheers, > > Nitin > > > > On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote: > >> (Using tester program from Daniel) > >> > >> Following compares this kernel port ('take 6') vs original miniLZO code: > >> > >> 'TinyLZO' refers to this kernel port. > >> > >> 1 run averages: > >> 'Tiny LZO': > >>Combined: 61.2223 usec > >>Compression: 41.8412 usec > >>Decompression: 19.3811 usec > >> 'miniLZO': > >>Combined: 66.0444 usec > >>Compression: 46.6323 usec > >>Decompression: 19.4121 usec > >> > >> Result: > >> Overall: TinyLZO is 7.3% faster > >> Compressor: TinyLZO is 10.2% faster > >> Decompressor: TinyLZO is 0.15% faster > > So your the compressor of your version runs 10.2% faster than the > original version. > > That's a huge difference. > > Why exactly is it that much faster? All I've done is write the code used for testing the compressor in userspace, but I think it might have something to do with the NOP that cpu_to_le16() is on LE systems (the original has an open-coded byte-swap and I don't think it ever really gets passed) and the removal of a lot of un-needed casts. (version 5 of the code re-introduced one of the macro's that had a lot of casts and it showed a drastic slow-down). Anyway, I'm going to get back into my testbed and see about actually making a lot of the stuff that is currently just a NOP (I made exceedingly minor changes to the files to get the working in userspace - one of the changes was to define the kernel-specific macros's likely(), unlikely(), noinline and cpu_to_le16() to be NOP's.) into functional code. That set of changes may affect the performance and bring it closer to the original or it may not. And the testbed code is open - I guess I should stick the GPL headers in it, even though it isn't likely to see much use outside a very small circle. If *anyone* is skeptical about the numbers, feel free to use the testbed yourself. (It currently only works as intended on LE systems, however) Just untar the source into a directory, run make and the run 'tester.pl' to actually perform the test runs and gather the numbers. There is no overhead included in the numbers - they come directly from running gettimeofday() before and after the calls to the respective functions (well, a bit of magic is done to make the numbers have meaning). DRH PS: the variable '$RUNS' in tester.pl determines how many times the script loops each program to generate the numbers. I'm currently using 1 runs to generate the average timings, but if you feel that is too large or too small a sample set, go ahead and change it. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 5/28/07, Adrian Bunk <[EMAIL PROTECTED]> wrote: On Mon, May 28, 2007 at 08:10:31PM +0530, Nitin Gupta wrote: > Hi, > > Attached is tester code used for testing. > (developed by Daniel Hazelton -- modified slightly to now use 'take 6' > version for 'TinyLZO') > > Cheers, > Nitin > > On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote: >> (Using tester program from Daniel) >> >> Following compares this kernel port ('take 6') vs original miniLZO code: >> >> 'TinyLZO' refers to this kernel port. >> >> 1 run averages: >> 'Tiny LZO': >>Combined: 61.2223 usec >>Compression: 41.8412 usec >>Decompression: 19.3811 usec >> 'miniLZO': >>Combined: 66.0444 usec >>Compression: 46.6323 usec >>Decompression: 19.4121 usec >> >> Result: >> Overall: TinyLZO is 7.3% faster >> Compressor: TinyLZO is 10.2% faster >> Decompressor: TinyLZO is 0.15% faster So your the compressor of your version runs 10.2% faster than the original version. That's a huge difference. Why exactly is it that much faster? cu Adrian I am not sure on how to account for this _big_ perf. gain but from benchmarks I see that whenever I remove unncessary casting from tight loops I get perf. gain of 1-2%. For e.g. open coding LZO_CHECK_MPOS_NON_DET macro with all unnecessary casting removed, gave perf. gain of ~2%. Similarly, I found many other places where such casting was unnecessary. These changes have been tested on x86, amd64, ppc. Testing of 'take 6' version is yet to be done - this will confirm that I didn't reintroduce some error. - Nitin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Mon, May 28, 2007 at 08:36:44PM +0530, Nitin Gupta wrote: >... > So, before this change, it will be good if it gets merged in mainline > and tested, at least for correctness, on all supported achs. All the > while, we will have a good feeling that there is still a good scope > for perf improvement :) The correct order is: - create one version with all the optimizations you have in mind - then ensure that it works correctly on all architectures and document why your version is that much faster than the original version and why you know your optimizations have no side effects - then get it tested in -mm After these steps, you can start considering getting it into mainline. > Cheers, > Nitin cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Mon, May 28, 2007 at 08:10:31PM +0530, Nitin Gupta wrote: > Hi, > > Attached is tester code used for testing. > (developed by Daniel Hazelton -- modified slightly to now use 'take 6' > version for 'TinyLZO') > > Cheers, > Nitin > > On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote: >> (Using tester program from Daniel) >> >> Following compares this kernel port ('take 6') vs original miniLZO code: >> >> 'TinyLZO' refers to this kernel port. >> >> 1 run averages: >> 'Tiny LZO': >>Combined: 61.2223 usec >>Compression: 41.8412 usec >>Decompression: 19.3811 usec >> 'miniLZO': >>Combined: 66.0444 usec >>Compression: 46.6323 usec >>Decompression: 19.4121 usec >> >> Result: >> Overall: TinyLZO is 7.3% faster >> Compressor: TinyLZO is 10.2% faster >> Decompressor: TinyLZO is 0.15% faster So your the compressor of your version runs 10.2% faster than the original version. That's a huge difference. Why exactly is it that much faster? cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On 5/28/07, Daniel Hazelton <[EMAIL PROTECTED]> wrote: On Monday 28 May 2007 10:40:31 Nitin Gupta wrote: > Hi, > > Attached is tester code used for testing. > (developed by Daniel Hazelton -- modified slightly to now use 'take 6' > version for 'TinyLZO') > > Cheers, > Nitin > I haven't tested with version 6, but after removing the LZO_CHECK_MPOS_NON_DET macro from the 'take 5' code and replacing the open-coded byte-for-byte copies with calls to memcpy: I did memcpy() changes in some initial post (take '2', I think). That caused some _correctness_ issue in de/compressor code -- Bret's test could not succeed on ppc machine. After going back to byte-by-byte copying, his tests were successful. So, it's better not to include such changes now or test on all supported archs if you really want to do so :) 1 run averages: 'Tiny LZO': Combined: 57.4691 usec Compression: 39.8837 usec Decompression: 17.5854 usec 'miniLZO': Combined: 64.0484 usec Compression: 46.0604 usec Decompression: 17.988 usec which means: Overall TinyLZO is 10.2% faster TinyLZO compresses 13.4% faster TinyLZO decompresses 2.23% faster -Benchmark run a a Pentium-M 1.73GHz, 1GB Ram With the speed-up seen with just the removal of the LZO_CHECK_MPOS_NON_DET I wasn't sure that changing the open-coded copy to a call to memcpy() was going to have a big impact on the code, but it does appear to have has several percentage points of difference. Yes, memcpy() changes have potential of giving significant perf gain but I am not too sure if memcpy() will be good if we want to copy just few bytes (which is the case at many times in de/compressor). Also, at some places, memcpy() changes are not trival and have actually caused correctness issues as mentioned above. So, before this change, it will be good if it gets merged in mainline and tested, at least for correctness, on all supported achs. All the while, we will have a good feeling that there is still a good scope for perf improvement :) Cheers, Nitin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] LZO de/compression support - take 6
On Monday 28 May 2007 10:40:31 Nitin Gupta wrote: > Hi, > > Attached is tester code used for testing. > (developed by Daniel Hazelton -- modified slightly to now use 'take 6' > version for 'TinyLZO') > > Cheers, > Nitin > I haven't tested with version 6, but after removing the LZO_CHECK_MPOS_NON_DET macro from the 'take 5' code and replacing the open-coded byte-for-byte copies with calls to memcpy: 1 run averages: 'Tiny LZO': Combined: 57.4691 usec Compression: 39.8837 usec Decompression: 17.5854 usec 'miniLZO': Combined: 64.0484 usec Compression: 46.0604 usec Decompression: 17.988 usec which means: Overall TinyLZO is 10.2% faster TinyLZO compresses 13.4% faster TinyLZO decompresses 2.23% faster -Benchmark run a a Pentium-M 1.73GHz, 1GB Ram With the speed-up seen with just the removal of the LZO_CHECK_MPOS_NON_DET I wasn't sure that changing the open-coded copy to a call to memcpy() was going to have a big impact on the code, but it does appear to have has several percentage points of difference. (Though I am unsure about the speed increase with the decompression - I didn't touch that file when making this set of changes. I'm going to make the memcpy() changes to 'take 6' and see if the speed-up holds true) DRH - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/