subject:"\[RFC\] LZO de\/compression support \- take 6"

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-06-01 Thread Satyam Sharma

Hi Daniel,

On 6/1/07, Daniel Hazelton <[EMAIL PROTECTED]> wrote:

On Wednesday 30 May 2007 19:02:28 Mark Adler wrote:
> On May 30, 2007, at 6:30 AM, Satyam Sharma wrote:
> > [1] For your reference, here is the user code in question:
>
> ...
>
> >if (srclen > 2 && !(data_in[1] & PRESET_DICT) &&
> > ((data_in[0] & 0x0f) == Z_DEFLATED) &&
> > !(((data_in[0]<<8) + data_in[1]) % 31)) {
>
> The funny thing here is that the author felt compelled to use a
> #defined constant for the dictionary bit (PRESET_DICT), but had no
> problem with a numeric constant to isolate the compression method
> (0x0f), or for that matter extracting the window bits from the
> header.  The easy way to avoid the use of an internal zlib header
> file here is to simply replace PRESET_DICT with 0x20.  That constant
> will never change -- it is part of the definition of the zlib header
> in RFC 1950.

If there is no objection, I'll put together a patch that changes the use in
JFFS2 into a "magic number", complete with documentation on it, and also
moves all of the zlib stuff into a single directory.

Right, s/PRESET_DICT/0x20/ would have the least fall-out / unknown
side-effects.

> The slightly more involved patch to avoid the problem is to let
> inflate() do all that work for you, including the integrity check on
> the zlib header (% 31).  Also this corrects an error in the original
> code, which is that it continues to try to decompress after finding
> that a dictionary is needed or that the zlib header is invalid.  In
> this version, a bad header simply returns an error:
>

Does anyone know if doing as Mark suggests would negatively impact the
performance of JFFS2 to such a degree that it could be considered a
regression? I, unfortunately, don't have the hardware to properly test the
code. And, at this point in time, I also don't have enough familiarity with
the JFFS2 code to make such a change myself.

David would have to comment on that, but you could simultaneously
make and submit the patch as suggested by Mark with both your
signed-offs-by. That'll naturally need to go through David's tree, so we'll
know if he likes/accepts the suggested modifications ...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-06-01 Thread Satyam Sharma


Hi Daniel,

On 6/1/07, Daniel Hazelton [EMAIL PROTECTED] wrote:

On Wednesday 30 May 2007 19:02:28 Mark Adler wrote:
 On May 30, 2007, at 6:30 AM, Satyam Sharma wrote:
  [1] For your reference, here is the user code in question:

 ...

 if (srclen  2  !(data_in[1]  PRESET_DICT) 
  ((data_in[0]  0x0f) == Z_DEFLATED) 
  !(((data_in[0]8) + data_in[1]) % 31)) {

 The funny thing here is that the author felt compelled to use a
 #defined constant for the dictionary bit (PRESET_DICT), but had no
 problem with a numeric constant to isolate the compression method
 (0x0f), or for that matter extracting the window bits from the
 header.  The easy way to avoid the use of an internal zlib header
 file here is to simply replace PRESET_DICT with 0x20.  That constant
 will never change -- it is part of the definition of the zlib header
 in RFC 1950.

If there is no objection, I'll put together a patch that changes the use in
JFFS2 into a magic number, complete with documentation on it, and also
moves all of the zlib stuff into a single directory.


Right, s/PRESET_DICT/0x20/ would have the least fall-out / unknown
side-effects.


 The slightly more involved patch to avoid the problem is to let
 inflate() do all that work for you, including the integrity check on
 the zlib header (% 31).  Also this corrects an error in the original
 code, which is that it continues to try to decompress after finding
 that a dictionary is needed or that the zlib header is invalid.  In
 this version, a bad header simply returns an error:


Does anyone know if doing as Mark suggests would negatively impact the
performance of JFFS2 to such a degree that it could be considered a
regression? I, unfortunately, don't have the hardware to properly test the
code. And, at this point in time, I also don't have enough familiarity with
the JFFS2 code to make such a change myself.


David would have to comment on that, but you could simultaneously
make and submit the patch as suggested by Mark with both your
signed-offs-by. That'll naturally need to go through David's tree, so we'll
know if he likes/accepts the suggested modifications ...
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-31 Thread Daniel Hazelton

On Wednesday 30 May 2007 19:02:28 Mark Adler wrote:
> On May 30, 2007, at 6:30 AM, Satyam Sharma wrote:
> > [1] For your reference, here is the user code in question:
>
> ...
>
> >if (srclen > 2 && !(data_in[1] & PRESET_DICT) &&
> > ((data_in[0] & 0x0f) == Z_DEFLATED) &&
> > !(((data_in[0]<<8) + data_in[1]) % 31)) {
>
> The funny thing here is that the author felt compelled to use a
> #defined constant for the dictionary bit (PRESET_DICT), but had no
> problem with a numeric constant to isolate the compression method
> (0x0f), or for that matter extracting the window bits from the
> header.  The easy way to avoid the use of an internal zlib header
> file here is to simply replace PRESET_DICT with 0x20.  That constant
> will never change -- it is part of the definition of the zlib header
> in RFC 1950.

If there is no objection, I'll put together a patch that changes the use in 
JFFS2 into a "magic number", complete with documentation on it, and also 
moves all of the zlib stuff into a single directory.

> The slightly more involved patch to avoid the problem is to let
> inflate() do all that work for you, including the integrity check on
> the zlib header (% 31).  Also this corrects an error in the original
> code, which is that it continues to try to decompress after finding
> that a dictionary is needed or that the zlib header is invalid.  In
> this version, a bad header simply returns an error:
>

Does anyone know if doing as Mark suggests would negatively impact the 
performance of JFFS2 to such a degree that it could be considered a 
regression? I, unfortunately, don't have the hardware to properly test the 
code. And, at this point in time, I also don't have enough familiarity with 
the JFFS2 code to make such a change myself. 

DRH

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-31 Thread Satyam Sharma


On 5/31/07, Nitin Gupta <[EMAIL PROTECTED]> wrote:

[...]
The author (Markus Oberhumer) of LZO  provided these comments for this patch:

---
I've only briefly looked over it, but it's obvious that your version does not
work on architechtures which do not allow unaligned access (arm, mips, ...).

As for further quality assurance, your version should generate byte-identical
object code to LZO 2.02 when using the same compiler & flags. So you could do
some semi-automatic testing by compiling with -ffunction-sections and use
"objcopy --only-section .text.XXX" to compare the md5sum of all generated
functions. This also works fine with crosscompilers.

Finally I'm not too happy that you renamed the functions and #defines like
LZO1X_WORKMEM_SIZE - please stay compatible with the official library version.


As suggested by Johannes earlier, it'd be great if you could submit
the various changes (as per Changelog) as _individual patches_ on
the original userspace code. That would be easier for others to review,
and there's lesser chances of bugs / issues leaking in that way.

As for "byte-identical object code", I definitely do *not* think it is
necessarily a requirement / good idea. As long as all the changes
you make are reviewed individually / closely by people here on this
list, there's very low chances of any bugs creeping in. [ F.e. I see
nothing wrong in removing the usage of "register" -- that could clearly
lead to different object code, but with no bugs introduced. ]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-31 Thread Nitin Gupta


Hi,

FYI...

The author (Markus Oberhumer) of LZO  provided these comments for this patch:

---
I've only briefly looked over it, but it's obvious that your version does not
work on architechtures which do not allow unaligned access (arm, mips, ...).

As for further quality assurance, your version should generate byte-identical
object code to LZO 2.02 when using the same compiler & flags. So you could do
some semi-automatic testing by compiling with -ffunction-sections and use
"objcopy --only-section .text.XXX" to compare the md5sum of all generated
functions. This also works fine with crosscompilers.

Finally I'm not too happy that you renamed the functions and #defines like
LZO1X_WORKMEM_SIZE - please stay compatible with the official library version.


Still a lot to do...

- Nitin


On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote:

This is kernel port of LZO1X-1 compressor and LZO1X decompressor (safe
version only).

* Changes since 'take 5' (Full Changelog after this):
- Added compressor and decomrpesssor as separate and hidden config
options (default: n)
- Cleanups: removed LZO_CHECK_MPOS_NON_DET macro
- removed PTR* macros.

* Benchmarks: (system: P4 3GHz, 1G RAM)
(Using tester program from Daniel)

Following compares this kernel port ('take 6') vs original miniLZO code:

'TinyLZO' refers to this kernel port.

1 run averages:
'Tiny LZO':
   Combined: 61.2223 usec
   Compression: 41.8412 usec
   Decompression: 19.3811 usec
'miniLZO':
   Combined: 66.0444 usec
   Compression: 46.6323 usec
   Decompression: 19.4121 usec

Result:
Overall: TinyLZO is 7.3% faster
Compressor: TinyLZO is 10.2% faster
Decompressor: TinyLZO is 0.15% faster

TODO: test 'take 6' port on archs other than x86(_32)

* Changelog vs. original LZO:
1) Used standard/kernel defined data types: (this eliminated _huge_
#ifdef chunks)
  lzo_bytep -> unsigned char *
  lzo_uint -> size_t
  lzo_xint -> size_t
  lzo_uint32p -> u32 *
  lzo_uintptr_t  -> unsigned long
2) Removed everything #ifdef'ed under COPY_DICT (this is not set for
LZO1X, so removed corres. parts).
3) Removed code #ifdef'ed for LZO1Y, LZO1Z, other variants.
4) Reformatted the code to match general kernel style.
5) The only code change: (as suggested by Andrey)
-#if defined(__LITTLE_ENDIAN)
- m_pos = op - 1;
- m_pos -= (*(const unsigned short *)ip) >> 2;
-#else
- m_pos = op - 1;
- m_pos -= (ip[0] >> 2) + (ip[1] << 6);
-#endif

+ m_pos = op - 1 - (cpu_to_le16(*(const u16 *)ip) >> 2);

(Andrey suggested le16_to_cpu for above but I think it should be cpu_to_le16).
*** Need testing on big endian machine ***

Similarly:
-#if defined(__LITTLE_ENDIAN)
- m_pos -= (*(const unsigned short *)ip) >> 2;
-#else
- m_pos -= (ip[0] >> 2) + (ip[1] << 6);
-#endif

+ m_pos -= cpu_to_le16(*(const u16 *)ip) >> 2;

6) Get rid of LZO_CHECK_MPOS_NON_DET macro and PTR* macros.


I think it's now ready for mainline inclusion.


 include/linux/lzo1x.h|   66 +++
 lib/Kconfig  |6 +
 lib/Makefile |2 +
 lib/lzo1x/Makefile   |2 +
 lib/lzo1x/lzo1x_compress.c   |  259 ++
 lib/lzo1x/lzo1x_decompress.c |  238 ++
 lib/lzo1x/lzo1x_int.h|   85 ++
 7 files changed, 658 insertions(+), 0 deletions(-)

Signed-off-by: Nitin Gupta 
---
diff --git a/include/linux/lzo1x.h b/include/linux/lzo1x.h
new file mode 100755
index 000..11a6f23
--- /dev/null
+++ b/include/linux/lzo1x.h
@@ -0,0 +1,66 @@
+/* lzo1x.h -- public interface of the LZO1X compression algorithm
+
+   This file is part of the LZO real-time data compression library.
+
+   Copyright (C) 1996-2005 Markus Franz Xaver Johannes Oberhumer
+   All Rights Reserved.
+
+   The LZO library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU General Public License,
+   version 2, as published by the Free Software Foundation.
+
+   The LZO library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with the LZO library; see the file COPYING.
+   If not, write to the Free Software Foundation, Inc.,
+   51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+
+   Markus F.X.J. Oberhumer
+   <[EMAIL PROTECTED]>
+   http://www.oberhumer.com/opensource/lzo/
+
+
+   This file is modified version of lzo1x.h found in original LZO 2.02
+   code. Some additional changes have also been made to make it work
+   in kernel space.
+
+   Nitin Gupta
+   <[EMAIL PROTECTED]>
+ */
+
+#ifndef __LZO1X_H
+#define __LZO1X_H
+
+/*

Re: [RFC] LZO de/compression support - take 6

2007-05-31 Thread Nitin Gupta


Hi,

FYI...

The author (Markus Oberhumer) of LZO  provided these comments for this patch:

---
I've only briefly looked over it, but it's obvious that your version does not
work on architechtures which do not allow unaligned access (arm, mips, ...).

As for further quality assurance, your version should generate byte-identical
object code to LZO 2.02 when using the same compiler  flags. So you could do
some semi-automatic testing by compiling with -ffunction-sections and use
objcopy --only-section .text.XXX to compare the md5sum of all generated
functions. This also works fine with crosscompilers.

Finally I'm not too happy that you renamed the functions and #defines like
LZO1X_WORKMEM_SIZE - please stay compatible with the official library version.


Still a lot to do...

- Nitin


On 5/28/07, Nitin Gupta [EMAIL PROTECTED] wrote:

This is kernel port of LZO1X-1 compressor and LZO1X decompressor (safe
version only).

* Changes since 'take 5' (Full Changelog after this):
- Added compressor and decomrpesssor as separate and hidden config
options (default: n)
- Cleanups: removed LZO_CHECK_MPOS_NON_DET macro
- removed PTR* macros.

* Benchmarks: (system: P4 3GHz, 1G RAM)
(Using tester program from Daniel)

Following compares this kernel port ('take 6') vs original miniLZO code:

'TinyLZO' refers to this kernel port.

1 run averages:
'Tiny LZO':
   Combined: 61.2223 usec
   Compression: 41.8412 usec
   Decompression: 19.3811 usec
'miniLZO':
   Combined: 66.0444 usec
   Compression: 46.6323 usec
   Decompression: 19.4121 usec

Result:
Overall: TinyLZO is 7.3% faster
Compressor: TinyLZO is 10.2% faster
Decompressor: TinyLZO is 0.15% faster

TODO: test 'take 6' port on archs other than x86(_32)

* Changelog vs. original LZO:
1) Used standard/kernel defined data types: (this eliminated _huge_
#ifdef chunks)
  lzo_bytep - unsigned char *
  lzo_uint - size_t
  lzo_xint - size_t
  lzo_uint32p - u32 *
  lzo_uintptr_t  - unsigned long
2) Removed everything #ifdef'ed under COPY_DICT (this is not set for
LZO1X, so removed corres. parts).
3) Removed code #ifdef'ed for LZO1Y, LZO1Z, other variants.
4) Reformatted the code to match general kernel style.
5) The only code change: (as suggested by Andrey)
-#if defined(__LITTLE_ENDIAN)
- m_pos = op - 1;
- m_pos -= (*(const unsigned short *)ip)  2;
-#else
- m_pos = op - 1;
- m_pos -= (ip[0]  2) + (ip[1]  6);
-#endif

+ m_pos = op - 1 - (cpu_to_le16(*(const u16 *)ip)  2);

(Andrey suggested le16_to_cpu for above but I think it should be cpu_to_le16).
*** Need testing on big endian machine ***

Similarly:
-#if defined(__LITTLE_ENDIAN)
- m_pos -= (*(const unsigned short *)ip)  2;
-#else
- m_pos -= (ip[0]  2) + (ip[1]  6);
-#endif

+ m_pos -= cpu_to_le16(*(const u16 *)ip)  2;

6) Get rid of LZO_CHECK_MPOS_NON_DET macro and PTR* macros.


I think it's now ready for mainline inclusion.


 include/linux/lzo1x.h|   66 +++
 lib/Kconfig  |6 +
 lib/Makefile |2 +
 lib/lzo1x/Makefile   |2 +
 lib/lzo1x/lzo1x_compress.c   |  259 ++
 lib/lzo1x/lzo1x_decompress.c |  238 ++
 lib/lzo1x/lzo1x_int.h|   85 ++
 7 files changed, 658 insertions(+), 0 deletions(-)

Signed-off-by: Nitin Gupta nitingupta910 [at] gmail [dot] com
---
diff --git a/include/linux/lzo1x.h b/include/linux/lzo1x.h
new file mode 100755
index 000..11a6f23
--- /dev/null
+++ b/include/linux/lzo1x.h
@@ -0,0 +1,66 @@
+/* lzo1x.h -- public interface of the LZO1X compression algorithm
+
+   This file is part of the LZO real-time data compression library.
+
+   Copyright (C) 1996-2005 Markus Franz Xaver Johannes Oberhumer
+   All Rights Reserved.
+
+   The LZO library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU General Public License,
+   version 2, as published by the Free Software Foundation.
+
+   The LZO library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with the LZO library; see the file COPYING.
+   If not, write to the Free Software Foundation, Inc.,
+   51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+
+   Markus F.X.J. Oberhumer
+   [EMAIL PROTECTED]
+   http://www.oberhumer.com/opensource/lzo/
+
+
+   This file is modified version of lzo1x.h found in original LZO 2.02
+   code. Some additional changes have also been made to make it work
+   in kernel space.
+
+   Nitin Gupta
+   [EMAIL PROTECTED]
+ */
+
+#ifndef __LZO1X_H
+#define __LZO1X_H
+

Re: [RFC] LZO de/compression support - take 6

2007-05-31 Thread Satyam Sharma


On 5/31/07, Nitin Gupta [EMAIL PROTECTED] wrote:

[...]
The author (Markus Oberhumer) of LZO  provided these comments for this patch:

---
I've only briefly looked over it, but it's obvious that your version does not
work on architechtures which do not allow unaligned access (arm, mips, ...).

As for further quality assurance, your version should generate byte-identical
object code to LZO 2.02 when using the same compiler  flags. So you could do
some semi-automatic testing by compiling with -ffunction-sections and use
objcopy --only-section .text.XXX to compare the md5sum of all generated
functions. This also works fine with crosscompilers.

Finally I'm not too happy that you renamed the functions and #defines like
LZO1X_WORKMEM_SIZE - please stay compatible with the official library version.


As suggested by Johannes earlier, it'd be great if you could submit
the various changes (as per Changelog) as _individual patches_ on
the original userspace code. That would be easier for others to review,
and there's lesser chances of bugs / issues leaking in that way.

As for byte-identical object code, I definitely do *not* think it is
necessarily a requirement / good idea. As long as all the changes
you make are reviewed individually / closely by people here on this
list, there's very low chances of any bugs creeping in. [ F.e. I see
nothing wrong in removing the usage of register -- that could clearly
lead to different object code, but with no bugs introduced. ]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-31 Thread Daniel Hazelton

On Wednesday 30 May 2007 19:02:28 Mark Adler wrote:
 On May 30, 2007, at 6:30 AM, Satyam Sharma wrote:
  [1] For your reference, here is the user code in question:

 ...

 if (srclen  2  !(data_in[1]  PRESET_DICT) 
  ((data_in[0]  0x0f) == Z_DEFLATED) 
  !(((data_in[0]8) + data_in[1]) % 31)) {

 The funny thing here is that the author felt compelled to use a
 #defined constant for the dictionary bit (PRESET_DICT), but had no
 problem with a numeric constant to isolate the compression method
 (0x0f), or for that matter extracting the window bits from the
 header.  The easy way to avoid the use of an internal zlib header
 file here is to simply replace PRESET_DICT with 0x20.  That constant
 will never change -- it is part of the definition of the zlib header
 in RFC 1950.

If there is no objection, I'll put together a patch that changes the use in 
JFFS2 into a magic number, complete with documentation on it, and also 
moves all of the zlib stuff into a single directory.

 The slightly more involved patch to avoid the problem is to let
 inflate() do all that work for you, including the integrity check on
 the zlib header (% 31).  Also this corrects an error in the original
 code, which is that it continues to try to decompress after finding
 that a dictionary is needed or that the zlib header is invalid.  In
 this version, a bad header simply returns an error:


Does anyone know if doing as Mark suggests would negatively impact the 
performance of JFFS2 to such a degree that it could be considered a 
regression? I, unfortunately, don't have the hardware to properly test the 
code. And, at this point in time, I also don't have enough familiarity with 
the JFFS2 code to make such a change myself. 

DRH

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-30 Thread Daniel Hazelton

On Wednesday 30 May 2007 19:02:28 Mark Adler wrote:
> On May 30, 2007, at 6:30 AM, Satyam Sharma wrote:
> > [1] For your reference, here is the user code in question:
>
> ...
>
> >if (srclen > 2 && !(data_in[1] & PRESET_DICT) &&
> > ((data_in[0] & 0x0f) == Z_DEFLATED) &&
> > !(((data_in[0]<<8) + data_in[1]) % 31)) {
>
> The funny thing here is that the author felt compelled to use a
> #defined constant for the dictionary bit (PRESET_DICT), but had no
> problem with a numeric constant to isolate the compression method
> (0x0f), or for that matter extracting the window bits from the
> header.  The easy way to avoid the use of an internal zlib header
> file here is to simply replace PRESET_DICT with 0x20.  That constant
> will never change -- it is part of the definition of the zlib header
> in RFC 1950.

That was one of my original suggestions, though I personally downchecked it 
(and believe I also did so in the mail with the suggestions) because of the 
dislike of "Magic Numbers" in the kernel. However, I had only looked at a 
small part of the JFFS2 code to see what it used zutil.h for - if it is using 
said 'Magic Numbers' in the rest of the code, the consistent way to fix this 
would be to replace the PRESET_DICT macro with the constant it stands for.

I've gotten a bit caught-up in extending the test-bed code I used to benchmark 
Nitin's 'tinyLZO' implementation, but I will see about getting a patch 
together to change JFFS2 in the manner Mark suggests. (Unless, of course, 
someone gets to it before I do).

DRH

>
> The slightly more involved patch to avoid the problem is to let
> inflate() do all that work for you, including the integrity check on
> the zlib header (% 31).  Also this corrects an error in the original
> code, which is that it continues to try to decompress after finding
> that a dictionary is needed or that the zlib header is invalid.  In
> this version, a bad header simply returns an error:
>
>  /* provide input data and output space */
>  inf_strm.next_in = data_in;
>  inf_strm.avail_in = srclen;
>  inf_strm.next_out = cpage_out;
>  inf_strm.avail_out = destlen;
>
>  /* verify and skip zlib header (this updates next_in and
> avail_in) */
>  inf_strm.zalloc = Z_NULL;
>  inf_strm.zfree = Z_NULL;
>  inf_strm.opaque = Z_NULL;
>  if (zlib_inflateInit(_strm) != Z_OK) {
>  printk(KERN_WARNING "inflateInit failed\n");
>  return 1;
>  }
>  ret = zlib_inflate(_strm, Z_BLOCK);
>  if (ret != Z_OK || (inf_strm.data_type & 0x80) == 0) {
>  printk(KERN_WARNING "inflate failed on zlib header\n");
>  return 1;
>  }
>  zlib_inflateEnd(_strm);
>
>  /* do raw inflate (no adler32) on deflate data after zlib header */
>  inf_strm.zalloc = Z_NULL;
>  inf_strm.zfree = Z_NULL;
>  inf_strm.opaque = Z_NULL;
>  if (zlib_inflateInit2(_strm, -MAX_WBITS) != Z_OK) {
>  printk(KERN_WARNING "inflateInit failed\n");
>  return 1;
>  }
>  while ((ret = zlib_inflate ...
>
> Mark


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-30 Thread Mark Adler


On May 30, 2007, at 6:30 AM, Satyam Sharma wrote:

[1] For your reference, here is the user code in question:

...

   if (srclen > 2 && !(data_in[1] & PRESET_DICT) &&
((data_in[0] & 0x0f) == Z_DEFLATED) &&
!(((data_in[0]<<8) + data_in[1]) % 31)) {


The funny thing here is that the author felt compelled to use a  
#defined constant for the dictionary bit (PRESET_DICT), but had no  
problem with a numeric constant to isolate the compression method  
(0x0f), or for that matter extracting the window bits from the  
header.  The easy way to avoid the use of an internal zlib header  
file here is to simply replace PRESET_DICT with 0x20.  That constant  
will never change -- it is part of the definition of the zlib header  
in RFC 1950.


The slightly more involved patch to avoid the problem is to let  
inflate() do all that work for you, including the integrity check on  
the zlib header (% 31).  Also this corrects an error in the original  
code, which is that it continues to try to decompress after finding  
that a dictionary is needed or that the zlib header is invalid.  In  
this version, a bad header simply returns an error:


/* provide input data and output space */
inf_strm.next_in = data_in;
inf_strm.avail_in = srclen;
inf_strm.next_out = cpage_out;
inf_strm.avail_out = destlen;

/* verify and skip zlib header (this updates next_in and  
avail_in) */

inf_strm.zalloc = Z_NULL;
inf_strm.zfree = Z_NULL;
inf_strm.opaque = Z_NULL;
if (zlib_inflateInit(_strm) != Z_OK) {
printk(KERN_WARNING "inflateInit failed\n");
return 1;
}
ret = zlib_inflate(_strm, Z_BLOCK);
if (ret != Z_OK || (inf_strm.data_type & 0x80) == 0) {
printk(KERN_WARNING "inflate failed on zlib header\n");
return 1;
}
zlib_inflateEnd(_strm);

/* do raw inflate (no adler32) on deflate data after zlib header */
inf_strm.zalloc = Z_NULL;
inf_strm.zfree = Z_NULL;
inf_strm.opaque = Z_NULL;
if (zlib_inflateInit2(_strm, -MAX_WBITS) != Z_OK) {
printk(KERN_WARNING "inflateInit failed\n");
return 1;
}
while ((ret = zlib_inflate ...

Mark

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-30 Thread Artem Bityutskiy


Satyam Sharma wrote:

Hi Artem,

On 5/30/07, Artem Bityutskiy <[EMAIL PROTECTED]> wrote:

Err, and important note is that it also wants this compressed data to be
independently uncompressable.


Hmm, okay.

But what I meant was that if jffs2's needs are "standard" enough in
the sense that they could conceivably be required by other users too
(and this one mentioned by you does appear to be one of those),
then why not make such hacks (if they are necessary and suitable
indeed, lets wait for Mark's response) part of zlib itself?


I really not sure - I vaguely remember that zlib just cannot provide this
capability. And what JFFS2 does is not always correct. But again, it was
so long time ago...

--
Best Regards,
Artem Bityutskiy (Артём Битюцкий)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-30 Thread Satyam Sharma


Hi Artem,

On 5/30/07, Artem Bityutskiy <[EMAIL PROTECTED]> wrote:

Err, and important note is that it also wants this compressed data to be
independently uncompressable.


Hmm, okay.

But what I meant was that if jffs2's needs are "standard" enough in
the sense that they could conceivably be required by other users too
(and this one mentioned by you does appear to be one of those),
then why not make such hacks (if they are necessary and suitable
indeed, lets wait for Mark's response) part of zlib itself?

Thanks,
Satyam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-30 Thread Artem Bityutskiy


Artem Bityutskiy wrote:
JFFS2 needs: it has _big_ input buffer, and _small_ output buffer, and 
it wants
zlib to compress as much as possible from the input buffer, and make the 
output

buffer full or nearly full of compressed data.


Err, and important note is that it also wants this compressed data to be
independently uncompressable.


--
Best Regards,
Artem Bityutskiy (Артём Битюцкий)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-30 Thread Satyam Sharma

On 5/30/07, Johannes Stezenbach <[EMAIL PROTECTED]> wrote:

On Wed, May 30, 2007, Nitin Gupta wrote:
>
> Again, all the original code has been retained _as-is_. Whatever was
> changed, has been mentioned in that detailed changelog that I post
> along with patch.

Just a general remark (I haven't been following this thread closely):

IMHO it would be _much_ better to merge the original code and
your changes as seperate patches. Then someone who
wants to review it later doesn't have to jump through all
the hoops of finding the original code himself to diff it
and see your changes.

Additionally, you should also split stylistic/cleanup
changes like "Reformatted the code to match general kernel style"
from functional changes like "use cpu_to_le16()".

Ideally each of the changes you mention in your
"Changelog vs. original LZO" should be a seperate
patch, this would make review much easier.

I violently agree with this method of going forward.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-30 Thread Johannes Stezenbach

On Wed, May 30, 2007, Nitin Gupta wrote:
> 
> Again, all the original code has been retained _as-is_. Whatever was
> changed, has been mentioned in that detailed changelog that I post
> along with patch.

Just a general remark (I haven't been following this thread closely):

IMHO it would be _much_ better to merge the original code and
your changes as seperate patches. Then someone who
wants to review it later doesn't have to jump through all
the hoops of finding the original code himself to diff it
and see your changes.

Additionally, you should also split stylistic/cleanup
changes like "Reformatted the code to match general kernel style"
from functional changes like "use cpu_to_le16()".

Ideally each of the changes you mention in your
"Changelog vs. original LZO" should be a seperate
patch, this would make review much easier.

Regards,
Johannes
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-30 Thread Artem Bityutskiy


Satyam Sharma wrote:

Hmmm, either jffs2 thinks the zlib interfaces exposed through
include/linux/zlib.h are insufficient, or it isn't using zlib properly
(or at least the way it was supposed to be used :-)



I do not remember many details, but yes, JFFS2 uses zlib trickily.

Traditionally, zlib is used like this: you have an input buffer, and you have
a large enough out put buffer, you compress whole input buffer to the output
buffer.

JFFS2 needs: it has _big_ input buffer, and _small_ output buffer, and it wants
zlib to compress as much as possible from the input buffer, and make the output
buffer full or nearly full of compressed data.

This is why JFFS2 uses different hacks, but I do not remember details anymore.
Otherwise it could just use cryptoapi instead. In fact, I tried this 2
years ago, but gave up: http://lkml.org/lkml/2005/3/25/104

--
Best Regards,
Artem Bityutskiy (Артём Битюцкий)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-30 Thread Satyam Sharma

Hi Mark,

On 5/30/07, Mark Adler <[EMAIL PROTECTED]> wrote:

On May 29, 2007, at 8:15 AM, Satyam Sharma wrote:
> skipping some checksum calculation if some
> flag (PRESET_DICT) is absent from the input stream about to
> be decompressed ...

You don't need to dissect the header manually to look for that bit.
If you feed inflate() at least the first two bytes, it will return
immediately with the Z_NEED_DICT return code if a preset dictionary
is requested.  You can force inflate() to return immediately after
decoding the two byte header even if a preset dictionary is not
requested by using the Z_BLOCK flush code.

Thanks for replying, unfortunately I don't know either zlib or jffs2 code
deeply enough to actually understand what you meant here. Are you
saying that the if-else block in question [1] in
fs/jffs2/compr_zlib.c:jffs2_zlib_decompress() is unnecessary and can
be done away with?

Thanks,
Satyam

[1] For your reference, here is the user code in question:

inf_strm.next_in = data_in;
inf_strm.avail_in = srclen;
inf_strm.total_in = 0;
inf_strm.next_out = cpage_out;
inf_strm.avail_out = destlen;
inf_strm.total_out = 0;

int wbits = MAX_WBITS;

/* If it's deflate, and it's got no preset dictionary, then
   we can tell zlib to skip the adler32 check. */

   if (srclen > 2 && !(data_in[1] & PRESET_DICT) &&
((data_in[0] & 0x0f) == Z_DEFLATED) &&
!(((data_in[0]<<8) + data_in[1]) % 31)) {
D2(printk(KERN_DEBUG "inflate skipping adler32\n"));
wbits = -((data_in[0] >> 4) + 8);
inf_strm.next_in += 2;
inf_strm.avail_in -= 2;
} else {
/* Let this remain D1 for now -- it should never happen */
D1(printk(KERN_DEBUG "inflate not skipping adler32\n"));
}

if (zlib_inflateInit2(_strm, wbits) != Z_OK) {
printk(KERN_WARNING "inflateInit failed\n");
return 1;
}
while((ret = zlib_inflate(_strm, Z_FINISH)) == Z_OK)
;
if (ret != Z_STREAM_END)
printk(KERN_NOTICE "inflate returned %d\n", ret);
zlib_inflateEnd(_strm);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-30 Thread Daniel Hazelton

On Wednesday 30 May 2007 01:31:19 Mark Adler wrote:
> On May 29, 2007, at 8:15 AM, Satyam Sharma wrote:
> > skipping some checksum calculation if some
> > flag (PRESET_DICT) is absent from the input stream about to
> > be decompressed ...
>
> You don't need to dissect the header manually to look for that bit.
> If you feed inflate() at least the first two bytes, it will return
> immediately with the Z_NEED_DICT return code if a preset dictionary
> is requested.  You can force inflate() to return immediately after
> decoding the two byte header even if a preset dictionary is not
> requested by using the Z_BLOCK flush code.
>
> Mark

I think that JFFS2 is doing the work of looking for PRESET_DICT itself not 
only to skip the checksum calc, but also to lose the function-call overhead. 
Yes, the performance difference shouldn't be all that great, but might have 
made a difference in some test. When I'm done working on the 
benchmark/testbed for the LZO code proposed for inclusion in the kernel I'll 
do some testing and see how large the difference is in userspace. (for most 
code the differences hold true, though usually with less drastic numbers, in 
kernel)

DRH
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-30 Thread Nitin Gupta


On 5/30/07, Jan Engelhardt <[EMAIL PROTECTED]> wrote:


On May 30 2007 11:24, Nitin Gupta wrote:
>>
>> It is the bytestream (ip) that is reinterpreted as uint16_t.
>> And I really doubt that the LZO author has a big-endian machine,
>> given the days of ubiquitous x86.
>
>> le16_to_cpu it is.
>
> But then why you think it should be
> le_16_cpu() -- how will this make any difference?

Like I said, we are reinterpreting the byte stream as a uint16_t
(= reading from bytestream to CPU). While writing out an uint16_t
as a bytestream is cpu_to_le16.



I didn't properly understand this. But you seem to be correct - now I
ran sparse check:

* using cpu_to_le16()
lib/lzo1x/lzo1x_decompress.c:141:35: error: incompatible types for
operation (>>)
lib/lzo1x/lzo1x_decompress.c:141:35:left side has type restricted
unsigned short [usertype] [force] 
lib/lzo1x/lzo1x_decompress.c:141:35:right side has type int
lib/lzo1x/lzo1x_decompress.c:140:20: error: incompatible types for operation (-)
lib/lzo1x/lzo1x_decompress.c:140:20:left side has type unsigned
char *register [assigned] op
lib/lzo1x/lzo1x_decompress.c:140:20:right side has type bad type
lib/lzo1x/lzo1x_decompress.c:157:35: error: incompatible types for
operation (>>)
lib/lzo1x/lzo1x_decompress.c:157:35:left side has type restricted
unsigned short [usertype] [force] 
lib/lzo1x/lzo1x_decompress.c:157:35:right side has type int
lib/lzo1x/lzo1x_decompress.c:156:11: error: incompatible types for
operation (-=)


* using le16_to_cpu()
lib/lzo1x/lzo1x_decompress.c:140:23: warning: cast to restricted type
lib/lzo1x/lzo1x_decompress.c:156:14: warning: cast to restricted type

But still, how do I get rid of these two warnings? or, do these
warning suggest some real problem that still exist in code?

- Nitin



> For your ref (from big_endian.h):
> # define __cpu_to_le16(x) ((__force __le16)__swab16((x)))
> # define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))


Jan
--


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-30 Thread Jan Engelhardt


On May 30 2007 11:24, Nitin Gupta wrote:
>> 
>> It is the bytestream (ip) that is reinterpreted as uint16_t.
>> And I really doubt that the LZO author has a big-endian machine,
>> given the days of ubiquitous x86.
>
>> le16_to_cpu it is.
>
> But then why you think it should be
> le_16_cpu() -- how will this make any difference?

Like I said, we are reinterpreting the byte stream as a uint16_t
(= reading from bytestream to CPU). While writing out an uint16_t
as a bytestream is cpu_to_le16.

> For your ref (from big_endian.h):
> # define __cpu_to_le16(x) ((__force __le16)__swab16((x)))
> # define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))


Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-30 Thread Mark Adler


On May 30, 2007, at 6:30 AM, Satyam Sharma wrote:

[1] For your reference, here is the user code in question:

...

   if (srclen  2  !(data_in[1]  PRESET_DICT) 
((data_in[0]  0x0f) == Z_DEFLATED) 
!(((data_in[0]8) + data_in[1]) % 31)) {


The funny thing here is that the author felt compelled to use a  
#defined constant for the dictionary bit (PRESET_DICT), but had no  
problem with a numeric constant to isolate the compression method  
(0x0f), or for that matter extracting the window bits from the  
header.  The easy way to avoid the use of an internal zlib header  
file here is to simply replace PRESET_DICT with 0x20.  That constant  
will never change -- it is part of the definition of the zlib header  
in RFC 1950.


The slightly more involved patch to avoid the problem is to let  
inflate() do all that work for you, including the integrity check on  
the zlib header (% 31).  Also this corrects an error in the original  
code, which is that it continues to try to decompress after finding  
that a dictionary is needed or that the zlib header is invalid.  In  
this version, a bad header simply returns an error:


/* provide input data and output space */
inf_strm.next_in = data_in;
inf_strm.avail_in = srclen;
inf_strm.next_out = cpage_out;
inf_strm.avail_out = destlen;

/* verify and skip zlib header (this updates next_in and  
avail_in) */

inf_strm.zalloc = Z_NULL;
inf_strm.zfree = Z_NULL;
inf_strm.opaque = Z_NULL;
if (zlib_inflateInit(inf_strm) != Z_OK) {
printk(KERN_WARNING inflateInit failed\n);
return 1;
}
ret = zlib_inflate(inf_strm, Z_BLOCK);
if (ret != Z_OK || (inf_strm.data_type  0x80) == 0) {
printk(KERN_WARNING inflate failed on zlib header\n);
return 1;
}
zlib_inflateEnd(inf_strm);

/* do raw inflate (no adler32) on deflate data after zlib header */
inf_strm.zalloc = Z_NULL;
inf_strm.zfree = Z_NULL;
inf_strm.opaque = Z_NULL;
if (zlib_inflateInit2(inf_strm, -MAX_WBITS) != Z_OK) {
printk(KERN_WARNING inflateInit failed\n);
return 1;
}
while ((ret = zlib_inflate ...

Mark

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-30 Thread Daniel Hazelton

On Wednesday 30 May 2007 19:02:28 Mark Adler wrote:
 On May 30, 2007, at 6:30 AM, Satyam Sharma wrote:
  [1] For your reference, here is the user code in question:

 ...

 if (srclen  2  !(data_in[1]  PRESET_DICT) 
  ((data_in[0]  0x0f) == Z_DEFLATED) 
  !(((data_in[0]8) + data_in[1]) % 31)) {

 The funny thing here is that the author felt compelled to use a
 #defined constant for the dictionary bit (PRESET_DICT), but had no
 problem with a numeric constant to isolate the compression method
 (0x0f), or for that matter extracting the window bits from the
 header.  The easy way to avoid the use of an internal zlib header
 file here is to simply replace PRESET_DICT with 0x20.  That constant
 will never change -- it is part of the definition of the zlib header
 in RFC 1950.

That was one of my original suggestions, though I personally downchecked it 
(and believe I also did so in the mail with the suggestions) because of the 
dislike of Magic Numbers in the kernel. However, I had only looked at a 
small part of the JFFS2 code to see what it used zutil.h for - if it is using 
said 'Magic Numbers' in the rest of the code, the consistent way to fix this 
would be to replace the PRESET_DICT macro with the constant it stands for.

I've gotten a bit caught-up in extending the test-bed code I used to benchmark 
Nitin's 'tinyLZO' implementation, but I will see about getting a patch 
together to change JFFS2 in the manner Mark suggests. (Unless, of course, 
someone gets to it before I do).

DRH


 The slightly more involved patch to avoid the problem is to let
 inflate() do all that work for you, including the integrity check on
 the zlib header (% 31).  Also this corrects an error in the original
 code, which is that it continues to try to decompress after finding
 that a dictionary is needed or that the zlib header is invalid.  In
 this version, a bad header simply returns an error:

  /* provide input data and output space */
  inf_strm.next_in = data_in;
  inf_strm.avail_in = srclen;
  inf_strm.next_out = cpage_out;
  inf_strm.avail_out = destlen;

  /* verify and skip zlib header (this updates next_in and
 avail_in) */
  inf_strm.zalloc = Z_NULL;
  inf_strm.zfree = Z_NULL;
  inf_strm.opaque = Z_NULL;
  if (zlib_inflateInit(inf_strm) != Z_OK) {
  printk(KERN_WARNING inflateInit failed\n);
  return 1;
  }
  ret = zlib_inflate(inf_strm, Z_BLOCK);
  if (ret != Z_OK || (inf_strm.data_type  0x80) == 0) {
  printk(KERN_WARNING inflate failed on zlib header\n);
  return 1;
  }
  zlib_inflateEnd(inf_strm);

  /* do raw inflate (no adler32) on deflate data after zlib header */
  inf_strm.zalloc = Z_NULL;
  inf_strm.zfree = Z_NULL;
  inf_strm.opaque = Z_NULL;
  if (zlib_inflateInit2(inf_strm, -MAX_WBITS) != Z_OK) {
  printk(KERN_WARNING inflateInit failed\n);
  return 1;
  }
  while ((ret = zlib_inflate ...

 Mark


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-30 Thread Jan Engelhardt


On May 30 2007 11:24, Nitin Gupta wrote:
 
 It is the bytestream (ip) that is reinterpreted as uint16_t.
 And I really doubt that the LZO author has a big-endian machine,
 given the days of ubiquitous x86.

 le16_to_cpu it is.

 But then why you think it should be
 le_16_cpu() -- how will this make any difference?

Like I said, we are reinterpreting the byte stream as a uint16_t
(= reading from bytestream to CPU). While writing out an uint16_t
as a bytestream is cpu_to_le16.

 For your ref (from big_endian.h):
 # define __cpu_to_le16(x) ((__force __le16)__swab16((x)))
 # define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))


Jan
-- 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-30 Thread Nitin Gupta


On 5/30/07, Jan Engelhardt [EMAIL PROTECTED] wrote:


On May 30 2007 11:24, Nitin Gupta wrote:

 It is the bytestream (ip) that is reinterpreted as uint16_t.
 And I really doubt that the LZO author has a big-endian machine,
 given the days of ubiquitous x86.

 le16_to_cpu it is.

 But then why you think it should be
 le_16_cpu() -- how will this make any difference?

Like I said, we are reinterpreting the byte stream as a uint16_t
(= reading from bytestream to CPU). While writing out an uint16_t
as a bytestream is cpu_to_le16.



I didn't properly understand this. But you seem to be correct - now I
ran sparse check:

* using cpu_to_le16()
lib/lzo1x/lzo1x_decompress.c:141:35: error: incompatible types for
operation ()
lib/lzo1x/lzo1x_decompress.c:141:35:left side has type restricted
unsigned short [usertype] [force] noident
lib/lzo1x/lzo1x_decompress.c:141:35:right side has type int
lib/lzo1x/lzo1x_decompress.c:140:20: error: incompatible types for operation (-)
lib/lzo1x/lzo1x_decompress.c:140:20:left side has type unsigned
char *register [assigned] op
lib/lzo1x/lzo1x_decompress.c:140:20:right side has type bad type
lib/lzo1x/lzo1x_decompress.c:157:35: error: incompatible types for
operation ()
lib/lzo1x/lzo1x_decompress.c:157:35:left side has type restricted
unsigned short [usertype] [force] noident
lib/lzo1x/lzo1x_decompress.c:157:35:right side has type int
lib/lzo1x/lzo1x_decompress.c:156:11: error: incompatible types for
operation (-=)


* using le16_to_cpu()
lib/lzo1x/lzo1x_decompress.c:140:23: warning: cast to restricted type
lib/lzo1x/lzo1x_decompress.c:156:14: warning: cast to restricted type

But still, how do I get rid of these two warnings? or, do these
warning suggest some real problem that still exist in code?

- Nitin



 For your ref (from big_endian.h):
 # define __cpu_to_le16(x) ((__force __le16)__swab16((x)))
 # define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))


Jan
--


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-30 Thread Daniel Hazelton

On Wednesday 30 May 2007 01:31:19 Mark Adler wrote:
 On May 29, 2007, at 8:15 AM, Satyam Sharma wrote:
  skipping some checksum calculation if some
  flag (PRESET_DICT) is absent from the input stream about to
  be decompressed ...

 You don't need to dissect the header manually to look for that bit.
 If you feed inflate() at least the first two bytes, it will return
 immediately with the Z_NEED_DICT return code if a preset dictionary
 is requested.  You can force inflate() to return immediately after
 decoding the two byte header even if a preset dictionary is not
 requested by using the Z_BLOCK flush code.

 Mark

I think that JFFS2 is doing the work of looking for PRESET_DICT itself not 
only to skip the checksum calc, but also to lose the function-call overhead. 
Yes, the performance difference shouldn't be all that great, but might have 
made a difference in some test. When I'm done working on the 
benchmark/testbed for the LZO code proposed for inclusion in the kernel I'll 
do some testing and see how large the difference is in userspace. (for most 
code the differences hold true, though usually with less drastic numbers, in 
kernel)

DRH
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-30 Thread Satyam Sharma


Hi Mark,

On 5/30/07, Mark Adler [EMAIL PROTECTED] wrote:

On May 29, 2007, at 8:15 AM, Satyam Sharma wrote:
 skipping some checksum calculation if some
 flag (PRESET_DICT) is absent from the input stream about to
 be decompressed ...

You don't need to dissect the header manually to look for that bit.
If you feed inflate() at least the first two bytes, it will return
immediately with the Z_NEED_DICT return code if a preset dictionary
is requested.  You can force inflate() to return immediately after
decoding the two byte header even if a preset dictionary is not
requested by using the Z_BLOCK flush code.


Thanks for replying, unfortunately I don't know either zlib or jffs2 code
deeply enough to actually understand what you meant here. Are you
saying that the if-else block in question [1] in
fs/jffs2/compr_zlib.c:jffs2_zlib_decompress() is unnecessary and can
be done away with?

Thanks,
Satyam

[1] For your reference, here is the user code in question:

inf_strm.next_in = data_in;
inf_strm.avail_in = srclen;
inf_strm.total_in = 0;
inf_strm.next_out = cpage_out;
inf_strm.avail_out = destlen;
inf_strm.total_out = 0;

int wbits = MAX_WBITS;

/* If it's deflate, and it's got no preset dictionary, then
   we can tell zlib to skip the adler32 check. */

   if (srclen  2  !(data_in[1]  PRESET_DICT) 
((data_in[0]  0x0f) == Z_DEFLATED) 
!(((data_in[0]8) + data_in[1]) % 31)) {
D2(printk(KERN_DEBUG inflate skipping adler32\n));
wbits = -((data_in[0]  4) + 8);
inf_strm.next_in += 2;
inf_strm.avail_in -= 2;
} else {
/* Let this remain D1 for now -- it should never happen */
D1(printk(KERN_DEBUG inflate not skipping adler32\n));
}

if (zlib_inflateInit2(inf_strm, wbits) != Z_OK) {
printk(KERN_WARNING inflateInit failed\n);
return 1;
}
while((ret = zlib_inflate(inf_strm, Z_FINISH)) == Z_OK)
;
if (ret != Z_STREAM_END)
printk(KERN_NOTICE inflate returned %d\n, ret);
zlib_inflateEnd(inf_strm);
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-30 Thread Artem Bityutskiy


Satyam Sharma wrote:

Hmmm, either jffs2 thinks the zlib interfaces exposed through
include/linux/zlib.h are insufficient, or it isn't using zlib properly
(or at least the way it was supposed to be used :-)



I do not remember many details, but yes, JFFS2 uses zlib trickily.

Traditionally, zlib is used like this: you have an input buffer, and you have
a large enough out put buffer, you compress whole input buffer to the output
buffer.

JFFS2 needs: it has _big_ input buffer, and _small_ output buffer, and it wants
zlib to compress as much as possible from the input buffer, and make the output
buffer full or nearly full of compressed data.

This is why JFFS2 uses different hacks, but I do not remember details anymore.
Otherwise it could just use cryptoapi instead. In fact, I tried this 2
years ago, but gave up: http://lkml.org/lkml/2005/3/25/104

--
Best Regards,
Artem Bityutskiy (Артём Битюцкий)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-30 Thread Johannes Stezenbach

On Wed, May 30, 2007, Nitin Gupta wrote:
 
 Again, all the original code has been retained _as-is_. Whatever was
 changed, has been mentioned in that detailed changelog that I post
 along with patch.

Just a general remark (I haven't been following this thread closely):

IMHO it would be _much_ better to merge the original code and
your changes as seperate patches. Then someone who
wants to review it later doesn't have to jump through all
the hoops of finding the original code himself to diff it
and see your changes.

Additionally, you should also split stylistic/cleanup
changes like Reformatted the code to match general kernel style
from functional changes like use cpu_to_le16().

Ideally each of the changes you mention in your
Changelog vs. original LZO should be a seperate
patch, this would make review much easier.


Regards,
Johannes
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-30 Thread Satyam Sharma


On 5/30/07, Johannes Stezenbach [EMAIL PROTECTED] wrote:

On Wed, May 30, 2007, Nitin Gupta wrote:

 Again, all the original code has been retained _as-is_. Whatever was
 changed, has been mentioned in that detailed changelog that I post
 along with patch.

Just a general remark (I haven't been following this thread closely):

IMHO it would be _much_ better to merge the original code and
your changes as seperate patches. Then someone who
wants to review it later doesn't have to jump through all
the hoops of finding the original code himself to diff it
and see your changes.

Additionally, you should also split stylistic/cleanup
changes like Reformatted the code to match general kernel style
from functional changes like use cpu_to_le16().

Ideally each of the changes you mention in your
Changelog vs. original LZO should be a seperate
patch, this would make review much easier.


I violently agree with this method of going forward.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-30 Thread Artem Bityutskiy


Artem Bityutskiy wrote:
JFFS2 needs: it has _big_ input buffer, and _small_ output buffer, and 
it wants
zlib to compress as much as possible from the input buffer, and make the 
output

buffer full or nearly full of compressed data.


Err, and important note is that it also wants this compressed data to be
independently uncompressable.


--
Best Regards,
Artem Bityutskiy (Артём Битюцкий)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-30 Thread Satyam Sharma


Hi Artem,

On 5/30/07, Artem Bityutskiy [EMAIL PROTECTED] wrote:

Err, and important note is that it also wants this compressed data to be
independently uncompressable.


Hmm, okay.

But what I meant was that if jffs2's needs are standard enough in
the sense that they could conceivably be required by other users too
(and this one mentioned by you does appear to be one of those),
then why not make such hacks (if they are necessary and suitable
indeed, lets wait for Mark's response) part of zlib itself?

Thanks,
Satyam
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-30 Thread Artem Bityutskiy


Satyam Sharma wrote:

Hi Artem,

On 5/30/07, Artem Bityutskiy [EMAIL PROTECTED] wrote:

Err, and important note is that it also wants this compressed data to be
independently uncompressable.


Hmm, okay.

But what I meant was that if jffs2's needs are standard enough in
the sense that they could conceivably be required by other users too
(and this one mentioned by you does appear to be one of those),
then why not make such hacks (if they are necessary and suitable
indeed, lets wait for Mark's response) part of zlib itself?


I really not sure - I vaguely remember that zlib just cannot provide this
capability. And what JFFS2 does is not always correct. But again, it was
so long time ago...

--
Best Regards,
Artem Bityutskiy (Артём Битюцкий)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-29 Thread Mark Adler


On May 29, 2007, at 8:15 AM, Satyam Sharma wrote:

skipping some checksum calculation if some
flag (PRESET_DICT) is absent from the input stream about to
be decompressed ...


You don't need to dissect the header manually to look for that bit.   
If you feed inflate() at least the first two bytes, it will return  
immediately with the Z_NEED_DICT return code if a preset dictionary  
is requested.  You can force inflate() to return immediately after  
decoding the two byte header even if a preset dictionary is not  
requested by using the Z_BLOCK flush code.


Mark

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Nitin Gupta

On 5/30/07, Jan Engelhardt <[EMAIL PROTECTED]> wrote:

On May 28 2007 20:04, Nitin Gupta wrote:
>
> * Changelog vs. original LZO:
> 1) Used standard/kernel defined data types: (this eliminated _huge_
> #ifdef chunks)
> lzo_bytep -> unsigned char *
> lzo_uint -> size_t
> lzo_xint -> size_t

Is this safe (as far as compressed LZO stream is concerned) --
or is it even needed (could it be unsigned int)?

> - m_pos -= (*(const unsigned short *)ip) >> 2;
> -#else
> - m_pos = op - 1;
> - m_pos -= (ip[0] >> 2) + (ip[1] << 6);
> -#endif
>
> + m_pos = op - 1 - (cpu_to_le16(*(const u16 *)ip) >> 2);
>
> (Andrey suggested le16_to_cpu for above but I think it should be cpu_to_le16).
> *** Need testing on big endian machine ***

On i386, both cpu_to_le16 and le16_to_cpu do nothing.
On sparc for example, cpu_to_leXX and leXX_to_cpu do 'the same' ;-)
they swap 1234<->4321.

It is the bytestream (ip) that is reinterpreted as uint16_t.
And I really doubt that the LZO author has a big-endian machine,
given the days of ubiquitous x86.

le16_to_cpu it is.

I just missed the point that leXX_to_cpu() and cpu_to_leXX() are
identical on BE machine anyway. But then why you think it should be
le_16_cpu() -- how will this make any difference?

For your ref (from big_endian.h):
#define __cpu_to_le16(x) ((__force __le16)__swab16((x)))
#define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))

- Nitin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Nitin Gupta

On 5/30/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:

On Tue, May 29, 2007 at 11:10:05PM +0200, Jan Engelhardt wrote:
>
> On May 28 2007 19:11, Adrian Bunk wrote:
You completely miss the point of my question.

It's about the performance improvements of the modified code that were
mentioned.

What you are talking about shouldn't have any effect on the generated
code.

After Daniel refined this testing program, we can see that perf gain
is < 2% which can surely be accounted to cleanups like unnecessary
castings in tight loops.

Again, all the original code has been retained _as-is_. Whatever was
changed, has been mentioned in that detailed changelog that I post
along with patch.

If someone want to review these 500 lines with this changelog in hand,
it should not take more than couple of hours. If no-one can see any
problem in code by now and considering that it's tested on x86(_32),
amd64, ppc and giving somewhat better perf. than original then I
believe it is unnecessarily hanging outside of -mm tree.

I also contacted author (Markus Oberhumer) regarding above changes but
he seems not be responding right now. But still, if it gets into -mm
and gets used by various related projects then it should be good
enough for mainline also.

Cheers,
Nitin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Nitin Gupta


On 5/30/07, Daniel Hazelton <[EMAIL PROTECTED]> wrote:

I just noticed a bug in my testbed/benchmarking code. It's fixed, but I
decided to compare version 6 of the code against the *unsafe* decompressor
again. The results of the three runs I've put it through after changing it to
compare against the unsafe decompressor were startling. `Tiny's` compressor
is still faster - I've seen it be rated up to 3% faster. The decompressor,
OTOH, when compared to the unsafe version (which is the comparison that
started me on this binge of hacking), is more than 7% worse. About 11% slower
on the original test against a C source file, and about 6% slower for random
data.


Unsafe vs safe is within 10%. Its okay.


However, looking at the numbers involved, I can't see a reason to keep
the unsafe version around - the percentages look worse than they are - from 1
to 3 microseconds.


Not just numbers. Most of applications cannot afford to use unsafe
versions anyway (like fs people).

(well, the compressed-cache people might want those extra

usecs - but the difference will never be noticeable anywhere outside the
kernel)

DRH



compressed cache people require every single percent of that
performance. For now, ccaching is not ready for mainline (many things
need to be done). So, till then I will keep off the unsafe version. If
ever compressed caching is on its way to mainline _then_ I will try
and add back the unsafe version. But I see no other project that
really cares about unsafe version so it's okay to keep it off.


- Nitin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Daniel Hazelton

I just noticed a bug in my testbed/benchmarking code. It's fixed, but I 
decided to compare version 6 of the code against the *unsafe* decompressor 
again. The results of the three runs I've put it through after changing it to 
compare against the unsafe decompressor were startling. `Tiny's` compressor 
is still faster - I've seen it be rated up to 3% faster. The decompressor, 
OTOH, when compared to the unsafe version (which is the comparison that 
started me on this binge of hacking), is more than 7% worse. About 11% slower 
on the original test against a C source file, and about 6% slower for random 
data. However, looking at the numbers involved, I can't see a reason to keep 
the unsafe version around - the percentages look worse than they are - from 1 
to 3 microseconds. (well, the compressed-cache people might want those extra 
usecs - but the difference will never be noticeable anywhere outside the 
kernel)

DRH


lzo1x-test-6a.tar.bz2
Description: application/tbz

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Daniel Hazelton

All problems I was having with the test-bed code have been solved, and the 
error I was running into was, as I suspected, in the code I used to fill the 
buffer for the random-data test.

Results of running the new benchmark (version 6 of the benchmark, version 6 
of 'tinyLZO'):
1 run averages:
'Tiny LZO':
Combined: 104.1529 usec
Compression: 43.065 usec
Decompression: 18.4648 usec
Random Data Compression: 31.132 usec
Random Data Decompression: 11.4911 usec
'miniLZO':
Combined: 106.1363 usec
Compression: 44.6988 usec
Decompression: 18.3576 usec
Random Data Compression: 31.5772 usec
Random Data Decompression: 11.5027 usec
Percentages (calculated as: ((full - tiny)/full)*100):
Overall (combined totals): Tiny is 1.87 % faster
Compression: Tiny is 3.66 % faster
Decompression: Tiny is 0.58 % slower
Random Compression: Tiny is 1.41 % faster
Random Decompression: Tiny is 0.10 % faster

The results, on my system, are not consistent, except in that 'TinyLZO' is 
generally faster in the non-random data tests than miniLZO. It did, three of 
the five times I ran the test, perform (on average) about 1% worse than 
miniLZO.

In order to sidestep any issues that a difference in the input data might have 
caused, I'm going to rewrite the code so that all tests are run against the 
same data. However, in the meantime, I've attached the latest version of my 
test-code.

DRH
PS: the code is going to massively change as I work to include more data 
sources for the benchmarking, as well as tests that will try to really stress 
the stripped-down code.


lzo1x-test-6.tar.bz2
Description: application/tbz

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Adrian Bunk

On Tue, May 29, 2007 at 11:10:05PM +0200, Jan Engelhardt wrote:
> 
> On May 28 2007 19:11, Adrian Bunk wrote:
> >
> >I have not seen any explanations:
> >- Why did the upstream author write the code that way?
> 
> I guess it's along the lines of
> - portability
> 
> (note how this contradicts itself). Really. I have yet to
> figure out why everyone invents their own xxx32_t types,
> like e.g. glib. well, integer types, one might understand, but when it
> comes to gchar or gpointer, that's just plain microsoft-style
> (think LPCSTR and LPCVOID...)

You completely miss the point of my question.

It's about the performance improvements of the modified code that were 
mentioned.

What you are talking about shouldn't have any effect on the generated 
code.

>   Jan

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Jan Engelhardt

On May 28 2007 19:11, Adrian Bunk wrote:
>
>I have not seen any explanations:
>- Why did the upstream author write the code that way?

I guess it's along the lines of
- portability

(note how this contradicts itself). Really. I have yet to
figure out why everyone invents their own xxx32_t types,
like e.g. glib. well, integer types, one might understand, but when it
comes to gchar or gpointer, that's just plain microsoft-style
(think LPCSTR and LPCVOID...)

Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Jan Engelhardt

On May 28 2007 20:04, Nitin Gupta wrote:
>
> * Changelog vs. original LZO:
> 1) Used standard/kernel defined data types: (this eliminated _huge_
> #ifdef chunks)
> lzo_bytep -> unsigned char *
> lzo_uint -> size_t
> lzo_xint -> size_t

Is this safe (as far as compressed LZO stream is concerned) --
or is it even needed (could it be unsigned int)? 

> - m_pos -= (*(const unsigned short *)ip) >> 2;
> -#else
> - m_pos = op - 1;
> - m_pos -= (ip[0] >> 2) + (ip[1] << 6);
> -#endif
>
> + m_pos = op - 1 - (cpu_to_le16(*(const u16 *)ip) >> 2);
>
> (Andrey suggested le16_to_cpu for above but I think it should be cpu_to_le16).
> *** Need testing on big endian machine ***

On i386, both cpu_to_le16 and le16_to_cpu do nothing.
On sparc for example, cpu_to_leXX and leXX_to_cpu do 'the same' ;-)
they swap 1234<->4321.

It is the bytestream (ip) that is reinterpreted as uint16_t.
And I really doubt that the LZO author has a big-endian machine,
given the days of ubiquitous x86. le16_to_cpu it is.

Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Daniel Hazelton

On Tuesday 29 May 2007 16:14:34 Daniel Hazelton wrote:
> On Tuesday 29 May 2007 01:58:43 Nitin Gupta wrote:
> > On 5/29/07, Bret Towe <[EMAIL PROTECTED]> wrote:
> > > tested this on ppc and its still good
> > >
> > > is there any reason to bother with a test on amd64?
> > > if there is I might be able to get to it tonight
> >
> > Yes, this test is desired on 'take 6'
> > (In future I will append version to patch bz2)
> >
> > Thanks,
> > Nitin
>
> I added a few functions to my test-bed to see how well the LZO code coped
> with random input data. I'll extend my error checking so that it dumps all
> local variables at the first error, but for the time being I'm staring at a
> problem with what appears to be a double-free (in addition to
> lzo1x_decompress(_safe) screaming about overruns when I pass it the buffer
> of compressed, totally random data).
>
> (I fill the buffer by doing a read from /dev/urandom, though I'll certainly
> change it to use random() if anyone thinks that'll make a difference)
>
> I'll post the version with the random-data tests when I get the double-free
> solved. (I've looked and don't see how its happening)
>
> DRH

After some additional debugging the problem appears to be that both 
lzo1x_decompress_safe (from miniLZO) and lzo1x_decompress (from the 'Tiny' 
version) both are somehow corrupting memory.

*** glibc detected *** ./tinytest: double free or corruption (out): 0x08070e60 
***

The buffer at 0x08070e60 is the one I allocated specifically for the new 
random data tests. It has the same location when it is allocated as it is 
when free()'d, so the corruption has to be coming from somewhere. The problem 
*might* be in my code - that is, I might have made a mistake in the read() 
from /dev/urandom. I'm going to go test that now.

DRH
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Daniel Hazelton

On Tuesday 29 May 2007 01:58:43 Nitin Gupta wrote:
> On 5/29/07, Bret Towe <[EMAIL PROTECTED]> wrote:
> > tested this on ppc and its still good
> >
> > is there any reason to bother with a test on amd64?
> > if there is I might be able to get to it tonight
>
> Yes, this test is desired on 'take 6'
> (In future I will append version to patch bz2)
>
> Thanks,
> Nitin

I added a few functions to my test-bed to see how well the LZO code coped with 
random input data. I'll extend my error checking so that it dumps all local 
variables at the first error, but for the time being I'm staring at a problem 
with what appears to be a double-free (in addition to lzo1x_decompress(_safe) 
screaming about overruns when I pass it the buffer of compressed, totally 
random data).

(I fill the buffer by doing a read from /dev/urandom, though I'll certainly 
change it to use random() if anyone thinks that'll make a difference)

I'll post the version with the random-data tests when I get the double-free 
solved. (I've looked and don't see how its happening)

DRH
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-29 Thread Daniel Hazelton

On Tuesday 29 May 2007 11:15:41 Satyam Sharma wrote:
> [ Trimmed Cc list; added original zlib authors. ]
>
> > On Tuesday 29 May 2007 09:33:51 Michael-Luke Jones wrote:
> > > On 29 May 2007, at 12:27, Satyam Sharma wrote:
> > > > Right, actually, zlib could be switched over to [using a common
> > > > directory].
> > > > Because zlib_deflate/ and zlib_inflate/ too share a private header
> > > > zutil.h which has unfortunately been stuck into include/linux/ with
> > > > a big
> > > > /* WARNING: this file should *not* be used by applications. */
> > > > comment ...
> > >
> > > Well, unfortunately zutil.h is currently being used in-kernel by fs/
> > > jffs2/compr_zlib.c despite the "WARNING: this file should *not* be
> > > used by applications" notice.
>
> Hmmm, either jffs2 thinks the zlib interfaces exposed through
> include/linux/zlib.h are insufficient, or it isn't using zlib properly
> (or at least the way it was supposed to be used :-)
>
> Looking at fs/jffs2/compr_zlib.c, however, it just seems to be
> an optimization -- skipping some checksum calculation if some
> flag (PRESET_DICT) is absent from the input stream about to
> be decompressed ...
>
> This /looks/ like an optimization that might make sense for other
> users of zlib too, in which case that entire check and
> skip-the-checksum thing could actually be put into zlib proper. (?)

This *is* a good idea. If someone doesn't beat me to it I'll make sure I've 
got the latest git and do up a patch that does this.

DRH

> > > So moving this header to a truly private location isn't possible
> > > right now, unfortunately,
> > >
> > > Michael-Luke Jones
> > > [added David Whitehouse to cc]
>
> On 5/29/07, Daniel Hazelton <[EMAIL PROTECTED]> wrote:
> > I've looked at that code and it seems to need that file for one constant.
> > Perhaps it'd be better for jffs2/compr_zlib.c to define that constant
> > itself (or use it as a "Magic Number") rather than include the zlib
> > private header. Another possibility would be to move that constant out of
> > zutil.h and into zconf.h or zlih.b - doing any of those would allow the
> > zlib private header to be moved such that zlib could be changed to use a
> > common directory *and* have said private header in that directory.
>
> Moving PRESET_DICT to zlib.h robs lib/zlib_deflate/deflate.c from
> being able to resolve that macro. FWIW, a dirty hack could be to
> simply duplicate it in zlib.h. But then, implementation and interface
> are decoupled for good reasons ...
>
> Satyam


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-29 Thread Satyam Sharma

[ Trimmed Cc list; added original zlib authors. ]

On Tuesday 29 May 2007 09:33:51 Michael-Luke Jones wrote:
> On 29 May 2007, at 12:27, Satyam Sharma wrote:
> > Right, actually, zlib could be switched over to [using a common
> > directory].
> > Because zlib_deflate/ and zlib_inflate/ too share a private header
> > zutil.h which has unfortunately been stuck into include/linux/ with
> > a big
> > /* WARNING: this file should *not* be used by applications. */
> > comment ...
>
> Well, unfortunately zutil.h is currently being used in-kernel by fs/
> jffs2/compr_zlib.c despite the "WARNING: this file should *not* be
> used by applications" notice.

Hmmm, either jffs2 thinks the zlib interfaces exposed through
include/linux/zlib.h are insufficient, or it isn't using zlib properly
(or at least the way it was supposed to be used :-)

Looking at fs/jffs2/compr_zlib.c, however, it just seems to be
an optimization -- skipping some checksum calculation if some
flag (PRESET_DICT) is absent from the input stream about to
be decompressed ...

This /looks/ like an optimization that might make sense for other
users of zlib too, in which case that entire check and
skip-the-checksum thing could actually be put into zlib proper. (?)

> So moving this header to a truly private location isn't possible
> right now, unfortunately,
>
> Michael-Luke Jones
> [added David Whitehouse to cc]

On 5/29/07, Daniel Hazelton <[EMAIL PROTECTED]> wrote:

I've looked at that code and it seems to need that file for one constant.
Perhaps it'd be better for jffs2/compr_zlib.c to define that constant itself
(or use it as a "Magic Number") rather than include the zlib private header.
Another possibility would be to move that constant out of zutil.h and into
zconf.h or zlih.b - doing any of those would allow the zlib private header to
be moved such that zlib could be changed to use a common directory *and* have
said private header in that directory.

Moving PRESET_DICT to zlib.h robs lib/zlib_deflate/deflate.c from
being able to resolve that macro. FWIW, a dirty hack could be to
simply duplicate it in zlib.h. But then, implementation and interface
are decoupled for good reasons ...

Satyam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-29 Thread Daniel Hazelton

On Tuesday 29 May 2007 09:33:51 Michael-Luke Jones wrote:
> On 29 May 2007, at 12:27, Satyam Sharma wrote:
> > Right, actually, zlib could be switched over to [using a common
> > directory].
> > Because zlib_deflate/ and zlib_inflate/ too share a private header
> > zutil.h which has unfortunately been stuck into include/linux/ with
> > a big
> > /* WARNING: this file should *not* be used by applications. */
> > comment ...
>
> Well, unfortunately zutil.h is currently being used in-kernel by fs/
> jffs2/compr_zlib.c despite the "WARNING: this file should *not* be
> used by applications" notice.
>
> So moving this header to a truly private location isn't possible
> right now, unfortunately,
>
> Michael-Luke Jones
> [added David Whitehouse to cc]

I've looked at that code and it seems to need that file for one constant. 
Perhaps it'd be better for jffs2/compr_zlib.c to define that constant itself 
(or use it as a "Magic Number") rather than include the zlib private header. 
Another possibility would be to move that constant out of zutil.h and into 
zconf.h or zlih.b - doing any of those would allow the zlib private header to 
be moved such that zlib could be changed to use a common directory *and* have 
said private header in that directory.

DRH

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-29 Thread Michael-Luke Jones


On 29 May 2007, at 12:27, Satyam Sharma wrote:

Right, actually, zlib could be switched over to [using a common  
directory].

Because zlib_deflate/ and zlib_inflate/ too share a private header
zutil.h which has unfortunately been stuck into include/linux/ with  
a big
/* WARNING: this file should *not* be used by applications. */  
comment ...


Well, unfortunately zutil.h is currently being used in-kernel by fs/ 
jffs2/compr_zlib.c despite the "WARNING: this file should *not* be  
used by applications" notice.


So moving this header to a truly private location isn't possible  
right now, unfortunately,


Michael-Luke Jones
[added David Whitehouse to cc]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Daniel Hazelton

On Tuesday 29 May 2007 08:03:55 Nitin Gupta wrote:
> On 5/29/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:
> > On Tue, May 29, 2007 at 09:08:27AM +0100, Michael-Luke Jones wrote:
> > > On 28 May 2007, at 18:11, Adrian Bunk wrote:
> > >> I have not seen any explanations:
> > >> - Why did the upstream author write the code that way?
> > >
> > > Apparently due to his requirement for extreme portability. The original
> > > code was designed to work on everything from 16-bit DOS through CRAY
> > > supercomputers through Windows, Unices and Linux.
> >
> > Sure, this could be the reason in some or all cases.
> >
> > The upstream author knows the code best, and discussing such issues with
> > him will in many cases be a win:
> >
> > It could be that there was in some cases no good reason, and the
> > upstream code that gets used by many other projects could become faster.
> >
> > Or there was a good reason that applies also to the in-kernel version
> > and a change breaks some corner case.
>
> I have mailed the author with detailed changelog - waiting for reply.
>
> > > The author has stated on the thread that it's a good idea to remove
> > > unnecessary ifdefs when porting the code into the kernel, given that
> > > the portability requirements are obviously no longer needed.
> >
> > "remove unnecessary ifdefs" implies "generated code is identical".
> >
> > That's quite different from "code is 10% faster".
>
> Daniel made some changes to his testing code and now the perf gain is just
> 1.6%.
>
> - Nitin

The changes for version 5 are all in one file - helpers.h. All I did was 
define the macros that had previously been NOP's. Most of the performance 
loss, IMHO, is due to the compression and decompression functions being 
marked noinline. The currently reported 1.6% overall performance boost is 
likely related to the use of likely()/unlikely().

DRH

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Daniel Hazelton

On Tuesday 29 May 2007 01:48:29 Nitin Gupta wrote:
> On 5/29/07, Bret Towe <[EMAIL PROTECTED]> wrote:
> > On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote:
> > > Hi,
> > >
> > > Attached is tester code used for testing.
> > > (developed by Daniel Hazelton -- modified slightly to now use 'take 6'
> > > version for 'TinyLZO')
> > >
> > > Cheers,
> > > Nitin
> > >
> > > On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote:
> > > > (Using tester program from Daniel)
> > > >
> > > > Following compares this kernel port ('take 6') vs original miniLZO
> > > > code:
> > > >
> > > > 'TinyLZO' refers to this kernel port.
> > > >
> > > > 1 run averages:
> > > > 'Tiny LZO':
> > > >Combined: 61.2223 usec
> > > >Compression: 41.8412 usec
> > > >Decompression: 19.3811 usec
> > > > 'miniLZO':
> > > >Combined: 66.0444 usec
> > > >Compression: 46.6323 usec
> > > >Decompression: 19.4121 usec
> > > >
> > > > Result:
> > > > Overall: TinyLZO is 7.3% faster
> > > > Compressor: TinyLZO is 10.2% faster
> > > > Decompressor: TinyLZO is 0.15% faster
> >
> > 1 run averages:
> > 'Tiny LZO':
> > Combined: 112.6642 usec
> > Compression: 56.3321 usec
> > Decompression: 56.3321 usec
> > 'miniLZO':
> > Combined: 77.6642 usec
> > Compression: 56.3416 usec
> > Decompression: 21.3226 usec
> >
> > now the interesting bit I thought was the following
> > [EMAIL PROTECTED]:/usr/src/lzo1x-test-4# ./fulltest
> > [test_lzo.c::compress (93)] run took 42 microseconds
> > [test_lzo.c::decompress (127)] run took 20 microseconds
> > [EMAIL PROTECTED]:/usr/src/lzo1x-test-4# ./tinytest
> > [test.c::compress (91)] run took 44 microseconds
> > [test.c::decompress (117)] BUG: lzo1x_decompress has failed ( t == -6 )
> > [test.c::main (149)] BUG: Decompression routine failure
>
> Did you use x86 for above test? Maybe some problem with testing
> script? What data did you use for this test?
>
>
> - Nitin

Using any version of the test code below 5 on a non-LE system would result in 
this error. Those earlier versions couldn't handle BE systems because I was 
too lazy to define cpu_to_le16. (And yes, Nitin, that does appear to be the 
correct conversion to use)

DRH

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Nitin Gupta

On 5/29/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:

On Tue, May 29, 2007 at 09:08:27AM +0100, Michael-Luke Jones wrote:
> On 28 May 2007, at 18:11, Adrian Bunk wrote:
>
>> I have not seen any explanations:
>> - Why did the upstream author write the code that way?
>
> Apparently due to his requirement for extreme portability. The original
> code was designed to work on everything from 16-bit DOS through CRAY
> supercomputers through Windows, Unices and Linux.

Sure, this could be the reason in some or all cases.

The upstream author knows the code best, and discussing such issues with
him will in many cases be a win:

It could be that there was in some cases no good reason, and the
upstream code that gets used by many other projects could become faster.

Or there was a good reason that applies also to the in-kernel version
and a change breaks some corner case.

I have mailed the author with detailed changelog - waiting for reply.

> The author has stated on the thread that it's a good idea to remove
> unnecessary ifdefs when porting the code into the kernel, given that the
> portability requirements are obviously no longer needed.

"remove unnecessary ifdefs" implies "generated code is identical".

That's quite different from "code is 10% faster".

Daniel made some changes to his testing code and now the perf gain is just 1.6%.

- Nitin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Adrian Bunk

On Tue, May 29, 2007 at 09:08:27AM +0100, Michael-Luke Jones wrote:
> On 28 May 2007, at 18:11, Adrian Bunk wrote:
>
>> I have not seen any explanations:
>> - Why did the upstream author write the code that way?
>
> Apparently due to his requirement for extreme portability. The original 
> code was designed to work on everything from 16-bit DOS through CRAY 
> supercomputers through Windows, Unices and Linux.

Sure, this could be the reason in some or all cases.

The upstream author knows the code best, and discussing such issues with 
him will in many cases be a win:

It could be that there was in some cases no good reason, and the 
upstream code that gets used by many other projects could become faster.

Or there was a good reason that applies also to the in-kernel version 
and a change breaks some corner case.

> The author has stated on the thread that it's a good idea to remove 
> unnecessary ifdefs when porting the code into the kernel, given that the 
> portability requirements are obviously no longer needed.

"remove unnecessary ifdefs" implies "generated code is identical".

That's quite different from "code is 10% faster".

> Michael-Luke

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Makefile question (was [RFC] LZO de/compression support - take 6)

2007-05-29 Thread Satyam Sharma

On 5/29/07, Michael-Luke Jones <[EMAIL PROTECTED]> wrote:

On 29 May 2007, at 11:41, Satyam Sharma wrote:

> This is syntactically correct (and wouldn't produce any build errors),
> but it's quite ... strange, still. Why would I even want to /build/
> the
> compress code if all I selected was the decompress option?

Apologies, you gave me the answer I was looking for (make is
'intelligent' enough not to build the same file twice in this
situation...) but I think you may have missed the makefile in the
lzo1x directory:

> diff --git a/lib/lzo1x/Makefile b/lib/lzo1x/Makefile
> new file mode 100644
> index 000..7b56a4d
> --- /dev/null
> +++ b/lib/lzo1x/Makefile
> @@ -0,0 +1,2 @@
> +obj-$(CONFIG_LZO1X_COMPRESS) += lzo1x_compress.o
> +obj-$(CONFIG_LZO1X_DECOMPRESS) += lzo1x_decompress.o

Ah, yes, I only saw the mail and didn't check lib/lzo1x/Makefile, thanks.
So effectively, we've done precisely what zlib does too ... only, in a
smarter way!

Thus, the Kconfig options for compress/decompress won't be simply
'dummy' options...

Given the shared private header between the compress and decompress
code, I don't think there is any need to separate the code into two
directories a la the zlib code.

Right, actually, zlib could be switched over to this style, in fact.
Because zlib_deflate/ and zlib_inflate/ too share a private header
zutil.h which has unfortunately been stuck into include/linux/ with a big
/* WARNING: this file should *not* be used by applications. */ comment ...

Satyam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Makefile question (was [RFC] LZO de/compression support - take 6)

2007-05-29 Thread Michael-Luke Jones


On 29 May 2007, at 11:41, Satyam Sharma wrote:


This is syntactically correct (and wouldn't produce any build errors),
but it's quite ... strange, still. Why would I even want to /build/  
the

compress code if all I selected was the decompress option?


Apologies, you gave me the answer I was looking for (make is  
'intelligent' enough not to build the same file twice in this  
situation...) but I think you may have missed the makefile in the  
lzo1x directory:



diff --git a/lib/lzo1x/Makefile b/lib/lzo1x/Makefile
new file mode 100644
index 000..7b56a4d
--- /dev/null
+++ b/lib/lzo1x/Makefile
@@ -0,0 +1,2 @@
+obj-$(CONFIG_LZO1X_COMPRESS) += lzo1x_compress.o
+obj-$(CONFIG_LZO1X_DECOMPRESS) += lzo1x_decompress.o


Thus, the Kconfig options for compress/decompress won't be simply  
'dummy' options...


Given the shared private header between the compress and decompress  
code, I don't think there is any need to separate the code into two  
directories a la the zlib code.


Sorry for noise,

Michael-Luke
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Makefile question (was [RFC] LZO de/compression support - take 6)

2007-05-29 Thread Satyam Sharma

On 5/29/07, Michael-Luke Jones <[EMAIL PROTECTED]> wrote:

On 28 May 2007, at 15:34, Nitin Gupta wrote:

> diff --git a/lib/Kconfig b/lib/Kconfig
> index 2e7ae6b..eb95eaa 100644
> --- a/lib/Kconfig
> +++ b/lib/Kconfig
> @@ -64,6 +64,12 @@ config ZLIB_INFLATE
> config ZLIB_DEFLATE
>   tristate
>
> +config LZO1X_COMPRESS
> + tristate
> +
> +config LZO1X_DECOMPRESS
> + tristate
> +
> #
> # Generic allocator support is selected if needed
> #
> diff --git a/lib/Makefile b/lib/Makefile
> index c8c8e20..448ae37 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -49,6 +49,8 @@ obj-$(CONFIG_GENERIC_ALLOCATOR) += genalloc.o
> obj-$(CONFIG_ZLIB_INFLATE) += zlib_inflate/
> obj-$(CONFIG_ZLIB_DEFLATE) += zlib_deflate/
> obj-$(CONFIG_REED_SOLOMON) += reed_solomon/
> +obj-$(CONFIG_LZO1X_COMPRESS) += lzo1x/
> +obj-$(CONFIG_LZO1X_DECOMPRESS) += lzo1x/
>
> obj-$(CONFIG_TEXTSEARCH) += textsearch.o
> obj-$(CONFIG_TEXTSEARCH_KMP) += ts_kmp.o

Hey there,

Is this syntax OK for Makefile use? I'm only asking out of personal
ignorance and because I'd use:

> +obj-$(CONFIG_LZO1X_COMPRESS) += lzo1x/
> +obj-$(CONFIG_LZO1X_DECOMPRESS) += lzo1x/

This is syntactically correct (and wouldn't produce any build errors),
but it's quite ... strange, still. Why would I even want to /build/ the
compress code if all I selected was the decompress option?

Also, when both the options are y, make would just make the first
target, then recurse back into that same directory again and skip
it because it just made it ... which, although harmless, is ... strange.

So I'm not sure if we really want both the options to resolve to the
same make target -- the kernel typically has more users of
_decompress_ than those of _compress_, so it does make some sense
to completely decouple the two into separate directories in lib/ and have
separate Kconfig options (as well as Makefile target subdirs) for them.

Kconfig:

config LZO1X_COMPRESS
select LZO1X
tristate

config LZO1X_DECOMPRESS
select LZO1X
tristate

config LZO1X
tristate (or boolean??)

Hmmm ... if this is /really/ what we want to do, what's wrong with simply:

config LZO1x
tristate

?

Makefile:

obj-$(CONFIG_LZO1X) += lzo1x/

I find nothing wrong with the style being used for ZLIB presently. Agreed,
users of ZLIB do typically want to select only decompress *without*
compress whereas there's no such user for LZO currently, but I see
nothing wrong in just copying over ZLIB's style for LZO too (i.e. decoupling
the compress and decompress code into separate subdirs in lib/).
And if not, then why have two separate (dummy) Kconfig options either?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Makefile question (was [RFC] LZO de/compression support - take 6)

2007-05-29 Thread Michael-Luke Jones


On 28 May 2007, at 15:34, Nitin Gupta wrote:


diff --git a/lib/Kconfig b/lib/Kconfig
index 2e7ae6b..eb95eaa 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -64,6 +64,12 @@ config ZLIB_INFLATE
config ZLIB_DEFLATE
tristate

+config LZO1X_COMPRESS
+   tristate
+
+config LZO1X_DECOMPRESS
+   tristate
+
#
# Generic allocator support is selected if needed
#
diff --git a/lib/Makefile b/lib/Makefile
index c8c8e20..448ae37 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -49,6 +49,8 @@ obj-$(CONFIG_GENERIC_ALLOCATOR) += genalloc.o
obj-$(CONFIG_ZLIB_INFLATE) += zlib_inflate/
obj-$(CONFIG_ZLIB_DEFLATE) += zlib_deflate/
obj-$(CONFIG_REED_SOLOMON) += reed_solomon/
+obj-$(CONFIG_LZO1X_COMPRESS) += lzo1x/
+obj-$(CONFIG_LZO1X_DECOMPRESS) += lzo1x/

obj-$(CONFIG_TEXTSEARCH) += textsearch.o
obj-$(CONFIG_TEXTSEARCH_KMP) += ts_kmp.o


Hey there,

Is this syntax OK for Makefile use? I'm only asking out of personal  
ignorance and because I'd use:


Kconfig:

config LZO1X_COMPRESS
select LZO1X
tristate

config LZO1X_DECOMPRESS
select LZO1X
tristate

config LZO1X
tristate (or boolean??)

Makefile:

obj-$(CONFIG_LZO1X) += lzo1x/

Thanks,

Michael-Luke
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Michael-Luke Jones


On 28 May 2007, at 18:11, Adrian Bunk wrote:


I have not seen any explanations:
- Why did the upstream author write the code that way?


Apparently due to his requirement for extreme portability. The  
original code was designed to work on everything from 16-bit DOS  
through CRAY supercomputers through Windows, Unices and Linux.


The author has stated on the thread that it's a good idea to remove  
unnecessary ifdefs when porting the code into the kernel, given that  
the portability requirements are obviously no longer needed.


Michael-Luke
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Michael-Luke Jones


On 28 May 2007, at 18:11, Adrian Bunk wrote:


I have not seen any explanations:
- Why did the upstream author write the code that way?


Apparently due to his requirement for extreme portability. The  
original code was designed to work on everything from 16-bit DOS  
through CRAY supercomputers through Windows, Unices and Linux.


The author has stated on the thread that it's a good idea to remove  
unnecessary ifdefs when porting the code into the kernel, given that  
the portability requirements are obviously no longer needed.


Michael-Luke
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Makefile question (was [RFC] LZO de/compression support - take 6)

2007-05-29 Thread Michael-Luke Jones


On 28 May 2007, at 15:34, Nitin Gupta wrote:


diff --git a/lib/Kconfig b/lib/Kconfig
index 2e7ae6b..eb95eaa 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -64,6 +64,12 @@ config ZLIB_INFLATE
config ZLIB_DEFLATE
tristate

+config LZO1X_COMPRESS
+   tristate
+
+config LZO1X_DECOMPRESS
+   tristate
+
#
# Generic allocator support is selected if needed
#
diff --git a/lib/Makefile b/lib/Makefile
index c8c8e20..448ae37 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -49,6 +49,8 @@ obj-$(CONFIG_GENERIC_ALLOCATOR) += genalloc.o
obj-$(CONFIG_ZLIB_INFLATE) += zlib_inflate/
obj-$(CONFIG_ZLIB_DEFLATE) += zlib_deflate/
obj-$(CONFIG_REED_SOLOMON) += reed_solomon/
+obj-$(CONFIG_LZO1X_COMPRESS) += lzo1x/
+obj-$(CONFIG_LZO1X_DECOMPRESS) += lzo1x/

obj-$(CONFIG_TEXTSEARCH) += textsearch.o
obj-$(CONFIG_TEXTSEARCH_KMP) += ts_kmp.o


Hey there,

Is this syntax OK for Makefile use? I'm only asking out of personal  
ignorance and because I'd use:


Kconfig:

config LZO1X_COMPRESS
select LZO1X
tristate

config LZO1X_DECOMPRESS
select LZO1X
tristate

config LZO1X
tristate (or boolean??)

Makefile:

obj-$(CONFIG_LZO1X) += lzo1x/

Thanks,

Michael-Luke
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Makefile question (was [RFC] LZO de/compression support - take 6)

2007-05-29 Thread Satyam Sharma


On 5/29/07, Michael-Luke Jones [EMAIL PROTECTED] wrote:

On 28 May 2007, at 15:34, Nitin Gupta wrote:

 diff --git a/lib/Kconfig b/lib/Kconfig
 index 2e7ae6b..eb95eaa 100644
 --- a/lib/Kconfig
 +++ b/lib/Kconfig
 @@ -64,6 +64,12 @@ config ZLIB_INFLATE
 config ZLIB_DEFLATE
   tristate

 +config LZO1X_COMPRESS
 + tristate
 +
 +config LZO1X_DECOMPRESS
 + tristate
 +
 #
 # Generic allocator support is selected if needed
 #
 diff --git a/lib/Makefile b/lib/Makefile
 index c8c8e20..448ae37 100644
 --- a/lib/Makefile
 +++ b/lib/Makefile
 @@ -49,6 +49,8 @@ obj-$(CONFIG_GENERIC_ALLOCATOR) += genalloc.o
 obj-$(CONFIG_ZLIB_INFLATE) += zlib_inflate/
 obj-$(CONFIG_ZLIB_DEFLATE) += zlib_deflate/
 obj-$(CONFIG_REED_SOLOMON) += reed_solomon/
 +obj-$(CONFIG_LZO1X_COMPRESS) += lzo1x/
 +obj-$(CONFIG_LZO1X_DECOMPRESS) += lzo1x/

 obj-$(CONFIG_TEXTSEARCH) += textsearch.o
 obj-$(CONFIG_TEXTSEARCH_KMP) += ts_kmp.o

Hey there,

Is this syntax OK for Makefile use? I'm only asking out of personal
ignorance and because I'd use:



 +obj-$(CONFIG_LZO1X_COMPRESS) += lzo1x/
 +obj-$(CONFIG_LZO1X_DECOMPRESS) += lzo1x/


This is syntactically correct (and wouldn't produce any build errors),
but it's quite ... strange, still. Why would I even want to /build/ the
compress code if all I selected was the decompress option?

Also, when both the options are y, make would just make the first
target, then recurse back into that same directory again and skip
it because it just made it ... which, although harmless, is ... strange.

So I'm not sure if we really want both the options to resolve to the
same make target -- the kernel typically has more users of
_decompress_ than those of _compress_, so it does make some sense
to completely decouple the two into separate directories in lib/ and have
separate Kconfig options (as well as Makefile target subdirs) for them.


Kconfig:

config LZO1X_COMPRESS
select LZO1X
tristate

config LZO1X_DECOMPRESS
select LZO1X
tristate

config LZO1X
tristate (or boolean??)


Hmmm ... if this is /really/ what we want to do, what's wrong with simply:

config LZO1x
tristate

?


Makefile:

obj-$(CONFIG_LZO1X) += lzo1x/


I find nothing wrong with the style being used for ZLIB presently. Agreed,
users of ZLIB do typically want to select only decompress *without*
compress whereas there's no such user for LZO currently, but I see
nothing wrong in just copying over ZLIB's style for LZO too (i.e. decoupling
the compress and decompress code into separate subdirs in lib/).
And if not, then why have two separate (dummy) Kconfig options either?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Makefile question (was [RFC] LZO de/compression support - take 6)

2007-05-29 Thread Michael-Luke Jones


On 29 May 2007, at 11:41, Satyam Sharma wrote:


This is syntactically correct (and wouldn't produce any build errors),
but it's quite ... strange, still. Why would I even want to /build/  
the

compress code if all I selected was the decompress option?


Apologies, you gave me the answer I was looking for (make is  
'intelligent' enough not to build the same file twice in this  
situation...) but I think you may have missed the makefile in the  
lzo1x directory:



diff --git a/lib/lzo1x/Makefile b/lib/lzo1x/Makefile
new file mode 100644
index 000..7b56a4d
--- /dev/null
+++ b/lib/lzo1x/Makefile
@@ -0,0 +1,2 @@
+obj-$(CONFIG_LZO1X_COMPRESS) += lzo1x_compress.o
+obj-$(CONFIG_LZO1X_DECOMPRESS) += lzo1x_decompress.o


Thus, the Kconfig options for compress/decompress won't be simply  
'dummy' options...


Given the shared private header between the compress and decompress  
code, I don't think there is any need to separate the code into two  
directories a la the zlib code.


Sorry for noise,

Michael-Luke
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Makefile question (was [RFC] LZO de/compression support - take 6)

2007-05-29 Thread Satyam Sharma


On 5/29/07, Michael-Luke Jones [EMAIL PROTECTED] wrote:

On 29 May 2007, at 11:41, Satyam Sharma wrote:

 This is syntactically correct (and wouldn't produce any build errors),
 but it's quite ... strange, still. Why would I even want to /build/
 the
 compress code if all I selected was the decompress option?

Apologies, you gave me the answer I was looking for (make is
'intelligent' enough not to build the same file twice in this
situation...) but I think you may have missed the makefile in the
lzo1x directory:

 diff --git a/lib/lzo1x/Makefile b/lib/lzo1x/Makefile
 new file mode 100644
 index 000..7b56a4d
 --- /dev/null
 +++ b/lib/lzo1x/Makefile
 @@ -0,0 +1,2 @@
 +obj-$(CONFIG_LZO1X_COMPRESS) += lzo1x_compress.o
 +obj-$(CONFIG_LZO1X_DECOMPRESS) += lzo1x_decompress.o


Ah, yes, I only saw the mail and didn't check lib/lzo1x/Makefile, thanks.
So effectively, we've done precisely what zlib does too ... only, in a
smarter way!


Thus, the Kconfig options for compress/decompress won't be simply
'dummy' options...

Given the shared private header between the compress and decompress
code, I don't think there is any need to separate the code into two
directories a la the zlib code.


Right, actually, zlib could be switched over to this style, in fact.
Because zlib_deflate/ and zlib_inflate/ too share a private header
zutil.h which has unfortunately been stuck into include/linux/ with a big
/* WARNING: this file should *not* be used by applications. */ comment ...

Satyam
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Adrian Bunk

On Tue, May 29, 2007 at 09:08:27AM +0100, Michael-Luke Jones wrote:
 On 28 May 2007, at 18:11, Adrian Bunk wrote:

 I have not seen any explanations:
 - Why did the upstream author write the code that way?

 Apparently due to his requirement for extreme portability. The original 
 code was designed to work on everything from 16-bit DOS through CRAY 
 supercomputers through Windows, Unices and Linux.

Sure, this could be the reason in some or all cases.

The upstream author knows the code best, and discussing such issues with 
him will in many cases be a win:

It could be that there was in some cases no good reason, and the 
upstream code that gets used by many other projects could become faster.

Or there was a good reason that applies also to the in-kernel version 
and a change breaks some corner case.

 The author has stated on the thread that it's a good idea to remove 
 unnecessary ifdefs when porting the code into the kernel, given that the 
 portability requirements are obviously no longer needed.

remove unnecessary ifdefs implies generated code is identical.

That's quite different from code is 10% faster.

 Michael-Luke

cu
Adrian

-- 

   Is there not promise of rain? Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   Only a promise, Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Nitin Gupta


On 5/29/07, Adrian Bunk [EMAIL PROTECTED] wrote:

On Tue, May 29, 2007 at 09:08:27AM +0100, Michael-Luke Jones wrote:
 On 28 May 2007, at 18:11, Adrian Bunk wrote:

 I have not seen any explanations:
 - Why did the upstream author write the code that way?

 Apparently due to his requirement for extreme portability. The original
 code was designed to work on everything from 16-bit DOS through CRAY
 supercomputers through Windows, Unices and Linux.

Sure, this could be the reason in some or all cases.

The upstream author knows the code best, and discussing such issues with
him will in many cases be a win:

It could be that there was in some cases no good reason, and the
upstream code that gets used by many other projects could become faster.

Or there was a good reason that applies also to the in-kernel version
and a change breaks some corner case.



I have mailed the author with detailed changelog - waiting for reply.


 The author has stated on the thread that it's a good idea to remove
 unnecessary ifdefs when porting the code into the kernel, given that the
 portability requirements are obviously no longer needed.

remove unnecessary ifdefs implies generated code is identical.

That's quite different from code is 10% faster.



Daniel made some changes to his testing code and now the perf gain is just 1.6%.

- Nitin
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Daniel Hazelton

On Tuesday 29 May 2007 01:48:29 Nitin Gupta wrote:
 On 5/29/07, Bret Towe [EMAIL PROTECTED] wrote:
  On 5/28/07, Nitin Gupta [EMAIL PROTECTED] wrote:
   Hi,
  
   Attached is tester code used for testing.
   (developed by Daniel Hazelton -- modified slightly to now use 'take 6'
   version for 'TinyLZO')
  
   Cheers,
   Nitin
  
   On 5/28/07, Nitin Gupta [EMAIL PROTECTED] wrote:
(Using tester program from Daniel)
   
Following compares this kernel port ('take 6') vs original miniLZO
code:
   
'TinyLZO' refers to this kernel port.
   
1 run averages:
'Tiny LZO':
   Combined: 61.2223 usec
   Compression: 41.8412 usec
   Decompression: 19.3811 usec
'miniLZO':
   Combined: 66.0444 usec
   Compression: 46.6323 usec
   Decompression: 19.4121 usec
   
Result:
Overall: TinyLZO is 7.3% faster
Compressor: TinyLZO is 10.2% faster
Decompressor: TinyLZO is 0.15% faster
 
  1 run averages:
  'Tiny LZO':
  Combined: 112.6642 usec
  Compression: 56.3321 usec
  Decompression: 56.3321 usec
  'miniLZO':
  Combined: 77.6642 usec
  Compression: 56.3416 usec
  Decompression: 21.3226 usec
 
  now the interesting bit I thought was the following
  [EMAIL PROTECTED]:/usr/src/lzo1x-test-4# ./fulltest
  [test_lzo.c::compress (93)] run took 42 microseconds
  [test_lzo.c::decompress (127)] run took 20 microseconds
  [EMAIL PROTECTED]:/usr/src/lzo1x-test-4# ./tinytest
  [test.c::compress (91)] run took 44 microseconds
  [test.c::decompress (117)] BUG: lzo1x_decompress has failed ( t == -6 )
  [test.c::main (149)] BUG: Decompression routine failure

 Did you use x86 for above test? Maybe some problem with testing
 script? What data did you use for this test?


 - Nitin

Using any version of the test code below 5 on a non-LE system would result in 
this error. Those earlier versions couldn't handle BE systems because I was 
too lazy to define cpu_to_le16. (And yes, Nitin, that does appear to be the 
correct conversion to use)

DRH

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Daniel Hazelton

On Tuesday 29 May 2007 08:03:55 Nitin Gupta wrote:
 On 5/29/07, Adrian Bunk [EMAIL PROTECTED] wrote:
  On Tue, May 29, 2007 at 09:08:27AM +0100, Michael-Luke Jones wrote:
   On 28 May 2007, at 18:11, Adrian Bunk wrote:
   I have not seen any explanations:
   - Why did the upstream author write the code that way?
  
   Apparently due to his requirement for extreme portability. The original
   code was designed to work on everything from 16-bit DOS through CRAY
   supercomputers through Windows, Unices and Linux.
 
  Sure, this could be the reason in some or all cases.
 
  The upstream author knows the code best, and discussing such issues with
  him will in many cases be a win:
 
  It could be that there was in some cases no good reason, and the
  upstream code that gets used by many other projects could become faster.
 
  Or there was a good reason that applies also to the in-kernel version
  and a change breaks some corner case.

 I have mailed the author with detailed changelog - waiting for reply.

   The author has stated on the thread that it's a good idea to remove
   unnecessary ifdefs when porting the code into the kernel, given that
   the portability requirements are obviously no longer needed.
 
  remove unnecessary ifdefs implies generated code is identical.
 
  That's quite different from code is 10% faster.

 Daniel made some changes to his testing code and now the perf gain is just
 1.6%.

 - Nitin

The changes for version 5 are all in one file - helpers.h. All I did was 
define the macros that had previously been NOP's. Most of the performance 
loss, IMHO, is due to the compression and decompression functions being 
marked noinline. The currently reported 1.6% overall performance boost is 
likely related to the use of likely()/unlikely().

DRH

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-29 Thread Michael-Luke Jones


On 29 May 2007, at 12:27, Satyam Sharma wrote:

Right, actually, zlib could be switched over to [using a common  
directory].

Because zlib_deflate/ and zlib_inflate/ too share a private header
zutil.h which has unfortunately been stuck into include/linux/ with  
a big
/* WARNING: this file should *not* be used by applications. */  
comment ...


Well, unfortunately zutil.h is currently being used in-kernel by fs/ 
jffs2/compr_zlib.c despite the WARNING: this file should *not* be  
used by applications notice.


So moving this header to a truly private location isn't possible  
right now, unfortunately,


Michael-Luke Jones
[added David Whitehouse to cc]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-29 Thread Daniel Hazelton

On Tuesday 29 May 2007 09:33:51 Michael-Luke Jones wrote:
 On 29 May 2007, at 12:27, Satyam Sharma wrote:
  Right, actually, zlib could be switched over to [using a common
  directory].
  Because zlib_deflate/ and zlib_inflate/ too share a private header
  zutil.h which has unfortunately been stuck into include/linux/ with
  a big
  /* WARNING: this file should *not* be used by applications. */
  comment ...

 Well, unfortunately zutil.h is currently being used in-kernel by fs/
 jffs2/compr_zlib.c despite the WARNING: this file should *not* be
 used by applications notice.

 So moving this header to a truly private location isn't possible
 right now, unfortunately,

 Michael-Luke Jones
 [added David Whitehouse to cc]

I've looked at that code and it seems to need that file for one constant. 
Perhaps it'd be better for jffs2/compr_zlib.c to define that constant itself 
(or use it as a Magic Number) rather than include the zlib private header. 
Another possibility would be to move that constant out of zutil.h and into 
zconf.h or zlih.b - doing any of those would allow the zlib private header to 
be moved such that zlib could be changed to use a common directory *and* have 
said private header in that directory.

DRH


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-29 Thread Satyam Sharma


[ Trimmed Cc list; added original zlib authors. ]


On Tuesday 29 May 2007 09:33:51 Michael-Luke Jones wrote:
 On 29 May 2007, at 12:27, Satyam Sharma wrote:
  Right, actually, zlib could be switched over to [using a common
  directory].
  Because zlib_deflate/ and zlib_inflate/ too share a private header
  zutil.h which has unfortunately been stuck into include/linux/ with
  a big
  /* WARNING: this file should *not* be used by applications. */
  comment ...

 Well, unfortunately zutil.h is currently being used in-kernel by fs/
 jffs2/compr_zlib.c despite the WARNING: this file should *not* be
 used by applications notice.


Hmmm, either jffs2 thinks the zlib interfaces exposed through
include/linux/zlib.h are insufficient, or it isn't using zlib properly
(or at least the way it was supposed to be used :-)

Looking at fs/jffs2/compr_zlib.c, however, it just seems to be
an optimization -- skipping some checksum calculation if some
flag (PRESET_DICT) is absent from the input stream about to
be decompressed ...

This /looks/ like an optimization that might make sense for other
users of zlib too, in which case that entire check and
skip-the-checksum thing could actually be put into zlib proper. (?)


 So moving this header to a truly private location isn't possible
 right now, unfortunately,

 Michael-Luke Jones
 [added David Whitehouse to cc]


On 5/29/07, Daniel Hazelton [EMAIL PROTECTED] wrote:

I've looked at that code and it seems to need that file for one constant.
Perhaps it'd be better for jffs2/compr_zlib.c to define that constant itself
(or use it as a Magic Number) rather than include the zlib private header.
Another possibility would be to move that constant out of zutil.h and into
zconf.h or zlih.b - doing any of those would allow the zlib private header to
be moved such that zlib could be changed to use a common directory *and* have
said private header in that directory.


Moving PRESET_DICT to zlib.h robs lib/zlib_deflate/deflate.c from
being able to resolve that macro. FWIW, a dirty hack could be to
simply duplicate it in zlib.h. But then, implementation and interface
are decoupled for good reasons ...

Satyam
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-29 Thread Daniel Hazelton

On Tuesday 29 May 2007 11:15:41 Satyam Sharma wrote:
 [ Trimmed Cc list; added original zlib authors. ]

  On Tuesday 29 May 2007 09:33:51 Michael-Luke Jones wrote:
   On 29 May 2007, at 12:27, Satyam Sharma wrote:
Right, actually, zlib could be switched over to [using a common
directory].
Because zlib_deflate/ and zlib_inflate/ too share a private header
zutil.h which has unfortunately been stuck into include/linux/ with
a big
/* WARNING: this file should *not* be used by applications. */
comment ...
  
   Well, unfortunately zutil.h is currently being used in-kernel by fs/
   jffs2/compr_zlib.c despite the WARNING: this file should *not* be
   used by applications notice.

 Hmmm, either jffs2 thinks the zlib interfaces exposed through
 include/linux/zlib.h are insufficient, or it isn't using zlib properly
 (or at least the way it was supposed to be used :-)

 Looking at fs/jffs2/compr_zlib.c, however, it just seems to be
 an optimization -- skipping some checksum calculation if some
 flag (PRESET_DICT) is absent from the input stream about to
 be decompressed ...

 This /looks/ like an optimization that might make sense for other
 users of zlib too, in which case that entire check and
 skip-the-checksum thing could actually be put into zlib proper. (?)

This *is* a good idea. If someone doesn't beat me to it I'll make sure I've 
got the latest git and do up a patch that does this.

DRH

   So moving this header to a truly private location isn't possible
   right now, unfortunately,
  
   Michael-Luke Jones
   [added David Whitehouse to cc]

 On 5/29/07, Daniel Hazelton [EMAIL PROTECTED] wrote:
  I've looked at that code and it seems to need that file for one constant.
  Perhaps it'd be better for jffs2/compr_zlib.c to define that constant
  itself (or use it as a Magic Number) rather than include the zlib
  private header. Another possibility would be to move that constant out of
  zutil.h and into zconf.h or zlih.b - doing any of those would allow the
  zlib private header to be moved such that zlib could be changed to use a
  common directory *and* have said private header in that directory.

 Moving PRESET_DICT to zlib.h robs lib/zlib_deflate/deflate.c from
 being able to resolve that macro. FWIW, a dirty hack could be to
 simply duplicate it in zlib.h. But then, implementation and interface
 are decoupled for good reasons ...

 Satyam


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Daniel Hazelton

On Tuesday 29 May 2007 01:58:43 Nitin Gupta wrote:
 On 5/29/07, Bret Towe [EMAIL PROTECTED] wrote:
  tested this on ppc and its still good
 
  is there any reason to bother with a test on amd64?
  if there is I might be able to get to it tonight

 Yes, this test is desired on 'take 6'
 (In future I will append version to patch bz2)

 Thanks,
 Nitin

I added a few functions to my test-bed to see how well the LZO code coped with 
random input data. I'll extend my error checking so that it dumps all local 
variables at the first error, but for the time being I'm staring at a problem 
with what appears to be a double-free (in addition to lzo1x_decompress(_safe) 
screaming about overruns when I pass it the buffer of compressed, totally 
random data).

(I fill the buffer by doing a read from /dev/urandom, though I'll certainly 
change it to use random() if anyone thinks that'll make a difference)

I'll post the version with the random-data tests when I get the double-free 
solved. (I've looked and don't see how its happening)

DRH
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Daniel Hazelton

On Tuesday 29 May 2007 16:14:34 Daniel Hazelton wrote:
 On Tuesday 29 May 2007 01:58:43 Nitin Gupta wrote:
  On 5/29/07, Bret Towe [EMAIL PROTECTED] wrote:
   tested this on ppc and its still good
  
   is there any reason to bother with a test on amd64?
   if there is I might be able to get to it tonight
 
  Yes, this test is desired on 'take 6'
  (In future I will append version to patch bz2)
 
  Thanks,
  Nitin

 I added a few functions to my test-bed to see how well the LZO code coped
 with random input data. I'll extend my error checking so that it dumps all
 local variables at the first error, but for the time being I'm staring at a
 problem with what appears to be a double-free (in addition to
 lzo1x_decompress(_safe) screaming about overruns when I pass it the buffer
 of compressed, totally random data).

 (I fill the buffer by doing a read from /dev/urandom, though I'll certainly
 change it to use random() if anyone thinks that'll make a difference)

 I'll post the version with the random-data tests when I get the double-free
 solved. (I've looked and don't see how its happening)

 DRH

After some additional debugging the problem appears to be that both 
lzo1x_decompress_safe (from miniLZO) and lzo1x_decompress (from the 'Tiny' 
version) both are somehow corrupting memory.

*** glibc detected *** ./tinytest: double free or corruption (out): 0x08070e60 
***

The buffer at 0x08070e60 is the one I allocated specifically for the new 
random data tests. It has the same location when it is allocated as it is 
when free()'d, so the corruption has to be coming from somewhere. The problem 
*might* be in my code - that is, I might have made a mistake in the read() 
from /dev/urandom. I'm going to go test that now.

DRH
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Jan Engelhardt


On May 28 2007 20:04, Nitin Gupta wrote:

 * Changelog vs. original LZO:
 1) Used standard/kernel defined data types: (this eliminated _huge_
 #ifdef chunks)
 lzo_bytep - unsigned char *
 lzo_uint - size_t
 lzo_xint - size_t

Is this safe (as far as compressed LZO stream is concerned) --
or is it even needed (could it be unsigned int)? 

 - m_pos -= (*(const unsigned short *)ip)  2;
 -#else
 - m_pos = op - 1;
 - m_pos -= (ip[0]  2) + (ip[1]  6);
 -#endif

 + m_pos = op - 1 - (cpu_to_le16(*(const u16 *)ip)  2);

 (Andrey suggested le16_to_cpu for above but I think it should be cpu_to_le16).
 *** Need testing on big endian machine ***

On i386, both cpu_to_le16 and le16_to_cpu do nothing.
On sparc for example, cpu_to_leXX and leXX_to_cpu do 'the same' ;-)
they swap 1234-4321.

It is the bytestream (ip) that is reinterpreted as uint16_t.
And I really doubt that the LZO author has a big-endian machine,
given the days of ubiquitous x86. le16_to_cpu it is.



Jan
-- 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Jan Engelhardt


On May 28 2007 19:11, Adrian Bunk wrote:

I have not seen any explanations:
- Why did the upstream author write the code that way?

I guess it's along the lines of
- portability

(note how this contradicts itself). Really. I have yet to
figure out why everyone invents their own xxx32_t types,
like e.g. glib. well, integer types, one might understand, but when it
comes to gchar or gpointer, that's just plain microsoft-style
(think LPCSTR and LPCVOID...)


Jan
-- 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Adrian Bunk

On Tue, May 29, 2007 at 11:10:05PM +0200, Jan Engelhardt wrote:
 
 On May 28 2007 19:11, Adrian Bunk wrote:
 
 I have not seen any explanations:
 - Why did the upstream author write the code that way?
 
 I guess it's along the lines of
 - portability
 
 (note how this contradicts itself). Really. I have yet to
 figure out why everyone invents their own xxx32_t types,
 like e.g. glib. well, integer types, one might understand, but when it
 comes to gchar or gpointer, that's just plain microsoft-style
 (think LPCSTR and LPCVOID...)

You completely miss the point of my question.

It's about the performance improvements of the modified code that were 
mentioned.

What you are talking about shouldn't have any effect on the generated 
code.

   Jan

cu
Adrian

-- 

   Is there not promise of rain? Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   Only a promise, Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Daniel Hazelton

All problems I was having with the test-bed code have been solved, and the 
error I was running into was, as I suspected, in the code I used to fill the 
buffer for the random-data test.

Results of running the new benchmark (version 6 of the benchmark, version 6 
of 'tinyLZO'):
1 run averages:
'Tiny LZO':
Combined: 104.1529 usec
Compression: 43.065 usec
Decompression: 18.4648 usec
Random Data Compression: 31.132 usec
Random Data Decompression: 11.4911 usec
'miniLZO':
Combined: 106.1363 usec
Compression: 44.6988 usec
Decompression: 18.3576 usec
Random Data Compression: 31.5772 usec
Random Data Decompression: 11.5027 usec
Percentages (calculated as: ((full - tiny)/full)*100):
Overall (combined totals): Tiny is 1.87 % faster
Compression: Tiny is 3.66 % faster
Decompression: Tiny is 0.58 % slower
Random Compression: Tiny is 1.41 % faster
Random Decompression: Tiny is 0.10 % faster

The results, on my system, are not consistent, except in that 'TinyLZO' is 
generally faster in the non-random data tests than miniLZO. It did, three of 
the five times I ran the test, perform (on average) about 1% worse than 
miniLZO.

In order to sidestep any issues that a difference in the input data might have 
caused, I'm going to rewrite the code so that all tests are run against the 
same data. However, in the meantime, I've attached the latest version of my 
test-code.

DRH
PS: the code is going to massively change as I work to include more data 
sources for the benchmarking, as well as tests that will try to really stress 
the stripped-down code.


lzo1x-test-6.tar.bz2
Description: application/tbz

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Daniel Hazelton

I just noticed a bug in my testbed/benchmarking code. It's fixed, but I 
decided to compare version 6 of the code against the *unsafe* decompressor 
again. The results of the three runs I've put it through after changing it to 
compare against the unsafe decompressor were startling. `Tiny's` compressor 
is still faster - I've seen it be rated up to 3% faster. The decompressor, 
OTOH, when compared to the unsafe version (which is the comparison that 
started me on this binge of hacking), is more than 7% worse. About 11% slower 
on the original test against a C source file, and about 6% slower for random 
data. However, looking at the numbers involved, I can't see a reason to keep 
the unsafe version around - the percentages look worse than they are - from 1 
to 3 microseconds. (well, the compressed-cache people might want those extra 
usecs - but the difference will never be noticeable anywhere outside the 
kernel)

DRH


lzo1x-test-6a.tar.bz2
Description: application/tbz

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Nitin Gupta


On 5/30/07, Daniel Hazelton [EMAIL PROTECTED] wrote:

I just noticed a bug in my testbed/benchmarking code. It's fixed, but I
decided to compare version 6 of the code against the *unsafe* decompressor
again. The results of the three runs I've put it through after changing it to
compare against the unsafe decompressor were startling. `Tiny's` compressor
is still faster - I've seen it be rated up to 3% faster. The decompressor,
OTOH, when compared to the unsafe version (which is the comparison that
started me on this binge of hacking), is more than 7% worse. About 11% slower
on the original test against a C source file, and about 6% slower for random
data.


Unsafe vs safe is within 10%. Its okay.


However, looking at the numbers involved, I can't see a reason to keep
the unsafe version around - the percentages look worse than they are - from 1
to 3 microseconds.


Not just numbers. Most of applications cannot afford to use unsafe
versions anyway (like fs people).

(well, the compressed-cache people might want those extra

usecs - but the difference will never be noticeable anywhere outside the
kernel)

DRH



compressed cache people require every single percent of that
performance. For now, ccaching is not ready for mainline (many things
need to be done). So, till then I will keep off the unsafe version. If
ever compressed caching is on its way to mainline _then_ I will try
and add back the unsafe version. But I see no other project that
really cares about unsafe version so it's okay to keep it off.


- Nitin
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Nitin Gupta


On 5/30/07, Adrian Bunk [EMAIL PROTECTED] wrote:

On Tue, May 29, 2007 at 11:10:05PM +0200, Jan Engelhardt wrote:

 On May 28 2007 19:11, Adrian Bunk wrote:
You completely miss the point of my question.

It's about the performance improvements of the modified code that were
mentioned.

What you are talking about shouldn't have any effect on the generated
code.



After Daniel refined this testing program, we can see that perf gain
is  2% which can surely be accounted to cleanups like unnecessary
castings in tight loops.

Again, all the original code has been retained _as-is_. Whatever was
changed, has been mentioned in that detailed changelog that I post
along with patch.

If someone want to review these 500 lines with this changelog in hand,
it should not take more than couple of hours. If no-one can see any
problem in code by now and considering that it's tested on x86(_32),
amd64, ppc and giving somewhat better perf. than original then I
believe it is unnecessarily hanging outside of -mm tree.

I also contacted author (Markus Oberhumer) regarding above changes but
he seems not be responding right now. But still, if it gets into -mm
and gets used by various related projects then it should be good
enough for mainline also.


Cheers,
Nitin
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-29 Thread Nitin Gupta


On 5/30/07, Jan Engelhardt [EMAIL PROTECTED] wrote:


On May 28 2007 20:04, Nitin Gupta wrote:

 * Changelog vs. original LZO:
 1) Used standard/kernel defined data types: (this eliminated _huge_
 #ifdef chunks)
 lzo_bytep - unsigned char *
 lzo_uint - size_t
 lzo_xint - size_t

Is this safe (as far as compressed LZO stream is concerned) --
or is it even needed (could it be unsigned int)?

 - m_pos -= (*(const unsigned short *)ip)  2;
 -#else
 - m_pos = op - 1;
 - m_pos -= (ip[0]  2) + (ip[1]  6);
 -#endif

 + m_pos = op - 1 - (cpu_to_le16(*(const u16 *)ip)  2);

 (Andrey suggested le16_to_cpu for above but I think it should be cpu_to_le16).
 *** Need testing on big endian machine ***

On i386, both cpu_to_le16 and le16_to_cpu do nothing.
On sparc for example, cpu_to_leXX and leXX_to_cpu do 'the same' ;-)
they swap 1234-4321.

It is the bytestream (ip) that is reinterpreted as uint16_t.
And I really doubt that the LZO author has a big-endian machine,
given the days of ubiquitous x86.



le16_to_cpu it is.



I just missed the point that leXX_to_cpu() and cpu_to_leXX() are
identical on BE machine anyway. But then why you think it should be
le_16_cpu() -- how will this make any difference?

For your ref (from big_endian.h):
#define __cpu_to_le16(x) ((__force __le16)__swab16((x)))
#define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))

- Nitin
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: JFFS2 using 'private' zlib header (was [RFC] LZO de/compression support - take 6)

2007-05-29 Thread Mark Adler


On May 29, 2007, at 8:15 AM, Satyam Sharma wrote:

skipping some checksum calculation if some
flag (PRESET_DICT) is absent from the input stream about to
be decompressed ...


You don't need to dissect the header manually to look for that bit.   
If you feed inflate() at least the first two bytes, it will return  
immediately with the Z_NEED_DICT return code if a preset dictionary  
is requested.  You can force inflate() to return immediately after  
decoding the two byte header even if a preset dictionary is not  
requested by using the Z_BLOCK flush code.


Mark

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-28 Thread Nitin Gupta


On 5/29/07, Bret Towe <[EMAIL PROTECTED]> wrote:


tested this on ppc and its still good



is there any reason to bother with a test on amd64?
if there is I might be able to get to it tonight


Yes, this test is desired on 'take 6'
(In future I will append version to patch bz2)

Thanks,
Nitin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-28 Thread Nitin Gupta


On 5/28/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:

On Mon, May 28, 2007 at 09:33:32PM +0530, Nitin Gupta wrote:
> On 5/28/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:






I have not seen any explanations:
- Why did the upstream author write the code that way?
- Why are your changes correct?
- Why do your changes allow the compiler to produce faster code?

The upstream author of the code is available - and he might be able to
help you getting answers. No matter whether your changes are incorrect
or correct optimizations that should go upstream, in both cases
discussing these issues with upstream is the best solution.


The changelog I posted along with patch mentions all the changes I
made. I thought we will find all problems with this changelog in hand
and considering that its  just ~500 LOC.   But still, ok, asking
author himself will be good if he replies. I will mail him detailed
changelog and seek his feedback on this. This should answer all of
your questions.



And testing is nice, but if you broke some case that's outside of your
tests you'll never notice.



Yes. We cannot come up with exhaustive set of test cases to cover all
cases. But assuming that _original_ version is right and taking the
chagelog we should be able to verify if the porting is correct.


- Nitin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-28 Thread Nitin Gupta


On 5/29/07, Bret Towe <[EMAIL PROTECTED]> wrote:

On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote:
> Hi,
>
> Attached is tester code used for testing.
> (developed by Daniel Hazelton -- modified slightly to now use 'take 6'
> version for 'TinyLZO')
>
> Cheers,
> Nitin
>
> On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote:
> > (Using tester program from Daniel)
> >
> > Following compares this kernel port ('take 6') vs original miniLZO code:
> >
> > 'TinyLZO' refers to this kernel port.
> >
> > 1 run averages:
> > 'Tiny LZO':
> >Combined: 61.2223 usec
> >Compression: 41.8412 usec
> >Decompression: 19.3811 usec
> > 'miniLZO':
> >Combined: 66.0444 usec
> >Compression: 46.6323 usec
> >Decompression: 19.4121 usec
> >
> > Result:
> > Overall: TinyLZO is 7.3% faster
> > Compressor: TinyLZO is 10.2% faster
> > Decompressor: TinyLZO is 0.15% faster
> >
>
>

1 run averages:
'Tiny LZO':
Combined: 112.6642 usec
Compression: 56.3321 usec
Decompression: 56.3321 usec
'miniLZO':
Combined: 77.6642 usec
Compression: 56.3416 usec
Decompression: 21.3226 usec

now the interesting bit I thought was the following
[EMAIL PROTECTED]:/usr/src/lzo1x-test-4# ./fulltest
[test_lzo.c::compress (93)] run took 42 microseconds
[test_lzo.c::decompress (127)] run took 20 microseconds
[EMAIL PROTECTED]:/usr/src/lzo1x-test-4# ./tinytest
[test.c::compress (91)] run took 44 microseconds
[test.c::decompress (117)] BUG: lzo1x_decompress has failed ( t == -6 )
[test.c::main (149)] BUG: Decompression routine failure




Did you use x86 for above test? Maybe some problem with testing
script? What data did you use for this test?


- Nitin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-28 Thread Bret Towe


On 5/28/07, Bret Towe <[EMAIL PROTECTED]> wrote:

On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote:
> Hi,
>
> This is kernel port of LZO1X-1 compressor and LZO1X decompressor (safe
> version only).
>
> * Changes since 'take 5' (Full Changelog after this):
> - Added compressor and decomrpesssor as separate and hidden config
> options (default: n)
> - Cleanups: removed LZO_CHECK_MPOS_NON_DET macro
> - removed PTR* macros.
>
> * Benchmarks: (system: P4 3GHz, 1G RAM)
> (Using tester program from Daniel)
>
> Following compares this kernel port ('take 6') vs original miniLZO code:
>
> 'TinyLZO' refers to this kernel port.
>
> 1 run averages:
> 'Tiny LZO':
>Combined: 61.2223 usec
>Compression: 41.8412 usec
>Decompression: 19.3811 usec
> 'miniLZO':
>Combined: 66.0444 usec
>Compression: 46.6323 usec
>Decompression: 19.4121 usec
>
> Result:
> Overall: TinyLZO is 7.3% faster
> Compressor: TinyLZO is 10.2% faster
> Decompressor: TinyLZO is 0.15% faster
>
> TODO: test 'take 6' port on archs other than x86(_32)
>
> * Changelog vs. original LZO:
> 1) Used standard/kernel defined data types: (this eliminated _huge_
> #ifdef chunks)
>   lzo_bytep -> unsigned char *
>   lzo_uint -> size_t
>   lzo_xint -> size_t
>   lzo_uint32p -> u32 *
>   lzo_uintptr_t  -> unsigned long
> 2) Removed everything #ifdef'ed under COPY_DICT (this is not set for
> LZO1X, so removed corres. parts).
> 3) Removed code #ifdef'ed for LZO1Y, LZO1Z, other variants.
> 4) Reformatted the code to match general kernel style.
> 5) The only code change: (as suggested by Andrey)
> -#if defined(__LITTLE_ENDIAN)
> - m_pos = op - 1;
> - m_pos -= (*(const unsigned short *)ip) >> 2;
> -#else
> - m_pos = op - 1;
> - m_pos -= (ip[0] >> 2) + (ip[1] << 6);
> -#endif
>
> + m_pos = op - 1 - (cpu_to_le16(*(const u16 *)ip) >> 2);
>
> (Andrey suggested le16_to_cpu for above but I think it should be cpu_to_le16).
> *** Need testing on big endian machine ***
>
> Similarly:
> -#if defined(__LITTLE_ENDIAN)
> - m_pos -= (*(const unsigned short *)ip) >> 2;
> -#else
> - m_pos -= (ip[0] >> 2) + (ip[1] << 6);
> -#endif
>
> + m_pos -= cpu_to_le16(*(const u16 *)ip) >> 2;
>
> 6) Get rid of LZO_CHECK_MPOS_NON_DET macro and PTR* macros.
>
>
> I think it's now ready for mainline inclusion.
>
>
>  include/linux/lzo1x.h|   66 +++
>  lib/Kconfig  |6 +
>  lib/Makefile |2 +
>  lib/lzo1x/Makefile   |2 +
>  lib/lzo1x/lzo1x_compress.c   |  259

tested this on ppc and its still good
is there any reason to bother with a test on amd64?
if there is I might be able to get to it tonight



forgot to mention that it would make life easier if you would put the
version of the patch
in the .bz2 files you attach since you poping out new takes so quick
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-28 Thread Bret Towe


On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote:

Hi,

Attached is tester code used for testing.
(developed by Daniel Hazelton -- modified slightly to now use 'take 6'
version for 'TinyLZO')

Cheers,
Nitin

On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote:
> (Using tester program from Daniel)
>
> Following compares this kernel port ('take 6') vs original miniLZO code:
>
> 'TinyLZO' refers to this kernel port.
>
> 1 run averages:
> 'Tiny LZO':
>Combined: 61.2223 usec
>Compression: 41.8412 usec
>Decompression: 19.3811 usec
> 'miniLZO':
>Combined: 66.0444 usec
>Compression: 46.6323 usec
>Decompression: 19.4121 usec
>
> Result:
> Overall: TinyLZO is 7.3% faster
> Compressor: TinyLZO is 10.2% faster
> Decompressor: TinyLZO is 0.15% faster
>




1 run averages:
'Tiny LZO':
   Combined: 112.6642 usec
   Compression: 56.3321 usec
   Decompression: 56.3321 usec
'miniLZO':
   Combined: 77.6642 usec
   Compression: 56.3416 usec
   Decompression: 21.3226 usec

now the interesting bit I thought was the following
[EMAIL PROTECTED]:/usr/src/lzo1x-test-4# ./fulltest
[test_lzo.c::compress (93)] run took 42 microseconds
[test_lzo.c::decompress (127)] run took 20 microseconds
[EMAIL PROTECTED]:/usr/src/lzo1x-test-4# ./tinytest
[test.c::compress (91)] run took 44 microseconds
[test.c::decompress (117)] BUG: lzo1x_decompress has failed ( t == -6 )
[test.c::main (149)] BUG: Decompression routine failure
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-28 Thread Bret Towe


On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote:

Hi,

This is kernel port of LZO1X-1 compressor and LZO1X decompressor (safe
version only).

* Changes since 'take 5' (Full Changelog after this):
- Added compressor and decomrpesssor as separate and hidden config
options (default: n)
- Cleanups: removed LZO_CHECK_MPOS_NON_DET macro
- removed PTR* macros.

* Benchmarks: (system: P4 3GHz, 1G RAM)
(Using tester program from Daniel)

Following compares this kernel port ('take 6') vs original miniLZO code:

'TinyLZO' refers to this kernel port.

1 run averages:
'Tiny LZO':
   Combined: 61.2223 usec
   Compression: 41.8412 usec
   Decompression: 19.3811 usec
'miniLZO':
   Combined: 66.0444 usec
   Compression: 46.6323 usec
   Decompression: 19.4121 usec

Result:
Overall: TinyLZO is 7.3% faster
Compressor: TinyLZO is 10.2% faster
Decompressor: TinyLZO is 0.15% faster

TODO: test 'take 6' port on archs other than x86(_32)

* Changelog vs. original LZO:
1) Used standard/kernel defined data types: (this eliminated _huge_
#ifdef chunks)
  lzo_bytep -> unsigned char *
  lzo_uint -> size_t
  lzo_xint -> size_t
  lzo_uint32p -> u32 *
  lzo_uintptr_t  -> unsigned long
2) Removed everything #ifdef'ed under COPY_DICT (this is not set for
LZO1X, so removed corres. parts).
3) Removed code #ifdef'ed for LZO1Y, LZO1Z, other variants.
4) Reformatted the code to match general kernel style.
5) The only code change: (as suggested by Andrey)
-#if defined(__LITTLE_ENDIAN)
- m_pos = op - 1;
- m_pos -= (*(const unsigned short *)ip) >> 2;
-#else
- m_pos = op - 1;
- m_pos -= (ip[0] >> 2) + (ip[1] << 6);
-#endif

+ m_pos = op - 1 - (cpu_to_le16(*(const u16 *)ip) >> 2);

(Andrey suggested le16_to_cpu for above but I think it should be cpu_to_le16).
*** Need testing on big endian machine ***

Similarly:
-#if defined(__LITTLE_ENDIAN)
- m_pos -= (*(const unsigned short *)ip) >> 2;
-#else
- m_pos -= (ip[0] >> 2) + (ip[1] << 6);
-#endif

+ m_pos -= cpu_to_le16(*(const u16 *)ip) >> 2;

6) Get rid of LZO_CHECK_MPOS_NON_DET macro and PTR* macros.


I think it's now ready for mainline inclusion.


 include/linux/lzo1x.h|   66 +++
 lib/Kconfig  |6 +
 lib/Makefile |2 +
 lib/lzo1x/Makefile   |2 +
 lib/lzo1x/lzo1x_compress.c   |  259


tested this on ppc and its still good
is there any reason to bother with a test on amd64?
if there is I might be able to get to it tonight
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-28 Thread Daniel Hazelton

On Monday 28 May 2007 16:18:40 Daniel Hazelton wrote:
> On Monday 28 May 2007 13:11:15 Adrian Bunk wrote:
> > On Mon, May 28, 2007 at 09:33:32PM +0530, Nitin Gupta wrote:
> > > On 5/28/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:
> > >...
> > >
> > >> - then ensure that it works correctly on all architectures and
> > >
> > > Already tested on x86, amd64, ppc (by Bret). I do not have machines
> > > from other archs available. Bret tested 'take 3' version but no
> > > changes were introduced in further revisions that could affect
> > > correctness - but still it will be good to have this version tested
> > > too. Only with inclusion in -mm and testing by much wider user base
> > > can make it to mainline (I suppose nobody uses -mm for production use
> > > anyway).
> > >
> > >>   document why your version is that much faster than the original
> > >>   version and why you know your optimizations have no side effects
>
> With likely(), unlikely() and noinline *not* defined as NOP's performance
> drops:
>
> 1 run averages:
> 'Tiny LZO':
> Combined: 84.9292 usec
> Compression: 42.4646 usec
> Decompression: 42.4646 usec
> 'miniLZO':
> Combined: 61.3548 usec
> Compression: 43.5648 usec
> Decompression: 17.79 usec
>
> However, I'm worried that my testbed code - likely the Perl script that
> actually loops the test code and collects its output - is somehow faulting,
> as the way that the Compression and Decompression code have the exact same
> value.
>
> I'm going to toss some debugging output in the script and see if I can spot
> the problem.
>

Okay, checked my code and it was all a problem with cpu_to_le16 getting 
defined fully instead of just being a NOP. With that problem corrected the 
performance returns:

1 run averages:
'Tiny LZO':
Combined: 60.1028 usec
Compression: 42.0652 usec
Decompression: 18.0376 usec
'miniLZO':
Combined: 61.0932 usec
Compression: 43.4382 usec
Decompression: 17.655 usec

Combined average shows 'Tiny' to be 1.6% faster
Compression in 'Tiny' is 3.2% faster
Decompression in 'Tiny' is 2.2% slower

All in all, the trade-off in this code, with the overall performance of 
the 'Tiny' code being faster than the stock miniLZO code I'd like to say that 
I'm certain that the decompressor could probably be sped up more, although I 
don't know of a place in the kernel where less than half a microsecond would 
make a massive impact. (Only place I can think of where this might have a 
negative is in SLAB/SLOB/SLUB,  and I don't think that a low-level memory 
manager like those is a place for a compression algorithm anyway)

Later today or tommorrow  I'll start putting together another part to 
this "test-bed" for stress-testing the algorithm. Currently I only have a few 
ideas for tests:
(for benchmarking)
1) Random input to compressor
2) large input data
3) real-world data (mmap()'d chunk of /dev/mem as input)

(for stability checking)
4) deliberate corruption of compressed data
5) early finish (realloc() compressed data buffer to less than the full size 
   of the data)
6) late start (change pointer to start of compressed data so the decompressor
 doesn't start at the beginning)

When I have a better idea of how the LZO algorithm works I might try creating 
an input data-set for the decompressor that would deliberately overflow the 
output buffer. This is supposed to be caught by bounds-checking in 
the '_safe' version of the decompressor (the version  used in 'Tiny' and  
used in the benchmarking of miniLZO), however it might be possible to trick 
the decompressor. (If I do manage this and it crashes both 'Tiny' 
and 'miniLZO' I *will* report it to Herr Oberhumer as well as see if I can 
come up with a quick patch for the problem).

As I said before, any suggestions for how to improve the benchmark code are 
welcome, as are suggestions for tests I can add to it for stressing the 
new 'TinyLZO' implementation. Latest version of the test-bed attached. 
Significant changes include:
likely(), unlikely() and noinline are no longer NOP's
when the marked line in helpers.h is commented out, cpu_to_le16 functions as 
expected and is no longer a NOP. (Yes, that means this code is now fully 
capable of working correctly on a BE system).

DRH

lzo1x-test-bed-5.tar.bz2
Description: application/tbz

Re: [RFC] LZO de/compression support - take 6

2007-05-28 Thread Daniel Hazelton

On Monday 28 May 2007 13:11:15 Adrian Bunk wrote:
> On Mon, May 28, 2007 at 09:33:32PM +0530, Nitin Gupta wrote:
> > On 5/28/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:
> >...
> >
> >> - then ensure that it works correctly on all architectures and
> >
> > Already tested on x86, amd64, ppc (by Bret). I do not have machines
> > from other archs available. Bret tested 'take 3' version but no
> > changes were introduced in further revisions that could affect
> > correctness - but still it will be good to have this version tested
> > too. Only with inclusion in -mm and testing by much wider user base
> > can make it to mainline (I suppose nobody uses -mm for production use
> > anyway).
> >
> >>   document why your version is that much faster than the original
> >>   version and why you know your optimizations have no side effects

With likely(), unlikely() and noinline *not* defined as NOP's performance 
drops:

1 run averages:
'Tiny LZO':
Combined: 84.9292 usec
Compression: 42.4646 usec
Decompression: 42.4646 usec
'miniLZO':
Combined: 61.3548 usec
Compression: 43.5648 usec
Decompression: 17.79 usec

However, I'm worried that my testbed code - likely the Perl script that 
actually loops the test code and collects its output - is somehow faulting, 
as the way that the Compression and Decompression code have the exact same 
value.

I'm going to toss some debugging output in the script and see if I can spot 
the problem.

DRH
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-28 Thread Daniel Hazelton

On Monday 28 May 2007 13:11:15 Adrian Bunk wrote:
> On Mon, May 28, 2007 at 09:33:32PM +0530, Nitin Gupta wrote:
> > On 5/28/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:
> >...
> >
> >> - then ensure that it works correctly on all architectures and
> >
> > Already tested on x86, amd64, ppc (by Bret). I do not have machines
> > from other archs available. Bret tested 'take 3' version but no
> > changes were introduced in further revisions that could affect
> > correctness - but still it will be good to have this version tested
> > too. Only with inclusion in -mm and testing by much wider user base
> > can make it to mainline (I suppose nobody uses -mm for production use
> > anyway).
> >
> >>   document why your version is that much faster than the original
> >>   version and why you know your optimizations have no side effects
> >
> > Replied in earlier mail regarding this.
> >...
>
> I have not seen any explanations:
> - Why did the upstream author write the code that way?
> - Why are your changes correct?
> - Why do your changes allow the compiler to produce faster code?

Would it help if I added instrumentation code to both the upstream code and 
the stripped down code in question?

> The upstream author of the code is available - and he might be able to
> help you getting answers. No matter whether your changes are incorrect
> or correct optimizations that should go upstream, in both cases
> discussing these issues with upstream is the best solution.
>
> And testing is nice, but if you broke some case that's outside of your
> tests you'll never notice.

Any suggestions for tests I could add to the boilerplate I've been using to 
benchmark the code? I'll add a few tests that use random input rather than 
one of the source files - I'd add a test to see how it deals with NULL input, 
but I ran into that in an early version of the testbed (where I had a typo 
that stopped the code from working) and the compressor didn't like it. And I 
suppose I could put in a test that reads in a chunk of /dev/mem or have it 
get a chunk of data from HAL/DBus as well.

But if anyone has a suggestion of what tests I could toss at it to really 
stress the code, let me know and I'll get to work extending this code so it 
can stress test both the standard implementation of miniLZO *and* this 
stripped down 'Tiny' version.

DRH

> > - Nitin
>
> cu
> Adrian

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-28 Thread Daniel Hazelton

On Monday 28 May 2007 13:01:09 Adrian Bunk wrote:
> On Mon, May 28, 2007 at 11:55:14AM -0400, Daniel Hazelton wrote:
> >...
> > This is my guess as well. Though performance will likely drop when I make
> > the noinline macro mean something. (This may be offset by figuring out a
> > way to make likely() and unlikely() also have a meaningful effect in
> > userspace). ...
>
> What is your problem?
>
> The likely/unlikely macros aren't in any way depending on any kernel
> infrastructure.

That I'm just plain lazy and haven't felt like pulling them out of the kernel 
sources ?

Actually, that is the case - and I wasn't exactly sure that likely() and 
unlikely() were completely decoupled from the kernel's infrastructure.

DRH

> > DRH
>
> cu
> Adrian


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-28 Thread Adrian Bunk

On Mon, May 28, 2007 at 09:33:32PM +0530, Nitin Gupta wrote:
> On 5/28/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:
>...
>> - then ensure that it works correctly on all architectures and
>
> Already tested on x86, amd64, ppc (by Bret). I do not have machines
> from other archs available. Bret tested 'take 3' version but no
> changes were introduced in further revisions that could affect
> correctness - but still it will be good to have this version tested
> too. Only with inclusion in -mm and testing by much wider user base
> can make it to mainline (I suppose nobody uses -mm for production use
> anyway).
>
>>   document why your version is that much faster than the original
>>   version and why you know your optimizations have no side effects
>
> Replied in earlier mail regarding this.
>...

I have not seen any explanations:
- Why did the upstream author write the code that way?
- Why are your changes correct?
- Why do your changes allow the compiler to produce faster code?

The upstream author of the code is available - and he might be able to 
help you getting answers. No matter whether your changes are incorrect 
or correct optimizations that should go upstream, in both cases 
discussing these issues with upstream is the best solution.

And testing is nice, but if you broke some case that's outside of your 
tests you'll never notice.

> - Nitin

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-28 Thread Adrian Bunk

On Mon, May 28, 2007 at 11:55:14AM -0400, Daniel Hazelton wrote:
>...
> This is my guess as well. Though performance will likely drop when I make the 
> noinline macro mean something. (This may be offset by figuring out a way to 
> make likely() and unlikely() also have a meaningful effect in userspace).
>...

What is your problem?

The likely/unlikely macros aren't in any way depending on any kernel 
infrastructure.

> DRH

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-28 Thread Nitin Gupta

On 5/28/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:

On Mon, May 28, 2007 at 08:36:44PM +0530, Nitin Gupta wrote:
>...
> So, before this change, it will be good if it gets merged in mainline
> and tested, at least for correctness, on all supported achs. All the
> while, we will have a good feeling that there is still a good scope
> for perf improvement :)

The correct order is:
- create one version with all the optimizations you have in mind

Already done. One more optimization is regarding use of memcpy() in
place of COPY4() macros and open byte-by-byte copying. There are some
places where it's very hard to get it correct without adding
additional checks on various values which casues futher overhead by
iteslf - even then I could not get them correct so I decided not to go
with this particular optimization by myself.

- then ensure that it works correctly on all architectures and

Already tested on x86, amd64, ppc (by Bret). I do not have machines
from other archs available. Bret tested 'take 3' version but no
changes were introduced in further revisions that could affect
correctness - but still it will be good to have this version tested
too. Only with inclusion in -mm and testing by much wider user base
can make it to mainline (I suppose nobody uses -mm for production use
anyway).

  document why your version is that much faster than the original
  version and why you know your optimizations have no side effects

Replied in earlier mail regarding this.

- then get it tested in -mm

This is what I am looking for :)

After these steps, you can start considering getting it into mainline.

- Nitin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-28 Thread Daniel Hazelton

On Monday 28 May 2007 11:47:55 Nitin Gupta wrote:
> On 5/28/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:
> > On Mon, May 28, 2007 at 08:10:31PM +0530, Nitin Gupta wrote:
> > > Hi,
> > >
> > > Attached is tester code used for testing.
> > > (developed by Daniel Hazelton -- modified slightly to now use 'take 6'
> > > version for 'TinyLZO')
> > >
> > > Cheers,
> > > Nitin
> > >
> > > On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote:
> > >> (Using tester program from Daniel)
> > >>
> > >> Following compares this kernel port ('take 6') vs original miniLZO
> > >> code:
> > >>
> > >> 'TinyLZO' refers to this kernel port.
> > >>
> > >> 1 run averages:
> > >> 'Tiny LZO':
> > >>Combined: 61.2223 usec
> > >>Compression: 41.8412 usec
> > >>Decompression: 19.3811 usec
> > >> 'miniLZO':
> > >>Combined: 66.0444 usec
> > >>Compression: 46.6323 usec
> > >>Decompression: 19.4121 usec
> > >>
> > >> Result:
> > >> Overall: TinyLZO is 7.3% faster
> > >> Compressor: TinyLZO is 10.2% faster
> > >> Decompressor: TinyLZO is 0.15% faster
> >
> > So your the compressor of your version runs 10.2% faster than the
> > original version.
> >
> > That's a huge difference.
> >
> > Why exactly is it that much faster?
> >
> > cu
> > Adrian
>
> I am not sure on how to account for this _big_ perf. gain but from
> benchmarks I see that whenever I remove unncessary casting from tight
> loops I get perf. gain of 1-2%. For e.g. open coding
> LZO_CHECK_MPOS_NON_DET macro with all unnecessary casting removed,
> gave perf. gain of ~2%. Similarly, I found many other places where
> such casting was unnecessary.

This is my guess as well. Though performance will likely drop when I make the 
noinline macro mean something. (This may be offset by figuring out a way to 
make likely() and unlikely() also have a meaningful effect in userspace).

This benchmark should be run on BE machines, but I'm still trying to figure 
out a way to open-code a *fast* and *reliable* version of cpu_to_le16.

Suggestions for the above bits will always be appreciated.
 
> These changes have been tested on x86, amd64, ppc. Testing of 'take 6'
> version is yet to be done - this will confirm that I didn't
> reintroduce some error.

I don't see that 'take 6' has had too much changed from 'take 4', although I 
haven't do a full comparison.

DRH
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-28 Thread Daniel Hazelton

On Monday 28 May 2007 11:30:55 Adrian Bunk wrote:
> On Mon, May 28, 2007 at 08:10:31PM +0530, Nitin Gupta wrote:
> > Hi,
> >
> > Attached is tester code used for testing.
> > (developed by Daniel Hazelton -- modified slightly to now use 'take 6'
> > version for 'TinyLZO')
> >
> > Cheers,
> > Nitin
> >
> > On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote:
> >> (Using tester program from Daniel)
> >>
> >> Following compares this kernel port ('take 6') vs original miniLZO code:
> >>
> >> 'TinyLZO' refers to this kernel port.
> >>
> >> 1 run averages:
> >> 'Tiny LZO':
> >>Combined: 61.2223 usec
> >>Compression: 41.8412 usec
> >>Decompression: 19.3811 usec
> >> 'miniLZO':
> >>Combined: 66.0444 usec
> >>Compression: 46.6323 usec
> >>Decompression: 19.4121 usec
> >>
> >> Result:
> >> Overall: TinyLZO is 7.3% faster
> >> Compressor: TinyLZO is 10.2% faster
> >> Decompressor: TinyLZO is 0.15% faster
>
> So your the compressor of your version runs 10.2% faster than the
> original version.
>
> That's a huge difference.
>
> Why exactly is it that much faster?

All I've done is write the code used for testing the compressor in userspace, 
but I think it might have something to do with the NOP that cpu_to_le16() is 
on LE systems (the original has an open-coded byte-swap and I don't think it 
ever really gets passed) and the removal of a lot of un-needed casts. 
(version 5 of the code re-introduced one of the macro's that had a lot of 
casts and it showed a drastic slow-down). Anyway, I'm going to get back into 
my testbed and see about actually making a lot of the stuff that is currently 
just a NOP (I made exceedingly minor changes to the files to get the working 
in userspace - one of the changes was to define the kernel-specific macros's 
likely(), unlikely(), noinline and cpu_to_le16() to be NOP's.) into 
functional code. That set of changes may affect the performance and bring it 
closer to the original or it may not.

And the testbed code is open - I guess I should stick the GPL headers in it, 
even though it isn't likely to see much use outside a very small circle. If 
*anyone* is skeptical about the numbers, feel free to use the testbed 
yourself. (It currently only works as intended on LE systems, however) Just 
untar the source into a directory, run make and the run 'tester.pl' to 
actually perform the test runs and gather the numbers. There is no overhead 
included in the numbers - they come directly from running gettimeofday() 
before and after the calls to the respective functions (well, a bit of magic 
is done to make the numbers have meaning).

DRH

PS: the variable '$RUNS' in tester.pl determines how many times the script 
loops each program to generate the numbers. I'm currently using 1 runs to 
generate the average timings, but if you feel that is too large or too small 
a sample set, go ahead and change it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-28 Thread Nitin Gupta

On 5/28/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:

On Mon, May 28, 2007 at 08:10:31PM +0530, Nitin Gupta wrote:
> Hi,
>
> Attached is tester code used for testing.
> (developed by Daniel Hazelton -- modified slightly to now use 'take 6'
> version for 'TinyLZO')
>
> Cheers,
> Nitin
>
> On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote:
>> (Using tester program from Daniel)
>>
>> Following compares this kernel port ('take 6') vs original miniLZO code:
>>
>> 'TinyLZO' refers to this kernel port.
>>
>> 1 run averages:
>> 'Tiny LZO':
>>Combined: 61.2223 usec
>>Compression: 41.8412 usec
>>Decompression: 19.3811 usec
>> 'miniLZO':
>>Combined: 66.0444 usec
>>Compression: 46.6323 usec
>>Decompression: 19.4121 usec
>>
>> Result:
>> Overall: TinyLZO is 7.3% faster
>> Compressor: TinyLZO is 10.2% faster
>> Decompressor: TinyLZO is 0.15% faster

So your the compressor of your version runs 10.2% faster than the
original version.

That's a huge difference.

Why exactly is it that much faster?

cu
Adrian

I am not sure on how to account for this _big_ perf. gain but from
benchmarks I see that whenever I remove unncessary casting from tight
loops I get perf. gain of 1-2%. For e.g. open coding
LZO_CHECK_MPOS_NON_DET macro with all unnecessary casting removed,
gave perf. gain of ~2%. Similarly, I found many other places where
such casting was unnecessary.

These changes have been tested on x86, amd64, ppc. Testing of 'take 6'
version is yet to be done - this will confirm that I didn't
reintroduce some error.

- Nitin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-28 Thread Adrian Bunk

On Mon, May 28, 2007 at 08:36:44PM +0530, Nitin Gupta wrote:
>...
> So, before this change, it will be good if it gets merged in mainline
> and tested, at least for correctness, on all supported achs. All the
> while, we will have a good feeling that there is still a good scope
> for perf improvement :)

The correct order is:
- create one version with all the optimizations you have in mind
- then ensure that it works correctly on all architectures and
  document why your version is that much faster than the original 
  version and why you know your optimizations have no side effects
- then get it tested in -mm

After these steps, you can start considering getting it into mainline.

> Cheers,
> Nitin

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-28 Thread Adrian Bunk

On Mon, May 28, 2007 at 08:10:31PM +0530, Nitin Gupta wrote:
> Hi,
>
> Attached is tester code used for testing.
> (developed by Daniel Hazelton -- modified slightly to now use 'take 6'
> version for 'TinyLZO')
>
> Cheers,
> Nitin
>
> On 5/28/07, Nitin Gupta <[EMAIL PROTECTED]> wrote:
>> (Using tester program from Daniel)
>>
>> Following compares this kernel port ('take 6') vs original miniLZO code:
>>
>> 'TinyLZO' refers to this kernel port.
>>
>> 1 run averages:
>> 'Tiny LZO':
>>Combined: 61.2223 usec
>>Compression: 41.8412 usec
>>Decompression: 19.3811 usec
>> 'miniLZO':
>>Combined: 66.0444 usec
>>Compression: 46.6323 usec
>>Decompression: 19.4121 usec
>>
>> Result:
>> Overall: TinyLZO is 7.3% faster
>> Compressor: TinyLZO is 10.2% faster
>> Decompressor: TinyLZO is 0.15% faster

So your the compressor of your version runs 10.2% faster than the 
original version.

That's a huge difference.

Why exactly is it that much faster?

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-28 Thread Nitin Gupta

On 5/28/07, Daniel Hazelton <[EMAIL PROTECTED]> wrote:

On Monday 28 May 2007 10:40:31 Nitin Gupta wrote:
> Hi,
>
> Attached is tester code used for testing.
> (developed by Daniel Hazelton -- modified slightly to now use 'take 6'
> version for 'TinyLZO')
>
> Cheers,
> Nitin
>

I haven't tested with version 6, but after removing the LZO_CHECK_MPOS_NON_DET
macro from the 'take 5' code and replacing the open-coded byte-for-byte
copies with calls to memcpy:

I did memcpy() changes in some initial post (take '2', I think). That
caused some _correctness_ issue in de/compressor code -- Bret's test
could not succeed on ppc machine. After going back to byte-by-byte
copying, his tests were successful.

So, it's better not to include such changes now or test on all
supported archs if you really want to do so :)

1 run averages:
'Tiny LZO':
Combined: 57.4691 usec
Compression: 39.8837 usec
Decompression: 17.5854 usec
'miniLZO':
Combined: 64.0484 usec
Compression: 46.0604 usec
Decompression: 17.988 usec

which means:
Overall TinyLZO is 10.2% faster
TinyLZO compresses 13.4% faster
TinyLZO decompresses 2.23% faster

-Benchmark run a a Pentium-M 1.73GHz, 1GB Ram
With the speed-up seen with just the removal of the LZO_CHECK_MPOS_NON_DET I
wasn't sure that changing the open-coded copy to a call to memcpy() was going
to have a big impact on the code, but it does appear to have has several
percentage points of difference.

Yes, memcpy() changes have potential of giving significant perf gain
but I am not too sure if memcpy() will be good if we want to copy just
few bytes (which is the case at many times in de/compressor). Also, at
some places, memcpy() changes are not trival and have actually caused
correctness issues as mentioned above.

So, before this change, it will be good if it gets merged in mainline
and tested, at least for correctness, on all supported achs. All the
while, we will have a good feeling that there is still a good scope
for perf improvement :)

Cheers,
Nitin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO de/compression support - take 6

2007-05-28 Thread Daniel Hazelton

On Monday 28 May 2007 10:40:31 Nitin Gupta wrote:
> Hi,
>
> Attached is tester code used for testing.
> (developed by Daniel Hazelton -- modified slightly to now use 'take 6'
> version for 'TinyLZO')
>
> Cheers,
> Nitin
>


I haven't tested with version 6, but after removing the LZO_CHECK_MPOS_NON_DET 
macro from the 'take 5' code and replacing the open-coded byte-for-byte 
copies with calls to memcpy:

1 run averages:
'Tiny LZO':
Combined: 57.4691 usec
Compression: 39.8837 usec
Decompression: 17.5854 usec
'miniLZO':
Combined: 64.0484 usec
Compression: 46.0604 usec
Decompression: 17.988 usec

which means:
Overall TinyLZO is 10.2% faster
TinyLZO compresses 13.4% faster
TinyLZO decompresses 2.23% faster

-Benchmark run a a Pentium-M 1.73GHz, 1GB Ram
With the speed-up seen with just the removal of the LZO_CHECK_MPOS_NON_DET I 
wasn't sure that changing the open-coded copy to a call to memcpy() was going 
to have a big impact on the code, but it does appear to have has several 
percentage points of difference. (Though I am unsure about the speed increase 
with the decompression - I didn't touch that file when making this set of 
changes. I'm going to make the memcpy() changes to 'take 6' and see if the 
speed-up holds true)

DRH
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 >

1 - 100 of 124 matches

Mail list logo