Re: [PATCH 1/1] lz4: Implement lz4 with dynamic offset length.
On (04/03/18 19:13), Vaneet Narang wrote: > Hi Sergey, > > >You shrink a 2 bytes offset down to a 1 byte offset, thus you enforce that > 2 Byte offset is not shrinked to 1 byte, Its only 1 bit is reserved out of > 16 bits of offset. So only 15 Bits can be used to store offset value. Yes, you are right. My bad, was thinking about something else. > >'page should be less than 32KB', which I'm sure will be confusing. > lz4_dyn will work on bigger data length(> 32k) but in that case compression > ratio may not be better than LZ4. This is same as LZ4 compressing data more > than 64K (16Bits). LZ4 can't store offset more than 64K similarly > LZ4 dyn can't store offset more than 32K. Then drop that `if PAGE_SIZE' thing. I'd rather do that stuff internally in lz4... if it needed at all. > >And you > >rely on lz4_dyn users to do the right thing - namely, to use that 'nice' > >`#if (PAGE_SIZE < (32 * KB))'. > They don't need to add this code Then drop it. > >Apart from that, lz4_dyn supports only data > >in up to page_size chunks. Suppose my system has page_size of less than 32K, > >so I legitimately can enable lz4_dyn, but suppose that I will use it > >somewhere where I don't work with page_size-d chunks. Will I able to just > >do tfm->compress(src, sz) on random buffers? The whole thing looks to be > >quite fragile. > No thats not true, lz4_dyn can work for random buffers and it need not be > of page size chunks. There is no difference in Lz4 and Lz4 dyn working. You are right. -ss
RE: Re: [PATCH 1/1] lz4: Implement lz4 with dynamic offset length.
Hi Sergey, >You shrink a 2 bytes offset down to a 1 byte offset, thus you enforce that 2 Byte offset is not shrinked to 1 byte, Its only 1 bit is reserved out of 16 bits of offset. So only 15 Bits can be used to store offset value. >'page should be less than 32KB', which I'm sure will be confusing. lz4_dyn will work on bigger data length(> 32k) but in that case compression ratio may not be better than LZ4. This is same as LZ4 compressing data more than 64K (16Bits). LZ4 can't store offset more than 64K similarly LZ4 dyn can't store offset more than 32K. There is a handling in LZ4 code for this and similar handling added for LZ4 Dyn. Handling in LZ4 Dyn: max_distance is 32K for lz4_dyn and will be 64K for LZ4 int max_distance = dynOffset ? MAX_DISTANCE_DYN : MAX_DISTANCE; >And you >rely on lz4_dyn users to do the right thing - namely, to use that 'nice' >`#if (PAGE_SIZE < (32 * KB))'. They don't need to add this code, they just need to choose right compression algorithm that fits their requirement. If source length is less than 32K then lz4_dyn would give better compression ratio then LZ4. Considering ZRAM as a user for LZ4 dyn, we have added this check for PAGE_SIZE which is source length. This code adds lz4 dyn to preferred list of compression algorithm when PAGE size is less than 32K. >Apart from that, lz4_dyn supports only data >in up to page_size chunks. Suppose my system has page_size of less than 32K, >so I legitimately can enable lz4_dyn, but suppose that I will use it >somewhere where I don't work with page_size-d chunks. Will I able to just >do tfm->compress(src, sz) on random buffers? The whole thing looks to be >quite fragile. No thats not true, lz4_dyn can work for random buffers and it need not be of page size chunks. There is no difference in Lz4 and Lz4 dyn working. Only difference is LZ4 dyn doesn't use fixed offset size, this concept already getting used in LZO which uses dynamic size of Metadata based on Match Length and Match offset. It uses different markers for this which defines length of meta data. lzodefs.h: #define M1_MAX_OFFSET 0x0400 #define M2_MAX_OFFSET 0x0800 #define M3_MAX_OFFSET 0x4000 #define M4_MAX_OFFSET 0xbfff #define M1_MIN_LEN 2 #define M1_MAX_LEN 2 #define M2_MIN_LEN 3 #define M2_MAX_LEN 8 #define M3_MIN_LEN 3 #define M3_MAX_LEN 33 #define M4_MIN_LEN 3 #define M4_MAX_LEN 9 #define M1_MARKER 0 #define M2_MARKER 64 #define M3_MARKER 32 #define M4_MARKER 16 Similarly for LZ4 Dyn, we have used 1 bit as a marker to determine offset length. Thanks & Regards, Vaneet Narang rcptInfo.txt Description: Binary data
Re: [PATCH 1/1] lz4: Implement lz4 with dynamic offset length.
On (04/02/18 11:21), Maninder Singh wrote: [..] > >> static const char * const backends[] = { > >> "lzo", > >> #if IS_ENABLED(CONFIG_CRYPTO_LZ4) > >> "lz4", > >> +#if (PAGE_SIZE < (32 * KB)) > >> +"lz4_dyn", > >> +#endif > > > >This is not the list of supported algorithms. It's the list of > >recommended algorithms. You can configure zram to use any of > >available and known to Crypto API algorithms. Including lz4_dyn > >on PAGE_SIZE > 32K systems. > > > Yes, we want to integrate new compression(lz4_dyn) for ZRAM > only if PAGE_SIZE is less than 32KB to get maximum benefit. > so we added lz4_dyn to available list of ZRAM compression alhorithms. Which is not what I was talking about. You shrink a 2 bytes offset down to a 1 byte offset, thus you enforce that 'page should be less than 32KB', which I'm sure will be confusing. And you rely on lz4_dyn users to do the right thing - namely, to use that 'nice' `#if (PAGE_SIZE < (32 * KB))'. Apart from that, lz4_dyn supports only data in up to page_size chunks. Suppose my system has page_size of less than 32K, so I legitimately can enable lz4_dyn, but suppose that I will use it somewhere where I don't work with page_size-d chunks. Will I able to just do tfm->compress(src, sz) on random buffers? The whole thing looks to be quite fragile. -ss
Re: [PATCH 1/1] lz4: Implement lz4 with dynamic offset length.
Hi, >> diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c >> index 4ed0a78..5bc5aab 100644 >> --- a/drivers/block/zram/zcomp.c >> +++ b/drivers/block/zram/zcomp.c >> @@ -17,11 +17,15 @@ >> #include >> >> #include "zcomp.h" >> +#define KB(1 << 10) >> >> static const char * const backends[] = { >> "lzo", >> #if IS_ENABLED(CONFIG_CRYPTO_LZ4) >> "lz4", >> +#if (PAGE_SIZE < (32 * KB)) >> +"lz4_dyn", >> +#endif > >This is not the list of supported algorithms. It's the list of >recommended algorithms. You can configure zram to use any of >available and known to Crypto API algorithms. Including lz4_dyn >on PAGE_SIZE > 32K systems. > >-ss Yes, we want to integrate new compression(lz4_dyn) for ZRAM only if PAGE_SIZE is less than 32KB to get maximum benefit. so we added lz4_dyn to available list of ZRAM compression alhorithms. Thanks, Manider Singh
Re: [PATCH 1/1] lz4: Implement lz4 with dynamic offset length.
Hi Nick, Thanks for your comments, Please check my reply to few of your comments. I will be sharing benchmarking figures separately. > >> +if (curr_offset > 127) { >> +curr_offset = (curr_offset << 1) | DYN_BIT; >> +LZ4_writeLE16(op, (U16)curr_offset); >> +op += 2; >> +} else { >> +curr_offset = curr_offset << 1; >> +*op = (BYTE)curr_offset; >> +op++; >> +} > >The standard way to do variable sized integers is to use the high-bit as >the control bit, not the low-bit. Do you have benchmarks to show that using >the low-bit is faster than using the high-bit? If so, lets comment in the >code, if not lets use the high-bit. > We are not sure about performance difference of using low or high bit but Since as per LZ4 specification, offset needs to be stored in little endian format so keeping Low bit as control bit makes it easier to retreive offset when offset is spread across two bytes. offset = LZ4_readLE16(ip); if (offset & 0x1) { offset = offset >> 1; // Just one Right shift ip += 2; } else { offset = (offset & 0xff) >> 1; // Only Two Operation for one byte offset ip += 1; } So not sure if keeping High bit will make much difference as we will be needing same or more operations to get offset in case of High bit. >> /* copy literals */ >> cpy = op + length; >> if (((endOnInput) && ((cpy > (partialDecoding ? oexit : >> oend - MFLIMIT)) >> -|| (ip + length > iend - (2 + 1 + LASTLITERALS >> -|| ((!endOnInput) && (cpy > oend - >> WILDCOPYLENGTH))) { >> +|| (ip + length > iend - (2 + LASTLITERALS >> +|| ((!endOnInput) && (cpy > oend - WILDCOPYLENGTH - >> 1))) { >> if (partialDecoding) { >> if (cpy > oend) { >> /* >> @@ -188,13 +190,31 @@ static FORCE_INLINE int LZ4_decompress_generic( >> break; >> } >> >> -LZ4_wildCopy(op, ip, cpy); >> +if (dynOffset && length < 4) >> +LZ4_copy4(op, ip); >> +else >> +LZ4_wildCopy(op, ip, cpy); >> + > >The LZ4 format enforces that the last 5 bytes are literals so that >LZ4_wildCopy() can be used here. I suspect that having this extra branch >here for `dynOffset` mode hurts decompression speed. > This check is made to handle one byte read overflow when decompressing last frame. wildCopy does 8 byte copy blindly. Issue Case: 0 length literal followed by 1 byte offset and then it ends with one byte token and 5 Byte Last Literals. With this combination only 7 bytes (1 Byte Offset + 1 Byte Token + 5 Byte Literal) remains at the end of input buffer so reading 8 bytes from input buffer results in 1 byte overflow. Since 1 byte offset is not there in original LZ4, so this issue is not possible. To avoid overhead of this check, we planned to use 6 Byte as Last literal Length. -#define LASTLITERALS5 +#define LASTLITERALS6 >> >> int LZ4_decompress_safe(const char *source, char *dest, >> -int compressedSize, int maxDecompressedSize) >> +int compressedSize, int maxDecompressedSize, bool dynOffset) >> { >> return LZ4_decompress_generic(source, dest, compressedSize, >> maxDecompressedSize, endOnInputSize, full, 0, >> -noDict, (BYTE *)dest, NULL, 0); >> +noDict, (BYTE *)dest, NULL, 0, dynOffset); > >You'll need to use the same trick that LZ4_compress_fast() uses, by hard >coding `dynOffset`. We want the compiler to generate too version of >LZ4_decompress_generic(), one with `dynOffset` and one with `!dynOffset`. >That way the tight loop won't the branches that check `dynOffset`. > >if (dynOffset) >return LZ4_decompress_generic(..., true); >else >return LZ4_decompress_generic(..., false); > >Without this trick, I expect that this patch causes a regression to both >LZ4 and LZ4_DYN decompression speed. Since there is no backward compatibility of our solution with original LZ4 so we are bit confused if we should make it as separate API to avoid overhead of dynOffset checks every where in the code and also to avoid changing prototype of exported functions LZ4_decompress/LZ4_compress OR we should keep these checks to avoid redundant code. Kindly suggest Thanks & Regards, Vaneet Narang
Re: [PATCH 1/1] lz4: Implement lz4 with dynamic offset length.
Hi Maninder, Thank you for the patch! Yet something to improve: [auto build test ERROR on linus/master] [also build test ERROR on v4.16-rc6] [cannot apply to next-20180322] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Maninder-Singh/cover-letter-lz4-Implement-lz4-with-dynamic-offset-length/20180323-064137 config: i386-randconfig-s1-03221113 (attached as .config) compiler: gcc-6 (Debian 6.4.0-9) 6.4.0 20171026 reproduce: # save the attached .config to linux build tree make ARCH=i386 All errors (new ones prefixed by >>): fs/squashfs/lz4_wrapper.c: In function 'lz4_uncompress': >> fs/squashfs/lz4_wrapper.c:110:8: error: too few arguments to function >> 'LZ4_decompress_safe' res = LZ4_decompress_safe(stream->input, stream->output, ^~~ In file included from fs/squashfs/lz4_wrapper.c:13:0: include/linux/lz4.h:301:5: note: declared here int LZ4_decompress_safe(const char *source, char *dest, int compressedSize, ^~~ vim +/LZ4_decompress_safe +110 fs/squashfs/lz4_wrapper.c 9c06a46f Phillip Lougher2014-11-27 91 9c06a46f Phillip Lougher2014-11-27 92 9c06a46f Phillip Lougher2014-11-27 93 static int lz4_uncompress(struct squashfs_sb_info *msblk, void *strm, 9c06a46f Phillip Lougher2014-11-27 94 struct buffer_head **bh, int b, int offset, int length, 9c06a46f Phillip Lougher2014-11-27 95 struct squashfs_page_actor *output) 9c06a46f Phillip Lougher2014-11-27 96 { 9c06a46f Phillip Lougher2014-11-27 97 struct squashfs_lz4 *stream = strm; 9c06a46f Phillip Lougher2014-11-27 98 void *buff = stream->input, *data; 9c06a46f Phillip Lougher2014-11-27 99 int avail, i, bytes = length, res; 9c06a46f Phillip Lougher2014-11-27 100 9c06a46f Phillip Lougher2014-11-27 101 for (i = 0; i < b; i++) { 9c06a46f Phillip Lougher2014-11-27 102 avail = min(bytes, msblk->devblksize - offset); 9c06a46f Phillip Lougher2014-11-27 103 memcpy(buff, bh[i]->b_data + offset, avail); 9c06a46f Phillip Lougher2014-11-27 104 buff += avail; 9c06a46f Phillip Lougher2014-11-27 105 bytes -= avail; 9c06a46f Phillip Lougher2014-11-27 106 offset = 0; 9c06a46f Phillip Lougher2014-11-27 107 put_bh(bh[i]); 9c06a46f Phillip Lougher2014-11-27 108 } 9c06a46f Phillip Lougher2014-11-27 109 d21b5ff1 Sven Schmidt 2017-02-24 @110 res = LZ4_decompress_safe(stream->input, stream->output, d21b5ff1 Sven Schmidt 2017-02-24 111 length, output->length); d21b5ff1 Sven Schmidt 2017-02-24 112 d21b5ff1 Sven Schmidt 2017-02-24 113 if (res < 0) 9c06a46f Phillip Lougher2014-11-27 114 return -EIO; 9c06a46f Phillip Lougher2014-11-27 115 d21b5ff1 Sven Schmidt 2017-02-24 116 bytes = res; 9c06a46f Phillip Lougher2014-11-27 117 data = squashfs_first_page(output); 9c06a46f Phillip Lougher2014-11-27 118 buff = stream->output; 9c06a46f Phillip Lougher2014-11-27 119 while (data) { 09cbfeaf Kirill A. Shutemov 2016-04-01 120 if (bytes <= PAGE_SIZE) { 9c06a46f Phillip Lougher2014-11-27 121 memcpy(data, buff, bytes); 9c06a46f Phillip Lougher2014-11-27 122 break; 9c06a46f Phillip Lougher2014-11-27 123 } 09cbfeaf Kirill A. Shutemov 2016-04-01 124 memcpy(data, buff, PAGE_SIZE); 09cbfeaf Kirill A. Shutemov 2016-04-01 125 buff += PAGE_SIZE; 09cbfeaf Kirill A. Shutemov 2016-04-01 126 bytes -= PAGE_SIZE; 9c06a46f Phillip Lougher2014-11-27 127 data = squashfs_next_page(output); 9c06a46f Phillip Lougher2014-11-27 128 } 9c06a46f Phillip Lougher2014-11-27 129 squashfs_finish_page(output); 9c06a46f Phillip Lougher2014-11-27 130 d21b5ff1 Sven Schmidt 2017-02-24 131 return res; 9c06a46f Phillip Lougher2014-11-27 132 } 9c06a46f Phillip Lougher2014-11-27 133 :: The code at line 110 was first introduced by commit :: d21b5ff12df45a65bb220c7e8103a5f0f5609377 fs/pstore: fs/squashfs: change usage of LZ4 to work with new LZ4 version :: TO: Sven Schmidt <4ssch...@informatik.uni-hamburg.de> :: CC: Linus Torvalds --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [PATCH 1/1] lz4: Implement lz4 with dynamic offset length.
Hi Maninder, Thank you for the patch! Yet something to improve: [auto build test ERROR on linus/master] [also build test ERROR on v4.16-rc6] [cannot apply to next-20180322] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Maninder-Singh/cover-letter-lz4-Implement-lz4-with-dynamic-offset-length/20180323-064137 config: i386-randconfig-x073-201811 (attached as .config) compiler: gcc-7 (Debian 7.3.0-1) 7.3.0 reproduce: # save the attached .config to linux build tree make ARCH=i386 All errors (new ones prefixed by >>): fs/pstore/platform.c: In function 'decompress_lz4': >> fs/pstore/platform.c:357:8: error: too few arguments to function >> 'LZ4_decompress_safe' ret = LZ4_decompress_safe(in, out, inlen, outlen); ^~~ In file included from fs/pstore/platform.c:38:0: include/linux/lz4.h:301:5: note: declared here int LZ4_decompress_safe(const char *source, char *dest, int compressedSize, ^~~ vim +/LZ4_decompress_safe +357 fs/pstore/platform.c 8cfc8ddc Geliang Tang 2016-02-18 352 8cfc8ddc Geliang Tang 2016-02-18 353 static int decompress_lz4(void *in, void *out, size_t inlen, size_t outlen) 8cfc8ddc Geliang Tang 2016-02-18 354 { 8cfc8ddc Geliang Tang 2016-02-18 355 int ret; 8cfc8ddc Geliang Tang 2016-02-18 356 d21b5ff1 Sven Schmidt 2017-02-24 @357 ret = LZ4_decompress_safe(in, out, inlen, outlen); d21b5ff1 Sven Schmidt 2017-02-24 358 if (ret < 0) { d21b5ff1 Sven Schmidt 2017-02-24 359 /* d21b5ff1 Sven Schmidt 2017-02-24 360* LZ4_decompress_safe will return an error code d21b5ff1 Sven Schmidt 2017-02-24 361* (< 0) if decompression failed d21b5ff1 Sven Schmidt 2017-02-24 362*/ d21b5ff1 Sven Schmidt 2017-02-24 363 pr_err("LZ4_decompress_safe error, ret = %d!\n", ret); 8cfc8ddc Geliang Tang 2016-02-18 364 return -EIO; 8cfc8ddc Geliang Tang 2016-02-18 365 } 8cfc8ddc Geliang Tang 2016-02-18 366 d21b5ff1 Sven Schmidt 2017-02-24 367 return ret; 8cfc8ddc Geliang Tang 2016-02-18 368 } 8cfc8ddc Geliang Tang 2016-02-18 369 :: The code at line 357 was first introduced by commit :: d21b5ff12df45a65bb220c7e8103a5f0f5609377 fs/pstore: fs/squashfs: change usage of LZ4 to work with new LZ4 version :: TO: Sven Schmidt <4ssch...@informatik.uni-hamburg.de> :: CC: Linus Torvalds --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [PATCH 1/1] lz4: Implement lz4 with dynamic offset length.
CC: Vaneet Narang. On (03/21/18 10:10), Maninder Singh wrote: > diff --git a/lib/lz4/lz4_compress.c b/lib/lz4/lz4_compress.c > index cc7b6d4..185c358 100644 > --- a/lib/lz4/lz4_compress.c > +++ b/lib/lz4/lz4_compress.c > @@ -183,7 +183,8 @@ static FORCE_INLINE int LZ4_compress_generic( > const tableType_t tableType, > const dict_directive dict, > const dictIssue_directive dictIssue, > -const U32 acceleration) > +const U32 acceleration, > +const Dynamic_Offset dynOffset) > { > const BYTE *ip = (const BYTE *) source; > const BYTE *base; > @@ -199,6 +200,7 @@ static FORCE_INLINE int LZ4_compress_generic( > > BYTE *op = (BYTE *) dest; > BYTE * const olimit = op + maxOutputSize; > +int max_distance = dynOffset ? MAX_DISTANCE_DYN : MAX_DISTANCE; Lets mark this variable `const`. If the compiler doesn't realize that this variable and `dynOffset` are compile time constants, I expect the speed to be impacted. > > U32 forwardH; > size_t refDelta = 0; > @@ -245,6 +247,7 @@ static FORCE_INLINE int LZ4_compress_generic( > for ( ; ; ) { > const BYTE *match; > BYTE *token; > +int curr_offset; > > /* Find a match */ > { > @@ -285,7 +288,7 @@ static FORCE_INLINE int LZ4_compress_generic( > : 0) > || ((tableType == byU16) > ? 0 > -: (match + MAX_DISTANCE < ip)) > +: (match + max_distance < ip)) > || (LZ4_read32(match + refDelta) > != LZ4_read32(ip))); > } > @@ -328,8 +331,26 @@ static FORCE_INLINE int LZ4_compress_generic( > > _next_match: > /* Encode Offset */ > -LZ4_writeLE16(op, (U16)(ip - match)); > -op += 2; > +if (dynOffset) { > +curr_offset = (U16)(ip - match); > + > +/* > + * If Ofsset is greater than 127, we need 2 bytes > + * to store it. Otherwise 1 byte is enough. > + */ > +if (curr_offset > 127) { > +curr_offset = (curr_offset << 1) | DYN_BIT; > +LZ4_writeLE16(op, (U16)curr_offset); > +op += 2; > +} else { > +curr_offset = curr_offset << 1; > +*op = (BYTE)curr_offset; > +op++; > +} The standard way to do variable sized integers is to use the high-bit as the control bit, not the low-bit. Do you have benchmarks to show that using the low-bit is faster than using the high-bit? If so, lets comment in the code, if not lets use the high-bit. > +} else { > +LZ4_writeLE16(op, (U16)(ip - match)); > +op += 2; > +} > > /* Encode MatchLength */ > { > @@ -480,39 +501,70 @@ static int LZ4_compress_fast_extState( > return LZ4_compress_generic(ctx, source, > dest, inputSize, 0, > noLimit, byU16, noDict, > -noDictIssue, acceleration); > +noDictIssue, acceleration, NoDynOffset); > else > return LZ4_compress_generic(ctx, source, > dest, inputSize, 0, > noLimit, tableType, noDict, > -noDictIssue, acceleration); > +noDictIssue, acceleration, NoDynOffset); > } else { > if (inputSize < LZ4_64Klimit) > return LZ4_compress_generic(ctx, source, > dest, inputSize, > maxOutputSize, limitedOutput, byU16, noDict, > -noDictIssue, acceleration); > +noDictIssue, acceleration, NoDynOffset); > else > return LZ4_compress_generic(ctx, source, > dest, inputSize, > maxOutputSize, limitedOutput, tableType, >noDict, > -noDictIssue, acceleration); > +noDictIssue, acceleration, NoDynOffset); > } > } > > +static int LZ4_compress_fast_extState_dynamic( > +void *state, > +const char *source, > +char *dest, >
Re: [PATCH 1/1] lz4: Implement lz4 with dynamic offset length.
On (03/21/18 10:10), Maninder Singh wrote: > diff --git a/lib/lz4/lz4_compress.c b/lib/lz4/lz4_compress.c > index cc7b6d4..185c358 100644 > --- a/lib/lz4/lz4_compress.c > +++ b/lib/lz4/lz4_compress.c > @@ -183,7 +183,8 @@ static FORCE_INLINE int LZ4_compress_generic( > const tableType_t tableType, > const dict_directive dict, > const dictIssue_directive dictIssue, > - const U32 acceleration) > + const U32 acceleration, > + const Dynamic_Offset dynOffset) > { > const BYTE *ip = (const BYTE *) source; > const BYTE *base; > @@ -199,6 +200,7 @@ static FORCE_INLINE int LZ4_compress_generic( > > BYTE *op = (BYTE *) dest; > BYTE * const olimit = op + maxOutputSize; > + int max_distance = dynOffset ? MAX_DISTANCE_DYN : MAX_DISTANCE; Lets mark this variable `const`. If the compiler doesn't realize that this variable and `dynOffset` are compile time constants, I expect the speed to be impacted. > > U32 forwardH; > size_t refDelta = 0; > @@ -245,6 +247,7 @@ static FORCE_INLINE int LZ4_compress_generic( > for ( ; ; ) { > const BYTE *match; > BYTE *token; > + int curr_offset; > > /* Find a match */ > { > @@ -285,7 +288,7 @@ static FORCE_INLINE int LZ4_compress_generic( > : 0) > || ((tableType == byU16) > ? 0 > - : (match + MAX_DISTANCE < ip)) > + : (match + max_distance < ip)) > || (LZ4_read32(match + refDelta) > != LZ4_read32(ip))); > } > @@ -328,8 +331,26 @@ static FORCE_INLINE int LZ4_compress_generic( > > _next_match: > /* Encode Offset */ > - LZ4_writeLE16(op, (U16)(ip - match)); > - op += 2; > + if (dynOffset) { > + curr_offset = (U16)(ip - match); > + > + /* > + * If Ofsset is greater than 127, we need 2 bytes > + * to store it. Otherwise 1 byte is enough. > + */ > + if (curr_offset > 127) { > + curr_offset = (curr_offset << 1) | DYN_BIT; > + LZ4_writeLE16(op, (U16)curr_offset); > + op += 2; > + } else { > + curr_offset = curr_offset << 1; > + *op = (BYTE)curr_offset; > + op++; > + } The standard way to do variable sized integers is to use the high-bit as the control bit, not the low-bit. Do you have benchmarks to show that using the low-bit is faster than using the high-bit? If so, lets comment in the code, if not lets use the high-bit. > + } else { > + LZ4_writeLE16(op, (U16)(ip - match)); > + op += 2; > + } > > /* Encode MatchLength */ > { > @@ -480,39 +501,70 @@ static int LZ4_compress_fast_extState( > return LZ4_compress_generic(ctx, source, > dest, inputSize, 0, > noLimit, byU16, noDict, > - noDictIssue, acceleration); > + noDictIssue, acceleration, NoDynOffset); > else > return LZ4_compress_generic(ctx, source, > dest, inputSize, 0, > noLimit, tableType, noDict, > - noDictIssue, acceleration); > + noDictIssue, acceleration, NoDynOffset); > } else { > if (inputSize < LZ4_64Klimit) > return LZ4_compress_generic(ctx, source, > dest, inputSize, > maxOutputSize, limitedOutput, byU16, noDict, > - noDictIssue, acceleration); > + noDictIssue, acceleration, NoDynOffset); > else > return LZ4_compress_generic(ctx, source, > dest, inputSize, > maxOutputSize, limitedOutput, tableType, noDict, > - noDictIssue, acceleration); > + noDictIssue, acceleration, NoDynOffset); > } > } > > +static int LZ4_compress_fast_extState_dynamic( > + void *state, > + const char *source, > + char *dest, > + int inputSize, > + int maxOutputSize, > + int acceleration) > +{ > + LZ4_stream_t_internal *ctx = &((LZ4_stream_t > *)state)->internal_donotuse; > + > + LZ4_resetStream((LZ4_stream_t *)state); > + > + if (acceleration < 1) > +
Re: [PATCH 1/1] lz4: Implement lz4 with dynamic offset length.
On (03/21/18 10:10), Maninder Singh wrote: [..] > +static struct crypto_alg alg_lz4_dyn = { > + .cra_name = "lz4_dyn", > + .cra_flags = CRYPTO_ALG_TYPE_COMPRESS, > + .cra_ctxsize= sizeof(struct lz4_ctx), > + .cra_module = THIS_MODULE, > + .cra_list = LIST_HEAD_INIT(alg_lz4_dyn.cra_list), > + .cra_init = lz4_init, > + .cra_exit = lz4_exit, > + .cra_u = { .compress = { > + .coa_compress = lz4_compress_crypto_dynamic, > + .coa_decompress = lz4_decompress_crypto_dynamic } } > +}; [..] > diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c > index 4ed0a78..5bc5aab 100644 > --- a/drivers/block/zram/zcomp.c > +++ b/drivers/block/zram/zcomp.c > @@ -17,11 +17,15 @@ > #include > > #include "zcomp.h" > +#define KB (1 << 10) > > static const char * const backends[] = { > "lzo", > #if IS_ENABLED(CONFIG_CRYPTO_LZ4) > "lz4", > +#if (PAGE_SIZE < (32 * KB)) > + "lz4_dyn", > +#endif This is not the list of supported algorithms. It's the list of recommended algorithms. You can configure zram to use any of available and known to Crypto API algorithms. Including lz4_dyn on PAGE_SIZE > 32K systems. -ss