Re: [PATCH 2/2] zram: use crypto API to compress the page

2015-08-13 Thread Joonsoo Kim
On Thu, Aug 13, 2015 at 12:51:13PM +0900, Sergey Senozhatsky wrote:
 On (08/13/15 11:24), Joonsoo Kim wrote:
  Until now, zram uses compression algorithm through direct call
  to core algorithm function, but, it has drawback that we need to add
  compression algorithm manually to zram if needed. Without this work,
  we cannot utilize various compression algorithms in the system.
  Moreover, to add new compression algorithm, we need to know how to use it
  and this is somewhat time-consuming.
  
 
 I don't like this change, sorry.
 
 Here is why. One of the main reasons why I haven't implemented it using
 crypto API (and it was an option and I thought about it) was and still
 is the fact that crypto API requires tfm for both compress and decompress
 operations. And this is a show stopper. No matter how you implement it,
 it is slower than what we have by definition.

I don't think it is show stopper. Current implementation that uses zstrm
only for compression seems over-optimized thing to me. Some of other
compression algorithms need scratch-buffer to decompress the contents.
As you know, zlib is one of them. I tested some of other decompression
algorithm such as quicklz, wkdm and they also need scratch-buffer.

And, we cannot support crypto algorithm *module* in current
implementation.

If that optimization is really needed for the case that doesn't need
tfm except fetching function pointer, we can implement sharable tfm
in crypto subsystem.

 
 Now zram_bvec_read() operations depends on:
 
 a) other read operations
because read() path now use limited in size idle stream list
 
 b) write operations
because write() path uses same idle streams list
 
 
 
 Literally, you change zram_bvec_read() from a fast
 
   zram_decompress_page(zram, uncmem, index);
 
 to a slow
 
   zstrm = zcomp_strm_find(zram-comp);
   zram_decompress_page(zram, zstrm, uncmem, index);
   zcomp_strm_release(zram-comp, zstrm);
 
 
 you either slow down both write() and read() if you use a single idle
 stream list for both read and write, or double the amount of memory
 used by idle streams if you decide to use separate read and write
 idle stream lists. I see no real reason to do either of those. I like
 that the existing read() is as fast as probably possible.

Yes, there can be some performace regression. But, by allocating some
more streams, we can easily eliminate regression. Memory overhead
would be really small. I think that gaining flexibility and reducing
management overhead can compensate this small memory overhead.

Thanks.

 
  When I tested new algorithms such as zlib, these problems make me hard
  to test them. To prevent these problem in the future, I think that
  using crypto API to compress the page is better approach and this patch
  implements it.
  
  The reason we need to support vairous compression algorithms is that
  zram's performance is highly depend on workload and compression algorithm
  and architecture. Every compression algorithm has it's own strong point.
  For example, zlib is the best for compression ratio, but, it's
  (de)compression speed is rather slow. Against my expectation, in my kernel
  build test with zram swap, in low-memory condition on x86, zlib shows best
  performance than others. In this case, I guess that compression ratio is
  the most important factor. Unlike this situation, on ARM, maybe fast
  (de)compression speed is the most important because it's computation speed
  is slower than x86.
  
  We can't expect what algorithm is the best fit for one's system, because
  it needs too complex calculation. We need to test it in case by case and
  easy to use new compression algorithm by this patch will help
  that situation.
  
  Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com
  ---
   drivers/block/zram/Kconfig | 13 +--
   drivers/block/zram/Makefile|  4 +-
   drivers/block/zram/zcomp.c | 87 
  ++
   drivers/block/zram/zcomp.h | 34 +
   drivers/block/zram/zcomp_lz4.c | 47 ---
   drivers/block/zram/zcomp_lz4.h | 17 -
   drivers/block/zram/zcomp_lzo.c | 47 ---
   drivers/block/zram/zcomp_lzo.h | 17 -
   drivers/block/zram/zram_drv.c  | 28 +-
   9 files changed, 66 insertions(+), 228 deletions(-)
   delete mode 100644 drivers/block/zram/zcomp_lz4.c
   delete mode 100644 drivers/block/zram/zcomp_lz4.h
   delete mode 100644 drivers/block/zram/zcomp_lzo.c
   delete mode 100644 drivers/block/zram/zcomp_lzo.h
  
  diff --git a/drivers/block/zram/Kconfig b/drivers/block/zram/Kconfig
  index 386ba3d..36ec96f 100644
  --- a/drivers/block/zram/Kconfig
  +++ b/drivers/block/zram/Kconfig
  @@ -1,8 +1,7 @@
   config ZRAM
  tristate Compressed RAM block device support
  depends on BLOCK  SYSFS  ZSMALLOC
  -   select LZO_COMPRESS
  -   select LZO_DECOMPRESS
  +   select CRYPTO_LZO
  default n
  help
Creates virtual block devices 

Re: [PATCH 2/2] zram: use crypto API to compress the page

2015-08-13 Thread Herbert Xu
On Thu, Aug 13, 2015 at 04:19:41PM +0900, Joonsoo Kim wrote:
 
 If that optimization is really needed for the case that doesn't need
 tfm except fetching function pointer, we can implement sharable tfm
 in crypto subsystem.

I'm happy to consider changes to the crypto compression interface
as long as it continues to support the existing algorithms and users.

Cheers,
-- 
Email: Herbert Xu herb...@gondor.apana.org.au
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] zram: use crypto API to compress the page

2015-08-13 Thread Joonsoo Kim
On Thu, Aug 13, 2015 at 03:32:33PM +0800, Herbert Xu wrote:
 On Thu, Aug 13, 2015 at 04:19:41PM +0900, Joonsoo Kim wrote:
  
  If that optimization is really needed for the case that doesn't need
  tfm except fetching function pointer, we can implement sharable tfm
  in crypto subsystem.
 
 I'm happy to consider changes to the crypto compression interface
 as long as it continues to support the existing algorithms and users.

Happy to hear that.
I will think how it can be improved.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] zram: use crypto API to compress the page

2015-08-12 Thread Sergey Senozhatsky
Removed Herbert, David, Stephan not spam their mail boxes.


I'd rather investigate something like this. 

WARNING!!! NOT TESTED AT ALL! like for real... only compile tested.

RFC


define ZLIB backend the same way as we already do. and introduce
compression frontend -flags field. backend that requires zstream
for read op will set it to ZCOMP_NEED_READ_STRM. zram checks this
flag in read() path and does slow zcomp_strm_find()/zcomp_strm_release()
if backend requires it; otherwise we have our fast just decompress it
path.


this is very ugly and quick. just as an idea at the moment.
I will return to this a bit later.


not-yet-signed-off-by: Sergey Senozhatsky sergey.senozhatsky.w...@gmail.com

---
 drivers/block/zram/Kconfig  |  12 -
 drivers/block/zram/Makefile |   1 +
 drivers/block/zram/zcomp.c  |  21 +++-
 drivers/block/zram/zcomp.h  |   9 +++-
 drivers/block/zram/zcomp_lz4.c  |   2 +-
 drivers/block/zram/zcomp_lzo.c  |   2 +-
 drivers/block/zram/zcomp_zlib.c | 115 
 drivers/block/zram/zcomp_zlib.h |  17 ++
 drivers/block/zram/zram_drv.c   |  15 --
 9 files changed, 184 insertions(+), 10 deletions(-)
 create mode 100644 drivers/block/zram/zcomp_zlib.c
 create mode 100644 drivers/block/zram/zcomp_zlib.h

diff --git a/drivers/block/zram/Kconfig b/drivers/block/zram/Kconfig
index 386ba3d..7afd2ac 100644
--- a/drivers/block/zram/Kconfig
+++ b/drivers/block/zram/Kconfig
@@ -23,4 +23,14 @@ config ZRAM_LZ4_COMPRESS
default n
help
  This option enables LZ4 compression algorithm support. Compression
- algorithm can be changed using `comp_algorithm' device attribute.
\ No newline at end of file
+ algorithm can be changed using `comp_algorithm' device attribute.
+
+config ZRAM_ZLIB_COMPRESS
+   bool Enable ZLIB algorithm support
+   depends on ZRAM
+   select ZLIB_INFLATE
+   select ZLIB_DEFLATE
+   default n
+   help
+ This option enables ZLIB compression algorithm support. Compression
+ algorithm can be changed using `comp_algorithm' device attribute.
diff --git a/drivers/block/zram/Makefile b/drivers/block/zram/Makefile
index be0763f..0922f54 100644
--- a/drivers/block/zram/Makefile
+++ b/drivers/block/zram/Makefile
@@ -1,5 +1,6 @@
 zram-y :=  zcomp_lzo.o zcomp.o zram_drv.o
 
 zram-$(CONFIG_ZRAM_LZ4_COMPRESS) += zcomp_lz4.o
+zram-$(CONFIG_ZRAM_ZLIB_COMPRESS) += zcomp_zlib.o
 
 obj-$(CONFIG_ZRAM) +=  zram.o
diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
index 965d1af..4f04186 100644
--- a/drivers/block/zram/zcomp.c
+++ b/drivers/block/zram/zcomp.c
@@ -19,6 +19,9 @@
 #ifdef CONFIG_ZRAM_LZ4_COMPRESS
 #include zcomp_lz4.h
 #endif
+#ifdef CONFIG_ZRAM_ZLIB_COMPRESS
+#include zcomp_zlib.h
+#endif
 
 /*
  * single zcomp_strm backend
@@ -48,6 +51,9 @@ static struct zcomp_backend *backends[] = {
 #ifdef CONFIG_ZRAM_LZ4_COMPRESS
zcomp_lz4,
 #endif
+#ifdef CONFIG_ZRAM_ZLIB_COMPRESS
+   zcomp_zlib,
+#endif
NULL
 };
 
@@ -313,10 +319,16 @@ int zcomp_compress(struct zcomp *comp, struct zcomp_strm 
*zstrm,
zstrm-private);
 }
 
-int zcomp_decompress(struct zcomp *comp, const unsigned char *src,
+int zcomp_decompress(struct zcomp *comp, struct zcomp_strm *zstrm,
+   const unsigned char *src,
size_t src_len, unsigned char *dst)
 {
-   return comp-backend-decompress(src, src_len, dst);
+   void *private = NULL;
+
+   if (unlikely(zstrm))
+   private = zstrm-private;
+
+   return comp-backend-decompress(src, src_len, dst, private);
 }
 
 void zcomp_destroy(struct zcomp *comp)
@@ -354,5 +366,10 @@ struct zcomp *zcomp_create(const char *compress, int 
max_strm)
kfree(comp);
return ERR_PTR(-ENOMEM);
}
+
+   /* FIXME quick dirty and ugly. ONLY for testing purposes */
+   if (sysfs_streq(compress, zlib))
+   comp-flags |= ZCOMP_NEED_READ_STRM;
+
return comp;
 }
diff --git a/drivers/block/zram/zcomp.h b/drivers/block/zram/zcomp.h
index 46e2b9f..e3ac8d3 100644
--- a/drivers/block/zram/zcomp.h
+++ b/drivers/block/zram/zcomp.h
@@ -12,6 +12,8 @@
 
 #include linux/mutex.h
 
+#define ZCOMP_NEED_READ_STRM   (1  0)
+
 struct zcomp_strm {
/* compression/decompression buffer */
void *buffer;
@@ -31,7 +33,7 @@ struct zcomp_backend {
size_t *dst_len, void *private);
 
int (*decompress)(const unsigned char *src, size_t src_len,
-   unsigned char *dst);
+   unsigned char *dst, void *private);
 
void *(*create)(void);
void (*destroy)(void *private);
@@ -44,6 +46,8 @@ struct zcomp {
void *stream;
struct zcomp_backend *backend;
 
+   int flags;
+
struct zcomp_strm *(*strm_find)(struct zcomp *comp);
void (*strm_release)(struct zcomp *comp, struct zcomp_strm *zstrm);
   

[PATCH 2/2] zram: use crypto API to compress the page

2015-08-12 Thread Joonsoo Kim
Until now, zram uses compression algorithm through direct call
to core algorithm function, but, it has drawback that we need to add
compression algorithm manually to zram if needed. Without this work,
we cannot utilize various compression algorithms in the system.
Moreover, to add new compression algorithm, we need to know how to use it
and this is somewhat time-consuming.

When I tested new algorithms such as zlib, these problems make me hard
to test them. To prevent these problem in the future, I think that
using crypto API to compress the page is better approach and this patch
implements it.

The reason we need to support vairous compression algorithms is that
zram's performance is highly depend on workload and compression algorithm
and architecture. Every compression algorithm has it's own strong point.
For example, zlib is the best for compression ratio, but, it's
(de)compression speed is rather slow. Against my expectation, in my kernel
build test with zram swap, in low-memory condition on x86, zlib shows best
performance than others. In this case, I guess that compression ratio is
the most important factor. Unlike this situation, on ARM, maybe fast
(de)compression speed is the most important because it's computation speed
is slower than x86.

We can't expect what algorithm is the best fit for one's system, because
it needs too complex calculation. We need to test it in case by case and
easy to use new compression algorithm by this patch will help
that situation.

Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com
---
 drivers/block/zram/Kconfig | 13 +--
 drivers/block/zram/Makefile|  4 +-
 drivers/block/zram/zcomp.c | 87 ++
 drivers/block/zram/zcomp.h | 34 +
 drivers/block/zram/zcomp_lz4.c | 47 ---
 drivers/block/zram/zcomp_lz4.h | 17 -
 drivers/block/zram/zcomp_lzo.c | 47 ---
 drivers/block/zram/zcomp_lzo.h | 17 -
 drivers/block/zram/zram_drv.c  | 28 +-
 9 files changed, 66 insertions(+), 228 deletions(-)
 delete mode 100644 drivers/block/zram/zcomp_lz4.c
 delete mode 100644 drivers/block/zram/zcomp_lz4.h
 delete mode 100644 drivers/block/zram/zcomp_lzo.c
 delete mode 100644 drivers/block/zram/zcomp_lzo.h

diff --git a/drivers/block/zram/Kconfig b/drivers/block/zram/Kconfig
index 386ba3d..36ec96f 100644
--- a/drivers/block/zram/Kconfig
+++ b/drivers/block/zram/Kconfig
@@ -1,8 +1,7 @@
 config ZRAM
tristate Compressed RAM block device support
depends on BLOCK  SYSFS  ZSMALLOC
-   select LZO_COMPRESS
-   select LZO_DECOMPRESS
+   select CRYPTO_LZO
default n
help
  Creates virtual block devices called /dev/zramX (X = 0, 1, ...).
@@ -14,13 +13,3 @@ config ZRAM
  disks and maybe many more.
 
  See zram.txt for more information.
-
-config ZRAM_LZ4_COMPRESS
-   bool Enable LZ4 algorithm support
-   depends on ZRAM
-   select LZ4_COMPRESS
-   select LZ4_DECOMPRESS
-   default n
-   help
- This option enables LZ4 compression algorithm support. Compression
- algorithm can be changed using `comp_algorithm' device attribute.
\ No newline at end of file
diff --git a/drivers/block/zram/Makefile b/drivers/block/zram/Makefile
index be0763f..9e2b79e 100644
--- a/drivers/block/zram/Makefile
+++ b/drivers/block/zram/Makefile
@@ -1,5 +1,3 @@
-zram-y :=  zcomp_lzo.o zcomp.o zram_drv.o
-
-zram-$(CONFIG_ZRAM_LZ4_COMPRESS) += zcomp_lz4.o
+zram-y :=  zcomp.o zram_drv.o
 
 obj-$(CONFIG_ZRAM) +=  zram.o
diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
index 965d1af..3dc9bfa 100644
--- a/drivers/block/zram/zcomp.c
+++ b/drivers/block/zram/zcomp.c
@@ -13,12 +13,9 @@
 #include linux/slab.h
 #include linux/wait.h
 #include linux/sched.h
+#include linux/crypto.h
 
 #include zcomp.h
-#include zcomp_lzo.h
-#ifdef CONFIG_ZRAM_LZ4_COMPRESS
-#include zcomp_lz4.h
-#endif
 
 /*
  * single zcomp_strm backend
@@ -43,36 +40,17 @@ struct zcomp_strm_multi {
wait_queue_head_t strm_wait;
 };
 
-static struct zcomp_backend *backends[] = {
-   zcomp_lzo,
-#ifdef CONFIG_ZRAM_LZ4_COMPRESS
-   zcomp_lz4,
-#endif
-   NULL
-};
-
-static struct zcomp_backend *find_backend(const char *compress)
-{
-   int i = 0;
-   while (backends[i]) {
-   if (sysfs_streq(compress, backends[i]-name))
-   break;
-   i++;
-   }
-   return backends[i];
-}
-
 static void zcomp_strm_free(struct zcomp *comp, struct zcomp_strm *zstrm)
 {
-   if (zstrm-private)
-   comp-backend-destroy(zstrm-private);
+   if (zstrm-tfm)
+   crypto_free_comp(zstrm-tfm);
free_pages((unsigned long)zstrm-buffer, 1);
kfree(zstrm);
 }
 
 /*
- * allocate new zcomp_strm structure with -private initialized by
- * backend, return NULL on error
+ * allocate new zcomp_strm structure with 

Re: [PATCH 2/2] zram: use crypto API to compress the page

2015-08-12 Thread Sergey Senozhatsky
On (08/13/15 11:24), Joonsoo Kim wrote:
 Until now, zram uses compression algorithm through direct call
 to core algorithm function, but, it has drawback that we need to add
 compression algorithm manually to zram if needed. Without this work,
 we cannot utilize various compression algorithms in the system.
 Moreover, to add new compression algorithm, we need to know how to use it
 and this is somewhat time-consuming.
 

I don't like this change, sorry.

Here is why. One of the main reasons why I haven't implemented it using
crypto API (and it was an option and I thought about it) was and still
is the fact that crypto API requires tfm for both compress and decompress
operations. And this is a show stopper. No matter how you implement it,
it is slower than what we have by definition.

Now zram_bvec_read() operations depends on:

a) other read operations
   because read() path now use limited in size idle stream list

b) write operations
   because write() path uses same idle streams list



Literally, you change zram_bvec_read() from a fast

  zram_decompress_page(zram, uncmem, index);

to a slow

  zstrm = zcomp_strm_find(zram-comp);
  zram_decompress_page(zram, zstrm, uncmem, index);
  zcomp_strm_release(zram-comp, zstrm);


you either slow down both write() and read() if you use a single idle
stream list for both read and write, or double the amount of memory
used by idle streams if you decide to use separate read and write
idle stream lists. I see no real reason to do either of those. I like
that the existing read() is as fast as probably possible.

-ss

 When I tested new algorithms such as zlib, these problems make me hard
 to test them. To prevent these problem in the future, I think that
 using crypto API to compress the page is better approach and this patch
 implements it.
 
 The reason we need to support vairous compression algorithms is that
 zram's performance is highly depend on workload and compression algorithm
 and architecture. Every compression algorithm has it's own strong point.
 For example, zlib is the best for compression ratio, but, it's
 (de)compression speed is rather slow. Against my expectation, in my kernel
 build test with zram swap, in low-memory condition on x86, zlib shows best
 performance than others. In this case, I guess that compression ratio is
 the most important factor. Unlike this situation, on ARM, maybe fast
 (de)compression speed is the most important because it's computation speed
 is slower than x86.
 
 We can't expect what algorithm is the best fit for one's system, because
 it needs too complex calculation. We need to test it in case by case and
 easy to use new compression algorithm by this patch will help
 that situation.
 
 Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com
 ---
  drivers/block/zram/Kconfig | 13 +--
  drivers/block/zram/Makefile|  4 +-
  drivers/block/zram/zcomp.c | 87 
 ++
  drivers/block/zram/zcomp.h | 34 +
  drivers/block/zram/zcomp_lz4.c | 47 ---
  drivers/block/zram/zcomp_lz4.h | 17 -
  drivers/block/zram/zcomp_lzo.c | 47 ---
  drivers/block/zram/zcomp_lzo.h | 17 -
  drivers/block/zram/zram_drv.c  | 28 +-
  9 files changed, 66 insertions(+), 228 deletions(-)
  delete mode 100644 drivers/block/zram/zcomp_lz4.c
  delete mode 100644 drivers/block/zram/zcomp_lz4.h
  delete mode 100644 drivers/block/zram/zcomp_lzo.c
  delete mode 100644 drivers/block/zram/zcomp_lzo.h
 
 diff --git a/drivers/block/zram/Kconfig b/drivers/block/zram/Kconfig
 index 386ba3d..36ec96f 100644
 --- a/drivers/block/zram/Kconfig
 +++ b/drivers/block/zram/Kconfig
 @@ -1,8 +1,7 @@
  config ZRAM
   tristate Compressed RAM block device support
   depends on BLOCK  SYSFS  ZSMALLOC
 - select LZO_COMPRESS
 - select LZO_DECOMPRESS
 + select CRYPTO_LZO
   default n
   help
 Creates virtual block devices called /dev/zramX (X = 0, 1, ...).
 @@ -14,13 +13,3 @@ config ZRAM
 disks and maybe many more.
  
 See zram.txt for more information.
 -
 -config ZRAM_LZ4_COMPRESS
 - bool Enable LZ4 algorithm support
 - depends on ZRAM
 - select LZ4_COMPRESS
 - select LZ4_DECOMPRESS
 - default n
 - help
 -   This option enables LZ4 compression algorithm support. Compression
 -   algorithm can be changed using `comp_algorithm' device attribute.
 \ No newline at end of file
 diff --git a/drivers/block/zram/Makefile b/drivers/block/zram/Makefile
 index be0763f..9e2b79e 100644
 --- a/drivers/block/zram/Makefile
 +++ b/drivers/block/zram/Makefile
 @@ -1,5 +1,3 @@
 -zram-y   :=  zcomp_lzo.o zcomp.o zram_drv.o
 -
 -zram-$(CONFIG_ZRAM_LZ4_COMPRESS) += zcomp_lz4.o
 +zram-y   :=  zcomp.o zram_drv.o
  
  obj-$(CONFIG_ZRAM)   +=  zram.o
 diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
 index 965d1af..3dc9bfa 100644
 ---