Adding some notes and ideas based on looking more at the squashfs
implementation.

One idea I had was to reduce the data cache (or maybe even eliminate it)
by making squashfs decompress directly into the page cache. It turns out
that squashfs already does this if CONFIG_SQUASHFS_FILE_DIRECT=y, which
is the case in our kernel. Rather it tries to read into the page cache
directly first, and if that fails it falls back to using the cache, so
we can't eliminate the cache entirely. But it does mean that squashfs
can probably get by with much less data cache than it currently
allocates for SQUASHFS_DECOMP_MULTI and SQUASHFS_DECOMP_MULTI_PERCPU
when that option is set. We can't reduce the data cache for
CONFIG_SQUASHFS_DECOMP_SINGLE since it always needs enough cache for at
least one uncompressed data block.

Also important to note is that although the squashfs data cache size is
determined by which decompressor implementation is selected, the cache
and decompressor states are managed independently, therefore there's no
requirement that they are so tightly coupled. The coupling makes sense
if SQUASHFS_FILE_DIRECT is disabled but seems less sensible if it's
enabled.

The decompressor implementations come with their own tradeoffs:

- CONFIG_SQUASHFS_DECOMP_SINGLE: Uses less memory. However decompression
is serialized, i.e. for a given superblock only one block can be
decompressed at a time.

- SQUASHFS_DECOMP_MULTI: Allows for (num_online_cpus() * 2) parallel
decompressions, which means the number of parallel decompressions could
vary depending on how many CPUs were online when the filesystem was
mounted. Also allocates the same number of blocks for the data cache.
This implementation has more overhead associated with managing
decompressor state memory than the others.

- SQUASHFS_DECOMP_MULTI_PERCPU: This maintains per-cpu decompressor
state, therefore it is lockless and there is no overhead for managing
the decompressor state memory. That means it uses more RAM though, both
for decompressor state memory and for data cache. You also can have as
many parallel decompressions going on as there are CPU cores.

Based on this, here are a couple of ideas we could try to strike a good
balance between performance and RAM usage:

1. Add a config option for the maximum number of data cache blocks. This
would allow us to use one of the implementations which allows for
parallel decompression without the data cache size exploding. As long as
most blocks get decompressed directly into the page cache (which we'll
need to verify) having only one or two blocks in the data cache should
not be detrimental to performance.

2. Make a new decompressor implementation which allows for full
customization of the number of parallel decompressions and number of
blocks in the data cache. In my opinion SQUASHFS_DECOMP_SINGLE is too
little parallelization but the others are too much on systems with large
numbers of CPUs. Instead we could specify both the maximum number of
parallel decompressions in the kernel config (possibly capped at the
number of possible CPU cores) and the number of data blocks to cache to
suit our needs.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1636847

Title:
  unexpectedly large memory usage of mounted snaps

Status in Snappy:
  New
Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Yakkety:
  Fix Committed
Status in linux source package in Zesty:
  Fix Committed

Bug description:
  This is a tracking bug for what might be kernel bugs or kernel
  configuration changes.

  As described [1], memory used by simply mounting a squashfs file (even
  an empty one) is ranging from almost nothing (on certain
  distributions) to 131MB on Ubuntu 16.04 and 16.10 on a single-core
  machine or VM.

  The amount is excessive and should be investigated by the kernel team.
  We may need to change the kernel or at least the configuration we ship
  in our packages and kernel snaps.

  [1] https://github.com/zyga/mounted-fs-memory-checker

To manage notifications about this bug go to:
https://bugs.launchpad.net/snappy/+bug/1636847/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to