Hi All,
This problem is close to my heart too.

On the Blackfin systems we have been working on slobs, slabs and even (piece of)cake allocators inside the ever dynamic 2.6 kernel.

We were, I think close to a solution but I for one am having trouble keeping any solution in line with kernel movement.

If you want to stream data buffers I would  take a close look at relayfs

I also have a my own simpler close relation that I use in these circumstances. This will stream the data into already allocated io channels (rEALLY BIG FIFOS).
This totally avoids any dynamic memory allocation problems.

Just another 10C worth.

Phil Wilshire





David McCullough wrote:
Jivin Jamie Lokier lays it down ...
David McCullough wrote:
Feel free to send in some patches :-)
When they let me past the dark age of 2.4.26-uc0, maybe I will :-)

I have a few ideas to combine the better fragmentation performance of
page_alloc2.c with the speed of page_alloc.c (a hybrid of buddy and
bitmap search), plus some fragmentation-reducing strategies using
zones (nothing to do with uclinux) that were proposed for 2.6 kernels
and did well in measurements.

You know, when that copious free time rolls around :-)

I think everyone is waiting for that one :-)

Are you low on memory ? page_alloc2 gets pretty nasty about trying to
clear the caches etc as often as possible to keep as much contiguous
memory available at all times.
Rapidly allocating and freeing memory: it's streaming video from disk
at rates of 1-2MB/s, on a device with 32MB total for Linux.  Free
memory oscillates, decreasing and then jumping up every 5 seconds (on
the vendor-patched kernel).  "Straight" uclinux keeps the free memory
up more consistently, but at the cost of very high kswapd CPU while
streaming.

That said, I have seen systems where kswapd CPU usage is not a problem,
and oviously there are those where it is.  I don't know the cause.  2
possibilities:

        1) I haven't actively used a 2.4 kernel on a non-MMU system for some
       time and the page_alloc2 code may just be wrong due to a kernel
       update and bit rot.

        2) The usage on these systems is triggering the behaviour.

If you boot te hsystem is a configuration that doesn't use much RAM and
don't start and nasty big apps is the system idle (ie kswapd is
behaving).  If so what triggers it's rampage ?
I think it's the high rate of page allocation which triggers it.

There shouldn't be a need to run kswapd constantly, for file cache
pages: it should be possible to reclaim cache pages rapidly during
allocation, recycling them.  I think that's where page_alloc2.c goes
wrong.  The heuristic interaction between page_alloc.c and kswapd is
rather subtle and tricky, but the basic difference is that
page_alloc.c doesn't maximise free memory all the time; instead, it
keeps track of rapidly reclaimable memory.

Apart from the CPU difference, that means page_alloc2.c tends to fail
allocations if it really does run out of memory while kswapd is
catching up asynchronously.  (And failed allocations result in execs
crashing, ahem).  It's crashes due to memory shortage which prompted
me to investigate; the CPU differences were a surprise.

A side effect of the high CPU of kswapd with page_alloc2.c in these
situations is that allocation is noticably slower.  I noticed, to my
great surprise, that rsync was able to fetch files over the network
and write them to disk twice as fast with page_alloc.c.  (4MB/s
instead of 2MB/s).  For ages, I'd assumed it was the driver or hardware.

To summarise, I found these differences:

page_alloc.c:

     Pro: Lower CPU usage of kswapd, especially when streaming files.
     Pro: Doesn't fail allocations when lots of data in filecache;
          reclaims cache pages when needed.
     Pro: Keeps file data cached, if the pages are not required
          for something else.
     Pro: Faster allocation, surprisingly faster sometimes.
     Con: After long uptimes, with fork/execs causing large
          contiguous allocations, eventually memory will be too
          fragmented for fork/execs and the allocator is unable
          to recover.  So after long uptimes, the system will
          fail to allow telnet logins, for example, but will still
          be functioning in other ways.

page_alloc2.c:

     Con: Higher CPU usage of kswapd, especially when streaming files.
     Con: Fails allocations when lots of data in filecache which could
          be reclaimed, sometimes.
     Con: Evicts cached file data regularly.  Even tiny files which are
          read very often from disk will do I/O periodically, instead
          of always reading from cache.
     Con: Slower allocation, surprisingly so sometimes.
     Pro: After long uptimes, with fork/execs causing large contiguous
          allocations, and simultaneous streaming file data, it
          manages to keep different types of allocation separate
          enough that fragmentation is not inevitable.  Indefinitely
          long uptimes are realistically possible.

In the end, we stuck with page_alloc2.c because of that last point.
Our systems either crash and burn (with watchdog recovery), or telnet
still works :) But we like every performance characteristic of
page_alloc.c more.

The CPU usage of kswapd was a problem, and the crashing when too much
file data cached (due to fast streaming) was a big problem, so we
tuned kswapd to a sweet spot for this application, and did everything
possible with XIP-in-RAM to free up memory.  Currently we have 11MB
free (out of 32MB) which seems to be enough.  It seems extravagant,
but we found much less and the system crashes from time to time.

Great summary.  Basically it backs up every reason why page_alloc2 was
created.  We had routers running ipsec/pptp/whatever and 4MB of RAM.
Without page_alloc2 they pretty much failed to boot,  let alone stay up
for months on end,  thus page_alloc2.

I am with you though,  it should be able to do what it does without
the CPU overhead.

Cheers,
Davidm


_______________________________________________
uClinux-dev mailing list
uClinux-dev@uclinux.org
http://mailman.uclinux.org/mailman/listinfo/uclinux-dev
This message was resent by uclinux-dev@uclinux.org
To unsubscribe see:
http://mailman.uclinux.org/mailman/options/uclinux-dev

Reply via email to