Re: Enabling large-page-allocations

Mike Lambert Wed, 15 Jul 2009 17:47:38 -0700

Trond, any thoughts?

I'd like to double-check that there isn't a reason we can't support
preallocation without getpagesizes() before attempting to manually
patch memcache and play with our production system here.


Thanks,
Mike

On Jul 13, 8:38 pm, Mike Lambert <mlamb...@gmail.com> wrote:
> On Jul 10, 1:37 pm, Matt Ingenthron <ingen...@cep.net> wrote:
>
>
>
>
>
> > Mike Lambert wrote:
> > > Currently the -L flag is only enabled if
> > > HAVE_GETPAGESIZES&&HAVE_MEMCNTL. I'm curious what the motivation is
> > > for something like that? In our experience, for some memcache pools we
> > > end up fragmenting memory due to the repeated allocation of 1MB slabs
> > > around all the other hashtables and free lists going on. We know we
> > > want to allocate all memory upfront, but can't seem to do that on a
> > > Linux system.
>
> > The primary motivation was more about not beating up the TLB cache on
> > the CPU when running with large heaps.  There are users with large heaps
> > already, so this should help if the underlying OS supports large pages.  
> > TLB cache sizes are getting bigger in CPUs, but virtualization is more
> > common and memory heaps are growing faster.
>
> > I'd like to have some empirical data on how big a difference the -L flag
> > makes, but that assumes a workload profile.  I should be able to hack
> > one up and do this with memcachetest, but I've just not done it yet.  :)
>
> > > To put it more concretely, here is a proposed change to make -L do a
> > > contiguous preallocation even on machines without getpagesizes tuning.
> > > My memcached server doesn't seem to crash, but I'm not sure if that's
> > > a proper litmus test. What are the pros/cons of doing something like
> > > this?
>
> > This feels more related to the -k flag, and that it should be using
> > madvise() in there somewhere too.  It wouldn't be a bad idea to separate
> > these necessarily.   I don't know that the day after 1.4.0 is the day to
> > redefine -L though, but it's not necessarily bad. We should wait for
> > Trond's repsonse to see what he thinks about this since he implemented
> > it.  :)
>
> Haha, yeah, the release of 1.4.0 reminded me I wanted to send this
> email. Sorry for the bad timing.
>
> -k keeps the memory from getting paged out to disk (which is a very
> goodt hing in our case.)
> -L appears to me (who isn't aware of what getpagesizes does) to be
> related to preallocation with big allocations, which I thought was
> what I wanted.
>
> If you want, I'd be just as happy with a -A flag that turns on
> preallocation, but without any of getpagesizes() tuning. It'd force
> one big slabs allocation and that's it.
>
> > Also, I did some testing with this (-L) some time back (admittedly on
> > OpenSolaris) and the actual behavior will vary based on the memory
> > allocation library you're using and what it does with the OS
> > underneath.  I didn't try Linux variations, but that may be worthwhile
> > for you.  IIRC, default malloc would wait for page-fault to do the
> > actual memory allocation, so there'd still be risk of fragmentation.
>
> We do use Linux, but haven't tested in production with my modified -L
> patch. What I *have* noticed is that when we allocate a 512MB
> hashtable, that shows up in linux as mmap-ed contiguous block of
> memory. Fromhttp://m.linuxjournal.com/article/6390, we "For very
> large requests, malloc() uses the mmap() system call to find
> addressable memory space. This process helps reduce the negative
> effects of memory fragmentation when large blocks of memory are freed
> but locked by smaller, more recently allocated blocks lying between
> them and the end of the allocated space."
>
> I was hoping to get the same large mmap for all our slabs, out of the
> way in a different address space in a way that didn't interfere with
> the actual memory allocator itself, so that the linux allocator could
> then focus on balancing just the small allocations without any page
> waste.
>
> Thanks,
> Mike

Re: Enabling large-page-allocations

Reply via email to