Re: Issue 256 in memcached: PLEASE do not remove cachedump, better dumping feature needed

Slawomir Pryczek Sun, 26 Feb 2012 12:07:52 -0800

Hi guys, thanks for all the answers.

>Would it help to just use a shorter expire time on everything since it is being
>evicted anyway, then write back anything you are actively reusing to
>bump up the time to live?  That way less active data gets out of the
>way sooner with no extra work.

That's the problem because it won't. The cache would grow indefinitely
just 3 slab classes (if i set it to 2GB max it'll use up all the
memory in a day, if i set to 14GB it'll still use all available memory
in a week). And i can't predict what is evicted. So for the dataset
that would fit perfectly in 100mb of ram i need 1GB and still get
evictions... i mean i don't need... i can use 200mb just get more
evictions...

For the "running multiple copies"... im using persistent connection
but are you sure the amount of TCP communication will be good for
performance. I mean even locking the whole slab that has 1mb and
scanning it? Will it take more than 1 ms on modern machine? Beside
it's complicated to rewrite application like this.

@Dormando... why you call it "bizzare"? :) Rebalancing slabs shouldn't
be much different.

What you think about forking the app? (i mean forking the in-memory
process). Should work well on modern kernel without locking because
you have copy-on-write? Maybe locking then copying the whole single
slab? I can allocate some buffer, which will be size of single slab
then use LOCK, copy ONE slab into the buffer and use another thread to
build a list of items we can remove. Copying eg. 1 mb of memory should
happen in no time.

Generally you think i should move the cleanup into storage engine? How
advanced is that (production ready?)

> The worst we do is in slab rebalance, which holds a slab logically and 
> glances at it
> with tiny locks.
The good thing about cleanup is that you won't have to use tiny locks
(i think). Just lock the slab, copy memory and then wake up some
thread to take a look, add the keys to some list then just process the
list from time to time  (or am i wrong?)

Can you give me some pointers please?

for now im seeing you're using:
it = heads[slabs_clsid];
then iterate it = it->next;

that's probably why you say it's too slow... but what if we just
lock=>copy one slab's memory=>unlock=>analyze slab=>[make 100 get
requsts=>sleep]repeat? We have fixed size items in slab so we know
exactly where the key and expiration time is, right?

Thanks,
Slawomir.

On 26 Lut, 20:09, Rohit Karlupia <iamro...@gmail.com> wrote:
> Consider cacheismo.
>
> thanks!
> rohitk
>
>
>
>
>
>
>
> On Sun, Feb 26, 2012 at 2:08 PM, <memcac...@googlecode.com> wrote:
> > Status: New
> > Owner: ----
> > Labels: Type-Defect Priority-Medium
>
> > New issue 256 by psla...@wp.pl: PLEASE do not remove cachedump, better
> > dumping feature needed
> >http://code.google.com/p/**memcached/issues/detail?id=256<http://code.google.com/p/memcached/issues/detail?id=256>
>
> > Hi Guys.
>
> > Sorry for the long post, i wrote it to describe some very big problem with
> > memcached (and solution).
>
> > My company implemented adserver that handles tens of millions impressions
> > daily by extensively using memcached. We use memcached both to cache data
> > but also for staging SQL writes. And to my knowledge it is (as of today)
> > only available tool that can scale writes to SQL (redis because of their
> > reclaim policy is totally ususable and other K-V storage tools are out of
> > the equation because they write data to disk).
>
> > So we run tens of thousands of writes per minute through memcached, then
> > we analyze the data every minute and write/update 100-200 sql rows with
> > aggregated data. We scaled the server from about 40-50 requests / seconds
> > to more than 800 so it works great.
>
> > But we got a problem related to LRU/"lazy" reclaim. The cache is filling
> > all available memory and then there are some evictions becuase they keys
> > have different expire time (some of them just 5 seconds, the others 24
> > hours).
>
> > As a workaround we used cachedump, to get a list of keys, then issue a GET
> > command so the key is immediately reclaimed. And it works, the only problem
> > is that we can't eg. dump whole 10 million keys, because the dump is
> > limited.
>
> > To see how bad it is without this kind of "fast" reclaim - after 20-30
> > hours we have about 2GB of outdated keys that occupy just one SLAB. So we
> > can't accumulate for different traffic patterns because all slabs are
> > taken. While the non-expired set is like 30mb, so 1970mb out of 2000 is a
> > waste. So with RAM 66 TIMES bigger than actually "needed", without
> > cachedump we'd still got evictions.
>
> > So can you please make "improved dump" a much needed feature request? I
> > saw posts by many other people asking about this. Maybe include command
> > line option to turn this ON if you're concerned about security?
>
> > If that's not appropriate place to make feature requests, can you please
> > direct me there.
>
> > Maybe it will be possible to make separated low priority thread that'll
> > scan the key list and issue get from time to time. I'm a C++ coder, how
> > hard will it be to make? Would it require partiall or full lock of some
> > important shared resource so it'd be problematic (like whole item list).
> > Maybe it'd be possible to fork the process (copy on write) so it'd have
> > access to the whole list and then just issue GETs to parent using text
> > protocol?
>
> > Thanks,
> > Slawomir.

Re: Issue 256 in memcached: PLEASE do not remove cachedump, better dumping feature needed

Reply via email to