Re: get operation with expiration time

2016-06-21 Thread Colin Pitrat
We although had the issue at one point of accessing "metadata" associated
to keys and wondered if this would make sense to create a new verb for this
(like the gete that you propose).
We concluded it would not.

I would highly recommend considering Dormando's suggestion before doing
this, because this would bring many drawbacks (in addition to having to
maintain the feature).

For one thing, memcached protocol is larger than memcached. With vanilla
memcached protocol, you can substitute memcached by another datastore
supporting the same protocol.
You can start using a proxy between your servers and memcached easily.
If you start touching the protocol, your locking yourself into the current
architecture by increasing the cost of such changes.

For another, what currently seems to you like the same things may well not
be.
You currently want the same value for memcached's TTL and your applicative
TTL, but having two separate fields as in Dormando's option 3 may prove
useful in the future.
For example if you want to keep the data stored even once it's outdated
(e.g to speed up an archiving process that uses it).

Regards,
Colin

2016-06-21 1:51 GMT+01:00 dormando :

> Hi,
>
> Since you're talking about contributing or forking, would you mind talking
> through your use case a bit more? There may be some solutions that fit
> better, and if not, a decent path inward.
>
> First, you seem to be using the binary protocol? Can you share what client
> and what features you make use of? It's somewhat rare and would be nice to
> know.
>
> Next: you seem to rely on the TTL for when your data actually needs to be
> updated? Do you ever evict, are you doing a periodic rebuild of all items,
> etc? (sorry, I can't tell what co you're from in case I'm supposed to know
> this already :P). It'd be useful to know if you can tolerate cache misses
> or if this is some kind of in-memory database situation.
>
> What happens on a cache miss, exactly?
>
> If you'd have the patience, it'd be nice to walk through a few scenarios
> and why they do or don't work for you:
>
> 1) The fastest way to fill a new cluster of cache nodes is typically to
> first-fetch against the new group, and on miss fetch-then-fill from the
> old group. Most places I've seen get up to a normal hit ratio within 10
> minutes. An hour at most, doing just that. You end up losing the long tail
> in a cutover, and it doesn't give you the remaining TTL; but I'd like to
> know if this pattern works or if you still have to iterate everything.
>
> 2) I wrote the LRU crawler a few years back. It's wired close to being
> able to execute nearly arbitrary code while it rolls through each slab
> class. I don't want to let too many cats out: but with the new logger code
> being able to migrate sockets between threads (one way, have some
> experimental code for two way), it may be possible to replace cachedump
> with an LRU crawler exension. This would allow you to more trivially dump
> valid items and their full headers without impacting the server.
>
> Would that help or alleviate the need of the extra command? It sounds like
> you need to iterate all items, fetch the header info, then fetch the item
> again and dump it into the new server... If using a crawler you could
> stream the headers for valid items in one go, then iterate them to fetch
> the data back.
>
> 3) A really old common safety net is to embed the TTL again inside the
> item you're storing. This allows people to "serve while stale" if a
> preferred TTL is surpassed but the underlying item lives on. It also
> allows you to schedule a background job or otherwise elect a client to
> refresh an item once it nears the TTL, while fast serving the current item
> to other clients. You can't move the need with GAT anymore, you'd have to
> swap the item via CAS or similar. This is generally a best of all worlds
> and has the added safety net effect. If doing this, combined with 1) and
> tolerance for long-tail cache misses you can bring a new cluster up to
> speed in a few minutes without modifying the daemon at all.
>
> 4) I have been thinking about new commands which return extended details
> about the item... unfortunately the protocol code is all dumped into the
> same file and is in need of a cleanup. I'm also trying to rethink the main
> protocols a little more to make something better in the future. This means
> if you fork now it could be pretty hard to maintain until after the
> refactoring at least. Also have to keep the binary protocol and text
> protocol in parity where possible.
>
> Sorry for the wall of text. You have a few options before having to modify
> the thing. Lets be absolutely sure it's what you need and that you can
> operate on the most minimal effect.
>
> thanks,
> -Dormando
>
> On Mon, 20 Jun 2016, 'Vu Tuan Nguyen' via memcached wrote:
>
> > We'd like to get the expiration time with the value on a get operation.
>   We'd use this new operation mainly for an administrative task--cache
> warmi

Re: get operation with expiration time

2016-06-20 Thread dormando
Hi,

Since you're talking about contributing or forking, would you mind talking
through your use case a bit more? There may be some solutions that fit
better, and if not, a decent path inward.

First, you seem to be using the binary protocol? Can you share what client
and what features you make use of? It's somewhat rare and would be nice to
know.

Next: you seem to rely on the TTL for when your data actually needs to be
updated? Do you ever evict, are you doing a periodic rebuild of all items,
etc? (sorry, I can't tell what co you're from in case I'm supposed to know
this already :P). It'd be useful to know if you can tolerate cache misses
or if this is some kind of in-memory database situation.

What happens on a cache miss, exactly?

If you'd have the patience, it'd be nice to walk through a few scenarios
and why they do or don't work for you:

1) The fastest way to fill a new cluster of cache nodes is typically to
first-fetch against the new group, and on miss fetch-then-fill from the
old group. Most places I've seen get up to a normal hit ratio within 10
minutes. An hour at most, doing just that. You end up losing the long tail
in a cutover, and it doesn't give you the remaining TTL; but I'd like to
know if this pattern works or if you still have to iterate everything.

2) I wrote the LRU crawler a few years back. It's wired close to being
able to execute nearly arbitrary code while it rolls through each slab
class. I don't want to let too many cats out: but with the new logger code
being able to migrate sockets between threads (one way, have some
experimental code for two way), it may be possible to replace cachedump
with an LRU crawler exension. This would allow you to more trivially dump
valid items and their full headers without impacting the server.

Would that help or alleviate the need of the extra command? It sounds like
you need to iterate all items, fetch the header info, then fetch the item
again and dump it into the new server... If using a crawler you could
stream the headers for valid items in one go, then iterate them to fetch
the data back.

3) A really old common safety net is to embed the TTL again inside the
item you're storing. This allows people to "serve while stale" if a
preferred TTL is surpassed but the underlying item lives on. It also
allows you to schedule a background job or otherwise elect a client to
refresh an item once it nears the TTL, while fast serving the current item
to other clients. You can't move the need with GAT anymore, you'd have to
swap the item via CAS or similar. This is generally a best of all worlds
and has the added safety net effect. If doing this, combined with 1) and
tolerance for long-tail cache misses you can bring a new cluster up to
speed in a few minutes without modifying the daemon at all.

4) I have been thinking about new commands which return extended details
about the item... unfortunately the protocol code is all dumped into the
same file and is in need of a cleanup. I'm also trying to rethink the main
protocols a little more to make something better in the future. This means
if you fork now it could be pretty hard to maintain until after the
refactoring at least. Also have to keep the binary protocol and text
protocol in parity where possible.

Sorry for the wall of text. You have a few options before having to modify
the thing. Lets be absolutely sure it's what you need and that you can
operate on the most minimal effect.

thanks,
-Dormando

On Mon, 20 Jun 2016, 'Vu Tuan Nguyen' via memcached wrote:

> We'd like to get the expiration time with the value on a get operation.   
> We'd use this new operation mainly for an administrative task--cache warming 
> a new group of servers.
> At times, we want to deploy a new server group to replace the previous one 
> seemlessly--doing so in a way that the client apps don't suffer a significant 
> drop in hit rate.  We
> currently do that by deploying the new server group where the remote client 
> processes are dynamically notified that the new group is in write-only mode.  
> We wait for the
> duration of the normal app TTL when the new server group is sufficiently 
> full, then make the new server group readable--removing the previous group 
> shortly afterward.
> Since we have a lot of caches that have different TTL's and we manage the 
> caches separately from the client apps that read/write to them, we'd like to 
> make this cache warm-up
> process quicker (and easier operationally).  We want to dynamically warm-up 
> the new servers so that we don't need to wait for the full TTL before 
> enabling the new servers.  We
> already can get the keys from elsewhere.  We do have the TTL at the time of 
> the write operation too.  However, using the TTL from this is a bit more 
> complex than we'd like,
> and we also don't get the latest expiration if a get-and-touch operation is 
> used.
>
> Can a new operation (like gete) be added to support this in the binary 
> protocol---maybe returning it i