Re: Soft flushing proposal

dormando Mon, 23 Feb 2009 00:30:37 -0800

Not totally stopped, but are able to lose one or two instances and
continue to function. That's a defined requirement for all subsystems of
an operation. Redundant DB's so you can lose one, several gearmand's so
you can lose one, enough memcached's so losing one or two is fine, etc.


Anything else is a time bomb, and the bigger you get, the harder you'll
fall.

I definitely don't advocate running a site where disabling memcached 100%
should be possible. All of the sites I run will not function at all
without it.

-Dormando

On Mon, 23 Feb 2009, Colin Pitrat wrote:

> You mean something like "In n seconds, remove all items that have not been
> updated until then" ? That's the way I first thought flush with ttl would
> work, and I was quite disappointed when I understood that it wasn't the
> case. When you give it a thought, that's the same thing as waiting n seconds
> and executing flush(-n) with Jean-Charles' patch. And unlike the flush with
> positive TTL, it's adding a real value.
>
> Dormando, I have the impression that you advise to have a configuration
> where memcached can be totally stopped with no impact on the system. If this
> is a case, what's the point of memcached ? Isn't it supposed to be designed
> to remove some work from a DB-tier ? You can imagine some systems where
> memcached would handle a non critical traffic that cannot be handled by the
> DB alone, and where the application is in degraded mode (e.g. discard this
> traffic) when memcached is stopped.
>
> In any case, you'd like to reduce to the minimum time when memcached is
> down, and being able to do a progressive flush may be very helpful in a lot
> of cases. It also allows simplest administration, even if node by node
> operations are still needed when upgrading memcached.
>
> 2009/2/23 Clint Webb <webb.cl...@gmail.com>
>
> > I like the idea, but probably would never use it on my production systems,
> > because if I'm making extensive architecture changes (which is the only time
> > this would be useful really), I like to do a complete restart of the
> > memcached process just to ensure that I dont have slabs of memory allocated
> > for large object sizes that I no longer need..
> >
> > That being said, what I would love is a command that says "set all items to
> > expire in X seconds if they dont have an expiry or it is greater than X
> > seconds"... I'm a little bit nervous about entries getting cached without an
> > expiry when they should have one, and then never going away, and the apps
> > just keep using it forever when it expects it to expire.   Almost the same
> > as the negative flush, but not quite.
> >
> > On Mon, Feb 23, 2009 at 10:02 AM, dormando <dorma...@rydia.net> wrote:
> >
> >>
> >> Yo,
> >>
> >> I'm a little confused by this thread... It appears that the point is to
> >> reduce pain or reduce the time required in a full restart of a memcached
> >> cluster.
> >>
> >> This request looks like it would encourage folks to get themselves into
> >> positions where a full restart of a memcached instance is too much pain to
> >> bear. Now you can't upgrade the server, upgrade memcached, or tolerate a
> >> hardware failure. I've seen too many shops get themselves into this
> >> position, and it's frustrating since it stunts our ability to get bugfixes
> >> and features deployed.
> >>
> >> It feels excessive if the only real benefit is being able to do a full
> >> data flush in less time? Is there anything I'm missing?
> >>
> >> -Dormando
> >>
> >> On Sat, 21 Feb 2009, Jean-Charles Redoutey wrote:
> >>
> >> > ok, if you put the future flush in the same basket, I am not *offended*
> >> ;-)
> >> >
> >> > imho, the main bone of contention we have is we don't consider the same
> >> way
> >> > the age of an item.
> >> >
> >> > As I understand, for you, this is somehow the "content age", i.e. the
> >> time
> >> > the oldest part used to construct the item has been put in cache.
> >> >
> >> > For me, this is more a "technical age", i.e.  the time the item has been
> >> > physically put in the cache, whatever the status of what has been used
> >> to
> >> > build it.
> >> >
> >> > imho also, both are valid approaches:
> >> >
> >> > - the first one ensures the recentness of the actual content, which is
> >> > really useful if data put in cache is built from other data already in
> >> > cache
> >> >
> >> > - the second one ensures the technical recentness, which is important
> >> for at
> >> > least 2 points:
> >> >
> >> >   - if the way you put data in cache evolves, at some point, the current
> >> > version may not be compatible with very old ones, and you want to ensure
> >> you
> >> > don't have this kind of very old item still in the cache
> >> >
> >> >   - if the distribution of the items on the nodes changes (e.g. change
> >> in
> >> > the number of server), you have originally the item on one node, then it
> >> > goes on a another one and then, after another distrbution change, it can
> >> go
> >> > back where it was before (typical case, a configuration fallback); if
> >> > neither the LRU nor the TTL has deleted this old item, you will end up
> >> not
> >> > having a cache miss but a cache hit on a deprecated data, something we
> >> > definitly want to avoid. Basically, to prevent that, we need to ensure
> >> no
> >> > data in the cache is older than the last configuration change.
> >> >
> >> > In the end, the first one can only be ensured by the instananeous flush.
> >> >
> >> > The second admittely also, but this can also be done with a nearly null
> >> > impact on the DB with the negative delay feature, and imho this is worth
> >> a
> >> > dedicated feature. If this has to go through a new command, I don't mind
> >> at
> >> > all, I proposed the negative delay since in means less than 10 lines of
> >> > code, no impact on actual feature and, even if this may be misleading to
> >> > people reading the feature description unattentively, the exact
> >> behaviour
> >> > can be precisely described in less than 20 words...
> >> >
> >> > ---
> >> > Jean-Charles
> >> >
> >> >
> >> > On Fri, Feb 20, 2009 at 23:08, Dustin <dsalli...@gmail.com> wrote:
> >> >
> >> > >
> >> > >
> >> > > On Feb 20, 10:29 am, Jean-Charles Redoutey <jc.redou...@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > If we go for 2, the *right* way to use the delayed flush would be
> >> > > something
> >> > > > like flush +10 on server a and flush +20 on server b.
> >> > >
> >> > >   I've also argued for the removal of flush with delay.  It was
> >> > > semantically confusing with delete with reserve (which was remove),
> >> > > and is really easy to do as a client feature.  I don't think it makes
> >> > > sense to exist as a server function at all.
> >> > >
> >> > > > The only way to have the global consistency you are describing is to
> >> > > flush
> >> > > > all the nodes with the exact same delay, which is simply
> >> unapplicable to
> >> > > > production cache.
> >> > >
> >> > >   Well, flush them at the same time.  If you issue a flush on my
> >> > > client, it does all of them as concurrently as possible.
> >> > >
> >> > >  It's about reducing the window of error.  As you've pointed out, the
> >> > > larger the window is, the more of a chance there is.
> >> > >
> >> > > > Since you can't ensure the *real* age of a data, basically the first
> >> time
> >> > > > any part used to build it has been put in any of the node within the
> >> > > > cluster, why not focus on something you can ensure, the time this
> >> > > particular
> >> > > > data was put in cache? In which case, whatever the sign of the flush
> >> > > delay,
> >> > > > we have the same semantic.
> >> > >
> >> > >   I suppose the difference of opinion regarding the semantics is that
> >> > > in the non-negative case (existing code), *all* records are
> >> > > invalidated within memcached.
> >> > >
> >> > >  Does anyone actually use a "future" flush that can't be done client-
> >> > > side?
> >> > > >
> >> > >
> >> >
> >>
> >
> >
> >
> > --
> > "Be excellent to each other"
> >
>

Re: Soft flushing proposal

Reply via email to