Re: What's new in memcached (part 2)

Trond Norbye Tue, 12 Apr 2011 02:52:22 -0700

On 12. apr. 2011, at 05.55, dormando wrote:

> ps. folks please look this over and evaluate. Do you understand
> everything? Does anything suck? Need more clarification? Whatever?
> 
> http://code.google.com/p/memcached/downloads/detail?name=memcached-1.6.0_beta1.tar.gz
> ^ easy-bake oven form beta release. Passes tests on a bunch of platforms,
> but possibly not OpenBSD.
>


I've fixed that bug. See: 
https://github.com/memcached/memcached/commit/cc3941084188195fc8b43fcdc05cec3dab5a4bd4

Cheers,

Trond


> Make evaluating! Give major feedback.
> 
> -Dormando
> 
> On Mon, 11 Apr 2011, Trond Norbye wrote:
> 
>>                       What's new in memcached
>>                       =======================
>> 
>> (part two - new feature proposals)
>> 
>> Table of Contents
>> =================
>> 1 Protocol
>>    1.1 Virtual buckets!
>>    1.2 TAP
>>    1.3 New commands
>>        1.3.1 VERBOSITY
>>        1.3.2 TOUCH, GAT and GATQ
>>        1.3.3 SET_VBUCKET, GET_VBUCKET, DEL_VBUCKET
>>        1.3.4 TAP_CONNECT
>>        1.3.5 TAP_MUTATION, TAP_DELETE, TAP_FLUSH
>>        1.3.6 TAP_OPAQUE
>>        1.3.7 TAP_VBUCKET_SET
>>        1.3.8 TAP_CHECKPOINT_START and TAP_CHECKPOINT_END
>> 2 Modularity
>>    2.1 Engines
>>    2.2 Extensions
>>        2.2.1 Logger
>>        2.2.2 Daemon
>>        2.2.3 ASCII commands
>> 3 New stats
>>    3.1 Stats returned by the default stats command
>>        3.1.1 libevent
>>        3.1.2 rejected_conns
>>        3.1.3 stats related to TAP
>>    3.2 topkeys
>>    3.3 aggregate
>>    3.4 settings
>>        3.4.1 extension
>>        3.4.2 topkeys
>> 
>> 
>> 1 Protocol
>> ~~~~~~~~~~~
>> 
>> Intentionally, there is no significant difference in protocol over
>> 1.4.x.  There is one minor change, but it should be transparent to
>> most users.
>> 
>> 1.1 Virtual buckets!
>> =====================
>> 
>> We don't know who originally came up with the idea, but we've heard
>> rumors that it might be Anatoly Vorobey or Brad Fitzpatrick.  In lieu
>> of a full explanation on this, the concept is that instead of mapping
>> each key to a server we map it to a virtual bucket.  These virtual
>> buckets are then distributed across all of the servers.  To ease the
>> introduction of this we've assigned the two reserved bytes in the
>> binary protocol for specifying the vbucket id, which allowed us to
>> avoid protocol extensions.
>> 
>> Note that this change should allow for complete compatibility if the
>> clients and the server are not aware of vbuckets.  These should have
>> been set to 0 according to the original binary protocol specification,
>> which means that they will always use vbucket 0.
>> 
>> The idea is that we can move these vbuckets between servers such that
>> you can "grow" or "shrink" your cluster without losing data in your
>> cache. The classic memcached caching engine does _not_ implement
>> support for multiple vbuckets right now, but it is on the roadmap to
>> create a version of the engine in memcached to support this (it is a
>> question of memory efficiency, and there are currently not many
>> clients that support them).
>> 
>> Defining this now will allow us to start moving down the path to
>> vbuckets in the default_engine and allow other engine implementors to
>> consider vbuckets in their design.
>> 
>> You can read more about the mechanics of it here:
>> [http://dustin.github.com/2010/06/29/memcached-vbuckets.html]
>> 
>> However, you _cannot_ use a mix of clients that are vbucket aware and
>> clients who don't use vbuckets, but then again it doesn't make sense
>> to use a vbucket aware backend if your clients don't know how to
>> access them.  This is why we believe a protocol change isn't
>> warranted.
>> 
>> Defining this now will allow us to start moving down the path to
>> vbuckets in the default_engine and allow other engine implementors to
>> consider vbuckets in their design.
>> 
>> 1.2 TAP
>> ========
>> 
>> In order to facilitate vbucket transfers, among other use cases where
>> people want to see what's inside the server, we added to the binary
>> protocol a set of commands collectively called TAP.  The intention is
>> to allow "clients" to receive a stream of notifications whenever data
>> change in the server.  It is solely up to the backing store to
>> implement this, so it can make decisions about what resources are used
>> to implement TAP.  This functionality is commonly needed enough though
>> that the core is aware of it, leaving specific implementation to
>> engines.
>> 
>> 1.3 New commands
>> =================
>> 
>> There are a few new commands available.  The following sections
>> provides a brief description of them.  Please check protocol_binary.h
>> for the implementation details.
>> 
>> 1.3.1 VERBOSITY
>> ----------------
>> 
>> We did not have an equivalent of the verbosity command in the textual
>> protocol.  This command allows the user to change the verbosity level
>> on your running server by using the binary protocol.  Why do we need
>> this? There is a command line option you may use to disable the ascii
>> protocol, so we need this command in order to change the logging level
>> in those configurations.
>> 
>> 1.3.2 TOUCH, GAT and GATQ
>> --------------------------
>> 
>> One of the problems with the existing commands in memcached is that
>> you couldn't tell the memcached server that the object is still valid
>> and we just want a longer expiration.  Normally you want to put an
>> expiry time on the objects, so that you can get an indication if your
>> cache is big enough (by watching the eviction stats.. if your
>> memcached server has a high eviction rate your cache isn't big enough
>> for what you want to have in there).  The normal idea is that the items
>> you're normally using would be bumped to the front of your LRU (and
>> hence not be kicked out immediately).
>> 
>> The touch command lets you set the expiry time for an object without
>> retrieving the object.  In most cases, you will not want to do this
>> unless you provide a CAS value to ensure that you're touching the
>> correct version of the object.
>> 
>> GAT means "get and touch" and returns the object in addition to
>> setting a new expiration time.  This allows you to have a rolling
>> window of expiry that has a TTL in addition to the access time.  For
>> example, you can instruct memcached to allow an object to live no
>> later than five minutes after the last time it was access (but as
>> always, it may expire sooner).
>> 
>> 1.3.3 SET_VBUCKET, GET_VBUCKET, DEL_VBUCKET
>> --------------------------------------------
>> 
>> These commands are used to set, get or delete a vbucket on the server.
>> 
>> 1.3.4 TAP_CONNECT
>> ------------------
>> 
>> Connect and request that the server initialize a TAP stream.
>> 
>> The point of this command is to allow clients to connect and specify a
>> few things about the data they wish to receive.  Specifically, the
>> client will typically specify a date either in the past or in the
>> future along with specifying a vbucket.  The server will then stream
>> data mutated since that given date or if a future date is specified,
>> only stream new mutations as they arrive.  The specific details about
>> which mutations to send may vary on implementation.
>> 
>> 1.3.5 TAP_MUTATION, TAP_DELETE, TAP_FLUSH
>> ------------------------------------------
>> 
>> TAP_MUTATION is a notification that an item changed value in the
>> server.
>> 
>> The mutation typically comes with the new value.
>> 
>> TAP_DELETE is a notification that a key was deleted on the server.
>> 
>> Finally, to avoid having to send a complete list of all the keys in
>> the server when the user issues a flush, we can send a single message
>> (TAP_FLUSH) representing the flush.  Please note that the FLUSH
>> message means _ALL_ vbuckets, and not just a single vbucket.
>> 
>> 1.3.6 TAP_OPAQUE
>> -----------------
>> 
>> To allow storage engines to send their own messages over the tap
>> stream between each other, a tap opaque message is defined.  It is
>> completely up to the storage engine to specify the internal layout of
>> the package.
>> 
>> 1.3.7 TAP_VBUCKET_SET
>> ----------------------
>> 
>> This is a message requesting a vbucket set. It is similar to the
>> set_vbucket command, with the difference that this message comes over
>> a tap connection (with the extra info a tap message contains)
>> 
>> 1.3.8 TAP_CHECKPOINT_START and TAP_CHECKPOINT_END
>> --------------------------------------------------
>> 
>> The checkpoint start and end messages may be used by engine who wants
>> to use checkpoints.  Checkpoints are an optional feature that may be
>> used by some engines to allow clients to start at a checkpoint
>> position.  By doing so, the client need not do a full "backfill" even
>> if it is revisiting a server after having been gone for a while.  The
>> TAP_CHECKPOINT_START tells a client that it's the start of a new
>> checkpoint, and the TAP_CHECKPOINT_END tells the client when it's
>> received everything for that given checkpoint.
>> 
>> 2 Modularity
>> ~~~~~~~~~~~~~
>> 
>> As we mentioned in the first email on changes, one big difference with
>> this new work is that we've tried to refactor memcached into being a
>> modular application instead of being monolithic.  In the future, we'd
>> like to make the command parser as a separate module, so that we may
>> load the parsers separately.
>> 
>> 2.1 Engines
>> ============
>> 
>> We've done a lot of work trying to refactor the code in memcached to
>> avoid the tight coupling between the command protocol parser and the
>> actual item storage.
>> 
>> The idea with the engine interface is that the memcached process loads
>> a dynamically loadable object and calls a well known function to get a
>> set of function pointers.  All communication between the memcached
>> process and the engine is performed through these function
>> pointers.  The memcached process provides a set of services to the
>> engine as well through another set of function pointers.
>> 
>> The beauty of this is that the user may choose between a set of
>> different storage engines that suites their runtime
>> environment.  People have different requirements for their server. Some
>> people need ACID, others may prefer ecstacy ;-) The storage interface
>> may let them design their app by using the memcached protocol, and
>> they can just swap in the backend that suites their needs (may it be
>> persistence, replication (sync or lazily) etc..)
>> 
>> 2.2 Extensions
>> ===============
>> 
>> The item storage isn't the only place we've tried to create a level of
>> modularity.  People run memcached in different environments with
>> different requirements. You specify the extensions you want to use by
>> adding the -X command line argument.
>> 
>> 2.2.1 Logger
>> -------------
>> 
>> We've seen a lot of different requests when it comes to logging. Some
>> want it to a file, some to syslog (or Windows event log) and some want
>> it to standard out.  By default memcached will print to stderr, but
>> you may specify a different logger by loading the appropriate module
>> with the -X command line argument
>> 
>> 2.2.2 Daemon
>> -------------
>> 
>> You might want to have some daemons providing extra services inside
>> your memcached server.  Examples would be things like a doors server
>> to provide additional access to your server (Trond's favorite), or
>> perhaps a "dispatcher" offering a threadpool for your engines to
>> use?).
>> 
>> 2.2.3 ASCII commands
>> ---------------------
>> 
>> If you really need to extend the ASCII protocol, you may now load
>> additional ASCII commands as loadable modules.  We don't need a
>> separate module for binary commands, because those are already handled
>> inside memcached due to the fixed semantics on the protocol.  This
>> isn't necessarily encouraged, but sometimes it is required to get
>> something done quick.
>> 
>> 3 New stats
>> ~~~~~~~~~~~~
>> 
>> There are a number of new stats introduced.  The key supplied in the
>> status command is passed to the storage engine to allow the storage
>> engine to add extra information to the existing stats commands, and to
>> create their own stat commands.
>> 
>> 3.1 Stats returned by the default stats command
>> ================================================
>> 
>> 3.1.1 libevent
>> ---------------
>> 
>> Over the time we've seen a lot of bugs around people using an old
>> version of libevent.  That's part of the reason why we bundle a well
>> known version of libevent in the release distribution.  Memcached
>> checks the libevent version during startup, and will refuse to start
>> if the one used is too old.  Since most operating systems use shared
>> libraries these days, you might be using another version than the one
>> you originally used when you first built memcached.  In order for us to
>> see which library people are using we decided to put it into the stats
>> as well.
>> 
>> 3.1.2 rejected_conns
>> ---------------------
>> 
>> The number of times a connection attempt was refused (due when we're
>> hitting the maximum number of connections.
>> 
>> 3.1.3 stats related to TAP
>> ---------------------------
>> 
>> There are a number of stats related to the packages used in the TAP
>> protocol.  These stats will only appear if they are non-zero:
>> 
>> tap_checkpoint_start_received tap_checkpoint_start_sent
>> tap_checkpoint_end_received tap_checkpoint_end_sent
>> tap_connect_received tap_delete_received tap_delete_sent
>> tap_flush_received tap_flush_sent tap_mutation_received
>> tap_mutation_sent tap_opaque_received tap_opaque_sent
>> tap_vbucket_set_received tap_vbucket_set_sent
>> 
>> 3.2 topkeys
>> ============
>> 
>> You may get information about the most popular keys in memcached by
>> exporting the environment variable MEMCACHED_TOP_KEYS to the number of
>> keys you would want memcached to keep track of.  There is no such thing
>> as a free lunch, so enabling this can have a small memory and speed
>> impact.  We've decided to _disable_ this by default, so you need to
>> export this variable to enable the feature. Ex:
>> 
>> me@localhost:> MEMCACHED_TOP_KEYS=10 ./memcached
>> 
>> Running "stats topkeys" would return something like
>> 
>> STAT my-key2 get_hits=0,get_misses=1,cmd_set=0,incr_hits=0,
>>     incr_misses=0,decr_hits=0,decr_misses=0,delete_hits=0,
>>     delete_misses=0,evictions=0,cas_hits=0,cas_badval=0,
>>     cas_misses=0,ctime=2,atime=2
>> STAT my-key1
>>     get_hits=1,get_misses=0,cmd_set=1,incr_hits=0,
>>     incr_misses=0,decr_hits=0,decr_misses=0,delete_hits=0,
>>     delete_misses=0,evictions=0,cas_hits=0,cas_badval=0,
>>     cas_misses=0,ctime=12,atime=12
>> 
>> (Line breaks and indentations added to make it more readable in this
>> document):
>> 
>> 3.3 aggregate
>> ==============
>> 
>> The combination of the storage engine interface and the SASL auth
>> allows for the combination of a connection-based stats.  The aggregate
>> subcommand is used to aggregate the stats from all of the connections
>> on the server.  The stats returned from the aggregate subcommand is the
>> same as the normal stats command.
>> 
>> 3.4 settings
>> =============
>> 
>> There are times an engine may want to share details about it's
>> configuration through stats.  This argument to stats will get you
>> there.
>> 
>> Just to show a couple of examples...
>> 
>> 3.4.1 extension
>> ----------------
>> 
>> Displays one of the extensions loaded (may appear multiple times).
>> 
>> ex:
>> 
>> STAT logger syslog
>> STAT ascii_extension scrub
>> STAT ascii_extension noop
>> STAT ascii_extension echo
>> 
>> 3.4.2 topkeys
>> --------------
>> 
>> The number of keys we are monitoring.
>> 
>> There may be many other settings exposed, depending on the engine's
>> configuration.
>> 
>>

Re: What's new in memcached (part 2)

Reply via email to