I was actually going to ask if this was a draft of a changelog going on github/memcache.org/etc, because it already seems pretty well-formatted to be a one-off email, and would be useful to point others to, without forwarding.
- Marc On Mon, Apr 11, 2011 at 3:30 PM, Adam Lee <a...@fotolog.biz> wrote: > is there somewhere i can copy edit this document? > > a bit nitpicky, i know, but i found a few mistakes just while browsing > it... section 2.1 both "suites" should be "suits," section 3.4 "it's" should > be "its," etc. > > awl > On Apr 11, 2011 3:05 PM, "Trond Norbye" <trond.nor...@gmail.com> wrote: > > What's new in memcached > > ======================= > > > > (part two - new feature proposals) > > > > Table of Contents > > ================= > > 1 Protocol > > 1.1 Virtual buckets! > > 1.2 TAP > > 1.3 New commands > > 1.3.1 VERBOSITY > > 1.3.2 TOUCH, GAT and GATQ > > 1.3.3 SET_VBUCKET, GET_VBUCKET, DEL_VBUCKET > > 1.3.4 TAP_CONNECT > > 1.3.5 TAP_MUTATION, TAP_DELETE, TAP_FLUSH > > 1.3.6 TAP_OPAQUE > > 1.3.7 TAP_VBUCKET_SET > > 1.3.8 TAP_CHECKPOINT_START and TAP_CHECKPOINT_END > > 2 Modularity > > 2.1 Engines > > 2.2 Extensions > > 2.2.1 Logger > > 2.2.2 Daemon > > 2.2.3 ASCII commands > > 3 New stats > > 3.1 Stats returned by the default stats command > > 3.1.1 libevent > > 3.1.2 rejected_conns > > 3.1.3 stats related to TAP > > 3.2 topkeys > > 3.3 aggregate > > 3.4 settings > > 3.4.1 extension > > 3.4.2 topkeys > > > > > > 1 Protocol > > ~~~~~~~~~~~ > > > > Intentionally, there is no significant difference in protocol over > > 1.4.x. There is one minor change, but it should be transparent to > > most users. > > > > 1.1 Virtual buckets! > > ===================== > > > > We don't know who originally came up with the idea, but we've heard > > rumors that it might be Anatoly Vorobey or Brad Fitzpatrick. In lieu > > of a full explanation on this, the concept is that instead of mapping > > each key to a server we map it to a virtual bucket. These virtual > > buckets are then distributed across all of the servers. To ease the > > introduction of this we've assigned the two reserved bytes in the > > binary protocol for specifying the vbucket id, which allowed us to > > avoid protocol extensions. > > > > Note that this change should allow for complete compatibility if the > > clients and the server are not aware of vbuckets. These should have > > been set to 0 according to the original binary protocol specification, > > which means that they will always use vbucket 0. > > > > The idea is that we can move these vbuckets between servers such that > > you can "grow" or "shrink" your cluster without losing data in your > > cache. The classic memcached caching engine does _not_ implement > > support for multiple vbuckets right now, but it is on the roadmap to > > create a version of the engine in memcached to support this (it is a > > question of memory efficiency, and there are currently not many > > clients that support them). > > > > Defining this now will allow us to start moving down the path to > > vbuckets in the default_engine and allow other engine implementors to > > consider vbuckets in their design. > > > > You can read more about the mechanics of it here: > > [http://dustin.github.com/2010/06/29/memcached-vbuckets.html] > > > > However, you _cannot_ use a mix of clients that are vbucket aware and > > clients who don't use vbuckets, but then again it doesn't make sense > > to use a vbucket aware backend if your clients don't know how to > > access them. This is why we believe a protocol change isn't > > warranted. > > > > Defining this now will allow us to start moving down the path to > > vbuckets in the default_engine and allow other engine implementors to > > consider vbuckets in their design. > > > > 1.2 TAP > > ======== > > > > In order to facilitate vbucket transfers, among other use cases where > > people want to see what's inside the server, we added to the binary > > protocol a set of commands collectively called TAP. The intention is > > to allow "clients" to receive a stream of notifications whenever data > > change in the server. It is solely up to the backing store to > > implement this, so it can make decisions about what resources are used > > to implement TAP. This functionality is commonly needed enough though > > that the core is aware of it, leaving specific implementation to > > engines. > > > > 1.3 New commands > > ================= > > > > There are a few new commands available. The following sections > > provides a brief description of them. Please check protocol_binary.h > > for the implementation details. > > > > 1.3.1 VERBOSITY > > ---------------- > > > > We did not have an equivalent of the verbosity command in the textual > > protocol. This command allows the user to change the verbosity level > > on your running server by using the binary protocol. Why do we need > > this? There is a command line option you may use to disable the ascii > > protocol, so we need this command in order to change the logging level > > in those configurations. > > > > 1.3.2 TOUCH, GAT and GATQ > > -------------------------- > > > > One of the problems with the existing commands in memcached is that > > you couldn't tell the memcached server that the object is still valid > > and we just want a longer expiration. Normally you want to put an > > expiry time on the objects, so that you can get an indication if your > > cache is big enough (by watching the eviction stats.. if your > > memcached server has a high eviction rate your cache isn't big enough > > for what you want to have in there). The normal idea is that the items > > you're normally using would be bumped to the front of your LRU (and > > hence not be kicked out immediately). > > > > The touch command lets you set the expiry time for an object without > > retrieving the object. In most cases, you will not want to do this > > unless you provide a CAS value to ensure that you're touching the > > correct version of the object. > > > > GAT means "get and touch" and returns the object in addition to > > setting a new expiration time. This allows you to have a rolling > > window of expiry that has a TTL in addition to the access time. For > > example, you can instruct memcached to allow an object to live no > > later than five minutes after the last time it was access (but as > > always, it may expire sooner). > > > > 1.3.3 SET_VBUCKET, GET_VBUCKET, DEL_VBUCKET > > -------------------------------------------- > > > > These commands are used to set, get or delete a vbucket on the server. > > > > 1.3.4 TAP_CONNECT > > ------------------ > > > > Connect and request that the server initialize a TAP stream. > > > > The point of this command is to allow clients to connect and specify a > > few things about the data they wish to receive. Specifically, the > > client will typically specify a date either in the past or in the > > future along with specifying a vbucket. The server will then stream > > data mutated since that given date or if a future date is specified, > > only stream new mutations as they arrive. The specific details about > > which mutations to send may vary on implementation. > > > > 1.3.5 TAP_MUTATION, TAP_DELETE, TAP_FLUSH > > ------------------------------------------ > > > > TAP_MUTATION is a notification that an item changed value in the > > server. > > > > The mutation typically comes with the new value. > > > > TAP_DELETE is a notification that a key was deleted on the server. > > > > Finally, to avoid having to send a complete list of all the keys in > > the server when the user issues a flush, we can send a single message > > (TAP_FLUSH) representing the flush. Please note that the FLUSH > > message means _ALL_ vbuckets, and not just a single vbucket. > > > > 1.3.6 TAP_OPAQUE > > ----------------- > > > > To allow storage engines to send their own messages over the tap > > stream between each other, a tap opaque message is defined. It is > > completely up to the storage engine to specify the internal layout of > > the package. > > > > 1.3.7 TAP_VBUCKET_SET > > ---------------------- > > > > This is a message requesting a vbucket set. It is similar to the > > set_vbucket command, with the difference that this message comes over > > a tap connection (with the extra info a tap message contains) > > > > 1.3.8 TAP_CHECKPOINT_START and TAP_CHECKPOINT_END > > -------------------------------------------------- > > > > The checkpoint start and end messages may be used by engine who wants > > to use checkpoints. Checkpoints are an optional feature that may be > > used by some engines to allow clients to start at a checkpoint > > position. By doing so, the client need not do a full "backfill" even > > if it is revisiting a server after having been gone for a while. The > > TAP_CHECKPOINT_START tells a client that it's the start of a new > > checkpoint, and the TAP_CHECKPOINT_END tells the client when it's > > received everything for that given checkpoint. > > > > 2 Modularity > > ~~~~~~~~~~~~~ > > > > As we mentioned in the first email on changes, one big difference with > > this new work is that we've tried to refactor memcached into being a > > modular application instead of being monolithic. In the future, we'd > > like to make the command parser as a separate module, so that we may > > load the parsers separately. > > > > 2.1 Engines > > ============ > > > > We've done a lot of work trying to refactor the code in memcached to > > avoid the tight coupling between the command protocol parser and the > > actual item storage. > > > > The idea with the engine interface is that the memcached process loads > > a dynamically loadable object and calls a well known function to get a > > set of function pointers. All communication between the memcached > > process and the engine is performed through these function > > pointers. The memcached process provides a set of services to the > > engine as well through another set of function pointers. > > > > The beauty of this is that the user may choose between a set of > > different storage engines that suites their runtime > > environment. People have different requirements for their server. Some > > people need ACID, others may prefer ecstacy ;-) The storage interface > > may let them design their app by using the memcached protocol, and > > they can just swap in the backend that suites their needs (may it be > > persistence, replication (sync or lazily) etc..) > > > > 2.2 Extensions > > =============== > > > > The item storage isn't the only place we've tried to create a level of > > modularity. People run memcached in different environments with > > different requirements. You specify the extensions you want to use by > > adding the -X command line argument. > > > > 2.2.1 Logger > > ------------- > > > > We've seen a lot of different requests when it comes to logging. Some > > want it to a file, some to syslog (or Windows event log) and some want > > it to standard out. By default memcached will print to stderr, but > > you may specify a different logger by loading the appropriate module > > with the -X command line argument > > > > 2.2.2 Daemon > > ------------- > > > > You might want to have some daemons providing extra services inside > > your memcached server. Examples would be things like a doors server > > to provide additional access to your server (Trond's favorite), or > > perhaps a "dispatcher" offering a threadpool for your engines to > > use?). > > > > 2.2.3 ASCII commands > > --------------------- > > > > If you really need to extend the ASCII protocol, you may now load > > additional ASCII commands as loadable modules. We don't need a > > separate module for binary commands, because those are already handled > > inside memcached due to the fixed semantics on the protocol. This > > isn't necessarily encouraged, but sometimes it is required to get > > something done quick. > > > > 3 New stats > > ~~~~~~~~~~~~ > > > > There are a number of new stats introduced. The key supplied in the > > status command is passed to the storage engine to allow the storage > > engine to add extra information to the existing stats commands, and to > > create their own stat commands. > > > > 3.1 Stats returned by the default stats command > > ================================================ > > > > 3.1.1 libevent > > --------------- > > > > Over the time we've seen a lot of bugs around people using an old > > version of libevent. That's part of the reason why we bundle a well > > known version of libevent in the release distribution. Memcached > > checks the libevent version during startup, and will refuse to start > > if the one used is too old. Since most operating systems use shared > > libraries these days, you might be using another version than the one > > you originally used when you first built memcached. In order for us to > > see which library people are using we decided to put it into the stats > > as well. > > > > 3.1.2 rejected_conns > > --------------------- > > > > The number of times a connection attempt was refused (due when we're > > hitting the maximum number of connections. > > > > 3.1.3 stats related to TAP > > --------------------------- > > > > There are a number of stats related to the packages used in the TAP > > protocol. These stats will only appear if they are non-zero: > > > > tap_checkpoint_start_received tap_checkpoint_start_sent > > tap_checkpoint_end_received tap_checkpoint_end_sent > > tap_connect_received tap_delete_received tap_delete_sent > > tap_flush_received tap_flush_sent tap_mutation_received > > tap_mutation_sent tap_opaque_received tap_opaque_sent > > tap_vbucket_set_received tap_vbucket_set_sent > > > > 3.2 topkeys > > ============ > > > > You may get information about the most popular keys in memcached by > > exporting the environment variable MEMCACHED_TOP_KEYS to the number of > > keys you would want memcached to keep track of. There is no such thing > > as a free lunch, so enabling this can have a small memory and speed > > impact. We've decided to _disable_ this by default, so you need to > > export this variable to enable the feature. Ex: > > > > me@localhost:> MEMCACHED_TOP_KEYS=10 ./memcached > > > > Running "stats topkeys" would return something like > > > > STAT my-key2 get_hits=0,get_misses=1,cmd_set=0,incr_hits=0, > > incr_misses=0,decr_hits=0,decr_misses=0,delete_hits=0, > > delete_misses=0,evictions=0,cas_hits=0,cas_badval=0, > > cas_misses=0,ctime=2,atime=2 > > STAT my-key1 > > get_hits=1,get_misses=0,cmd_set=1,incr_hits=0, > > incr_misses=0,decr_hits=0,decr_misses=0,delete_hits=0, > > delete_misses=0,evictions=0,cas_hits=0,cas_badval=0, > > cas_misses=0,ctime=12,atime=12 > > > > (Line breaks and indentations added to make it more readable in this > > document): > > > > 3.3 aggregate > > ============== > > > > The combination of the storage engine interface and the SASL auth > > allows for the combination of a connection-based stats. The aggregate > > subcommand is used to aggregate the stats from all of the connections > > on the server. The stats returned from the aggregate subcommand is the > > same as the normal stats command. > > > > 3.4 settings > > ============= > > > > There are times an engine may want to share details about it's > > configuration through stats. This argument to stats will get you > > there. > > > > Just to show a couple of examples... > > > > 3.4.1 extension > > ---------------- > > > > Displays one of the extensions loaded (may appear multiple times). > > > > ex: > > > > STAT logger syslog > > STAT ascii_extension scrub > > STAT ascii_extension noop > > STAT ascii_extension echo > > > > 3.4.2 topkeys > > -------------- > > > > The number of keys we are monitoring. > > > > There may be many other settings exposed, depending on the engine's > > configuration. > > >