On Apr 18, 10:35 am, Brian Aker <br...@tangent.org> wrote:

> > Intentionally, there is no significant difference in protocol over
> > 1.4.x.  There is one minor change, but it should be transparent to
> > most users.
>
> What is the change?

16 bits of reserved field were turned into a vbucket identifier.  That
means that retroactively, all operations have been specifying vbucket
0.

> When doing a diff I could one change in an ENUM, and also the addition of 
> "touch" which has no description.

It's unclear what you're diffing.  Try "git log -p
include/memcached/protocol_binary.h" It will show you things like
this:

    commit eeaeeede5ddaaed8179389876866756ce1c05158
    Author: Trond Norbye <trond.nor...@gmail.com>
    Date:   Tue May 18 21:55:02 2010 +0200

        Use the reserved bytes in the protocol header for vbucket

    diff --git a/include/memcached/protocol_binary.h b/include/
memcached/protocol_bi
    index 1c9a2a4..1e10ac0 100644
    --- a/include/memcached/protocol_binary.h
    +++ b/include/memcached/protocol_binary.h
    @@ -73,6 +73,7 @@ extern "C"
             PROTOCOL_BINARY_RESPONSE_EINVAL = 0x04,
             PROTOCOL_BINARY_RESPONSE_NOT_STORED = 0x05,
             PROTOCOL_BINARY_RESPONSE_DELTA_BADVAL = 0x06,
    +        PROTOCOL_BINARY_RESPONSE_NOT_MY_VBUCKET = 0x07,
             PROTOCOL_BINARY_RESPONSE_AUTH_ERROR = 0x20,
             PROTOCOL_BINARY_RESPONSE_AUTH_CONTINUE = 0x21,
             PROTOCOL_BINARY_RESPONSE_UNKNOWN_COMMAND = 0x81,
    @@ -168,7 +169,7 @@ extern "C"
                 uint16_t keylen;
                 uint8_t extlen;
                 uint8_t datatype;
    -            uint16_t reserved;
    +            uint16_t vbucket;
                 uint32_t bodylen;
                 uint32_t opaque;
                 uint64_t cas;

> Looking through the notes I don't see any discussion of what the format for 
> configuration is. When looking through what Dustin has it seems to be a 
> reference to some JSON which isn't documented. Also, is everyone expected now 
> to install CouchDB in order to use Memcached? That seems like a hefty 
> requirement.

I have no idea what you're talking about here.  There's nothing
related to CouchDB or even JSON in the memcached tree anywhere.

vbuckets are numbers with states corresponding to numbers in the
binary protocol.  You turn them on or you turn them off using vbucket
commands.

It's only there for plumbing.  It's best thought of as a hashing mode
with a tiny bit of help from the server.  The client hashes the key to
a vbucket, and then has a mapping of the vbucket to the server.  The
client tells the server that it wants something stored under vbucket x
and the server can refuse it.

The memcached server cares neither why the client sent vbucket x or
why it was told it should or shouldn't service requests to vbucket x.
The only difference between this and any other hashing strategy is
that the server can participate just ever so slightly in the hashing
to verify you're not sending things to the wrong place when two
clients have conflicting server lists.

This commit added the actual commits to the tree:

  
https://github.com/memcached/memcached/commit/8f449322cd31f9af28cd7849b1eb0a09030cf2bc

It has stuff like this in it:

    +     * Definition of the packet used by set vbucket
    +     */
    +    typedef union {
    +        struct {
    +            protocol_binary_request_header header;
    +            struct {
    +                vbucket_state_t state;
    +            } body;
    +        } message;
    +        uint8_t bytes[sizeof(protocol_binary_request_header) +
sizeof(vbucket_s
    +    } protocol_binary_request_set_vbucket;
    +    /**
    +     * Definition of the packet returned from set vbucket
    +     */
    +    typedef protocol_binary_response_no_extras
protocol_binary_response_set_vbu
    +    /**
    +     * Definition of the packet used by del vbucket
    +     */
    +    typedef protocol_binary_request_no_extras
protocol_binary_request_del_vbuck
    +    /**
    +     * Definition of the packet returned from del vbucket
    +     */
    +    typedef protocol_binary_response_no_extras
protocol_binary_response_del_vbu
    +
    +    /**
    +     * Definition of the packet used by get vbucket
    +     */
    +    typedef protocol_binary_request_no_extras
protocol_binary_request_get_vbuck
    +
    +    /**
    +     * Definition of the packet returned from get vbucket
    +     */
    +    typedef union {
    +        struct {
    +            protocol_binary_response_header header;
    +            struct {
    +                vbucket_state_t state;
    +            } body;
    +        } message;
    +        uint8_t bytes[sizeof(protocol_binary_response_header) +
sizeof(vbucket_
    +    } protocol_binary_response_get_vbucket;


> Can this stream be documented? How about picking a name other then TAP so 
> that there is no confusion between this and "System Tap" which the Linux 
> kernel folks have.

It could certainly use more documentation, though there's some
floating around.  I think enough people have implemented it now in
enough languages for enough use cases that we've hit most of the major
snags so there's not a lot of justification for not widely
distributing documentation that was collected as a result of doing
that work.

It's basically no different from the binary protocol in general, it
just runs the other way.  I've got a doc I wrote a while back I'll put
somewhere if we can't find anything better.  I'll send that out today
or so.

I don't know that I'm that concerned about the name being that bad.
It seems kind of obvious for a way to tap into the operations going on
in something and use that info to make decisions or build stuff.  Just
because the Linux guys also had the same idea doesn't make it a bad
one.  :)

> > There are a few new commands available.  The following sections
> > provides a brief description of them.  Please check protocol_binary.h
> > for the implementation details.
>
> There is just an ENUM in the file, that isn't enough for anyone who is 
> perusing the document. The original document Brad did is a good example of 
> what should be done.

There are 18 structs declaring toplevel packet formats along with four
enums.  Most of those have at least some documentation.  Where it's
insufficient we can make it better, of course.

Extension documents and/or protocol_binary core doc can be updated
when things are completely finalized.

> > The touch command lets you set the expiry time for an object without
> > retrieving the object.  In most cases, you will not want to do this
> > unless you provide a CAS value to ensure that you're touching the
> > correct version of the object.
>
> What happens if the object is not available? Is it possible that the server 
> will send an error back? What errors?

It sends the normal error codes just like any other commands.

> > 1.3.3 SET_VBUCKET, GET_VBUCKET, DEL_VBUCKET
> > --------------------------------------------
>
> > These commands are used to set, get or delete a vbucket on the server.
>
> How should these commands be structured? What possible errors exist? Is there 
> a reference implementation?

They're documented in protocol\_binary.h and default\_engine
implements
them.

> > 2.1 Engines
> > ============
> > environment.  People have different requirements for their server. Some
> > people need ACID, others may prefer ecstacy ;-) The storage interface
>
> How can you perform XA with this? If not, does this mean that each operation 
> in transactional? How is a conflict reported?  If this is an ACID interface, 
> what is the isolation for events?

That's up to your engine.

> Are all of the stats being updated so that users can see it? How is this 
> being performed in a backwards compatible way?

The stats API has been extensible for a while now.  Engines regularly
add new stats without affecting most clients (I can't speak for all).

> Its great to see work being done, but we need a lot more information to 
> support this.
>
> Alan has commented, and from a quick run through it looks like top keys cuts 
> performance in half. Is this going to be solved before a GA release?

Yes, that's a good point.  We did open a bug for it.  We haven't quite
decided how to resolve it.  This is exactly the kind of discussion
we'd like to have at this point.

On one hand, if someone can't deal with only 200k ops/s on a server,
they may not want to enable that feature in testing at this phase.  On
the other, we may not want to be adding features that we can't support
at full speed.

Reply via email to