On Oct 2, 2007, at 9:36 , Brian Aker wrote:

So I am looking at increasing the performance in libmemcached. Looking at how some of the other clients are implemented I am finding a catch-22 that I am hoping someone can explain.

Most clients seem to be setting their IO to non-blocking, which is excellent, but I don't understand what this is really buying since:
1) Clients are not threaded

I don't quite understand why you're implying non-blocking IO and threading must go together. Many people implement threads just because non-blocking IO appears to require more thought (in reality, it seems to be the other way around, but that's a different issue).

My client is used in threaded environments, but only has one thread dedicated to IO multiplexing. It's performing non-blocking IO over as many connections as it needs... sending and receiving whenever it's possible and completing requests when enough data arrive.

2)  The protocol always sends an ACK of some sort.

The interface provided to my client doesn't require the caller to wait for ACKs. You tend to want to do that for get requests, but you may not care in the case of deletes or sets.

That is to say, you generally don't want to not know when something is over (in the case of quiet gets in the binary protocol, you'll want a noop or a regular get at the end), but you can't really send a quiet get and then wait just in case something starts arriving. Instead, just stream requests out and stream responses in. Line them up, and you're good to go.

        Non-blocking IO means you're only waiting when there's nothing to do.

Take "set" for example. I can do a "set" which is non-blocking, but then I have to sit and spin either in the kernel or in user space waiting for the "STORED" to be returned. This seems to defeat the point of non-blocking IO.

You don't have to at all. A set is issued, and the state of the op is changed to waiting_for_response or something and it's added to an input queue. Then you start sending the next operation from your output queue. If a server starts sending stuff back to you, it's for whatever's on the top of your input queue (in the binary protocol, you can double-check this).

I must be missing something about the above, since I can't see why there is a benefit to dealing with non-blocking IO on a set, if you will just end up waiting on the read() (ok, recv()).

        Not with my client (unless you want to).  :)

On a different related note, I've noticed another issue with "set". When I send a "set foo 0 0 20\r\n", I have to just send that message. I can't just drop the "set" and the data to be stored in the same socket. If I do that, then the server removes whatever portion of the key that was contained in the "set". Maybe this is my bug (though I can demonstrate it), but that seems like a waste. AKA if on the server its doing a read() for the set and tossing out the rest of the packet then its purposely causing two roundtrips for the same data.

By ``socket,'' do you mean ``packet?'' My client pipelines request in such a way that multiple gets, sets, deletes, etc... can easily get stuffed into the same packet.

Looking through all of this, I am hoping that the binary protocol, which I eagerly await reading, has a "set" which doesn't bother to tell me what the result of the "set" was. You could pump a lot more data into memcached if this was the case.

We can create a qset, but the semantics would need to be carefully considered. qget just keeps its errors silent and only returns positive results. Should a qset do the opposite, or should it never return anything at all?

        Here's a fun exercise to do with memcached:

Write out a bunch of set commands to a text file, followed by a quit. Pipe that into nc with output to /dev/null. This will do various fun pipelining and basically show you how fast it's possible to write. The speed isn't all that much of a protocol issue.

--
Dustin Sallings


Reply via email to