Re: Multi-get implementation in binary protocol

2014-05-09 Thread Byung-chul Hong
Hello, Ryan, dormando,

Thanks a lot for the clear explanation and the comments.
I'm trying to find out how many requests I can batch as a muli-get within
the allowed latency.
I think multi-get has many advantages, the only penalty is the longer
latency as pointed out in the above answer.
But, the longer latency may not be a real issue unless it exceeds some
threshold that the end users can notice.
So, now I'm trying to use multi-get as much as possible.

Actually, I have thought that Binary protocol would be always better than
ascii protocol since binary protocol
can reduce the burden of parsing in the Server side, but it seems that I
need to test both cases.

Thanks again for the comments, and I will share the result if I get some
interesting or useful data.

Byungchul.



2014-05-08 9:30 GMT+09:00 dormando :

> > Hello,
> >
> > For now, I'm trying to evaluate the performance of memcached server by
> using several client workloads.
> > I have a question about multi-get implementation in binary protocol.
> > As I know, in ascii protocol, we can send multiple keys in a single
> request packet to implement multi-get.
> >
> > But, in a binary protocol, it seems that we should send multiple request
> packets (one request packet per key) to implement multi-get.
> > Even though we send multiple getQ, then sends get for the last key, we
> only can save the number of response packets only for cache miss.
> > If I understand correctly, multi-get in binary protocol cannot reduce
> the number of request packets, and
> > it also cannot reduce the number of response packets if hit-ratio is
> very high (like 99% get hit).
> >
> > If the performance bottleneck is on the network side not on the CPU, I
> think reducing the number of packets is still very important,
> > but I don't understand why the binary protocol doesn't care about this.
> > I missed something?
>
> you're right, it sucks. I was never happy with it, but haven't had time to
> add adjustments to the protocol for this. To note, with .19 some
> inefficiencies with the protocol were lifted, and most network cards are
> fast enough for most situations, even if it's one packet per response (and
> for large enough responses they split into multiple packets, anyway).
>
> The reason why this was done is for latency and streaming of responses:
>
> - In ascii multiget, I can send 10,000 keys, then I'm forced to wait for
> the server to look up all of the keys before sending its responses, this
> isn't typically very high but there's some latency to it.
>
> - In binary multiget, the responses are sent back as it receives them from
> the network more or less. This reduces the latency to when you start
> seeing responses, regardless of how large your multiget is. this is useful
> if you have a kind of client which can start processing responses in a
> streaming fashion. This potentially reduces the total time to render your
> response since you can keep the CPU busy unmarshalling responses instead
> of sleeping.
>
> However, it should have some tunables: One where it at least does one
> write per complete packet (TCP_CORK'ed, or similar), and one where it
> buffers up to some size. In my tests I can get ascii multiget up to 16.2
> million keys/sec, but (with the fixes in .19) binprot caps out at 4.6m and
> is spending all of its time calling sendmsg(). Most people need far, far
> less than that, so the binprot as is should be okay though.
>
> The code isn't too friendly to this and there're other higher priority
> things I'd like to get done sooner. The relatively few number of people
> who do 500,000+ requests per second in binprot (they're almost always
> ascii at that scale) is the other reason.
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "memcached" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/memcached/QwjEftFhtCY/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> memcached+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to memcached+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Multi-get implementation in binary protocol

2014-05-09 Thread dormando
Unfortunately binprot isn't that much faster processing wise... what it
does give you is a bunch of safe features (batching set's, mixing
sets/gets and the like).

You *can* reduce the packet load on the server a bit by ensuring your
client is actually batching the binary multiget packets together, then
it's only the server increasing the packet load...

On Fri, 9 May 2014, Byung-chul Hong wrote:

> Hello, Ryan, dormando,
>
> Thanks a lot for the clear explanation and the comments.
> I'm trying to find out how many requests I can batch as a muli-get within the 
> allowed latency.
> I think multi-get has many advantages, the only penalty is the longer latency 
> as pointed out in the above answer.
> But, the longer latency may not be a real issue unless it exceeds some 
> threshold that the end users can notice.
> So, now I'm trying to use multi-get as much as possible.
>
> Actually, I have thought that Binary protocol would be always better than 
> ascii protocol since binary protocol
> can reduce the burden of parsing in the Server side, but it seems that I need 
> to test both cases.
>
> Thanks again for the comments, and I will share the result if I get some 
> interesting or useful data.
>
> Byungchul.
>
>
>
> 2014-05-08 9:30 GMT+09:00 dormando :
>   > Hello,
>   >
>   > For now, I'm trying to evaluate the performance of memcached server 
> by using several client workloads.
>   > I have a question about multi-get implementation in binary protocol.
>   > As I know, in ascii protocol, we can send multiple keys in a single 
> request packet to implement multi-get.
>   >
>   > But, in a binary protocol, it seems that we should send multiple 
> request packets (one request packet per key) to implement multi-get.
>   > Even though we send multiple getQ, then sends get for the last key, 
> we only can save the number of response packets only for cache
>   miss.
>   > If I understand correctly, multi-get in binary protocol cannot reduce 
> the number of request packets, and
>   > it also cannot reduce the number of response packets if hit-ratio is 
> very high (like 99% get hit).
>   >
>   > If the performance bottleneck is on the network side not on the CPU, 
> I think reducing the number of packets is still very important,
>   > but I don't understand why the binary protocol doesn't care about 
> this.
>   > I missed something?
>
> you're right, it sucks. I was never happy with it, but haven't had time to
> add adjustments to the protocol for this. To note, with .19 some
> inefficiencies with the protocol were lifted, and most network cards are
> fast enough for most situations, even if it's one packet per response (and
> for large enough responses they split into multiple packets, anyway).
>
> The reason why this was done is for latency and streaming of responses:
>
> - In ascii multiget, I can send 10,000 keys, then I'm forced to wait for
> the server to look up all of the keys before sending its responses, this
> isn't typically very high but there's some latency to it.
>
> - In binary multiget, the responses are sent back as it receives them from
> the network more or less. This reduces the latency to when you start
> seeing responses, regardless of how large your multiget is. this is useful
> if you have a kind of client which can start processing responses in a
> streaming fashion. This potentially reduces the total time to render your
> response since you can keep the CPU busy unmarshalling responses instead
> of sleeping.
>
> However, it should have some tunables: One where it at least does one
> write per complete packet (TCP_CORK'ed, or similar), and one where it
> buffers up to some size. In my tests I can get ascii multiget up to 16.2
> million keys/sec, but (with the fixes in .19) binprot caps out at 4.6m and
> is spending all of its time calling sendmsg(). Most people need far, far
> less than that, so the binprot as is should be okay though.
>
> The code isn't too friendly to this and there're other higher priority
> things I'd like to get done sooner. The relatively few number of people
> who do 500,000+ requests per second in binprot (they're almost always
> ascii at that scale) is the other reason.
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the Google 
> Groups "memcached" group.
> To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/memcached/QwjEftFhtCY/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to 
> memcached+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups 
> "memcached" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to memcached+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 

--- 
You received th

Re: Issue 363 in memcached: MemcachePool::get(): Server 127.0.0.1 (tcp 11211, udp 0) failed with: Network timeout

2014-05-09 Thread dormando
Can you give me a list (privately, if need be) of a few things:

- The exact OS your server is running (centos/redhat release/etc)
- The exact kernel version (and where it came from? centos/rh proper or a
3rd party repo?)
- Full list of your 3rd party repos, since I know you had some random
french thing in there.
- Full list of packages installed from 3rd party repos.

It is extremely important that all of the software matches.

- Hardware details:
  - Network card(s), speeds
  - CPU type, number of cores (hyperthreading?)
  - Amount of RAM

- Is this a hardware machine, or a VM somewhere? If a VM, what provider?

- memcached stats snapshots again, from your machine after it's been
running a while:
  - "stats", "stats slabs", "stats items", "stats settings", "stats
conns".
^ That's five commands, don't forget any.

It's too difficult to try to debug the issue when you hit it. usually
when I'm at a gdb console I'm issuing a command every second or two, but
it takes us 10 minutes to get through 3-4 commands. It'd be nice if I
could attempt to reproduce it here.

I went digging more and there're some dup() bugs with epoll, except your
libevent is new enough to have those patched.. plus we're not using dup()
in such a way to cause the bug.

There was also an EPOLL_CTL_MOD race condition in the kernel, but so far
as I can tell even with libevent 2.x libevent's not using that feature for
us.

The issue does smell like the bug that happens with dup()'s - the events
keep happening and the fd sits half closed, but again we're never closing
those sockets.

I can also make a branch with the new dup() calls explicitly removed, but
this continues to be obnoxious multi-week-long debugging.

I'm convinced that the code in memcached is correct and the bug exists
outside of it (libevent or the kernel). There's simply no way for it to
hit that code path without closing the socket, and doubly so: epoll
automatically delete's an event when the socket is closed. We delete it
then close it, and it still comes back.

It's not possible a connection ends up in the wrong thread, since both
connection initialization and close happens local to a thread. We would
need to have a new connection come in with a duplicated fd. If that
happens, nothing on your machine would work.

thanks.

On Thu, 8 May 2014, notificati...@commando.io wrote:

> I am just speculating, and by no means have any idea what I am really talking 
> about here. :)
> With 2 threads, still solid, no timeouts, no runaway 100% cpu. Its been days. 
> Increasing from 2 threads to 4 does not generate any more traffic or
> requests to memcached. Thus I am speculating perhaps it is a race-condition 
> or some sort, only hitting with > 2 threads.
>
> Why do you say it will be less likely to happen with 2 threads than 4?
>
> On Wednesday, May 7, 2014 5:38:47 PM UTC-7, Dormando wrote:
>   That doesn't really tell us anything about the nature of the problem
>   though. With 2 threads it might still happen, but is a lot less likely.
>
>   On Wed, 7 May 2014, notifi...@commando.io wrote:
>
>   > Bumped up to 2 threads and so far no timeout errors. I'm going to let 
> it run for a few more days, then revert back to 4 threads and
>   see if timeout
>   > errors come up again. That will tell us the problem lies in spawning 
> more than 2 threads.
>   >
>   > On Wednesday, May 7, 2014 5:19:13 PM UTC-7, Dormando wrote:
>   >       Hey,
>   >
>   >       try this branch:
>   >       https://github.com/dormando/memcached/tree/double_close
>   >
>   >       so far as I can tell that emulates the behavior in .17...
>   >
>   >       to build:
>   >       ./autogen.sh && ./configure && make
>   >
>   >       run it in screen like you were doing with the other tests, see 
> if it
>   >       prints "ERROR: Double Close [somefd]". If it prints that once 
> then stops,
>   >       I guess that's what .17 was doing... if it print spams, then 
> something
>   >       else may have changed.
>   >
>   >       I'm mostly convinced something about your OS or build is 
> corrupt, but I
>   >       have no idea what it is. The only other thing I can think of is 
> to
>   >       instrument .17 a bit more and have you try that (with the 
> connection code
>   >       laid out the old way, but with a conn_closed flag to detect a 
> double close
>   >       attempt), and see if the old .17 still did it.
>   >
>   >       On Tue, 6 May 2014, notifi...@commando.io wrote:
>   >
>   >       > Changing from 4 threads to 1 seems to have resolved the 
> problem. No timeouts since. Should I set to 2 threads and wait and
>   see how
>   >       things go?
>   >       >
>   >       > On Tuesday, May 6, 2014 12:07:08 AM UTC-7, Dormando wrote:
>   >       >       and how'd that work out?
>   >       >
>   >       >       Still no other reports :/ a few thousand m