Just a quick follow up on this timeout issue.   No solution yet -- but sure
seems like a client network issue.

I have three servers on the same subnet.   One called "mem" where I'm
running a single instance of Memcached.  Then I have dev-1 and dev-2 with
each running 
mc_conn_tester.pl<http://consoleninja.net/code/memcached/mc_conn_tester.pl>.
 It's not reporting any timeouts on either machine.

I then start another script on dev-1 that forks 30 processes, connects,
then sends large set requests (almost 1MB in size) in a loop. This is
suppose to emulate a busy forking web server, for example.

Then I start seeing timeouts from mc_conn_tester.pl on the dev-1 machine
but not the dev-2 machine.  And likewise, if I move the load generator to
dev-2 then I see the timeouts on dev-2 not on dev-1.   Not a lot of
timeouts in either case, but it's clear it happens where the load script is
running.

If the load generating script is changed to send much smaller data size
then the timeouts stop.

That has me thinking this isn't a problem related to Memcached itself,
rather some network problem.   The network is not close to saturation so
maybe a temporary buffer overrun.   I've asked our network people to look
into it.

Agreed?






On Tue, Sep 24, 2013 at 7:09 AM, Bill Moseley <mose...@hank.org> wrote:

> I'm using the notes at https://code.google.com/p/memcached/wiki/Timeouts
>  to debug timeout errors against a single 1.4.4 Memcached server with 8GB
> of RAM on CentOS 6.2 started with
>
> memcached -d -p 11211 -u memcached -m 4096 -c 8192
>
>
> I could not get http://consoleninja.net/code/memcached/mc_conn_tester.pl to
> issue a timeout running by itself.
>
> So I wrote another script using Perl's Memcached::libmemcached that forked
> 20 or so processes and set ~1/2MB of data using keys generated by
> Data::UUID.  I didn't specify an expires time for these sets.
>
> I then started to see a few timeouts w/o connecting like in the examples:
>
> Fail: (timeout: 1) (elapsed: 1.00427794) (conn: 0.00000000) (set: 0.00000000) 
> (get: 0.00000000)
>
> I'm just starting to look at this now, but the network cards are not
> showing errors or dropped packets.  I couldn't get enough timeouts where
> changing the timeout value made much difference.
>
> Anyone have any additional suggestions for debugging these?
>
>
> And I assume unrelated to the timeout errors, but while testing I started
> to get server errors on my script writing the large data to Memcached:
>
> SERVER_ERROR out of memory storing object
>
>
> Are those failed malloc calls?  I'm suspecting that this is related to my
> old version of Memcached (per this thread):
>
> https://groups.google.com/forum/#!topic/memcached/QD7a-6JdqgA
>
> But, I just started up another instance of Memcached using the defaults
> (-m 64) and cannot get it to fail with that error.
>
> The machine where I was getting the out of memory errors has plenty of
> room:
>
>  $ free
>              total       used       free     shared    buffers     cached
> Mem:       8059188    5006444    3052744          0     284740     215796
> -/+ buffers/cache:    4505908    3553280
> Swap:     10289144          0   10289144
>
> Any chance the timeouts are somehow related?
>
> --
> Bill Moseley
> mose...@hank.org
>



-- 
Bill Moseley
mose...@hank.org

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to memcached+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to