Memcached and EC2

2010-02-10 Thread tachu
I've been lately running into some limitation using memcache and ec2.
Given that this is a EC2 issue i might get flamed but I was hoping to
get opinion from people here that have been using memcache. We
currently run a cluster of aproximately 40 memcache servers with about
6.5 gb of ram each machine using m1.medium ec2 instances. I was in the
process of reducing the number of servers while increasing the memory
size for each from 6 to about 30gb. Now i've started noticing that
some servers seem to hit certain bandwidth limitations not consistenly
though since i have some servers pushing 6mb/sec and some at 4mb
having packet los and tcp timeouts. Most of my issues are the php
client complaining about Connection timed out (110) or failed with:
Failed reading line from stream (0) this is sporadic at best but
usually concentrated on the same servers, I've replaced the instances
hoping this will give me an instance on a better area or on a less
congested switch but i still have the issue on the same server. I've
tried multiple kernel settings hoping to aleviate this but im not sure
that either this apply to the machine without restarting the memcache
daemon. I guess bottom line is. if i add say another 30 servers
eficiently splitting network traffic across more machines. will this
slow down a lot of the key hashing part of memcache as well as the
initialization of the client when we loop to add servers..
I like to note i use pecl memcache client.

T


Re: Memcached and EC2

2010-02-10 Thread Henrik Schröder
On Wed, Feb 10, 2010 at 23:04, tachu  wrote:

> will this
> slow down a lot of the key hashing part of memcache


No, the naive server selection is O(1), the consistent hash server selection
is O(log(n))

as well as the
> initialization of the client when we loop to add servers..
>

That depends, the naive server selection just makes an array of all the
servers, that's O(n). The consistent hash server selection though computes
100 hashes for each server, so that's O(100n). Either way, client
initialization is a much more expensive operation than the actual server
selection.

Slightly related: Is there any way to get persistent connections for
memcached in PHP? There's no real static variables in it, but there seems to
be something for mysql that attaches connections to the parent apache
process, is there something similar for memcached? Or are you just
permanently screwed and have to re-initialize everything each pageview? I
tried googling for it, but I only get... unhelpful... results.


/Henrik


Re: Memcached and EC2

2010-02-11 Thread Brian Moon

Slightly related: Is there any way to get persistent connections for
memcached in PHP? There's no real static variables in it, but there
seems to be something for mysql that attaches connections to the parent
apache process, is there something similar for memcached? Or are you
just permanently screwed and have to re-initialize everything each
pageview? I tried googling for it, but I only get... unhelpful... results.


http://us2.php.net/manual/en/memcached.construct.php

or

http://us3.php.net/manual/en/function.memcache-addserver.php

depending on your library.

Brian.


Re: Memcached and EC2

2010-02-12 Thread Henrik Schröder
On Fri, Feb 12, 2010 at 01:26, Brian Moon  wrote:

> Slightly related: Is there any way to get persistent connections for
>> memcached in PHP? There's no real static variables in it, but there
>> seems to be something for mysql that attaches connections to the parent
>> apache process, is there something similar for memcached? Or are you
>> just permanently screwed and have to re-initialize everything each
>> pageview? I tried googling for it, but I only get... unhelpful... results.
>>
>
> http://us2.php.net/manual/en/memcached.construct.php
>
> or
>
> http://us3.php.net/manual/en/function.memcache-addserver.php
>
> depending on your library.
>

Thanks, but how does that work exactly?

I've been trying to find information about it, and what I've pieced together
so far is that if you are running PHP as an Apache module, and if you are
making your own extension in C, then you get access to methods that allow
you to store stuff in some sort of shared memory in the Apache process, that
you can retrieve between pageviews, such as socket handles or initialized
arrays or similar. But there's nothing in the PHP language itself to do
something like that, you have to make your own C module. Correct?

And that then, in turn, means that the answer to the original question is
that adding more servers doesn't slow down the app, if they're using
persistent connections, because client initialization will only happen once?
(Given the prerequsities above, and given that they actually make sure they
only init once like a comment on that memcached manual page describes)

And is this true for both PHP clients?


/Henrik


Re: Memcached and EC2

2010-02-12 Thread dormando
> Thanks, but how does that work exactly?
>
> I've been trying to find information about it, and what I've pieced together 
> so far is that if you are running PHP as an Apache module, and if you are 
> making your own extension in C, then you get access to
> methods that allow you to store stuff in some sort of shared memory in the 
> Apache process, that you can retrieve between pageviews, such as socket 
> handles or initialized arrays or similar. But there's
> nothing in the PHP language itself to do something like that, you have to 
> make your own C module. Correct?

Yeah. special PHP mumbo jumbo.

> And that then, in turn, means that the answer to the original question is 
> that adding more servers doesn't slow down the app, if they're using 
> persistent connections, because client initialization will
> only happen once? (Given the prerequsities above, and given that they 
> actually make sure they only init once like a comment on that memcached 
> manual page describes)
>
> And is this true for both PHP clients?

I recall a thread a while back that the ketama/consistent hashing stuff
with the old library redid all of the calculations on every request
regardless of persistent settings.

Don't recall if that was ever fixed or properly tested, but the
complaining user had his performance problems go away when he stopped
using that.

I assume pecl/memcached isn't quite that stupid, but someone might want to
verify for PHP's sake.

-Dormando


Re: Memcached and EC2

2010-02-12 Thread Brian Moon

I've been trying to find information about it, and what I've pieced
together so far is that if you are running PHP as an Apache module, and
if you are making your own extension in C, then you get access to
methods that allow you to store stuff in some sort of shared memory in
the Apache process, that you can retrieve between pageviews, such as
socket handles or initialized arrays or similar. But there's nothing in
the PHP language itself to do something like that, you have to make your
own C module. Correct?


Right, PHP is meant to do all complicated things or heavy lifting in C, 
not PHP. It is the PHP way despite people not doing it that way every day.



And that then, in turn, means that the answer to the original question
is that adding more servers doesn't slow down the app, if they're using
persistent connections, because client initialization will only happen
once? (Given the prerequsities above, and given that they actually make
sure they only init once like a comment on that memcached manual page
describes)

And is this true for both PHP clients?


I don't really know what this has to do with slowing down the app.  Are 
you seeing some problem or issue you are trying to resolve or being 
academic? Persistent connections happen once per PHP process provided 
the SAPI in use allows for storage across requests. But, if you are 
using Apache and you have MaxClients at 250 and it gets there, you will 
have 250 connections to memcached. If you want to reduce that, you could 
use something like Moxi to pipeline those connections. But, again, 
knowing what your problem is and why you are so worried about it would 
help us help you.


As for pecl/memcache, just don't use it. It needs a lot of work and the 
authors don't seem interested.


Brian.