> @Les, you make a clear and concise point. thnx.
>
> In this thread, i'm really keen on exploring a theoretical possibility (that 
> could become very practical for very large installations):
>
>     -- at what node count (for a given pool) may/could we start to experience 
> problems related to performance  (server, network or even client)
> assuming a near perfect hardware/network set-up?

I think the really basic theoretical response is:

- If your request will easily fit in the TCP send buffer and immediately
transfer out the network card, it's best if it hits a single server.
- If your requests are large, you can get lower latency responses by not
waiting on the TCP socket.
- Then there's some fiddling in the middle.
- Each time a client runs "send" that's a syscall, so more do suck, but
keep in mind the above tradeoff: A little system cpu time vs waiting for
TCP Ack's.

In reality it doesn't tend to matter that much. The point of my response
to the facebook "multiget hole" is that you can tell clients to group keys
to specific or subsets of servers, (like all keys related to a particular
user), so you can have a massive pool and still generally avoid contacting
all of them on every request.

>     -- if a memcacached client were to pool say, 2,000 or 20,000 connections 
> (again, theoretical but not entirely impractical given the rate of
> internet growth), wud that not inject enough overhead -- connection or 
> otherwise -- on the client side to, say, warrant a direct fetch from the
> database? in such a case, we wud have established a *theoretical* maximum 
> number nodes in a pool for that given client in near perfect conditions.

The theory depends on your setup, of course:

- Accessing the server hash takes no time (it's a hash), calculating it
is the time consuming one. We've seen clients misbehave and seriously slow
things down by recalculating a consistent hash on every request. So long
as you're internally caching the continuum the lookups are free.

- Established TCP sockets mostly just waste RAM, but don't generally slow
things down. So for a client server, you can calculate the # of memcached
instances * number of apache procs or whatever * the amount of memory
overhead per TCP socket compared to the amount of RAM in the box and
there's your limit. If you're using persistent connections.

- If you decide to not use persistent connections, and design your
application so satisfying a page read would hit at *most* something like 3
memcached instances, you can go much higher. Tune the servers for
TIME_WAIT reuse, higher local ports, etc, which deals with the TCP churn.
Connections are established on first use, then reused until the end of the
request, so the TCP SYN/ACK cycle for 1-3 (or even more) instances won't
add up to much. Pretending you can have an infinite number of servers on
the same L2 segment you would likely be limited purely by bandwidth, or
the amount of memory required to load the consistent hash for clients.
Probably tens of thousands.

- Or use UDP, if your data is tiny and you tune the fuck out of it.
Typically it doesn't seem to be much faster, but I may get a boost out of
it with some new linux syscalls.

- Or (Matt/Dustin correct me if I'm wrong) you use a client design like
spymemcached. The memcached binary protocol can actually allow many client
instances to use the same server connections. Each client stacks commands
in the TCP sockets like a queue (you could even theoretically add a few
more connections if you block too long waiting for space), then they get
responses routed to them off the same socket. This means you can use
persistent connections, and generally have one socket per server instance
for an entire app server. Many thousands should scale okay.

- Remember Moore's law does grow computers very quickly. Maybe not as fast
as the internet, but ten years ago you would have 100 megabit 2G ram
memcached instances and need an awful lot of them. Right now 10ge is
dropping in price, 100G+ RAM servers are more affordable, and the industry
is already looking toward higher network rates. So as your company grows,
you get opportunities to cut the instance count every few years.

>     -- also, i wud think the hashing algo wud deteriorate after a given 
> number of nodes.. admittedly, this number could be very large indeed and
> also, i  know this is unlikely in probably 99.999% of cases but it wud be 
> great to factor in the maths behind science.

I sorta answered this above. Should put this into a wiki page I guess...

-Dormando

Reply via email to