Re: Memcache as session server with high cache miss?

2010-03-10 Thread Adam Lee
chances are, the same DOS attack would kill your datastore as well...

i you want speed and persistence, i'd recommend checking out products
like tokyotyrant; it even speaks memcached. we use it to cache user
data and it works incredibly well...

On Wednesday, March 10, 2010, Carlos Alvarez  wrote:
> On Wed, Mar 10, 2010 at 6:48 AM, Martin Grotzke
>  wrote:
>> Is there anything wrong with this? What would have to be considered
>> additionally?
>
> If you have an intranet app, I don't see any problem with this.
> If you have a public website, you are exposing yourself to DOS attacks
> with really low load.
>
>
>
> Carlos.
>

-- 
awl


Re: Memcache as session server with high cache miss?

2010-03-10 Thread Adam Lee
Agreed.

As I said before, we use a multi-layer approach to solve a similar problem.
 We use a database as our primary store for user data and it's cached in
TokyoTyrant and memcached.  The majority of our page views are read-only and
therefore don't need to hit the database.  We pull that data from memcached
if it exists and, if not, we pull it from TokyoTyrant and stick it in
memcached again.  The DB is never accessed from these pages (and, in fact
can't be).

In order to update user data, it is written to all three layers.  This can
only be done by logged-in users modifying their own account or by backend
jobs that are updating user data-- this means that it happens at a much
smaller scale and DOS attacks are much less of a concern (though we do have
rate controls in place, obviously).

This allows us to scale to a huge level (several thousand page views a
second) that wasn't really previously possible... before we started using
TokyoTyrant as the persistent middle-tier, the system used a file-based
cache on our SAN that was served over NFS and it didn't scale nearly as
well, plus it was a lot harder to administrate, backup, etc...  The same
system that was done with millions of files served by 6 very high-end
servers (32 cores per machine) is achieved with two boxes running
TokyoTyrant in a master-master replication setup.

On Wed, Mar 10, 2010 at 1:05 PM, Carlos Alvarez  wrote:

> On Wed, Mar 10, 2010 at 2:41 PM, Les Mikesell 
> wrote:
> > But you'll find it very expensive to scale up the number of servers
> > accessing that persistent store and the speed it can operate if you don't
> > use something like memcache in front of it.
>
> Of course. If It was understood that I was saying that memcached is
> useless I apologize for my poor english.
>
> I was trying to point out that if your app. behaves differently
> depending whether the data is present on the memcached or not (apart
> from response time and scalability), you may be using a cache tool for
> other purpouses. And it is risky.
>
>
>
> Carlos.
>



-- 
awl


Re: Memcache as session server with high cache miss?

2010-03-11 Thread Adam Lee
On Thu, Mar 11, 2010 at 4:03 AM, TheOnly92 <05049...@gmail.com> wrote:

> We used memcache as a session storage server because we needed to
> balance load across servers and share session data. Persistent file
> storage is not usable, and since memcache session storage is easy to
> configure in PHP, so we have decided to use it.
>

Hard to argue with that.  As I said before, though, it might be worth
looking into TokyoTyrant.  It speaks memcached and it's persistent, so you
can basically drop it in and start using it with your application code as it
exists.  It's not as fast or lightweight as memcached, though, so I would
suggest using it only as an add-on in situations where persistence is
important.


> Before this, memcache maximum storage was set to 64 MB, the data went
> up to 54 MB and so we thought it was the cause of cache misses. But
> after we increased to 256 MB, the problem is still occurring. Users
> report that after they logged in, they click on another page and they
> get logged out. But after a refresh they appear logged in again.
>
> session.gc_maxlifetime  14401440
>

This is definitely unusual behavior.  Have you tried debugging it at all?
 I'd say that you should try going through step-by-step and checking the
value that's stored in memcached at each point.  Given what you described,
it sounds to me like the bug is on the application side, not in memcached--
it sounds like some sort of state is perhaps being stored and that's why
they see themselves as logged-out until they do a refresh.

-- 
awl


Re: Memcache as session server with high cache miss?

2010-03-13 Thread Adam Lee
Given the behavior and the high miss rate, I really have to think that
something doesn't match between your two servers.  Either the server configs
are different or they're trying to get different keys-- there's no reason,
otherwise, that data would appear to be there for one of them but not for
the other.

Can you paste your memcached configs from both servers?  It's important to
note that even order matters in respect to memcached config.  If the two
machines have the same servers but in different orders, they will hash the
values to different memcached servers.

On Sat, Mar 13, 2010 at 11:56 AM, TheOnly92 <05049...@gmail.com> wrote:

> Webserver1: http://paste2.org/p/715491
> Webserver2: http://paste2.org/p/715492
>
> Situation:
> 1. User logs in.
> 2. User clicks somewhere (still logged in)
> 3. User clicks on another placed and gets redirected to the home page
> (appears logged out for this page)
> 4. Refreshes and able to access the page again (logged in).
>
> On Mar 13, 1:35 pm, dormando  wrote:
> > Can you telnet to the instances, type "stats", "stats items", and "stats
> > slabs", then copy/paste all that into pastebin?
> >
> > echo "stats" | nc host 11211 >> stats.txt works too
> >
> > You version is very old... It's missing many statistical counters that
> > could help us diagnose a problem. The extendedstats isn't printing an
> > evictions counter, but I can't remember if that version even had one.
> >
> > Can you describe your problem in more detail? If I recall:
> >
> > - User logs in.
> > - Clicks somewhere. now they're logged out?
> > - They click somewhere else, and they're logged in again? Does this mean
> > they found their original session again, or did you app log them in
> again?
> >
> > -Dormando
> >
> >
> >
> > On Fri, 12 Mar 2010, TheOnly92 wrote:
> > > I'm retrieving statistics via Memcache::extendedStats function, here
> > > are the basics:
> >
> > > Session Server 1
> > > Version1.2.2
> > > Uptime 398,954 sec
> > > Cache Hits 2,065,061
> > > Cache Misses   987,726 (47.83%)
> > > Current Items  381,928
> > > Data Read  4,318,055.02 KB
> > > Data Written   2,011,004.09 KB
> > > Current Storage100,688.96 KB
> > > Maximum Storage256.00 MB
> > > Current Connections9
> > > Total Connections  5,278,414
> > > Session Server 2
> > > Version1.2.2
> > > Uptime 398,943 sec
> > > Cache Hits 2,225,697
> > > Cache Misses   987,733 (44.38%)
> > > Current Items  381,919
> > > Data Read  4,323,893.05 KB
> > > Data Written   2,159,309.95 KB
> > > Current Storage100,685.52 KB
> > > Maximum Storage256.00 MB
> > > Current Connections11
> > > Total Connections  5,278,282
> >
> > > We are absolutely sure that both webservers are able to access the
> > > memcache server instances, we selected memcache because it was an easy
> > > configuration and setup without any changes of source code required,
> > > not to think that it is "absolutely" reliable. We just need to make
> > > sure that it works most of the time, but current situation is just
> > > unacceptable.
>



-- 
awl


Re: How to get more predictable caching behavior - how to store sessions in memcached

2010-03-13 Thread Adam Lee
If your goal is only to make memcached into a reliable datastore, then I
think you are perhaps going about it in the wrong way.  The memcached server
is extremely well written and tuned and does it's job incredibly well and
very efficiently.  If you want to ensure that it is deterministic, I think
that you should do your code on the client side rather than on the server
side.

We, for example, did some work in the past to store our user data (we don't
really use sessions in the traditional sense of the word, but this probably
the closest thing we have outside of our cookie) in memcached because the
load on our primary database was just too high.  In order to make it
deterministic, we wrote our own client and did a special setup.

We had several servers (started with 3, ended up growing it to 5 before we
replaced it with TokyoTyrant) that had identical configurations, such that
each server had more than enough memory to fit the entire dataset.  We then
wrote a client that had the following behavior:

- Writes were sent to every server
- All updates to the database had to also be written to memcached in order
to be considered a success
- Reads were performed on a randomly selected server

We also wrote a populate-user-cache script that could fill a new server with
the required data. Since we have about 30 million users, this job took quite
a while, so we also built in the idea of an "is populated" flag.  This flag
would not be set by the populate script until it was totally finished
replicating the data.  The client code was written such that it could write
to a server that didn't have the "is populated" flag, but would never read
from it.  This meant that we could bring up new servers and they would be
populated with new data, but only would be used once they were accurate (the
populate-user-cache script only issued add commands, making sure that it
didn't clobber any data being written by actual traffic).

One of the key features of this setup was that every server had the full
dataset-- this meant that we could build a page that needed data for, say,
500 users and load it with almost no more latency than needed to get the
data for one user because of how well memcached handles multi-gets.

We don't use this setup anymore because we moved to using TokyoTyrant as our
persistent cache layer, but I will say that it worked pretty much flawlessly
for about two years.  There was no way that our database would have been
able to handle the read necessary read load, but these servers performed
exceedingly well-- easily handling over 30,000+ gets per second.

Anyway, I think that building something similar might do a much better job
of performing the task you're attempting.  The key thing to recognize is
that memcached is built to do a specific task and it's _GREAT_ at it, so you
should use it for what it does best. Let me know if any of this doesn't make
sense to you or if you have any further questions.

-- 
awl


Re: How to get more predictable caching behavior - how to store sessions in memcached

2010-03-14 Thread Adam Lee
i find that people tend to do a lot of mental gymnastics to come up
with "what if?" scenarios for memcached (particularly in regard to
node failures and data integrity) and, while they're technically
possible, they very, very rarely happen in the wild. for this one,
i'll just say that memcached does a great job with its slab allocation
and, unless you're running with way too little memory, you'll not very
often see items evicted before their expiration time.

that said, memcached is a cache and should be treated as such unless
you jump through hoops to make it more deterministic (e.g. what i
described in my most recent mail to the list)

On Sunday, March 14, 2010, Peter J. Holzer  wrote:
> On 2010-03-12 17:07:25 -0800, dormando wrote:
>> Now, it should be obvious that if a user session has reached a point where
>> it would be evicted early, it is because you did not have enough memory to
>> store *all active sessions anyway*. The odds of it evicting someone who
>> has visited your site *after* me are highly unlikely. The longer I stay
>> off the site, the higher the odds of it being evicted early due to lack of
>> memory.
>>
>> This does mean, by way of painfully describing how an LRU works, that the
>> odds of you finding sessions in memcached which have not been expired, but
>> are being evicted from the LRU earlier than expired sessions, is very
>> unlikely.
> [...]
>> The caveat is that memcached has one LRU per slab class.
>>
>> So, lets say your traffic ends up looking like:
>>
>> - For the first 10,000 sessions, they are all 200 kilobytes. This ends up
>> having memcached allocate all of its slab memory toward something that
>> will fit 200k items.
>> - You get linked from the frontpage of digg.com and suddenly you have a
>> bunch of n00bass users hitting your site. They have smaller sessions since
>> they are newbies. 10k items.
>> - Memcached has only reserved 1 megabyte toward 10k items. So now all of
>> your newbies share a 1 megabyte store for sessions, instead of 200
>> megabytes.
>
> There's another caveat (I think Martin may have been referring to this
> scenario, but he wasn't very clear):
>
>
> Suppose you have two kinds of entries in your memcached, with different
> expire times. For example, in addition to your sessions with 3600s, you
> have some alert box with an expiry time of 60s. By chance,
> both items are approximately the same size and occupy the same slab
> class(es).
>
> You have enough memory to keep all sessions for 3600 seconds and enough
> memory to keep all alert boxes for 60 seconds. But you don't have enough
> memory to keep all alert boxes for 3600 seconds (why should you, they expire
> after 60 seconds).
>
> Now, when you walk the LRU chain, the search for expired items will only
> return expired alert boxes which are about as old as your oldest session.
> As soon as there are 50 (not yet expired) sessions older than the oldest
> (expired) alert box, you will evict a session although you still have a
> lot of expired alert boxes which you could reuse.
>
> The only workaround for this problem I can see is to use different
> memcached servers for items of (wildly) different expiration times.
>
>> However the slab out of balance thing is a real fault of ours. It's a
>> project on my plate to have automated slab rebalancing done in some usable
>> fashion within the next several weeks. This means that if a slab is out of
>> memory and under pressure, memcached will decide if it can pull memory
>> from another slab class to satisfy that need. As the size of your items
>> change over time, it will thus try to compensate.
>
> That's good to hear.
>
>         hp
>
> --
>    _  | Peter J. Holzer    | Openmoko has already embedded
> |_|_) | Sysadmin WSR       | voting system.
> | |   | h...@hjp.at         | Named "If you want it -- write it"
> __/   | http://www.hjp.at/ |  -- Ilja O. on commun...@lists.openmoko.org
>

-- 
awl


Re: How to get more predictable caching behavior - how to store sessions in memcached

2010-03-14 Thread Adam Lee
well, it depends on what you mean by scalability... i'm personally of
the opinion that traditional sessions should be avoided if you want to
truly scale.


On Sunday, March 14, 2010, Martin Grotzke  wrote:
> On Sun, Mar 14, 2010 at 5:37 PM, Les Mikesell  wrote:
>
> What about tomcat's ClusterManager?  Doesn't that provide replication across 
> server instances?Yes, but the DeltaManager does an all-to-all replication 
> which limits scalability. The BackupManager does a replication to another 
> tomcat (according to the docs this is not that much tested), and this 
> requires special configuration in the load balancer to support this.
>
> And both are using java serialization - for the memcached-session-manager I 
> implemented xml based serialization that allows easier code upgrades. Of 
> course you still need to think about code changes that affect classes stored 
> in the session, but removing fields is easier to support than with java 
> serialization.
>
> Cheers,Martin
>
> --
>   Les Mikesell
>    lesmikes...@gmail.com
>
>
>
> --
> Martin Grotzke
> http://www.javakaffee.de/blog/
>

-- 
awl


Re: How to get more predictable caching behavior - how to store sessions in memcached

2010-03-15 Thread Adam Lee
On Sun, Mar 14, 2010 at 2:59 PM, Les Mikesell  wrote:

> Adam Lee wrote:
>
>> well, it depends on what you mean by scalability... i'm personally of
>> the opinion that traditional sessions should be avoided if you want to
>> truly scale.
>>
>
> And yet, everyone wants dynamic pages custom-generated to the user's
> preferences.  So how do you reconcile that?  You can help things a bit by
> splitting pages into iframe/image components that do/don't need sessions,
> and you can make the client do more of the work by sending back values in
> cookies instead of just the session key, but I'm not sure how far you can
> go.
>

Well, I guess it depends on your definition of "session."  Obviously, you
need to account for user preferences and such, but I don't consider those
"session" data since they are consistent across any session that the user
instantiates.

Probably the easiest way to build a "stateless"/shared-nothing web
application, and what we've done to scale, is to store user authentication
data and the like in an encrypted cookie.  Any other session-like data (geo
location from IP lookup, language preference, etc) can be set in separate
cookies.  Since cookies are sent with every request, it is possible to
easily authenticate that the user is who they say they are and discern the
necessary data to build their page using only these cookies and you don't
need to look anything up in any sort of centralized session cache.

Data that is needed to authenticate a request or to display a message on a
subsequent page view (things that would be stored in the Flash in Rails,
from how I understand that to work) can be encoded into a cryptographically
secure "token" that is passed to the following request.

User preferences and settings, on the other hand, are not really session
data, as I said above.  I've already described somewhat how we have this
data stored in a few previous posts on this thread, but I guess I'll do a
basic overview for the sake of completeness...

Our central datastore for users is still (unfortunately) a database (mysql),
but this is essentially only used for writes.  All user data is also written
to TokyoTyrant, which is our primary persistent datastore for reads, and is
replicated exactly in memcached.

Since not all user data is needed for every page view, we've broken the user
data into what we call "user chunks," which roughly correspond to what would
be DB tables or separate objects in a traditional ORM.  We built a service
that will get you the data you want for a specific user or set of users by
taking name(s) and a bitmask for what chunks you want.  So, for example, if
I wanted to load the basic user data, active photo and profile data for the
user "admin," I'd just have to do something like this:

RoUserCache.get("admin", USER | ACTIVE_PHOTO | PROFILE);

The beauty of this is that the cache is smart-- it batches all of the
requests from a thread into bulk gets, it does as much as possible
asynchronously and it tries to get data from memcached first and, if it's
not there, then gets it from TokyoTyrant. TokyoTyrant and memcached are both
great at doing bulk gets, so this is pretty fast and, since they both speak
the same protocol (memcached), it wasn't terribly difficult to build.  Doing
it asynchronously means that most of the latency is absorbed, too, since we
try to do these loads as early on in the page building process as possible,
so it tends to be there by the time the page tries to use it.

Anyway, I've strayed a bit from the topic at hand, but I guess I felt I
should elaborate on what I meant...

-- 
awl


Re: How to get more predictable caching behavior - how to store sessions in memcached

2010-03-16 Thread Adam Lee
Yes.  As described in my previous post, all necessary state
information is contained in the request-- if you want to pass state
information to the next request, it's easiest to encode it in a sort
of cryptographically secure "token."  I find it easiest to think of it
as almost like an FSM where the cookies, token and query parameters
are inputs.  Obviously it's not purely deterministic or referentially
transparent, since things do end up getting written to the datastores
and such, but it is a somewhat useful abstraction.

On Tue, Mar 16, 2010 at 4:02 AM, Martin Grotzke
 wrote:
> On Mon, Mar 15, 2010 at 6:57 PM, Adam Lee  wrote:
>>
>> On Sun, Mar 14, 2010 at 2:59 PM, Les Mikesell 
>> wrote:
>>>
>>> Adam Lee wrote:
>>>>
>>>> well, it depends on what you mean by scalability... i'm personally of
>>>> the opinion that traditional sessions should be avoided if you want to
>>>> truly scale.
>>>
>>> And yet, everyone wants dynamic pages custom-generated to the user's
>>> preferences.  So how do you reconcile that?  You can help things a bit by
>>> splitting pages into iframe/image components that do/don't need sessions,
>>> and you can make the client do more of the work by sending back values in
>>> cookies instead of just the session key, but I'm not sure how far you can
>>> go.
>>
>> Well, I guess it depends on your definition of "session."  Obviously, you
>> need to account for user preferences and such, but I don't consider those
>> "session" data since they are consistent across any session that the user
>> instantiates.
>> Probably the easiest way to build a "stateless"/shared-nothing web
>> application, and what we've done to scale, is to store user authentication
>> data and the like in an encrypted cookie.  Any other session-like data (geo
>> location from IP lookup, language preference, etc) can be set in separate
>> cookies.  Since cookies are sent with every request, it is possible to
>> easily authenticate that the user is who they say they are and discern the
>> necessary data to build their page using only these cookies and you don't
>> need to look anything up in any sort of centralized session cache.
>> Data that is needed to authenticate a request or to display a message on a
>> subsequent page view (things that would be stored in the Flash in Rails,
>> from how I understand that to work) can be encoded into a cryptographically
>> secure "token" that is passed to the following request.
>> User preferences and settings, on the other hand, are not really session
>> data, as I said above.  I've already described somewhat how we have this
>> data stored in a few previous posts on this thread, but I guess I'll do a
>> basic overview for the sake of completeness...
>> Our central datastore for users is still (unfortunately) a database
>> (mysql), but this is essentially only used for writes.  All user data is
>> also written to TokyoTyrant, which is our primary persistent datastore for
>> reads, and is replicated exactly in memcached.
>> Since not all user data is needed for every page view, we've broken the
>> user data into what we call "user chunks," which roughly correspond to what
>> would be DB tables or separate objects in a traditional ORM.  We built a
>> service that will get you the data you want for a specific user or set of
>> users by taking name(s) and a bitmask for what chunks you want.  So, for
>> example, if I wanted to load the basic user data, active photo and profile
>> data for the user "admin," I'd just have to do something like this:
>> RoUserCache.get("admin", USER | ACTIVE_PHOTO | PROFILE);
>> The beauty of this is that the cache is smart-- it batches all of the
>> requests from a thread into bulk gets, it does as much as possible
>> asynchronously and it tries to get data from memcached first and, if it's
>> not there, then gets it from TokyoTyrant. TokyoTyrant and memcached are both
>> great at doing bulk gets, so this is pretty fast and, since they both speak
>> the same protocol (memcached), it wasn't terribly difficult to build.  Doing
>> it asynchronously means that most of the latency is absorbed, too, since we
>> try to do these loads as early on in the page building process as possible,
>> so it tends to be there by the time the page tries to use it.
>> Anyway, I've strayed a bit from the topic at hand, but I guess I felt I
>> should elaborate on what I meant...
>
> So you're one of the lucky guys that don't have to support users with
> cookies disabled?
> According to what you describe is seems you're not using sticky sessions. Do
> you handle concurrency issues in any way, to make sure that concurrent
> requests (e.g. tabbed browsing, AJAX) hitting different servers see the same
> data?
> Cheers,
> Martin
>
>>
>> --
>> awl
>
>
>
> --
> Martin Grotzke
> http://www.javakaffee.de/blog/
>



-- 
awl


Re: How to get more predictable caching behavior - how to store sessions in memcached

2010-03-22 Thread Adam Lee
On Sat, Mar 20, 2010 at 7:31 PM, Martin Grotzke
 wrote:
> Ok, thanx for sharing your experience. Do you have some app online
> implemented like this I can have a look at?

http://www.fotolog.com/

-- 
awl

To unsubscribe from this group, send email to 
memcached+unsubscribegooglegroups.com or reply to this email with the words 
"REMOVE ME" as the subject.


Re: Am I writing to Memcached too fast?

2010-04-06 Thread Adam Lee
Yeah, I'm having a bit of trouble wrapping my head around what exactly
it is you're trying to accomplish-- it really sounds like your
solution is significantly more complex than your problem...  perhaps
if you described the actual functionality you're trying to implement,
we'd have a better chance of suggesting something.

-- 
awl


spymemcached + kestrel

2010-04-26 Thread Adam Lee
I know this isn't exactly the list for this, but does anybody out there have
experience using Kestrel with the Spy client?  If so, can you please help me
with some discussions off-list?

It seems the two are nearly irreconcilable without some major hacking...

-- 
awl


-- 
Subscription settings: http://groups.google.com/group/memcached/subscribe?hl=en


Re: Using Java to Telnet into memcached

2010-05-25 Thread Adam Lee
I've never done it programmatically from Java, but I've done it with telnet,
netcat, etc. and had no problems.  The telnet client isn't trying to do
extra work that would confuse memcached or something, is it?

It should be very trivial to use nio classes to whip up something to send a
stats command periodically...

On Tue, May 25, 2010 at 1:34 PM, Tim Sneed  wrote:

>  Hey all,
>
>
>
> I am attempting to use a standard Java telnet client
> (commons.net.TelnetClient) but am having some trouble completing the
> connection. Once I run my Java test I see on the memcached console “<30 new
> auto-negotiating client connection” but then it just hangs there, eventually
> timing out with no exception being thrown.
>
>
>
> When I use the spymemcached I can connect no problem but I want to reduce
> the overhead since I am only interested in sending the STATS command at a
> set interval. Has anyone done this where they use a simple Telnet socket
> connection from Java to issue commands rather than using a Java memcached
> client such as spymemcached? Any info would be greatly appreciated, thanks!
>
>
>
> -ts
>



-- 
awl


Re: Using Java to Telnet into memcached

2010-05-25 Thread Adam Lee
That's fine, but you should at least use an actual socket to do it.  As I
said, take a look at the nio packages-- you should be able to whip something
up very quickly that performs extremely well and doesn't have all the
overhead of a telnet client.

On Tue, May 25, 2010 at 3:13 PM, Tim Sneed  wrote:

>  Yes, I have also used a socket, simmer down.
>
>
>
> As for the java client, I’d like to keep as little third-party code in our
> product as possible. If I only need to get stats and can use a socket to do
> so, I’m going with that.
>
>
>
> Thanks for your two cents.
>
>
>
> -ts
>
>
>
> *From:* memcached@googlegroups.com [mailto:memcac...@googlegroups.com] *On
> Behalf Of *Henrik Schröder
> *Sent:* Tuesday, May 25, 2010 3:11 PM
> *To:* memcached@googlegroups.com
>
> *Subject:* Re: Using Java to Telnet into memcached
>
>
>
> No, no, no. Don't use a telnet client to connect to it programmatically,
> just open a socket for crying out loud! You only need to write "stats\r\n"
> to it and then read the response. Why are you needlessly complicating
> things? If you use a programmatic telnet client you're gonna get something
> that tries to talk the telnet protocol. It works to connect to a memcached
> server with an actual telnet client, because they can usually handle the
> other part not being an actual telnet server, and downgrade to a dumb socket
> connection.
>
>
> Also, using an actual memcached client will probably not add a noticeable
> overhead, and you get the connecting to a server cluster + parsing of the
> results for free. Try it first and profile it instead of assuming it's a bad
> solution.
>
>
> /Henrik
>
>  On Tue, May 25, 2010 at 19:34, Tim Sneed  wrote:
>
> Hey all,
>
>
>
> I am attempting to use a standard Java telnet client
> (commons.net.TelnetClient) but am having some trouble completing the
> connection. Once I run my Java test I see on the memcached console “<30 new
> auto-negotiating client connection” but then it just hangs there, eventually
> timing out with no exception being thrown.
>
>
>
> When I use the spymemcached I can connect no problem but I want to reduce
> the overhead since I am only interested in sending the STATS command at a
> set interval. Has anyone done this where they use a simple Telnet socket
> connection from Java to issue commands rather than using a Java memcached
> client such as spymemcached? Any info would be greatly appreciated, thanks!
>
>
>
> -ts
>
>
>



-- 
awl


Re: Suggestions for deferring DB write using memcached

2010-06-03 Thread Adam Lee
On Thu, Jun 3, 2010 at 3:06 PM, ehalpern  wrote:

> We're building a system with heavy real-time write volume and looking
> for a way to decouple db writes from the user request path.
>
> We're exploring the approach of buffering updated entities in
> memcached and writing them back to the database asynchronously.  The
> primary problem that we're concerned about is how to ensure that the
> entity remains in the cache until the background process has a change
> to write it.
>
> Any advice and/or references would be greatly appreciated.
>

You want a job/work queue for this or just a simple queue service.

I've built something that does this for us using Kestrel, a very simple,
fast, persistent queue from Twitter that speaks memcached.
http://github.com/robey/kestrel

You can obviously make it as simple or
complex as you need, but I'd say it's best to start out as simple as
possible.  Since kestrel is fast and persistent, you should just be able to
write your data to it and move on, with the assumption that a background
service will pop items off of kestrel's queue and write them to the database
as quickly as possible.  If you need the data to be immediately available,
you can do something more complex like updating your cache at the same time,
so that subsequent reads will get the new data as long as it's still
available in the cache.

-- 
awl


Re: Is this really Distributed?

2010-06-10 Thread Adam Lee
Uh... huh? There is no single point of failure in memcached.

It is distributed in that the data is distributed across the servers.  Every
client knows the algorithm to find data for any key.  If any server dies,
you still have a cache, though your miss rate might increase slightly.


On Fri, Jun 11, 2010 at 1:10 AM, Dilip  wrote:

> Going by that definition:
>
> All client server architectures are distributed.
>
> like ftp, ldap, sql servers are all distributed.
>
> As i understood, in distributed systems there is no single point of
> failure.
>
> but in all these cases there are single point of failures.
>
> internet is distributed because there is no single point of failure.
>
> But after looking at the link
> http://en.wikipedia.org/wiki/Distributed_computing,
> here client server architecture is termed as distributed.
> I think the responsibility of not having single point of failure is to
> have intermediary clients, which can do that.
>
> Now I think we can call it that way.
>
> On Jun 10, 9:22 pm, Les Mikesell  wrote:
> > Dilip wrote:
> > > We read that memcached is "Free & open source, high-performance,
> > > distributed memory object caching system" Where as "distirbuted" is
> > > not part of memcached servers. We have to have some client which knows
> > > about all memcached servers and uses some hash based on key to
> > > determine a server.
> >
> > > Is My understanding correct? If it is correct, we should remove
> > > "Distributed" from the above definition.
> >
> > The data is distributed - but the servers don't need to know anything
> about
> > that.  Doesn't that still make it a distributed system?
> >
> > --
> >Les Mikesell
> >  lesmikes...@gmail.com
>



-- 
awl


Re: Distributed != Replicated. Another memcached vocabulary lesson

2010-06-11 Thread Adam Lee
I think it's kind of ridiculous to even bother arguing about this.
 Memcached does what it does and it does it exceptionally well, to the point
that it's used very heavily at most of the largest websites in the world.

Having said that, I just googled "distributed system" and took some
definitions from the first few results:

A distributed system consists of a collection of autonomous
computers, connected through a network and distribution middleware, which
enables computers to coordinate their activities and to share the resources
of the system, so that users perceive the system as a single, integrated
computing facility.


or, this one from Google itself:

A distributed system is an application that executes a collection of
protocols to coordinate the actions of multiple processes on a network, such
that all components cooperate together to perform a single or small set of
related tasks.


Memcached clearly fits both of these definitions.  The servers are
autonomous but through the middleware/protocol design, they work together to
perform one single task (acting as a cache/distributed hash table) and
appear as a single facility to any client.

I think the only problem that the OP has is that the "middleware" is
embedded in the client, but this just means that the distribution is, in a
way, baked into the protocol.  As long as every client agrees on the
protocol/hash, everything works perfectly.  And there is no single point of
failure, because if you lose a server, the cache as a whole continues to
function.

On Fri, Jun 11, 2010 at 12:54 PM, Brian Moon  wrote:

> After the recent thread and reading some comments on the memcached wiki I
> think I know what is wrong. People see the word distributed and think it
> means replicated.
>
> http://www.merriam-webster.com/dictionary/distribute says:
>
> distribute:
> 1 : to divide among several or many
> 2 : to spread out so as to cover something : scatter
> 3 : to divide or separate especially into kinds
> 4 : to use in or as an operation so as to be mathematically distributive
>
> POW! memcached (the solution, not the daemon) satisfies all of these
> definitions to a tee.
>
> http://www.merriam-webster.com/dictionary/replicate says:
>
> replicate:
> produce a replica of itself
>
> This is not what memcached is or does. Never claimed to.
>
> --
>
> Brian.
> 
> http://brian.moonspot.net/
>



-- 
awl


Re: Distributed != Replicated. Another memcached vocabulary lesson

2010-06-18 Thread Adam Lee
There is definitely at least a bit of misunderstanding related to this
aspect of memcached, though.  Even if people don't have this exact issue, it
seems that people come on here every month or so and post something related
to this functionality (or lack thereof) of memcached.  Usually, it's along
the lines of "I figured out a scenario where memcached could get
inconsisent," wherein they proceed to jump through a lot of hoops explaining
the situation where a server gets a value, goes offline, the value gets
updated elsewhere and then the server comes back online and they could
possibly have a bad value.

Obviously, these things are not really a problem in the real world, since
all of us run large instances of memcached doing hundreds of thousands of
requests per-second or more and have little, if any, problem with it, but it
maybe is something that should be explicitly spelled out.  It's covered a
bit in the wiki and the documentation (c.f. the "Persisent Storage" section
of the Overview in the wiki), but I almost feel like there needs to be a
very explicit:


   - memcached is not replicated. servers are unaware of each other-- they
   don't speak to each other in any way.
   - memcached is not persistent. there are many excellent, persisent K/V
   stores available.  memcached is not one of them.
   - memcached is fast. it is fast precisely because it does one thing and
   does it very well.
   - memcached is a cache.  don't put data in it that you can't recreate and
   that you can't afford to lose.


On Fri, Jun 18, 2010 at 1:03 PM, Marc Bollinger wrote:

> It didn't seem to me that Brian was talking about a quorum of people
> thinking that, but rather that some small portion of people conflate the
> two, one of whom started the long thread about memcached not being
> distributed. I agree that most people probably wouldn't trip over that
> wording.
>
>
> - Marc
>
> On Fri, Jun 18, 2010 at 6:04 AM, Simon Riggs wrote:
>
>> On Fri, 2010-06-11 at 11:54 -0500, Brian Moon wrote:
>> > After the recent thread and reading some comments on the memcached wiki
>> > I think I know what is wrong. People see the word distributed and think
>> > it means replicated.
>>
>> I'm not really sure there's a quorum of people that think that. I didn't
>> worry too much when I read what that previously.
>>
>> --
>>  Simon Riggs   www.2ndQuadrant.com
>>  PostgreSQL Development, 24x7 Support, Training and Services
>>
>>
>


-- 
awl


Re: LRU mechanism question

2010-07-07 Thread Adam Lee
That's not really true in practice.  Yes, memcached does reuse slots, but
your items don't need to actually be the exact same size, they just need to
be in the same slab class.  In production, you'll probably never run into a
situation like your test where 100% of the slab space is allocated to the
same item size.

Memcached is very good at what it does.

On Tue, Jul 6, 2010 at 10:03 PM, Sergei Bobovich wrote:

> Thanks, Brian,
> I understand that. My goal here is to better understand possible
> limitations
> and set expectations properly. Actually per what I saw in my tests (if the
> second series of inserts will still be of 512K then all of them will be
> stored successfully) I would conclude that if my data is about the same
> size
> (let's say from 9 to 10K) then I will do much more better by having all
> data
> pieces of the same size (align to 10K). Again this is a speculation without
> knowing internals but my impression is that memcached successfully reuses
> slots of the same size.
>
> Regards,
> Sergei
>
> -Original Message-
> From: Brian Moon [mailto:br...@moonspot.net]
> Sent: Tuesday, July 06, 2010 8:36 PM
> To: memcached@googlegroups.com
> Cc: siroga
> Subject: Re: LRU mechanism question
>
> Just to pile on, test data that is all the same size like that is
> probably a very bad test of memcached. Most likely, all your data is not
> the exact same size.
>
> Brian.
> 
> http://brian.moonspot.net/
>
> On 7/6/10 5:36 PM, siroga wrote:
> > Hi,
> > I just started playing with memcached. While doing very basic stuff I
> > found one thing that confused me a lot.
> > I have memcached running with default settings - 64M of memory for
> > caching.
> > 1. Called flushALL to clean the cache.
> > 2. insert 100 of byte arrays 512K each - this should consume about 51M
> > of memory so  I should have enough space to keep all of them - and to
> > very that call get() for each of them  - as expected all arrays are
> > present
> > 3. I call flushAll again - so cache should be clear
> > 4. insert 100 arrays of smaller size ( 256K). I also expected that I
> > have enough memory to store them (overall I need about 26M), but
> > surprisingly to me when calling get() only last 15 where found in the
> > cache!!!
> >
> > It looks like memcached still hold memory occupied by first 100
> > arrays.
> > Memcache-top says that only 3.8M out of 64 used.
> >
> > Any info/explanation on memcached memory management details is very
> > welcomed. Sorry if it is a well known feature, but I did not find much
> > on a wiki that would suggest explanation.
> >
> > Regards,
> > Sergei
> >
> > Here is my test program (I got the same result using both danga and
> > spy.memcached. clients):
> >
> >  MemCachedClient cl;
> >
> > @Test
> >  public void strange() throws Throwable
> >  {
> >  byte[] testLarge = new byte[1024*512];
> >  byte[] testSmall = new byte[1024*256];
> >  int COUNT = 100;
> >  cl.flushAll();
> >  Thread.sleep(1000);
> >  for (int i = 0; i<  COUNT; i++)
> >  {
> >  cl.set("largekey" + i, testLarge, 600);
> >  }
> >  for (int i = 0; i<  COUNT; i++)
> >  {
> >  if (null != cl.get("largekey" + i))
> >  {
> >  System.out.println("First not null " + i);
> >  break;
> >  }
> >  }
> >  Thread.sleep(1000);
> >  cl.flushAll();
> >  Thread.sleep(1000);
> >  for (int i = 0; i<  COUNT; i++)
> >  {
> >  cl.set("smallkey" + i, testSmall, 600);
> >  }
> >  for (int i = 0; i<  COUNT; i++)
> >  {
> >  if (null != cl.get("smallkey" + i))
> >  {
> >  System.out.println("First not null " + i);
> >  break;
> >  }
> >  }
> >
> >  }
>
>


-- 
awl


Re: Disappearing Keys

2010-07-07 Thread Adam Lee
+1

On Tue, Jul 6, 2010 at 2:47 PM, dormando  wrote:

> Or you could disable the "failover" feature...
>
> On Tue, 6 Jul 2010, Darryl Kuhn wrote:
>
> > FYI - we made the change on one server and it does appear to have
> resolved premature key expiration.
> >
> > Effectively what appears to have been happening was that every so often a
> client was unable to connect to one or more of the memcached servers. When
> this happened it changed the key distribution. Because
> > the connection was persistent it meant that subsequent requests would use
> the same connection handle with the reduced server pool. Turning off
> persistent connections ensures that a if we are unable to
> > connect to a server in one instance the failure does not persist for
> subsequent connections.
> >
> > We'll be rolling this change out to the entire server pool and I'll give
> the list another update with our findings.
> >
> > Thanks,
> > Darryl
> >
> > On Fri, Jul 2, 2010 at 8:34 AM, Darryl Kuhn 
> wrote:
> >   Found the reset call - that was me being an idiot (I actually
> introduced it when I added logging to debug this issue)... That's been
> removed however there was no flush command. Somebody else
> >   suggested it may have to do with the fact that we're running
> persistent connections; and that if a failure occurred that failure would
> persist and alter hashing rules for subsequent requests on
> >   that connection. I do see a limited number of connection failures
> (~5-15) throughout the day. I'm going to alter the config to make
> connections non-persistent and see if it makes a difference
> >   (however I'm doubtful this is the issue as we've run with memcache
> server pools with a single instance - which would make it impossible to
> alter the hashing distribution).
> >
> >   I'll report back what I find - thanks for your continued input!
> >
> >   -Darryl
> >
> >
> > On Thu, Jul 1, 2010 at 12:28 PM, dormando  wrote:
> >   > Dormando... Thanks for the response. I've moved one of our
> servers to use an upgraded version running 1.4.5. Couple of things:
> >   >  *  I turned on logging last night
> >   >  *  I'm only running -vv at the moment; -vvv generated way more
> logging than we could handle. As it stands we've generated ~6GB of logs
> since last night (using -vv). I'm looking at ways
> >   of reducing log
> >   > volume by logging only specific data or perhaps standing up
> 10 or 20 instances on one machine (using multiple ports) and turning on -vvv
> on only one instance. Any suggestions there?
> >
> > Oh. I thought given your stats output that you had reproduced it on a
> > server that was on a dev instance or local machine... but I guess that's
> > related to below. Running logs on a production instance with a lot of
> > traffic isn't that great of an idea, sorry about that :/
> >
> > > Looking at the logs two things jump out at me.
> > >  *  While I had -vvv turned on I saw "stats reset" command being issued
> constantly (at least once a second). Nothing in the code that we have does
> this - do you know if the PHP client does
> > this perhaps? Is
> > > this something you've seen in the past?
> >
> > No, you probably have some code that's doing something intensely wrong.
> > Now we should probably add a counter for the number of times a "stats
> > reset" has been called...
> >
> > >  *  Second with -vv on I get something like this:
> > >  +  <71 get resourceCategoryPath21:984097:
> > > >71 sending key resourceCategoryPath21:984097:
> > > >71 END
> > > <71 set 
> > > popularProducts:2010-06-28:skinit.com:styleskins:en::2000:image_wall:0__type
> 0 86400 5
> > > >71 STORED
> > > <71 set 
> > > popularProducts:2010-06-28:skinit.com:styleskins:en::2000:image_wall:0
> 1 86400 130230
> > > <59 get domain_host:www.bestbuyskins.com
> > > >59 sending key domain_host:www.bestbuyskins.com
> > > >59 END
> > >  *  Two questions on the output - what's the "71" and "59"? Second - I
> would have thought I'd see an "END" after each "get" and "set" however you
> can see that's not the case.
> > >
> > > Last question... other than trolling through code is there a good place
> to go to understand how to parse out these log files (I'd prefer to
> self-help rather than bugging you)?
> >
> > Looks ike you figured that out. The numbers are the file descriptors
> > (connections). END/STORED/etc are the responses.
> >
> > Honestly I'm going to take a wild guess that something on your end is
> > constantly trying to reset the memcached instance.. it's probably doing a
> > "flush_all" then a "stats reset" which would hide the flush counter. Do
> > you see "flush_all" being called in the logs anywhere?
> >
> > Go find where you're calling stats reset and make it stop... that'll
> > probably help bubble up what the real problem is.
> >
> >
> >
> >
> >
>



-- 
awl


Re: [PATCH] to make memcached drop privileges completely when running as root

2010-07-21 Thread Adam Lee
Yeah, memcached already knows how to drop privileges and it's entirely
possible to jail it without having to tie it to a super-specific config.
 Not sure this patch is needed (no offense)

On Wed, Jul 21, 2010 at 1:30 AM, dormando  wrote:

> I'm not really sure how to delicately explain this so please have some
> forgiveness :)
>
> You can't really hardcode either of those things... That's replacing a
> flexible feature with the inflexible sort you'd expect out of a
> proprietary appliance.
>
> I'm pretty sure the -u feature works the way you need it to, and
> chroot'ing an application is perfectly doable with an init script. I think
> we could take a patch with an example init script for chroot'ing it
> (perhaps along with some directions).
>
> A bit on the fence about adding an outright chroot command, since
> different OS's have different ways of doing that, and hardcoding it
> doesn't seem to be the best use here (tho someone correct me if I'm
> wrong).
>
> On Wed, 21 Jul 2010, Loganaden Velvindron wrote:
>
> > Hi,
> >
> > _memcached is a dedicated user with home directory /var/empty,
> > and the login shell is /sbin/nologin.
> >
> > /var/empty could be created by the package manager or install script.
> >
> > //Logan
> > C-x-C-c
> >
> >
> > On Wed, Jul 21, 2010 at 1:25 AM, Trond Norbye 
> wrote:
> >
> > Why do you remove the ability for the user to specify the username it
> should run as, and instead hardcode it to run as _memcached ?? In addition
> this patch require /var/empty to exists, and I know of
> > a number of platforms that don't have a /var/empty directory...
> >
> > Just my 0.5NOK
> >
> > Trond
> >
> >
> > On 20. juli 2010, at 20.54, Loganaden Velvindron wrote:
> >
> >   Greetings,
> >
> >   I've investigated further, and this diff seems to be ok.
> >
> >   What do you think ?
> >
> >   //Logan
> >   C-x-C-c
> >
> >   diff --git a/memcached.c b/memcached.c
> >   index 750c8b3..1d56a8f 100644
> >   --- a/memcached.c
> >   +++ b/memcached.c
> >   @@ -22,6 +22,8 @@
> >#include 
> >#include 
> >#include 
> >   +#include 
> >   +#include 
> >
> >/* some POSIX systems need the following definition
> >* to get mlockall flags out of sys/mman.h.  */
> >   @@ -4539,22 +4541,6 @@ int main (int argc, char **argv) {
> >   }
> >   }
> >
> >   -/* lose root privileges if we have them */
> >   -if (getuid() == 0 || geteuid() == 0) {
> >   -if (username == 0 || *username == '\0') {
> >   -fprintf(stderr, "can't run as root without the -u
> switch\n");
> >   -exit(EX_USAGE);
> >   -}
> >   -if ((pw = getpwnam(username)) == 0) {
> >   -fprintf(stderr, "can't find the user %s to switch
> to\n", username);
> >   -exit(EX_NOUSER);
> >   -}
> >   -if (setgid(pw->pw_gid) < 0 || setuid(pw->pw_uid) < 0) {
> >   -fprintf(stderr, "failed to assume identity of user
> %s\n", username);
> >   -exit(EX_OSERR);
> >   -}
> >   -}
> >   -
> >   /* Initialize Sasl if -S was specified */
> >   if (settings.sasl) {
> >   init_sasl();
> >   @@ -4675,6 +4661,30 @@ int main (int argc, char **argv) {
> >   }
> >
> >   /* Drop privileges no longer needed */
> >   +if (getuid()==0 || geteuid()==0) {
> >   +   if ((pw=getpwnam("_memcached")) == NULL) {
> >   +   fprintf(stderr,"user _memcached not found");
> >   +   exit(EX_NOUSER);
> >   +   }
> >   +
> >   +   if((chroot("/var/empty") == -1)) {
> >   +   fprintf(stderr,"check permissions on /var/empty");
> >   +   exit(EX_OSERR);
> >   +   }
> >   +
> >   +   if(chdir("/") == -1) {
> >   +   fprintf(stderr," Cannot set new root");
> >   +   exit(EX_OSERR);
> >   +   }
> >   +
> >   +   if(setgroups(1, &pw->pw_gid) ||
> >   +   setresgid(pw->pw_gid, pw->pw_gid, pw->pw_gid) ||
> >   +   setresuid(pw->pw_uid, pw->pw_uid, pw->pw_uid)) {
> >   +   fprintf(stderr," failed to switch to correct
> user");
> >   +   exit(EX_NOUSER);
> >   +   }
> >   +
> >   +   }
> >   drop_privileges();
> >
> >   /* enter the event loop */
> >
> >   On Tue, Jul 20, 2010 at 10:53 AM, Loganaden Velvindron <
> logana...@gmail.com> wrote:
> > yep it makes sense.
> >
> > In this case, could we not remove this part and drop root at
> the other location
> > to gain the jail benefit ?
> >
> >
> > //Logan
> > C-x-C-c
> >
> > On Tue, Jul 20, 2010 at 10:24 AM, dormando  wrote:
> >   You don't need to run memcached as root to do that, you need to
> *s

Re: stats help

2010-07-26 Thread Adam Lee
I'm not in love with a lot of its inner workings, but Cacti is built on
rrdtool and very capable in terms of memcached graphing.  It has plugins
that let you generate graphs for things like Hits/Misses, Bytes Used, Sets,
Gets, Network Traffic, Items Cached, etc...

http://dealnews.com/developers/cacti/memcached.html

On Sun, Jul 25, 2010 at 12:53 PM, Les Mikesell wrote:

> I know rrdtool (and the jrobin equivalent in java) can do it, but that's a
> fairly low level tool.  I was hoping to find some generic framework that
> could accept either counter or gauge type values and do the rest for you
> including a web graph display.  I'd think this would be a common problem but
> I haven't found any high-level tools that aren't married to snmp for the
> input.
>
>  -Les
>
>
>
> Gavin M. Roy wrote:
>
>> I use RRDTool for this with derive counter types.  Collect the data you
>> want from the stats command and use rrdtool to store the data every minute,
>> graph it out with rrdtool graph and you'll get your trended stats.
>>
>>
> hi, i am newbie to memcached. I need help in finding how to get
>>throughput stat.
>>
>>I want to see how much throughput memcache is getting.
>>"stats" command
>>does not list any stat for through put (requests per sec).
>>Any idea on
>>how to go about getting that info? Does memcache keep track
>>of this
>>information?
>>
>>
>> It's rare to keep derived stats like that in general.  It's
>> usually
>>not interesting.  Do you want average requests per second over the
>>lifetime of the process?  Over the last second?, 60 seconds?
>>300, 900,
>>3600, etc...
>>
>> Most of the time, this is easily observable from the outside.
>>Collect counters -- wait a bit, collect them again, then do your
>> own
>>math.  That'll give you exactly what you want.
>>
>>
>>It's a bit off topic for this list, but does anyone know if there
>>are good generic tools for that?  There are quite a few designed to
>>convert SNMP 'COUNTER' types to rates, check thresholds and keep
>>history to graph the trends, but usually the SNMP sampling is
>>closely coupled to the rest of the logic.  I think OpenNMS might do
>>it with values it can pick up with http requests but I'm not sure
>>how well it handles the spikes that would appear from restarts and
>>value rollovers.
>>
>>--  Les Mikesell
>>  lesmikes...@gmail.com 
>>
>>
>>
>


-- 
awl


Re: REST API

2010-07-28 Thread Adam Lee
memcached's protocol is, as has been pointed out, already language agnostic
and much more efficient than trying to do HTTP.  If you're saying RESTful in
the "not necessarily HTTP" sense, though, then I'd say that memcached's text
protocol is basically already as RESTful as you're going to get- think of
commands as verbs ('get,' 'set,' 'add,' 'delete,' etc...) and the key as a
URI and you're basically in an analogous situation that I think basically
meets the criteria as much as you can (hard to have a stateless cache)...
http://en.wikipedia.org/wiki/Representational_State_Transfer#Constraints

If you want a key-value datastore with an HTTP interface, though, might I
recommend Tokyo Tyrant?  It speaks memcached and its own binary protocol as
well: http://1978th.net/tokyotyrant/spex.html#protocol

On Wed, Jul 28, 2010 at 12:03 PM, Les Mikesell wrote:

> On 7/28/2010 10:16 AM, jsm wrote:
>
>> Gavin,
>> You are right about the overhead and also saw that API's exist for
>> most of the languages as well.
>> I thought REST API would make memcached language agnostic.
>> I would like to hear from the community if the REST API should be
>> pursued or not?
>>
>
> I'm not quite sure how a rest api could deal with the distributed servers
> without a special client anyway.  But, it might be handy to have a web
> service that mapped a rest api as directly as possible to memcache
> operations where the http side would use traditional load balance/fail over
> methods and handle the http 1.1 connection caching.  I'm sure there would be
> places where this could be used by components that have/want data in a cache
> shared by more efficient clients.
>
> --
>  Les Mikesell
>   lesmikes...@gmail.com
>



-- 
awl


Re: REST API

2010-07-28 Thread Adam Lee
That seems an odd case to me, to be honest.  One of the key benefits of
memcached is its ultra-low latency, which is negated somewhat by using a
much heavier protocol.  Also, writing a simple client library for the text
protocol is seriously achievable in an afternoon.

Anyway, there's nothing to say that you can't change your "hashing" strategy
to work with HTTP/URIs.  Just hash the URI or use characters from it to
select a server, for example...  As long as all clients agree, it doesn't
matter how you shard the data.

Again, I'd say you should take a look at TokyoTyrant for a fast, simple
key-value store that speaks both memcached and HTTP.

On Wed, Jul 28, 2010 at 2:11 PM, Les Mikesell  wrote:

> There's no argument that embedding a locally-configured memcache client
> library for the appropriate language into your program would be more
> efficient, but consider the case where you have many programs in many
> languages sharing the cache data and some of them have inherent http
> capability and aren't used enough to care about that last 10% efficiency
> when it means rewriting a bunch of code with new libraries to get it.
> However, I still think the http interface would have to be a separate
> standalone piece, sitting over a stock client that knows about the local
> distributed servers or you'd need a special client library anyway.
>
>  -Les
>
>
>
> On 7/28/2010 12:53 PM, Adam Lee wrote:
>
>> memcached's protocol is, as has been pointed out, already language
>> agnostic and much more efficient than trying to do HTTP.  If you're
>> saying RESTful in the "not necessarily HTTP" sense, though, then I'd say
>> that memcached's text protocol is basically already as RESTful as you're
>> going to get- think of commands as verbs ('get,' 'set,' 'add,' 'delete,'
>> etc...) and the key as a URI and you're basically in an analogous
>> situation that I think basically meets the criteria as much as you can
>> (hard to have a stateless cache)...
>> http://en.wikipedia.org/wiki/Representational_State_Transfer#Constraints
>>
>> If you want a key-value datastore with an HTTP interface, though, might
>> I recommend Tokyo Tyrant?  It speaks memcached and its own binary
>> protocol as well: http://1978th.net/tokyotyrant/spex.html#protocol
>>
>> On Wed, Jul 28, 2010 at 12:03 PM, Les Mikesell > <mailto:lesmikes...@gmail.com>> wrote:
>>
>>On 7/28/2010 10:16 AM, jsm wrote:
>>
>>Gavin,
>>You are right about the overhead and also saw that API's exist for
>>most of the languages as well.
>>I thought REST API would make memcached language agnostic.
>>I would like to hear from the community if the REST API should be
>>pursued or not?
>>
>>
>>I'm not quite sure how a rest api could deal with the distributed
>>servers without a special client anyway.  But, it might be handy to
>>have a web service that mapped a rest api as directly as possible to
>>memcache operations where the http side would use traditional load
>>balance/fail over methods and handle the http 1.1 connection
>>caching.  I'm sure there would be places where this could be used by
>>components that have/want data in a cache shared by more efficient
>>clients.
>>
>>--
>>  Les Mikesell
>>lesmikes...@gmail.com <mailto:lesmikes...@gmail.com>
>>
>>
>>
>>
>> --
>> awl
>>
>
>


-- 
awl


Re: Determining what node a key is located on for testing

2010-08-02 Thread Adam Lee
Take a look at SpyNodeLocator, it has the info that you're looking for.

On Mon, Aug 2, 2010 at 6:21 PM, ehalpern  wrote:

> I'm trying to test some failure scenarios with our application which
> uses a memcached cluster for caching dirty data.  To do this, I need a
> programmatic way to determine which memcached node is responsible for
> storing a particular key.  Does anyone know of a way to do this using
> the spymemcached client?
>
> Thanks in advance
>



-- 
awl


Re: Determining what node a key is located on for testing

2010-08-02 Thread Adam Lee
D'oh, just remembered that SpyNodeLocator is something we developed
in-house.   Please disregard...

Depending on which hashing strategy you're using, one of the NodeLocators
will be used.  If you're using Ketama hashing, then take a look at
net.spy.memcached.KetamaNodeLocator.

On Mon, Aug 2, 2010 at 7:20 PM, Adam Lee  wrote:

> Take a look at SpyNodeLocator, it has the info that you're looking for.
>
>
> On Mon, Aug 2, 2010 at 6:21 PM, ehalpern  wrote:
>
>> I'm trying to test some failure scenarios with our application which
>> uses a memcached cluster for caching dirty data.  To do this, I need a
>> programmatic way to determine which memcached node is responsible for
>> storing a particular key.  Does anyone know of a way to do this using
>> the spymemcached client?
>>
>> Thanks in advance
>>
>
>
>
> --
> awl
>



-- 
awl


Re: Don't quite understand why expiration is necessary

2010-08-03 Thread Adam Lee
The main case I find expiration useful:

You have a database that contains very important data for a popular website.
 Since this data is so important and this website is so popular, the
database obviously can't service every request, so you compute this data and
give it an expiration that's a balance between "how much can my database
handle" and "just how stale can data on the website be before users start to
notice."

Another good case:  You have a counter on your website that tracks the
number of times a user performs a certain action in a day.  You set this
item's expiration for end of day and it automatically goes away when it's
time to start a new counter the next day...

On Tue, Aug 3, 2010 at 3:40 AM, Dustin  wrote:

>
> On Aug 3, 12:30 am, Peter  wrote:
> > Hi, I am new to memcached. I am wondering why expiration is necessary
> > for the protocol. Why do we want an item to be expired if there is
> > still space to keep it? Otherwise, we can still use some replacement
> > algorithm to replace it or garbage collection algorithm to collect it?
>
>   Sometimes, you don't control your source data production and can't
> perform cache transformations when things change.
>
>  Sometimes, you really want something to go away after a certain
> amount of time because they're no longer relevant and you're best
> suited by recomputing.
>
>  Sometimes, you just don't trust that every event will be processed
> correctly and having some part of your application be using incorrect
> data for 15 minutes or an hour or whatever is an acceptable worst
> case.
>
>  Ideally, you're right -- all cache would be invalidated exactly when
> things change and nobody should ever use an expiration date on their
> caches.  I have an application that does this, for everything else,
> there's TTL.
>



-- 
awl


Re: Which key is chosen when key duplicates exist in a cluster?

2010-09-14 Thread Adam Lee
I think you need to read up a little bit on memcached's entire strategy on
this:

http://code.google.com/p/memcached/wiki/FAQ#Cluster_Architecture_Questions

On Tue, Sep 14, 2010 at 7:03 PM, Granit  wrote:

> Suppose one of the memcached machines in a cluster looses connection,
> when asking for a key it is non existent on any other machine in the
> cluster so
> the client decides to create a new one, now the disconnected machine
> is back.
> Thus we have two machines with the same key, which key->value pair
> would we get
> when requesting this duplicated key?
>
> - the one created the last?
> - which ever server answers first?
>
> thanks
> Granit
>



-- 
awl


Re: Which key is chosen when key duplicates exist in a cluster?

2010-09-14 Thread Adam Lee
At any given moment, your client thinks key k exists on the server
represented in the server list by n where n = hash(k)

99.% of the time this list doesn't change unless you're doing a really
bad job of running your memcached servers or your network, so you don't need
to worry too much about it.  Plus you can only really get inconsistent
results if the server your key hashes to goes down, the value changes, gets
written to the new server and then the original server comes back offline.
If you really do care about not getting inconsistent results, though, you
should just turn off failover.

On Tue, Sep 14, 2010 at 7:25 PM, Granit  wrote:

> or is it, if a server disconnects which a client was working with,
> this particular client simply couldn't work with any server at all and
> would have to wait for reappearance? at which state it would simply
> receive possibly stale data.
>
> On Sep 15, 12:22 am, Granit  wrote:
> > Oh yeah,
> >
> > I think my question would be answered in the not yet written:
> >
> > > getting stale entries when a memcached server flaps in and out of the
> cluster
> >
> > which is why I posted it here...
> >
> > many thanks in advance
> >
> > On Sep 15, 12:17 am, Granit  wrote:
> >
> > > Thank you for the quick reply,
> > > I did read that> It is able to use the same hashing process to figure
> out key "foo" is on server B. It then directly requests key "foo" and gets
> back "barbaz".
> >
> > > So what happens when key "foo" exists on server A and B? I mean by
> > > accident not deliberately. I am aware that duplications are not
> > > supposed to happen.
> >
> > > On Sep 15, 12:09 am, Adam Lee  wrote:
> >
> > > > I think you need to read up a little bit on memcached's entire
> strategy on
> > > > this:
> >
> > > >
> http://code.google.com/p/memcached/wiki/FAQ#Cluster_Architecture_Ques...
> >
> > > > On Tue, Sep 14, 2010 at 7:03 PM, Granit  wrote:
> > > > > Suppose one of the memcached machines in a cluster looses
> connection,
> > > > > when asking for a key it is non existent on any other machine in
> the
> > > > > cluster so
> > > > > the client decides to create a new one, now the disconnected
> machine
> > > > > is back.
> > > > > Thus we have two machines with the same key, which key->value pair
> > > > > would we get
> > > > > when requesting this duplicated key?
> >
> > > > > - the one created the last?
> > > > > - which ever server answers first?
> >
> > > > > thanks
> > > > > Granit
> >
> > > > --
> > > > awl
> >
> >
>



-- 
awl


Re: Repopulating cache after cache miss.

2010-09-27 Thread Adam Lee
Yeah, I say either go with Gearman or else have backend processes that
generate the data and write it to the cache instead of generating it within
the context of a client request.

On Sun, Sep 26, 2010 at 12:44 PM, Brian Moon  wrote:

> My concern is that the client may make multiple requests for additional
>>> parts at the same time triggering multiple (duplicate) re-fetches and saves.
>>>  Anyone have a similar situation?  Would you recommend
>>> use an atomic "add" with a short timeout as a lock?
>>>
>>
>> It's discussed on the wiki a bit. There's the "Ghetto lock" with add, but
>> also things like gearman which can do request coalescing.
>>
>
> Gearman++
>
> --
>
> Brian.
> 
> http://brian.moonspot.net/
>



-- 
awl


Re: Large dataset in memory

2010-09-29 Thread Adam Lee
This is, essentially, precisely what memcached is.  You can view memcached
as one large, shared map that should appear identical to all clients as long
as they are configured the same.  It isn't "one object," but rather a
distributed cache shared equally amongst all the servers running the daemon,
but from the point of view of the clients/code, it basically looks like one
large map.

On Wed, Sep 29, 2010 at 12:18 PM, parsa  wrote:

> Hey fellas,
>
> I have a large key-value map that I want to serve in a web service
> application. I want to keep a single instant of this map inside the
> memory (around 600mb footprint) and let every request that is made to
> the service use the very same object. I'm new to memcached and to be
> honest, caching in general. So is it better to keep the object in the
> memory as a whole or to add key-values to the cache separately? (btw
> I'm using Scala on Lift)
>



-- 
awl


Re: Large dataset in memory

2010-09-29 Thread Adam Lee
Now that I think about it, though, it sounds like you don't actually want a
cache.  Memcached is truly a cache, and is not guaranteed to keep your
values around.

Perhaps you want something more like TokyoTyrant or Redis.  We (fotolog.com)
recently open-sourced our Scala client for Redis.  You can take a look at
Redis at http://code.google.com/p/redis/ and our Scala client at
http://github.com/andreyk0/redis-client-scala-netty

Redis is a key-value
store, rather than a cache, and it tries to be more ACID-like...

On Wed, Sep 29, 2010 at 12:18 PM, parsa  wrote:

> Hey fellas,
>
> I have a large key-value map that I want to serve in a web service
> application. I want to keep a single instant of this map inside the
> memory (around 600mb footprint) and let every request that is made to
> the service use the very same object. I'm new to memcached and to be
> honest, caching in general. So is it better to keep the object in the
> memory as a whole or to add key-values to the cache separately? (btw
> I'm using Scala on Lift)
>



-- 
awl


Re: Large dataset in memory

2010-10-01 Thread Adam Lee
On Thursday, September 30, 2010, parsa  wrote:
> Each request only needs parts of the map, not all of it. But as the
> number of simultaneous requests grows to somewhere near 500, there's a
> chance of using 90% of the map.
> It doesn't change in run-time. It changes on a schedule once in a
> month.
>
> I think caching is not the way to go for me. I've looked into key-
> value databases but the problem is the algorithm that's triggered with
> each request (think of some searching) requires a specific type of
> data which is a Trie or prefix tree. Currently, I generate the map
> once in a singleton object inside the servlet container and give
> references to it for each request and it works. But what I'm saying
> is, maybe it's better to hold the data as a normal key-value map, then
> when each request arrives, generate a Trie out of it and run the
> algorithm with that Trie. (some sort of lazy loading)

Generate each Trie, serialize it as a byte array and store it in the
key-value store. This eliminates any need to do computation at runtime
and any data duplication while still meeting all of your concurrency
and performancr needs. Redis can do this quite easily and it should be
very fast and simple to work with- if you choose to go down this route
and need any assistance, let me know!

Good luck

-- 
awl


Re: Is memcache add() atomic on a multithreaded memcached?

2010-10-13 Thread Adam Lee
Yeah, we also have used this as a sort of crude locking mechanism on a site
under fairly heavy load and have never seen any sort of inconsistency-- as
dormando said, I'd make sure your configuration is correct.  Debug and make
sure that they're both indeed setting it on the same server.  Or, if that's
not possible, whip up a small script that iterates through all of your
servers and see if the key exists on multiple servers.

On Wed, Oct 13, 2010 at 1:47 PM, dormando  wrote:

> > Hi everyone,
> >
> > we have the following situation: due to massive simultaneous inserts
> > in mysql on possibly identical primary keys, we use the atomic
> > memcache add() as a semaphore. In a few cases we observed the
> > behaviour, that two simultaneous add() using the same key from
> > different clients both returned true (due to consistent hashing the
> > key has to be on the same machine).
> >
> > Is it now possible, that the multithreaded memcached does return true
> > on two concurrent add() on the same key, if the requests are handled
> > by two different threads on the same machine?
>
> It should not be possible, no. Be sure you've disabled the client
> "failover" code.
>



-- 
awl


Re: Is memcache add() atomic on a multithreaded memcached?

2010-10-15 Thread Adam Lee
Is it ever possible that your compute takes longer than your timeout?

On Fri, Oct 15, 2010 at 5:45 AM, Tobias  wrote:

> > Can you give more info about exactly what the app is doing?
>
> Something like this:
>
> value = memcache.get("record" + x)
>
> if (false == value && cache.add("lock" + x, "1", 60)) {
>
>   compute (expensive) record
>   insert record with Primary key x Into DB
>   memcache.set("record" + x, record);
>   memcache.delete("lock" + x);
>
> } else {
>  // someone else is doing the expensive stuff
> }
>
> In a very few cases (<20 of 3 Million) we observed a "Duplicate entry"
> Mysql-Error.
>
>
>
>


-- 
awl


Re: Matrix in memcache

2010-10-29 Thread Adam Lee
Depending on the access patterns and such, it might also be worth
looking into a persistent kv store like tokyotyrant, redis, etc...

On Friday, October 29, 2010, Sreejith S  wrote:
> Thank u..
> I am planning to use MongoDb to store the matrix entries and for processing 
> take the entries from mongo DB to memcache...
> Yea...i should increase my cache..cos its heavy data.
>
> Sreejith
>
>
> On Fri, Oct 29, 2010 at 6:30 AM, PlumbersStock.com 
>  wrote:
>
> You can store them but it shouldn't be your only copy. I frequently
> cache documents in memcache so I don't have to pull them off the
> filesystem as often but I do store them in the filesystem. If using a
> lot of memory you may have to increase the size of your cache.
>

-- 
awl


Re: Evictions with Free Space

2010-12-03 Thread Adam Lee
It's not a bug-- in order to be speed up memory management, memcached uses
what is known as a slab allocator instead of using standard malloc()/free()

*Basically, it only allocates memory in blocks of 1MB (by default), referred
to as a slab.  Slabs belong to a slab class, which is defined by the size of
item that is stored in that slab. For ease of explanation, let's say you
have only two slab classes, a slab class at 1k and the next slab class at 2k
(though in reality the default growth factor in slab classes is 1.25 and
there are a bunch of them).  An item of 800b would get put into a slab in
the 1k slab class and an item of 1024b, 1.5k, 1.99k or 2k would get put into
a chunk in the 2k slab class.*
*
*
*When an item is stored, first there is a check to see which class it should
get shoved in, based on its size, and then mc will check that class's stack
to see if there are any empty spots available.  If not, mc will attempt to
allocate a new slab for that class and store the item there.*
*
*
*All of this means, obviously, that there is some tradeoff in unused
memory-- not every item is going to be exactly the size defined by the slab
class, or you could have just one item stored in a class for items of size
1k, meaning that 1MB is allocated just to store that 1k item-- but it is
more than made up for by the fact that all of these operations are very fast
and very simple.  It just examines a stack and does some very simple
multiplication and pointer arithmetic to find/store any item and slabs never
have to be free()ed, mc just puts that spot back in the empty stack and
it'll eventually get reused by another item looking to go into that slab
class.
*
There are lots of different configuration options that can change this
behavior, as I've hinted at-- you can change the growth factor, the slab
size, you can ask memcached to allocate all of the memory at startup instead
of lazily allocating slabs only as they're needed, and you can actually even
compile memcached to use malloc() if you want.

Anyway, the slab allocator is worth learning a little bit about if you're
going to be doing anything more than very tangential work to memcached.

Just googled around a bit and found this, which might be a decent place to
start:
http://code.google.com/p/memcached/wiki/MemcachedSlabAllocator#What's_a_clsid?

Hope
that helps!

--
awl


Re: key taking only 8 characters

2010-12-14 Thread Adam Lee
You wrote code to speak directly with memcached or you're using a
client? What is the server returning when it fails- a success code,
CLIENT_ERROR, ...?

Perhaps it would help if you posted your test code...

On Tuesday, December 14, 2010, Prashu  wrote:
> Hi,
>   i am new to memcached.
>   i write a sample program to set/get from memcached.
>
>   it is accepting only 8 charcters as key if give more or less than
> the 8 characters it is not storing any values.
>
>  could you help me for this.
>  whether we can configure the  key max or min length?
>
> thanks
> Prashanth Kumar chanda
>

-- 
awl


Re: What will return when multiple incr commands are issued to the same key?

2010-12-21 Thread Adam Lee
incr is atomic and get returns the current value stored. i don't see any
unexpected behavior in your test- if you incr 200 once, it's 201. twice,
it's 202...

awl
On Dec 21, 2010 9:59 PM, "speedfirst"  wrote:
> For example, a key value pair "testkey"="200" has existed in a
> memcached server. When 2 or more write threads issue "incr testkey 1"
> simultaneously with nonblocking manner. And there is a read thread
> only for reading response from memcached. What the response is?
>
> My test show:
> 201CRLF
> 202CRLF
> END
>
> Is this "END" part of correct response? Not found this in the protocol
> text file.
>
> Thans.


Re: Online chat with memcached

2011-01-05 Thread Adam Lee
it could be made to work, but this is nowhere near an ideal design-
memcached wasn't really made to be used like this and you're going to have
to jump through some hoops if you do want to use it like this.

using this design, you're going to have to store the entire list of messages
for a room in one key-value. this means that to add a message, you're going
to have to read the (possibly large) value over the wire, deserialize it,
add the new message, serialize it and send it back over the wire. this is an
order of magnitude more traffic than necessary, in addition to not being
threadsafe.  if you're going to do it like this, at least take advantage of
CAS operations in memcached to make it correct, though this won't do
anything to reduce the workload-- in fact, it will actually make it worse in
high-traffic situations since you'll probably have a fairly large number of
failed-sets/retransmissions when multiple clients are trying to concurrently
modify a room.

presumably, you're going to limit the number of messages for any given room
to some max value, N. given that, you could instead implement a design
wherein you create N slots for each room (room:0, room:1, ... room:N-1) and
maintain a counter, I that tracks your current index and lets you treat them
like a circular buffer.  to add a message, you simply attempt to update
room:(I mod N) with the message and, if successful, incr I.  this way, every
client can keep track of its last I for each room that it cares about. if I'
== I, there are no new messages, otherwise it only needs to do a multiget on
the keys between I' mod N and I mod N to get all the new messages.

that said, this is still not really ideal. i would check out some other
projects like redis (each room as a list. to add a message just do a PUSH &
a TRIM. basically just a formalized version of what i designed above, but
persistent) or kesrtrel (each room as a queue and to listen to a room you
just create a child queue for each client. kestrel takes care of
persistence, concurrency, etc)

any of these designs should work for you, but i really think the
non-memcached ones are your best bet... why reinvent the wheel when it comes
to persistence, polling, in-memory data structures, concurrency, etc? let
the backend do the heavy lifting and spend your time actually focusing on
the unique logic of your app.

--
awl
On Jan 4, 2011 2:08 PM, "- -"  wrote:


Re: Unique Set as value

2011-01-14 Thread Adam Lee
Honestly, this probably isn't the best response for this list, but use
redis,
that's exactly what it was designed for.  It has native support for basic
data structures, like hashes (associative arrays, dictionaries, whatever you
wanna call em), lists, sets and sorted sets.  Your code would actually end
up being even simpler, since you could store the dictionary directly in
redis.  Not sure about the client situation for python, but it looks like
there are a few of them: http://redis.io/clients

On Fri, Jan 14, 2011 at 3:24 PM, Gustav  wrote:

> I'm working on a specific problem where I'd like to keep a unique set
> as the value and was wonder if there is a better way of doing this.
>
> Here is the scenario:
>
> I want to store all accounts that try to login from one IP. So I want
> to keep a unique set of accounts in a key for that IP. I currently use
> Python to store a dictionary in the value for that IP and I have to
> pickle/unpickle everytime to see if I need to insert the account into
> to dictionary. This is terribly slow when there are a ton of them.
>
> Are there any sneaky ways of keeping a unique set of values in a key
> that scales better than my approach?
>
> Thanks.
>



-- 
awl


Re: memcached php doubt

2011-01-20 Thread Adam Lee
have you confirmed that the server is running correctly? try telnetting to
the port and issuing a "stats" command, for example.

awl
On Jan 20, 2011 11:55 AM, "Shihab KB"  wrote:
> Hi,
>
> I am trying to incorporate caching for my restful web services written
> in php. I am going to use memcache as cache server. I have installed
> the memcache. I followed the page http://shikii.net/blog/installing-memca
> ... windows-7/ for the installation.
>
> After installation I am trying to test my memcache installation is
> successful or not. I write the following code for testing
> 
> $memcache = new Memcache; // instantiating memcache extension
> class
> $memcache->connect("127.0.0.1",11211) or die ("Could not
> connect");
>
> print_r($memcache);
> echo "";
> try {
> $version = $memcache->getVersion();
> echo "Server version: ".$version."\n";
> } catch (Exception $ex) {
> echo $ex->getMessage();
> }
> 
>
> But the getVersion is not returning any value. I think the connection
> is successful but I cannot set/get value to memcache. The result of my
> code is show below
>
> 
> Memcache Object ( [connection] => Resource id #3 )
> Server version:
> 
>
> I am using memcached.exe version = 1.2.6.0 and extension
> (php_memcache.dll) version = 2.2.5.0.
>
> And when I tried with error reporting enabled, I got the following
> error.
>
> 
> error_reporting(-1);
> ini_set('display_errors', true);
> 
>
> 
> Memcache::getversion() [memcache.getversion]: Server 127.0.0.1 (tcp
> 11211) failed with: Failed reading line from stream (0)
> 
>
> regards
> Shihab


Re: Memcached won't increment a numeric value

2011-01-21 Thread Adam Lee
Due to a quirk in memcached, I believe you actually want to store a string
representation in order to use incr/decr.

Try changing the second line to CACHE.set('abc', '123')  and see if that
works.

On Thu, Jan 20, 2011 at 11:47 AM, Josiah Ivey  wrote:

> Using both the Dalli and Memcached-client gems, I am unable to
> increment a numeric value:
>
> ruby-1.9.2-p0 > CACHE = MemCache.new 'localhost:11211'
>  => 
> ruby-1.9.2-p0 > CACHE.set('abc', 123)
>  => "STORED\r\n"
> ruby-1.9.2-p0 > CACHE.get('abc')
>  => 123
> ruby-1.9.2-p0 > CACHE.incr('abc')
> MemCache::MemCacheError: cannot increment or decrement non-numeric
> value
>from /Users/josiahivey/.rvm/gems/ruby-1.9.2-p0/gems/memcache-
> client-1.8.5/lib/memcache.rb:926:in `raise_on_error_response!'
>from /Users/josiahivey/.rvm/gems/ruby-1.9.2-p0/gems/memcache-
> client-1.8.5/lib/memcache.rb:831:in `block in cache_incr'
>from /Users/josiahivey/.rvm/gems/ruby-1.9.2-p0/gems/memcache-
> client-1.8.5/lib/memcache.rb:865:in `call'
>from /Users/josiahivey/.rvm/gems/ruby-1.9.2-p0/gems/memcache-
> client-1.8.5/lib/memcache.rb:865:in `with_socket_management'
>from /Users/josiahivey/.rvm/gems/ruby-1.9.2-p0/gems/memcache-
> client-1.8.5/lib/memcache.rb:827:in `cache_incr'
>from /Users/josiahivey/.rvm/gems/ruby-1.9.2-p0/gems/memcache-
> client-1.8.5/lib/memcache.rb:342:in `block in incr'
>from /Users/josiahivey/.rvm/gems/ruby-1.9.2-p0/gems/memcache-
> client-1.8.5/lib/memcache.rb:886:in `with_server'
>from /Users/josiahivey/.rvm/gems/ruby-1.9.2-p0/gems/memcache-
> client-1.8.5/lib/memcache.rb:341:in `incr'
>from (irb):6
>from /Users/josiahivey/.rvm/rubies/ruby-1.9.2-p0/bin/irb:17:in
> `'
>
> Any ideas?
>



-- 
awl


Re: features - interesting!!

2011-02-01 Thread Adam Lee
there are some excellent solutions out there already. check out, for
example, zookeeper.

awl
On Jan 29, 2011 3:32 PM, "rspadim"  wrote:
> hi guys, there's a async replication project (repcached) that is very
> interesting, could we implement it in main source code? at compile
> time we could select from repcached or memcached
> could we make it sync and/or async?
> http://repcached.sourceforge.net/
>
=
> there's some non volatile solutions too that's very interesting
> (memcachedb), for low memory computers we can use disk
> could we implement it in main source code too?
> http://memcachedb.org/
>
>
=
> another, now !NEW! feature...
>
> i was looking for a *DISTRIBUTED LOCK MANAGER*, but i only found
> kernel linux lock manager, that's based on file system (flock)
> could we implement a lock manager at memcached?
>
> what lock manager do?
> client send: KEY NAME, lock type+client name (KEY VALUE), key timeout,
> wait lock timeout (infinity/seconds)
> (this can be implement in memcached protocol without many
> modifications!!!)
> server side function:
> 1)seek if client can have this lock
> 2)wait lock timeout... (this is a problem since we can have a very big
> wait time...)
> 3) if client disconect exit do while
> 4) yes we have the lock => change key value (give this lock to
> client), exit do
> 5) no we don't have the lock, exit do
> 6) end of do while... return key value: lock type + client name (like
> a get command)
>
> ideas:
> 1)maybe a separated memory size? we can run two separated servers, one
> for keys another for lock function (make command line options: just
> lock system, objects only system or both)
>
> 2)this type of key is diferent from memcached key cache objects,
> that's obvious
>
> but is managed with same functions... (get, list, etc)
> but
> all write/delete functions can't be done, they MUST be done by LOCK
> (the new) function,
> DELETE/UNLOCK function is a LOCK function with lock type=0 (unlock)
> read can be done by get and will return current client lock name and
> lock type (get command)
>
>
>
> *WHY THIS FEATURE?*
> i didn't found a distributed lock manager for user space (not kernel
> space) with easy to implement protocol, and many program languages,
> and a very mature server and protocol.
> =(
>
> but with this feature...
> I DON'T NEED A SAMBA/NFS SERVER FOR NON FILESYSTEM LOCKING! \o/
> I WILL NEVER USE FLOCK() AGAIN!!! \o/ !!!
>
> I JUST NEED:
> MYSQL+MEMCACHED+ (APACHE+CGI/PHP/JAVA/PERL/PYTHON)
> for any cluster solution, no more filesystem!!!
>
> NO MORE FILESYSTEM REPLICATIONS (DRBD, NBD+RAID) FOR MY HIGH
> AVAIBILITY / CLUSTER SOLUTION!
> WE CAN USE REPCACHED (WE NEED A SYNC MODE)
>
> THINK ABOUT IT!!!
> REPLICATION + FLOCK! IT'S A VERY VERY VERY NICE FEATURE!
>
> 
> type of object (1bit) default / lock manager can be putted on key
> options/flags!!!
> inside key value, we can put:
> lock type(3 bits)
> client name (a variable length, many bytes)
>
> http://en.wikipedia.org/wiki/Distributed_lock_manager
> from wikipedia, TYPE OF LOCKS:
> * Null Lock (NL). Indicates interest in the resource, but does not
> prevent other processes from locking it. It has the advantage that the
> resource and its lock value block are preserved, even when no
> processes are locking it.
> * Concurrent Read (CR). Indicates a desire to read (but not
> update) the resource. It allows other processes to read or update the
> resource, but prevents others from having exclusive access to it. This
> is usually employed on high-level resources, in order that more
> restrictive locks can be obtained on subordinate resources.
> * Concurrent Write (CW). Indicates a desire to read and update the
> resource. It also allows other processes to read or update the
> resource, but prevents others from having exclusive access to it. This
> is also usually employed on high-level resources, in order that more
> restrictive locks can be obtained on subordinate resources.
> * Protected Read (PR). This is the traditional share lock, which
> indicates a desire to read the resource but prevents other from
> updating it. Others can however also read the resource.
> * Protected Write (PW). This is the traditional update lock, which
> indicates a desire to read and update the resource and prevents others
> from updating it. Others with Concurrent Read access can however read
> the resource.
> * Exclusive (EX). This is the traditional exclusive lock which
> allows read and update access to the resource, and prevents others
> from having any access to it.
>
> NEW LOCK FUNCTION:
>
> LOCK 
>
> key: key name
> =key value
>
> lock_type:
> NL = 0
> CR = 1
> CW = 2
> PR = 3
> PW = 4
> EX = 5
>
> client name: any value
> timeout: any number, 0=infinity
> wait lock timeout: wait lock time, 0=infinity
>
> how loc

Re: features - interesting!!

2011-02-03 Thread Adam Lee
Here's one I hacked together a while back, though, as I said before, I
recommend using something better suited to the job...  BTW, this thing uses
a few of our utility classes, but it should be very simple to drop in
replacements.

public class GlobalLock
{
public GlobalLock(String lockType, String resourceId)
{
if (StringUtils.isBlank(lockType))
throw new NullPointerException("Empty lock type");

if (StringUtils.isBlank(resourceId))
throw new NullPointerException("Empty resource id");

_globalId = StringUtil.toHexString((lockType +
resourceId).getBytes()) +":GlobalLock";

while(_value == 0)
_value = RandomUtils.nextLong();

_acquired = false;
}

public GlobalLock lock()
{
if (_acquired)
return this;

// Lock duration 20sec
// tries to acquire lock for 21sec
// obviously this is a far from perfect hack

final int LOCK_DURATION_SECS = 20;
final int SLEEP_TIME_MILLIS = 100;
final int MAX_TRIES = 210; // max number of attempts to acquire lock

MemcachedClient mc = FotologMemCache.getFotolog();
for(int numTries = 0; numTries < MAX_TRIES; numTries++)
{
if (_log.isInfoEnabled()) _log.info("locking "+_globalId);
try
{
CASValue mcVal = mc.gets(_globalId);

if (mcVal == null)
{
_acquired = mc.add(_globalId, LOCK_DURATION_SECS,
_value).get();
}
else if ( ((Long)mcVal.getValue()).longValue() == 0 )
{
CASResponse casResp = mc.cas(_globalId, mcVal.getCas(),
_value);
_acquired = (casResp == CASResponse.OK);
}
else
{
if (_log.isInfoEnabled()) _log.info("waiting for another
process to finish: "+_globalId + ":" + mcVal.getValue());
}
}
catch (Exception e)
{
_log.error(e.getMessage());
}

if (_acquired)
return this;

try { Thread.sleep(SLEEP_TIME_MILLIS); } catch
(InterruptedException ie) {/**/} // don't 'busywait'
}

throw new GlobalLockException("Unable to lock [" + _globalId + "]
after " + MAX_TRIES + " attempts");
}


/** Unlocks ALL of the resources locked with lock().
 * Never fails, so doesn't require additional try/catch if you are
calling it from some other 'finally'
 */
public void unlock()
{
if (_acquired)
{
_acquired = false;

MemcachedClient mc = FotologMemCache.getFotolog();
CASValue mcVal = mc.gets(_globalId);

if (mcVal == null)
{
// nothing to do, val already expired
if (_log.isInfoEnabled()) _log.info("already expired: " +
_globalId + ":" + _value);
return;
}
else if ( ((Long)mcVal.getValue()).longValue() == _value )
{
mc.cas(_globalId, mcVal.getCas(), 0l); // reset but only if
it matches our val
}
else
{
_log.error("failed to unlock: " + _globalId + ":" + _value);
}
}
}


private static Logger _log = Logger.getLogger(GlobalLock.class);

private String _globalId;
private long _value;
private boolean _acquired;
}

On Tue, Feb 1, 2011 at 1:40 PM, Roberto Spadim wrote:

> LOCK should be something like this:
>
>  // type=0 -> unlock
> // type=1 -> lock
> // client_name must change (use sessionID + username)
> function
> memcache_flock($memcache_obj,$key,$type=0,$client_name='1',$timeout=0){
>$ret=memcache_add($memcache_obj,$key,$client_name,false,$timeout);
>if($ret==true){
>if($type==0)// delete
>memcache_del($memcache_obj,$key);
>return(true);
>}
>$cur_cli=memcache_get($memcache_obj,$key);
>if(is_string($cur_cli) && $cur_cli!=''){ // if ='' no user!
>if($cur_cli !== $client_name){
>// it's not our lock
>if(check_user_online_function()) // http session
> function (if want
> http session integration), for memcached it´s like (true)
>return($cur_cli);   // return current
> lock client_name
>}
>// our lock!
>}else{
>        // replace, autocorrect a wrong usage
>memcache_replace($memcache_ob

Re: features - interesting!!

2011-02-03 Thread Adam Lee
I'm not sure what you mean that it's "client-based."  Sure, the logic is on
the client, as a lot of things are with memcached, but the CAS is enforced
by the server.  Doesn't seem like it's functionally any different than
adding the same function on the server side, since the client would just be
interpreting the new feature in the protocol instead, but functionally it
would work out to be identical, I believe...


On Thu, Feb 3, 2011 at 2:49 PM, Roberto Spadim wrote:

> ok, but it´s client based...
> i want a server based (memcache daemon)
>
> at client side:
> memcached_lock("lock_name",1);
> memcached_lock("lock_name",0);
>
> at server side:
> lock/unlock some variable (maybe a server based flock())
> since we use ram memory we could use ram locks (not filesystem lock)
> with repcache we can replicate this lock on replicas
>
>
> 2011/2/3 Adam Lee :
> > Here's one I hacked together a while back, though, as I said before, I
> > recommend using something better suited to the job...  BTW, this thing
> uses
> > a few of our utility classes, but it should be very simple to drop in
> > replacements.
> > public class GlobalLock
> > {
> > public GlobalLock(String lockType, String resourceId)
> > {
> > if (StringUtils.isBlank(lockType))
> > throw new NullPointerException("Empty lock type");
> > if (StringUtils.isBlank(resourceId))
> > throw new NullPointerException("Empty resource id");
> > _globalId = StringUtil.toHexString((lockType +
> > resourceId).getBytes()) +":GlobalLock";
> > while(_value == 0)
> > _value = RandomUtils.nextLong();
> > _acquired = false;
> > }
> > public GlobalLock lock()
> > {
> > if (_acquired)
> > return this;
> > // Lock duration 20sec
> > // tries to acquire lock for 21sec
> > // obviously this is a far from perfect hack
> > final int LOCK_DURATION_SECS = 20;
> > final int SLEEP_TIME_MILLIS = 100;
> > final int MAX_TRIES = 210; // max number of attempts to acquire
> lock
> > MemcachedClient mc = FotologMemCache.getFotolog();
> > for(int numTries = 0; numTries < MAX_TRIES; numTries++)
> > {
> > if (_log.isInfoEnabled()) _log.info("locking "+_globalId);
> > try
> > {
> > CASValue mcVal = mc.gets(_globalId);
> > if (mcVal == null)
> > {
> > _acquired = mc.add(_globalId, LOCK_DURATION_SECS,
> > _value).get();
> > }
> > else if ( ((Long)mcVal.getValue()).longValue() == 0 )
> > {
> > CASResponse casResp = mc.cas(_globalId,
> mcVal.getCas(),
> > _value);
> > _acquired = (casResp == CASResponse.OK);
> > }
> > else
> > {
> > if (_log.isInfoEnabled()) _log.info("waiting for
> another
> > process to finish: "+_globalId + ":" + mcVal.getValue());
> > }
> > }
> > catch (Exception e)
> > {
> > _log.error(e.getMessage());
> > }
> > if (_acquired)
> > return this;
> > try { Thread.sleep(SLEEP_TIME_MILLIS); } catch
> > (InterruptedException ie) {/**/} // don't 'busywait'
> > }
> > throw new GlobalLockException("Unable to lock [" + _globalId + "]
> > after " + MAX_TRIES + " attempts");
> > }
> >
> > /** Unlocks ALL of the resources locked with lock().
> >  * Never fails, so doesn't require additional try/catch if you are
> > calling it from some other 'finally'
> >  */
> > public void unlock()
> > {
> > if (_acquired)
> > {
> > _acquired = false;
> > MemcachedClient mc = FotologMemCache.getFotolog();
> > CASValue mcVal = mc.gets(_globalId);
> > if (mcVal == null)
> > {
> > // nothing to do, val already expired
> > if (_log.isInfoEnabled()) _log.info("already expired: "
> +
> > _globalId + ":" + _value);
> > return;
> > }
> 

Re: features - interesting!!

2011-02-03 Thread Adam Lee
awl
On Feb 3, 2011 11:11 PM, "Roberto Spadim"  wrote:
> not identical... if you need less network latency, server side is
> better... no doubt...
> like CAS we could make a LOCK system (just LOCK/UNLOCK, with timeout,
> and just allow change lock/unlock if content of lock = content value
> sent on memcache client)
>
> 2011/2/3 Adam Lee :
>> I'm not sure what you mean that it's "client-based."  Sure, the logic is
on
>> the client, as a lot of things are with memcached, but the CAS is
enforced
>> by the server.  Doesn't seem like it's functionally any different than
>> adding the same function on the server side, since the client would just
be
>> interpreting the new feature in the protocol instead, but functionally it
>> would work out to be identical, I believe...
>>
>> On Thu, Feb 3, 2011 at 2:49 PM, Roberto Spadim 
>> wrote:
>>>
>>> ok, but it´s client based...
>>> i want a server based (memcache daemon)
>>>
>>> at client side:
>>> memcached_lock("lock_name",1);
>>> memcached_lock("lock_name",0);
>>>
>>> at server side:
>>> lock/unlock some variable (maybe a server based flock())
>>> since we use ram memory we could use ram locks (not filesystem lock)
>>> with repcache we can replicate this lock on replicas
>>>
>>>
>>> 2011/2/3 Adam Lee :
>>> > Here's one I hacked together a while back, though, as I said before, I
>>> > recommend using something better suited to the job...  BTW, this thing
>>> > uses
>>> > a few of our utility classes, but it should be very simple to drop in
>>> > replacements.
>>> > public class GlobalLock
>>> > {
>>> > public GlobalLock(String lockType, String resourceId)
>>> > {
>>> > if (StringUtils.isBlank(lockType))
>>> > throw new NullPointerException("Empty lock type");
>>> > if (StringUtils.isBlank(resourceId))
>>> > throw new NullPointerException("Empty resource id");
>>> > _globalId = StringUtil.toHexString((lockType +
>>> > resourceId).getBytes()) +":GlobalLock";
>>> > while(_value == 0)
>>> > _value = RandomUtils.nextLong();
>>> > _acquired = false;
>>> > }
>>> > public GlobalLock lock()
>>> > {
>>> > if (_acquired)
>>> > return this;
>>> > // Lock duration 20sec
>>> > // tries to acquire lock for 21sec
>>> > // obviously this is a far from perfect hack
>>> > final int LOCK_DURATION_SECS = 20;
>>> > final int SLEEP_TIME_MILLIS = 100;
>>> > final int MAX_TRIES = 210; // max number of attempts to
acquire
>>> > lock
>>> > MemcachedClient mc = FotologMemCache.getFotolog();
>>> > for(int numTries = 0; numTries < MAX_TRIES; numTries++)
>>> > {
>>> > if (_log.isInfoEnabled()) _log.info("locking "+_globalId);
>>> > try
>>> > {
>>> > CASValue mcVal = mc.gets(_globalId);
>>> > if (mcVal == null)
>>> > {
>>> > _acquired = mc.add(_globalId, LOCK_DURATION_SECS,
>>> > _value).get();
>>> > }
>>> > else if ( ((Long)mcVal.getValue()).longValue() == 0 )
>>> > {
>>> > CASResponse casResp = mc.cas(_globalId,
>>> > mcVal.getCas(),
>>> > _value);
>>> > _acquired = (casResp == CASResponse.OK);
>>> > }
>>> > else
>>> > {
>>> > if (_log.isInfoEnabled()) _log.info("waiting for
>>> > another
>>> > process to finish: "+_globalId + ":" + mcVal.getValue());
>>> > }
>>> > }
>>> > catch (Exception e)
>>> > {
>>> > _log.error(e.getMessage());
>>> > }
>>> > if (_acquired)
>>> > return this;
>>> > try { Thread.sleep(SLEEP_TIME_MILLIS); } catch
>>> > (Interrupte

Re: features - interesting!!

2011-02-03 Thread Adam Lee
sure, latency would be lower, but i still believe that they would be
functionally identical.

regardless, i believe that this isn't really something that memcached should
do. it gives you the tools necessary to implement it without adding
functionality tangential to its core purpose.  if you want a more robust
distributed locking mechanism, then you can use a tool that was
purpose-built for this.

i'm a firm believer in keeping systems to their core design, otherwise
memcached will eventually have an email client built into it.

--
awl
On Feb 3, 2011 11:11 PM, "Roberto Spadim"  wrote:


Re: features - interesting!!

2011-02-04 Thread Adam Lee
Yes.  I'm starting to feel like a broken record, but I'd like to reiterate
that you are trying to solve problems using memcached that are better solved
by other, existing products.  If you want a distributed lock, there are many
options-- I use zookeeper for this.  If you want transactional nosql, take a
look at redis.  Etc...  These products are just as easy to use (a lot of
them even speak memcached protocol) and would fit your needs much better.

-- 
awl


Re: features - interesting!!

2011-02-04 Thread Adam Lee
You don't need two libraries.  As I said, a lot of these products already
speak the memcached protocol or, alternatively, something like redis would
provide you all of the functionality that you need from memcached plus the
features that you're requesting (key-value store with replication,
persistence, atomicity/transactions, etc).  There are many things that
memcached is better at than redis, but given the use case that you're
describing, the advantages of memcached versus redis don't really apply to
you, so switching to it wouldn't really hurt you at all and at the same time
would, I believe, solve your problems.

http://redis.io/
<http://redis.io/>
On Fri, Feb 4, 2011 at 1:40 PM, Roberto Spadim wrote:

> ehehe i know that there´s many others lock server,
> but i don´t have ROM on my PIC18f4550 to put two libraries =( it would
> be nice if i had a ARM or a x86 :/, but i don´t have :(
> if i use memcache i could do it with my PIC
>
> 2011/2/4 Adam Lee :
> > Yes.  I'm starting to feel like a broken record, but I'd like to
> reiterate
> > that you are trying to solve problems using memcached that are better
> solved
> > by other, existing products.  If you want a distributed lock, there are
> many
> > options-- I use zookeeper for this.  If you want transactional nosql,
> take a
> > look at redis.  Etc...  These products are just as easy to use (a lot of
> > them even speak memcached protocol) and would fit your needs much better.
> > --
> > awl
> >
>
>
>
> --
> Roberto Spadim
> Spadim Technology / SPAEmpresarial
>



-- 
awl


Re: features - interesting!!

2011-02-04 Thread Adam Lee
Do you only need the lock in order to do transaction groups?  If so, redis
supports those out of the box:

http://rediscookbook.org/pipeline_multiple_commands.html

<http://rediscookbook.org/pipeline_multiple_commands.html>If, though, you're
actually trying to build locks, check out WATCH:

http://redis.io/commands/watch

<http://redis.io/commands/watch>Anyway, this is all off-topic for the
memcached mailing list, so I'll stop talking about redis now.  Just know
that I truly believe redis will go much further toward solving your problem
set than memcached, without any changes to the server/protocol/etc...

On Fri, Feb 4, 2011 at 2:16 PM, Roberto Spadim wrote:

> i will learn more about redis, i didn´t see a lock server with cache yet
> my today app work today, but it´s not very good/fast (i have a network
> bootneck today)
> i rewrite to redis protocol is very poor :(
> if i can´t do anythink i will try a proxy between memcache and client
> and implement lock and unlock there, but it´s not so good, but solve
> my problem
> maybe in future put in memcache could be nicer =)
> i will study more and tell what i used
> thanks =]
>
> 2011/2/4 Adam Lee :
> > You don't need two libraries.  As I said, a lot of these products already
> > speak the memcached protocol or, alternatively, something like redis
> would
> > provide you all of the functionality that you need from memcached plus
> the
> > features that you're requesting (key-value store with replication,
> > persistence, atomicity/transactions, etc).  There are many things that
> > memcached is better at than redis, but given the use case that you're
> > describing, the advantages of memcached versus redis don't really apply
> to
> > you, so switching to it wouldn't really hurt you at all and at the same
> time
> > would, I believe, solve your problems.
> >
> > http://redis.io/
> > On Fri, Feb 4, 2011 at 1:40 PM, Roberto Spadim 
> > wrote:
> >>
> >> ehehe i know that there´s many others lock server,
> >> but i don´t have ROM on my PIC18f4550 to put two libraries =( it would
> >> be nice if i had a ARM or a x86 :/, but i don´t have :(
> >> if i use memcache i could do it with my PIC
> >>
> >> 2011/2/4 Adam Lee :
> >> > Yes.  I'm starting to feel like a broken record, but I'd like to
> >> > reiterate
> >> > that you are trying to solve problems using memcached that are better
> >> > solved
> >> > by other, existing products.  If you want a distributed lock, there
> are
> >> > many
> >> > options-- I use zookeeper for this.  If you want transactional nosql,
> >> > take a
> >> > look at redis.  Etc...  These products are just as easy to use (a lot
> of
> >> > them even speak memcached protocol) and would fit your needs much
> >> > better.
> >> > --
> >> > awl
> >> >
> >>
> >>
> >>
> >> --
> >> Roberto Spadim
> >> Spadim Technology / SPAEmpresarial
> >
> >
> >
> > --
> > awl
> >
>
>
>
> --
> Roberto Spadim
> Spadim Technology / SPAEmpresarial
>



-- 
awl


Re: evictions and total_items

2011-02-14 Thread Adam Lee
It's very easy for this to happen-- evictions are when you try to add an
item but there is no more room and no expired items can be found to remove
quickly either, so an item that is still valid must be removed.

If, for example, you had a cache that only had room for 5 items and you kept
trying to set 10 items, total_items would only ever be 5, but evictions
would just keep growing.

On Sun, Feb 13, 2011 at 2:36 AM, Arkadiy  wrote:

> Hi all,
>
> What is the exact meaning of these stats?  My server has more
> evictions than total_items.  How can this happen?
>
> Thanks in advance for your help.
>
> Arkadiy
>



-- 
awl


Re: Memcached performance issues

2011-02-22 Thread Adam Lee
I'm a little late to the party, but I've been reading the emails and
following along...

Out of curiosity, what do you mean by this:

 I have multiple servers on the front end that each have 100 connections
> round robining to memcached.


I mean, I think I understand what you mean by this, but it doesn't really
make sense to me-- why does each server need 100 connections to memcached?
 Beyond that, how does each server have 100 connections to memcached?  You
said that you're using the spymemcached client, right?

If you could explain exactly how your setup works and what your actual
intention was with this design, I think it'd help me a lot.  I have quite a
bit of experience tuning spymemcached to do hundreds of thousands of
requests a second, so I'm hoping I can help you out quite a bit once I can
wrap my head around it.

On Sun, Feb 20, 2011 at 12:15 PM, Patrick Santora wrote:

> I am having issues with Memcached at the moment. I have multiple
> servers on the front end that each have 100 connections round robining
> to memcached. I have 2 memcached servers, each with 512MB of ram and
> 20 threads (might be a little high) available to each.
>
> What I am seeing is that when my memcached container hits around 10MB
> of written traffic is starts to bottleneck causing my front end
> systems to slow WAY down. I've turned on verbose debugging and see no
> issues and there are no complaints on the front end stating that the
> connection clients are not able to hit memcached.
>
> Has anyone seen anything like this before?
>
> I would appreciate any feedback that could help out with this.
>
> Thanks
> -Pat
>



-- 
awl


Re: Multiport support

2011-02-28 Thread Adam Lee
On Feb 28, 2011 5:04 PM, "Trond Norbye"  wrote:
> I don't have any huge objections to this patch, but I'd rather change the
logic so instead of specifying the number of ports and have them in a range
I'd prefer if we used multiple -h hostname:port and -U hostname:port. That
would allow you to bind to multiple specific interfaces and multiple port
numbers.. ex: -h *:11211 -h *:11212 -h *:11213 etc

this sounds excellent!

--
awl


Re: Replication ?

2011-03-04 Thread Adam Lee
On Mar 4, 2011 10:38 PM, "dormando"  wrote:
> soo. it's more about matching the tool
> vs your actual needs. most of the problem here has always been separating
> perceieved requirements from actual requirements.

yeah, that's an incredibly important distinction.  i talk to a lot of people
who seem to think that their data is so important, they can't possibly
tolerate even a brief inconsistency. or that just because memcached *could*
lose data means that it will. the truth is, we've been running a large (over
500GB on, at one point, up to 50 servers) installation and we've had very
little data loss. generally, the only times a server went down were when we
intentionally brought it down or the very rare hardware failure.

obviously, it's not a persistent datastore and you need to keep your
permanent data somewhere, but for anything ephemeral or that can be easily
queried or recomputed, memcached is an excellent and fairly reliable choice.

in fact, i would bet there are a lot of situations where a fairly
high-traffic site chooses to store something like session in a slower but
more "reliable" datastore because they "can't afford to lose the data," but
end up with a lower QOS because the datastore can't keep up with the load
and ends up with failled reads and/or writes.

awl


Re: Problems when one of the memcached server down

2011-03-08 Thread Adam Lee
it is, however, possible to support "replication" at the client level, if a
bit out of band for memcached. we (Fotolog), for example, at one point wrote
our own client that would set data in multiple servers and then get from
only one of them. it was a read-heavy environment with a small dataset that
easily fit completely into RAM on each server, so it actually worked. not
saying that it is a good idea, just that as possible.

anyway, it has since been replaced in our system with TokyoTyrant, which
solved the problem much better for us... and it (or something similar) very
well might in this case as well.

awl
On Mar 8, 2011 3:01 PM, "Les Mikesell"  wrote:
> On 3/8/2011 1:40 PM, Evil Boy 4 Life wrote:
>> Thanks guys, I turned off the failover in the client...
>> In fact, I didn't know that there are several .net clients and now I
>> get confused, because I don't know which one is better (now I use
>> memcacheddotnet)
>> Do you know if one of the .net clients support replication in any way?
>
> Memcache isn't supposed to do replication - it's a cache that you should
> be able to replenish from the real data source. What happens when one
> server fails depends on the client hash mechanism. Some people want the
> load to be rebalanced evenly over the remaining servers, some don't.
> Either way, the next attempt to use a failed server should detect it is
> down and retrieve the data from the backing DB.
>
> --
> Les Mikesell
> lesmikes...@gmail.com
>
>


Re: It's a long story... (TL;DR?)

2011-03-09 Thread Adam Lee
i started to build somwthing similar, but rather than an in-app queue, it
was external. basically,  write your new entries into a fifo or a circular
buffer on some external box(es) and have as many boxes as you need/can
afford watching this and writing all entries to cache.

this frees the app from having to do multiple writes and gives you some
scalability in number of listener/writer instances.

awl
On Mar 9, 2011 3:30 PM, "Nelz"  wrote:
> Hey all,
>
> Here's a long-ish story for you, and at the end I'll ask for some
suggestions.
>
> We have been able to 'swimlane' our traffic into two realms,
> transactional (TX) and read-only (RO). This enables us to have lower
> load on our TX/master db, and have all the RO servers talk to
> replicated slave dbs.
>
> We, of course, have been able to greatly increase our efficiency by
> putting a memcached layer between the dbs and the webheads, which we
> also split into two logical (TX and RO) groups.
>
> All well and good so far...
>
> The interactions in our app mean that a user can make a modification
> in the TX realm, and expect to see the results from the RO realm in
> real-time.
> Because we know there's a likelihood that there's a (now-stale) entry
> in the RO cache, we had the TX realm send invalidate (DELETE) to the
> RO cache. This would have worked, except that we now have a race
> condition between the master->slave replication ('eventual
> consistency', anyone?) and the request coming into the RO realm which
> basically puts back a stale version in the cache.
> To combat this, we decided that the TX realm should, upon update,
> eagerly do puts (SET) on the RO logical cache. This solved our issues
> (for the time being)...
>
> As we scale up, we now want to put additional copies of our RO realm
> in N other (cloud) datacenters. Of course, this only makes the db
> replication lag worse, so now more than ever we need to keep doing the
> eager puts (SET).
>
> But this raises several other issues for us:
> 1) the TX app has to manage not just connections to the RO and TX
> logical caches, it also needs to manage N number of remote caches as
> well.
> 2) cache connectivity between datacenters (via stunnels) is fitful at
best.
>
> #2 lead to a lot of talk of how to handle 'network fault tolerance'.
> I've realized that *most other* people's fault-tolerance is achieved
> by being on an intranet and the clients' ability to handle going from
> N -> (N-1) physical nodes fairly gracefully. This doesn't really apply
> for us, because due to going across the wire via stunnels, our caches
> go from N -> 0 nodes when there are network hiccups, and in these
> cases the clients (spymemcached for us) doesn't handle that nearly as
> gracefully. [NTS: Hrm... Maybe that is a concrete suggestion I can
> make on the spymemcached project page...?]
>
> As a short-term fix, I've made our (now broadcast) SET code be
> asynchronous, using an in-app queue, but this doesn't mitigate problem
> #1, and it increases resource use on our TX webservers.
>
> These problems have led us to talking with the CouchBase folks (Hi
> Chris! Matt, Dustin, I know you watch here too!). The discussion seems
> to have been about using the TAP interface
> (
http://blog.membase.com/northscale-blog/2010/03/want-to-know-what-your-memcached-servers-are-doing-tap-them.html
),
> but that feels a bit after-the-fact to me, as we'd have to listen to
> each of physical memcached nodes in the 'main' DC's RO cache for
> broadcast to multiple logical and remote RO caches.
>
> (There was also talk that using CouchBase in the future. I would love
> it to solve our problems, but I'm not convinced it could guarantee
> beating an eventual-consistency race condition.)
>
> So, I feel like I want something between the web clients and the
> memcached servers. (A proxy server you say? Why, that sounds loverly!)
>
> I looked at Moxi, and while I didn't get too hands-on, it seems like
> isn't a fire-and-forget interaction pattern, meaning we'd still have
> to do async processing while it goes through the cloud if we want to
> keep the TX realm running fast.
>
> Today I found http://code.google.com/p/memagent ... It seems to have
> (at least a 2-logical) broadcast for non-GET operations, which is
> cool. But the docs are shy on details, and it looks like kinda barren.
>
> TL:DR
>
> I guess I'm asking:
> Has anyone used Memagent, and what do they think of it?
> If you took the time to read my long-winded explanation, do you have
> other suggestions for addressing the issues?
>
> Thanks for indulging me.
>
> - Nelz


Re: Implementation of CHECK command for memcached

2011-03-18 Thread Adam Lee
Is it also your intention to have CHECK with an expiration act as a sort of
touch command?

On Fri, Mar 18, 2011 at 1:12 PM, Oleg Romanenko wrote:

> Hi.
>
> >How could this be put to use?  i.e. when is knowing that
> > something exists at some point in an ephemeral store useful to an
> > application?
>
> This command provides support for lazy read operation. You can read
> the value of the key only when is really necessary.
> For example:
> 1. I am storing some information about the client in memcached. The
> key - is the name of the session, which is is given to the client.
> 2. I have a lot of service scripts that should check the rights of
> clients. They need only to know whether there is a key (then
> everything is OK) or not (then the client is redirected to the login
> script).
> 3. And I have only two scripts that use the value of a key in they
> work.
>
> Of course, this problem can be solved by creating an additional (flag)
> key(s) without content. But such approach is less secure and in
> generally reduces the consistency of system's data (for example, key
> with data can be deleted when the flag key still remain available).
>



-- 
awl


Re: list command

2011-03-20 Thread Adam Lee
if you need this functionality, i recommend taking a look at something like
TokyoTyrant/KyotoTycoon or redis. they are built to do pretty much exactly
what you are looking for.

awl
On Mar 20, 2011 11:23 PM, "Roberto Spadim"  wrote:
> if yes... could it work like a noSQL database?
> LIST '%abc' (only keys ending with abc)
> list 'abc%' (only keys beggininig with abc)
> list '%abc%' (only keys with abc)
>
>
> if no... could we implement? no problem if it do 'dirty reads', i
> don't want to lock cache just to list... (we could do 2 commands, one
> with lock, and other without lock)
>
> 2011/3/21 Roberto Spadim :
>> hi guys i was reading
>> http://code.google.com/p/memcached/wiki/NewCommands
>>
>> there's a list command to return all 'keys names' in cache?
>>
>> --
>> Roberto Spadim
>> Spadim Technology / SPAEmpresarial
>>
>
>
>
> --
> Roberto Spadim
> Spadim Technology / SPAEmpresarial


Re: Memcachd with httpd to cache user stickyness to a datacenter

2011-04-11 Thread Adam Lee
also is the 70 percent thing really honestly that huge of a deal?  send all
the traffic from the data center to one instance and the rest to the other-
is not an even split but it's not that far off.

really it seems to me like people are coming up with perfectly valid
solutions to your problem and you're throwing them out without really
considering them. outside of writing it for you, I don't know what more
people can do.

use any sort of nosql solution that has master-master replication (membase,
kyoto, redis, etc), have that sync across data centers, pin users to one
data center per session... it's not rocket science and it's a solved
problem.

awl
On Apr 6, 2011 2:49 PM, "dormando"  wrote:
>
>
> On Wed, 6 Apr 2011, Mohit Anchlia wrote:
>
>> Thanks! These points are on my list but none of them are useful. The
>> reason is I think I mentioned before that most of these servers that
>> are sending requests to us are hosted inside the co. but by different
>> group. So geoReplication will not work in this case since 70% of
>> request comes from one region, infact same data center.
>>
>> Point# 1 mentioned by you is the best option but I am having some
>> challanges there. Problem like I mentioned is that User A -> connects
>> to one of the servers in the pool and that server sends -> http to our
>> server. Now user A can sign out and connect to other server in the
>> pool and then we get the request. Only way we can solve this is by
>> changing the server code, this would be best. However, we are having
>> hard time and I am trying to see if there are other solutions like say
>> a nosql distributed db that keeps track of user session.
>
> I could write that redirector as an apache plugin, or perlbal plugin, or
> varnish plugin. Which seems like the only place you have access to.
>
> You reaally sure geodns won't work? Even though your
> servers are 70% from one datacenter and 30% from another, are they all
> coming from the same exact IP address? You *could* use by-ip granularity
> for the load balancing, which I was sort of hinting at there.
>
> NoSQL isn't magic problem solving, you still have that race condition
> unless your app only makes one request every hour, or you replicate
> synchronously.
>
> Anyway that's the last I'll say on this, I just wanted to be clear :P It
> sorta seems like you just want something prebuilt.


Re: Memcachd with httpd to cache user stickyness to a datacenter

2011-04-11 Thread Adam Lee
I wasn't seeking to belittle your project... you have dismissed several load
balancing solutions simply because 70 percent of your traffic originates in
one location and I was asking why you couldn't, worst case scenario, just
have a 70/30 split in your load balancing if it's really not possible to
further subdivide it.

as long as you can guarantee your users "stick" to one data center longer
than the latency of your replication, any sort of master-master replicating
nosql cache should be able to fill your use case, unless I'm missing
something.

awl
On Apr 11, 2011 1:13 PM, "Mohit Anchlia"  wrote:
> Yes it is a big deal for the business otherwise I wouldn't be posting
> it here asking for suggestions. I respect everyones input and
> thanksful for that, but I need to see if it will work for us too :)
>
> Agreed, it's not a rocket science.
>
> On Mon, Apr 11, 2011 at 9:56 AM, Adam Lee  wrote:
>> also is the 70 percent thing really honestly that huge of a deal?  send
all
>> the traffic from the data center to one instance and the rest to the
other-
>> is not an even split but it's not that far off.
>>
>> really it seems to me like people are coming up with perfectly valid
>> solutions to your problem and you're throwing them out without really
>> considering them. outside of writing it for you, I don't know what more
>> people can do.
>>
>> use any sort of nosql solution that has master-master replication
(membase,
>> kyoto, redis, etc), have that sync across data centers, pin users to one
>> data center per session... it's not rocket science and it's a solved
>> problem.
>>
>> awl
>>
>> On Apr 6, 2011 2:49 PM, "dormando"  wrote:
>>>
>>>
>>> On Wed, 6 Apr 2011, Mohit Anchlia wrote:
>>>
>>>> Thanks! These points are on my list but none of them are useful. The
>>>> reason is I think I mentioned before that most of these servers that
>>>> are sending requests to us are hosted inside the co. but by different
>>>> group. So geoReplication will not work in this case since 70% of
>>>> request comes from one region, infact same data center.
>>>>
>>>> Point# 1 mentioned by you is the best option but I am having some
>>>> challanges there. Problem like I mentioned is that User A -> connects
>>>> to one of the servers in the pool and that server sends -> http to our
>>>> server. Now user A can sign out and connect to other server in the
>>>> pool and then we get the request. Only way we can solve this is by
>>>> changing the server code, this would be best. However, we are having
>>>> hard time and I am trying to see if there are other solutions like say
>>>> a nosql distributed db that keeps track of user session.
>>>
>>> I could write that redirector as an apache plugin, or perlbal plugin, or
>>> varnish plugin. Which seems like the only place you have access to.
>>>
>>> You reaally sure geodns won't work? Even though your
>>> servers are 70% from one datacenter and 30% from another, are they all
>>> coming from the same exact IP address? You *could* use by-ip granularity
>>> for the load balancing, which I was sort of hinting at there.
>>>
>>> NoSQL isn't magic problem solving, you still have that race condition
>>> unless your app only makes one request every hour, or you replicate
>>> synchronously.
>>>
>>> Anyway that's the last I'll say on this, I just wanted to be clear :P It
>>> sorta seems like you just want something prebuilt.
>>


Re: What's new in memcached (part 2)

2011-04-11 Thread Adam Lee
is there somewhere i can copy edit this document?

a bit nitpicky, i know, but i found a few mistakes just while browsing it...
section 2.1 both "suites" should be "suits," section 3.4 "it's" should be
"its," etc.

awl
On Apr 11, 2011 3:05 PM, "Trond Norbye"  wrote:
> What's new in memcached
> ===
>
> (part two - new feature proposals)
>
> Table of Contents
> =
> 1 Protocol
> 1.1 Virtual buckets!
> 1.2 TAP
> 1.3 New commands
> 1.3.1 VERBOSITY
> 1.3.2 TOUCH, GAT and GATQ
> 1.3.3 SET_VBUCKET, GET_VBUCKET, DEL_VBUCKET
> 1.3.4 TAP_CONNECT
> 1.3.5 TAP_MUTATION, TAP_DELETE, TAP_FLUSH
> 1.3.6 TAP_OPAQUE
> 1.3.7 TAP_VBUCKET_SET
> 1.3.8 TAP_CHECKPOINT_START and TAP_CHECKPOINT_END
> 2 Modularity
> 2.1 Engines
> 2.2 Extensions
> 2.2.1 Logger
> 2.2.2 Daemon
> 2.2.3 ASCII commands
> 3 New stats
> 3.1 Stats returned by the default stats command
> 3.1.1 libevent
> 3.1.2 rejected_conns
> 3.1.3 stats related to TAP
> 3.2 topkeys
> 3.3 aggregate
> 3.4 settings
> 3.4.1 extension
> 3.4.2 topkeys
>
>
> 1 Protocol
> ~~~
>
> Intentionally, there is no significant difference in protocol over
> 1.4.x. There is one minor change, but it should be transparent to
> most users.
>
> 1.1 Virtual buckets!
> =
>
> We don't know who originally came up with the idea, but we've heard
> rumors that it might be Anatoly Vorobey or Brad Fitzpatrick. In lieu
> of a full explanation on this, the concept is that instead of mapping
> each key to a server we map it to a virtual bucket. These virtual
> buckets are then distributed across all of the servers. To ease the
> introduction of this we've assigned the two reserved bytes in the
> binary protocol for specifying the vbucket id, which allowed us to
> avoid protocol extensions.
>
> Note that this change should allow for complete compatibility if the
> clients and the server are not aware of vbuckets. These should have
> been set to 0 according to the original binary protocol specification,
> which means that they will always use vbucket 0.
>
> The idea is that we can move these vbuckets between servers such that
> you can "grow" or "shrink" your cluster without losing data in your
> cache. The classic memcached caching engine does _not_ implement
> support for multiple vbuckets right now, but it is on the roadmap to
> create a version of the engine in memcached to support this (it is a
> question of memory efficiency, and there are currently not many
> clients that support them).
>
> Defining this now will allow us to start moving down the path to
> vbuckets in the default_engine and allow other engine implementors to
> consider vbuckets in their design.
>
> You can read more about the mechanics of it here:
> [http://dustin.github.com/2010/06/29/memcached-vbuckets.html]
>
> However, you _cannot_ use a mix of clients that are vbucket aware and
> clients who don't use vbuckets, but then again it doesn't make sense
> to use a vbucket aware backend if your clients don't know how to
> access them. This is why we believe a protocol change isn't
> warranted.
>
> Defining this now will allow us to start moving down the path to
> vbuckets in the default_engine and allow other engine implementors to
> consider vbuckets in their design.
>
> 1.2 TAP
> 
>
> In order to facilitate vbucket transfers, among other use cases where
> people want to see what's inside the server, we added to the binary
> protocol a set of commands collectively called TAP. The intention is
> to allow "clients" to receive a stream of notifications whenever data
> change in the server. It is solely up to the backing store to
> implement this, so it can make decisions about what resources are used
> to implement TAP. This functionality is commonly needed enough though
> that the core is aware of it, leaving specific implementation to
> engines.
>
> 1.3 New commands
> =
>
> There are a few new commands available. The following sections
> provides a brief description of them. Please check protocol_binary.h
> for the implementation details.
>
> 1.3.1 VERBOSITY
> 
>
> We did not have an equivalent of the verbosity command in the textual
> protocol. This command allows the user to change the verbosity level
> on your running server by using the binary protocol. Why do we need
> this? There is a command line option you may use to disable the ascii
> protocol, so we need this command in order to change the logging level
> in those configurations.
>
> 1.3.2 TOUCH, GAT and GATQ
> --
>
> One of the problems with the existing commands in memcached is that
> you couldn't tell the memcached server that the object is still valid
> and we just want a longer expiration. Normally you want to put an
> expiry time on the objects, so that you can get an indication if your
> cache is big enough (by watching the eviction stats.. if your
> memcached server has a high eviction rate your cache isn't big enough
> for what you want to have 

Re: maximum size of memcached instance

2011-05-07 Thread Adam Lee
memcached can handle that no problem- there are much larger and busier
instances in production out there.

my one question, though, would be why are you caching the xml instead of
parsing it and caching the resulting data?  it seems silly to parse the same
thing multiple times.

awl
On May 7, 2011 3:57 PM, "md"  wrote:
> There are going to be multiple processes that would access this cache.
> It's event driven architecture and cache is referred at various stages
> of processing.
>
> so as i understand from your reply, having a 72 GB cache size is not
> an issue. Hope there are such installations out there. would like to
> operational/performance issues if any.
>
> However keeping objects of more than 1 MB can cause problems.
>
> Any suggestions about client protocol?
> Regards
> Manish
> On May 6, 3:03 am, dormando  wrote:
>> > HI
>> > Can i define memcached instance of 32 GB /64 GB or 96 GB. Typical rac
>> > server has 16 core 96 GB. can i utilize this 96 GB with memcached
>> > cache. I have large objects to cache. value of a key is 1 MB -3MB.
>> > This object is xml data having binary data in it.
>>
>> > Reason for  doing this is that this data is accessed multiple times
>> > during processing. This data is discarded from cache once processing
>> > is over for this data.
>>
>> > Is this right usage of memcached and will memcached scale to meet this
>> > requirement?
>>
>> > Application that access cache is java based. So which is the right
>> > protocol for java client to communicated with memcached server.
>>
>> You can use the -I option to increase the max object size (it defaults to
>> 1mb), but that will reduce the overall memory efficiency. So you still
>> shouldn't set it too high.
>>
>> If your server has 96G of ram, you still need to leave some left over for
>> the OS, memcached's hash table, connection management, buffers, and TCP
>> sockets. So I'd put that closer to 92G or 90G for memcached.
>>
>> It shouldn't be hard to prototype and see if it'd help? It's not clear if
>> you have multiple servers accessed this shared data, or if it's just one
>> process accessing the same information multiple times, etc.
>>
>> -Dormando


Re: Regarding MemCache Speed

2011-05-25 Thread Adam Lee
you said client and server are on the same box, right? you're not hitting
any sort of a wall on the system, are you? (paging, cpu, io, etc...)

awl
On May 25, 2011 6:53 AM, "Ashu gupta"  wrote:
> Hi Dustin,
>
> I have tried the same with the things you have mentioned but still
> getting the same result . Below is the program.
>
> MemcachedClient client = new MemcachedClient(new
> InetSocketAddress(MEMCACHE_SERVER_DOMAIN, 11211));
>
> public Object getKeyValue(String key, int time,MemcachedClient client)
> {
> try {
> Object myObj = null;
>
> long startTime = 0;
> long deliverTime = 0;
> if (client != null) {
> startTime = System.currentTimeMillis();
> myObj = client.get(key);
> deliverTime = System.currentTimeMillis();
> } else {
> s_logger.error("Not able to get client. Check on high
> priority.");
> }
> long diff = deliverTime - startTime;
> s_logger.error("Time to deliver key " + key + " is " +
> diff);
>
> return myObj;
> } catch (Exception e) {
> s_logger.error("Not able to get KeyValue pair for key " +
> key, e);
> }
> return null;
> }
>
> On May 23, 10:47 pm, Dustin  wrote:
>> On May 23, 2:04 am, Ashu gupta  wrote:
>>
>> > client = getFreeClient();
>>
>>   What is getFreeClient()?  That feels like it'd be complicated.  Have
>> you tried this test with just a plain client being reused for every
>> iteration?
>>
>>   (yes, I realize it's not actually being timed itself, but depending
>> on what it does, it could easily have a large effect on the timing)
>>
>> > long startTime = 0;
>> > long deliverTime = 0;
>> > if (client != null) {
>> > startTime = Calendar.getInstance().getTimeInMillis();
>>
>>   Isn't this an exceedingly slow way to call
>> System.currentTimeMillis() ?


Re: a c memcached client

2011-05-29 Thread Adam Lee
i don't think you did understand. not to put words into his mouth, but i
think he was trying to say that when you run into a bug/problem with open
source projects, it's generally better to try to fix the software and
"contribute to the commons," as he said.  that way, users have one really
good project with all of the features they need and very few bugs, coders
are cooperating, there's less fragmentation, etc...

awl
On May 29, 2011 12:30 PM, "tony Cui"  wrote:
> Hi Matt,
> I think it is my fault did not explain my motivation clearly.
> I am not the one who has the power to tear Spymemcached up. Spymemcached
> helps me a lot , I love Spymemcahced. I just want to share some thing
which
> is valuable.
> Thank you for your reply. You must be a loyal user of
> Spymemcached. I understood you completely. Since it was a open-source
> project, I have my right to suggest and improve it.
> One thing is true, I use my client to store 10 keys in
> memcached , and it runs well . For spymecahced it failed.
> "I can say with a bit of experience, dealing with all of the
> possible connection issues takes some effort." For god's sake, as a
will-be
> member of IT , I have to say we were born to solve the problems. I solved
> a problem, I wanted to share with people. I thought it could help someone
> out of trouble.
> Now you are saying a great number of people use Spymemcached
> quite successfully. People have the right to choose what they love, you
can
> not stop them. You never can.
> Thanks Matt, you gave me a idea about my client's future. I
> am looking forward to your reply.
>
> On Sun, May 29, 2011 at 3:12 AM, Matt Ingenthron  wrote:
>
>> Hi Tony,
>>
>>
>> On May 28, 2011, at 4:36 AM, tony Cui wrote:
>>
>> > I wrote a c memcached client. The reason I wrote it is
>> because spymemcached has some problems, say "connection reset by peer".
And
>> the problems has driven crazy, so a idea came up , what about write a
>> client.
>>
>>
>> I'm never one to fault someone for writing more stuff they release for
>> others to use, but I do personally believe it's better to be part of
helping
>> fix software commons. I have to say, "connection reset by peer" sure
sounds
>> more like a network issue or the server shutting the connection down
rather
>> than a broken client.
>>
>> Have you filed any issues against spymemcached? Have you posted to the
>> mailing list?
>>
>> There are a great number of people who use spymemcached quite
successfully,
>> it's probably not necessary to tear it down it just because you decided
to
>> write your own. I can say with a bit of experience, dealing with all of
the
>> possible connection issues takes some effort.
>>
>> Good luck with it,
>>
>> Matt
>>
>
>
>
> --
> Best Regards
> Tony Cui


Re: How to determine if memcache is full

2011-06-09 Thread Adam Lee
your email and your pasted stats seem to greatly disagree...  you're
allocating 96MB for memcached and that's how much it thinks it's stored.  it
also says it has stored 630k items- where are you getting 2,600?

also, as an aside, does your test program only store one size of item?

awl
On Jun 8, 2011 10:51 PM, "PK Hunter"  wrote:
> Actually I joined this google group to ask the same thing, and found this
> thread.
>
> I start memcached on a CentOS 64 bit server with 8 GB of RAM, with the
> following settings:
>
> memcached -d -m 96 -n 10 -c 4096 -f 1.05 -l 127.0.0.1 -p 11211
>
> Yet, the memcached on my server has just about 2,600 keys, and seeing the
> actual number of bytes stores including the characters needed to store
keys
> + their values, it is 632,817. Which I understand is about 0.60 MB, right?

>
> What am I missing?
>
> If I manually try to add keys, and I wrote a PHP program using the
memcache
> library to add 6,000 keys, the keys do NOT get added, and also the
"$status
> ["evictions"]" remains at 0. So I'm not sure why the server is stopping at

> circa 2,600 mark.
>
> Any ideas would be much appreciated!
>
> These are the stats of my server:
>
>
> Memcache Server version: 1.4.5
>
> Process id of this server process 21808
>
> Number of seconds this server has been running 200229
>
> Accumulated user time for this process 48.654603 seconds
>
> Accumulated system time for this process 198.973751 seconds
>
> Total number of items stored by this server ever since it started 637981
>
> Number of open connections 6
>
> Total number of connections opened since the server started running
1505905
>
> Number of connection structures allocated by the server 70
>
> Cumulative number of retrieval requests 2657551
>
> Cumulative number of storage requests 637981
>
> Number of keys that have been requested and found present 2507883 (94.4%)
>
> Number of items that have been requested and not found 149668(5.6%)
>
> Total number of bytes read by this server from network 97.1105 MB
>
> Total number of bytes sent by this server to network 3839.72 MB
>
> Number of bytes this server is allowed to use for storage. 96 MB
>
> Number of valid items removed from cache to free memory for new items. 0
>
>
>


Re: How to determine if memcache is full

2011-06-09 Thread Adam Lee
why do you need to restart it?  you're telling it that it's allowed to use
up to 2G and it never breaks that...  flush_all only flushes the cache, it
doesn't deallocate memory.

awl
On Jun 9, 2011 6:30 AM, "Eduardo Silvestre"  wrote:
> Hello,
>
> every weeks we need restart memcached daemon... I try do flush_all with no
> lucky. Do you know other command to flush ?
>
> nobody 1 0.1 6.8 586716 564768 ? S Jun06 5:45
> /usr/bin/memcached -m 2048 -p 11211 -u nobody -l 192.168.52.52 -c 64000 -M
>
> stats
> STAT pid 1
> STAT uptime 219153
> STAT time 1307613237
> STAT version 1.2.2
> STAT pointer_size 64
> STAT rusage_user 151.757484
> STAT rusage_system 338.165134
> STAT curr_items 447844
> STAT total_items 9176979
> STAT bytes 498694125
> STAT curr_connections 6
> STAT total_connections 9658642
> STAT connection_structures 190
> STAT cmd_get 10255192
> STAT cmd_set 9176979
> STAT get_hits 8671302
> STAT get_misses 1583890
> STAT evictions 0
> STAT bytes_read 11289578042
> STAT bytes_written 10440713334
> STAT limit_maxbytes 2147483648
> STAT threads 1
> END
>
> (graph of my memcached attached)
>
> In last version this problem is fixed?
>
> Best Regards,
>
>
>
> On Thu, Jun 9, 2011 at 3:50 AM, PK Hunter  wrote:
>
>> Actually I joined this google group to ask the same thing, and found this
>> thread.
>>
>> I start memcached on a CentOS 64 bit server with 8 GB of RAM, with the
>> following settings:
>>
>> memcached -d -m 96 -n 10 -c 4096 -f 1.05 -l 127.0.0.1 -p 11211
>>
>> Yet, the memcached on my server has just about 2,600 keys, and seeing the
>> actual number of bytes stores including the characters needed to store
keys
>> + their values, it is 632,817. Which I understand is about 0.60 MB,
right?
>>
>>
>> What am I missing?
>>
>> If I manually try to add keys, and I wrote a PHP program using the
memcache
>> library to add 6,000 keys, the keys do NOT get added, and also the
"$status
>> ["evictions"]" remains at 0. So I'm not sure why the server is stopping
at
>> circa 2,600 mark.
>>
>> Any ideas would be much appreciated!
>>
>> These are the stats of my server:
>>
>>
>> Memcache Server version: 1.4.5
>>
>> Process id of this server process 21808
>>
>> Number of seconds this server has been running 200229
>>
>> Accumulated user time for this process 48.654603 seconds
>>
>> Accumulated system time for this process 198.973751 seconds
>>
>> Total number of items stored by this server ever since it started 637981
>>
>> Number of open connections 6
>>
>> Total number of connections opened since the server started running
>> 1505905
>>
>> Number of connection structures allocated by the server 70
>>
>> Cumulative number of retrieval requests 2657551
>>
>> Cumulative number of storage requests 637981
>>
>> Number of keys that have been requested and found present 2507883 (94.4%)
>>
>> Number of items that have been requested and not found 149668(5.6%)
>>
>> Total number of bytes read by this server from network 97.1105 MB
>>
>> Total number of bytes sent by this server to network 3839.72 MB
>>
>> Number of bytes this server is allowed to use for storage. 96 MB
>>
>> Number of valid items removed from cache to free memory for new items. 0
>>
>>
>>
>>


Re: Log when cache entry is removed to make space

2011-12-30 Thread Adam Lee
No, but evictions are tracked in stats, so if those are going up, then
items are being evicted to make space.

If you need to keep items around, then you need to make special
considerations, like making sure you have plenty of spare memory or having
separate memcache servers for all the expire=0 items, but honestly you
should probably take a look at a persistent key-value store instead, since
that is exactly what they're designed to do.  TokyoTyrant, for example,
does this and even speaks the memcached protocol (not sure if the same is
true for KyotoDystopia), so it can just be dropped in.
On Dec 16, 2011 12:44 PM, "Michael Bennett"  wrote:

> Hi-
>
> Is there a setting where memcached will log when it deletes a cache
> entry to make space for an incoming entry?  I am running with -vv and
> so far have not seen any such logs entries, but maybe I am missing
> it.  Reason I'm asking is I am storing some data with expire = 0, and
> some time later when I ask for that data, its not there.  Only reason
> I can think of is that its being kicked out to make room for other
> things.
>
> Thanks!
>


Re: Determining server load

2009-02-26 Thread Adam Lee
Perhaps you're running out of connections?

On Thu, Feb 26, 2009 at 3:01 PM, Chris Cameron  wrote:

>
> During peak load, PHP sporadically complains that it is unable to
> connect to my memcache server. I've taken a close look, and everything
> appears to be functioning fine, but the connection errors persist.
>
> The memcache server shows a load of 0.5 or so, the traffic leaving the
> server is about 20MB/s, curr_connections is 1357.
>
> My question is, what would an overloaded memcache server look like?
> CPU, RAM and network connection are all underused (from what I can
> see), so why under high load is PHP (occasionally) unable to connect?
>
> People are also reporting "strange errors" during this time as well
> (blank pages, 503's), which I assume is from this memcache issue.
>
>
> Any help would be appreciated.
>
> Chris
>
>
> The exact PHP error is:
>
> [Thu Feb 26 12:57:31 2009] [error] [client xx.xx.xx.xx] PHP Warning:
> Memcache::pconnect() [function.Memcache-pconnect]: Can't connect to
> xx.xx.xx.xx:11211, Unknown error (0) in /path/to/lib/
> lib_memcache.inc.php on line 121, referer: http://xxx/etc.
>



-- 
awl


Re: Question about writing to multiple memcached server nodes...

2009-02-26 Thread Adam Lee

Then I think you're not looking for memcached.

Look into some other distributed key-value datastores.  There is a lot
of really interesting work being done in this area right now.

This might be a good starting point:

http://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-value-stores/

On Fri, Feb 27, 2009 at 12:08 AM, theRat  wrote:
>
> I'm still curious about this though.  I'm considering an application
> that doesn't even USE a database - we'd keep data in it that we
> wouldn't WANT to lose - but it wouldn't be a disaster if we did.
> Without a database to backup if memcached failed, each memcached node
> becomes a single point of failure.  I'm looking to see if I can do
> something a little more robust...  I don't know if this is possible to
> do at the config level - or if it is something that different
> implementations of the client can handle - or if it has to be done by
> my application...
>
> Your thoughts?
>
> -john
>
> On Feb 26, 3:38 pm, Joseph Engo  wrote:
>> You wouldn't want this type of setup.  Memcached is a simple cache, if
>> your request results in a miss you should hit your datastore.  Don't
>> treat it like a database. Always have a fall back plan for something
>> not being present in cache.
>>
>> Add all the memcached nodes to your connection pool and allow the
>> library to distribute your keys.
>>
>> On Feb 26, 2009, at 3:35 PM, theRat wrote:
>>
>>
>>
>> > All,
>>
>> > sorry for the newbie question - I haven't found my answer anywhere
>> > else...
>>
>> > When I write a value to a particular memcached server node A, can I
>> > also have it write to a second (or third) node B - as a backup in case
>> > node A goes down?  It's unclear to me whether this is possible.  If it
>> > IS possible, is this set up through a configuration of the server?
>> > the client?  Is it done "invisibly" or does my application have to
>> > write to multiple nodes?
>>
>> > Many thanks!
>>
>> > -john
>



-- 
awl


spy client under heavy load

2009-03-06 Thread Adam Lee

we recently made the switch from the whalin client to spy and seem to
be running into problems under heavy concurrency/load in our front-end
servers and i was wondering if anybody (dustin, perhaps?) had any
ideas for strategies to deal with it.

the majority of our front-end servers are sun fire t1000s (8 cores, 4
threads per core) running solaris 10, so obviously the spy client
works a lot better for us in the vast majority of cases-- the
synchronized blocks in the whalin connection pool gave us a lot of
contention problems in particular. when the systems get busy, though,
it seems that i/o can't keep up and we start seeing a lot of timeouts,
which in turn has a domino effect and effectively brings down the
entire cluster.  the problem is that the machines aren't even reaching
60% cpu when this happens.

does my diagnosis of the problem seem right and, if so, any ideas for
the best way to deal with this?  obviously adjusting timeouts would
probably only exacerbate the problem, so i toyed with the idea of
having a pool of clients (though i haven't really delved into the code
to see if that's feasible or would help at all) or possibly hacking it
to change how its i/o threads work.  for now, we've just added a few
more machines to this cluster, but it seems like a waste of hardware
when i know that these things can operate above 90% cpu for a
sustained period with no problem.

thanks... any help would be great and let me know if you have any more
questions about specifics.

--
awl


Re: Which java memcached client should I use?

2009-03-06 Thread Adam Lee

I'm a big fan of the spy client and it should work with whatever
server version you want.

On Fri, Mar 6, 2009 at 12:03 PM, theRat  wrote:
>
> All,
>
> anyone out there have an opinion on which java memcached client is
> best?  I'm looking for something that can handle multi-threading on
> the client side, is reliable, and performs well.  Any opinions?  Also,
> is memcached in a position yet where you can mix-and-match memcached
> clients with memcached servers?
>
> Thanks!
>



-- 
awl


Re: Distributed resource locking using memcached

2009-04-08 Thread Adam Lee

We actually use memcached as a lock server for certain very rare,
per-user transactions.  It's less than ideal, but we scrambled to
write it in a pinch when our previous solution had problems and it's
held up thus far.  It's something we've had on the "fix when we have
time" list since, but it does indeed work.

I still wouldn't recommend it, though.  What kind of scale are you
looking for?  Have you considered something like zookeeper, for
example?

On Wed, Apr 8, 2009 at 10:07 AM, Clint Webb  wrote:
> Its possible... but really... its a cache.  Thats what it is designed to do,
> and that is what it is good for.  memcached is not optimized at all to be a
> lock server, why exactly are you wanting to use memcached for this instead
> of an actual lock server?
>
> That being said, there are a number of issues when using memcached as a lock
> server that would make it far less than ideal.
>
> Also a default unlock after 5 minutes rather defeats the purpose of a lock
> server...   I wouldn't recommend this at all.  Far better off with an actual
> lock server that can defect if a client has disconnected.
>
> Short answer:  No, do not use memcached as a lock server.
>
> On Wed, Apr 8, 2009 at 9:44 PM, yatagarasu  wrote:
>>
>> Is it possible to implement distributed resource blocking using
>> memcached ?
>>
>> Is it possible to implement them using following scheme (pseudocode)
>>
>> lock(a)
>> 1. if (add( key=>a, expiration=>5min )) {
>> 2.   if (cas( key=>a, expiration=>5min )) {
>>            // resource locked by us
>>      }
>>      // Someone tried to lock resource the same time. Race condition
>> in add.
>>   }
>>   // resource is already locked
>>
>> unlock(a)
>> 1. delete(a)
>>
>> What do you think?
>> Is race condition in add possible?
>> Will such double check work?
>> Or is there more simple way?
>>
>>
>> PS. Expiration in 5min is used to auto unlock resource in case of
>> client disconnection or other errors.
>
>
> --
> "Be excellent to each other"
>



-- 
awl


Re: how to clear the cache

2009-06-22 Thread Adam Lee
Also, restarting memcached definitely clears the cache.

On Mon, Jun 22, 2009 at 1:56 PM, Chris Goffinet  wrote:

>
> *Sigh*
>
> Guys. Do some research first.
>
> flush_all is the command
>
> http://code.sixapart.com/svn/memcached/trunk/server/doc/protocol.txt
>
>
> ---
> Chris Goffinet
> goffi...@digg.com
>
>
>
>
>
>
> On Jun 22, 2009, at 10:17 AM, Аркадий Левин wrote:
>
>
>> Only restart
>>
>> On Mon, Jun 22, 2009 at 10:46 PM, Peter Heiner
>> wrote:
>>
>>>
>>> hello,
>>>
>>> i want to clear the cache? how can i do this with a command?
>>>
>>> "/etc/init.d/memcached restart" just restart memcached, but it don't
>>> clear the cache!
>>>
>>>
>


-- 
awl


Re: how big is the memcached Market

2009-07-09 Thread Adam Lee
On Thu, Jul 9, 2009 at 11:48 AM, Dustin  wrote:

> > In discussion with a server provider it was
> > mentioned that you would need 2 memcached servers for each 10database
> > servers. I am not sure if this is a good estimate that I should use.
>
>   Such estimates are nonsense.  You need as many as your application
> requires.  There are applications that use memcached and don't have
> databases, for example.
>

Yeah, plus I'd say that one memcached server for every five database servers
would be a more realistic estimate.
Joking aside, Dustin is right-- it's impossible to say how many memcached
servers you need without knowing the size of the objects and exactly how
many you want to cache.  Realistically, too, the size of the memcached pool
is of more importance than the number of servers in it, unless you're doing
a particularly large number of operations...

-- 
awl


Re: clustering memcached

2009-07-09 Thread Adam Lee
I'd be surprised if this is how it's actually set up.  I'm guessing the
object-cache plugin hashes the key for the object to choose which server to
set/get data.  This means that (approximately) 50% of your data is stored on
each memcached instance and load is shared between them, though no data
should exist on both servers.  Given that they're on the same network, the
overhead is negligible and well worth it in exchange for the tradeoffs
(scalability, durability, etc)
On Thu, Jul 9, 2009 at 5:25 PM, DLS  wrote:

>
> Hello all,
>
> A question about clustering memcached. I currently have a website with
> two front ends. Each front end runs a single instance of memcached.
> And both servers connect to the same database server. They run
> wordpress and via an object-cache plugin, I believe it checks both
> instances of memcached on both front ends for cached objects. It's
> given an array of memcached server:ports.
>
> My questions is if there is any way to have both instances of
> memcached know of each other so that the code can only have to connect
> to its local memcached instance? As opposed to having to make a
> connection to the other server's memcached instance? I'm think about
> whether it effects performance when having to connect to another
> server.
>
> Otherwsie, what's the best way to have this set up? Assuming 2 (or
> more) frontends each running an instance of memcached? And wanting to
> keep the two front ends in sync with each other.
>
> Thanks!
> Dan
>



-- 
awl


Re: Memcache Question

2009-07-28 Thread Adam Lee
No, this is not very expensive.  We use memcached for a lot of our
rate-control here.

On Tue, Jul 28, 2009 at 12:04 PM, Beyza  wrote:

>
> Hi,
>
> This is not an important question, but I wonder the answer.
>
> When I develop websites by using memcache, I also use it for spam
> checking.
>
> For example; When someone post a comment, or report something, or
> search something, I create a memcache object and set an expiration
> time, i.e. 5 seconds. It goes something like that
>
> memcache_set($memcache, "comment-check-userid", '1', 0, 5);
>
> I check every time whether this value is set or not. If it's set, I
> consider it as a spam.
>
> My question is that, is it expensive to create this if you compare
> with other options for this purpose? I could not find any information
> about this on the internet. I am just curious :)
>
> Thanks from now on,
>



-- 
awl


Re: Memcache Question

2009-07-28 Thread Adam Lee
This sort of information is inherently fairly transient, though, I'd say.
 Unless you're getting a huge amount of spam from a lot of different user
IDs or your cache is already running very near to capacity (in which cases
you already have bigger problems), I'd say it should work in the great
majority of cases.
If you want to get more involved, you can make a separate memcached pool
just for this task and can also start doing things like multiple counters
per-user using incr and decr so that you can get a little more sophisticated
with your algorithm.  You can, for example, make an hourly and daily counter
for a user that you increment when they post and when they reach some
threshold, you stop them from posting, make them solve a CAPTCHA, etc...

On Tue, Jul 28, 2009 at 12:45 PM, Josef Finsel  wrote:

> Granted, Colin, but I wouldn't want anyone coming across a simple
> implementation like this and then expand upon it without the realization
> that memcached is not a persistent data store.
>
>
> On Tue, Jul 28, 2009 at 12:41 PM, Colin Pitrat wrote:
>
>>
>> Well, I'd say that for double-post and spam control, it's not a big
>> problem if sometime it doesn't work. I mean, with this kind of
>> algorithm he already accept one spamming message, so two is not a
>> problem.
>>
>> 2009/7/28 Josef Finsel :
>> > Another question you might want to ask is how are you going to handle it
>> if
>> > the item has been evicted from the cache? Someone could be spamming you
>> and
>> > you wouldn't catch it.
>> > Granted, that would have to be a cache with a high-eviction rate but
>> it's
>> > still a possibility you might want to consider.
>> >
>> > On Tue, Jul 28, 2009 at 12:04 PM, Beyza  wrote:
>> >>
>> >> Hi,
>> >>
>> >> This is not an important question, but I wonder the answer.
>> >>
>> >> When I develop websites by using memcache, I also use it for spam
>> >> checking.
>> >>
>> >> For example; When someone post a comment, or report something, or
>> >> search something, I create a memcache object and set an expiration
>> >> time, i.e. 5 seconds. It goes something like that
>> >>
>> >> memcache_set($memcache, "comment-check-userid", '1', 0, 5);
>> >>
>> >> I check every time whether this value is set or not. If it's set, I
>> >> consider it as a spam.
>> >>
>> >> My question is that, is it expensive to create this if you compare
>> >> with other options for this purpose? I could not find any information
>> >> about this on the internet. I am just curious :)
>> >>
>> >> Thanks from now on,
>> >
>> >
>> >
>> > --
>> > "If you see a whole thing - it seems that it's always beautiful.
>> Planets,
>> > lives... But up close a world's all dirt and rocks. And day to day,
>> life's a
>> > hard job, you get tired, you lose the pattern."
>> > Ursula K. Le Guin
>> >
>> > What's different about data in the cloud? http://www.azuredba.com
>> >
>> > http://www.finsel.com/words,-words,-words.aspx (My blog) -
>> > http://www.finsel.com/photo-gallery.aspx (My Photogallery)
>> >  -http://www.reluctantdba.com/dbas-and-programmers/blog.aspx (My
>> > Professional Blog)
>> >
>>
>
>
>
> --
> "If you see a whole thing - it seems that it's always beautiful. Planets,
> lives... But up close a world's all dirt and rocks. And day to day, life's a
> hard job, you get tired, you lose the pattern."
> Ursula K. Le Guin
>
> What's different about data in the cloud? http://www.azuredba.com
>
> http://www.finsel.com/words,-words,-words.aspx (My blog) -
> http://www.finsel.com/photo-gallery.aspx (My Photogallery)  -
> http://www.reluctantdba.com/dbas-and-programmers/blog.aspx (My
> Professional Blog)
>



-- 
awl


Re: Is clearing out the cache necessary?

2009-07-30 Thread Adam Lee

We used to flush cache on rollout of new code that we knew would be
binary incompatible, but that was very painful. So we do "versioned"
objects and write conversion methods so that when new code rolls out,
it can read the previous object, see that it's an old version and
apply the necessary transformation.  This works particularly well for
us because we don't have long running sessions or, really, any sort of
state between requests.

On Thu, Jul 30, 2009 at 1:54 AM, Matt Ingenthron wrote:
>
> blazah wrote:
>>
>> What do folks do with the objects stored in memcached when a new
>> version of the software is deployed?  There is the potential that the
>> data could be stale depending on the code changes so do people
>> typically just flush the cache?
>>
>
> The most common approach here is to add something to the key prefix which
> matches the version of the application.  If you work this out, you can
> actually even do 'rolling upgrades' of the application (assuming users have
> long lived sessions).  As things roll over from the old prefix to the new
> one, the old objects will just naturally expire or LRU out.
>
> If you have a heavy workload, a flush_all could be pretty bad for your
> users.
>
>> Is there information on best practices and/or how memcached is used in
>> production?
>
> There's quite a bit on the FAQ:
> http://code.google.com/p/memcached/wiki/FAQ
>
> Hope that helps,
>
> - Matt
>
>



-- 
awl


Re: Patch to allow -F option to disable flush_all

2009-08-03 Thread Adam Lee
Yeah, I'm inclined to agree with you...  Disabling flush seems like a bit of
a red herring.  Sure, it prevents one very particular case, but at best it
provides a false sense of safety.

If we were going to do anything like this, perhaps we could consider doing
something like how Tokyo Cabinet/Tokyo Tyrant handles this sort of thing.
 From their documentation:

-mask expr : specify the names of forbidden commands.
-unmask expr : specify the names of allowed commands.

The command mask expression is a list of command names separated by ",". For
example, "out,vanish,copy" means a set of "out", "vanish", and "copy".
Commands of the memcached compatible protocol and the HTTP compatible
protocol are also forbidden or allowed, related by the mask of each original
command. Moreover, there are meta expressions. "all" means all commands.
"allorg" means all commands of the original binary protocol. "allmc" means
all commands of the memcached compatible protocol. "allhttp" means all
commands of the HTTP compatible protocol. "allread" is the abbreviation of
`get', `mget', `vsiz', `iterinit', `iternext', `fwmkeys', `rnum', `size',
and `stat'. "allwrite" is the abbreviation of `put', `putkeep', `putcat',
`putshl', `putnr', `out', `addint', `adddouble', `vanish', and `misc'.
"allmanage" is the abbreviation of `sync', `optimize', `copy', `restore',
and `setmst'. "repl" means replication as master. "slave" means replication
as slave.

On Mon, Aug 3, 2009 at 5:23 AM, dormando wrote:
>
> Hey,
>
> I read the whole thread and thought about it for a bit... I'm not sure we
> should do this. Especially not as an explicit solution to a security
> problem with a shared hosting cluster. We can roadmap out a method of
> multitenancy (dustin's apparently done proof of concept before, and I can
> imagine it being pretty easy)... but more long term.
>
> If you disable flush, there're still a hundred ways where one customer can
> screw another customer, or if there's a site injection, they could gather
> information and destroy things you'd really rather they not otherwise.
>
> They can either inject large faulty SET's to push things out of cache.. or
> run 'stats cachedump' and fetch/delete random values from other
> customers... or run 'stats sizes' in a loop and hang all your servers.
>
> The -o discussion is good, but a separate discussion, and not related
> towards security fixes in a multitenancy situation. We should revisit that
> as the engine work comes in, since we'd need a set of extended options we
> could pass into the engines?
>
> -Dormando
>
> On Fri, 24 Jul 2009, Adrian Otto wrote:
>
>>
>> Dormando,
>>
>> Thanks for your reply. The use case is for using memcached from a hosting
>> environment where multiple subscribers share the same source IP address
>> because they run application code together on the same cluster of web
servers.
>> The clusters are large, typically in the hundreds of nodes range. In this
>> arrangement it's possible for one subscriber to dump the cache belonging
to
>> another, even when they have their own memcached instance running.
>>
>> We are also aware of horror stories where app developers don't properly
>> sanitize user input that gets sent to memcached, potentially resulting in
the
>> equivalent of an SQL injection. It's possible to dump the cache using an
>> exploit of such code to send a "flush_all" command and lead to rather
serious
>> database performance problems for busy sites when the cache is cold.
Because
>> we can't control the code that is run on our platform to protect us from
this,
>> we'd like a simple way to nip it in the bud right in memcached.
>>
>> We recognize that we could implement a more elaborate method of
partitioning
>> access to memcached on a per-subscriber basis, but we just wanted
something
>> simple to let them use an individual memcached instance if they want to,
>> accepting the security implications of the shared environment.
>>
>> The feature is optional, defaults to off, and it only adds a simple check
of a
>> boolean to bypass the code in normal configuration. Furthermore,
purge_all
>> should be infrequently accessed anyway, so the performance implication of
the
>> additional data comparison should be mute. I appreciate the
consideration.
>>
>> Thanks,
>>
>> Adrian
>>
>> On Jul 24, 2009, at 12:54 PM, dormando wrote:
>>
>> >
>> > Hey,
>> >
>> > We've rejected a few similar patches in the past. Usually if folks need
>> > this they have bigger problems... What is your particular use case?
>> >
>> > I could see this going in though. It squicks me out but I'm open to
>> > opinions from the others :)
>> >
>> > -Dormando
>> >
>> > On Fri, 24 Jul 2009, Adrian Otto wrote:
>> >
>> > > Hi,
>> > >
>> > > I've attached a patch for a tan option flag -F to disable the
>> > > purge_all command in memcached. It also includes:
>> > >
>> > > 1) A tiny tweak to testapp.c that allowed "make test" to pass
>> > > 2) Fixed a minor bug in t/binary.t with a variable scope.
>> > > 3) F

Re: Update / Populate cache in the background

2009-08-05 Thread Adam Lee
Run a cron job that executes the query and updates the cache at an interval
shorter than the expiration time for the cached item.

On Wed, Aug 5, 2009 at 11:38 AM, Haes  wrote:

>
> Hi,
>
> I'm using memcached together with Django to speed up the database
> queries. One of my Django views (page) uses a query that takes over 20
> sec. Normally this query hits the memcached cache and the data is
> served almost instantly. The problem now is that if the cache expires,
> the next person accessing this page will have to wait about 20 seconds
> for it to load which is not really acceptable for me.
>
> Is there a way to update the memcached data before it times out, in a
> way that this query always hits the cache?
>
> Thanks for any hints.
>
> Cheers.
>



-- 
awl


Re: Update / Populate cache in the background

2009-08-05 Thread Adam Lee
Yeah, we're a (very) big site and we'd definitely not be in good shape if
memcached died at any point during peak traffic hours.  We can (and do)
bounce/flush during off-peak times, but memcached is really a cornerstone of
our scalability.  That said, we almost never write code that dies completely
if memcached isn't around (though we have a few special hacks that I know
I've at least described to Trond and others in conference calls).
In any spot where I had a query like the one above, I'd write the code such
that the application only checks memcached for the value.  If it's not
there, it fails gracefully.  Only the background job ever performs the query
and updates the cache.  This means that no user ever has to wait 20 seconds
for a page to load and you avoid having any sort of a herd effect-- I can
only imagine what would happen to the DB if this "20 seconds to execute"
query got performed by multiple request threads at once.

On Wed, Aug 5, 2009 at 1:51 PM, dormando  wrote:

>
> Yes, well... For most big sites I'd actually insist that it's okay if you
> disappear from the internet if too many of your memcached instances go
> away. Losing 10-50% might be enough to kill you and that's okay. Memcached
> has been around long enough and integrated tight enough that it *is* a
> critical service. Losing the typical number of common failures, or issuing
> a slow rolling restart, shouldn't cause you to go offline. Losing a huge
> sack all at once might make you limp severely or fail temporarily.
>
> For very small sites with just 1-2 instances, I'm not too sure what to
> recommend. Consider making sure you at least have 2-4 instances and that
> you can survive the loss of one, even if you end up running multiple on
> one box.
>
> -Dormando
>
> On Wed, 5 Aug 2009, Josef Finsel wrote:
>
> > Or, to put it another way, memcached is *not a persistent data store nor
> > should it be treated as one. If your application will fail if memcached
> is
> > not running, odds are you are using memcached incorrectly.*
> >
> > On Wed, Aug 5, 2009 at 1:36 PM, dormando  wrote:
> >
> > >
> > > What happens when we release 1.4.1?
> > >
> > > Your site really should be able to tolerate at least the brief loss of
> a
> > > single memcached instance.
> > >
> > > It's definitely good to update the cache asyncronously where possible,
> and
> > > all of this stuff stuff (a lot is detailed in the FAQ too). I'd just
> > > really like to push the point that these techniques help reduce common
> > > load. They can even help a lot in emergencies when you find a page
> taking
> > > 50 seconds to load for no good reason.
> > >
> > > However, relying on it will lead to disaster. It allows you to create
> > > sites which cannot recover on their own after blips in cache, upgrades,
> > > hardware failure, etc.
> > >
> > > If you have objects that really do take umpteen billions of
> milliseconds
> > > to load, you should really consider rendering data (or just the really
> > > slow components of it) into summary tables. I like using memcached to
> take
> > > queries that run in a few tens of milliseconds or less, and make them
> > > typically run in 0.1ms. If they take 20+ seconds something is horribly
> > > wrong. Even 1 full second is something you should be taking very
> > > seriously.
> > >
> > > -Dormando
> > >
> > > On Wed, 5 Aug 2009, Edward Goldberg wrote:
> > >
> > > >
> > > > I use the term "Cache Warmer" for task like this,  where I keep a key
> > > > (or keys) warm all of the time.
> > > >
> > > > The best way to keep the load on the DB constant is to Warm the Cache
> > > > at a well defined rate.
> > > >
> > > > If you wait for a "request" to update the cache,  then you may loose
> > > > control over the exact moment of the load.
> > > >
> > > > The idea is to make a list of the items that need to be warmed,  and
> > > > then do that list at a safe rate and time.
> > > >
> > > > Very long TTL values can be un-safe.  Old values can lead to code
> > > > errors that are not expected days later.
> > > >
> > > > Edward M. Goldberg
> > > > http://myCloudWatcher.com/
> > > >
> > > >
> > > > On Wed, Aug 5, 2009 at 9:44 AM, dormando wrote:
> > > > >
> > > > > Also consider optimizing the page so it doesn't tak

Re: 1.2.6 to 1.2.8 or 1.4.0?

2009-08-06 Thread Adam Lee
We upgraded to 1.4 in production recently and it's been great.  No problems
whatsoever and it's a bit faster (thanks to threading and concurrency
fixes).

On Thu, Aug 6, 2009 at 1:39 AM, Matt Ingenthron  wrote:

>
> Jay Paroline wrote:
>
>> Hello,
>> We've been having some intermittent memcached connection issues and I
>> noticed that we are a couple of releases behind. Our current version
>> is 1.2.6.
>>
>> Before I nag our admin about upgrading, is there any reason why it
>> might be more wise to go to 1.2.8 rather than make the leap to 1.4.0?
>>
>>
>
> 1.4.0 is in production in some large sites (as was 1.3) and it does have
> some miles on it.  Whether you make the move to 1.4 or not is very
> subjective to environment, etc. but I'd say if you're going to make the move
> going to 1.4.0 gives you a few more client options, and that side of things
> likely iterates faster.  I'm personally a fan of more options.
>
> I have to say, I don't think going to 1.2.8 or 1.4.0 has anything which
> will help with connection issues... but then I don't really know what your
> issues are.
>
>> FWIW the clients we use are PHP primarily and a couple of lower
>> traffic Java apps. Should 1.4.0 work with any clients that were
>> working with 1.2.6?
>>
>>
>
> Yes, 1.4.0 should support all of your clients which are working with 1.2.6.
>  As aforementioned, there are clients which give you features only available
> with 1.4.0, but the 1.4.0 server is backward compatible with existing
> clients.
>
> Hope that helps,
>
> - Matt
>



-- 
awl


Re: Memcached as distributed RAM disk

2009-08-11 Thread Adam Lee
We do a hack that enables something similar to this, but I wouldn't
recommend it.  If you want something memcached-like but persistent, you
should look into, for example, tokyocabinet.  It even speaks memcached
protocol, so you can use it as a drop-in replacement and achieve the desired
effect.  It's not _as_ fast as memcached, but it's still very fast.
On Tue, Aug 11, 2009 at 1:59 PM, smolix  wrote:

>
> Hi,
>
> Is there a way to use memcached as a _guaranteed_ distributed
> (key,value) storage? That is, I want to have a distributed storage of
> (key, value) pairs which can be accessed from many clients
> efficiently. The RAM is sufficient that all should easily fit into
> memory but I probably can't have an overhead of more than 2x the
> amount of data it takes to store the pairs. Is there a way to turn off
> the discard option in memcached? I can tune the keys such that they
> are sequential or do similar preprocessing if needed.
>
> This is about 100-500GB of data that I need to store with values less
> than 4k per item (in some cases much smaller).
>
> Any help and suggestions would be greatly appreciated.
>
> Thanks,
>
> Alex
>
>


-- 
awl


Re: Memcached as distributed RAM disk

2009-08-11 Thread Adam Lee
We have a medium-sized dataset (~50M entries) with small values (a few
hundred bytes) where we need "persistence" with a very high read throughput
and occasional updates.
To solve this, we built a cluster of memcached servers with enough RAM on
each machine to store the entire dataset and wrote our own memcached client
with the following characteristics:

- each write operation writes to every machine in the cluster
- each read operation reads from any one machine in the cluster
- if a machine becomes non-responsive, it is marked as dirty and removed
from the cluster list

Every night a "full populate" script is run and any new machines or machines
that have been removed throughout the day are re-added to the cluster.

With this setup, we achieve hundreds of thousands of reads per-second and
achieve virtual "persistence."

On Tue, Aug 11, 2009 at 4:56 PM, smolix  wrote:

> Hi Adam,
>
> Thanks for the tokyocabinet pointer. Unfortunately that would be too
> slow (we need as high iops as we can get and no, ssd would not be an
> answer unless it gets into FusionIO performance range). What was the
> hack you did? We don't need persistent storage for many days. The
> total computation will run in 1 maybe 2 days total.
>
> Take care,
>
> Alex
>
> On Aug 11, 12:37 pm, Adam Lee  wrote:
> > We do a hack that enables something similar to this, but I wouldn't
> > recommend it.  If you want something memcached-like but persistent, you
> > should look into, for example, tokyocabinet.  It even speaks memcached
> > protocol, so you can use it as a drop-in replacement and achieve the
> desired
> > effect.  It's not _as_ fast as memcached, but it's still very fast.
> >
> >
> >
> > On Tue, Aug 11, 2009 at 1:59 PM, smolix  wrote:
> >
> > > Hi,
> >
> > > Is there a way to use memcached as a _guaranteed_ distributed
> > > (key,value) storage? That is, I want to have a distributed storage of
> > > (key, value) pairs which can be accessed from many clients
> > > efficiently. The RAM is sufficient that all should easily fit into
> > > memory but I probably can't have an overhead of more than 2x the
> > > amount of data it takes to store the pairs. Is there a way to turn off
> > > the discard option in memcached? I can tune the keys such that they
> > > are sequential or do similar preprocessing if needed.
> >
> > > This is about 100-500GB of data that I need to store with values less
> > > than 4k per item (in some cases much smaller).
> >
> > > Any help and suggestions would be greatly appreciated.
> >
> > > Thanks,
> >
> > > Alex
> >
> > --
> > awl
>



-- 
awl


Re: multi-get usage

2009-08-21 Thread Adam Lee
The place where we tend to use a lot of multi-gets is when we're rendering
something like a page that has comments from a lot of users on it and we
want to get the cached data for every one of those users who has commented,
for example.
Also, some clients (e.g. the Spy client for Java) actually do optimization
and collapse multiple gets to the same server into a multi-get.

On Fri, Aug 21, 2009 at 1:10 PM, Bill Moseley  wrote:

> I've often seen comments about using multi-gets for better performance.
> First, I'm a bit curious if/why it's a big performance gain.
> Obviously, not all keys will be on the same memcached server in a multi-get
> request.  Can somoene give me the one short explanaition or point me to
> where to learn more?
>
> I'm perhaps more curious how people use multiple gets -- or really the
> design of an web application that supports this.
>
> I've tended to cache at a pretty low level -- that is if I have a method in
> the data model to fetch object "foo" I tend to check the cache, if not found
> then fetch from the database and save to memcached.  As a result we can end
> up with a lot of separate memcached get requests for a single web-request.
> This is less true as the app becomes more web2.0-ish, but still happens.
>
> So, I'm curious about the design of an application that supports gathering
> up a number of (perhaps unrelated) cache requests and fetch all at once.
> Then fill in the cache where there are cache-misses.  Am I misunderstanding
> how people use this feature?
>
>
> --
> Bill Moseley
> mose...@hank.org
>



-- 
awl


Re: can memcached trigger event when object expired?

2009-08-24 Thread Adam Lee
Or don't store your sessions at all...
On Mon, Aug 24, 2009 at 4:28 AM, Henrik Schröder  wrote:

> The correct way to do this is to store your sessions in your database and
> use memcached as a cache to speed it up. If you want to do something on
> session end, you make a script that goes through the stored sessions in your
> db, finds the expired ones, and does whatever it is you want to do.
>
> So, on every pageview you do a memcached get for the current user's session
> data.
> If it does not exist, you check the db.
> If it exists, you load it from db and store in memcached.
> If it does not exist in the db, it is a new session, so you create it.
> If it does exist in memcached, well, there you have it.
>
> Then you process your page, and perform any changes to the session object.
>
> And when your page is done, you store the session object back into
> memcached and your db.
> It is a good idea to attach a timestamp to your session object so that you
> only do db writes every N minutes instead of every pageview. This way, users
> may lose a minute or two of session data if a memcached server crashes or
> the data expires, but you will get a significant speedup by not having to
> write to the db on every pageview.
>
>
> /Henrik Schröder
>
> On Mon, Aug 24, 2009 at 00:22, ron  wrote:
>
>> I simply want to use memcached to store user sessions for multiple web
>> app servers, instead of using the tomcat's local session.
>> Tomcat supports something call HttpSessionListener which has a
>> listener method "sessionDestroyed" that allows a last chance clean up
>> before the session object is destroyed.
>> I know tomcat has ways to replicate sessions over multiple servers but
>> i am thinking memcached is an easier solution for that.
>>
>> If you know some other better option please let me know.. thx.
>>
>> On Aug 23, 10:08 am, Henrik Schröder  wrote:
>> > Basically, no, but there are ways you can do that yourself in your
>> > application. However, it still sounds like you're using memcached for
>> > something it was not designed to do. It is a cache, not a datastore. You
>> are
>> > never guaranteed to get back an item that you have stored, and you have
>> no
>> > way of knowing if a cache miss is because the item expired, or because
>> it
>> > was never stored in the first place.
>> >
>> > But what you can do is to store a timestamp yourself together with the
>> item.
>> > Then, whenever you retrieve the item, you look at your timestamp and
>> treat
>> > it as the actual expiry, so if it's passed, you perform your cleanup and
>> > return null to the caller. And whenever you store an item you tag on
>> this
>> > timestamp and set the actual expiry to sometime far in the future.
>> However,
>> > note that you may not always get back items and your cleanup may not run
>> for
>> > all items.
>> >
>> > Perhaps if you told us what you wanted to achieve with your application
>> we
>> > could help you do it properly with memcached or point you in the
>> direction
>> > of other technologies if memcached is the wrong one for the job?
>> >
>> > /Henrik Schröder
>> >
>> > On Sun, Aug 23, 2009 at 17:57, ron  wrote:
>> >
>> > > thanks.
>> >
>> > > So can memcached at least return an expiring object for the last time
>> > > to my application, with a flag set saying that it has expired in
>> > > memecached and the next time you request the object it will return
>> > > null?
>> > > Doing this will give me a chance to do my other application clean up
>> > > which needs the data in that expiring cache object.
>> >
>> > > On Aug 23, 8:34 am, Brian Moon  wrote:
>> > > > 1. No, memcached does not do that.
>> >
>> > > > 2. You don't want it to.
>> > >http://code.google.com/p/memcached/wiki/FAQ#When_do_expired_cached_it.
>> ..
>> > > > Having to have a garbage collector that watched items for expiration
>> > > > would increase the load that memcached put on a system a great deal.
>> >
>> > > > Brian.
>> > > > http://brian.moonspot.net/
>> >
>> > > > On 8/23/09 12:03 AM, ron wrote:
>> >
>> > > > > Hi,
>> >
>> > > > > Can memached trigger some kind of event to notify client that a
>> > > > > particular cached object is expired??
>> >
>> > > > > If it doesn't support, does anyone know what other technology out
>> > > > > there there is also a distributed cache but also it can send out
>> > > > > events when an object is about to expired??
>> >
>> > > > > thx.
>> > > > > Ron
>>
>
>


-- 
awl


Re: memcached for social networking sites?

2009-08-28 Thread Adam Lee
How much traffic are you expecting?  If you're planning to grow very large,
I would suggest looking into a CDN like Akamai, CloudFront, etc...

On Fri, Aug 28, 2009 at 2:24 AM, mel_06  wrote:

>
> hi again guys,
>
> i'm doing a social networking site relates to music, i need some
> advice on caching on images, videos and audio files. thanks!
>



-- 
awl


Re: incr/decr and re-setting expiration?

2009-09-10 Thread Adam Lee

On Thu, Sep 10, 2009 at 9:31 AM, Dean Michael Berris
 wrote:
>
> Hi Guys,
>
> I recently checked the documentation about the memcached protocol and
> while looking at supporting it in my C++ client, I'm looking at a
> situation where I'm not sure whether it's possible to set the
> expiration of a key through the incr/decr commands. From the
> documentation, it seems that there's only a way to increment and
> decrement, but not set the expiration of the key being
> incremented/decremented.
>
> Is this by design? Should I file for a feature request for an
> incr/decr that also updates the expiration of a cache key?
>
> This feature would be really useful for tracking keys -- maybe
> tracking activity within a certain period of time. So something like
> this algorithm would be easily supported:
>
>  1. The first time something happens, set a key to 0 in memcached,
> expires in X seconds
>  2. The next time something happens, increment the key (update the
> expiration to X seconds) -- if the increment fails, that means the key
> has expired, set the key to 0 again
>  3. If the incremented value is higher than a certain number, then
> you can do something about it (maybe there's abusive behaviour going
> on, maybe a clicker bot or something)

We use this exact functionality, with one key difference-- we set this
expiration when we initialize the counter, but it doesn't get updated
when we incr/decr.   This is useful to implement any sort of counter--
rate limiting certain user actions, tracking occurrences of specific
activities, etc...

We ended up just building a generic McCounter class that you
initialize with an expiration.  It has get/set/incr/decr and takes
care of all the behind-the-scenes work to make sure that the value
exists in MC before trying to incr, etc.  It's a super simple class
and has proved infinitely useful-- perhaps you also want to build
something along these lines.

-- 
awl


Re: incr/decr and re-setting expiration?

2009-09-11 Thread Adam Lee

We expire counters when we want to track something by time period...

On Fri, Sep 11, 2009 at 1:13 AM, Clint Webb  wrote:
> Why are you putting an expiry on a key you are incrementing and
> decrementing?   I cant think of a situation where I would want to expire a
> counter (other then letting it fall out the LRU).
>
> In fact, the only situation I can think of that could use an expiry would
> follow the way it currently works.   Ie, I'd set a key with a 5 minute
> expiry, and then keep incrementing it for whatever reason, and then after 5
> minutes, the key goes away and I create another one.   I do this for
> tracking surges of hits from particular IP's (although I dont use 5
> minutes).
>
> I cant think of any reason where I would want to give the key a new expiry
> that couldn't be taken care of by the LRU evictions.
>
> On Fri, Sep 11, 2009 at 1:20 AM, Adam Lee  wrote:
>>
>> On Thu, Sep 10, 2009 at 9:31 AM, Dean Michael Berris
>>  wrote:
>> >
>> > Hi Guys,
>> >
>> > I recently checked the documentation about the memcached protocol and
>> > while looking at supporting it in my C++ client, I'm looking at a
>> > situation where I'm not sure whether it's possible to set the
>> > expiration of a key through the incr/decr commands. From the
>> > documentation, it seems that there's only a way to increment and
>> > decrement, but not set the expiration of the key being
>> > incremented/decremented.
>> >
>> > Is this by design? Should I file for a feature request for an
>> > incr/decr that also updates the expiration of a cache key?
>> >
>> > This feature would be really useful for tracking keys -- maybe
>> > tracking activity within a certain period of time. So something like
>> > this algorithm would be easily supported:
>> >
>> >  1. The first time something happens, set a key to 0 in memcached,
>> > expires in X seconds
>> >  2. The next time something happens, increment the key (update the
>> > expiration to X seconds) -- if the increment fails, that means the key
>> > has expired, set the key to 0 again
>> >  3. If the incremented value is higher than a certain number, then
>> > you can do something about it (maybe there's abusive behaviour going
>> > on, maybe a clicker bot or something)
>>
>> We use this exact functionality, with one key difference-- we set this
>> expiration when we initialize the counter, but it doesn't get updated
>> when we incr/decr.   This is useful to implement any sort of counter--
>> rate limiting certain user actions, tracking occurrences of specific
>> activities, etc...
>>
>> We ended up just building a generic McCounter class that you
>> initialize with an expiration.  It has get/set/incr/decr and takes
>> care of all the behind-the-scenes work to make sure that the value
>> exists in MC before trying to incr, etc.  It's a super simple class
>> and has proved infinitely useful-- perhaps you also want to build
>> something along these lines.
>>
>> --
>> awl
>
>
>
> --
> "Be excellent to each other"
>



-- 
awl


Re: Multi-threading and cpu affinity

2009-09-16 Thread Adam Lee

Yeah, I love that about Solaris.  I'm also a big fan of processor
sets: http://developers.sun.com/solaris/articles/solaris_processor.html

On Wed, Sep 16, 2009 at 11:47 AM, Trond Norbye  wrote:
>
> On 16. sep.. 2009, at 15.57, Jason Priebe wrote:
>
>>
>> We want to update an older memcached server that uses four instances
>> per server, each listening to a different port, and each using a
>> different processor.
>>
>> We would like to get rid of this hacky implementation and just use 4
>> threads.  But when I enable the multithreading, it seems that the
>> threads all go to a single CPU.
>>
>> See the top output below (the "P" column right before the "COMMAND"
>> column is the processor number).  It's pretty clear that Cpu3 is doing
>> most of the work, as it has about 80% idle time, with the others
>> nearly 100% idle.
>>
>> I'm guessing that the threads are spawned only when the memcached
>> process is started, so they're going to stay where they are.
>>
>> Is there any way to force them to each use a different processor?
>> Thanks for any advice.
>>
>
> On Solaris you can bind a process or a thread to a certain CPU by using the
> pbind command.
>
> Cheers,
>
> Trond
>
>
>
>> top - 09:23:22 up  4:09,  1 user,  load average: 0.09, 0.16, 0.18
>> Tasks: 130 total,   1 running, 129 sleeping,   0 stopped,   0 zombie
>> Cpu0  :  0.0%us,  0.3%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,
>> 0.0%si,  0.0%st
>> Cpu1  :  0.0%us,  0.3%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,
>> 0.0%si,  0.0%st
>> Cpu2  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,
>> 0.0%si,  0.0%st
>> Cpu3  :  1.3%us,  7.6%sy,  0.0%ni, 83.7%id,  0.0%wa,  1.3%hi,
>> 6.0%si,  0.0%st
>> Mem:   2059560k total,  1860344k used,   199216k free,    77008k
>> buffers
>> Swap:  2031608k total,        0k used,  2031608k free,  1341024k
>> cached
>>
>>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  P
>> COMMAND
>> 3582 nobody    15   0  310m 260m  604 S    4 12.9   4:26.84 3
>> memcached
>> 3585 nobody    15   0  310m 260m  604 S    3 12.9   4:24.56 3
>> memcached
>> 3584 nobody    15   0  310m 260m  604 S    2 12.9   4:42.19 3
>> memcached
>> 3583 nobody    15   0  310m 260m  604 S    1 12.9   4:29.27 3
>> memcached
>>
>> Jason Priebe
>> CBC New Media Group
>
> --
> Trond Norbye
>
> Web Scale Infrastructure                 E-mail: trond.nor...@sun.com
> SUN Microsystems                         Phone:  +47 73842100
> Haakon VII's gt. 7B                      Fax:    +47 73842101
> 7485 Trondheim, Norway
>
>



-- 
awl


Re: Range Spec Posted to Wiki

2009-09-16 Thread Adam Lee

Am I confused or is it actually being proposed that documentation
exist on Wikipedia?

I see no problem with the current wiki and there's no way that
Wikipedia would that...

http://en.wikipedia.org/wiki/Wikipedia:NOT#Wikipedia_is_not_a_manual.2C_guidebook.2C_textbook.2C_or_scientific_journal

-- 
awl


Re: How to manipulate data on memcached level and rewrite to db

2009-09-21 Thread Adam Lee

I'm going to +1 on this recommendation.  ORMs greatly simplify your
life, especially when it comes to caching.  I'm also going to
recommend that you don't try to write anything that prevents updates
to the business objects from going directly to the DB unless you are
really OK with the idea of losing data.  Sure, you can do multiple
changes to an object in memory before writing it to disk, but you
definitely don't want to do all your updates to objects on copies that
are in the cache and then rely on a background thread to sync those
objects to DB-- that sounds like a recipe for losing data, letting
inconsistency creep in, etc...

-- 
awl


Re: Schooner appliance?

2009-09-22 Thread Adam Lee

A schooner is a sailboat!

Having said that, no, I don't.  I met with them a while back and it
definitely seemed like some really cool technology, but not really
what we were looking for at the time.

For anybody who is interested, they build some flash memory-based
products like this:

http://www1.schoonerinfotech.com/products/memcached-appliance.html

On Tue, Sep 22, 2009 at 11:19 AM, Michael Shadle  wrote:
>
> It's not a Schooner, it's a sailboat!
>
> (Sorry, I just had to)
>
> On Tue, Sep 22, 2009 at 8:15 AM, Steve Webb  wrote:
>>
>> Q: Does anyone have any experience with the Schooner memcached appliance?
>>
>> - Steve Webb
>>
>> --
>> Steve Webb - Lead System Administrator for Pronto.com
>> Email: sw...@pronto.com (Please send any work requests to: r...@pronto.com)
>> Cell: 303-564-4269, Office: 303-351-1312, YIM: scumola
>>
>



-- 
awl


Re: best memcached java client

2009-10-07 Thread Adam Lee

Fact.

On Wed, Oct 7, 2009 at 4:31 PM, John Reilly  wrote:
> +1
>
> On Wed, Oct 7, 2009 at 1:23 PM, Nelz  wrote:
>>
>> As a user of Dustin's client, I corroborate what he said. It really is
>> the best Java client around.
>>
>> - Nelz
>>
>> On Wed, Oct 7, 2009 at 11:46, Dustin  wrote:
>> >
>> >
>> > On Oct 7, 11:36 am, sudo  wrote:
>> >> Which java client is most stable for working with memcached?
>> >
>> >  Well, I'd have to suggest mine.  I maintain really high test
>> > coverage, good performance, and keep up with server features (or stay
>> > a bit ahead of them).
>> >
>> >  http://code.google.com/p/spymemcached/
>> >
>> >  Please ask more specific questions here or in the spymemcached
>> > group:
>> >
>> >  http://groups.google.com/group/spymemcached
>
>



-- 
awl


Re: Issue 95 in memcached: Memory allocation default change (-m < 40 doesn't work)

2009-10-07 Thread Adam Lee

On Wed, Oct 7, 2009 at 6:55 PM,   wrote:
> I've thought about this before, and Dustin just reiterated, that the
> default should probably be to allocate all of the memory requested at start
> time. It should attempt to do a mass allocation, and if that fails it
> should spin on 1 meg allocations until it's full.

This sounds reasonable to me.

-- 
awl


Re: memcached failover solution?

2009-10-19 Thread Adam Lee
On Mon, Oct 19, 2009 at 6:00 AM, Henrik Schröder  wrote:

> How it works depends on which client you use. If you use the BeITMemcached
> client, when one instance goes down it will internally mark it as dead,
> start writing about it in the error log, and all requests that would end up
> at that server will be cache misses, which means that if you only have two
> servers, half of your data will not be cached. Your application should be
> able to handle this, and you should have something monitoring the servers
> and the error log so that you can see that one instance is down, and then
> you have the choice of either bringing the server back up again, or removing
> it from the configuration of you application and restarting it.
>
> If you have a client that supports automatic failover (BeITMemcached
> doesn't) then in the scenario above, all your data would be cached on the
> other instance instead, so you would still be fully cached while bringing
> the failing instance back up, or removing it from the configuration.
> However, you would still have to restart your application to reset the
> failover. This would be the best option, I'll try to add failover to the
> BeITMemcached client as soon as I have time for it. :-)
>
> The third case is a client that supports automatic failover, and automatic
> recovery from failover. It's similar to the scenario above, except it won't
> need an application restart when the failing memcached server comes back up,
> HOWEVER this means that your cache will not be synchronized while your
> application gradually discovers that the memcached server is back up.
> Depending on your application, this can be catastrophic, or it can be
> inconsequential. I don't really want to add this to my client, and if I do,
> there will be lots of warnings about it.
>

An even better to solution is to have the client automatically remove the
server on failure but to have server rediscovery be a manual and/or
synchronous operation.  You don't want a server flapping between up and down
states because of something like a faulty NIC and thus having your cache
become horribly inconsistent because none of the clients agree on its state,
so you only want it to go down if it's really down and you only want it to
come back up if it's really up and every client can agree (or all be told at
the same time) that it's back up.

We use memcached in a few different ways in our system and, therefore, have
a few different ways of implementing this on our side.  In one instance, we
use memcached as a deterministic cache where we want to guarantee that all
data is always available-- we do this by using our own custom client that
does client-side replication to every server in the pool.  It's a fairly
small dataset (a few gigs) and fits in memory on every instance, so we can
easily do this.  We ensure consistency, though, by writing a marker entry
that indicates when the cache was last populated.  Our client code never
writes this entry, so a new server will only ever be _actually_ added to the
pool when a full-populate is run manually or the nightly crontab job
executes.  This way, we know that we can add new servers to the pool and not
have to worry about them missing or having inconsistent data-- they won't be
read from until they're shown to have the proper dataset.

In the other instance, we use memcached in a more standard configuration.
 Here, we don't do any sort of client side replication, though we do use
ketama hashing (and a few other things that I won't get into like a hacked
up NodeLocator and cache miss fall-through to a middle tier persistent
cache)...  When a machine dies, we automatically take it out of the config.
 To add a new machine, though, we have to push out the config we want and
fire off an admin message (just a special class of message on our standard
message queue) indicating that a new, valid memcached config is present and
should be loaded.  This way, we can somewhat guarantee that configs stay
consistent across all instances.

-- 
awl


Re: Using memcached as a distributed file cache

2009-11-02 Thread Adam Lee
I'm guessing you might get better mileage out of using something written
more for this purpose, e.g. squid set up as a reverse proxy.

On Mon, Nov 2, 2009 at 4:35 PM, Jay Paroline  wrote:

>
> I'm running this by you guys to make sure we're not trying something
> completely insane. ;)
>
> We already rely on memcached quite heavily to minimize load on our DB
> with stunning success, but as a music streaming service, we also serve
> up lots and lots of 5-6MB files, and right now we don't have a
> distributed cache of any kind, just lots and lots of really fast
> disks. Due to the nature of our content, we have some files that are
> insanely popular, and a lot of long tail content that gets played
> infrequently. I don't remember the exact numbers, but I'd guesstimate
> that the top 50GB of our many TB of files accounts for 40-60% of our
> streams on any given day.
>
> What I'd love to do is get those popular files served from memory,
> which should alleviate load on the disks considerably. Obviously the
> file system cache does some of this already, but since it's not
> distributed it uses the space a lot less efficiently than a
> distributed cache would (say one popular file lives on 3 stream nodes,
> it's going to be cached in memory 3 separate times instead of just
> once).  We have multiple stream servers, obviously, and between them
> we could probably scrounge up 50GB or more for memcached,
> theoretically removing the disk load for all of the most popular
> content.
>
> My favorite memory cache is of course memcache, so I'm wondering if
> this would be an appropriate use (with the slab size turned way up,
> obviously). We're going to start doing some experiments with it, but
> I'm wondering what the community thinks.
>
> Thanks,
>
> Jay
>



-- 
awl


Re: Using memcached as a distributed file cache

2009-11-02 Thread Adam Lee
So you actually give back the file contents in the response, not the URL to
the media? If so, then that does complicate things a little bit.  I still
think that memcached might not be the best solution for this, though it
could obviously be configured to do it.

On Mon, Nov 2, 2009 at 5:44 PM, Jay Paroline  wrote:

>
> I'm not sure how well a reverse proxy would fit our needs, having
> never used one before. The way we do streaming is a client sends a one-
> time-use key to the stream server. The key is used to determine which
> file should be streamed, and then the file is returned. The effect is
> that no two requests are identical, and that code must be run for
> every single request to verify the request and lookup the appropriate
> file. Is it possible or practical to use a reverse proxy in that way?
>
> Jay
>
> Adam Lee wrote:
> > I'm guessing you might get better mileage out of using something written
> > more for this purpose, e.g. squid set up as a reverse proxy.
> >
> > On Mon, Nov 2, 2009 at 4:35 PM, Jay Paroline 
> wrote:
> >
> > >
> > > I'm running this by you guys to make sure we're not trying something
> > > completely insane. ;)
> > >
> > > We already rely on memcached quite heavily to minimize load on our DB
> > > with stunning success, but as a music streaming service, we also serve
> > > up lots and lots of 5-6MB files, and right now we don't have a
> > > distributed cache of any kind, just lots and lots of really fast
> > > disks. Due to the nature of our content, we have some files that are
> > > insanely popular, and a lot of long tail content that gets played
> > > infrequently. I don't remember the exact numbers, but I'd guesstimate
> > > that the top 50GB of our many TB of files accounts for 40-60% of our
> > > streams on any given day.
> > >
> > > What I'd love to do is get those popular files served from memory,
> > > which should alleviate load on the disks considerably. Obviously the
> > > file system cache does some of this already, but since it's not
> > > distributed it uses the space a lot less efficiently than a
> > > distributed cache would (say one popular file lives on 3 stream nodes,
> > > it's going to be cached in memory 3 separate times instead of just
> > > once).  We have multiple stream servers, obviously, and between them
> > > we could probably scrounge up 50GB or more for memcached,
> > > theoretically removing the disk load for all of the most popular
> > > content.
> > >
> > > My favorite memory cache is of course memcache, so I'm wondering if
> > > this would be an appropriate use (with the slab size turned way up,
> > > obviously). We're going to start doing some experiments with it, but
> > > I'm wondering what the community thinks.
> > >
> > > Thanks,
> > >
> > > Jay
> > >
> >
> >
> >
> > --
> > awl
>



-- 
awl


Re: Using memcached as a distributed file cache

2009-11-02 Thread Adam Lee
You could also do a relatively simple solution like tack a two digit shard
ID onto the front of your key then use this to direct your request to a
specific cluster internally.  Give the clusters a lot of RAM and rely on OS
filesystem caching to keep frequently requested files in memory.  Would be
very easy and cheap to build.

-- 
awl


  1   2   >