Re: memory allocation, does memcached play fair ?

2009-06-16 Thread Brian Moon



What I have devised, and after some load testing appears to be working
is a mixed-bag method of solving this.
In effect, sessions are stored in the DB. When a returning user comes
to hte site, the session data is read from the DB and pushed into the
memcache. The memcache key is also stored in an 'active-sessions' DB
table. All proceeding session interaction with the user is then done
via memcache. When the user stops interacting with us, their session
data is returned to the database and removed from the memcache. This
return is done by polling the active session table and comparing
against data in the memcache, and can be run at a time when usage is
low.


You may want to look at gearman.  It could handle the session writing in 
the background for you so that your page times would not be delayed.


Brian.


Re: memory allocation, does memcached play fair ?

2009-06-15 Thread tbs

David,

Thank you for the well thought out reply to my post, I agree with a
lot of what you said. Our current session handler is actually using
NFS over RAID on a NAS, however we are finding that due to our
particular requirements (retain session data indefinitely, unless the
session file becomes untouched for more  than 6 months), that session
retrieval starts off very very quick over NFS when there are a small
number of files in the session directory. However, as the number of
session files passes a couple of million, the NFS bogs down. The end
result of this is that a page that once took 0.4 seconds to generate
now takes up to 1.5 seconds to generate. This degradation can occur
within 3 to 4 weeks of a complete session 'wipe and start again'. As
one of our business requirements is that we have to deliver all pages
in less that a second, you can see where a pure NFS/RAID solution is
failing us. I am totally willing to state for the record that it could
be our hardware/server/OS setup, but those decisions are way behind us
now (CentOS/XFS file system, SnapServer with guardianOS  as NFS NAS)
and I must still solve this issue.
What I have devised, and after some load testing appears to be working
is a mixed-bag method of solving this.
In effect, sessions are stored in the DB. When a returning user comes
to hte site, the session data is read from the DB and pushed into the
memcache. The memcache key is also stored in an 'active-sessions' DB
table. All proceeding session interaction with the user is then done
via memcache. When the user stops interacting with us, their session
data is returned to the database and removed from the memcache. This
return is done by polling the active session table and comparing
against data in the memcache, and can be run at a time when usage is
low.
This achieves a couple of things for us:
The session data retrieval during a user session is very quick
For most sessions there are only two reads from and one write back to
the database. In the case where a user session may invoke  10 page
views, this reduces the DB interaction by 80%, and the reduction
increases the more page views we get.
In the memcache dies, we lose only the immediate data and not the
complete session use history.
The software handler automatically switches to a DB solution if the
memCache server fails to respond, and can adjust when the server comes
back on line.

This system will undergo heavy heavy testing this week, so I will
report back on how it performed

Richard


Re: memory allocation, does memcached play fair ?

2009-06-15 Thread Joseph Engo


Have you tried a bucketing approach ?  (Using multiple NFS partitions  
and/or directories instead of a single giant directory).


On Jun 15, 2009, at 10:00 AM, tbs wrote:



David,

Thank you for the well thought out reply to my post, I agree with a
lot of what you said. Our current session handler is actually using
NFS over RAID on a NAS, however we are finding that due to our
particular requirements (retain session data indefinitely, unless the
session file becomes untouched for more  than 6 months), that session
retrieval starts off very very quick over NFS when there are a small
number of files in the session directory. However, as the number of
session files passes a couple of million, the NFS bogs down. The end
result of this is that a page that once took 0.4 seconds to generate
now takes up to 1.5 seconds to generate. This degradation can occur
within 3 to 4 weeks of a complete session 'wipe and start again'. As
one of our business requirements is that we have to deliver all pages
in less that a second, you can see where a pure NFS/RAID solution is
failing us. I am totally willing to state for the record that it could
be our hardware/server/OS setup, but those decisions are way behind us
now (CentOS/XFS file system, SnapServer with guardianOS  as NFS NAS)
and I must still solve this issue.
What I have devised, and after some load testing appears to be working
is a mixed-bag method of solving this.
In effect, sessions are stored in the DB. When a returning user comes
to hte site, the session data is read from the DB and pushed into the
memcache. The memcache key is also stored in an 'active-sessions' DB
table. All proceeding session interaction with the user is then done
via memcache. When the user stops interacting with us, their session
data is returned to the database and removed from the memcache. This
return is done by polling the active session table and comparing
against data in the memcache, and can be run at a time when usage is
low.
This achieves a couple of things for us:
The session data retrieval during a user session is very quick
For most sessions there are only two reads from and one write back to
the database. In the case where a user session may invoke  10 page
views, this reduces the DB interaction by 80%, and the reduction
increases the more page views we get.
In the memcache dies, we lose only the immediate data and not the
complete session use history.
The software handler automatically switches to a DB solution if the
memCache server fails to respond, and can adjust when the server comes
back on line.

This system will undergo heavy heavy testing this week, so I will
report back on how it performed

Richard




Re: memory allocation, does memcached play fair ?

2009-06-15 Thread tbs



On Jun 15, 1:22 pm, Joseph Engo dev.toas...@gmail.com wrote:
 Have you tried a bucketing approach ?  (Using multiple NFS partitions  
 and/or directories instead of a single giant directory).

Yes, we tried that, using an additional 2 levels of dir, and it helped
a little, as we can now actual perform 'ls' on the directory(s) !
However, I think there is more to it than appears on the surface. The
Snapserver we use as an NFS server comes with GuardianOS, which in
turn only offers XFS as a linux filesystem. While XFS is a great
filesystem for large files, I think it is a little unhappy with
storing millions of tiny 3 - 5k files ...ReiserFS would have been
great, but I dont see GuardianOS implementing it, at least until he
(Reiser) is out of jail ... My gut is telling me that our issues are a
limitation due to the combo of SnapServer, XFS, GuardianOS and NFS3
(which is the only NFS supported in CentOS by default). Fortunately I
was not responsible for that messup, as it happened before I joined.!
I'll let you know how it goes, when Jmeter slams my new memCache/write-
back/MySQL solution

Richard



 On Jun 15, 2009, at 10:00 AM, tbs wrote:



  David,

  Thank you for the well thought out reply to my post, I agree with a
  lot of what you said. Our current session handler is actually using
  NFS over RAID on a NAS, however we are finding that due to our
  particular requirements (retain session data indefinitely, unless the
  session file becomes untouched for more  than 6 months), that session
  retrieval starts off very very quick over NFS when there are a small
  number of files in the session directory. However, as the number of
  session files passes a couple of million, the NFS bogs down. The end
  result of this is that a page that once took 0.4 seconds to generate
  now takes up to 1.5 seconds to generate. This degradation can occur
  within 3 to 4 weeks of a complete session 'wipe and start again'. As
  one of our business requirements is that we have to deliver all pages
  in less that a second, you can see where a pure NFS/RAID solution is
  failing us. I am totally willing to state for the record that it could
  be our hardware/server/OS setup, but those decisions are way behind us
  now (CentOS/XFS file system, SnapServer with guardianOS  as NFS NAS)
  and I must still solve this issue.
  What I have devised, and after some load testing appears to be working
  is a mixed-bag method of solving this.
  In effect, sessions are stored in the DB. When a returning user comes
  to hte site, the session data is read from the DB and pushed into the
  memcache. The memcache key is also stored in an 'active-sessions' DB
  table. All proceeding session interaction with the user is then done
  via memcache. When the user stops interacting with us, their session
  data is returned to the database and removed from the memcache. This
  return is done by polling the active session table and comparing
  against data in the memcache, and can be run at a time when usage is
  low.
  This achieves a couple of things for us:
  The session data retrieval during a user session is very quick
  For most sessions there are only two reads from and one write back to
  the database. In the case where a user session may invoke  10 page
  views, this reduces the DB interaction by 80%, and the reduction
  increases the more page views we get.
  In the memcache dies, we lose only the immediate data and not the
  complete session use history.
  The software handler automatically switches to a DB solution if the
  memCache server fails to respond, and can adjust when the server comes
  back on line.

  This system will undergo heavy heavy testing this week, so I will
  report back on how it performed

  Richard


Re: memory allocation, does memcached play fair ?

2009-06-11 Thread David Rolston
I'm not following your argument.  First off, sessions in general are not a
good candidate for caching.  Caching is in my opinion best reserved for data
that is primarily static, or has a high read to write ratio.  Memcache when
used to front mysql for example, is preventing the overhead and contention
that can occur in the database when you have a lot of selects repeatedly
returning the same result set.  When you're reading much more frequently
than you're writing or changing data, then memcache works great.  If you
have volatile data as sessions tend to be, then caching now becomes an
impediment and added complexity. In the not too distant past, I did
implement a custom session handler in php that stores the session data in
memcache.  While this worked well and was reliable, the problem becomes that
if memcached is restarted, all the sessions are lost.

If the sessions are inherently file based, a Netapp appliance a good network
infrastructure is a great way to go --- the simplicity of NFS with the
reliability of RAID 6, and you can get performance limited only by the money
you want to spend on the networking --- anywhere from commodity stuff to
fiber channel or iscsi SAN.

On Wed, Jun 10, 2009 at 12:08 PM, tbs theblues...@gmail.com wrote:

 
 I hear you, I hear you, and that is what I wanted to do too. However
 it was impossible to get that passed the pointy-headed bosses. In fact
 why I wanted to use memCached in the first place was a difficult
 enough battle. Our sessions currently are flatfile over NFS ( with 
 10 million page views a month) and NFS has a hotline to tears and 911!
 a pure DB solution, with our growth path would have only been a temp
 fix and ended up in the same place as NFS. Using memcached+db is
 perfect, and if I could have pinched off a small chunk of memory from
 all the front end servers and created a nice little pool things would
 have been sweet, however 'da management' has [unqualified]
 reservations, and so I am forced to do this 'all on one box' method
 just to get us going. When it works out perfectly nd I am proven
 right, then I get some leverage to spread the memory load across
 multiple servers.
 oh the joy of engineering versus office politics :(

 Richard


  Brian.


Re: memory allocation, does memcached play fair ?

2009-06-10 Thread Brian Moon



The memcached server will sit on the same box as the MySQL server.
Given that on any given machine MySQL will try and take as much memory
as it can, does the invocation of memcached, with a memory allocation
that exceeds the available memory, cause the OS (linux) to pull the
difference from the mySql server instance, or does it use swap to make
up the difference ?
In other words what would happen if :


MySQL uses as much memory as you tell it to.  Most likely you are 
telling MySQL it can use more memory than the system has and therefore 
your observation is that is uses all the memory on the system.  This is 
a common problem.


To answer your question, it will start to swap stuff from wherever the 
kernel thinks it can.  Swap is bad.



I have 4 GIGs of memory
2 GIG are being used by MySQL
I startup memcached with a memory allocation of 3 GIG.


That is asinine.  Why would you ever do that?  You are configuring for 
failure.



Does memcached ignore the missing GIG, or does it use swap?, or does
it steal it from MySQL?


Memcached does not do anything.  The kernel manages memory.  The kernel 
would start to swap things.  You will notice this when your server 
starts crying and dialing 911.



CentOS 2.6.18-53.el5PAE  -  athlon i386 GNU/Linux
8 Gig memory
MySql 5.0.51
Memcached allocation 4 GIG


Why not put memcached on the web servers?  That was what it was 
originally made to do.  You can use a little ram on those boxes to 
create a large pool and you don't lose all your sessions from one 
restart of memcached.


Brian.


Re: memory allocation, does memcached play fair ?

2009-06-10 Thread tbs



On Jun 10, 2:55 pm, Brian Moon br...@moonspot.net wrote:
  The memcached server will sit on the same box as the MySQL server.
  Given that on any given machine MySQL will try and take as much memory
  as it can, does the invocation of memcached, with a memory allocation
  that exceeds the available memory, cause the OS (linux) to pull the
  difference from the mySql server instance, or does it use swap to make
  up the difference ?
  In other words what would happen if :

 MySQL uses as much memory as you tell it to.  Most likely you are
 telling MySQL it can use more memory than the system has and therefore
 your observation is that is uses all the memory on the system.  This is
 a common problem.

 To answer your question, it will start to swap stuff from wherever the
 kernel thinks it can.  Swap is bad.


okay, that makes sense. I can limit the MySQL memory allocation.

  I have 4 GIGs of memory
  2 GIG are being used by MySQL
  I startup memcached with a memory allocation of 3 GIG.

 That is asinine.  Why would you ever do that?  You are configuring for
 failure.

lol - I understand that, it was just a way of presenting a scenario
where something wasn't right so you could explain how the kernel would
handle it. I would never do that, and you are quite correct, it would
be assinine in the extreme!


  Does memcached ignore the missing GIG, or does it use swap?, or does
  it steal it from MySQL?

 Memcached does not do anything.  The kernel manages memory.  The kernel
 would start to swap things.  You will notice this when your server
 starts crying and dialing 911.

that is what I figured, but thanks for the clarification.

  CentOS 2.6.18-53.el5PAE  -  athlon i386 GNU/Linux
  8 Gig memory
  MySql 5.0.51
  Memcached allocation 4 GIG

 Why not put memcached on the web servers?  That was what it was
 originally made to do.  You can use a little ram on those boxes to
 create a large pool and you don't lose all your sessions from one
 restart of memcached.

I hear you, I hear you, and that is what I wanted to do too. However
it was impossible to get that passed the pointy-headed bosses. In fact
why I wanted to use memCached in the first place was a difficult
enough battle. Our sessions currently are flatfile over NFS ( with 
10 million page views a month) and NFS has a hotline to tears and 911!
a pure DB solution, with our growth path would have only been a temp
fix and ended up in the same place as NFS. Using memcached+db is
perfect, and if I could have pinched off a small chunk of memory from
all the front end servers and created a nice little pool things would
have been sweet, however 'da management' has [unqualified]
reservations, and so I am forced to do this 'all on one box' method
just to get us going. When it works out perfectly nd I am proven
right, then I get some leverage to spread the memory load across
multiple servers.
oh the joy of engineering versus office politics :(

Richard


 Brian.