RE: mlockall?

Uwe Schindler Fri, 13 Dec 2013 16:58:38 -0800

Hi Hoss,

full ack! "sysctl -w vm.swappiness=0" is your friend (if you really want to do 
it, I don't recommend it for several reasons, nor do I recommend to mlockall)


Mlockall is  too risky if we don't have a single main() method that 
mlockallalls directly after starting.
It would also make the build system more crazy, because we would need to ship 
(like ES) with precompiled dll/so files for various OS, which is not easy to 
handle. One that wants to mlockall, can always use an agent-jar that does this. 
Or much easier: Write your own main() method in a single class that locks all 
stuff and then delegates to the main() method of your servlet container. But 
this is all out of scope of for Solr, this is how to setup your runtime env. I 
don't want to have that in Lucene or Solr.

Finally, I had several customers with ES that did exactly the wrong thing:
- They allocated too much heap space (initially like 70% of available RAM), 
just because they did not know better
- They also used mmap. Because 70% of the RAM was locked, the OS had no chance 
to swap in mmapped pages and FileChannel.map() throwed OOM Exception (same 
happened for them with NIOFSDir, because NIOFS also needs direct buffers 
outside heap!)
- Because of the OOM (which was a "special" OOM, not the default "heap space" 
or "permgen" one), they raised -Xmx further
- You can repeat this several times until you cannot reach your machine anymore 
because all mem is locked and also fragmented... AMEN :-) Hopefully Linux OOM 
killer kills your processes!

With lowering swappiness, this cannot happen (because with swappiness=0, the 
system will still swap if all wents bad), so you can reach your system and 
don’t make it die. Also: As swapping and mmapping is essentially the same 
thing, you should leave the decision when to swap something out and better swap 
some mmapped buffers in to the OS kernel!

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: [email protected]

> -----Original Message-----
> From: Chris Hostetter [mailto:[email protected]]
> Sent: Friday, December 13, 2013 6:19 PM
> To: [email protected]
> Subject: Re: mlockall?
> 
> 
> : Right, right, I meant to say that I know about that blog post.... but my Q
> : is:
> : If mlockall is such a good thing, why not have it in Lucene or Solr?  Or
> : maybe mlockall is not such a good or simple thing?
> 
> Beyond writting that agent jar, I never put much effort into thinking about
> integrating it directly into Solr for a variety of reasons alluded to in the 
> blog...
> http://searchhub.org/2013/05/21/mlockall-for-all/
> 
> "...it seemed like an unnecessary complication and poor substitute for
> disabling swap on your production servers."
> 
> "There are a few important caveats to using mlockall-agent.jar, mostly
> because there are some important caveats to using mlockall in general (pay
> attention to your ulimits) and some specific caveats to using it in java (make
> sure your min heap = your max heap)"
> 
> ...and in the FAQ at the bottom of the README...
> https://github.com/LucidWorks/mlockall-
> agent/blob/master/README.txt#L111
> 
> In particular, note the FAQ about MCL_CURRENT and the associated
> comments (straight from CASSANDRA-1214)...
> https://github.com/LucidWorks/mlockall-
> agent/blob/master/src/MLockAgent.java#L29
> ...my understanding is that mlockall is really only a good idea *before* any
> data is mmapped so you don't try to lock the stuff the OS is already
> mmapping -- doing that from within Solr's source (after the servlet container
> has already started) would be risky.
> 
> Assuing it was worth the techinical effort, it would convolute the build
> system a bit, and we'd have to make choices (similar to what Cassandra
> did) of how to deal with it on systems where it's not supported, or in
> instances where the call fails (because the ulimit isn't set high ehough) ...
> treating at as an "optional" performance optimization isn't neccessarily the
> best approach if people with platforms that do work are counting on it
> working.
> 
> Baring any evidence from someone whose looked into it more then me, my
> current suggestion would be...
> 
>  * don't overload your machines and just disable swap when
>    using solr -- don't worry about mlockall
>  * if you can't disable swap, and you want to run solr with
>    mlockall because your machines are overloaded, use mlockall-agent
> 
> Once Solr evolves to the point where we don't run in a servlet container, and
> have our own public static void main, and ship platform native startup scripts
> where we can handle things like forcing heap min=max, and have overridable
> startup config options for things like "force_mlockall=true"
> then it might be worth revisiting.
> 
> 
> 
> -Hoss
> http://www.lucidworks.com/
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected] For additional
> commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

RE: mlockall?

Reply via email to