Re: Occasional Solr performance issues

2012-10-29 Thread Dotan Cohen
On Mon, Oct 29, 2012 at 7:04 AM, Shawn Heisey  wrote:
> They are indeed Java options.  The first two control the maximum and
> starting heap sizes.  NewRatio controls the relative size of the young and
> old generations, making the young generation considerably larger than it is
> by default.  The others are garbage collector options.  This seems to be a
> good summary:
>
> http://www.petefreitag.com/articles/gctuning/
>
> Here's the official Sun (Oracle) documentation on GC tuning:
>
> http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html
>

Thank you Shawn! Those are exactly the documents that I need. Google
should hire you to fill in the pages when someone searches for "java
garbage collection". Interestingly, I just check and bing.com does
list the Oracle page on the first pager of results. I shudder to think
that I might have to switch search engines!

-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com


Re: Occasional Solr performance issues

2012-10-28 Thread Shawn Heisey

On 10/28/2012 2:28 PM, Dotan Cohen wrote:

On Fri, Oct 26, 2012 at 11:04 PM, Shawn Heisey  wrote:

Warming doesn't seem to be a problem here -- all your warm times are zero,
so I am going to take a guess that it may be a heap/GC issue.  I would
recommend starting with the following additional arguments to your JVM.
Since I have no idea how solr gets started on your server, I don't know
where you would add these:

-Xmx4096M -Xms4096M -XX:NewRatio=1 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled

Thanks. I've added those flags to the Solr line that I use to start
Solr. Those are Java flags, not Solr, correct? I'm googling the flags
now, but I find it interesting that I cannot find a canonical
reference for them.


They are indeed Java options.  The first two control the maximum and 
starting heap sizes.  NewRatio controls the relative size of the young 
and old generations, making the young generation considerably larger 
than it is by default.  The others are garbage collector options.  This 
seems to be a good summary:


http://www.petefreitag.com/articles/gctuning/

Here's the official Sun (Oracle) documentation on GC tuning:

http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html

Thanks,
Shawn



Re: Occasional Solr performance issues

2012-10-28 Thread Dotan Cohen
On Fri, Oct 26, 2012 at 11:04 PM, Shawn Heisey  wrote:
> Warming doesn't seem to be a problem here -- all your warm times are zero,
> so I am going to take a guess that it may be a heap/GC issue.  I would
> recommend starting with the following additional arguments to your JVM.
> Since I have no idea how solr gets started on your server, I don't know
> where you would add these:
>
> -Xmx4096M -Xms4096M -XX:NewRatio=1 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
> -XX:+CMSParallelRemarkEnabled
>

Thanks. I've added those flags to the Solr line that I use to start
Solr. Those are Java flags, not Solr, correct? I'm googling the flags
now, but I find it interesting that I cannot find a canonical
reference for them.


> This allocates 4GB of RAM to java, sets up a larger than normal Eden space
> in the heap, and uses garbage collection options that usually fare better in
> a server environment than the default.Java memory management options are
> like religion to some people ... I may start a flamewar with these
> recommendations. ;)  The best I can tell you about these choices: They made
> a big difference for me.
>

Thanks. I will experiment with them empirically. The first step is to
learn to read the debug info, though. I've been googing for days, but
I must be missing something. Where is the information that I pasted in
pastebin documented?


> I would also recommend switching to a Sun/Oracle jvm.  I have heard that
> previous versions of Solr were not happy on variants like OpenJDK, I have no
> idea whether that might still be the case with 4.0.  If you choose to do
> this, you probably have package choices in Ubuntu.  I know that in Debian,
> the package is called sun-java6-jre ... Ubuntu is probably something
> similar. Debian has a CLI command 'update-java-alternatives' that will
> quickly switch between different java implementations that are installed.
> Hopefully Ubuntu also has this.  If not, you might need the following
> command instead to switch the main java executable:
>
> update-alternatives --config java
>

Thanks, I will take a look at the current Oracle JVM.


-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com


Re: Occasional Solr performance issues

2012-10-26 Thread Shawn Heisey

On 10/26/2012 9:41 AM, Dotan Cohen wrote:

On the dashboard of the GUI, it lists all the jvm arguments. Include those.

Click Java Properties and gather the "java.runtime.version" and
"java.specification.vendor" information.

After one of the long update times, pause/stop your indexing application.
Click on your core in the GUI, open Plugins/Stats, and paste the following
bits with a header to indicate what each section is:
CACHE->filterCache
CACHE->queryResultCache
CORE->searcher

Thanks,
Shawn

Thank you Shawn. The information is here:
http://pastebin.com/aqEfeYVA



Warming doesn't seem to be a problem here -- all your warm times are 
zero, so I am going to take a guess that it may be a heap/GC issue.  I 
would recommend starting with the following additional arguments to your 
JVM.  Since I have no idea how solr gets started on your server, I don't 
know where you would add these:


-Xmx4096M -Xms4096M -XX:NewRatio=1 -XX:+UseParNewGC 
-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled


This allocates 4GB of RAM to java, sets up a larger than normal Eden 
space in the heap, and uses garbage collection options that usually fare 
better in a server environment than the default.Java memory management 
options are like religion to some people ... I may start a flamewar with 
these recommendations. ;)  The best I can tell you about these choices: 
They made a big difference for me.


I would also recommend switching to a Sun/Oracle jvm.  I have heard that 
previous versions of Solr were not happy on variants like OpenJDK, I 
have no idea whether that might still be the case with 4.0.  If you 
choose to do this, you probably have package choices in Ubuntu.  I know 
that in Debian, the package is called sun-java6-jre ... Ubuntu is 
probably something similar. Debian has a CLI command 
'update-java-alternatives' that will quickly switch between different 
java implementations that are installed.  Hopefully Ubuntu also has 
this.  If not, you might need the following command instead to switch 
the main java executable:


update-alternatives --config java

Thanks,
Shawn



Re: Occasional Solr performance issues

2012-10-26 Thread Dotan Cohen
On Fri, Oct 26, 2012 at 4:02 PM, Shawn Heisey  wrote:
>
> Taking all the information I've seen so far, my bet is on either cache
> warming or heap/GC trouble as the source of your problem.  It's now specific
> information gathering time.  Can you gather all the following information
> and put it into a web paste page, such as pastie.org, and reply with the
> link?  I have gathered the same information from my test server and created
> a pastie example. http://pastie.org/5118979
>
> On the dashboard of the GUI, it lists all the jvm arguments. Include those.
>
> Click Java Properties and gather the "java.runtime.version" and
> "java.specification.vendor" information.
>
> After one of the long update times, pause/stop your indexing application.
> Click on your core in the GUI, open Plugins/Stats, and paste the following
> bits with a header to indicate what each section is:
> CACHE->filterCache
> CACHE->queryResultCache
> CORE->searcher
>
> Thanks,
> Shawn
>

Thank you Shawn. The information is here:
http://pastebin.com/aqEfeYVA

-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com


Re: Occasional Solr performance issues

2012-10-26 Thread Shawn Heisey

On 10/26/2012 7:16 AM, Dotan Cohen wrote:

I spoke too soon! Wereas three days ago when the index was new 500
records could be written to it in <3 seconds, now that operation is
taking a minute and a half, sometimes longer. I ran optimize() but
that did not help the writes. What can I do to improve the write
performance?

Even opening the Logging tab of the Solr instance is taking quite a
long time. In fact, I just left it for 20 minutes and it still hasn't
come back with anything. I do have an SSH window open on the server
hosting Solr and it doesn't look overloaded at all:

$ date && du -sh data/ && uptime && free -m
Fri Oct 26 13:15:59 UTC 2012
578Mdata/
  13:15:59 up 4 days, 17:59,  1 user,  load average: 0.06, 0.12, 0.22
  total   used   free sharedbuffers cached
Mem: 14980   3237  11743  0284   
-/+ buffers/cache:729  14250
Swap:0  0  0


Taking all the information I've seen so far, my bet is on either cache 
warming or heap/GC trouble as the source of your problem.  It's now 
specific information gathering time.  Can you gather all the following 
information and put it into a web paste page, such as pastie.org, and 
reply with the link?  I have gathered the same information from my test 
server and created a pastie example. http://pastie.org/5118979


On the dashboard of the GUI, it lists all the jvm arguments. Include those.

Click Java Properties and gather the "java.runtime.version" and 
"java.specification.vendor" information.


After one of the long update times, pause/stop your indexing 
application.  Click on your core in the GUI, open Plugins/Stats, and 
paste the following bits with a header to indicate what each section is:

CACHE->filterCache
CACHE->queryResultCache
CORE->searcher

Thanks,
Shawn



Re: Occasional Solr performance issues

2012-10-26 Thread Dotan Cohen
I spoke too soon! Wereas three days ago when the index was new 500
records could be written to it in <3 seconds, now that operation is
taking a minute and a half, sometimes longer. I ran optimize() but
that did not help the writes. What can I do to improve the write
performance?

Even opening the Logging tab of the Solr instance is taking quite a
long time. In fact, I just left it for 20 minutes and it still hasn't
come back with anything. I do have an SSH window open on the server
hosting Solr and it doesn't look overloaded at all:

$ date && du -sh data/ && uptime && free -m
Fri Oct 26 13:15:59 UTC 2012
578Mdata/
 13:15:59 up 4 days, 17:59,  1 user,  load average: 0.06, 0.12, 0.22
 total   used   free sharedbuffers cached
Mem: 14980   3237  11743  0284   
-/+ buffers/cache:729  14250
Swap:0  0  0


-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com


Re: Occasional Solr performance issues

2012-10-26 Thread Dotan Cohen
On Wed, Oct 24, 2012 at 4:33 PM, Walter Underwood  wrote:
> Please consider never running "optimize". That should be called "force merge".
>

Thanks. I have been letting the system run for about two days already
without an optimize. I will let it run a week, then merge to see the
effect.

-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com


Re: Occasional Solr performance issues

2012-10-24 Thread Walter Underwood
Please consider never running "optimize". That should be called "force merge". 

wunder

On Oct 24, 2012, at 3:28 AM, Dotan Cohen wrote:

> On Tue, Oct 23, 2012 at 3:07 PM, Erick Erickson  
> wrote:
>> Maybe you've been looking at it but one thing that I didn't see on a fast
>> scan was that maybe the commit bit is the problem. When you commit,
>> eventually the segments will be merged and a new searcher will be opened
>> (this is true even if you're NOT optimizing). So you're effectively 
>> committing
>> every 1-2 seconds, creating many segments which get merged, but more
>> importantly opening new searchers (which you are getting since you pasted
>> the message: Overlapping onDeckSearchers=2).
>> 
>> You could pinpoint this by NOT committing explicitly, just set your 
>> autocommit
>> parameters (or specify commitWithin in your indexing program, which is
>> preferred). Try setting it at a minute or so and see if your problem goes 
>> away
>> perhaps?
>> 
>> The NRT stuff happens on soft commits, so you have that option to have the
>> documents immediately available for search.
>> 
> 
> 
> Thanks, Erick. I'll play around with different configurations. So far
> just removing the periodic optimize command worked wonders. I'll see
> how much it helps or hurts to run that daily or more or less frequent.
> 
> 
> -- 
> Dotan Cohen
> 
> http://gibberish.co.il
> http://what-is-what.com






Re: Occasional Solr performance issues

2012-10-24 Thread Dotan Cohen
On Tue, Oct 23, 2012 at 3:07 PM, Erick Erickson  wrote:
> Maybe you've been looking at it but one thing that I didn't see on a fast
> scan was that maybe the commit bit is the problem. When you commit,
> eventually the segments will be merged and a new searcher will be opened
> (this is true even if you're NOT optimizing). So you're effectively committing
> every 1-2 seconds, creating many segments which get merged, but more
> importantly opening new searchers (which you are getting since you pasted
> the message: Overlapping onDeckSearchers=2).
>
> You could pinpoint this by NOT committing explicitly, just set your autocommit
> parameters (or specify commitWithin in your indexing program, which is
> preferred). Try setting it at a minute or so and see if your problem goes away
> perhaps?
>
> The NRT stuff happens on soft commits, so you have that option to have the
> documents immediately available for search.
>


Thanks, Erick. I'll play around with different configurations. So far
just removing the periodic optimize command worked wonders. I'll see
how much it helps or hurts to run that daily or more or less frequent.


-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com


Re: Occasional Solr performance issues

2012-10-23 Thread Erick Erickson
Maybe you've been looking at it but one thing that I didn't see on a fast
scan was that maybe the commit bit is the problem. When you commit,
eventually the segments will be merged and a new searcher will be opened
(this is true even if you're NOT optimizing). So you're effectively committing
every 1-2 seconds, creating many segments which get merged, but more
importantly opening new searchers (which you are getting since you pasted
the message: Overlapping onDeckSearchers=2).

You could pinpoint this by NOT committing explicitly, just set your autocommit
parameters (or specify commitWithin in your indexing program, which is
preferred). Try setting it at a minute or so and see if your problem goes away
perhaps?

The NRT stuff happens on soft commits, so you have that option to have the
documents immediately available for search.

Best
Erick

On Mon, Oct 22, 2012 at 10:44 AM, Dotan Cohen  wrote:
> I've got a script writing ~50 documents to Solr at a time, then
> commiting. Each of these documents is no longer than 1 KiB of text,
> some much less. Usually the write-and-commit will take 1-2 seconds or
> less, but sometimes it can go over 60 seconds.
>
> During a recent time of over-60-second write-and-commits, I saw that
> the server did not look overloaded:
>
> $ uptime
>  14:36:46 up 19:20,  1 user,  load average: 1.08, 1.16, 1.16
> $ free -m
>  total   used   free sharedbuffers cached
> Mem: 14980   2091  12889  0233   1243
> -/+ buffers/cache:613  14366
> Swap:0  0  0
>
> Other than Solr, nothing is running on this machine other than stock
> Ubuntu Server services (no Apache, no MySQL). The machine is running
> on an Extra Large Amazon EC2 instance, with a virtual 4-core 2.4 GHz
> Xeon processor and ~16 GiB of RAM. The solr home is on a mounted EBS
> volume.
>
> What might make some queries take so long, while others perform fine?
>
> Thanks.
>
>
> --
> Dotan Cohen
>
> http://gibberish.co.il
> http://what-is-what.com


Re: Occasional Solr performance issues

2012-10-22 Thread Dotan Cohen
On Tue, Oct 23, 2012 at 3:52 AM, Shawn Heisey  wrote:
> As soon as you make any change at all to an index, it's no longer
> "optimized."  Delete one document, add one document, anything.  Most of the
> time you will not see a performance increase from optimizing an index that
> consists of one large segment and a bunch of very tiny segments or deleted
> documents.
>

I've since realized that by experimentation. I've probably saved quite
a few minutes of reading time by investing hours of experiment time!


> How big is your index, and did you run this right after a reboot?  If you
> did, then the cache will be fairly empty, and Solr has only read enough from
> the index files to open the searcher.The number is probably too small to
> show up on a gigabyte scale.  As you issue queries, the cached amount will
> get bigger.  If your index is small enough to fit in the 14GB of free RAM
> that you have, you can manually populate the disk cache by going to your
> index directory and doing 'cat * > /dev/null' from the commandline or a
> script.  The first time you do it, it may go slowly, but if you immediately
> do it again, it will complete VERY fast -- the data will all be in RAM.
>

The cat trick to get the files in RAM is great. I would not have
thought that would work for binary files.

The index is small, much less than the available RAM, for the time
being. Therefore, there was nothing to fill it with I now understand.
Both 'free' outputs were after the system had been running for some
time.


> The 'free -m' command in your first email shows cache usage of 1243MB, which
> suggests that maybe your index is considerably smaller than your available
> RAM.  Having loads of free RAM is a good thing for just about any workload,
> but especially for Solr.Try running the free command without the -g so you
> can see those numbers in kilobytes.
>
> I have seen a tendency towards creating huge caches in Solr because people
> have lots of memory.  It's important to realize that the OS is far better at
> the overall job of caching the index files than Solr itself is.  Solr caches
> are meant to cache result sets from queries and filters, not large sections
> of the actual index contents.  Make the caches big enough that you see some
> benefit, but not big enough to suck up all your RAM.
>

I see, thanks.


> If you are having warm time problems, make the autowarm counts low.  I have
> run into problems with warming on my filter cache, because we have filters
> that are extremely hairy and slow to run. I had to reduce my autowarm count
> on the filter cache to FOUR, with a cache size of 512.  When it is 8 or
> higher, it can take over a minute to autowarm.
>

I will have to experiment with the warning. Thank you for the tips.


-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com


Re: Occasional Solr performance issues

2012-10-22 Thread Shawn Heisey

On 10/22/2012 3:11 PM, Dotan Cohen wrote:

On Mon, Oct 22, 2012 at 10:01 PM, Walter Underwood
 wrote:

First, stop optimizing. You do not need to manually force merges. The system 
does a great job. Forcing merges (optimize) uses a lot of CPU and disk IO and 
might be the cause of your problem.


Thanks. Looking at the index statistics, I see that within minutes
after running optimize that the stats say the index needs to be
reoptimized. Though, the index still reads and writes fine even in
that state.


As soon as you make any change at all to an index, it's no longer 
"optimized."  Delete one document, add one document, anything.  Most of 
the time you will not see a performance increase from optimizing an 
index that consists of one large segment and a bunch of very tiny 
segments or deleted documents.



Second, the OS will use the "extra" memory for file buffers, which really helps 
performance, so you might not need to do anything. This will work better after you stop 
forcing merges. A forced merge replaces every file, so the OS needs to reload everything 
into file buffers.


I don't see that the memory is being used:

$ free -g
  total   used   free sharedbuffers cached
Mem:14  2 12  0  0  1
-/+ buffers/cache:  0 14
Swap:0  0  0


How big is your index, and did you run this right after a reboot?  If 
you did, then the cache will be fairly empty, and Solr has only read 
enough from the index files to open the searcher.The number is probably 
too small to show up on a gigabyte scale.  As you issue queries, the 
cached amount will get bigger.  If your index is small enough to fit in 
the 14GB of free RAM that you have, you can manually populate the disk 
cache by going to your index directory and doing 'cat * > /dev/null' 
from the commandline or a script.  The first time you do it, it may go 
slowly, but if you immediately do it again, it will complete VERY fast 
-- the data will all be in RAM.


The 'free -m' command in your first email shows cache usage of 1243MB, 
which suggests that maybe your index is considerably smaller than your 
available RAM.  Having loads of free RAM is a good thing for just about 
any workload, but especially for Solr.Try running the free command 
without the -g so you can see those numbers in kilobytes.


I have seen a tendency towards creating huge caches in Solr because 
people have lots of memory.  It's important to realize that the OS is 
far better at the overall job of caching the index files than Solr 
itself is.  Solr caches are meant to cache result sets from queries and 
filters, not large sections of the actual index contents.  Make the 
caches big enough that you see some benefit, but not big enough to suck 
up all your RAM.


If you are having warm time problems, make the autowarm counts low.  I 
have run into problems with warming on my filter cache, because we have 
filters that are extremely hairy and slow to run. I had to reduce my 
autowarm count on the filter cache to FOUR, with a cache size of 512.  
When it is 8 or higher, it can take over a minute to autowarm.


Thanks,
Shawn



Re: Occasional Solr performance issues

2012-10-22 Thread Dotan Cohen
On Mon, Oct 22, 2012 at 10:44 PM, Walter Underwood
 wrote:
> Lucene already did that:
>
> https://issues.apache.org/jira/browse/LUCENE-3454
>
> Here is the Solr issue:
>
> https://issues.apache.org/jira/browse/SOLR-3141
>
> People over-use this regardless of the name. In Ultraseek Server, it was 
> called "force merge" and we had to tell people to stop doing that nearly 
> every month.
>

Thank you for those links. I commented on the Solr bug. There are some
very insightful comments in there.


-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com


Re: Occasional Solr performance issues

2012-10-22 Thread Dotan Cohen
On Mon, Oct 22, 2012 at 10:01 PM, Walter Underwood
 wrote:
> First, stop optimizing. You do not need to manually force merges. The system 
> does a great job. Forcing merges (optimize) uses a lot of CPU and disk IO and 
> might be the cause of your problem.
>

Thanks. Looking at the index statistics, I see that within minutes
after running optimize that the stats say the index needs to be
reoptimized. Though, the index still reads and writes fine even in
that state.


> Second, the OS will use the "extra" memory for file buffers, which really 
> helps performance, so you might not need to do anything. This will work 
> better after you stop forcing merges. A forced merge replaces every file, so 
> the OS needs to reload everything into file buffers.
>

I don't see that the memory is being used:

$ free -g
 total   used   free sharedbuffers cached
Mem:14  2 12  0  0  1
-/+ buffers/cache:  0 14
Swap:0  0  0

-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com


Re: Occasional Solr performance issues

2012-10-22 Thread Yonik Seeley
On Mon, Oct 22, 2012 at 4:39 PM, Michael Della Bitta
 wrote:
> Has the Solr team considered renaming the optimize function to avoid
> leading people down the path of this antipattern?

If it were never the right thing to do, it could simply be removed.
The problem is that it's sometimes the right thing to do - but it
depends heavily on the use cases and trade-offs.  The best thing is to
simply document what it does and the cost of doing it.

-Yonik
http://lucidworks.com


Re: Occasional Solr performance issues

2012-10-22 Thread Walter Underwood
Lucene already did that:

https://issues.apache.org/jira/browse/LUCENE-3454

Here is the Solr issue:

https://issues.apache.org/jira/browse/SOLR-3141

People over-use this regardless of the name. In Ultraseek Server, it was called 
"force merge" and we had to tell people to stop doing that nearly every month.

wunder

On Oct 22, 2012, at 1:39 PM, Michael Della Bitta wrote:

> Has the Solr team considered renaming the optimize function to avoid
> leading people down the path of this antipattern?
> 
> Michael Della Bitta
> 
> 
> Appinions
> 18 East 41st Street, 2nd Floor
> New York, NY 10017-6271
> 
> www.appinions.com
> 
> Where Influence Isn’t a Game
> 
> 
> On Mon, Oct 22, 2012 at 4:01 PM, Walter Underwood  
> wrote:
>> First, stop optimizing. You do not need to manually force merges. The system 
>> does a great job. Forcing merges (optimize) uses a lot of CPU and disk IO 
>> and might be the cause of your problem.
>> 
>> Second, the OS will use the "extra" memory for file buffers, which really 
>> helps performance, so you might not need to do anything. This will work 
>> better after you stop forcing merges. A forced merge replaces every file, so 
>> the OS needs to reload everything into file buffers.
>> 
>> wunder
>> 
>> On Oct 22, 2012, at 12:55 PM, Dotan Cohen wrote:
>> 
>>> On Mon, Oct 22, 2012 at 9:22 PM, Mark Miller  wrote:
 Perhaps you can grab a snapshot of the stack traces when the 60 second
 delay is occurring?
 
 You can get the stack traces right in the admin ui, or you can use
 another tool (jconsole, visualvm, jstack cmd line, etc)
 
>>> Thanks. I've refactored so that the index is optimized once per hour,
>>> instead after each dump of commits. But when I will need to increase
>>> the optmize frequency in the future I will go through the stack
>>> traces. Thanks!
>>> 
>>> In any case, the server has an extra 14 GiB of memory available, how
>>> might I make the best use of that for Solr assuming both heavy reads
>>> and writes?
>>> 
>>> Thanks.
>>> 
>>> --
>>> Dotan Cohen
>>> 
>>> http://gibberish.co.il
>>> http://what-is-what.com
>> 
>> 
>> 
>> 

--
Walter Underwood
wun...@wunderwood.org





Re: Occasional Solr performance issues

2012-10-22 Thread Michael Della Bitta
Has the Solr team considered renaming the optimize function to avoid
leading people down the path of this antipattern?

Michael Della Bitta


Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271

www.appinions.com

Where Influence Isn’t a Game


On Mon, Oct 22, 2012 at 4:01 PM, Walter Underwood  wrote:
> First, stop optimizing. You do not need to manually force merges. The system 
> does a great job. Forcing merges (optimize) uses a lot of CPU and disk IO and 
> might be the cause of your problem.
>
> Second, the OS will use the "extra" memory for file buffers, which really 
> helps performance, so you might not need to do anything. This will work 
> better after you stop forcing merges. A forced merge replaces every file, so 
> the OS needs to reload everything into file buffers.
>
> wunder
>
> On Oct 22, 2012, at 12:55 PM, Dotan Cohen wrote:
>
>> On Mon, Oct 22, 2012 at 9:22 PM, Mark Miller  wrote:
>>> Perhaps you can grab a snapshot of the stack traces when the 60 second
>>> delay is occurring?
>>>
>>> You can get the stack traces right in the admin ui, or you can use
>>> another tool (jconsole, visualvm, jstack cmd line, etc)
>>>
>> Thanks. I've refactored so that the index is optimized once per hour,
>> instead after each dump of commits. But when I will need to increase
>> the optmize frequency in the future I will go through the stack
>> traces. Thanks!
>>
>> In any case, the server has an extra 14 GiB of memory available, how
>> might I make the best use of that for Solr assuming both heavy reads
>> and writes?
>>
>> Thanks.
>>
>> --
>> Dotan Cohen
>>
>> http://gibberish.co.il
>> http://what-is-what.com
>
>
>
>


Re: Occasional Solr performance issues

2012-10-22 Thread Walter Underwood
First, stop optimizing. You do not need to manually force merges. The system 
does a great job. Forcing merges (optimize) uses a lot of CPU and disk IO and 
might be the cause of your problem.

Second, the OS will use the "extra" memory for file buffers, which really helps 
performance, so you might not need to do anything. This will work better after 
you stop forcing merges. A forced merge replaces every file, so the OS needs to 
reload everything into file buffers.

wunder

On Oct 22, 2012, at 12:55 PM, Dotan Cohen wrote:

> On Mon, Oct 22, 2012 at 9:22 PM, Mark Miller  wrote:
>> Perhaps you can grab a snapshot of the stack traces when the 60 second
>> delay is occurring?
>> 
>> You can get the stack traces right in the admin ui, or you can use
>> another tool (jconsole, visualvm, jstack cmd line, etc)
>> 
> Thanks. I've refactored so that the index is optimized once per hour,
> instead after each dump of commits. But when I will need to increase
> the optmize frequency in the future I will go through the stack
> traces. Thanks!
> 
> In any case, the server has an extra 14 GiB of memory available, how
> might I make the best use of that for Solr assuming both heavy reads
> and writes?
> 
> Thanks.
> 
> -- 
> Dotan Cohen
> 
> http://gibberish.co.il
> http://what-is-what.com






Re: Occasional Solr performance issues

2012-10-22 Thread Dotan Cohen
On Mon, Oct 22, 2012 at 9:22 PM, Mark Miller  wrote:
> Perhaps you can grab a snapshot of the stack traces when the 60 second
> delay is occurring?
>
> You can get the stack traces right in the admin ui, or you can use
> another tool (jconsole, visualvm, jstack cmd line, etc)
>
Thanks. I've refactored so that the index is optimized once per hour,
instead after each dump of commits. But when I will need to increase
the optmize frequency in the future I will go through the stack
traces. Thanks!

In any case, the server has an extra 14 GiB of memory available, how
might I make the best use of that for Solr assuming both heavy reads
and writes?

Thanks.

-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com


Re: Occasional Solr performance issues

2012-10-22 Thread Mark Miller
Perhaps you can grab a snapshot of the stack traces when the 60 second
delay is occurring?

You can get the stack traces right in the admin ui, or you can use
another tool (jconsole, visualvm, jstack cmd line, etc)

- Mark

On Mon, Oct 22, 2012 at 1:47 PM, Dotan Cohen  wrote:
> On Mon, Oct 22, 2012 at 7:29 PM, Shawn Heisey  wrote:
>> On 10/22/2012 9:58 AM, Dotan Cohen wrote:
>>>
>>> Thank you, I have gone over the Solr admin panel twice and I cannot find
>>> the cache statistics. Where are they?
>>
>>
>> If you are running Solr4, you can see individual cache autowarming times
>> here, assuming your core is named collection1:
>>
>> http://server:port/solr/#/collection1/plugins/cache?entry=queryResultCache
>> http://server:port/solr/#/collection1/plugins/cache?entry=filterCache
>>
>> The warmup time for the entire searcher can be found here:
>>
>> http://server:port/solr/#/collection1/plugins/core?entry=searcher
>>
>>
>
> Thank you Shawn! I can see how I missed that data. I'm reviewing it
> now. Solr has a low barrier to entry, but quite a learning curve. I'm
> loving it!
>
> I see that the server is using less than 2 GiB of memory, whereas it
> is a dedicated Solr server with 16 GiB of memory. I understand that I
> can increase the query and document caches to increase performance,
> but I worry that this will increase the warm-up time to unacceptable
> levels. What is a good strategy for increasing the caches yet
> preserving performance after an optimize operation?
>
> Thanks.
>
> --
> Dotan Cohen
>
> http://gibberish.co.il
> http://what-is-what.com



-- 
- Mark


Re: Occasional Solr performance issues

2012-10-22 Thread Dotan Cohen
On Mon, Oct 22, 2012 at 7:29 PM, Shawn Heisey  wrote:
> On 10/22/2012 9:58 AM, Dotan Cohen wrote:
>>
>> Thank you, I have gone over the Solr admin panel twice and I cannot find
>> the cache statistics. Where are they?
>
>
> If you are running Solr4, you can see individual cache autowarming times
> here, assuming your core is named collection1:
>
> http://server:port/solr/#/collection1/plugins/cache?entry=queryResultCache
> http://server:port/solr/#/collection1/plugins/cache?entry=filterCache
>
> The warmup time for the entire searcher can be found here:
>
> http://server:port/solr/#/collection1/plugins/core?entry=searcher
>
>

Thank you Shawn! I can see how I missed that data. I'm reviewing it
now. Solr has a low barrier to entry, but quite a learning curve. I'm
loving it!

I see that the server is using less than 2 GiB of memory, whereas it
is a dedicated Solr server with 16 GiB of memory. I understand that I
can increase the query and document caches to increase performance,
but I worry that this will increase the warm-up time to unacceptable
levels. What is a good strategy for increasing the caches yet
preserving performance after an optimize operation?

Thanks.

-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com


Re: Occasional Solr performance issues

2012-10-22 Thread Shawn Heisey

On 10/22/2012 9:58 AM, Dotan Cohen wrote:
Thank you, I have gone over the Solr admin panel twice and I cannot 
find the cache statistics. Where are they?


If you are running Solr4, you can see individual cache autowarming times 
here, assuming your core is named collection1:


http://server:port/solr/#/collection1/plugins/cache?entry=queryResultCache
http://server:port/solr/#/collection1/plugins/cache?entry=filterCache

The warmup time for the entire searcher can be found here:

http://server:port/solr/#/collection1/plugins/core?entry=searcher


If you are on an older Solr release, everything is in various sections 
of the stats page.  Do a page search for "warmup" multiple times to see 
them all:


http://server:port/solr/corename/admin/stats.jsp

Thanks,
Shawn



Re: Occasional Solr performance issues

2012-10-22 Thread Dotan Cohen
On Mon, Oct 22, 2012 at 5:27 PM, Mark Miller  wrote:
> Are you using Solr 3X? The occasional long commit should no longer
> show up in Solr 4.
>

Thank you Mark. In fact, this is the production release of Solr 4.


-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com


Re: Occasional Solr performance issues

2012-10-22 Thread Dotan Cohen
On Mon, Oct 22, 2012 at 5:02 PM, Rafał Kuć  wrote:
> Hello!
>
> You can check if the long warming is causing the overlapping
> searchers. Check Solr admin panel and look at cache statistics, there
> should be warmupTime property.
>

Thank you, I have gone over the Solr admin panel twice and I cannot
find the cache statistics. Where are they?


> Lowering the autowarmCount should lower the time needed to warm up,
> howere you can also look at your warming queries (if you have such)
> and see how long they take.
>

Thank you, I will look at that!

-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com


Re: Occasional Solr performance issues

2012-10-22 Thread Mark Miller
Are you using Solr 3X? The occasional long commit should no longer
show up in Solr 4.

- Mark

On Mon, Oct 22, 2012 at 10:44 AM, Dotan Cohen  wrote:
> I've got a script writing ~50 documents to Solr at a time, then
> commiting. Each of these documents is no longer than 1 KiB of text,
> some much less. Usually the write-and-commit will take 1-2 seconds or
> less, but sometimes it can go over 60 seconds.
>
> During a recent time of over-60-second write-and-commits, I saw that
> the server did not look overloaded:
>
> $ uptime
>  14:36:46 up 19:20,  1 user,  load average: 1.08, 1.16, 1.16
> $ free -m
>  total   used   free sharedbuffers cached
> Mem: 14980   2091  12889  0233   1243
> -/+ buffers/cache:613  14366
> Swap:0  0  0
>
> Other than Solr, nothing is running on this machine other than stock
> Ubuntu Server services (no Apache, no MySQL). The machine is running
> on an Extra Large Amazon EC2 instance, with a virtual 4-core 2.4 GHz
> Xeon processor and ~16 GiB of RAM. The solr home is on a mounted EBS
> volume.
>
> What might make some queries take so long, while others perform fine?
>
> Thanks.
>
>
> --
> Dotan Cohen
>
> http://gibberish.co.il
> http://what-is-what.com



-- 
- Mark


Re: Occasional Solr performance issues

2012-10-22 Thread Rafał Kuć
Hello!

You can check if the long warming is causing the overlapping
searchers. Check Solr admin panel and look at cache statistics, there
should be warmupTime property.

Lowering the autowarmCount should lower the time needed to warm up,
howere you can also look at your warming queries (if you have such)
and see how long they take.

-- 
Regards,
 Rafał Kuć
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch

> When Solr is slow, I'm seeing these in the logs:
> [collection1] Error opening new searcher. exceeded limit of
> maxWarmingSearchers=2,​ try again later.
> [collection1] PERFORMANCE WARNING: Overlapping onDeckSearchers=2

> Googling, I found this in the FAQ:
> "Typically the way to avoid this error is to either reduce the
> frequency of commits, or reduce the amount of warming a searcher does
> while it's on deck (by reducing the work in newSearcher listeners,
> and/or reducing the autowarmCount on your caches)"
> http://wiki.apache.org/solr/FAQ#What_does_.22PERFORMANCE_WARNING:_Overlapping_onDeckSearchers.3DX.22_mean_in_my_logs.3F

> I happen to know that the script will try to commit once every 60
> seconds. How does one "reduce the work in newSearcher listeners"? What
> effect will this have? What effect will reducing the autowarmCount on
> caches have?

> Thanks.



Re: Occasional Solr performance issues

2012-10-22 Thread Dotan Cohen
When Solr is slow, I'm seeing these in the logs:
[collection1] Error opening new searcher. exceeded limit of
maxWarmingSearchers=2,​ try again later.
[collection1] PERFORMANCE WARNING: Overlapping onDeckSearchers=2

Googling, I found this in the FAQ:
"Typically the way to avoid this error is to either reduce the
frequency of commits, or reduce the amount of warming a searcher does
while it's on deck (by reducing the work in newSearcher listeners,
and/or reducing the autowarmCount on your caches)"
http://wiki.apache.org/solr/FAQ#What_does_.22PERFORMANCE_WARNING:_Overlapping_onDeckSearchers.3DX.22_mean_in_my_logs.3F

I happen to know that the script will try to commit once every 60
seconds. How does one "reduce the work in newSearcher listeners"? What
effect will this have? What effect will reducing the autowarmCount on
caches have?

Thanks.

-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com