Solr slow start up (tlog is small)

2014-11-03 Thread Po-Yu Chuang
Hi,

I am using Solr 4.9 with Tomcat and it works fine except that the
deployment of solr.war is too long. While deploying Solr, all webapps on
Tomcat stop responding which is unacceptable. Most articles I found say
that it might result from big transaction log because of uncommitted
documents, but this is not my case.

At first, the Solr data is 280G and the start up time is 30 minutes. Then I
set a field to stored=false and re-index whole data. The data size became
185G and the start up time reduced to 17 minutes, but it is still too long.

Here are some numbers I measured:

1)
Solr home: 280G
tlog: 500K
30 min to start up
While starting up, disk read is constantly about 50MB/s (according to
dstat). So it seems that Solr reads 30m * 60s * 50MB/s = 90GB of data while
starting up, which is 30% of index data size.

2)
Solr home: 185G
tlog: 5M
17 minutes to start up
While starting up, disk read is constantly about 5MB/s (according to
dstat). So it seems that Solr reads 17m * 60s *5MB/s = 5GB of data while
starting up, which is about 3% of index data size.

p.s. I did commit each time 1000 documents being added and did optimization
after all documents are added.

Any ideas or suggestions would be appreciated.

Thanks,
Po-Yu


Re: Solr slow start up (tlog is small)

2014-11-03 Thread Yonik Seeley
Can you tell from the logs what Solr is doing during that time?
Do you have any warming queries configured?
Also see this: https://issues.apache.org/jira/browse/SOLR-6679
  (comment out suggester related stuff if you aren't using it)

-Yonik
http://heliosearch.org - native code faceting, facet functions,
sub-facets, off-heap data


On Mon, Nov 3, 2014 at 11:03 AM, Po-Yu Chuang ratbert.chu...@gmail.com wrote:
 Hi,

 I am using Solr 4.9 with Tomcat and it works fine except that the
 deployment of solr.war is too long. While deploying Solr, all webapps on
 Tomcat stop responding which is unacceptable. Most articles I found say
 that it might result from big transaction log because of uncommitted
 documents, but this is not my case.

 At first, the Solr data is 280G and the start up time is 30 minutes. Then I
 set a field to stored=false and re-index whole data. The data size became
 185G and the start up time reduced to 17 minutes, but it is still too long.

 Here are some numbers I measured:

 1)
 Solr home: 280G
 tlog: 500K
 30 min to start up
 While starting up, disk read is constantly about 50MB/s (according to
 dstat). So it seems that Solr reads 30m * 60s * 50MB/s = 90GB of data while
 starting up, which is 30% of index data size.

 2)
 Solr home: 185G
 tlog: 5M
 17 minutes to start up
 While starting up, disk read is constantly about 5MB/s (according to
 dstat). So it seems that Solr reads 17m * 60s *5MB/s = 5GB of data while
 starting up, which is about 3% of index data size.

 p.s. I did commit each time 1000 documents being added and did optimization
 after all documents are added.

 Any ideas or suggestions would be appreciated.

 Thanks,
 Po-Yu


Re: Solr slow start up (tlog is small)

2014-11-03 Thread Dmitry Kan
One other reason for a slow start-up can be large number of segments in the
index. Which I'm guessing is not the case since you optimized? But anyway,
what's the number of segments in both 280G and 185G indices?

Dmitry

On Mon, Nov 3, 2014 at 6:17 PM, Yonik Seeley yo...@heliosearch.com wrote:

 Can you tell from the logs what Solr is doing during that time?
 Do you have any warming queries configured?
 Also see this: https://issues.apache.org/jira/browse/SOLR-6679
   (comment out suggester related stuff if you aren't using it)

 -Yonik
 http://heliosearch.org - native code faceting, facet functions,
 sub-facets, off-heap data


 On Mon, Nov 3, 2014 at 11:03 AM, Po-Yu Chuang ratbert.chu...@gmail.com
 wrote:
  Hi,
 
  I am using Solr 4.9 with Tomcat and it works fine except that the
  deployment of solr.war is too long. While deploying Solr, all webapps on
  Tomcat stop responding which is unacceptable. Most articles I found say
  that it might result from big transaction log because of uncommitted
  documents, but this is not my case.
 
  At first, the Solr data is 280G and the start up time is 30 minutes.
 Then I
  set a field to stored=false and re-index whole data. The data size
 became
  185G and the start up time reduced to 17 minutes, but it is still too
 long.
 
  Here are some numbers I measured:
 
  1)
  Solr home: 280G
  tlog: 500K
  30 min to start up
  While starting up, disk read is constantly about 50MB/s (according to
  dstat). So it seems that Solr reads 30m * 60s * 50MB/s = 90GB of data
 while
  starting up, which is 30% of index data size.
 
  2)
  Solr home: 185G
  tlog: 5M
  17 minutes to start up
  While starting up, disk read is constantly about 5MB/s (according to
  dstat). So it seems that Solr reads 17m * 60s *5MB/s = 5GB of data while
  starting up, which is about 3% of index data size.
 
  p.s. I did commit each time 1000 documents being added and did
 optimization
  after all documents are added.
 
  Any ideas or suggestions would be appreciated.
 
  Thanks,
  Po-Yu




-- 
Dmitry Kan
Luke Toolbox: http://github.com/DmitryKey/luke
Blog: http://dmitrykan.blogspot.com
Twitter: http://twitter.com/dmitrykan
SemanticAnalyzer: www.semanticanalyzer.info


Re: Solr slow start up (tlog is small)

2014-11-03 Thread Po-Yu Chuang
Hi Yonik,

After removing the suggest component, it takes only 7 seconds to start up
now!!! Thank you so much.

Po-Yu

On Mon, Nov 3, 2014 at 11:17 AM, Yonik Seeley yo...@heliosearch.com wrote:

 Can you tell from the logs what Solr is doing during that time?
 Do you have any warming queries configured?
 Also see this: https://issues.apache.org/jira/browse/SOLR-6679
   (comment out suggester related stuff if you aren't using it)

 -Yonik
 http://heliosearch.org - native code faceting, facet functions,
 sub-facets, off-heap data


 On Mon, Nov 3, 2014 at 11:03 AM, Po-Yu Chuang ratbert.chu...@gmail.com
 wrote:
  Hi,
 
  I am using Solr 4.9 with Tomcat and it works fine except that the
  deployment of solr.war is too long. While deploying Solr, all webapps on
  Tomcat stop responding which is unacceptable. Most articles I found say
  that it might result from big transaction log because of uncommitted
  documents, but this is not my case.
 
  At first, the Solr data is 280G and the start up time is 30 minutes.
 Then I
  set a field to stored=false and re-index whole data. The data size
 became
  185G and the start up time reduced to 17 minutes, but it is still too
 long.
 
  Here are some numbers I measured:
 
  1)
  Solr home: 280G
  tlog: 500K
  30 min to start up
  While starting up, disk read is constantly about 50MB/s (according to
  dstat). So it seems that Solr reads 30m * 60s * 50MB/s = 90GB of data
 while
  starting up, which is 30% of index data size.
 
  2)
  Solr home: 185G
  tlog: 5M
  17 minutes to start up
  While starting up, disk read is constantly about 5MB/s (according to
  dstat). So it seems that Solr reads 17m * 60s *5MB/s = 5GB of data while
  starting up, which is about 3% of index data size.
 
  p.s. I did commit each time 1000 documents being added and did
 optimization
  after all documents are added.
 
  Any ideas or suggestions would be appreciated.
 
  Thanks,
  Po-Yu