Solr slow start up (tlog is small)
Hi, I am using Solr 4.9 with Tomcat and it works fine except that the deployment of solr.war is too long. While deploying Solr, all webapps on Tomcat stop responding which is unacceptable. Most articles I found say that it might result from big transaction log because of uncommitted documents, but this is not my case. At first, the Solr data is 280G and the start up time is 30 minutes. Then I set a field to stored=false and re-index whole data. The data size became 185G and the start up time reduced to 17 minutes, but it is still too long. Here are some numbers I measured: 1) Solr home: 280G tlog: 500K 30 min to start up While starting up, disk read is constantly about 50MB/s (according to dstat). So it seems that Solr reads 30m * 60s * 50MB/s = 90GB of data while starting up, which is 30% of index data size. 2) Solr home: 185G tlog: 5M 17 minutes to start up While starting up, disk read is constantly about 5MB/s (according to dstat). So it seems that Solr reads 17m * 60s *5MB/s = 5GB of data while starting up, which is about 3% of index data size. p.s. I did commit each time 1000 documents being added and did optimization after all documents are added. Any ideas or suggestions would be appreciated. Thanks, Po-Yu
Re: Solr slow start up (tlog is small)
Can you tell from the logs what Solr is doing during that time? Do you have any warming queries configured? Also see this: https://issues.apache.org/jira/browse/SOLR-6679 (comment out suggester related stuff if you aren't using it) -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data On Mon, Nov 3, 2014 at 11:03 AM, Po-Yu Chuang ratbert.chu...@gmail.com wrote: Hi, I am using Solr 4.9 with Tomcat and it works fine except that the deployment of solr.war is too long. While deploying Solr, all webapps on Tomcat stop responding which is unacceptable. Most articles I found say that it might result from big transaction log because of uncommitted documents, but this is not my case. At first, the Solr data is 280G and the start up time is 30 minutes. Then I set a field to stored=false and re-index whole data. The data size became 185G and the start up time reduced to 17 minutes, but it is still too long. Here are some numbers I measured: 1) Solr home: 280G tlog: 500K 30 min to start up While starting up, disk read is constantly about 50MB/s (according to dstat). So it seems that Solr reads 30m * 60s * 50MB/s = 90GB of data while starting up, which is 30% of index data size. 2) Solr home: 185G tlog: 5M 17 minutes to start up While starting up, disk read is constantly about 5MB/s (according to dstat). So it seems that Solr reads 17m * 60s *5MB/s = 5GB of data while starting up, which is about 3% of index data size. p.s. I did commit each time 1000 documents being added and did optimization after all documents are added. Any ideas or suggestions would be appreciated. Thanks, Po-Yu
Re: Solr slow start up (tlog is small)
One other reason for a slow start-up can be large number of segments in the index. Which I'm guessing is not the case since you optimized? But anyway, what's the number of segments in both 280G and 185G indices? Dmitry On Mon, Nov 3, 2014 at 6:17 PM, Yonik Seeley yo...@heliosearch.com wrote: Can you tell from the logs what Solr is doing during that time? Do you have any warming queries configured? Also see this: https://issues.apache.org/jira/browse/SOLR-6679 (comment out suggester related stuff if you aren't using it) -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data On Mon, Nov 3, 2014 at 11:03 AM, Po-Yu Chuang ratbert.chu...@gmail.com wrote: Hi, I am using Solr 4.9 with Tomcat and it works fine except that the deployment of solr.war is too long. While deploying Solr, all webapps on Tomcat stop responding which is unacceptable. Most articles I found say that it might result from big transaction log because of uncommitted documents, but this is not my case. At first, the Solr data is 280G and the start up time is 30 minutes. Then I set a field to stored=false and re-index whole data. The data size became 185G and the start up time reduced to 17 minutes, but it is still too long. Here are some numbers I measured: 1) Solr home: 280G tlog: 500K 30 min to start up While starting up, disk read is constantly about 50MB/s (according to dstat). So it seems that Solr reads 30m * 60s * 50MB/s = 90GB of data while starting up, which is 30% of index data size. 2) Solr home: 185G tlog: 5M 17 minutes to start up While starting up, disk read is constantly about 5MB/s (according to dstat). So it seems that Solr reads 17m * 60s *5MB/s = 5GB of data while starting up, which is about 3% of index data size. p.s. I did commit each time 1000 documents being added and did optimization after all documents are added. Any ideas or suggestions would be appreciated. Thanks, Po-Yu -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info
Re: Solr slow start up (tlog is small)
Hi Yonik, After removing the suggest component, it takes only 7 seconds to start up now!!! Thank you so much. Po-Yu On Mon, Nov 3, 2014 at 11:17 AM, Yonik Seeley yo...@heliosearch.com wrote: Can you tell from the logs what Solr is doing during that time? Do you have any warming queries configured? Also see this: https://issues.apache.org/jira/browse/SOLR-6679 (comment out suggester related stuff if you aren't using it) -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data On Mon, Nov 3, 2014 at 11:03 AM, Po-Yu Chuang ratbert.chu...@gmail.com wrote: Hi, I am using Solr 4.9 with Tomcat and it works fine except that the deployment of solr.war is too long. While deploying Solr, all webapps on Tomcat stop responding which is unacceptable. Most articles I found say that it might result from big transaction log because of uncommitted documents, but this is not my case. At first, the Solr data is 280G and the start up time is 30 minutes. Then I set a field to stored=false and re-index whole data. The data size became 185G and the start up time reduced to 17 minutes, but it is still too long. Here are some numbers I measured: 1) Solr home: 280G tlog: 500K 30 min to start up While starting up, disk read is constantly about 50MB/s (according to dstat). So it seems that Solr reads 30m * 60s * 50MB/s = 90GB of data while starting up, which is 30% of index data size. 2) Solr home: 185G tlog: 5M 17 minutes to start up While starting up, disk read is constantly about 5MB/s (according to dstat). So it seems that Solr reads 17m * 60s *5MB/s = 5GB of data while starting up, which is about 3% of index data size. p.s. I did commit each time 1000 documents being added and did optimization after all documents are added. Any ideas or suggestions would be appreciated. Thanks, Po-Yu