Re: millions of records problem
Getting a solid-state drive might help -- View this message in context: http://lucene.472066.n3.nabble.com/millions-of-records-problem-tp3427796p3431309.html Sent from the Solr - User mailing list archive at Nabble.com.
millions of records problem
Hi, I've got 500 millions of documents in solr everyone with the same number of fields an similar width. The version of solr which I used is 1.4.1 with lucene 2.9.3. I don't have the option to use shards so the whole index has to be in a machine... The size of the index is about 50Gb and the ram is 8GbEverything is working but the searches are so slowly, although I tried different configurations of the solrconfig.xml as: - configure a first searcher with the most used searches - configure the caches (query, filter and document) with great numbers... but everything is still working slowly, so do you have any ideas to boost the searches without the penalty to use much more ram? Thanks in advance, Jesús -- ... __ / / Jesús Martín García C E / S / C A Tècnic de Projectes /__ / Centre de Serveis Científics i Acadèmics de Catalunya Gran Capità, 2-4 (Edifici Nexus) · 08034 Barcelona T. 93 551 6213 · F. 93 205 6979 · jmar...@cesca.cat ...
Re: millions of records problem
Hi, What exactly do you mean by slow search? 1s? 10s? Which operating system, how many CPUs, which servlet container and how much RAM have you allocated to your JVM? (-Xmx) What kind and size of docs? Your numbers indicate about 100bytes per doc? What kind of searches? Facets? Sorting? Wildcards? Have you tried to slim down you schema by setting indexed=false and stored=false wherever possible? First thought is that it's really impressive if you've managed to get 500mill docs into one index with only 8Gb RAM!! I would expect that to fail or best case be veery slow. If you have a beefy server I'd first try putting in 64Gb RAM, slim down your schema and perhaps even switch to Solr4.0(trunk) which is more RAM efficient. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com On 17. okt. 2011, at 12:19, Jesús Martín García wrote: Hi, I've got 500 millions of documents in solr everyone with the same number of fields an similar width. The version of solr which I used is 1.4.1 with lucene 2.9.3. I don't have the option to use shards so the whole index has to be in a machine... The size of the index is about 50Gb and the ram is 8GbEverything is working but the searches are so slowly, although I tried different configurations of the solrconfig.xml as: - configure a first searcher with the most used searches - configure the caches (query, filter and document) with great numbers... but everything is still working slowly, so do you have any ideas to boost the searches without the penalty to use much more ram? Thanks in advance, Jesús -- ... __ / / Jesús Martín García C E / S / C A Tècnic de Projectes /__ / Centre de Serveis Científics i Acadèmics de Catalunya Gran Capità, 2-4 (Edifici Nexus) · 08034 Barcelona T. 93 551 6213 · F. 93 205 6979 · jmar...@cesca.cat ...
Re: millions of records problem
You could use this technique? I'm currently reading up on it http://khaidoan.wikidot.com/solr-common-gram-filter On 17 October 2011 12:57, Jan Høydahl jan@cominvent.com wrote: Hi, What exactly do you mean by slow search? 1s? 10s? Which operating system, how many CPUs, which servlet container and how much RAM have you allocated to your JVM? (-Xmx) What kind and size of docs? Your numbers indicate about 100bytes per doc? What kind of searches? Facets? Sorting? Wildcards? Have you tried to slim down you schema by setting indexed=false and stored=false wherever possible? First thought is that it's really impressive if you've managed to get 500mill docs into one index with only 8Gb RAM!! I would expect that to fail or best case be veery slow. If you have a beefy server I'd first try putting in 64Gb RAM, slim down your schema and perhaps even switch to Solr4.0(trunk) which is more RAM efficient. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com On 17. okt. 2011, at 12:19, Jesús Martín García wrote: Hi, I've got 500 millions of documents in solr everyone with the same number of fields an similar width. The version of solr which I used is 1.4.1 with lucene 2.9.3. I don't have the option to use shards so the whole index has to be in a machine... The size of the index is about 50Gb and the ram is 8GbEverything is working but the searches are so slowly, although I tried different configurations of the solrconfig.xml as: - configure a first searcher with the most used searches - configure the caches (query, filter and document) with great numbers... but everything is still working slowly, so do you have any ideas to boost the searches without the penalty to use much more ram? Thanks in advance, Jesús -- ... __ / / Jesús Martín García C E / S / C A Tècnic de Projectes /__ / Centre de Serveis Científics i Acadèmics de Catalunya Gran Capità, 2-4 (Edifici Nexus) · 08034 Barcelona T. 93 551 6213 · F. 93 205 6979 · jmar...@cesca.cat ...
Re: millions of records problem
Hi Jesús, Others have already asked a number of relevant question. If I had to guess, I'd guess this is simply a disk IO issue, but of course there may be room for improvement without getting more RAM or SSDs, so tell us more about your queries, about disk IO you are seeing, etc. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ From: Jesús Martín García jmar...@cesca.cat To: solr-user@lucene.apache.org Sent: Monday, October 17, 2011 6:19 AM Subject: millions of records problem Hi, I've got 500 millions of documents in solr everyone with the same number of fields an similar width. The version of solr which I used is 1.4.1 with lucene 2.9.3. I don't have the option to use shards so the whole index has to be in a machine... The size of the index is about 50Gb and the ram is 8GbEverything is working but the searches are so slowly, although I tried different configurations of the solrconfig.xml as: - configure a first searcher with the most used searches - configure the caches (query, filter and document) with great numbers... but everything is still working slowly, so do you have any ideas to boost the searches without the penalty to use much more ram? Thanks in advance, Jesús -- ... __ / / Jesús Martín García C E / S / C A Tècnic de Projectes /__ / Centre de Serveis Científics i Acadèmics de Catalunya Gran Capità, 2-4 (Edifici Nexus) · 08034 Barcelona T. 93 551 6213 · F. 93 205 6979 · jmar...@cesca.cat ...
Re: millions of records problem
Hi, a number of relevant questions is given. i have another one: which type of docs do you have? Do you add some new docs every day? Or is it a stable number of docs (500Mio.) ? What about Replication? Regards Vadim 2011/10/17 Otis Gospodnetic otis_gospodne...@yahoo.com Hi Jesús, Others have already asked a number of relevant question. If I had to guess, I'd guess this is simply a disk IO issue, but of course there may be room for improvement without getting more RAM or SSDs, so tell us more about your queries, about disk IO you are seeing, etc. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ From: Jesús Martín García jmar...@cesca.cat To: solr-user@lucene.apache.org Sent: Monday, October 17, 2011 6:19 AM Subject: millions of records problem Hi, I've got 500 millions of documents in solr everyone with the same number of fields an similar width. The version of solr which I used is 1.4.1 with lucene 2.9.3. I don't have the option to use shards so the whole index has to be in a machine... The size of the index is about 50Gb and the ram is 8GbEverything is working but the searches are so slowly, although I tried different configurations of the solrconfig.xml as: - configure a first searcher with the most used searches - configure the caches (query, filter and document) with great numbers... but everything is still working slowly, so do you have any ideas to boost the searches without the penalty to use much more ram? Thanks in advance, Jesús -- ... __ / / Jesús Martín García C E / S / C A Tècnic de Projectes /__ / Centre de Serveis Científics i Acadèmics de Catalunya Gran Capità, 2-4 (Edifici Nexus) · 08034 Barcelona T. 93 551 6213 · F. 93 205 6979 · jmar...@cesca.cat ...