Re: SOLR - Documents with large number of fields ~ 450
Hi John, Mark is right. DocValues can be enabled in two ways: RAM resident (default) or on-disk. You can read more here: http://www.slideshare.net/LucidImagination/column-stride-fields-aka-docvalues Regards. On 22 March 2013 16:55, John Nielsen j...@mcb.dk wrote: with the on disk option. Could you elaborate on that? Den 22/03/2013 05.25 skrev Mark Miller markrmil...@gmail.com: You might try using docvalues with the on disk option and try and let the OS manage all the memory needed for all the faceting/sorting. This would require Solr 4.2. - Mark On Mar 21, 2013, at 2:56 AM, kobe.free.wo...@gmail.com wrote: Hello All, Scenario: My data model consist of approx. 450 fields with different types of data. We want to include each field for indexing as a result it will create a single SOLR document with *450 fields*. The total of number of records in the data set is *755K*. We will be using the features like faceting and sorting on approx. 50 fields. We are planning to use SOLR 4.1. Following is the hardware configuration of the web server that we plan to install SOLR on:- CPU: 2 x Dual Core (4 cores) | RAM: 12GB | Storage: 212 GB Questions : 1)What's the best approach when dealing with documents with large number of fields. What's the drawback of having a single document with a very large number of fields. Does SOLR support documents with large number of fields as in my case? 2)Will there be any performance issue if i define all of the 450 fields for indexing? Also if faceting is done on 50 fields with document having large number of fields and huge number of records? 3)The name of the fields in the data set are quiet lengthy around 60 characters. Will it be a problem defining fields with such a huge name in the schema file? Is there any best practice to be followed related to naming convention? Will big field names create problem during querying? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-Documents-with-large-number-of-fields-450-tp4049633.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SOLR - Documents with large number of fields ~ 450
Hi, I have a collection with more than 4K fields, but mostly Trie*Fields types. It is used for faceting,sorting,searching and statsComponent. It works pretty fine on Amazon 4xm1.large (7.5GB RAM) EC2 boxes. I'm using SolrCloud, multi A-Z setup and ephemeral storage. Index is managed by mmap, 4GB for Java heap, CMS for GC. Currently there is 800K records, but will be about 2m. Queries response is much longer (couple to dozen of seconds) during bulk loading, but this is rather typical as I think. Indexing takes much much longer than in case of records with less number of fields. I'm sending updates in 5MB batches. No OOM issues. Regarding DocValues: I believe they are great improvement for faceting, but they are annoying because of their limitations: as far as I checked a field has to be required or to have default value which is not possible in my case (I can't set some figures to 0 by default as it may impact other results displayed to the end user, which is not good). I wish it could change. Regards. On 21 March 2013 07:56, kobe.free.wo...@gmail.com kobe.free.wo...@gmail.com wrote: Hello All, Scenario: My data model consist of approx. 450 fields with different types of data. We want to include each field for indexing as a result it will create a single SOLR document with *450 fields*. The total of number of records in the data set is *755K*. We will be using the features like faceting and sorting on approx. 50 fields. We are planning to use SOLR 4.1. Following is the hardware configuration of the web server that we plan to install SOLR on:- CPU: 2 x Dual Core (4 cores) | RAM: 12GB | Storage: 212 GB Questions : 1)What's the best approach when dealing with documents with large number of fields. What's the drawback of having a single document with a very large number of fields. Does SOLR support documents with large number of fields as in my case? 2)Will there be any performance issue if i define all of the 450 fields for indexing? Also if faceting is done on 50 fields with document having large number of fields and huge number of records? 3)The name of the fields in the data set are quiet lengthy around 60 characters. Will it be a problem defining fields with such a huge name in the schema file? Is there any best practice to be followed related to naming convention? Will big field names create problem during querying? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-Documents-with-large-number-of-fields-450-tp4049633.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SOLR - Documents with large number of fields ~ 450
with the on disk option. Could you elaborate on that? Den 22/03/2013 05.25 skrev Mark Miller markrmil...@gmail.com: You might try using docvalues with the on disk option and try and let the OS manage all the memory needed for all the faceting/sorting. This would require Solr 4.2. - Mark On Mar 21, 2013, at 2:56 AM, kobe.free.wo...@gmail.com wrote: Hello All, Scenario: My data model consist of approx. 450 fields with different types of data. We want to include each field for indexing as a result it will create a single SOLR document with *450 fields*. The total of number of records in the data set is *755K*. We will be using the features like faceting and sorting on approx. 50 fields. We are planning to use SOLR 4.1. Following is the hardware configuration of the web server that we plan to install SOLR on:- CPU: 2 x Dual Core (4 cores) | RAM: 12GB | Storage: 212 GB Questions : 1)What's the best approach when dealing with documents with large number of fields. What's the drawback of having a single document with a very large number of fields. Does SOLR support documents with large number of fields as in my case? 2)Will there be any performance issue if i define all of the 450 fields for indexing? Also if faceting is done on 50 fields with document having large number of fields and huge number of records? 3)The name of the fields in the data set are quiet lengthy around 60 characters. Will it be a problem defining fields with such a huge name in the schema file? Is there any best practice to be followed related to naming convention? Will big field names create problem during querying? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-Documents-with-large-number-of-fields-450-tp4049633.html Sent from the Solr - User mailing list archive at Nabble.com.
SOLR - Documents with large number of fields ~ 450
Hello All, Scenario: My data model consist of approx. 450 fields with different types of data. We want to include each field for indexing as a result it will create a single SOLR document with *450 fields*. The total of number of records in the data set is *755K*. We will be using the features like faceting and sorting on approx. 50 fields. We are planning to use SOLR 4.1. Following is the hardware configuration of the web server that we plan to install SOLR on:- CPU: 2 x Dual Core (4 cores) | RAM: 12GB | Storage: 212 GB Questions : 1)What's the best approach when dealing with documents with large number of fields. What's the drawback of having a single document with a very large number of fields. Does SOLR support documents with large number of fields as in my case? 2)Will there be any performance issue if i define all of the 450 fields for indexing? Also if faceting is done on 50 fields with document having large number of fields and huge number of records? 3)The name of the fields in the data set are quiet lengthy around 60 characters. Will it be a problem defining fields with such a huge name in the schema file? Is there any best practice to be followed related to naming convention? Will big field names create problem during querying? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-Documents-with-large-number-of-fields-450-tp4049633.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SOLR - Documents with large number of fields ~ 450
You will definitely be pushing the limits for reasonable performance. Maybe 4-5 years from now you will be able to get decent performance with hundreds of fields and dozens of faceted fields, but I'd be surprised if you could get decent performance with more than about 100 fields and a dozen facets. The length of a field name should not be a problem for queries other than readability. Just be sure to stick with Java-style names (alpha, digit, underscore). The bottom line: Do a proof of concept (POC) first - and tell us how it performs. -- Jack Krupansky -Original Message- From: kobe.free.wo...@gmail.com Sent: Thursday, March 21, 2013 2:56 AM To: solr-user@lucene.apache.org Subject: SOLR - Documents with large number of fields ~ 450 Hello All, Scenario: My data model consist of approx. 450 fields with different types of data. We want to include each field for indexing as a result it will create a single SOLR document with *450 fields*. The total of number of records in the data set is *755K*. We will be using the features like faceting and sorting on approx. 50 fields. We are planning to use SOLR 4.1. Following is the hardware configuration of the web server that we plan to install SOLR on:- CPU: 2 x Dual Core (4 cores) | RAM: 12GB | Storage: 212 GB Questions : 1)What's the best approach when dealing with documents with large number of fields. What's the drawback of having a single document with a very large number of fields. Does SOLR support documents with large number of fields as in my case? 2)Will there be any performance issue if i define all of the 450 fields for indexing? Also if faceting is done on 50 fields with document having large number of fields and huge number of records? 3)The name of the fields in the data set are quiet lengthy around 60 characters. Will it be a problem defining fields with such a huge name in the schema file? Is there any best practice to be followed related to naming convention? Will big field names create problem during querying? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-Documents-with-large-number-of-fields-450-tp4049633.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SOLR - Documents with large number of fields ~ 450
Hi, In short, I suspect you'll OOM if you sort and facet on all these fields. Otis -- Solr ElasticSearch Support http://sematext.com/ On Thu, Mar 21, 2013 at 2:56 AM, kobe.free.wo...@gmail.com kobe.free.wo...@gmail.com wrote: Hello All, Scenario: My data model consist of approx. 450 fields with different types of data. We want to include each field for indexing as a result it will create a single SOLR document with *450 fields*. The total of number of records in the data set is *755K*. We will be using the features like faceting and sorting on approx. 50 fields. We are planning to use SOLR 4.1. Following is the hardware configuration of the web server that we plan to install SOLR on:- CPU: 2 x Dual Core (4 cores) | RAM: 12GB | Storage: 212 GB Questions : 1)What's the best approach when dealing with documents with large number of fields. What's the drawback of having a single document with a very large number of fields. Does SOLR support documents with large number of fields as in my case? 2)Will there be any performance issue if i define all of the 450 fields for indexing? Also if faceting is done on 50 fields with document having large number of fields and huge number of records? 3)The name of the fields in the data set are quiet lengthy around 60 characters. Will it be a problem defining fields with such a huge name in the schema file? Is there any best practice to be followed related to naming convention? Will big field names create problem during querying? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-Documents-with-large-number-of-fields-450-tp4049633.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SOLR - Documents with large number of fields ~ 450
You might try using docvalues with the on disk option and try and let the OS manage all the memory needed for all the faceting/sorting. This would require Solr 4.2. - Mark On Mar 21, 2013, at 2:56 AM, kobe.free.wo...@gmail.com wrote: Hello All, Scenario: My data model consist of approx. 450 fields with different types of data. We want to include each field for indexing as a result it will create a single SOLR document with *450 fields*. The total of number of records in the data set is *755K*. We will be using the features like faceting and sorting on approx. 50 fields. We are planning to use SOLR 4.1. Following is the hardware configuration of the web server that we plan to install SOLR on:- CPU: 2 x Dual Core (4 cores) | RAM: 12GB | Storage: 212 GB Questions : 1)What's the best approach when dealing with documents with large number of fields. What's the drawback of having a single document with a very large number of fields. Does SOLR support documents with large number of fields as in my case? 2)Will there be any performance issue if i define all of the 450 fields for indexing? Also if faceting is done on 50 fields with document having large number of fields and huge number of records? 3)The name of the fields in the data set are quiet lengthy around 60 characters. Will it be a problem defining fields with such a huge name in the schema file? Is there any best practice to be followed related to naming convention? Will big field names create problem during querying? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-Documents-with-large-number-of-fields-450-tp4049633.html Sent from the Solr - User mailing list archive at Nabble.com.