RE: Disable hyper-threading for better Solr performance?
Currently I'm using Solr 4.8.1 but I can move to another version if it performs significantly faster. My target is to reach the max indexing throughput possible on the machine. Since it seems the indexing process is CPU bound I was wondering whether 32 logical cores with twice indexing threads will perform better. Thanks, Avner -Original Message- From: Ilan Schwarts [mailto:ila...@gmail.com] Sent: Wednesday, March 09, 2016 9:09 AM To: solr-user@lucene.apache.org Subject: Re: Disable hyper-threading for better Solr performance? What is the solr version and shard config? Standalone? Multiple cores? Spread over RAID ? On Mar 9, 2016 9:00 AM, "Avner Levy" <av...@checkpoint.com> wrote: > I have a machine with 16 real cores (32 with HT enabled). > I'm running on it a Solr server and trying to reach maximum > performance for indexing and queries (indexing 20k documents/sec by a > number of threads). > I've read on multiple places that in some scenarios / products > disabling the hyper-threading may result in better performance results. > I'm looking for inputs / insights about HT on Solr setups. > Thanks in advance, > Avner > Email secured by Check Point
Disable hyper-threading for better Solr performance?
I have a machine with 16 real cores (32 with HT enabled). I'm running on it a Solr server and trying to reach maximum performance for indexing and queries (indexing 20k documents/sec by a number of threads). I've read on multiple places that in some scenarios / products disabling the hyper-threading may result in better performance results. I'm looking for inputs / insights about HT on Solr setups. Thanks in advance, Avner
Distributed Search in Solr with different queries per shard
I have 2 cores. One with active data and one with historical data (for documents which were removed from the active one). I want to run Distributed Search on both and get the unified result (as supported by Solr Distributed Search, I'm not using Solr Cloud). My problem is that the query for each core is different. Is there a way to specify different query per core and still let Solr to unify the query results? For example: Active data core query: select all green docs History core query: select all green docs with year=2012 Is there a way to extend the distributed search handler to support such a scenario? Thanks in advance, Avner · One option is to send a unified query to both but then each core will work harder for no reason.
RE: Distributed Search in Solr with different queries per shard
Yes, there is. But since the real query is very long and complex per core, I don't want each core to work very hard on irrelevant query parts of other cores. Perhaps I can write some query plugin which will strip the unnecessary parts on each core? Thanks, Avner -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Wednesday, May 21, 2014 6:52 PM To: solr-user@lucene.apache.org Subject: Re: Distributed Search in Solr with different queries per shard Unfortunately the same query will be sent to all cores if you use the shards parameter to query multiple cores. Is there some characteristic of the first core that is distinct from the second core so that you could OR the differences between the two? -- Jack Krupansky -Original Message- From: Avner Levy Sent: Wednesday, May 21, 2014 9:56 AM To: solr-user@lucene.apache.org Subject: Distributed Search in Solr with different queries per shard I have 2 cores. One with active data and one with historical data (for documents which were removed from the active one). I want to run Distributed Search on both and get the unified result (as supported by Solr Distributed Search, I'm not using Solr Cloud). My problem is that the query for each core is different. Is there a way to specify different query per core and still let Solr to unify the query results? For example: Active data core query: select all green docs History core query: select all green docs with year=2012 Is there a way to extend the distributed search handler to support such a scenario? Thanks in advance, Avner · One option is to send a unified query to both but then each core will work harder for no reason. Email secured by Check Point
RE: Distributed Search in Solr with different queries per shard
I believe unifying multiple query results including facets, paging, sorts and other extra features on my own in the application is complex as well. Is there some Solr code I can use in the application level to unify multiple results? (this can be actually an interesting direction) The queries were of course just an example. In real life I have 4 cores with very complex queries for each so unifying all 4 may cause a significant overhead on the system, especially if there are tens of such queries per second. Thanks, Avner -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Wednesday, May 21, 2014 6:13 PM To: solr-user@lucene.apache.org Subject: Re: Distributed Search in Solr with different queries per shard I suppose you could, but I _really_ question whether it's a wise investment in time. Personally I'd treat them as two different collections and have the app layer fire off two queries and do the aggregation (this is a variant of federated search I think). This removes your issue with having the cores do extra work Additionally, I'd really prove out that the extra work is actually a measurable performance issue before worrying about this, it smells like premature optimization. FWIW, Erick On Wed, May 21, 2014 at 6:56 AM, Avner Levy av...@checkpoint.com wrote: I have 2 cores. One with active data and one with historical data (for documents which were removed from the active one). I want to run Distributed Search on both and get the unified result (as supported by Solr Distributed Search, I'm not using Solr Cloud). My problem is that the query for each core is different. Is there a way to specify different query per core and still let Solr to unify the query results? For example: Active data core query: select all green docs History core query: select all green docs with year=2012 Is there a way to extend the distributed search handler to support such a scenario? Thanks in advance, Avner · One option is to send a unified query to both but then each core will work harder for no reason. Email secured by Check Point
Storing ranges on documents and searching all document with specific value included
I have millions of documents with the following fields: name (string), start version (int), end version (int). I need to query efficiently all records which answers the query: Select all documents where version = start version and version=end version Running the above query took 50-100 ms while similar query by tagging each version took only 15 ms. My question is how efficient can Solr handle such queries? (since it isn't classic FTS query) Do I need to define something special in order to optimize performance? Any alternate solutions will be welcomed. The fields values / types can be changed if needed.
Re: Adding documents in Solr plugin
I've tried to write the plugin code. Currently I do: AddUpdateCommand addUpdateCommand = new AddUpdateCommand(solrQueryRequest); DocIterator iterator = docList.iterator(); SolrIndexSearcher indexReader = solrQueryRequest.getSearcher(); while (iterator.hasNext()) { Document document = indexReader.doc(iterator.nextDoc()); SolrInputDocument solrInputDocument = new SolrInputDocument(); addUpdateCommand.clear(); addUpdateCommand.solrDoc = solrInputDocument; addUpdateCommand.solrDoc.setField(id, document.get(id)); addUpdateCommand.solrDoc.setField(my_updated_field, new_value); updateRequestProcessor.processAdd(addUpdateCommand); } But this is very expensive since the update handler will fetch again the document which I already hold at hand. Is there a safe way to update the lucene document and write it back while taking into account all the Solr related code such as caches, extra solr logic, etc? I was thinking of converting it to a SolrInputDocument and then just add the document through Solr but I need first to convert all fields. Thanks in advance, Avner -- View this message in context: http://lucene.472066.n3.nabble.com/Adding-documents-in-Solr-plugin-tp4071574p4097168.html Sent from the Solr - User mailing list archive at Nabble.com.
Adding documents in Solr plugin
I have a core with millions of records. I want to add a custom handler which scan the existing documents and update one of the field (delete and add document) based on a condition (age12 for example). All fields are stored so there is no problem to recreate the document from the search result. I prefer doing it on the Solr server side for avoiding sending millions of documents to the client and back. I'm thinking of writing a solr plugin which will receive a query and update some fields on the query documents (like the delete by query handler). Are existing solutions or better alternatives? I couldn't find any examples of Solr plugins which update / add / delete documents (I don't need to extend the update handler). If someone has an example it will be great help. Thanks in advance
Enabling realtime search in Solr 4.0
Hi, I'm trying to enable realtime search in Solr 4.0 (So I can see new documents without committing). I've added: realtime visible=0 facet=truetrue/realtime updateLog class=solr.FSUpdateLog str name=dir${solr.data.dir:}/str /updateLog But documents aren't seen before commit (or softCommit). Any help will be appreciated. Thanks, Avner
RE: Enabling realtime search in Solr 4.0
Thanks Mark, I appreciate your help. I need the Solr index to be in sync with my database. This means that even if one record was added I need it to appear in the next search (including faceting). I've read in Solr-RA documentation that if you add realtimetrue/realtime you can add documents and search for them without any commit at all (and I assumed it is functionality of Solr). So I guess there isn't a way to get such functionality in Solr 4.0, right? I think this relates to the ability to open readers from the writer if I understood it correctly? Does anyone knows how different is Solr-RA from the regular Solr? Thanks in advance, Avner -Original Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: Thursday, December 29, 2011 5:16 PM To: solr-user@lucene.apache.org Subject: Re: Enabling realtime search in Solr 4.0 On Dec 29, 2011, at 3:39 AM, Avner Levy wrote: Hi, I'm trying to enable realtime search in Solr 4.0 (So I can see new documents without committing). I've added: realtime visible=0 facet=truetrue/realtime updateLog class=solr.FSUpdateLog str name=dir${solr.data.dir:}/str /updateLog But documents aren't seen before commit (or softCommit). Any help will be appreciated. Thanks, Avner This is how you enable soft auto commit in trunk: http://wiki.apache.org/solr/SolrConfigXml?#Update_Handler_Section You do not need the update log for it - that is for realtime GET (where you would also need to set that up in a Request Handler). Sounds like you are conflating the two. - Mark Miller lucidimagination.com Scanned by Check Point Total Security Gateway.