Re: Index optimization takes too long

2018-11-04 Thread Toke Eskildsen
On Sat, 2018-11-03 at 21:41 -0700, Wei wrote: > Thanks everyone! I checked the system metrics during the optimization > process. CPU usage is quite low, there is no I/O wait, and memory > usage is not much different from before the docValues change. So I > wonder what could be the bottleneck.

Re: Questions about stored fields and updates.

2018-11-04 Thread Erick Erickson
Ash: Atomic updates are really a reindex of all the original fields. What happens is: 1> Solr gets all the stored fields from the disk 2> Solr overlays the new data 3> Solr re-indexes the entire document just as though it came from outside. For step <3>, there's no difference at all between an

Re: SolrCloud performance

2018-11-04 Thread Chuming Chen
Hi Shawn, Thank you very much for your analysis. I currently don’t have multiple machines to play with. I will try "one Solr instance and one ZK instance would be more efficient on a single server” you suggested. Thanks again, Chuming On Nov 4, 2018, at 7:56 PM, Shawn Heisey wrote: > On

Re: Questions about stored fields and updates.

2018-11-04 Thread Ash Ramesh
Also thanks for the information Shawn! :) On Mon, Nov 5, 2018 at 12:09 PM Ash Ramesh wrote: > Sorry Shawn, > > I seem to have gotten my wording wrong. I meant that we wanted to move > away from atomic-updates to replacing/reindexing the document entirely > again when changes are made. >

Re: Questions about stored fields and updates.

2018-11-04 Thread Ash Ramesh
Sorry Shawn, I seem to have gotten my wording wrong. I meant that we wanted to move away from atomic-updates to replacing/reindexing the document entirely again when changes are made. https://lucene.apache.org/solr/guide/7_5/uploading-data-with-index-handlers.html#adding-documents Regards, Ash

Re: SolrCloud performance

2018-11-04 Thread Shawn Heisey
On 11/4/2018 8:38 AM, Chuming Chen wrote: I have shared a tar ball with you (apa...@elyograg.org) from google drive. The tar ball includes logs directories of 4 nodes, solrconfig.xml, solr.in.sh, and screenshot of TOP command. The log files is about 1 day’s log. However, I restarted the solr

Re: Questions about stored fields and updates.

2018-11-04 Thread Shawn Heisey
On 11/3/2018 9:45 PM, Ash Ramesh wrote: My company currently uses SOLR to completely hydrate client objects by storing all fields (stored=true). Therefore we have 2 types of fields: 1. indexed=true | stored=true : For fields that will be used for searching, sorting, etc. 2.

RE: Solr OCR Support

2018-11-04 Thread Terry Steichen
+1 My experience is that you can't easily tell ahead of time whether your PDF is searchable or not. If it is, you may not even retrieve it because there's no text to index. Also, if you blindly OCR a file that has already been OCR'd, it can create a mess. Most higher end PDF editors have a

RE: Solr OCR Support

2018-11-04 Thread Phil Scadden
I would strongly consider OCR offline, BEFORE loading the documents into Solr. The advantage of this is that you convert your OCRed PDF into searchable PDF. Consider someone using Solr and they have found a document that matches their search criteria. Once they retrieve the document, they will

Re: migrating cores with Solr upgrade

2018-11-04 Thread Erick Erickson
Oops fumble fingers. Anyway I'd recommend completely reindexing into a new collection. On Sun, Nov 4, 2018, 12:53 Erick Erickson Lucene does not guarantee back comparability over two major versions, so > I'd recommend completely reinde > > On Sun, Nov 4, 2018, 02:02 Piyush Kumar Nayak wrote: >

Re: migrating cores with Solr upgrade

2018-11-04 Thread Erick Erickson
Lucene does not guarantee back comparability over two major versions, so I'd recommend completely reinde On Sun, Nov 4, 2018, 02:02 Piyush Kumar Nayak Hi, > > What is the best way to migrate cores from an old version of Solr (say > 5.x) to a newer version (say 7.x). I did not find anything

Re: SolrCloud performance

2018-11-04 Thread Chuming Chen
Hi Shawn, I have shared a tar ball with you (apa...@elyograg.org) from google drive. The tar ball includes logs directories of 4 nodes, solrconfig.xml, solr.in.sh, and screenshot of TOP command. The log files is about 1 day’s log. However, I restarted the solr cloud several times during that

Phrase query as feature in LTR not working

2018-11-04 Thread AshB
Phrase query is not working when applied in LTR. Feature supplied is { "name" : "isPook", "class" : "org.apache.solr.ltr.feature.SolrFeature", "params" : { "fq": ["{!type=edismax qf=text v=$qq}=\"${query}\""] } } Tested this feature outside and it returns only one result

migrating cores with Solr upgrade

2018-11-04 Thread Piyush Kumar Nayak
Hi, What is the best way to migrate cores from an old version of Solr (say 5.x) to a newer version (say 7.x). I did not find anything pertinent to the matter in the Solr reference guide. Is there a tool that can do that seamlessly? Regards, Piyush.