Re: Getting IO Exception while Indexing

2017-07-31 Thread mesenthil1
We printed in most of the places but could not get any significant differences between successful and error documents. We modified our logic to use direct http client and posted the JSON messages directly to solr cloud. Most of the ids are fine now. But we still see same issue with minimal doc

Re: Getting IO Exception while Indexing

2017-07-20 Thread mesenthil1
While debugging following are the findings. When we send the same document as json, it is getting indexed without an issue. When the same document is converted as SolrInputDocument and sent to solr using SolrServer, it fails. -- View this message in context: http://lucene.472066.n3.nabble.com

Re: Getting IO Exception while Indexing

2017-07-20 Thread mesenthil1
Hi, This is happening repeatedly for few documents. When we compared with other similar documents, we could not find any difference. As we are seeing 400 on apache, the request is not submitted to solr. So unable to find out the cause. Senthil -- View this message in context: http://luce

Re: Suggestions from different dictionaries dynamically

2017-03-15 Thread mesenthil1
Yes we are using spellcheck dictionary. Our default search field is "text". Following is the solrconfig snippet. Please let us know if there is more information required. edismax true typeaheadspellcheck spellcheckresearcher type

Re: Solr Cloud: Duplicate documents in multiple shards

2015-07-28 Thread mesenthil1
Thanks Erick. We could not recollect what could have happened in between.. Yes. We are seeing the same document in 2 shards. "Uniquefiled" is set as uuid in schema and declared as String. Will go with reindexing. schema.xml : Query: http://localhost:1004/solr/collection1/select?q=id:%22

Re: Solr Cloud: Duplicate documents in multiple shards

2015-07-27 Thread mesenthil1
Thanks Erick. As I understand now that the entire cluster goes down if any one shard is down, my first confusion is clarified. Following are the other details We really need to see details since I'm guessing we're talking past each other. So: *1> exactly how are you indexing documents?* /u

Re: Solr Cloud: Duplicate documents in multiple shards

2015-07-22 Thread mesenthil1
Alessandro, Thanks. see some confusion here. *First of all you need a smart client that will load balance the docs to index. Let's say the CloudSolrClient . * All these 5 shards are configured to load-balancer and requests are sent to the load-balancer and whichever server is up, will accept t

Re: Solr Cloud: Duplicate documents in multiple shards

2015-07-21 Thread mesenthil1
Unable to delete by passing distrib=false as well. Also it is difficult to identify those duplicate documents among the 130 million. Is there a way we can see the generated hash key and mapping them to the specific shard? -- View this message in context: http://lucene.472066.n3.nabble.com/Sol

Re: Solr Cloud: Duplicate documents in multiple shards

2015-07-20 Thread mesenthil1
Thanks Erick for clarifying .. We are not explicitly setting the compositeId. We are using numShards=5 alone as part of the server start up. We are using uuid as unique field. One sample id is : possting.mongo-v2.services.com-intl-staging-c2d2a376-5e4a-11e2-8963-0026b9414f30 Not sure how it wou

Solr Cloud: Duplicate documents in multiple shards

2015-07-20 Thread mesenthil1
Hi All, We are using solr 4.2.1 cloud with 5 shards set up ( 1 leader & 1 replica for each shard). We are seeing the following issue in our set up. Few of the documents are getting returned from more than one shard for queries. When we try to update the document, it is not updating the document

Re: CDATA response is coming with "<:" instead of "<"

2015-04-21 Thread mesenthil1
Thanks. For wt=json, it is bringing the results properly. I understand the reason for getting this in <. As our solr client is expecting this to be like within CDATA, I am looking for a way to achieve this. -- View this message in context: http://lucene.472066.n3.nabble.com/CDATA-response-

CDATA response is coming with "<:" instead of "<"

2015-04-21 Thread mesenthil1
We are using DIH for indexing XML files. As part of the xml we have xml enclosed with CDATA. It is getting indexed but in response the CDATA content is coming as decoded terms instead of symbols. Example: /Feed file: / 123 abc pqr xyz *