Re: How to identify documents failed in a batch request?

2016-12-17 Thread David Smiley
If you enable the "TolerantUpdateProcessor" Solr-side, you can add documents in bulk allowing some to fail and know which did: http://www.solr-start.com/javadoc/solr-lucene/org/apache/solr/update/processor/TolerantUpdateProcessorFactory.html On Sat, Dec 17, 2016 at 5:05 PM S G

Re: Separating Search and Indexing in SolrCloud

2016-12-17 Thread Erick Erickson
Yes indexing is adding stress. No you can't separate the two in SolrCloud. End of story, why beat it to death? You'll have to figure out the sharding strategy that meets your indexing and querying needs and live within that framework. I'd advise setting up a small cluster and driving it to its

How to identify documents failed in a batch request?

2016-12-17 Thread S G
Hi, I am using the following code to send documents to Solr: final UpdateRequest request = new UpdateRequest(); request.setAction(UpdateRequest.ACTION.COMMIT, false, false); request.add(docsList); UpdateResponse response = request.process(solrClient); The

Re: Separating Search and Indexing in SolrCloud

2016-12-17 Thread Jaroslaw Rozanski
Hi Erick, So what does this buffer represent? What does it actually store? Raw update request or analyzed document? The documentation suggest that it stores actual update requests. Obviously analyzed document can and will occupy much more space than raw one. Also analysis with create a lot of

Re: Separating Search and Indexing in SolrCloud

2016-12-17 Thread Erick Erickson
bq: I am more concerned with indexing memory requirements at volume By and large this isn't much of a problem. RAMBufferSizeMB in solrconfig.xml governs how much memory is consumed in Solr for indexing. When that limit is exceeded, the buffer is flushed to disk. I've rarely heard of indexing

Confusing debug=timing parameter

2016-12-17 Thread S G
Hi, I am using Solr 4.10 and its response time for the clients is not very good. Even though the Solr's plugin/stats shows less than 200 milliseconds, clients report several seconds in response time. So I tried using debug-timing parameter from the Solr UI and this is what I got. Note how the

Re: Solr on HDFS: Streaming API performance tuning

2016-12-17 Thread Chetas Joshi
Here is the stack trace. java.lang.NullPointerException at org.apache.solr.client.solrj.io.comp.FieldComparator$2.compare(FieldComparator.java:85) at org.apache.solr.client.solrj.io.comp.FieldComparator.compare(FieldComparator.java:92) at

Re: Caching multiple entities

2016-12-17 Thread William Bell
I am not sure, but it looks like your XML is invalid. last_modified > XYZ You need to switch to or use something like a database view so that the > and other < will not cause problems. On Sat, Dec 17, 2016 at 7:01 AM, Per Newgro wrote: > Hello, > > we are implementing a

Caching multiple entities

2016-12-17 Thread Per Newgro
Hello, we are implementing a questionnaire tool for companies. I would like to import the data using a DIH. To increase performance i would like to use some caching. But my solution is not working. The score of my questionnaire is empty. But there is a value in the database. I've checked

Re: ttl on merge-time possible somehow ?

2016-12-17 Thread Dorian Hoxha
On Sat, Dec 17, 2016 at 12:04 AM, Chris Hostetter wrote: > > : > lucene, something has to "mark" the segements as deleted in order for > them > ... > : Note, it doesn't mark the "segment", it marks the "document". > > correct, typo on my part -- sorry. > > : >