Re: Solr Pagination

2017-08-03 Thread Vincenzo D'Amore
tion, you will be guaranteed > that each document is only returned once, no matter how it may be be > modified during the use of the cursor.* https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results On Thu, Aug 3, 2017 at 12:47 PM, Vincenzo D'Amore <v.dam...@gmail.com> wrote:

Solr Pagination

2017-08-03 Thread Vincenzo D'Amore
Hi all, I have a collection that is frequently updated, is it possible that a Solr Cloud query returns duplicate documents while paginating? Just to be clear, there is a collection with about 3M of documents and a Solr query selects just 500K documents sorted by Id, which are returned simply

Re: Duplicate docs in Solr pagination

2016-12-11 Thread Alexandre Rafalovitch
t; Hi all, > > I am experiencing a weird behavior with Solr. Pagination gives duplicates > results. > > Requesting > *http://localhost:8983/solr/tweets/select?q=text:test=0=csv=id,timestamp=doc_type:tweet* > gives me: > > id,timestamp > 801943081268428800,2016-11-25T00:

Duplicate docs in Solr pagination

2016-12-11 Thread atawfik
Hi all, I am experiencing a weird behavior with Solr. Pagination gives duplicates results. Requesting *http://localhost:8983/solr/tweets/select?q=text:test=0=csv=id,timestamp=doc_type:tweet* gives me: id,timestamp 801943081268428800,2016-11-25T00:18:24.613Z 802159834942541824,2016-11-25T14:39

Re: Solr Pagination

2015-10-28 Thread Salman Ansari
I have already indexed all the documents in Solr and not indexing anymore. So the problem I am running in is after all the documents are indexed. I am using Solr cloud with two shards and two replicas for each shard but on the same machine. Is there anywhere I can look at the relation between

Re: Solr Pagination

2015-10-28 Thread Erick Erickson
In a word, "no". I once doubled the JVM requirements by changing just the query. You have to prototype. Here's a blog on the subject: https://lucidworks.com/blog/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/ On Wed, Oct 28, 2015 at 11:06 AM, Salman Ansari

Re: Solr Pagination

2015-10-26 Thread Upayavira
On Sun, Oct 25, 2015, at 05:43 PM, Salman Ansari wrote: > Thanks guys for your responses. > > That's a very very large cache size. It is likely to use a VERY large > amount of heap, and autowarming up to 4096 entries at commit time might > take many *minutes*. Each filterCache entry is

Re: Solr Pagination

2015-10-25 Thread Salman Ansari
Thanks guys for your responses. That's a very very large cache size. It is likely to use a VERY large amount of heap, and autowarming up to 4096 entries at commit time might take many *minutes*. Each filterCache entry is maxDoc/8 bytes. On an index core with 70 million documents, each

Re: Solr Pagination

2015-10-22 Thread Toke Eskildsen
On Wed, 2015-10-14 at 10:17 +0200, Jan Høydahl wrote: > I have not benchmarked various number of segments at different sizes > on different HW etc, so my hunch could very well be wrong for Salman’s case. > I don’t know how frequent updates there is to his data either. > > Have you done #segments

Re: Solr Pagination

2015-10-14 Thread Jan Høydahl
I have not benchmarked various number of segments at different sizes on different HW etc, so my hunch could very well be wrong for Salman’s case. I don’t know how frequent updates there is to his data either. Have you done #segments benchmarking for your huge datasets? -- Jan Høydahl, search

Re: Solr Pagination

2015-10-12 Thread Jan Høydahl
Salman, You say that you optimized your index from Admin. You should not do that, however strange it sounds. 70M docs on 2 shards means 35M docs per shard. What you do when you call optimize is to force Lucene to merge all those 35M docs into ONE SINGLE index segment. You get better HW

Re: Solr Pagination

2015-10-12 Thread Toke Eskildsen
On Mon, 2015-10-12 at 10:05 +0200, Jan Høydahl wrote: > What you do when you call optimize is to force Lucene to merge all > those 35M docs into ONE SINGLE index segment. You get better HW > utilization if you let Lucene/Solr automatically handle merging, > meaning you’ll have around 10 smaller

Re: Solr Pagination

2015-10-10 Thread Salman Ansari
Regarding Solr performance issue I was facing, I upgraded my Solr machine to have 8 cores 56 GB RAM 8 GB JVM However, unfortunately, I am still getting delays. I have run * the query "Football" with start=0 and rows=10 and it took around 7.329 seconds * the query "Football" with start=1000 and

Re: Solr Pagination

2015-10-10 Thread Shawn Heisey
On 10/10/2015 2:55 AM, Salman Ansari wrote: > Thanks Shawn for your response. Based on that > 1) Can you please direct me where I can get more information about cold > shard vs hot shard? I don't know of any information out there about hot/cold shards. I can describe it, though: A split point

Re: Solr Pagination

2015-10-10 Thread Salman Ansari
Thanks Shawn for your response. Based on that 1) Can you please direct me where I can get more information about cold shard vs hot shard? 2) That 10GB number assumes there's no other software on the machine, like a database server or a webserver. Yes the machine is dedicated for Solr 3) How

Re: Solr Pagination

2015-10-09 Thread Shawn Heisey
On 10/9/2015 1:39 PM, Salman Ansari wrote: > INFO - 2015-10-09 18:46:17.953; [c:sabr102 s:shard1 r:core_node2 > x:sabr102_shard1_replica1] org.apache.solr.core.SolrCore; > [sabr102_shard1_replica1] webapp=/solr path=/select > params={start=0=(content_text:Football)=10} hits=24408 status=0 >

Re: Solr Pagination

2015-10-09 Thread Erick Erickson
rChache >> > 5) Increased the docCache >> > 6) Run Optimize on the Solr Admin >> > >> > but still I get delays of around 16 seconds and sometimes even more. >> > What other mechanisms do you suggest I should use to handle this issue? >> > &g

Re: Solr Pagination

2015-10-09 Thread Salman Ansari
chanisms do you suggest I should use to handle this issue? > > > > While pagination is faster than increasing the start parameter, the > > difference is small as long as you stay below a start of 1000. 10K might > > also work for you. Do your users page beyond that? > >

Solr Pagination

2015-10-09 Thread Salman Ansari
Hi guys, I have been working with Solr and Solr.NET for some time for a big project that requires around 300M documents. Consequently, I faced an issue and I am highlighting it here in case you have any comments: As mentioned here ( https://cwiki.apache.org/confluence/display/solr/Pagination

Re: Solr Pagination

2015-10-09 Thread Toke Eskildsen
Salman Ansari wrote: [Pagination with cursors] > For example, what happens if the user navigates from page 1 to page 2, > does the front end need to store the next cursor at each query? Yes. > What about going to a previous page, do we need to store all cursors >

Re: Solr Pagination

2015-10-09 Thread Salman Ansari
t;> > tried to following the test better performance: > >> > > >> > 1) Used cursors instead of start and row > >> > 2) Increased the RAM on my Solr machine to 14GB > >> > 3) Increase the JVM on that machine to 4GB > >> > 4)

Re: Solr Pagination

2015-10-09 Thread Toke Eskildsen
Salman Ansari wrote: > As for the logs, I searched for "Salman" with rows=10 and start=1000 and it > took about 29 seconds to complete. However, it took less at each shard as > shown in the log file > [...] QTime=91 > [...] QTime=4 > the search in the second shard

Re: Solr Pagination

2015-10-09 Thread Erick Erickson
OK, this makes very little sense. The individual queries are taking < 100ms yet the total response is 29 seconds. I do note that one of your queries has rows=1010, a typo? Anyway, not at all sure what's going on here. If these are gigantic files you're returning, then it could be decompressing

Re: Solr Pagination

2015-10-09 Thread Toke Eskildsen
Salman Ansari wrote: > Thanks Eric for your response. If you find pagination is not the main > culprit, what other factors do you guys suggest I need to tweak to test > that? Well, is basic search slow? What are your response times for plain un-warmed top-20 searches?

Re: Solr Pagination

2015-10-09 Thread Salman Ansari
> Thanks Eric for your response. If you find pagination is not the main > culprit, what other factors do you guys suggest I need to tweak to test > that? Well, is basic search slow? What are your response times for plain un-warmed top-20 searches? I have restarted Solr and I have tried running a

Re: Solr Pagination

2015-10-09 Thread Salman Ansari
that? I can limit users not to go beyond 10K but still think at that level cursors will be much faster than increasing the start variable as explained here (https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results ), have you tried both ways on your collection and it was giving you similar

Re: Solr Pagination

2015-10-09 Thread Erick Erickson
for you. Do your users page beyond that? > I can limit users not to go beyond 10K but still think at that level > cursors will be much faster than increasing the start variable as explained > here (https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results > ), have you tried both