uation, you will be guaranteed
> that each document is only returned once, no matter how it may be be
> modified during the use of the cursor.*
https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results
On Thu, Aug 3, 2017 at 12:47 PM, Vincenzo D'Amore
wrote:
> Hi all
Hi all,
I have a collection that is frequently updated, is it possible that a Solr
Cloud query returns duplicate documents while paginating?
Just to be clear, there is a collection with about 3M of documents and a
Solr query selects just 500K documents sorted by Id, which are returned
simply pagi
iencing a weird behavior with Solr. Pagination gives duplicates
> results.
>
> Requesting
> *http://localhost:8983/solr/tweets/select?q=text:test&start=0&wt=csv&fl=id,timestamp&fq=doc_type:tweet*
> gives me:
>
> id,timestamp
> 801943081268428800,2016-11-25T00
Hi all,
I am experiencing a weird behavior with Solr. Pagination gives duplicates
results.
Requesting
*http://localhost:8983/solr/tweets/select?q=text:test&start=0&wt=csv&fl=id,timestamp&fq=doc_type:tweet*
gives me:
id,timestamp
801943081268428800,2016-11-
In a word, "no". I once doubled the JVM requirements
by changing just the query. You have to prototype. Here's
a blog on the subject:
https://lucidworks.com/blog/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
On Wed, Oct 28, 2015 at 11:06 AM, Salman Ansari wro
I have already indexed all the documents in Solr and not indexing anymore.
So the problem I am running in is after all the documents are indexed. I am
using Solr cloud with two shards and two replicas for each shard but on the
same machine. Is there anywhere I can look at the relation between index
On Sun, Oct 25, 2015, at 05:43 PM, Salman Ansari wrote:
> Thanks guys for your responses.
>
> That's a very very large cache size. It is likely to use a VERY large
> amount of heap, and autowarming up to 4096 entries at commit time might
> take many *minutes*. Each filterCache entry is maxDoc/
Thanks guys for your responses.
That's a very very large cache size. It is likely to use a VERY large
amount of heap, and autowarming up to 4096 entries at commit time might
take many *minutes*. Each filterCache entry is maxDoc/8 bytes. On an
index core with 70 million documents, each filterCac
On Wed, 2015-10-14 at 10:17 +0200, Jan Høydahl wrote:
> I have not benchmarked various number of segments at different sizes
> on different HW etc, so my hunch could very well be wrong for Salman’s case.
> I don’t know how frequent updates there is to his data either.
>
> Have you done #segments b
I have not benchmarked various number of segments at different sizes
on different HW etc, so my hunch could very well be wrong for Salman’s case.
I don’t know how frequent updates there is to his data either.
Have you done #segments benchmarking for your huge datasets?
--
Jan Høydahl, search solu
On Mon, 2015-10-12 at 10:05 +0200, Jan Høydahl wrote:
> What you do when you call optimize is to force Lucene to merge all
> those 35M docs into ONE SINGLE index segment. You get better HW
> utilization if you let Lucene/Solr automatically handle merging,
> meaning you’ll have around 10 smaller seg
Salman,
You say that you optimized your index from Admin. You should not do that,
however strange it sounds.
70M docs on 2 shards means 35M docs per shard. What you do when you call
optimize is to force Lucene
to merge all those 35M docs into ONE SINGLE index segment. You get better HW
utilizat
On 10/10/2015 2:55 AM, Salman Ansari wrote:
> Thanks Shawn for your response. Based on that
> 1) Can you please direct me where I can get more information about cold
> shard vs hot shard?
I don't know of any information out there about hot/cold shards. I can
describe it, though:
A split point is
Regarding Solr performance issue I was facing, I upgraded my Solr machine
to have
8 cores
56 GB RAM
8 GB JVM
However, unfortunately, I am still getting delays. I have run
* the query "Football" with start=0 and rows=10 and it took around 7.329
seconds
* the query "Football" with start=1000 and ro
Thanks Shawn for your response. Based on that
1) Can you please direct me where I can get more information about cold
shard vs hot shard?
2) That 10GB number assumes there's no other software on the machine, like
a database server or a webserver.
Yes the machine is dedicated for Solr
3) How much
On 10/9/2015 1:39 PM, Salman Ansari wrote:
> INFO - 2015-10-09 18:46:17.953; [c:sabr102 s:shard1 r:core_node2
> x:sabr102_shard1_replica1] org.apache.solr.core.SolrCore;
> [sabr102_shard1_replica1] webapp=/solr path=/select
> params={start=0&q=(content_text:Football)&rows=10} hits=24408 status=0
> Thanks Eric for your response. If you find pagination is not the main
> culprit, what other factors do you guys suggest I need to tweak to test
> that?
Well, is basic search slow? What are your response times for plain
un-warmed top-20 searches?
I have restarted Solr and I have tried running a q
Salman Ansari wrote:
> As for the logs, I searched for "Salman" with rows=10 and start=1000 and it
> took about 29 seconds to complete. However, it took less at each shard as
> shown in the log file
> [...] QTime=91
> [...] QTime=4
> the search in the second shard started AFTER 29 seconds. Any
OK, this makes very little sense. The individual queries are taking < 100ms
yet the total response is 29 seconds. I do note that one of your
queries has rows=1010, a typo?
Anyway, not at all sure what's going on here. If these are gigantic files you're
returning, then it could be decompressing tim
Salman Ansari wrote:
> Thanks Eric for your response. If you find pagination is not the main
> culprit, what other factors do you guys suggest I need to tweak to test
> that?
Well, is basic search slow? What are your response times for plain un-warmed
top-20 searches?
> As I mentioned, by navig
peration time out exception. The first page is
> >> relatively
> >> > faster to load but it does take around few seconds as well. After
> reading
> >> > some documentation I realized that cursors could help and it does. I
> have
> >> > tried to fo
eased the filterChache
>> > 5) Increased the docCache
>> > 6) Run Optimize on the Solr Admin
>> >
>> > but still I get delays of around 16 seconds and sometimes even more.
>> > What other mechanisms do you suggest I should use to handle this issue?
>>
ther mechanisms do you suggest I should use to handle this issue?
> >
> > While pagination is faster than increasing the start parameter, the
> > difference is small as long as you stay below a start of 1000. 10K might
> > also work for you. Do your users page beyond that?
>
of 1000. 10K might
> also work for you. Do your users page beyond that?
> I can limit users not to go beyond 10K but still think at that level
> cursors will be much faster than increasing the start variable as explained
> here (https://cwiki.apache.org/confluence/display/solr/Paginati
?
I can limit users not to go beyond 10K but still think at that level
cursors will be much faster than increasing the start variable as explained
here (https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results
), have you tried both ways on your collection and it was giving you
similar
Salman Ansari wrote:
[Pagination with cursors]
> For example, what happens if the user navigates from page 1 to page 2,
> does the front end need to store the next cursor at each query?
Yes.
> What about going to a previous page, do we need to store all cursors
> that have been navigated up t
Hi guys,
I have been working with Solr and Solr.NET for some time for a big project
that requires around 300M documents. Consequently, I faced an issue and I
am highlighting it here in case you have any comments:
As mentioned here (
https://cwiki.apache.org/confluence/display/solr/Pagination+of
27 matches
Mail list logo