tion, you will be guaranteed
> that each document is only returned once, no matter how it may be be
> modified during the use of the cursor.*
https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results
On Thu, Aug 3, 2017 at 12:47 PM, Vincenzo D'Amore <v.dam...@gmail.com>
wrote:
Hi all,
I have a collection that is frequently updated, is it possible that a Solr
Cloud query returns duplicate documents while paginating?
Just to be clear, there is a collection with about 3M of documents and a
Solr query selects just 500K documents sorted by Id, which are returned
simply
t; Hi all,
>
> I am experiencing a weird behavior with Solr. Pagination gives duplicates
> results.
>
> Requesting
> *http://localhost:8983/solr/tweets/select?q=text:test=0=csv=id,timestamp=doc_type:tweet*
> gives me:
>
> id,timestamp
> 801943081268428800,2016-11-25T00:
Hi all,
I am experiencing a weird behavior with Solr. Pagination gives duplicates
results.
Requesting
*http://localhost:8983/solr/tweets/select?q=text:test=0=csv=id,timestamp=doc_type:tweet*
gives me:
id,timestamp
801943081268428800,2016-11-25T00:18:24.613Z
802159834942541824,2016-11-25T14:39
I have already indexed all the documents in Solr and not indexing anymore.
So the problem I am running in is after all the documents are indexed. I am
using Solr cloud with two shards and two replicas for each shard but on the
same machine. Is there anywhere I can look at the relation between
In a word, "no". I once doubled the JVM requirements
by changing just the query. You have to prototype. Here's
a blog on the subject:
https://lucidworks.com/blog/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
On Wed, Oct 28, 2015 at 11:06 AM, Salman Ansari
On Sun, Oct 25, 2015, at 05:43 PM, Salman Ansari wrote:
> Thanks guys for your responses.
>
> That's a very very large cache size. It is likely to use a VERY large
> amount of heap, and autowarming up to 4096 entries at commit time might
> take many *minutes*. Each filterCache entry is
Thanks guys for your responses.
That's a very very large cache size. It is likely to use a VERY large
amount of heap, and autowarming up to 4096 entries at commit time might
take many *minutes*. Each filterCache entry is maxDoc/8 bytes. On an
index core with 70 million documents, each
On Wed, 2015-10-14 at 10:17 +0200, Jan Høydahl wrote:
> I have not benchmarked various number of segments at different sizes
> on different HW etc, so my hunch could very well be wrong for Salman’s case.
> I don’t know how frequent updates there is to his data either.
>
> Have you done #segments
I have not benchmarked various number of segments at different sizes
on different HW etc, so my hunch could very well be wrong for Salman’s case.
I don’t know how frequent updates there is to his data either.
Have you done #segments benchmarking for your huge datasets?
--
Jan Høydahl, search
Salman,
You say that you optimized your index from Admin. You should not do that,
however strange it sounds.
70M docs on 2 shards means 35M docs per shard. What you do when you call
optimize is to force Lucene
to merge all those 35M docs into ONE SINGLE index segment. You get better HW
On Mon, 2015-10-12 at 10:05 +0200, Jan Høydahl wrote:
> What you do when you call optimize is to force Lucene to merge all
> those 35M docs into ONE SINGLE index segment. You get better HW
> utilization if you let Lucene/Solr automatically handle merging,
> meaning you’ll have around 10 smaller
Regarding Solr performance issue I was facing, I upgraded my Solr machine
to have
8 cores
56 GB RAM
8 GB JVM
However, unfortunately, I am still getting delays. I have run
* the query "Football" with start=0 and rows=10 and it took around 7.329
seconds
* the query "Football" with start=1000 and
On 10/10/2015 2:55 AM, Salman Ansari wrote:
> Thanks Shawn for your response. Based on that
> 1) Can you please direct me where I can get more information about cold
> shard vs hot shard?
I don't know of any information out there about hot/cold shards. I can
describe it, though:
A split point
Thanks Shawn for your response. Based on that
1) Can you please direct me where I can get more information about cold
shard vs hot shard?
2) That 10GB number assumes there's no other software on the machine, like
a database server or a webserver.
Yes the machine is dedicated for Solr
3) How
On 10/9/2015 1:39 PM, Salman Ansari wrote:
> INFO - 2015-10-09 18:46:17.953; [c:sabr102 s:shard1 r:core_node2
> x:sabr102_shard1_replica1] org.apache.solr.core.SolrCore;
> [sabr102_shard1_replica1] webapp=/solr path=/select
> params={start=0=(content_text:Football)=10} hits=24408 status=0
>
rChache
>> > 5) Increased the docCache
>> > 6) Run Optimize on the Solr Admin
>> >
>> > but still I get delays of around 16 seconds and sometimes even more.
>> > What other mechanisms do you suggest I should use to handle this issue?
>> >
&g
chanisms do you suggest I should use to handle this issue?
> >
> > While pagination is faster than increasing the start parameter, the
> > difference is small as long as you stay below a start of 1000. 10K might
> > also work for you. Do your users page beyond that?
> >
Hi guys,
I have been working with Solr and Solr.NET for some time for a big project
that requires around 300M documents. Consequently, I faced an issue and I
am highlighting it here in case you have any comments:
As mentioned here (
https://cwiki.apache.org/confluence/display/solr/Pagination
Salman Ansari wrote:
[Pagination with cursors]
> For example, what happens if the user navigates from page 1 to page 2,
> does the front end need to store the next cursor at each query?
Yes.
> What about going to a previous page, do we need to store all cursors
>
t;> > tried to following the test better performance:
> >> >
> >> > 1) Used cursors instead of start and row
> >> > 2) Increased the RAM on my Solr machine to 14GB
> >> > 3) Increase the JVM on that machine to 4GB
> >> > 4)
Salman Ansari wrote:
> As for the logs, I searched for "Salman" with rows=10 and start=1000 and it
> took about 29 seconds to complete. However, it took less at each shard as
> shown in the log file
> [...] QTime=91
> [...] QTime=4
> the search in the second shard
OK, this makes very little sense. The individual queries are taking < 100ms
yet the total response is 29 seconds. I do note that one of your
queries has rows=1010, a typo?
Anyway, not at all sure what's going on here. If these are gigantic files you're
returning, then it could be decompressing
Salman Ansari wrote:
> Thanks Eric for your response. If you find pagination is not the main
> culprit, what other factors do you guys suggest I need to tweak to test
> that?
Well, is basic search slow? What are your response times for plain un-warmed
top-20 searches?
> Thanks Eric for your response. If you find pagination is not the main
> culprit, what other factors do you guys suggest I need to tweak to test
> that?
Well, is basic search slow? What are your response times for plain
un-warmed top-20 searches?
I have restarted Solr and I have tried running a
that?
I can limit users not to go beyond 10K but still think at that level
cursors will be much faster than increasing the start variable as explained
here (https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results
), have you tried both ways on your collection and it was giving you
similar
for you. Do your users page beyond that?
> I can limit users not to go beyond 10K but still think at that level
> cursors will be much faster than increasing the start variable as explained
> here (https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results
> ), have you tried both
27 matches
Mail list logo