dedupe doen't work on solr cloud with router field

2021-01-13 Thread Luke
I have one collection, 3 shards, 2 replicas, I defined route field: title, and ID is the unique key. I index two document with same ID and different title, I configured dedupe chain and I can see signature is generated, but the old document was removed by solr, please help, thanks

Re: Apache Solr in High Availability Primary and Secondary node.

2021-01-13 Thread Kaushal Shriyan
Hi, Checking in again if someone can pitch in for my earlier post to this mailing list? Thanks in Advance. Best Regards, On Tue, Jan 12, 2021 at 8:30 AM Kaushal Shriyan wrote: > > > On Tue, Jan 12, 2021 at 12:10 AM Dmitri Maziuk > wrote: > >> On 1/11/2021 12:30 PM, Walter Underwood wrote: >>

Re: Re:Interpreting Solr indexing times

2021-01-13 Thread Alessandro Benedetti
I agree, documents may be gigantic or very small, with heavy text analysis or simple strings ... so it's not possible to give an evaluation here. But you could make use of the nightly benchmark to give you an idea of Lucene indexing speed (the engine inside Apache Solr) :

Re: leader election stuck after hosts restarts

2021-01-13 Thread Alessandro Benedetti
I faced these problems a while ago, but at the time I created a blog post which I hope could help: https://sease.io/2018/05/solrcloud-leader-election-failing.html - --- Alessandro Benedetti Search Consultant, R Software Engineer, Director Sease Ltd. - www.sease.io -- Sent from:

Re: QueryResponse ordering

2021-01-13 Thread Alessandro Benedetti
Hi Srinivas, Filter queries don't impact scoring but only matching. So, what is the ordering you are expecting? A bq (boost query) parameter will add a clause to the query, impacting the score in an additive way. The query you posted is a bit confused, what was your intent there? To boost search

RE: Query over migrating a solr database from 7.7.1 to 8.7.0

2021-01-13 Thread Dyer, Jim
I think if you have _root_ in schema.xml you should look elsewhere. My memory is merely adding this one line to schema.xml took care of our problem. From: Flowerday, Matthew J Sent: Tuesday, January 12, 2021 3:23 AM To: solr-user@lucene.apache.org Subject: RE: Query over migrating a solr

Re: different score from different replica of same shard

2021-01-13 Thread Walter Underwood
Yes, check performance before turning on the stats cache in prod. When we tested the LRUStatsCache in 6.6.2, searches were 11X slower. It should be possible to do distributed IDF with little extra overhead. Infoseek was doing that in 1995 and the patent on the technique has expired. wunder

Re: Cursor Performance Issue

2021-01-13 Thread Mike Drob
You should be using docvalues on your id, but note that switching this would require a reindex. On Wed, Jan 13, 2021 at 6:04 AM Ajay Sharma wrote: > Hi All, > > I have used cursors to search and export documents in solr according to > >

Re: different score from different replica of same shard

2021-01-13 Thread Vincent Brehin
Hallo Bernd und Markus, A very instructive article, by the creator of TLOG mode (introduced in 7.0, btw): https://medium.com/@caomanhdat317/indexing-flow-of-solrcloud-sharding-distributed-systems-1-bba411bf8994 It helped me when architecting our replication policy. Not an easy matter, it's a

Re: different score from different replica of same shard

2021-01-13 Thread Markus Jelsma
Hallo Bernd, I see the different replica types in the 7.1 [1] manual but not in the 6.6. ExactStatsCache should work in 6.6, just add it to solrconfig.xml, not the request handler [1]. It will slow down searches due to added overhead. Regards, Markus [1]

SockerTimeoutException in long running streaming queries

2021-01-13 Thread ufuk yılmaz
When I performa a long running streaming expression, sometimes I get: { "error": { "metadata": [ "error-class", "org.apache.solr.common.SolrException", "root-error-class", "java.net.SocketTimeoutException" ], "msg":

Re: different score from different replica of same shard

2021-01-13 Thread Bernd Fehling
Hello Markus, thanks a lot. Is TLOG also for SOLR 6.6.6 or only 8.x and up? I will first try ExactStatsCache. Should be added as invariant to request handler, right? Comparing the replica index directories they have different size and the index version and generation is different. Also Max

RE: [Solr8.7] UI request reply empty after 8s

2021-01-13 Thread ufuk yılmaz
Hi, A while ago I asked the same thing here. Looking at the source javascript code of the frontend app, I saw a 10k millisecond timeout config in httpInterceptor inside app.js. I changed it to something much larger and results of long queries began to show. Hope it helps Sent from Mail for

Re: different score from different replica of same shard

2021-01-13 Thread Markus Jelsma
Hello Bernd, This is normal for NRT replicas, because the way segments are merged and deletes are removed is not synchronized between replicas. In that case counts for TF and IDF and norms become slightly different. You can either use ExactStatsCache that fetches counts for terms before scoring,

QueryResponse ordering

2021-01-13 Thread Srinivas Kashyap
Hello, I have a scenario where I'm using filter query to fetch the results. Example: Filter query(fq) - PARTY_ID:(abc OR def OR ghi) Now I'm getting query response through solrJ in different order. Is there a way I can get the results in same order as specified in filter query? Tried dismax

different score from different replica of same shard

2021-01-13 Thread Bernd Fehling
Hello list, a question for better understanding scoring of a shard in a cloud. I see different scores from different replicas of the same shard. Is this normal and if yes, why? My understanding until now was that replicas are always the same within a shard and the same query to each replica

[Solr8.7] UI request reply empty after 8s

2021-01-13 Thread Bruno Mannina
Hi All, I'm facing a problem with my Solr8.7. When I do a query on my collection from the Solr UI, if the request takes more than 8s nothing happens. I mean, Solr UI answer a blank page. No error in log, no response, nothing at all. And if I do the same request just behind, the answer

What should I do when I see a collection "recovering" in SolrCloud?

2021-01-13 Thread ufuk yılmaz
Should I stop indexing new documents, or stop indexing and wait for collections to recover? Recently our disk got 100% full and Solr started to throw various errors. So I deleted some unnecessary documents and committed with expungeDeletes=true. It freed some space but many collections went

Cursor Performance Issue

2021-01-13 Thread Ajay Sharma
Hi All, I have used cursors to search and export documents in solr according to https://lucene.apache.org/solr/guide/6_6/pagination-of-results.html#fetching-a-large-number-of-sorted-results-cursors Solr version: 6.5.0 No of Documents: 10 crore Before implementing cursor, I was using the start

Atomic Update Failures with Nested Schema and Lazy Field Loading

2021-01-13 Thread Ronen Nussbaum
Hi, I’ve encountered another issue that might be related to nested schema. Not always, but many times atomic updates fails for some shards with the message “TransactionLog doesn't know how to serialize class org.apache.lucene.document.LazyDocument$LazyField”. I checked both options: 1.