Adding solr-core via maven fails

2020-07-01 Thread Ali Akhtar
If I try adding solr-core to an existing project, e.g (SBT): libraryDependencies += "org.apache.solr" % "solr-core" % "8.5.2" It fails due a 404 on the dependencies: Extracting structure failed stack trace is suppressed; run last update for the full output stack trace is suppressed; run last ssE

How to use two search string in a single solr query

2020-07-01 Thread Tushar Arora
Hi, I have a scenario with following entry in the request handler(handler1) of solrconfig.xml.(defType=edismax is used) description category title^4 demand^0.3 2<-1 4<-30% When I searched 'bags' as a search string, solr returned 15000 results. Query Used : http://localhost:8984/solr/core_name/sele

Re: Suggestion or recommendation for NRT

2020-07-01 Thread Erick Erickson
That seems high. It can be tricky to get tests. Are you running with some kind of test runner? Do you have, say, 3-4 thousand queries you run? Are you running the tests after warming the searchers? Also, if you have indexed down to one segment, _then_ tried adding docs and measuring you are not ge

Re: Suggestion or recommendation for NRT

2020-07-01 Thread ramyogi
Thanks Erick for the details and reference to understand better about merging segment stuff. When I compare performance of uninterrupted/optimized ( segment count 1) collection for search request vs (indexing + search) in parallel going on collection performance is 3 times higher, for example

Re: FunctionScoreQuery how to use it

2020-07-01 Thread Mikhail Khludnev
Hi, Vincenzo. Discussed earlier https://www.mail-archive.com/java-user@lucene.apache.org/msg50255.html On Wed, Jul 1, 2020 at 8:36 PM Vincenzo D'Amore wrote: > Hi all, > > I'm struggling with an old class that extends CustomScoreQuery. > I was trying to port to solr 8.5.2 and I'm looking for an

Re: Suggestion or recommendation for NRT

2020-07-01 Thread Erick Erickson
Updated documents are marked as deleted in the old segment and added to a new segment. When commits happen, merges occur and only then is the space occupied by the deleted document reclaimed. Which segments are merged on commit depends on a number of factors. Unless you can prove the extra space

Re: Suggestion or recommendation for NRT

2020-07-01 Thread ramyogi
Even though same document indexed over and over again due to incremental update. Index size is being increased. Do I miss any configuration to make optimization occur by internally ? -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

RE: CDCR stress-test issues

2020-07-01 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
For the record, it is not just Solr7.4 which has the problem. When I start afresh with Solr8.5.2, both symptoms persist. With Solr8.5.2, tlogs accumulate endlessly at the non-Leader nodes of the Source SolrCloud and are never released regardless of maxNumLogsToKeep setting And with Solr8.5.2, i

Re: Downsides to applying to WordDelimiterFilter twice in analyzer chain

2020-07-01 Thread Erick Erickson
Consider something other than WhitespaceTokenizer. In this case the tokenizer would split on the period and it’d work. I don’t know whether that would fit the rest of your problem space or not though. But to answer your original question, no there’s no a-priori reason you can’t have WordDelimiter(

Re: Downsides to applying to WordDelimiterFilter twice in analyzer chain

2020-07-01 Thread gnandre
Here are links to images for the Analysis tab. https://pasteboard.co/JfFTYu6.png https://pasteboard.co/JfFUYXf.png On Wed, Jul 1, 2020 at 3:03 PM gnandre wrote: > I am doing that already but it does not help. > > Here is the complete analyzer chain. > > "100"> "solr.WhitespaceTokenizerFacto

Re: Downsides to applying to WordDelimiterFilter twice in analyzer chain

2020-07-01 Thread gnandre
I am doing that already but it does not help. Here is the complete analyzer chain. [image: image.png] [image: image.png] On Wed, Jul 1, 2020 at 12:29 PM Erick Erickson wrote: > Why not just specify preserveOriginal and follow by a lowerCaseFilter and > use one wordDelimit

FunctionScoreQuery how to use it

2020-07-01 Thread Vincenzo D'Amore
Hi all, I'm struggling with an old class that extends CustomScoreQuery. I was trying to port to solr 8.5.2 and I'm looking for an example on how to implement it using FunctionScoreQuery. Do you know if there are examples that explain how to port the code to the new implementation? -- Vincenzo D

Re: Downsides to applying to WordDelimiterFilter twice in analyzer chain

2020-07-01 Thread Erick Erickson
Why not just specify preserveOriginal and follow by a lowerCaseFilter and use one wordDelimiterFilterFactory? Best, Erick > On Jul 1, 2020, at 11:05 AM, gnandre wrote: > > Hi, > > To satisfy one use-case, I need to apply WordDelimiterFilter with > splitOnCaseChange > with 0 once and then with

Searching document content and mult-valued fields

2020-07-01 Thread Shaun Campbell
Hi Been using Solr on a project now for a couple of years and is working well. It's just a simple index of about 20 - 25 fields and 7,000 project records. Now there's a requirement to be able to search on the content of documents (web pages, Word, pdf etc) related to those projects. My initial t

Solr Grouping and Unique values

2020-07-01 Thread Reinhardt, Nate
I am trying to find a way to grab unique values based on a group. The idea would be to group by an id and then return that groups value. Query params fl=valueIwant+myID&group=true&group.field=myId&q=: "grouped": { "myID": { "matches": 7520236, "groups": [{

Re: Parallel SQL join on multivalue fields

2020-07-01 Thread Piero Scrima
the reason why JOIN works is because of the Calcite framework. The parallel sql features leverages Calcite, which implements all the sql features, all you need is to provide the way for calcite to get the collection/table, in solr this is done by the SolrTable.java (package org.apache.solr.handler.

Downsides to applying to WordDelimiterFilter twice in analyzer chain

2020-07-01 Thread gnandre
Hi, To satisfy one use-case, I need to apply WordDelimiterFilter with splitOnCaseChange with 0 once and then with 1 again. Are there some downsides to this approach? Use-case is to be able to match results when indexed content is my.camelCase and search query is camelcase.

Re: Parallel SQL join on multivalue fields

2020-07-01 Thread Joel Bernstein
There isn't any real support for joins in Parallel SQL currently. I'm surprised that you're having some success doing them. Can you provide a sample SQL join that is working for you? Joel Bernstein http://joelsolr.blogspot.com/ On Fri, Jun 26, 2020 at 3:32 AM Piero Scrima wrote: > Hi, > > Al

Re: Supporting multiple indexes in one collection

2020-07-01 Thread Erick Erickson
Sharding always adds overhead, which balances against splitting the work up amongst several machines. Sharding works like this for queries: 1> node receives query 2> a sub-query is sent to one replica of each shard 3> each replica sends back its top N (rows parameter) with ID and sort data 4