Query generation is different for search terms with and without "-"

2020-11-23 Thread Samuel Gutierrez
I am troubleshooting an issue with ranking for search terms that contain a
"-" vs the same query that does not contain the dash e.g. "high-tech" vs
"high tech". The field that I am querying is using the standard tokenizer,
so I would expect that the underlying lucene query should be the same for
both versions of the query, however when printing the debug, it appears
they are generated differently. I know "-" must be escaped as it has
special meaning in lucene, however escaping does not fix the problem. It
appears that with the "-" present, the pf2 edismax parameter is not
respected and omitted from the final query. We use sow=false as we have
multiterm synonyms and need to ensure they are included in the final lucene
query. My expectation is that the final underlying lucene query should be
based on the output  of the field analyzer, however after briefly looking
at the code for ExtendedDismaxQParser, it appears that there is some string
processing happening outside of the analysis step which causes the
unexpected lucene query.


Solr Debug for "high tech":

parsedquery: "+(DisjunctionMaxQuery((Name_enUS:high)~0.4)
DisjunctionMaxQuery((Name_enUS:tech)~0.4))~2
DisjunctionMaxQuery((Name_enUS:"high tech"~5)~0.4)
DisjunctionMaxQuery((Name_enUS:"high tech"~4)~0.4)",
parsedquery_toString: "+(((Name_enUS:high)~0.4
(Name_enUS:tech)~0.4)~2) (Name_enUS:"high tech"~5)~0.4
(Name_enUS:"high tech"~4)~0.4",


Solr Debug for "high-tech"

parsedquery: "+DisjunctionMaxQueryName_enUS:high
Name_enUS:tech)~2))~0.4) DisjunctionMaxQuery((Name_enUS:"high
tech"~5)~0.4)",
parsedquery_toString: "+(((Name_enUS:high Name_enUS:tech)~2))~0.4
(Name_enUS:"high tech"~5)~0.4"

SolrConfig:

  

  true
  true
  json
  375%
  Name_enUS
  Name_enUS
  5
  Name_enUS
  4   
  3
  0.4
  explicit
  100
  false


  edismax

  

Schema:

  
  




  
  


Using Solr 8.6.3

-- 
*The information contained in this message is the sole and exclusive 
property of ***iHerb Inc.*** and may be privileged and confidential. It may 
not be disseminated or distributed to persons or entities other than the 
ones intended without the written authority of ***iHerb Inc.** *If you have 
received this e-mail in error or are not the intended recipient, you may 
not use, copy, disseminate or distribute it. Do not open any attachments. 
Please delete it immediately from your system and notify the sender 
promptly by e-mail that you have done so.*


Re: Use stream result like a query (alternative to innerJoin)

2020-11-23 Thread Joel Bernstein
Here is the documentation for fetch:

https://lucene.apache.org/solr/guide/8_4/stream-decorator-reference.html#fetch


Joel Bernstein
http://joelsolr.blogspot.com/


On Mon, Nov 23, 2020 at 3:22 PM Joel Bernstein  wrote:

> There are two streams that behave like that.
>
> One is the "nodes" expression, which is not going to work for this use
> case because it does everything in memory.
>
> The second one is the "fetch" expression which behaves like a nested loop
> join with some limitations. Unfortunately the main limitation is likely to
> be a blocker for you which is that it doesn't support one-to-many joins yet.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Sun, Nov 22, 2020 at 10:37 AM ufuk yılmaz 
> wrote:
>
>> Hi all,
>>
>> I’m looking for a way to query two collections and find documents that
>> exist in both, I know this can be done with innerJoin streaming expression
>> but I want to avoid it, since one of the collection streams can possibly
>> have billions of results:
>>
>> Let’s say two collections are:
>>
>> deletedItems = [{deletedItemId: 1}, {deletedItemId: 2}...]
>> items = [
>> {
>> id: 1,
>> name: "a"
>> },
>> {   id: 2,
>> name: "b"
>> },
>> {
>> id: 3,
>> name: "c"
>> }.
>> ]
>>
>> “deletedItems” contain a few documents compared to “items” collection
>> (1mil vs 2-3 bil). If I query them both with a typical query in our system,
>> deletedItems gives a few thousand results but items give tens/hundreds of
>> millions. To use innerJoin, I have to stream the whole items result to
>> worker node over network.
>>
>> Is there a way to avoid this, something like using “deletedItems” result
>> as a query to “items” stream?
>>
>> Thanks in advance for the help
>>
>> Sent from Mail for Windows 10
>>
>>


Re: Use stream result like a query (alternative to innerJoin)

2020-11-23 Thread Joel Bernstein
There are two streams that behave like that.

One is the "nodes" expression, which is not going to work for this use case
because it does everything in memory.

The second one is the "fetch" expression which behaves like a nested loop
join with some limitations. Unfortunately the main limitation is likely to
be a blocker for you which is that it doesn't support one-to-many joins yet.

Joel Bernstein
http://joelsolr.blogspot.com/


On Sun, Nov 22, 2020 at 10:37 AM ufuk yılmaz 
wrote:

> Hi all,
>
> I’m looking for a way to query two collections and find documents that
> exist in both, I know this can be done with innerJoin streaming expression
> but I want to avoid it, since one of the collection streams can possibly
> have billions of results:
>
> Let’s say two collections are:
>
> deletedItems = [{deletedItemId: 1}, {deletedItemId: 2}...]
> items = [
> {
> id: 1,
> name: "a"
> },
> {   id: 2,
> name: "b"
> },
> {
> id: 3,
> name: "c"
> }.
> ]
>
> “deletedItems” contain a few documents compared to “items” collection
> (1mil vs 2-3 bil). If I query them both with a typical query in our system,
> deletedItems gives a few thousand results but items give tens/hundreds of
> millions. To use innerJoin, I have to stream the whole items result to
> worker node over network.
>
> Is there a way to avoid this, something like using “deletedItems” result
> as a query to “items” stream?
>
> Thanks in advance for the help
>
> Sent from Mail for Windows 10
>
>


RE: Solr8.7 Munin ?

2020-11-23 Thread Bruno Mannina
Ok thanks for this help !

-Message d'origine-
De : Bernd Fehling [mailto:bernd.fehl...@uni-bielefeld.de]
Envoyé : lundi 23 novembre 2020 10:46
À : solr-user@lucene.apache.org
Objet : Re: Solr8.7 Munin ?

Hi Bruno,

yes, I use munin-solr plugin.
https://github.com/averni/munin-solr

I renamed it to solr_*.py on my servers.

Regards
Bernd


Am 23.11.20 um 09:54 schrieb Bruno Mannina:
> Hello Bernd,
>
> Do you use a specific plugins for Sorl ?
>
> Thanks,
> Bruno
>
> -Message d'origine-
> De : Bernd Fehling [mailto:bernd.fehl...@uni-bielefeld.de]
> Envoyé : lundi 23 novembre 2020 09:02
> À : solr-user@lucene.apache.org
> Objet : Re: Solr8.7 Munin ?
>
> We are using Munin for years now for Solr monitoring.
> Currently Munin 2.0.40 and SolrCloud 6.6.
>
> Regards
> Bernd
>
>
> Am 20.11.20 um 21:02 schrieb Matheo Software:
>> Hello,
>>
>>
>>
>> I would like to use Munin to check my Solr 8.7 but it don't work. I
>> try to configure munin plugins without success.
>>
>>
>>
>> Is somebody use Munin with a recent version of Solr ? (version > 5.4)
>>
>>
>>
>> Thanks a lot,
>>
>>
>>
>> Cordialement, Best Regards
>>
>> Bruno Mannina
>>
>> www.matheo-software.com
>>
>> www.patent-pulse.com
>>
>> Tél. +33 0 970 738 743
>>
>> Mob. +33 0 634 421 817
>>
>> facebook (1)
>>  1425551717
>>  1425551737
>>  1425551760
>>
>>
>>
>>
>>
>
>

--
*
Bernd FehlingBielefeld University Library
Dipl.-Inform. (FH)LibTec - Library Technology
Universitätsstr. 25  and Knowledge Management
33615 Bielefeld
Tel. +49 521 106-4060   bernd.fehling(at)uni-bielefeld.de
   https://www.ub.uni-bielefeld.de/~befehl/

BASE - Bielefeld Academic Search Engine - www.base-search.net
*


--
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus



Re: Solr8.7 Munin ?

2020-11-23 Thread Bernd Fehling

Hi Bruno,

yes, I use munin-solr plugin.
https://github.com/averni/munin-solr

I renamed it to solr_*.py on my servers.

Regards
Bernd


Am 23.11.20 um 09:54 schrieb Bruno Mannina:

Hello Bernd,

Do you use a specific plugins for Sorl ?

Thanks,
Bruno

-Message d'origine-
De : Bernd Fehling [mailto:bernd.fehl...@uni-bielefeld.de]
Envoyé : lundi 23 novembre 2020 09:02
À : solr-user@lucene.apache.org
Objet : Re: Solr8.7 Munin ?

We are using Munin for years now for Solr monitoring.
Currently Munin 2.0.40 and SolrCloud 6.6.

Regards
Bernd


Am 20.11.20 um 21:02 schrieb Matheo Software:

Hello,

   


I would like to use Munin to check my Solr 8.7 but it don't work. I
try to configure munin plugins without success.

   


Is somebody use Munin with a recent version of Solr ? (version > 5.4)

   


Thanks a lot,

   


Cordialement, Best Regards

Bruno Mannina

    www.matheo-software.com

    www.patent-pulse.com

Tél. +33 0 970 738 743

Mob. +33 0 634 421 817

    facebook (1)
 1425551717
 1425551737
 1425551760

   









--
*
Bernd FehlingBielefeld University Library
Dipl.-Inform. (FH)LibTec - Library Technology
Universitätsstr. 25  and Knowledge Management
33615 Bielefeld
Tel. +49 521 106-4060   bernd.fehling(at)uni-bielefeld.de
  https://www.ub.uni-bielefeld.de/~befehl/

BASE - Bielefeld Academic Search Engine - www.base-search.net
*


RE: Solr8.7 Munin ?

2020-11-23 Thread Bruno Mannina
Hello Bernd,

Do you use a specific plugins for Sorl ?

Thanks,
Bruno

-Message d'origine-
De : Bernd Fehling [mailto:bernd.fehl...@uni-bielefeld.de]
Envoyé : lundi 23 novembre 2020 09:02
À : solr-user@lucene.apache.org
Objet : Re: Solr8.7 Munin ?

We are using Munin for years now for Solr monitoring.
Currently Munin 2.0.40 and SolrCloud 6.6.

Regards
Bernd


Am 20.11.20 um 21:02 schrieb Matheo Software:
> Hello,
>
>
>
> I would like to use Munin to check my Solr 8.7 but it don’t work. I
> try to configure munin plugins without success.
>
>
>
> Is somebody use Munin with a recent version of Solr ? (version > 5.4)
>
>
>
> Thanks a lot,
>
>
>
> Cordialement, Best Regards
>
> Bruno Mannina
>
>    www.matheo-software.com
>
>    www.patent-pulse.com
>
> Tél. +33 0 970 738 743
>
> Mob. +33 0 634 421 817
>
>    facebook (1)
>  1425551717
>  1425551737
>  1425551760
>
>
>
>
>


--
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus



Re: Solr8.7 Munin ?

2020-11-23 Thread Bernd Fehling

We are using Munin for years now for Solr monitoring.
Currently Munin 2.0.40 and SolrCloud 6.6.

Regards
Bernd


Am 20.11.20 um 21:02 schrieb Matheo Software:

Hello,

  


I would like to use Munin to check my Solr 8.7 but it don’t work. I try to
configure munin plugins without success.

  


Is somebody use Munin with a recent version of Solr ? (version > 5.4)

  


Thanks a lot,

  


Cordialement, Best Regards

Bruno Mannina

   www.matheo-software.com

   www.patent-pulse.com

Tél. +33 0 970 738 743

Mob. +33 0 634 421 817

   facebook (1)
 1425551717
 1425551737
 1425551760