Re: Constant score and stopwords strange behaviour

2020-06-25 Thread Paras Lehana
Hi,

You can also change the multiplication factor in TF IDF snipped in the
source code to 1 also. I know there would be a better method to handle
stopwords now that you have used constant scoring but I wanted to mention
my method by what we got rid of TF.

On Thu, 25 Jun 2020 at 03:02, dbourassa  wrote:

> Hi,
>
> I'm working on a Solr core where we don't want to use TF-IDF (BM25).
> We rank documents with boost based on popularity, exact match, phrase
> match,
> etc.
>
> To bypass TF-IDF, we use constant score like this "q=harry^=0.5
> potter^=0.5"
> (score is always 1 before boost)
> We have just noticed a strange behaviour with this method.
> With "q=a cat", the stopword 'a' is automatically removed by the query
> analyzer.
> But with "q=a^0.5 cat^0.5", the stopword 'a' is not removed.
>
> We also tried something like "q=(a AND cat)^=1" but the problem still.
>
> Someone have an idea or a better solution to bypass TF-IDF ?
>
> relevant info in solrconfig :
> ...
> edismax
> 590%
> true
> ...
>
> relevant info in schema :
> 
> ...
>  words="stopwords_querytime_custom.txt"/>
> ...
>
>
> Thanks
>
>
>
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


-- 
-- 
Regards,

*Paras Lehana* [65871]
Development Engineer, *Auto-Suggest*,
IndiaMART InterMESH Ltd,

11th Floor, Tower 2, Assotech Business Cresterra,
Plot No. 22, Sector 135, Noida, Uttar Pradesh, India 201305

Mob.: +91-9560911996
Work: 0120-4056700 | Extn:
*1196*

-- 
*
*

 


Constant score and stopwords strange behaviour

2020-06-24 Thread dbourassa
Hi,

I'm working on a Solr core where we don't want to use TF-IDF (BM25).
We rank documents with boost based on popularity, exact match, phrase match,
etc.

To bypass TF-IDF, we use constant score like this "q=harry^=0.5 potter^=0.5"
(score is always 1 before boost)
We have just noticed a strange behaviour with this method.
With "q=a cat", the stopword 'a' is automatically removed by the query
analyzer.
But with "q=a^0.5 cat^0.5", the stopword 'a' is not removed. 

We also tried something like "q=(a AND cat)^=1" but the problem still.

Someone have an idea or a better solution to bypass TF-IDF ?

relevant info in solrconfig :
...
edismax
590%
true
...

relevant info in schema :

...

...


Thanks



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html