Re: Different solr score between stand alone vs cloud mode solr

2018-06-07 Thread Erick Erickson
Wei:

That is odd. These should be the same so I'm puzzled too.

I'm assuming that you're using the exact same schema on both with each
field having the exact same definitions. And since you say it's the
same release of Solr it's not like some default changed

Here's an idea (and I'm shooting in the dark here).

Copy the index from one place to another and see if what you're seeing
is still true. Assuming the schema is the seam, you should be able to
1> shut down all your, say, SolrCloud instances.
2> copy the stand-alone index to each of those instances. Verify that
there is exactly one segment since you said it's optimized.
3> start the SolrCloud instances back up.

Are the scores still different?

Let's claim they're the same. In that case, use the schema from your
stand-alone solr for SolrCloud, then delete the index adn re-index
from scratch.

Best,
Erick

On Thu, Jun 7, 2018 at 2:28 PM, Wei  wrote:
> Thanks Erick. However our indexes on stand alone and cloud are both static
> -- we indexed them from the same source xmls, optimize and have no updates
> after it is done. Also in cloud there is only one single shard( with
> multiple replicas ). I assume distributed stats doesn't have effect in this
> case?
>
> Thanks,
> Wei
>
> On Thu, Jun 7, 2018 at 12:18 PM, Erick Erickson 
> wrote:
>
>> Short form:
>>
>> As docs are updated, they're marked as deleted until the segment is
>> merged. This affects things like term frequency and doc frequency
>> which in turn influences the score.
>>
>> Due to how commits happen, i.e. autocommit will hit at slightly skewed
>> wall-clock time, different segments are merged on different replicas
>> of the same shard. Thus the scores can be slightly different
>>
>> You can turn on distributed stats which will help with this:
>> https://issues.apache.org/jira/browse/SOLR-1632
>>
>> Best,
>> Erick
>>
>> On Thu, Jun 7, 2018 at 12:07 PM, Wei  wrote:
>> > Hi,
>> >
>> > Recently we have an observation that really puzzled us.  We have two
>> > instances of Solr,  one in stand alone mode and one is a single-shard
>> solr
>> > cloud with a couple of replicas.  Both are indexed with the same
>> documents
>> > and have same solr version 6.6.2.  When issue the same query, the solr
>> > score from stand alone and cloud are different.  How could this happen?
>> > With the same data, software version and query,  should solr score be
>> > exactly same regardless of cloud mode or not?
>> >
>> > Thanks,
>> > Wei
>>


Re: Different solr score between stand alone vs cloud mode solr

2018-06-07 Thread Wei
Thanks Erick. However our indexes on stand alone and cloud are both static
-- we indexed them from the same source xmls, optimize and have no updates
after it is done. Also in cloud there is only one single shard( with
multiple replicas ). I assume distributed stats doesn't have effect in this
case?

Thanks,
Wei

On Thu, Jun 7, 2018 at 12:18 PM, Erick Erickson 
wrote:

> Short form:
>
> As docs are updated, they're marked as deleted until the segment is
> merged. This affects things like term frequency and doc frequency
> which in turn influences the score.
>
> Due to how commits happen, i.e. autocommit will hit at slightly skewed
> wall-clock time, different segments are merged on different replicas
> of the same shard. Thus the scores can be slightly different
>
> You can turn on distributed stats which will help with this:
> https://issues.apache.org/jira/browse/SOLR-1632
>
> Best,
> Erick
>
> On Thu, Jun 7, 2018 at 12:07 PM, Wei  wrote:
> > Hi,
> >
> > Recently we have an observation that really puzzled us.  We have two
> > instances of Solr,  one in stand alone mode and one is a single-shard
> solr
> > cloud with a couple of replicas.  Both are indexed with the same
> documents
> > and have same solr version 6.6.2.  When issue the same query, the solr
> > score from stand alone and cloud are different.  How could this happen?
> > With the same data, software version and query,  should solr score be
> > exactly same regardless of cloud mode or not?
> >
> > Thanks,
> > Wei
>


RE: Different solr score between stand alone vs cloud mode solr

2018-06-07 Thread Markus Jelsma
To add on that, keep in mind to disable queryResultCache or distributed stats 
won't work.

And to add on that, i do not think distributed stats will work for a single 
shard index anyway.

Regards,
Markus

 
 
-Original message-
> From:Erick Erickson 
> Sent: Thursday 7th June 2018 21:19
> To: solr-user 
> Subject: Re: Different solr score between stand alone vs cloud mode solr
> 
> Short form:
> 
> As docs are updated, they're marked as deleted until the segment is
> merged. This affects things like term frequency and doc frequency
> which in turn influences the score.
> 
> Due to how commits happen, i.e. autocommit will hit at slightly skewed
> wall-clock time, different segments are merged on different replicas
> of the same shard. Thus the scores can be slightly different
> 
> You can turn on distributed stats which will help with this:
> https://issues.apache.org/jira/browse/SOLR-1632
> 
> Best,
> Erick
> 
> On Thu, Jun 7, 2018 at 12:07 PM, Wei  wrote:
> > Hi,
> >
> > Recently we have an observation that really puzzled us.  We have two
> > instances of Solr,  one in stand alone mode and one is a single-shard solr
> > cloud with a couple of replicas.  Both are indexed with the same documents
> > and have same solr version 6.6.2.  When issue the same query, the solr
> > score from stand alone and cloud are different.  How could this happen?
> > With the same data, software version and query,  should solr score be
> > exactly same regardless of cloud mode or not?
> >
> > Thanks,
> > Wei
> 


Re: Different solr score between stand alone vs cloud mode solr

2018-06-07 Thread David Hastings
Also the score is a fluid number, you shouldnt use the score for any real
reason aside from seeing that the documents are in the right order in
relation to the scores from the other documents in the result set.  or the
occasional condition where two results switch in place from one to the
other because they have the same score

On Thu, Jun 7, 2018 at 3:18 PM, Erick Erickson 
wrote:

> Short form:
>
> As docs are updated, they're marked as deleted until the segment is
> merged. This affects things like term frequency and doc frequency
> which in turn influences the score.
>
> Due to how commits happen, i.e. autocommit will hit at slightly skewed
> wall-clock time, different segments are merged on different replicas
> of the same shard. Thus the scores can be slightly different
>
> You can turn on distributed stats which will help with this:
> https://issues.apache.org/jira/browse/SOLR-1632
>
> Best,
> Erick
>
> On Thu, Jun 7, 2018 at 12:07 PM, Wei  wrote:
> > Hi,
> >
> > Recently we have an observation that really puzzled us.  We have two
> > instances of Solr,  one in stand alone mode and one is a single-shard
> solr
> > cloud with a couple of replicas.  Both are indexed with the same
> documents
> > and have same solr version 6.6.2.  When issue the same query, the solr
> > score from stand alone and cloud are different.  How could this happen?
> > With the same data, software version and query,  should solr score be
> > exactly same regardless of cloud mode or not?
> >
> > Thanks,
> > Wei
>


Re: Different solr score between stand alone vs cloud mode solr

2018-06-07 Thread Erick Erickson
Short form:

As docs are updated, they're marked as deleted until the segment is
merged. This affects things like term frequency and doc frequency
which in turn influences the score.

Due to how commits happen, i.e. autocommit will hit at slightly skewed
wall-clock time, different segments are merged on different replicas
of the same shard. Thus the scores can be slightly different

You can turn on distributed stats which will help with this:
https://issues.apache.org/jira/browse/SOLR-1632

Best,
Erick

On Thu, Jun 7, 2018 at 12:07 PM, Wei  wrote:
> Hi,
>
> Recently we have an observation that really puzzled us.  We have two
> instances of Solr,  one in stand alone mode and one is a single-shard solr
> cloud with a couple of replicas.  Both are indexed with the same documents
> and have same solr version 6.6.2.  When issue the same query, the solr
> score from stand alone and cloud are different.  How could this happen?
> With the same data, software version and query,  should solr score be
> exactly same regardless of cloud mode or not?
>
> Thanks,
> Wei


Different solr score between stand alone vs cloud mode solr

2018-06-07 Thread Wei
Hi,

Recently we have an observation that really puzzled us.  We have two
instances of Solr,  one in stand alone mode and one is a single-shard solr
cloud with a couple of replicas.  Both are indexed with the same documents
and have same solr version 6.6.2.  When issue the same query, the solr
score from stand alone and cloud are different.  How could this happen?
With the same data, software version and query,  should solr score be
exactly same regardless of cloud mode or not?

Thanks,
Wei


Re: SOLR Score Range Changed

2018-02-26 Thread Shawn Heisey
On 2/23/2018 2:28 PM, Hodder, Rick wrote:
> Combining everything into one query is what I'd prefer because as you said, 
> one would think that with everything in the same query, the score would 
> organize everything nicely.

I don't recall writing anything like that.  How did you infer that from
what I wrote?  One thing that you can infer from what I said is that
comparing scores from multiple queries is not going to do what you think
it will do.  Which leads into the next thing I'll quote from your message:

> So the way we had addressed it was running 3 separate SOLR queries and 
> combining them and sorting them by descending score - wasn’t perfect, but it 
> worked, and helped me to reduce the number of results we hand off to a 
> scoring engine that applies 3 algorithms (Monge-Elkan, Jaro-Winkler, and 
> SmithWindowed Affline) to further hone the results - which can take LOTS of 
> time if there are a lot of results, so 

It seems that you didn't finish your sentence, and may not have even
finished the message, as this was the last thing you wrote.

Running three separate queries and then trying to combine them based on
score is not something you should ever attempt, because as I mentioned
before, the absolute score of a document in a result is only meaningful
for that specific query done at that moment.  Even the same query done
later after something has changed might have a very different score range.

Thanks,
Shawn



RE: SOLR Score Range Changed

2018-02-23 Thread Hodder, Rick
Classic Similarity helped, but the ranges of values don’t have a min near 0 
like back in 4's version



Are there other attributes/elements to this factory that could get me back the 
old functionality?

-Original Message-
From: Joël Trigalo [mailto:jtrig...@gmail.com] 
Sent: Friday, February 23, 2018 10:41 AM
To: solr-user@lucene.apache.org
Subject: Re: SOLR Score Range Changed

The difference seems due to the fact that default similarity in solr 7 is
BM25 while it used to be TF-IDF in solr 4. As you realised, BM25 function is 
smoother.
You can configure schema.xml to use ClassicSimilarity, for instance 
https://lucene.apache.org/solr/guide/6_6/major-changes-from-solr-5-to-solr-6.html#default-similarity-changes
https://lucene.apache.org/solr/guide/6_6/field-type-definitions-and-properties.html#FieldTypeDefinitionsandProperties-FieldTypeSimilarity

But as said before, maybe you are using properties that are not guaranteed so 
it would be better to change score function or sorting (rather than coming back 
to ClassicSimilarity)



RE: SOLR Score Range Changed

2018-02-23 Thread Hodder, Rick
Hi Shawn,

Thanks for your help - I'm still finding my way in the weeds of SOLR.

Combining everything into one query is what I'd prefer because as you said, one 
would think that with everything in the same query, the score would organize 
everything nicely.

>>Assuming you're using the default relevancy sort
Yes

>> does the order of your search results change dramatically from one version 
>> to the other?  If it does, is the order generally better from a relevance 
>> standpoint, or generally worse?  If you are specifying an explicit sort, 
>> then the scores will likely be ignored.

Here's what we do - we have a list of policies with names (among other things, 
but I'll just use names for an example.

We search for several business names to see if we have policies in common with 
the names so that we don’t have too much risk with them.

So let's say I'm doing a search against three business names

Bob's carpentry
Conslidated carpentry of the Greater North West
Carpentry Land

q=(IDX_CompanyName:bob's AND carpentry) OR (IDX_CompanyName: conslidated AND 
carpentry AND of AND the AND Greater AND North AND West) OR (IDX_CompanyName: 
Carpentry AND Land)

Searching for 750 rows has hits that are all focused on Consolidated (seemingly 
because the number of words causes the SOLR score to go up into a higher range 
for all Consolidated results, as mentioned in my previous email.) Searching for 
all 3 things at the same time doesn’t insure that all 3 companies will be in 
the results, even when run separately there are results for all 3. If I boost 
maxrows to 4000, I see a few bob's carpentry but most are still Consolidated

So the way we had addressed it was running 3 separate SOLR queries and 
combining them and sorting them by descending score - wasn’t perfect, but it 
worked, and helped me to reduce the number of results we hand off to a scoring 
engine that applies 3 algorithms (Monge-Elkan, Jaro-Winkler, and SmithWindowed 
Affline) to further hone the results - which can take LOTS of time if there are 
a lot of results, so 


What I am describing is also why it's strongly recommended that you never try 
to convert scores to percentages:

https://wiki.apache.org/lucene-java/ScoresAsPercentages

Thanks,
Shawn



Re: SOLR Score Range Changed

2018-02-23 Thread Joël Trigalo
The difference seems due to the fact that default similarity in solr 7 is
BM25 while it used to be TF-IDF in solr 4. As you realised, BM25 function
is smoother.
You can configure schema.xml to use ClassicSimilarity, for instance
https://lucene.apache.org/solr/guide/6_6/major-changes-from-solr-5-to-solr-6.html#default-similarity-changes
https://lucene.apache.org/solr/guide/6_6/field-type-definitions-and-properties.html#FieldTypeDefinitionsandProperties-FieldTypeSimilarity

But as said before, maybe you are using properties that are not guaranteed
so it would be better to change score function or sorting (rather than
coming back to ClassicSimilarity)

2018-02-22 18:39 GMT+01:00 Shawn Heisey :

> On 2/22/2018 9:50 AM, Hodder, Rick wrote:
>
>> I am migrating from SOLR 4.10.2 to SOLR 7.1.
>>
>> All seems to be going well, except for one thing: the score that is
>> coming back for the resulting documents is giving different scores.
>>
>
> The absolute score has no meaning when you change something -- the index,
> the query, the software version, etc.  You can't compare absolute scores.
>
> What matters is the relative score of one document to another *in the same
> query*.  The amount of difference is almost irrelevant -- the goal of
> Lucene's score calculation gymnastics is to have one document score higher
> than another, so the *order* is reasonably correct.
>
> Assuming you're using the default relevancy sort, does the order of your
> search results change dramatically from one version to the other?  If it
> does, is the order generally better from a relevance standpoint, or
> generally worse?  If you are specifying an explicit sort, then the scores
> will likely be ignored.
>
> What I am describing is also why it's strongly recommended that you never
> try to convert scores to percentages:
>
> https://wiki.apache.org/lucene-java/ScoresAsPercentages
>
> Thanks,
> Shawn
>
>


Re: SOLR Score Range Changed

2018-02-22 Thread Shawn Heisey

On 2/22/2018 9:50 AM, Hodder, Rick wrote:

I am migrating from SOLR 4.10.2 to SOLR 7.1.

All seems to be going well, except for one thing: the score that is coming back 
for the resulting documents is giving different scores.


The absolute score has no meaning when you change something -- the 
index, the query, the software version, etc.  You can't compare absolute 
scores.


What matters is the relative score of one document to another *in the 
same query*.  The amount of difference is almost irrelevant -- the goal 
of Lucene's score calculation gymnastics is to have one document score 
higher than another, so the *order* is reasonably correct.


Assuming you're using the default relevancy sort, does the order of your 
search results change dramatically from one version to the other?  If it 
does, is the order generally better from a relevance standpoint, or 
generally worse?  If you are specifying an explicit sort, then the 
scores will likely be ignored.


What I am describing is also why it's strongly recommended that you 
never try to convert scores to percentages:


https://wiki.apache.org/lucene-java/ScoresAsPercentages

Thanks,
Shawn



SOLR Score Range Changed

2018-02-22 Thread Hodder, Rick
I am migrating from SOLR 4.10.2 to SOLR 7.1.

All seems to be going well, except for one thing: the score that is coming back 
for the resulting documents is giving different scores.

The core uses a schema. Here's the schema info for the field that i am 
searching on:




When searching maxrows=750, fields: *,score

IDX_Company:(cat and scratch)

SOLR 7.1: max score 6.95 and a min of 6.28

SOLR 4.10.2: max score 8.63 and a min of 0.91

IDX_InsuredName:(cat and scratch and fever)

SOLR 7.1 max score of 12.99 and a min of 11.25 SOLR 4.10.2 max 3.97 and min of 
0.77

See how the range of values is different (ranges in 7.1 dont go down to 0.x) 
Also notice that the max score doubles when I add one word to the search terms 
in 7.1. Most important, the ranges in 4.10.2 overlap - but the 7.1 dont.

A little more information to show you how I use this information, and why this 
is causing a problem.

I get a company name like "bobs cabinetry" and another "all american tech 
enterprise"

I run two SOLR queries per company name, I'll call them 1-AND, 1-OR, 2-AND, 
2-OR.

IDX_Company:(bobs AND cabinetry) =*,score,requestid:"1-AND"
IDX_Company:(bobs OR cabinetry) =*,score,requestid:"1-OR"
IDX_Company:(all AND american AND tech AND enterprise) 
=*,score,requestid:"2-AND"
IDX_Company:(all OR american OR tech OR enterprise) =*,score,requestid:"2-OR"

I combine the results together sort by descending score, and then take the top 
750 rows.(The requestid lets me know which query the results came from)

Because of the changes in the range of scores, the sort pushes all of the all 
american tech enterprise rows to the top of the results (because of no 
overlap), and when the top 750 are taken everything for bobs carpentry is 
removed from the results.

Is there some config setting I can change to make score calculation act like it 
did in 4.10.2?

Or something else?


Re: Solr score use cases

2017-12-04 Thread alessandro.benedetti
I would like to stress how important is what Erick explained.
A lot of times people want to use the score to show it to the
users/calculate probability/doing weird calculations.

Score is used to rank results, given a query.
To give a local ordering.
This is the only useful information for the end user.

>From an administrator/developer perspective is different, debugging the
score could be vital, mainly for relevancy tuning and understanding ranking
bugs.



-
---
Alessandro Benedetti
Search Consultant, R Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Solr score use cases

2017-12-01 Thread Faraz Fallahi
Thx for the clarification
Best regards

Am 01.12.2017 18:25 schrieb "Erick Erickson" <erickerick...@gmail.com>:

> Sorting certainly ignores scoring, I'm pretty sure it's just not
> calculated in that case.
>
> If your sorting results in multiple documents in the same bin, people
> will combine the primary sort with a secondary sort on score, so in
> that case the score is definitely calculated, ie "=day asc, score
> desc"
>
> Returning the score with documents is usually for development
> purposes. Scores are _not_ comparable except within a single query, so
> IMO telling users that a doc from one search has a score of X and a
> doc from another search has a score of Y is useless-to-misleading
> information. A score of 2X is _not_ necessarily "twice as good" (or
> even as good) as a score of X in another search.
>
> FWIW,
> Erick
>
> On Fri, Dec 1, 2017 at 6:31 AM, Faraz Fallahi
> <faraz.fall...@googlemail.com> wrote:
> > Or does the Score even get calculated when i sort or Not?
> >
> > Am 01.12.2017 4:38 nachm. schrieb "Faraz Fallahi" <
> > faraz.fall...@googlemail.com>:
> >
> >> Oki but If ID Just make an simple query with a "where Claude" and sort
> by
> >> a field i See no sense in calculating a score right?
> >>
> >> Am 01.12.2017 16:33 schrieb "Aman Tandon" <amantandon...@gmail.com>:
> >>
> >>> Hi Faraz,
> >>>
> >>> Solr score which you could retrieved by adding in fl parameter could be
> >>> helpful to understand the following:
> >>>
> >>> 1) search relevance ranking: how much score solr has given to the top &
> >>> second top document, and with debug=true you could better understand
> what
> >>> is causing that score.
> >>>
> >>> 2) You could use the function query to multiply score with some feature
> >>> e.g. paid customers score, popularity score, etc to improve the
> relevance
> >>> as per the business.
> >>>
> >>> I am able to think these few points only, someone can also put more
> light
> >>> if I am missing anything. I hope this is what you want to know. 
> >>>
> >>> Regards,
> >>> Aman
> >>>
> >>> On Dec 1, 2017 13:38, "Faraz Fallahi" <faraz.fall...@googlemail.com>
> >>> wrote:
> >>>
> >>> Hi
> >>>
> >>> A simple question: what are the most common use cases for the solr
> score
> >>> of
> >>> documents retrieved after firing queries?
> >>> I dont have a real understanding of its purpose at the moment.
> >>>
> >>> Thx for helping
> >>>
> >>
>


Re: Solr score use cases

2017-12-01 Thread Erick Erickson
Sorting certainly ignores scoring, I'm pretty sure it's just not
calculated in that case.

If your sorting results in multiple documents in the same bin, people
will combine the primary sort with a secondary sort on score, so in
that case the score is definitely calculated, ie "=day asc, score
desc"

Returning the score with documents is usually for development
purposes. Scores are _not_ comparable except within a single query, so
IMO telling users that a doc from one search has a score of X and a
doc from another search has a score of Y is useless-to-misleading
information. A score of 2X is _not_ necessarily "twice as good" (or
even as good) as a score of X in another search.

FWIW,
Erick

On Fri, Dec 1, 2017 at 6:31 AM, Faraz Fallahi
<faraz.fall...@googlemail.com> wrote:
> Or does the Score even get calculated when i sort or Not?
>
> Am 01.12.2017 4:38 nachm. schrieb "Faraz Fallahi" <
> faraz.fall...@googlemail.com>:
>
>> Oki but If ID Just make an simple query with a "where Claude" and sort by
>> a field i See no sense in calculating a score right?
>>
>> Am 01.12.2017 16:33 schrieb "Aman Tandon" <amantandon...@gmail.com>:
>>
>>> Hi Faraz,
>>>
>>> Solr score which you could retrieved by adding in fl parameter could be
>>> helpful to understand the following:
>>>
>>> 1) search relevance ranking: how much score solr has given to the top &
>>> second top document, and with debug=true you could better understand what
>>> is causing that score.
>>>
>>> 2) You could use the function query to multiply score with some feature
>>> e.g. paid customers score, popularity score, etc to improve the relevance
>>> as per the business.
>>>
>>> I am able to think these few points only, someone can also put more light
>>> if I am missing anything. I hope this is what you want to know. 
>>>
>>> Regards,
>>> Aman
>>>
>>> On Dec 1, 2017 13:38, "Faraz Fallahi" <faraz.fall...@googlemail.com>
>>> wrote:
>>>
>>> Hi
>>>
>>> A simple question: what are the most common use cases for the solr score
>>> of
>>> documents retrieved after firing queries?
>>> I dont have a real understanding of its purpose at the moment.
>>>
>>> Thx for helping
>>>
>>


Re: Solr score use cases

2017-12-01 Thread Faraz Fallahi
Or does the Score even get calculated when i sort or Not?

Am 01.12.2017 4:38 nachm. schrieb "Faraz Fallahi" <
faraz.fall...@googlemail.com>:

> Oki but If ID Just make an simple query with a "where Claude" and sort by
> a field i See no sense in calculating a score right?
>
> Am 01.12.2017 16:33 schrieb "Aman Tandon" <amantandon...@gmail.com>:
>
>> Hi Faraz,
>>
>> Solr score which you could retrieved by adding in fl parameter could be
>> helpful to understand the following:
>>
>> 1) search relevance ranking: how much score solr has given to the top &
>> second top document, and with debug=true you could better understand what
>> is causing that score.
>>
>> 2) You could use the function query to multiply score with some feature
>> e.g. paid customers score, popularity score, etc to improve the relevance
>> as per the business.
>>
>> I am able to think these few points only, someone can also put more light
>> if I am missing anything. I hope this is what you want to know. 
>>
>> Regards,
>> Aman
>>
>> On Dec 1, 2017 13:38, "Faraz Fallahi" <faraz.fall...@googlemail.com>
>> wrote:
>>
>> Hi
>>
>> A simple question: what are the most common use cases for the solr score
>> of
>> documents retrieved after firing queries?
>> I dont have a real understanding of its purpose at the moment.
>>
>> Thx for helping
>>
>


Re: Solr score use cases

2017-12-01 Thread Faraz Fallahi
Oki but If ID Just make an simple query with a "where Claude" and sort by a
field i See no sense in calculating a score right?

Am 01.12.2017 16:33 schrieb "Aman Tandon" <amantandon...@gmail.com>:

> Hi Faraz,
>
> Solr score which you could retrieved by adding in fl parameter could be
> helpful to understand the following:
>
> 1) search relevance ranking: how much score solr has given to the top &
> second top document, and with debug=true you could better understand what
> is causing that score.
>
> 2) You could use the function query to multiply score with some feature
> e.g. paid customers score, popularity score, etc to improve the relevance
> as per the business.
>
> I am able to think these few points only, someone can also put more light
> if I am missing anything. I hope this is what you want to know. 
>
> Regards,
> Aman
>
> On Dec 1, 2017 13:38, "Faraz Fallahi" <faraz.fall...@googlemail.com>
> wrote:
>
> Hi
>
> A simple question: what are the most common use cases for the solr score of
> documents retrieved after firing queries?
> I dont have a real understanding of its purpose at the moment.
>
> Thx for helping
>


Re: Solr score use cases

2017-12-01 Thread Aman Tandon
Hi Faraz,

Solr score which you could retrieved by adding in fl parameter could be
helpful to understand the following:

1) search relevance ranking: how much score solr has given to the top &
second top document, and with debug=true you could better understand what
is causing that score.

2) You could use the function query to multiply score with some feature
e.g. paid customers score, popularity score, etc to improve the relevance
as per the business.

I am able to think these few points only, someone can also put more light
if I am missing anything. I hope this is what you want to know. 

Regards,
Aman

On Dec 1, 2017 13:38, "Faraz Fallahi" <faraz.fall...@googlemail.com> wrote:

Hi

A simple question: what are the most common use cases for the solr score of
documents retrieved after firing queries?
I dont have a real understanding of its purpose at the moment.

Thx for helping


Solr score use cases

2017-12-01 Thread Faraz Fallahi
Hi

A simple question: what are the most common use cases for the solr score of
documents retrieved after firing queries?
I dont have a real understanding of its purpose at the moment.

Thx for helping


Re: Modify solr score

2017-04-24 Thread tstusr
We came with a simple solution.

We use  termfreq <https://wiki.apache.org/solr/FunctionQuery#termfreq>   and
write a simple processor that counts words for making a boost function that
only calculates the ratio between words that hit terms and the whole field
length.

Some tests are being made, maybe it could solves the problem.

Thanks for your help!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331614.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Modify solr score

2017-04-22 Thread Erik Hatcher
This may be suggesting a solution that is too experimental or using the wrong 
hammer for the job, but to me it sounds like you could use “payloads” for this 
type of ranking of terms relationship to a document.   

See SOLR-1485 for the recent work I’ve been doing (and aim to get committed 
soon).   You could index documents in this way:

   id, weighted_terms_dpf
   1, A|5.0 B|95.0
2,A|88.7 B|0.1

And then search for “A” and use the 88.7 value to factor into the score or 
sorting.  

Erik



> On Apr 21, 2017, at 12:35 PM, tstusr <ulfrhe...@gmail.com> wrote:
> 
> Since we report the score, we think there will be some relation between them.
> As far as we know scoring (and then ranking) are calculated based on tf-idf.
> 
> What we want to do is to make a qualitative ranking, it means, according to
> one topic we will tag documents as "very related", "fairly related" or "poor
> related". So, we select some documents completely unrelated to a topic.
> 
> On a very related document we found a ratio of ~2% of words that reports
> ~0.85 of score (what we think is related to ranking). On a test document we
> found a ratio of less than 0.01% and the score is heigher than the first
> one. What we expect is that documents not related (those ones with less
> ratio) report lower scores so we can then use them as minimum and create the
> scale.
> 
> We came with multiply (of affect in some way) the default rank solr provide
> us with the ratio of documents so unrelated documents will be penalized
> while those with higher ratio values will be overrated.
> 
> Greetings, and thanks for your help.
> 
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331315.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: Modify solr score

2017-04-21 Thread Rick Leir
Ulf: Maybe there is a way you could filter out the unrelated documents. Qf?
Rick

On April 21, 2017 2:18:59 PM EDT, tstusr <ulfrhe...@gmail.com> wrote:
>Well, I know they can change.
>
>I think, the main problem here it that (in this point) documents
>completely
>unrelated to a topic are being ranked as high as documents related. So,
>in
>order to penalize them we are trying to use the ratio or term
>frequency/word
>length.
>
>Nevertheless we aren't able to find a practical way to make it.
>
>Greetings.
>
>
>
>--
>View this message in context:
>http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331342.html
>Sent from the Solr - User mailing list archive at Nabble.com.

-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com 

Re: Modify solr score

2017-04-21 Thread tstusr
Well, I know they can change.

I think, the main problem here it that (in this point) documents completely
unrelated to a topic are being ranked as high as documents related. So, in
order to penalize them we are trying to use the ratio or term frequency/word
length.

Nevertheless we aren't able to find a practical way to make it.

Greetings.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331342.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Modify solr score

2017-04-21 Thread Walter Underwood
Using a minimum score cut off does not work. The score is not an absolute 
estimate of relevance.

The idf component of the score is a whole-corpus metric. When you add or delete 
documents, the scores for the exact same query can change.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Apr 21, 2017, at 10:18 AM, tstusr <ulfrhe...@gmail.com> wrote:
> 
> Well, maybe I explain it wrong.
> 
> We have entry points, each of them are related to a topic. It mens that when
> we select the first topic all information has to be related in some way to
> this vocabulary. So, it can work since we select documents not related to
> each vocabulary of every entry point. To establish a threshold of minimums,
> so that, we are trying to use hit ratio to modify score.
> 
> After we rank on that topics, all work after that is about faceting, word
> selection and so on.
> 
> Greeting
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331331.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: Modify solr score

2017-04-21 Thread tstusr
Well, maybe I explain it wrong.

We have entry points, each of them are related to a topic. It mens that when
we select the first topic all information has to be related in some way to
this vocabulary. So, it can work since we select documents not related to
each vocabulary of every entry point. To establish a threshold of minimums,
so that, we are trying to use hit ratio to modify score.

After we rank on that topics, all work after that is about faceting, word
selection and so on.

Greeting



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331331.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Modify solr score

2017-04-21 Thread Walter Underwood
It isn’t going to work. The score is not an absolute relevance measurement. It 
only says that the first document is more relevant than the second, and so on.

Scores are not comparable between different queries. The score cannot be used 
to say that the first hit for query A is a better match than the first hit for 
query B.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Apr 21, 2017, at 9:35 AM, tstusr <ulfrhe...@gmail.com> wrote:
> 
> Since we report the score, we think there will be some relation between them.
> As far as we know scoring (and then ranking) are calculated based on tf-idf.
> 
> What we want to do is to make a qualitative ranking, it means, according to
> one topic we will tag documents as "very related", "fairly related" or "poor
> related". So, we select some documents completely unrelated to a topic.
> 
> On a very related document we found a ratio of ~2% of words that reports
> ~0.85 of score (what we think is related to ranking). On a test document we
> found a ratio of less than 0.01% and the score is heigher than the first
> one. What we expect is that documents not related (those ones with less
> ratio) report lower scores so we can then use them as minimum and create the
> scale.
> 
> We came with multiply (of affect in some way) the default rank solr provide
> us with the ratio of documents so unrelated documents will be penalized
> while those with higher ratio values will be overrated.
> 
> Greetings, and thanks for your help.
> 
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331315.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: Modify solr score

2017-04-21 Thread tstusr
Since we report the score, we think there will be some relation between them.
As far as we know scoring (and then ranking) are calculated based on tf-idf.

What we want to do is to make a qualitative ranking, it means, according to
one topic we will tag documents as "very related", "fairly related" or "poor
related". So, we select some documents completely unrelated to a topic.

On a very related document we found a ratio of ~2% of words that reports
~0.85 of score (what we think is related to ranking). On a test document we
found a ratio of less than 0.01% and the score is heigher than the first
one. What we expect is that documents not related (those ones with less
ratio) report lower scores so we can then use them as minimum and create the
scale.

We came with multiply (of affect in some way) the default rank solr provide
us with the ratio of documents so unrelated documents will be penalized
while those with higher ratio values will be overrated.

Greetings, and thanks for your help.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331315.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Modify solr score

2017-04-21 Thread alessandro.benedetti
It has been discussed countless times, never rely on score values.
Rely on the ranking of your results.
It seems you model a  as a least of keywords and then you just run a
query for each topic.
Essentially for you, a  is a query.

The ranking of your results will already be affected by how many times (
Term Frequency) such keywords appear in the results.
You can even play with different query parsers ( such as dismax/edismax) and
play with the mm percentage to estabilish how strict you want your results
to be, in relation with input query [1] .
Can you elaborate better the way you would like to customize the score ?
Which factor would you like to modify ?

Cheers

[1]
https://cwiki.apache.org/confluence/display/solr/The+DisMax+Query+Parser#TheDisMaxQueryParser-Themm(MinimumShouldMatch)Parameter



-
---
Alessandro Benedetti
Search Consultant, R Software Engineer, Director
Sease Ltd. - www.sease.io
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331310.html
Sent from the Solr - User mailing list archive at Nabble.com.


Modify solr score

2017-04-21 Thread tstusr
Hi.

We are making an application that searches for certain specific topics, as
many captured words on a document the higher the score.

We have 2 scenarios of testing. The first one with documents that users tag
as relevant and other ones that contains documents out of our domain.

In first scenario, we report ratios of 1-2% on the amount of captured terms
against all document words. For the second scenario, we report ratios of
less than 0.005%.

Nevertheless, scores remain almost equal, ~0.85 for the first stage and ~0.8
for the latter one.


So what we want is to decrease the score we report for this latter scenario
according to the percentage of words captured in some way.


Is there any way to store those values in a field in order to use them as
query boost. Or any way to override the score default calculation to change
relevancy?


Thanks in advance...



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Solr debug 'explain' values differ from the Solr score

2016-03-20 Thread Rick Sullivan
I still have the problem even without using the phonetic field. 

For example, the following query will result in some exact name matches having 
scores of 4.64, while others get 2.32. All debug info has final values of 4.64.

    =( ( (firstName:john~)^0.5 (firstName:john) )^4)

I expect all exact matches to score the same, as the debug response seems to 
indicate they should be. 

I'm not having success reproducing the issue on a small amount of exported data 
indexed using post.jar. The issue still appears when I reduced the data pulled 
by the DIH to only the first 1,000,000 first names, however. Could this be due 
to some indexing issue with the DIH?

Thanks,
-Rick


> Date: Tue, 15 Mar 2016 15:40:18 -0700
> From: hossman_luc...@fucit.org
> To: solr-user@lucene.apache.org
> Subject: RE: Solr debug 'explain' values differ from the Solr score
>
>
> Sounds like a mismatch in the way the BooleanQuery explanation generation
> code is handling situations where there is/isn't a coord factor involved
> in computing the score itself. (the bug is almost certainly in the
> "explain" code, since that is less rigorously tested in most cases, and
> the score itself is probably correct)
>
> I tried to trivially reproduce the symptoms you described using the
> techproducts example and was unable to generate a discrepency using a
> simple boolean query w/a fuzzy clause...
>
> http://localhost:8983/solr/techproducts/query?q=ipod~%20belkin=id,name,score=query=results=true
>
> ...can you distill one of your problematic queries down to a
> shorter/simpler reproducible example, and/or provide us with the field &
> fieldType details for all of the fields used in your example?
>
> (i'm guessing it probably relates to your firstName_phonetic field?)
>
>
>
> : Date: Tue, 15 Mar 2016 13:17:04 -0700
> : From: Rick Sullivan <r...@ricksullivan.net>
> : Reply-To: solr-user@lucene.apache.org
> : To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
> : Subject: RE: Solr debug 'explain' values differ from the Solr score
> :
> : After some digging and experimentation, here are some more details on the 
> issue I'm seeing.
> :
> :
> : 1. The adjusted documents' scores are always exactly (debug_score/N), where 
> N is the number of OR items in the query.
> :
> : For example, `=firstName:gabby~ firstName_phonetic:gabby 
> firstName_tokens:(gabby)` will result in some of the documents with 
> firstName==GABBY receiving a score 1/3 of the score of other GABBY documents, 
> even though the debug explanation shows that they generated the same score.
> :
> :
> : 2. This doesn't appear to be a brand new issue, or an issue with SolrCloud.
> :
> : I've tested the problem using SolrCloud 5.5.0, Solr 5.5.0 (not cloud), and 
> Solr 5.4.1.
> :
> :
> : Anyone have any ideas?
> :
> : Thanks,
> : -Rick
> :
> : From: r...@ricksullivan.net
> : To: solr-user@lucene.apache.org
> : Subject: Solr debug 'explain' values differ from the Solr score
> : Date: Thu, 10 Mar 2016 08:34:30 -0800
> :
> : Hi,
> :
> : I'm seeing behavior in Solr 5.5.0 where the top-level values I see in the 
> debug response don't always correspond with the scores Solr assigns to the 
> matched documents.
> :
> : For example, here is the top-level debug information for two documents 
> matched by a query:
> :
> : 114628: Object
> : description: "sum of:"
> : details: Array[2]
> : match: true
> : value: 20.542768
> :
> : 357547: Object
> : description: "sum of:"
> : details: Array[2]
> : match: true
> : value: 26.517654
> :
> : But they have scores
> :
> : 114628: 20.542767
> : 357547: 13.258826
> :
> : I expect the second document to be the most relevant for my query, and the 
> debug values seem to agree. However, in the final score I receive, that 
> document's score has been adjusted down.
> :
> : The relevant debug response information can be found here: 
> http://apaste.info/mju
> :
> : Does anyone have an idea why the Solr score may differ from the debug value?
> :
> : Thanks,
> : -Rick
>
> -Hoss
> http://www.lucidworks.com/
  

RE: Solr debug 'explain' values differ from the Solr score

2016-03-15 Thread Chris Hostetter

Sounds like a mismatch in the way the BooleanQuery explanation generation 
code is handling situations where there is/isn't a coord factor involved 
in computing the score itself.  (the bug is almost certainly in the 
"explain" code, since that is less rigorously tested in most cases, and 
the score itself is probably correct)

I tried to trivially reproduce the symptoms you described using the 
techproducts example and was unable to generate a discrepency using a 
simple boolean query w/a fuzzy clause...

http://localhost:8983/solr/techproducts/query?q=ipod~%20belkin=id,name,score=query=results=true

...can you distill one of your problematic queries down to a 
shorter/simpler reproducible example, and/or provide us with the field & 
fieldType details for all of the fields used in your example?

(i'm guessing it probably relates to your firstName_phonetic field?)



: Date: Tue, 15 Mar 2016 13:17:04 -0700
: From: Rick Sullivan <r...@ricksullivan.net>
: Reply-To: solr-user@lucene.apache.org
: To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
: Subject: RE: Solr debug 'explain' values differ from the Solr score
: 
: After some digging and experimentation, here are some more details on the 
issue I'm seeing.
: 
: 
: 1. The adjusted documents' scores are always exactly (debug_score/N), where N 
is the number of OR items in the query. 
: 
: For example, `=firstName:gabby~ firstName_phonetic:gabby 
firstName_tokens:(gabby)` will result in some of the documents with 
firstName==GABBY receiving a score 1/3 of the score of other GABBY documents, 
even though the debug explanation shows that they generated the same score.
: 
: 
: 2. This doesn't appear to be a brand new issue, or an issue with SolrCloud.
: 
: I've tested the problem using SolrCloud 5.5.0, Solr 5.5.0 (not cloud), and 
Solr 5.4.1.
: 
: 
: Anyone have any ideas?
: 
: Thanks,
: -Rick
: 
: From: r...@ricksullivan.net
: To: solr-user@lucene.apache.org
: Subject: Solr debug 'explain' values differ from the Solr score
: Date: Thu, 10 Mar 2016 08:34:30 -0800
: 
: Hi,
: 
: I'm seeing behavior in Solr 5.5.0 where the top-level values I see in the 
debug response don't always correspond with the scores Solr assigns to the 
matched documents.
: 
: For example, here is the top-level debug information for two documents 
matched by a query:
: 
: 114628: Object
:   description: "sum of:"
:   details: Array[2]
:   match: true
:   value: 20.542768
: 
: 357547: Object
:   description: "sum of:"
:   details: Array[2]
:   match: true
:   value: 26.517654
: 
: But they have scores
: 
: 114628: 20.542767
: 357547: 13.258826
: 
: I expect the second document to be the most relevant for my query, and the 
debug values seem to agree. However, in the final score I receive, that 
document's score has been adjusted down.
: 
: The relevant debug response information can be found here: 
http://apaste.info/mju
: 
: Does anyone have an idea why the Solr score may differ from the debug value?
: 
: Thanks,
: -Rick   

-Hoss
http://www.lucidworks.com/


RE: Solr debug 'explain' values differ from the Solr score

2016-03-15 Thread Rick Sullivan
After some digging and experimentation, here are some more details on the issue 
I'm seeing.


1. The adjusted documents' scores are always exactly (debug_score/N), where N 
is the number of OR items in the query. 

For example, `=firstName:gabby~ firstName_phonetic:gabby 
firstName_tokens:(gabby)` will result in some of the documents with 
firstName==GABBY receiving a score 1/3 of the score of other GABBY documents, 
even though the debug explanation shows that they generated the same score.


2. This doesn't appear to be a brand new issue, or an issue with SolrCloud.

I've tested the problem using SolrCloud 5.5.0, Solr 5.5.0 (not cloud), and Solr 
5.4.1.


Anyone have any ideas?

Thanks,
-Rick

From: r...@ricksullivan.net
To: solr-user@lucene.apache.org
Subject: Solr debug 'explain' values differ from the Solr score
Date: Thu, 10 Mar 2016 08:34:30 -0800

Hi,

I'm seeing behavior in Solr 5.5.0 where the top-level values I see in the debug 
response don't always correspond with the scores Solr assigns to the matched 
documents.

For example, here is the top-level debug information for two documents matched 
by a query:

114628: Object
  description: "sum of:"
  details: Array[2]
  match: true
  value: 20.542768

357547: Object
  description: "sum of:"
  details: Array[2]
  match: true
  value: 26.517654

But they have scores

114628: 20.542767
357547: 13.258826

I expect the second document to be the most relevant for my query, and the 
debug values seem to agree. However, in the final score I receive, that 
document's score has been adjusted down.

The relevant debug response information can be found here: 
http://apaste.info/mju

Does anyone have an idea why the Solr score may differ from the debug value?

Thanks,
-Rick 

Solr debug 'explain' values differ from the Solr score

2016-03-10 Thread Rick Sullivan
Hi,
I'm seeing behavior in Solr 5.5.0 where the top-level values I see in the debug 
response don't always correspond with the scores Solr assigns to the matched 
documents.

For example, here is the top-level debug information for two documents matched 
by a query:
114628: Objectdescription: "sum of:"details: Array[2]match: truevalue: 20.542768
357547: Objectdescription: "sum of:"details: Array[2]match: truevalue: 26.517654
But they have scores114628: 20.542767357547: 13.258826
I expect the second document to be the most relevant for my query, and the 
debug values seem to agree. However, in the final score I receive, that 
document's score has been adjusted down.
The relevant debug response information can be found here: 
http://apaste.info/mju
Does anyone have an idea why the Solr score may differ from the debug value?
Thanks,-Rick  

Re: solr score threashold

2016-01-20 Thread Walter Underwood
The ScoresAsPercentages page is not really instructions for how to normalize 
scores. It is an explanation of why a score threshold does not do what you want.

Don’t use thresholds. If you want thresholds, you will need a search engine 
with a probabilistic model, like Verity K2. Those generally give worse results 
than a vector space model, but you can have thresholds.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Jan 20, 2016, at 5:11 AM, Emir Arnautovic <emir.arnauto...@sematext.com> 
> wrote:
> 
> Hi Sara,
> You can use funct and frange to achive needed, but note that scores are not 
> normalized meaning score 8 does not mean it is good match - it is just best 
> match. There are examples online how to normalize score (e.g. 
> http://wiki.apache.org/lucene-java/ScoresAsPercentages).
> Other approach is to write custom component that will filter out docs below 
> some threshold.
> 
> Thanks,
> Emir
> 
> On 20.01.2016 13:58, sara hajili wrote:
>> hi all,
>> i wanna to know about solr search relevency scoreing threashold.
>> can i change it?
>> i mean immagine when i searching i get this result
>> doc1 score =8
>> doc2 score =6.4
>> doc3 score=6
>> doc8score=5.5
>> doc5 score=2
>> i wana to change solr score threashold .in this way i set threashold for
>> example >4
>> and then i didn't get doc5 as result.can i do this?if yes how?
>> and if not how i can modified search to don't get docs as a result that
>> these docs have a lot distance from doc with max score?
>> in other word i wanna to delete this gap between solr results
>> 
> 
> -- 
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
> 



Re: solr score threashold

2016-01-20 Thread Doug Turnbull
What problem are you trying to solve?

If you're trying to cut out "bad" results, I might suggest explicitly using
filters that eliminate undesirable search items in terms that are
meaningful to how your users evaluate relevance.

For example, let's say your users only want items that have at least one
match in the title. One natural way to do this is to create a filter query
like *fq={!edismax qf=title mm=1 v=$q} *(where q is the user's plaintext
query). That's just an example, maybe you'd like to have some other
criteria for cutting out poor results? Use a filter query and express what
you need to trim out to Solr :)

-Doug




On Wed, Jan 20, 2016 at 7:58 AM, sara hajili <hajili.s...@gmail.com> wrote:

> hi all,
> i wanna to know about solr search relevency scoreing threashold.
> can i change it?
> i mean immagine when i searching i get this result
> doc1 score =8
> doc2 score =6.4
> doc3 score=6
> doc8score=5.5
> doc5 score=2
> i wana to change solr score threashold .in this way i set threashold for
> example >4
> and then i didn't get doc5 as result.can i do this?if yes how?
> and if not how i can modified search to don't get docs as a result that
> these docs have a lot distance from doc with max score?
> in other word i wanna to delete this gap between solr results
>



-- 
*Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections
<http://opensourceconnections.com>, LLC | 240.476.9983
Author: Relevant Search <http://manning.com/turnbull>
This e-mail and all contents, including attachments, is considered to be
Company Confidential unless explicitly stated otherwise, regardless
of whether attachments are marked as such.


solr score threashold

2016-01-20 Thread sara hajili
hi all,
i wanna to know about solr search relevency scoreing threashold.
can i change it?
i mean immagine when i searching i get this result
doc1 score =8
doc2 score =6.4
doc3 score=6
doc8score=5.5
doc5 score=2
i wana to change solr score threashold .in this way i set threashold for
example >4
and then i didn't get doc5 as result.can i do this?if yes how?
and if not how i can modified search to don't get docs as a result that
these docs have a lot distance from doc with max score?
in other word i wanna to delete this gap between solr results


Re: solr score threashold

2016-01-20 Thread Emir Arnautovic

Hi Sara,
You can use funct and frange to achive needed, but note that scores are 
not normalized meaning score 8 does not mean it is good match - it is 
just best match. There are examples online how to normalize score (e.g. 
http://wiki.apache.org/lucene-java/ScoresAsPercentages).
Other approach is to write custom component that will filter out docs 
below some threshold.


Thanks,
Emir

On 20.01.2016 13:58, sara hajili wrote:

hi all,
i wanna to know about solr search relevency scoreing threashold.
can i change it?
i mean immagine when i searching i get this result
doc1 score =8
doc2 score =6.4
doc3 score=6
doc8score=5.5
doc5 score=2
i wana to change solr score threashold .in this way i set threashold for
example >4
and then i didn't get doc5 as result.can i do this?if yes how?
and if not how i can modified search to don't get docs as a result that
these docs have a lot distance from doc with max score?
in other word i wanna to delete this gap between solr results



--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/



Solr score distribution usage

2015-09-08 Thread Ashish Mukherjee
Hello,

I would like to use the Solr score distribution to pick up most relevant
documents from the search result. Rather than top n results, I am
interested only in picking up the most relevant based on statistical
distribution of the scores.

A brief study of some sample searches (the most frequently searched terms)
on my data-set shows that the mode and median scores seem to coincide or be
very close together. Is this the kind of trend which is generally observed
in Solr (though I understand variations on specific searches)? Hence, I was
considering using statistical mode as the threshold above which I use the
documents from the result.

Has anyone done something like this before or would like to critique my
approach?

Regards,
Ashish


Re: Include Solr score into a ranking algorithm

2014-11-20 Thread Mikhail Khludnev
Hello Nicholas!
you can specify a function query as a main query where you can operate with
DVs, then you can use regular tfidf score from arbitrary query as one of
the arguments in the functional query see an example in
http://wiki.apache.org/solr/FunctionQuery#query

have a good research!

On Thu, Nov 20, 2014 at 6:45 AM, Nicholas Ding nicholas...@gmail.com
wrote:

 Hi,

 Currently, I'm trying to implement a ranking algorithm on Solr to include
 TFIDFSimilarity score into a formula.

 Ranking = TFIDFSimilarity Score * X1 + V1 * X2 + V2 * X3 + . + Vn-1 *
 Xn

 Basically, the values of Vn are stored in DocValues, I can access them in
 customized Function Query. The Xn are parameters I will pass to the
 Function Query.

 I searched on internet and dig a little bit in the Solr/Lucene source code.
 I found there is no way to access TFIDFSimilarity Score in Function Query.
 (Please correct me if I'm wrong.)

 So, I'm wondering is it possible to do my ranking algorithm in Solr/Lucene?

 --
 Nicholas Ding




-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
mkhlud...@griddynamics.com


Re: Include Solr score into a ranking algorithm

2014-11-20 Thread Nicholas Ding
Hi Mikhail,

Thank you very much! I'm using eDisMax by default, I think I will need to
change it to defType=func and pass all the query parameters (fq mainly) to
the sub query right?

Nicholas Ding


On Thu, Nov 20, 2014 at 5:22 AM, Mikhail Khludnev 
mkhlud...@griddynamics.com wrote:

 Hello Nicholas!
 you can specify a function query as a main query where you can operate with
 DVs, then you can use regular tfidf score from arbitrary query as one of
 the arguments in the functional query see an example in
 http://wiki.apache.org/solr/FunctionQuery#query

 have a good research!

 On Thu, Nov 20, 2014 at 6:45 AM, Nicholas Ding nicholas...@gmail.com
 wrote:

  Hi,
 
  Currently, I'm trying to implement a ranking algorithm on Solr to include
  TFIDFSimilarity score into a formula.
 
  Ranking = TFIDFSimilarity Score * X1 + V1 * X2 + V2 * X3 + . + Vn-1 *
  Xn
 
  Basically, the values of Vn are stored in DocValues, I can access them in
  customized Function Query. The Xn are parameters I will pass to the
  Function Query.
 
  I searched on internet and dig a little bit in the Solr/Lucene source
 code.
  I found there is no way to access TFIDFSimilarity Score in Function
 Query.
  (Please correct me if I'm wrong.)
 
  So, I'm wondering is it possible to do my ranking algorithm in
 Solr/Lucene?
 
  --
  Nicholas Ding
 



 --
 Sincerely yours
 Mikhail Khludnev
 Principal Engineer,
 Grid Dynamics

 http://www.griddynamics.com
 mkhlud...@griddynamics.com



Re: Include Solr score into a ranking algorithm

2014-11-20 Thread Ahmet Arslan
Hi Nicholas,

you can use sort by function feature of solr.

sort=sum(
mul(query(field:TfIdfQuery),x1),
mul(x1,v2))



On Thursday, November 20, 2014 4:23 PM, Nicholas Ding nicholas...@gmail.com 
wrote:
Hi Mikhail,

Thank you very much! I'm using eDisMax by default, I think I will need to
change it to defType=func and pass all the query parameters (fq mainly) to
the sub query right?

Nicholas Ding



On Thu, Nov 20, 2014 at 5:22 AM, Mikhail Khludnev 
mkhlud...@griddynamics.com wrote:

 Hello Nicholas!
 you can specify a function query as a main query where you can operate with
 DVs, then you can use regular tfidf score from arbitrary query as one of
 the arguments in the functional query see an example in
 http://wiki.apache.org/solr/FunctionQuery#query

 have a good research!

 On Thu, Nov 20, 2014 at 6:45 AM, Nicholas Ding nicholas...@gmail.com
 wrote:

  Hi,
 
  Currently, I'm trying to implement a ranking algorithm on Solr to include
  TFIDFSimilarity score into a formula.
 
  Ranking = TFIDFSimilarity Score * X1 + V1 * X2 + V2 * X3 + . + Vn-1 *
  Xn
 
  Basically, the values of Vn are stored in DocValues, I can access them in
  customized Function Query. The Xn are parameters I will pass to the
  Function Query.
 
  I searched on internet and dig a little bit in the Solr/Lucene source
 code.
  I found there is no way to access TFIDFSimilarity Score in Function
 Query.
  (Please correct me if I'm wrong.)
 
  So, I'm wondering is it possible to do my ranking algorithm in
 Solr/Lucene?
 
  --
  Nicholas Ding
 



 --
 Sincerely yours
 Mikhail Khludnev
 Principal Engineer,
 Grid Dynamics

 http://www.griddynamics.com
 mkhlud...@griddynamics.com



Re: Include Solr score into a ranking algorithm

2014-11-20 Thread Mikhail Khludnev
On Thu, Nov 20, 2014 at 5:23 PM, Nicholas Ding nicholas...@gmail.com
wrote:

 Hi Mikhail,

 Thank you very much! I'm using eDisMax by default, I think I will need to
 change it to defType=func and


I wonder why do you ask, because the given link has three examples of
including edismax into the simple calculation.

pass all the query parameters (fq mainly) to
 the sub query right?


this one particularly doesn't seem right to me. fq is fq, keep them as is.


 Nicholas Ding


 On Thu, Nov 20, 2014 at 5:22 AM, Mikhail Khludnev 
 mkhlud...@griddynamics.com wrote:

  Hello Nicholas!
  you can specify a function query as a main query where you can operate
 with
  DVs, then you can use regular tfidf score from arbitrary query as one of
  the arguments in the functional query see an example in
  http://wiki.apache.org/solr/FunctionQuery#query
 
  have a good research!
 
  On Thu, Nov 20, 2014 at 6:45 AM, Nicholas Ding nicholas...@gmail.com
  wrote:
 
   Hi,
  
   Currently, I'm trying to implement a ranking algorithm on Solr to
 include
   TFIDFSimilarity score into a formula.
  
   Ranking = TFIDFSimilarity Score * X1 + V1 * X2 + V2 * X3 + . +
 Vn-1 *
   Xn
  
   Basically, the values of Vn are stored in DocValues, I can access them
 in
   customized Function Query. The Xn are parameters I will pass to the
   Function Query.
  
   I searched on internet and dig a little bit in the Solr/Lucene source
  code.
   I found there is no way to access TFIDFSimilarity Score in Function
  Query.
   (Please correct me if I'm wrong.)
  
   So, I'm wondering is it possible to do my ranking algorithm in
  Solr/Lucene?
  
   --
   Nicholas Ding
  
 
 
 
  --
  Sincerely yours
  Mikhail Khludnev
  Principal Engineer,
  Grid Dynamics
 
  http://www.griddynamics.com
  mkhlud...@griddynamics.com
 




-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
mkhlud...@griddynamics.com


Re: Include Solr score into a ranking algorithm

2014-11-20 Thread Nicholas Ding
Thank you so much, Mikhail! It works perfectly.

On Thu, Nov 20, 2014 at 12:54 PM, Mikhail Khludnev 
mkhlud...@griddynamics.com wrote:

 On Thu, Nov 20, 2014 at 5:23 PM, Nicholas Ding nicholas...@gmail.com
 wrote:

  Hi Mikhail,
 
  Thank you very much! I'm using eDisMax by default, I think I will need to
  change it to defType=func and


 I wonder why do you ask, because the given link has three examples of
 including edismax into the simple calculation.

 pass all the query parameters (fq mainly) to
  the sub query right?
 

 this one particularly doesn't seem right to me. fq is fq, keep them as is.

 
  Nicholas Ding
 
 
  On Thu, Nov 20, 2014 at 5:22 AM, Mikhail Khludnev 
  mkhlud...@griddynamics.com wrote:
 
   Hello Nicholas!
   you can specify a function query as a main query where you can operate
  with
   DVs, then you can use regular tfidf score from arbitrary query as one
 of
   the arguments in the functional query see an example in
   http://wiki.apache.org/solr/FunctionQuery#query
  
   have a good research!
  
   On Thu, Nov 20, 2014 at 6:45 AM, Nicholas Ding nicholas...@gmail.com
   wrote:
  
Hi,
   
Currently, I'm trying to implement a ranking algorithm on Solr to
  include
TFIDFSimilarity score into a formula.
   
Ranking = TFIDFSimilarity Score * X1 + V1 * X2 + V2 * X3 + . +
  Vn-1 *
Xn
   
Basically, the values of Vn are stored in DocValues, I can access
 them
  in
customized Function Query. The Xn are parameters I will pass to the
Function Query.
   
I searched on internet and dig a little bit in the Solr/Lucene source
   code.
I found there is no way to access TFIDFSimilarity Score in Function
   Query.
(Please correct me if I'm wrong.)
   
So, I'm wondering is it possible to do my ranking algorithm in
   Solr/Lucene?
   
--
Nicholas Ding
   
  
  
  
   --
   Sincerely yours
   Mikhail Khludnev
   Principal Engineer,
   Grid Dynamics
  
   http://www.griddynamics.com
   mkhlud...@griddynamics.com
  
 



 --
 Sincerely yours
 Mikhail Khludnev
 Principal Engineer,
 Grid Dynamics

 http://www.griddynamics.com
 mkhlud...@griddynamics.com



Include Solr score into a ranking algorithm

2014-11-19 Thread Nicholas Ding
Hi,

Currently, I'm trying to implement a ranking algorithm on Solr to include
TFIDFSimilarity score into a formula.

Ranking = TFIDFSimilarity Score * X1 + V1 * X2 + V2 * X3 + . + Vn-1 * Xn

Basically, the values of Vn are stored in DocValues, I can access them in
customized Function Query. The Xn are parameters I will pass to the
Function Query.

I searched on internet and dig a little bit in the Solr/Lucene source code.
I found there is no way to access TFIDFSimilarity Score in Function Query.
(Please correct me if I'm wrong.)

So, I'm wondering is it possible to do my ranking algorithm in Solr/Lucene?

--
Nicholas Ding


Solr score manager

2014-07-16 Thread Shay Sofer
Hi All,

I need a specific score mechanism.

I would like to sort my results based on customize scoring field.
scoring for example -



1.   If this is a new object - 100

2.   Edited - 80

3.   Recent search - 50

4.   Opened - 40
and some more actions...

And then when execute a new search they sorted based on score field.

Example:
Object 1 : opened  = 40.
Object 2: New = 100
Object 3: edited X 2 + recent search X 1 = 210.

Result:

Object 3
Object 2
Object 1

Any good article for this? Examples?
I'm using Solr with Java.

Thanks in advance,
Shay.







Re: Solr score manager

2014-07-16 Thread Alexandre Rafalovitch
How are you storing this information in your documents?

Regards,
Alex
On 16/07/2014 5:03 pm, Shay Sofer sha...@checkpoint.com wrote:

 Hi All,

 I need a specific score mechanism.

 I would like to sort my results based on customize scoring field.
 scoring for example -



 1.   If this is a new object - 100

 2.   Edited - 80

 3.   Recent search - 50

 4.   Opened - 40
 and some more actions...

 And then when execute a new search they sorted based on score field.

 Example:
 Object 1 : opened  = 40.
 Object 2: New = 100
 Object 3: edited X 2 + recent search X 1 = 210.

 Result:

 Object 3
 Object 2
 Object 1

 Any good article for this? Examples?
 I'm using Solr with Java.

 Thanks in advance,
 Shay.








Fwd: Solr score manager

2014-07-16 Thread Alexandre Rafalovitch
-- Forwarded message --
From: Shay Sofer sha...@checkpoint.com
Date: Wed, Jul 16, 2014 at 6:55 PM

That’s my question :-)

How should I manage this scoring system.

I guess that I need to add new field (my_score) and update him as I want.



-Original Message-
From: Alexandre Rafalovitch [mailto:arafa...@gmail.com]
Sent: Wednesday, July 16, 2014 1:53 PM
To: solr-user
Subject: Re: Solr score manager

How are you storing this information in your documents?

Regards,
Alex
On 16/07/2014 5:03 pm, Shay Sofer sha...@checkpoint.com wrote:

 Hi All,

 I need a specific score mechanism.

 I would like to sort my results based on customize scoring field.
 scoring for example -



 1.   If this is a new object - 100

 2.   Edited - 80

 3.   Recent search - 50

 4.   Opened - 40
 and some more actions...

 And then when execute a new search they sorted based on score field.

 Example:
 Object 1 : opened  = 40.
 Object 2: New = 100
 Object 3: edited X 2 + recent search X 1 = 210.

 Result:

 Object 3
 Object 2
 Object 1

 Any good article for this? Examples?
 I'm using Solr with Java.

 Thanks in advance,
 Shay.








Email secured by Check Point


RE: Solr score manager

2014-07-16 Thread Doug Turnbull
Shay this presentation I gave at apachecon and dc solr exchange might
be useful to you:

http://www.slideshare.net/mobile/o19s/hacking-lucene-for-custom-search-results

Sent from my Windows Phone From: Shay Sofer
Sent: ‎7/‎16/‎2014 6:03 AM
To: solr-user@lucene.apache.org
Subject: Solr score manager
Hi All,

I need a specific score mechanism.

I would like to sort my results based on customize scoring field.
scoring for example -



1.   If this is a new object - 100

2.   Edited - 80

3.   Recent search - 50

4.   Opened - 40
and some more actions...

And then when execute a new search they sorted based on score field.

Example:
Object 1 : opened  = 40.
Object 2: New = 100
Object 3: edited X 2 + recent search X 1 = 210.

Result:

Object 3
Object 2
Object 1

Any good article for this? Examples?
I'm using Solr with Java.

Thanks in advance,
Shay.


Re: Combining Solr score with customized user ratings for a document

2014-05-26 Thread rulinma
Good. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-tp4040200p4138135.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Rounding errors with SOLR score

2014-03-22 Thread William Bell
I will send the debugQuery. They are exactly the same.



On Fri, Mar 21, 2014 at 2:59 AM, Raymond Wiker rwi...@gmail.com wrote:

 Are you sure that SOLR is rounding incorrectly, and not simply differently
 from what you expect? I was surprised myself at some of the rounding
 behaviour I saw with SOLR, but according to
 http://en.wikipedia.org/wiki/Rounding , the results were valid (just not
 the round-up-from-half that I naively expected).


 On Fri, Mar 21, 2014 at 3:27 AM, William Bell billnb...@gmail.com wrote:

  When doing complex boosting/bq we are getting rounding errors on the
 score.
 
  To get the score to be consistent I needed to use rint on sort:
 
  sort=rint(product(sum($p_score,$s_score,$q_score),100)) desc,s_query asc
 
  str name=p_scorerecip(priority,1,.5,.01)/str
  str name=s_scoreproduct(recip(synonym_rank,1,1,.01),17)/str
  str name=q_score
  query({!dismax qf=user_query_edge^1 user_query^0.5 user_query_fuzzy
  v=$q1})
  /str
 
  The issue is in the qf area.
 
  {s_query: Ear Irrigation,score: 10.331313},{s_query: Ear
  Piercing,
  score: 10.331314},{s_query: Ear Pinning,score: 10.331313},
 
  --
  Bill Bell
  billnb...@gmail.com
  cell 720-256-8076
 




-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076


Re: Rounding errors with SOLR score

2014-03-21 Thread Raymond Wiker
Are you sure that SOLR is rounding incorrectly, and not simply differently
from what you expect? I was surprised myself at some of the rounding
behaviour I saw with SOLR, but according to
http://en.wikipedia.org/wiki/Rounding , the results were valid (just not
the round-up-from-half that I naively expected).


On Fri, Mar 21, 2014 at 3:27 AM, William Bell billnb...@gmail.com wrote:

 When doing complex boosting/bq we are getting rounding errors on the score.

 To get the score to be consistent I needed to use rint on sort:

 sort=rint(product(sum($p_score,$s_score,$q_score),100)) desc,s_query asc

 str name=p_scorerecip(priority,1,.5,.01)/str
 str name=s_scoreproduct(recip(synonym_rank,1,1,.01),17)/str
 str name=q_score
 query({!dismax qf=user_query_edge^1 user_query^0.5 user_query_fuzzy
 v=$q1})
 /str

 The issue is in the qf area.

 {s_query: Ear Irrigation,score: 10.331313},{s_query: Ear
 Piercing,
 score: 10.331314},{s_query: Ear Pinning,score: 10.331313},

 --
 Bill Bell
 billnb...@gmail.com
 cell 720-256-8076



Rounding errors with SOLR score

2014-03-20 Thread William Bell
When doing complex boosting/bq we are getting rounding errors on the score.

To get the score to be consistent I needed to use rint on sort:

sort=rint(product(sum($p_score,$s_score,$q_score),100)) desc,s_query asc

str name=p_scorerecip(priority,1,.5,.01)/str
str name=s_scoreproduct(recip(synonym_rank,1,1,.01),17)/str
str name=q_score
query({!dismax qf=user_query_edge^1 user_query^0.5 user_query_fuzzy
v=$q1})
/str

The issue is in the qf area.

{s_query: Ear Irrigation,score: 10.331313},{s_query: Ear Piercing,
score: 10.331314},{s_query: Ear Pinning,score: 10.331313},

-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076


Re: How to round solr score ?

2013-10-08 Thread Mamta Thakur
Thanks for your replies.
I am actually doing the frange approach for now. The only downside I see there 
is it makes the function call twice, calling createWeight() twice. And so my 
social connections are evaluated twice which is quite heavy operation. So I was 
thinking if I could get away with one additional call.




This email is intended for the person(s) to whom it is addressed and may 
contain information that is PRIVILEGED or CONFIDENTIAL. Any unauthorized use, 
distribution, copying, or disclosure by any person other than the addressee(s) 
is strictly prohibited. If you have received this email in error, please notify 
the sender immediately by return email and delete the message and any 
attachments from your system.

Re: How to round solr score ?

2013-09-17 Thread Mamta Thakur
Hi ,

As per this post here 
http://grokbase.com/t/lucene/solr-user/131jzcg3q2/how-to-round-solr-score.
I was able to use my custom fn in 
sort(defType=funcq=socialDegree(id,1)fl=score,*sort=score%20asc) - works,
but can't facet on the 
same(defType=funcq=socialDegree(id,1)fl=score,*facet=truefacet.field=score) 
- doesn't work.

Exception:
org.apache.solr.common.SolrException: undefined field: score
at org.apache.solr.schema.IndexSchema.getField(IndexSchema.java:965)
at org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:294)
at 
org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:423)
at org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:205)
at 
org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:78)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)

Is there any way by which we can achieve this?

Thanks,
Mamta.




This email is intended for the person(s) to whom it is addressed and may 
contain information that is PRIVILEGED or CONFIDENTIAL. Any unauthorized use, 
distribution, copying, or disclosure by any person other than the addressee(s) 
is strictly prohibited. If you have received this email in error, please notify 
the sender immediately by return email and delete the message and any 
attachments from your system.

Re: How to round solr score ?

2013-09-17 Thread Chris Hostetter

: 'score' is a pseudo-field, i.e., it does not actually exist in
: the index, which is probably why it cannot be faceted on.
: Faceting on a rounded score seems like an unusual use
: case. What requirement are you trying to address?

agreed, more details would be helpful.

FWIW: the only way available to facet on functions is to use facet.query 
along with the {!frange} paser to create facet constraints based on ranges 
of function values that you specify.

there is no othe way i can think of to facet over function values -- there 
is an open issue where people were discussing it, but i don't think there 
wa ever a functional patch...

https://issues.apache.org/jira/browse/SOLR-1581






-Hoss


Re: How to round solr score ?

2013-09-17 Thread Gora Mohanty
On 17 September 2013 18:31, Mamta Thakur mtha...@care.com wrote:

 Hi ,

 As per this post here
 http://grokbase.com/t/lucene/solr-user/131jzcg3q2/how-to-round-solr-score.
 I was able to use my custom fn in
 sort(defType=funcq=socialDegree(id,1)fl=score,*sort=score%20asc) - works,
 but can't facet on the
 same(defType=funcq=socialDegree(id,1)fl=score,*facet=truefacet.field=score)
 - doesn't work.


'score' is a pseudo-field, i.e., it does not actually exist in
the index, which is probably why it cannot be faceted on.
Faceting on a rounded score seems like an unusual use
case. What requirement are you trying to address?

Regards,
Gora


Re: Combining Solr score with customized user ratings for a document

2013-09-10 Thread Amit Jha
You can use DB for storing user preferences and later if you want you can flush 
them to solr as an update along with userid.

Or you may add a result pipeline filter 



Rgds
AJ

On 13-Feb-2013, at 17:50, Á_o chachime...@yahoo.es wrote:

 Hi:
 
 I am working on a proyect where we want to recommend our users products
 based on their previous 'likes', purchases and so on (typical stuff of a
 recommender system), while we want to let them browse freely the catalogue
 by search queries, making use of facets, more-like-this and so on (typical
 stuff of a Solr index).
 
 After reading here and there, I have reached the conclusion that's it's
 better to keep Solr Index apart from the database. Solr is for products
 (which can be reindexed from the DB as a nightly batch) while the DB is for
 everything else, including -the products and- user profiles. 
 
 So, given an user and a particular search (which can be as simple as q=*),
 on one hand we have Solr results (i.e. docs + scores) for the query, while
 on the other we have user predicted ratings (i.e. recommender scores) coming
 from the DB (though they could be cached elsewhere) for each of the products
 returned by Solr.
 
 And what I want is clear -to state-: combine both scores (e.g. by a simple
 product) so the user receives a sorted list of relevant products biased by
 his/her preferences.
 
 I have been googleing for the last days without finding which is the best
 way to achieve this.
 
 I think it's not a matter of boosting, or at least I can't see which
 boosting method could be useful as the boost should be user-based. I think
 that I need to extend -somewhere- Solr so I can alter the result scores by
 providing the user ID and connecting to the DB at query time, doing the
 necessary maths and returning the final score in a -quite- transparent way
 for the Web app.
 
 A less elegant solution could be letting Solr do its work as usual, and then
 navigate through the XML modifying the scores and reordering the whole list
 of products (or maybe just the first N results) by the new combined score.
 
 What do you think?
 A big THANKS in advance
 
 Álvaro
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-tp4040200.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Combining Solr score with customized user ratings for a document

2013-02-21 Thread Chris Hostetter

: With this approach now I can boost (i.e. multiply Solr's score by a factor)
: the results of any query by doing something like this:
: http://localhost:8080/solr/Prueba/select_test?q={!boost
: b=rating(usuario1)}text:grapafl=score
: 
: Where 'rating' is the name of my function.
: 
: Unfortunately, I still can't see which differences are between doing this or
: making the product of both scores as the value for the query's sort
: parameter... :(

I'm not sure i understand your question.  With the example query above, 
your score -- both returned, and used for sorting by score -- is the 
mathematical result of multiplying your function by the relevancy score of 
text:grapa

Perhaps what you are refering to is the idea that if you wnat the score 
to remain purely about relevancy, you can still opionally sort on the 
results of this function, by using the function solely in your sort -- the 
only thing that tends to confuse people here is how you refer back to the 
original query in that sort by function command...

http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201206.mbox/%3Calpine.DEB.2.00.1206111242260.17925@bester%3E

or in your case, something like this would return the both the raw 
score, and your custom rating, but it would sort on the product of those 
two values...

?q=text:grapafl=id,score,rating(usuario1)sort=product(rating(usuario1),query($q)

: Which is the best place to do it? I think I would query the DB/cache just
: when the custom ValueSource is created in the ValueSourceParser's parse

That might makes sense, but becareful where you put this cache data -- 
if it's part of the ValueSource then whenever that ValueSource is used in 
a FunctionQuery (ie: {!boost b=rating(usuario1)}text:grapa it will be 
part of the cache key for the queryResultCache or filterCache -- so having 
large data structures in your ValueSource could eat up a lot of RAM.  Take 
a look at src/docs/differences between the ValueSource class and the 
FunctionValues class

-Hoss


Re: Combining Solr score with customized user ratings for a document

2013-02-19 Thread Á_____o
Well, as Hoss suggested, I have implemented my own function
(ValueSourceParser+ValueSource) :)

It's a very simple function which receives a parameter, the userId, and
returns a float value depending (with a switch-case structure just for this
demo) on it.

With this approach now I can boost (i.e. multiply Solr's score by a factor)
the results of any query by doing something like this:
http://localhost:8080/solr/Prueba/select_test?q={!boost
b=rating(usuario1)}text:grapafl=score

Where 'rating' is the name of my function.

Unfortunately, I still can't see which differences are between doing this or
making the product of both scores as the value for the query's sort
parameter... :(

Next step is, of course, replace that demo switch-case structure with a SQL
query to my DB/retrieval from a Solr cache.

My idea is to retrieve a docId,recScore map from the DB for each user that
queries our system for the first time. Next time he/she queries it I'd
like to get his/her map from a Solr's cache (until info becomes obsolete).

Which is the best place to do it? I think I would query the DB/cache just
when the custom ValueSource is created in the ValueSourceParser's parse
call, storing the map in the ValueSource. Then my floatVal method would just
be a 'get' from my map.

I'm so close!

Thanks!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-tp4040200p4041272.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Combining Solr score with customized user ratings for a document

2013-02-15 Thread Á_____o
Hi Tim!

Thank you for bringing in some light ;)

I have read your slides (in fact, I had already read them in the last days)
but I'm still missing something.

So, let's see...

As I see (and I may be wrong) Solr's external file fields are some kind of
docID, score maps, aren't them? I understand the power of this approach
for popularity scores, like in your example, which don't depend on anything
else but the docID but you don't want to have them stored in the index so
they can be refreshed more often than the documents. The problem here (as
with other regular rec system to my 'lil knowledge) is that we need a
usrID, lt;docID, score map instead.

The other thing you address is a custom Component. hmm... haven't thought of
that before. Maybe I read your slides when I had less understanding of
Solr's internal mechanishms (though I'm still quite confused). So, alright,
something that receives a parameter with the user ID plus setting a cache so
we don't eventually end in a bottleneck is definitely the direction I have
to follow. But now the problem I find is that I don't want to query my
database just to get categories for a filter. What I want to query is the
user rating for each document so I can combine it with Solr's relevancy
score.

All complaints, I know... :p Could you go -just a bit- further into that
Mahout integration with Solr?

I think I'm going to dive into custom components to get a better
understanding of them and to see if I can find there my holy grail :P

Thanks A LOT!

Regards,
Álvaro



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-tp4040200p4040597.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Combining Solr score with customized user ratings for a document

2013-02-15 Thread Á_____o
Á_o wrote
 As I see (and I may be wrong) Solr's external file fields are some kind of
 lt;docID, scoregt; maps, aren't them?

Actually I was wrong ;)
The key does not have to be necessarily the docID. It can be some other
field. Anyway, even in that case, it's still a 'docKey' which I can't see
how could it be user-customized... :(



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-tp4040200p4040616.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Combining Solr score with customized user ratings for a document

2013-02-15 Thread Timothy Potter
Alvaro - still thinking ... will reply when I have more ;-)

On Fri, Feb 15, 2013 at 6:31 AM, Á_o chachime...@yahoo.es wrote:
 Á_o wrote
 As I see (and I may be wrong) Solr's external file fields are some kind of
 lt;docID, scoregt; maps, aren't them?

 Actually I was wrong ;)
 The key does not have to be necessarily the docID. It can be some other
 field. Anyway, even in that case, it's still a 'docKey' which I can't see
 how could it be user-customized... :(



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-tp4040200p4040616.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Combining Solr score with customized user ratings for a document

2013-02-15 Thread Chris Hostetter

: 
http://www.slideshare.net/thelabdude/boosting-documents-in-solr-lucene-revolution-2011
...
:  Start by looking at Solr's external file field and

Rather then using ExternalFileField as imspiration, i would suggest you 
look at implementing a custom ValueSourceParser...

http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201301.mbox/%3Calpine.DEB.2.02.1301071825090.14692@frisbee%3E

Then you can either use your custom function as a boost function included 
the main query, or in isolation as part of a sort by function which 
could also include the score from the main query.  (Which one you choose 
should depends on your final goal, and how expensive it is to query your 
external datasource to find out hte per-user rankings.)



-Hoss


Re: Combining Solr score with customized user ratings for a document

2013-02-14 Thread Á_____o
Well, thinking a bit more, the second solution is not practical.

If Solr retrieves, say, 1.000 documents, I would have to navigate through
ALL (maybe less with some reasonable upper limit) of them to recalculate the
scores and reorder them according to the new score although the Web App is
going to show just the first 20.

In other words, I would lose the benefits of Solr's (well, and most DB's)
row/offset feature to retrieve information in batchs rather than the whole
amount of results which may not be seen by the user at all.

I'm now wondering if a custom implementation of a ValueSource + a
FunctionQuery is a solution to my problem...

Any hint?
Thanks!

Álvaro



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-tp4040200p4040444.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Combining Solr score with customized user ratings for a document

2013-02-14 Thread Timothy Potter
Start by looking at Solr's external file field and
http://www.linkedin.com/profile/view?id=18807864trk=tab_pro

On Thu, Feb 14, 2013 at 6:24 AM, Á_o chachime...@yahoo.es wrote:
 Well, thinking a bit more, the second solution is not practical.

 If Solr retrieves, say, 1.000 documents, I would have to navigate through
 ALL (maybe less with some reasonable upper limit) of them to recalculate the
 scores and reorder them according to the new score although the Web App is
 going to show just the first 20.

 In other words, I would lose the benefits of Solr's (well, and most DB's)
 row/offset feature to retrieve information in batchs rather than the whole
 amount of results which may not be seen by the user at all.

 I'm now wondering if a custom implementation of a ValueSource + a
 FunctionQuery is a solution to my problem...

 Any hint?
 Thanks!

 Álvaro



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-tp4040200p4040444.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Combining Solr score with customized user ratings for a document

2013-02-14 Thread Timothy Potter
Oops - that's definitely not the link I meant to give ;-) Here's the
link from slideshare:

http://www.slideshare.net/thelabdude/boosting-documents-in-solr-lucene-revolution-2011

In there we used Mahout to calculate recommendation scores and then
loaded them using external file field.

Cheers,
Tim

On Thu, Feb 14, 2013 at 11:25 AM, Timothy Potter thelabd...@gmail.com wrote:
 Start by looking at Solr's external file field and
 http://www.linkedin.com/profile/view?id=18807864trk=tab_pro

 On Thu, Feb 14, 2013 at 6:24 AM, Á_o chachime...@yahoo.es wrote:
 Well, thinking a bit more, the second solution is not practical.

 If Solr retrieves, say, 1.000 documents, I would have to navigate through
 ALL (maybe less with some reasonable upper limit) of them to recalculate the
 scores and reorder them according to the new score although the Web App is
 going to show just the first 20.

 In other words, I would lose the benefits of Solr's (well, and most DB's)
 row/offset feature to retrieve information in batchs rather than the whole
 amount of results which may not be seen by the user at all.

 I'm now wondering if a custom implementation of a ValueSource + a
 FunctionQuery is a solution to my problem...

 Any hint?
 Thanks!

 Álvaro



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-tp4040200p4040444.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Combining Solr score with customized user ratings for a document

2013-02-13 Thread Á_____o
Hi:

I am working on a proyect where we want to recommend our users products
based on their previous 'likes', purchases and so on (typical stuff of a
recommender system), while we want to let them browse freely the catalogue
by search queries, making use of facets, more-like-this and so on (typical
stuff of a Solr index).

After reading here and there, I have reached the conclusion that's it's
better to keep Solr Index apart from the database. Solr is for products
(which can be reindexed from the DB as a nightly batch) while the DB is for
everything else, including -the products and- user profiles. 

So, given an user and a particular search (which can be as simple as q=*),
on one hand we have Solr results (i.e. docs + scores) for the query, while
on the other we have user predicted ratings (i.e. recommender scores) coming
from the DB (though they could be cached elsewhere) for each of the products
returned by Solr.

And what I want is clear -to state-: combine both scores (e.g. by a simple
product) so the user receives a sorted list of relevant products biased by
his/her preferences.

I have been googleing for the last days without finding which is the best
way to achieve this.

I think it's not a matter of boosting, or at least I can't see which
boosting method could be useful as the boost should be user-based. I think
that I need to extend -somewhere- Solr so I can alter the result scores by
providing the user ID and connecting to the DB at query time, doing the
necessary maths and returning the final score in a -quite- transparent way
for the Web app.

A less elegant solution could be letting Solr do its work as usual, and then
navigate through the XML modifying the scores and reordering the whole list
of products (or maybe just the first N results) by the new combined score.

What do you think?
A big THANKS in advance

Álvaro



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-tp4040200.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to round solr score ?

2013-01-18 Thread Gora Mohanty
On 18 January 2013 18:26, Gustav xbihy...@sharklasers.com wrote:
 I have to bump this... is it possible to do it (round solr's score) with any
 integrated query function??

Do not have a Solr index handy at the moment to check,
but it should be possible to do this with function queries.
Please see the rint() and query() function at
http://wiki.apache.org/solr/FunctionQuery

Regards,
Gora


Re: How to round solr score ?

2013-01-18 Thread Gustav
Hey Gora, thanks for the fast answer!

I Had tried the rint(score) function before(it would be perfect in my case)
but it didnt work out, i guess it only works with indexed fields, so i got
the  sort param could not be parsed as a query, and is not a field that
exists in the index: rint(score) error,

And with the query() function i didnt got any successful result...

Im stuck in the same cenario as squaro. 

if two docs have score of 1.67989 and 1.6767, I would like to sort them by
price. 

My sort rules ae something like:
sort=score desc, price asc



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-round-solr-score-tp495198p4034551.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to round solr score ?

2013-01-18 Thread Gora Mohanty
On 18 January 2013 19:18, Gustav xbihy...@sharklasers.com wrote:
 Hey Gora, thanks for the fast answer!

 I Had tried the rint(score) function before(it would be perfect in my case)
 but it didnt work out, i guess it only works with indexed fields, so i got
 the  sort param could not be parsed as a query, and is not a field that
 exists in the index: rint(score) error,

 And with the query() function i didnt got any successful result...

 Im stuck in the same cenario as squaro.

 if two docs have score of 1.67989 and 1.6767, I would like to sort them by
 price.

 My sort rules ae something like:
 sort=score desc, price asc

You have to use rint() in combination with query()

If I understand your requirements correctly, something
along the lines below should work:
http://localhost:8983/solr/select/?defType=funcq=rint(query({!v=text:term}))fl=score,*sort=score
desc,price asc
should work, where one is searching for term in the field text.
The score is displayed in the returned fields to demonstrate that
it has been rounded off.

Regards,
Gora


Re: Solr Score threshold 'reasonably', independent of results returned

2012-08-28 Thread Chris Hostetter

: Not really. The percentage given in other search packages is fairly
: bogus. You have to do a global batch analysis of all of the index to
: get a true scale for relevance.

Exactly...

https://wiki.apache.org/solr/FAQ#Why_Aren.27t_Scores_returned_as_a_percentage.3F_How_Do_I_normalize_Scores.3F
https://wiki.apache.org/lucene-java/ScoresAsPercentages

*you* -- as the person in control of your solr instance, who kows 
everything about every document in the index, and has total control over 
the set of valid queries being executed against the index -- you *MAY* be 
able to compute a meaningful threshold of scores, based on the 
constraints you know/enforce.  But Solr can't do this, because in 
general Solr doesn't know those constraints (or if those constraints even 
exist) for an arbitrary index.


-Hoss


Re: Solr Score threshold 'reasonably', independent of results returned

2012-08-26 Thread Lance Norskog
Not really. The percentage given in other search packages is fairly
bogus. You have to do a global batch analysis of all of the index to
get a true scale for relevance.

On Sat, Aug 25, 2012 at 1:38 PM, Ramzi Alqrainy
ramzi.alqra...@gmail.com wrote:
 You are right Mr.Ravish, because this depends on (ranking and search fields)
 formula, but please allow me to tell you that Solr score can help us to
 define this document is relevant or not in some cases.



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Solr-Score-threshold-reasonably-independent-of-results-returned-tp4002312p4003248.html
 Sent from the Solr - User mailing list archive at Nabble.com.



-- 
Lance Norskog
goks...@gmail.com


Re: Solr Score threshold 'reasonably', independent of results returned

2012-08-25 Thread Ramzi Alqrainy
It will never return no result because its relative to score in previous
result

If score0.25*last_score then stop

Since score0 and last score is 0 for initial hit it will not stop



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Score-threshold-reasonably-independent-of-results-returned-tp4002312p4003247.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Score threshold 'reasonably', independent of results returned

2012-08-25 Thread Ramzi Alqrainy
You are right Mr.Ravish, because this depends on (ranking and search fields)
formula, but please allow me to tell you that Solr score can help us to
define this document is relevant or not in some cases. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Score-threshold-reasonably-independent-of-results-returned-tp4002312p4003248.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Score threshold 'reasonably', independent of results returned

2012-08-22 Thread Mou
Hi,
I think that this totally depends on your requirements and thus applicable
for a user scenario. Score does not have any absolute meaning, it is always
relative to the query. If you want to watch some particular queries and want
to show results with score above previously set threshold, you can use this. 

If I always have that x% threshold in place , there may be many queries
which would not return anything and I certainly do not want that.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Score-threshold-reasonably-independent-of-results-returned-tp4002312p4002673.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Score threshold 'reasonably', independent of results returned

2012-08-22 Thread Ravish Bhagdev
Commercial solutions often have %age that is meant to signify the quality
of match.  Solr has relative score and you cannot tell by just looking at
this value if a result is relevant enough to be in first page or not.
 Score depends on what else is in the index so not easy to normalize in
the way you suggest.

Ravish

On Wed, Aug 22, 2012 at 4:03 PM, Mou mouna...@gmail.com wrote:

 Hi,
 I think that this totally depends on your requirements and thus applicable
 for a user scenario. Score does not have any absolute meaning, it is always
 relative to the query. If you want to watch some particular queries and
 want
 to show results with score above previously set threshold, you can use
 this.

 If I always have that x% threshold in place , there may be many queries
 which would not return anything and I certainly do not want that.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Solr-Score-threshold-reasonably-independent-of-results-returned-tp4002312p4002673.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Solr Score threshold 'reasonably', independent of results returned

2012-08-20 Thread Ramzi Alqrainy
Usually, search results are sorted by their score (how well the document
matched the query), but it is common to need to support the sorting of
supplied data too.
Boosting affects the scores of matching documents in order to affect ranking
in score-sorted search results. Providing a boost value, whether at the
document or field level, is optional.
When the results are returned with scores, we want to be able to only keep
results that are above some score (i.e. results of a certain quality only).
Is it possible to do this when the returned subset could be anything?
I ask because it seems like on some queries a score of say 0.008 is
resulting in a decent match, whereas other queries a higher score results in
a poor match.
I have written pseudo code to achieve what I said.
Note: I have attached my code as screenshot

http://lucene.472066.n3.nabble.com/file/n4002312/Screen_Shot_2012-08-21_at_5.30.38_AM.png
 

https://issues.apache.org/jira/browse/SOLR-3747



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Score-threshold-reasonably-independent-of-results-returned-tp4002312.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Customizing Solr score with DixMax query

2012-02-27 Thread Ahmet Arslan


--- On Mon, 2/27/12, Xiao shinelee.thew...@gmail.com wrote:

 From: Xiao shinelee.thew...@gmail.com
 Subject: Customizing Solr score with DixMax query
 To: solr-user@lucene.apache.org
 Date: Monday, February 27, 2012, 5:59 AM
 In my application logic, I want to
 implement the ranking (scoring) logic as
 follows: 
 
 score = Solr relecency score * a_special_field_value.
 
 I tried to use DixMax to do this. My query statement is
 q={!type=dixmax
 qf='title content' bf=field1}data. However, when I open the
 debugquery
 option, I find that what Solr does is just a sum of of the
 two scores,
 i.e., the TF-IDF score and FunctionQuery score. But what I
 want is to
 multiple the two together. How can I implement the
 multiplication operation?

edismax has a boost parameter for this.
q={!type=edixmax qf='title content' boost=field1}data


Re: Customizing Solr score with DixMax query

2012-02-27 Thread Xiao
Yes! Thank you! I also get this in this morning from Sematext Blog.

Edismax
 Supports the “boost” parameter.. like the dismax bf param, but multiplies
the function query instead of adding it in 

http://blog.sematext.com/2010/01/20/solr-digest-january-2010/

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Customizing-Solr-score-with-DixMax-query-tp3779591p3781200.html
Sent from the Solr - User mailing list archive at Nabble.com.


Customizing Solr score with DixMax query

2012-02-26 Thread Xiao
In my application logic, I want to implement the ranking (scoring) logic as
follows: 

score = Solr relecency score * a_special_field_value.

I tried to use DixMax to do this. My query statement is q={!type=dixmax
qf='title content' bf=field1}data. However, when I open the debugquery
option, I find that what Solr does is just a sum of of the two scores,
i.e., the TF-IDF score and FunctionQuery score. But what I want is to
multiple the two together. How can I implement the multiplication operation?

I also tried using boosted query. I issue a query like q={!boost
b=field1}data. In this case, Solr does return a score which is a production
of two scores. However, by using boosted query, I lost the power of dismax
query which can search across multiple fields.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Customizing-Solr-score-with-DixMax-query-tp3779591p3779591.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Score Normalization

2011-11-16 Thread Jan Høydahl
Perhaps you can solve your usecase by playing with the new eDismax boost 
parameter, which multiplies the functions with the other score instead of 
adding.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

On 5. nov. 2011, at 01:26, sangrish wrote:

 
 Hi,
 
 
I have a (dismax) request handler which has the following 3 scoring
 components (1 qf  2 bf) :
 
qf = field1^2 field2^3
bf = func1(field3)^2 func2(field4)^3
 
  Both func1  func2 return scores between 0  1. The score returned by
 textual match (qf) ranges from 0 to NOT_A_FIXED_NUMBER
 
   To allow better combination of text match  my functions, I want the text
 score to be normalized between 0  1. Is there any way I can achieve that
 here?
 
 Thanks
 Sid
 
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Solr-Score-Normalization-tp3481627p3481627.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr Score Normalization

2011-11-16 Thread Chris Hostetter

: Perhaps you can solve your usecase by playing with the new eDismax 
: boost parameter, which multiplies the functions with the other score 
: instead of adding.

and FWIW: the boost param of the edismax parser is really just syntactic 
sugar for using the BoostQParsre wrapped arround an edismax query -- you 
can wrap it around any query produced by any QParser...

  q={!edismax qf=foo}barboost=func(asdf)

...is the same as...

  q={!boost b=func(asdf) v=$qq}qq={!edismax qf=foo}bar



-Hoss


Solr Score Normalization

2011-11-04 Thread sangrish

Hi,


I have a (dismax) request handler which has the following 3 scoring
components (1 qf  2 bf) :

qf = field1^2 field2^3
bf = func1(field3)^2 func2(field4)^3

  Both func1  func2 return scores between 0  1. The score returned by
textual match (qf) ranges from 0 to NOT_A_FIXED_NUMBER

   To allow better combination of text match  my functions, I want the text
score to be normalized between 0  1. Is there any way I can achieve that
here?

Thanks
Sid




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Score-Normalization-tp3481627p3481627.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Score Normalization

2011-11-04 Thread Chris Hostetter


:To allow better combination of text match  my functions, I want the text
: score to be normalized between 0  1. Is there any way I can achieve that
: here?

It is achievable, but it is not usualy meaningful...

https://wiki.apache.org/lucene-java/ScoresAsPercentages


-Hoss


solr score issue

2011-02-25 Thread Bagesh Sharma

Hi sir , 

Can anyone explain me how this score is being calculated. i am searching
here software engineer using dismax handler. Total documents indexed are
477 and query results are 28.

Query is like that -
   q=software+engineerfq=location%3Adelhi

dismax setting is - 

   str name=qf
 alltext
 title^2
 functional_role^1
/str

str name=pf
  body^100
/str


Here alltext field is made by copying all fields.
body field contains detail of job.

I am unable to understand how these scores have been calculated. From where
to start score calculating and what are default score for any term matching.

str name=20080604/3eb9a7b30131a782a0c0a0e2cdb2b6b8.html

0.5901718 = (MATCH) sum of:
  0.0032821721 = (MATCH) sum of:
0.0026574256 = (MATCH) max plus 0.1 times others of:
  0.0026574256 = (MATCH) weight(alltext:softwar in 339), product of:
0.0067262817 = queryWeight(alltext:softwar), product of:
  3.6121683 = idf(docFreq=34, maxDocs=477)
  0.0018621174 = queryNorm
0.39508092 = (MATCH) fieldWeight(alltext:softwar in 339), product
of:
  1.0 = tf(termFreq(alltext:softwar)=1)
  3.6121683 = idf(docFreq=34, maxDocs=477)
  0.109375 = fieldNorm(field=alltext, doc=339)
6.2474643E-4 = (MATCH) max plus 0.1 times others of:
  6.2474643E-4 = (MATCH) weight(alltext:engin in 339), product of:
0.0032613424 = queryWeight(alltext:engin), product of:
  1.7514161 = idf(docFreq=224, maxDocs=477)
  0.0018621174 = queryNorm
0.19156113 = (MATCH) fieldWeight(alltext:engin in 339), product of:
  1.0 = tf(termFreq(alltext:engin)=1)
  1.7514161 = idf(docFreq=224, maxDocs=477)
  0.109375 = fieldNorm(field=alltext, doc=339)
  0.5868896 = weight(body:softwar engin^100.0 in 339), product of:
0.9995919 = queryWeight(body:softwar engin^100.0), product of:
  100.0 = boost
  5.3680387 = idf(body: softwar=34 engin=223)
  0.0018621174 = queryNorm
0.58712924 = fieldWeight(body:softwar engin in 339), product of:
  1.0 = tf(phraseFreq=1.0)
  5.3680387 = idf(body: softwar=34 engin=223)
  0.109375 = fieldNorm(field=body, doc=339)
/str


please suggest me.
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-score-issue-tp2574680p2574680.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: solr score issue

2011-02-25 Thread Jayendra Patil
Check the Need help in understanding output of searcher.explain()
function thread.

http://mail-archives.apache.org/mod_mbox/lucene-java-user/201008.mbox/%3CAANLkTi=m9a1guhrahpeyqaxhu9gta9fjbnr7-8-zi...@mail.gmail.com%3E

Regards,
Jayendra

On Fri, Feb 25, 2011 at 6:57 AM, Bagesh Sharma mail.bag...@gmail.com wrote:

 Hi sir ,

 Can anyone explain me how this score is being calculated. i am searching
 here software engineer using dismax handler. Total documents indexed are
 477 and query results are 28.

 Query is like that -
       q=software+engineerfq=location%3Adelhi

 dismax setting is -

       str name=qf
             alltext
             title^2
             functional_role^1
        /str

        str name=pf
              body^100
        /str


 Here alltext field is made by copying all fields.
 body field contains detail of job.

 I am unable to understand how these scores have been calculated. From where
 to start score calculating and what are default score for any term matching.

 str name=20080604/3eb9a7b30131a782a0c0a0e2cdb2b6b8.html

 0.5901718 = (MATCH) sum of:
  0.0032821721 = (MATCH) sum of:
    0.0026574256 = (MATCH) max plus 0.1 times others of:
      0.0026574256 = (MATCH) weight(alltext:softwar in 339), product of:
        0.0067262817 = queryWeight(alltext:softwar), product of:
          3.6121683 = idf(docFreq=34, maxDocs=477)
          0.0018621174 = queryNorm
        0.39508092 = (MATCH) fieldWeight(alltext:softwar in 339), product
 of:
          1.0 = tf(termFreq(alltext:softwar)=1)
          3.6121683 = idf(docFreq=34, maxDocs=477)
          0.109375 = fieldNorm(field=alltext, doc=339)
    6.2474643E-4 = (MATCH) max plus 0.1 times others of:
      6.2474643E-4 = (MATCH) weight(alltext:engin in 339), product of:
        0.0032613424 = queryWeight(alltext:engin), product of:
          1.7514161 = idf(docFreq=224, maxDocs=477)
          0.0018621174 = queryNorm
        0.19156113 = (MATCH) fieldWeight(alltext:engin in 339), product of:
          1.0 = tf(termFreq(alltext:engin)=1)
          1.7514161 = idf(docFreq=224, maxDocs=477)
          0.109375 = fieldNorm(field=alltext, doc=339)
  0.5868896 = weight(body:softwar engin^100.0 in 339), product of:
    0.9995919 = queryWeight(body:softwar engin^100.0), product of:
      100.0 = boost
      5.3680387 = idf(body: softwar=34 engin=223)
      0.0018621174 = queryNorm
    0.58712924 = fieldWeight(body:softwar engin in 339), product of:
      1.0 = tf(phraseFreq=1.0)
      5.3680387 = idf(body: softwar=34 engin=223)
      0.109375 = fieldNorm(field=body, doc=339)
 /str


 please suggest me.
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/solr-score-issue-tp2574680p2574680.html
 Sent from the Solr - User mailing list archive at Nabble.com.



How to round solr score ?

2009-03-30 Thread squaro

Hello,

I would like to cut solr score to 3 or 4 digits .
Indeed I would like to be able to sort by score, then by another critria (
price for example).
So if two docs have score of 1.67989 and 1.6767, I would like to sort them
by price.

Do you have any idea how I could do that ? 
-- 
View this message in context: 
http://www.nabble.com/How-to-round-solr-score---tp22787254p22787254.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: How to round solr score ?

2009-03-30 Thread Shalin Shekhar Mangar
On Mon, Mar 30, 2009 at 10:04 PM, squaro marclebe...@gmail.com wrote:


 Hello,

 I would like to cut solr score to 3 or 4 digits .
 Indeed I would like to be able to sort by score, then by another critria (
 price for example).
 So if two docs have score of 1.67989 and 1.6767, I would like to sort them
 by price.

 Do you have any idea how I could do that ?


I don't there there is an existing way to round them. But it will be a
useful contribution if you can write a function query for rounding.

Look at http://wiki.apache.org/solr/FunctionQuery

-- 
Regards,
Shalin Shekhar Mangar.


Re: How to round solr score ?

2009-03-30 Thread Grant Ingersoll


On Mar 30, 2009, at 1:07 PM, Shalin Shekhar Mangar wrote:

On Mon, Mar 30, 2009 at 10:04 PM, squaro marclebe...@gmail.com  
wrote:




Hello,

I would like to cut solr score to 3 or 4 digits .
Indeed I would like to be able to sort by score, then by another  
critria (

price for example).
So if two docs have score of 1.67989 and 1.6767, I would like to  
sort them

by price.

Do you have any idea how I could do that ?



I don't there there is an existing way to round them. But it will be a
useful contribution if you can write a function query for rounding.

Look at http://wiki.apache.org/solr/FunctionQuery


What did you have in mind, Shalin?It seems to me you would have to  
hook into the HitCollector and/or implement your own sorting  
capability, as the Func Query is just going to allow you to take price  
in as a scoring factor, no?


-Grant


Re: How to round solr score ?

2009-03-30 Thread Walter Underwood
I think what you want to do is add in a function query that gives
values in that range.

There is no need to round the scores. That doesn't do anything
but throw away information.

wunder

On 3/30/09 10:07 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote:

 On Mon, Mar 30, 2009 at 10:04 PM, squaro marclebe...@gmail.com wrote:
 
 
 Hello,
 
 I would like to cut solr score to 3 or 4 digits .
 Indeed I would like to be able to sort by score, then by another critria (
 price for example).
 So if two docs have score of 1.67989 and 1.6767, I would like to sort them
 by price.
 
 Do you have any idea how I could do that ?
 
 
 I don't there there is an existing way to round them. But it will be a
 useful contribution if you can write a function query for rounding.
 
 Look at http://wiki.apache.org/solr/FunctionQuery



Re: How to round solr score ?

2009-03-30 Thread Shalin Shekhar Mangar
On Mon, Mar 30, 2009 at 10:54 PM, Grant Ingersoll gsing...@apache.orgwrote:


 I don't there there is an existing way to round them. But it will be a
 useful contribution if you can write a function query for rounding.

 Look at http://wiki.apache.org/solr/FunctionQuery


 What did you have in mind, Shalin?It seems to me you would have to hook
 into the HitCollector and/or implement your own sorting capability, as the
 Func Query is just going to allow you to take price in as a scoring factor,
 no?


Yonik added a way to use the score of a query in function queries with
SOLR-939. Look at the query function on the wiki. Some very cool things
are possible now :)

-- 
Regards,
Shalin Shekhar Mangar.


Re: How to round solr score ?

2009-03-30 Thread Shalin Shekhar Mangar
On Mon, Mar 30, 2009 at 11:06 PM, Walter Underwood
wunderw...@netflix.comwrote:

 I think what you want to do is add in a function query that gives
 values in that range.


The scale function won't work in this use-case because it will give you a
double in the given range. So you cannot do sort by score and price. For
this use-case you need to scale to an integer value in a discrete range.

-- 
Regards,
Shalin Shekhar Mangar.


Re: How to round solr score ?

2009-03-30 Thread Shalin Shekhar Mangar
On Mon, Mar 30, 2009 at 11:07 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:


 Yonik added a way to use the score of a query in function queries with
 SOLR-939. Look at the query function on the wiki. Some very cool things
 are possible now :)


Sorry, that should have been SOLR-1046

-- 
Regards,
Shalin Shekhar Mangar.


Re: How to round solr score ?

2009-03-30 Thread Shalin Shekhar Mangar
On Mon, Mar 30, 2009 at 11:10 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 On Mon, Mar 30, 2009 at 11:06 PM, Walter Underwood wunderw...@netflix.com
  wrote:

 I think what you want to do is add in a function query that gives
 values in that range.


 The scale function won't work in this use-case because it will give you a
 double in the given range. So you cannot do sort by score and price. For
 this use-case you need to scale to an integer value in a discrete range.


Walter -- I think I misinterpreted your response. Sorry about that. You are
indeed right. However, we can do scale(round(score, 2), 1, 10) or we can
create a new scale function as you said.

-- 
Regards,
Shalin Shekhar Mangar.


Re: solr score

2008-09-24 Thread Neeti Raj
Hi Santhanaraj

Just search for boost on Solr wiki and see if boost feature suffices your
requirement.

As for highlighting, this explains how to implement solr highlighting
http://wiki.apache.org/solr/HighlightingParameters
- neeti

On Wed, Sep 24, 2008 at 10:31 AM, sanraj25 [EMAIL PROTECTED] wrote:


 hi,
  How to weightage more frequently searched word in solr?

 what is the functionality in Apache solr module?
 I have a list of more frequently searched word in my site , i need to
 highlight those words.From the net i found out that 'score' is used for
 this
 purpose. Isn't it true?
 Anybody knows about it?
 Please help me.

 with Regards,
 Santhanaraj R

 --
 View this message in context:
 http://www.nabble.com/solr-score-tp19642046p19642046.html
 Sent from the Solr - User mailing list archive at Nabble.com.




solr score

2008-09-23 Thread sanraj25

hi,
  How to weightage more frequently searched word in solr?

what is the functionality in Apache solr module?
I have a list of more frequently searched word in my site , i need to
highlight those words.From the net i found out that 'score' is used for this
purpose. Isn't it true?
Anybody knows about it?
Please help me. 

with Regards,
Santhanaraj R

-- 
View this message in context: 
http://www.nabble.com/solr-score-tp19642046p19642046.html
Sent from the Solr - User mailing list archive at Nabble.com.



A question about solr score

2007-10-26 Thread zx zhang
Hi, everyone!
As we known, solr uses lucene scoring.
This score is the raw score. Scores returned from Hits aren't
necessarily the raw score, however. If the top-scoring document scores
greater than 1.0, all scores are normalized from that score, such that
all scores from Hits are uaranteed to be 1.0 or less.
Now it is my question, I always get scores of some documents which are
above 1.0, even some get up to 10.0!
Why?
I will really appreciate your reply.


Re: A question about solr score

2007-10-26 Thread Erik Hatcher

Solr returns the raw score, not the Lucene Hits normalized one.

It's trivial for the client to normalize if desired - take the top  
scoring document, if it's greater than 1.0 then scale all scores  
based on that.


Erik


On Oct 26, 2007, at 2:53 AM, zx zhang wrote:


Hi, everyone!
As we known, solr uses lucene scoring.
This score is the raw score. Scores returned from Hits aren't
necessarily the raw score, however. If the top-scoring document scores
greater than 1.0, all scores are normalized from that score, such that
all scores from Hits are uaranteed to be 1.0 or less.
Now it is my question, I always get scores of some documents which are
above 1.0, even some get up to 10.0!
Why?
I will really appreciate your reply.




Re: A question about solr score

2007-10-26 Thread Chris Hostetter

: It's trivial for the client to normalize if desired - take the top scoring
: document, if it's greater than 1.0 then scale all scores based on that.

this is why doclists include the maxScore in their output as well, to 
make it easy to normalize scores even if you are using pagination (or 
sorting on a field other then score)

http://localhost:8983/solr/select/?q=videofl=id,scorestart=1

result name=response numFound=3 start=1 maxScore=0.5145902
 doc
  float name=score0.39613172/float
  str name=idEN7800GTX/2DHTV/256M/str
 /doc
 doc
  float name=score0.39613172/float
  str name=id100-435805/str
 /doc
/result


-Hoss