Re: Different solr score between stand alone vs cloud mode solr
Wei: That is odd. These should be the same so I'm puzzled too. I'm assuming that you're using the exact same schema on both with each field having the exact same definitions. And since you say it's the same release of Solr it's not like some default changed Here's an idea (and I'm shooting in the dark here). Copy the index from one place to another and see if what you're seeing is still true. Assuming the schema is the seam, you should be able to 1> shut down all your, say, SolrCloud instances. 2> copy the stand-alone index to each of those instances. Verify that there is exactly one segment since you said it's optimized. 3> start the SolrCloud instances back up. Are the scores still different? Let's claim they're the same. In that case, use the schema from your stand-alone solr for SolrCloud, then delete the index adn re-index from scratch. Best, Erick On Thu, Jun 7, 2018 at 2:28 PM, Wei wrote: > Thanks Erick. However our indexes on stand alone and cloud are both static > -- we indexed them from the same source xmls, optimize and have no updates > after it is done. Also in cloud there is only one single shard( with > multiple replicas ). I assume distributed stats doesn't have effect in this > case? > > Thanks, > Wei > > On Thu, Jun 7, 2018 at 12:18 PM, Erick Erickson > wrote: > >> Short form: >> >> As docs are updated, they're marked as deleted until the segment is >> merged. This affects things like term frequency and doc frequency >> which in turn influences the score. >> >> Due to how commits happen, i.e. autocommit will hit at slightly skewed >> wall-clock time, different segments are merged on different replicas >> of the same shard. Thus the scores can be slightly different >> >> You can turn on distributed stats which will help with this: >> https://issues.apache.org/jira/browse/SOLR-1632 >> >> Best, >> Erick >> >> On Thu, Jun 7, 2018 at 12:07 PM, Wei wrote: >> > Hi, >> > >> > Recently we have an observation that really puzzled us. We have two >> > instances of Solr, one in stand alone mode and one is a single-shard >> solr >> > cloud with a couple of replicas. Both are indexed with the same >> documents >> > and have same solr version 6.6.2. When issue the same query, the solr >> > score from stand alone and cloud are different. How could this happen? >> > With the same data, software version and query, should solr score be >> > exactly same regardless of cloud mode or not? >> > >> > Thanks, >> > Wei >>
Re: Different solr score between stand alone vs cloud mode solr
Thanks Erick. However our indexes on stand alone and cloud are both static -- we indexed them from the same source xmls, optimize and have no updates after it is done. Also in cloud there is only one single shard( with multiple replicas ). I assume distributed stats doesn't have effect in this case? Thanks, Wei On Thu, Jun 7, 2018 at 12:18 PM, Erick Erickson wrote: > Short form: > > As docs are updated, they're marked as deleted until the segment is > merged. This affects things like term frequency and doc frequency > which in turn influences the score. > > Due to how commits happen, i.e. autocommit will hit at slightly skewed > wall-clock time, different segments are merged on different replicas > of the same shard. Thus the scores can be slightly different > > You can turn on distributed stats which will help with this: > https://issues.apache.org/jira/browse/SOLR-1632 > > Best, > Erick > > On Thu, Jun 7, 2018 at 12:07 PM, Wei wrote: > > Hi, > > > > Recently we have an observation that really puzzled us. We have two > > instances of Solr, one in stand alone mode and one is a single-shard > solr > > cloud with a couple of replicas. Both are indexed with the same > documents > > and have same solr version 6.6.2. When issue the same query, the solr > > score from stand alone and cloud are different. How could this happen? > > With the same data, software version and query, should solr score be > > exactly same regardless of cloud mode or not? > > > > Thanks, > > Wei >
RE: Different solr score between stand alone vs cloud mode solr
To add on that, keep in mind to disable queryResultCache or distributed stats won't work. And to add on that, i do not think distributed stats will work for a single shard index anyway. Regards, Markus -Original message- > From:Erick Erickson > Sent: Thursday 7th June 2018 21:19 > To: solr-user > Subject: Re: Different solr score between stand alone vs cloud mode solr > > Short form: > > As docs are updated, they're marked as deleted until the segment is > merged. This affects things like term frequency and doc frequency > which in turn influences the score. > > Due to how commits happen, i.e. autocommit will hit at slightly skewed > wall-clock time, different segments are merged on different replicas > of the same shard. Thus the scores can be slightly different > > You can turn on distributed stats which will help with this: > https://issues.apache.org/jira/browse/SOLR-1632 > > Best, > Erick > > On Thu, Jun 7, 2018 at 12:07 PM, Wei wrote: > > Hi, > > > > Recently we have an observation that really puzzled us. We have two > > instances of Solr, one in stand alone mode and one is a single-shard solr > > cloud with a couple of replicas. Both are indexed with the same documents > > and have same solr version 6.6.2. When issue the same query, the solr > > score from stand alone and cloud are different. How could this happen? > > With the same data, software version and query, should solr score be > > exactly same regardless of cloud mode or not? > > > > Thanks, > > Wei >
Re: Different solr score between stand alone vs cloud mode solr
Also the score is a fluid number, you shouldnt use the score for any real reason aside from seeing that the documents are in the right order in relation to the scores from the other documents in the result set. or the occasional condition where two results switch in place from one to the other because they have the same score On Thu, Jun 7, 2018 at 3:18 PM, Erick Erickson wrote: > Short form: > > As docs are updated, they're marked as deleted until the segment is > merged. This affects things like term frequency and doc frequency > which in turn influences the score. > > Due to how commits happen, i.e. autocommit will hit at slightly skewed > wall-clock time, different segments are merged on different replicas > of the same shard. Thus the scores can be slightly different > > You can turn on distributed stats which will help with this: > https://issues.apache.org/jira/browse/SOLR-1632 > > Best, > Erick > > On Thu, Jun 7, 2018 at 12:07 PM, Wei wrote: > > Hi, > > > > Recently we have an observation that really puzzled us. We have two > > instances of Solr, one in stand alone mode and one is a single-shard > solr > > cloud with a couple of replicas. Both are indexed with the same > documents > > and have same solr version 6.6.2. When issue the same query, the solr > > score from stand alone and cloud are different. How could this happen? > > With the same data, software version and query, should solr score be > > exactly same regardless of cloud mode or not? > > > > Thanks, > > Wei >
Re: Different solr score between stand alone vs cloud mode solr
Short form: As docs are updated, they're marked as deleted until the segment is merged. This affects things like term frequency and doc frequency which in turn influences the score. Due to how commits happen, i.e. autocommit will hit at slightly skewed wall-clock time, different segments are merged on different replicas of the same shard. Thus the scores can be slightly different You can turn on distributed stats which will help with this: https://issues.apache.org/jira/browse/SOLR-1632 Best, Erick On Thu, Jun 7, 2018 at 12:07 PM, Wei wrote: > Hi, > > Recently we have an observation that really puzzled us. We have two > instances of Solr, one in stand alone mode and one is a single-shard solr > cloud with a couple of replicas. Both are indexed with the same documents > and have same solr version 6.6.2. When issue the same query, the solr > score from stand alone and cloud are different. How could this happen? > With the same data, software version and query, should solr score be > exactly same regardless of cloud mode or not? > > Thanks, > Wei
Different solr score between stand alone vs cloud mode solr
Hi, Recently we have an observation that really puzzled us. We have two instances of Solr, one in stand alone mode and one is a single-shard solr cloud with a couple of replicas. Both are indexed with the same documents and have same solr version 6.6.2. When issue the same query, the solr score from stand alone and cloud are different. How could this happen? With the same data, software version and query, should solr score be exactly same regardless of cloud mode or not? Thanks, Wei
Re: SOLR Score Range Changed
On 2/23/2018 2:28 PM, Hodder, Rick wrote: > Combining everything into one query is what I'd prefer because as you said, > one would think that with everything in the same query, the score would > organize everything nicely. I don't recall writing anything like that. How did you infer that from what I wrote? One thing that you can infer from what I said is that comparing scores from multiple queries is not going to do what you think it will do. Which leads into the next thing I'll quote from your message: > So the way we had addressed it was running 3 separate SOLR queries and > combining them and sorting them by descending score - wasn’t perfect, but it > worked, and helped me to reduce the number of results we hand off to a > scoring engine that applies 3 algorithms (Monge-Elkan, Jaro-Winkler, and > SmithWindowed Affline) to further hone the results - which can take LOTS of > time if there are a lot of results, so It seems that you didn't finish your sentence, and may not have even finished the message, as this was the last thing you wrote. Running three separate queries and then trying to combine them based on score is not something you should ever attempt, because as I mentioned before, the absolute score of a document in a result is only meaningful for that specific query done at that moment. Even the same query done later after something has changed might have a very different score range. Thanks, Shawn
RE: SOLR Score Range Changed
Classic Similarity helped, but the ranges of values don’t have a min near 0 like back in 4's version Are there other attributes/elements to this factory that could get me back the old functionality? -Original Message- From: Joël Trigalo [mailto:jtrig...@gmail.com] Sent: Friday, February 23, 2018 10:41 AM To: solr-user@lucene.apache.org Subject: Re: SOLR Score Range Changed The difference seems due to the fact that default similarity in solr 7 is BM25 while it used to be TF-IDF in solr 4. As you realised, BM25 function is smoother. You can configure schema.xml to use ClassicSimilarity, for instance https://lucene.apache.org/solr/guide/6_6/major-changes-from-solr-5-to-solr-6.html#default-similarity-changes https://lucene.apache.org/solr/guide/6_6/field-type-definitions-and-properties.html#FieldTypeDefinitionsandProperties-FieldTypeSimilarity But as said before, maybe you are using properties that are not guaranteed so it would be better to change score function or sorting (rather than coming back to ClassicSimilarity)
RE: SOLR Score Range Changed
Hi Shawn, Thanks for your help - I'm still finding my way in the weeds of SOLR. Combining everything into one query is what I'd prefer because as you said, one would think that with everything in the same query, the score would organize everything nicely. >>Assuming you're using the default relevancy sort Yes >> does the order of your search results change dramatically from one version >> to the other? If it does, is the order generally better from a relevance >> standpoint, or generally worse? If you are specifying an explicit sort, >> then the scores will likely be ignored. Here's what we do - we have a list of policies with names (among other things, but I'll just use names for an example. We search for several business names to see if we have policies in common with the names so that we don’t have too much risk with them. So let's say I'm doing a search against three business names Bob's carpentry Conslidated carpentry of the Greater North West Carpentry Land q=(IDX_CompanyName:bob's AND carpentry) OR (IDX_CompanyName: conslidated AND carpentry AND of AND the AND Greater AND North AND West) OR (IDX_CompanyName: Carpentry AND Land) Searching for 750 rows has hits that are all focused on Consolidated (seemingly because the number of words causes the SOLR score to go up into a higher range for all Consolidated results, as mentioned in my previous email.) Searching for all 3 things at the same time doesn’t insure that all 3 companies will be in the results, even when run separately there are results for all 3. If I boost maxrows to 4000, I see a few bob's carpentry but most are still Consolidated So the way we had addressed it was running 3 separate SOLR queries and combining them and sorting them by descending score - wasn’t perfect, but it worked, and helped me to reduce the number of results we hand off to a scoring engine that applies 3 algorithms (Monge-Elkan, Jaro-Winkler, and SmithWindowed Affline) to further hone the results - which can take LOTS of time if there are a lot of results, so What I am describing is also why it's strongly recommended that you never try to convert scores to percentages: https://wiki.apache.org/lucene-java/ScoresAsPercentages Thanks, Shawn
Re: SOLR Score Range Changed
The difference seems due to the fact that default similarity in solr 7 is BM25 while it used to be TF-IDF in solr 4. As you realised, BM25 function is smoother. You can configure schema.xml to use ClassicSimilarity, for instance https://lucene.apache.org/solr/guide/6_6/major-changes-from-solr-5-to-solr-6.html#default-similarity-changes https://lucene.apache.org/solr/guide/6_6/field-type-definitions-and-properties.html#FieldTypeDefinitionsandProperties-FieldTypeSimilarity But as said before, maybe you are using properties that are not guaranteed so it would be better to change score function or sorting (rather than coming back to ClassicSimilarity) 2018-02-22 18:39 GMT+01:00 Shawn Heisey: > On 2/22/2018 9:50 AM, Hodder, Rick wrote: > >> I am migrating from SOLR 4.10.2 to SOLR 7.1. >> >> All seems to be going well, except for one thing: the score that is >> coming back for the resulting documents is giving different scores. >> > > The absolute score has no meaning when you change something -- the index, > the query, the software version, etc. You can't compare absolute scores. > > What matters is the relative score of one document to another *in the same > query*. The amount of difference is almost irrelevant -- the goal of > Lucene's score calculation gymnastics is to have one document score higher > than another, so the *order* is reasonably correct. > > Assuming you're using the default relevancy sort, does the order of your > search results change dramatically from one version to the other? If it > does, is the order generally better from a relevance standpoint, or > generally worse? If you are specifying an explicit sort, then the scores > will likely be ignored. > > What I am describing is also why it's strongly recommended that you never > try to convert scores to percentages: > > https://wiki.apache.org/lucene-java/ScoresAsPercentages > > Thanks, > Shawn > >
Re: SOLR Score Range Changed
On 2/22/2018 9:50 AM, Hodder, Rick wrote: I am migrating from SOLR 4.10.2 to SOLR 7.1. All seems to be going well, except for one thing: the score that is coming back for the resulting documents is giving different scores. The absolute score has no meaning when you change something -- the index, the query, the software version, etc. You can't compare absolute scores. What matters is the relative score of one document to another *in the same query*. The amount of difference is almost irrelevant -- the goal of Lucene's score calculation gymnastics is to have one document score higher than another, so the *order* is reasonably correct. Assuming you're using the default relevancy sort, does the order of your search results change dramatically from one version to the other? If it does, is the order generally better from a relevance standpoint, or generally worse? If you are specifying an explicit sort, then the scores will likely be ignored. What I am describing is also why it's strongly recommended that you never try to convert scores to percentages: https://wiki.apache.org/lucene-java/ScoresAsPercentages Thanks, Shawn
SOLR Score Range Changed
I am migrating from SOLR 4.10.2 to SOLR 7.1. All seems to be going well, except for one thing: the score that is coming back for the resulting documents is giving different scores. The core uses a schema. Here's the schema info for the field that i am searching on: When searching maxrows=750, fields: *,score IDX_Company:(cat and scratch) SOLR 7.1: max score 6.95 and a min of 6.28 SOLR 4.10.2: max score 8.63 and a min of 0.91 IDX_InsuredName:(cat and scratch and fever) SOLR 7.1 max score of 12.99 and a min of 11.25 SOLR 4.10.2 max 3.97 and min of 0.77 See how the range of values is different (ranges in 7.1 dont go down to 0.x) Also notice that the max score doubles when I add one word to the search terms in 7.1. Most important, the ranges in 4.10.2 overlap - but the 7.1 dont. A little more information to show you how I use this information, and why this is causing a problem. I get a company name like "bobs cabinetry" and another "all american tech enterprise" I run two SOLR queries per company name, I'll call them 1-AND, 1-OR, 2-AND, 2-OR. IDX_Company:(bobs AND cabinetry) =*,score,requestid:"1-AND" IDX_Company:(bobs OR cabinetry) =*,score,requestid:"1-OR" IDX_Company:(all AND american AND tech AND enterprise) =*,score,requestid:"2-AND" IDX_Company:(all OR american OR tech OR enterprise) =*,score,requestid:"2-OR" I combine the results together sort by descending score, and then take the top 750 rows.(The requestid lets me know which query the results came from) Because of the changes in the range of scores, the sort pushes all of the all american tech enterprise rows to the top of the results (because of no overlap), and when the top 750 are taken everything for bobs carpentry is removed from the results. Is there some config setting I can change to make score calculation act like it did in 4.10.2? Or something else?
Re: Solr score use cases
I would like to stress how important is what Erick explained. A lot of times people want to use the score to show it to the users/calculate probability/doing weird calculations. Score is used to rank results, given a query. To give a local ordering. This is the only useful information for the end user. >From an administrator/developer perspective is different, debugging the score could be vital, mainly for relevancy tuning and understanding ranking bugs. - --- Alessandro Benedetti Search Consultant, R Software Engineer, Director Sease Ltd. - www.sease.io -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Solr score use cases
Thx for the clarification Best regards Am 01.12.2017 18:25 schrieb "Erick Erickson" <erickerick...@gmail.com>: > Sorting certainly ignores scoring, I'm pretty sure it's just not > calculated in that case. > > If your sorting results in multiple documents in the same bin, people > will combine the primary sort with a secondary sort on score, so in > that case the score is definitely calculated, ie "=day asc, score > desc" > > Returning the score with documents is usually for development > purposes. Scores are _not_ comparable except within a single query, so > IMO telling users that a doc from one search has a score of X and a > doc from another search has a score of Y is useless-to-misleading > information. A score of 2X is _not_ necessarily "twice as good" (or > even as good) as a score of X in another search. > > FWIW, > Erick > > On Fri, Dec 1, 2017 at 6:31 AM, Faraz Fallahi > <faraz.fall...@googlemail.com> wrote: > > Or does the Score even get calculated when i sort or Not? > > > > Am 01.12.2017 4:38 nachm. schrieb "Faraz Fallahi" < > > faraz.fall...@googlemail.com>: > > > >> Oki but If ID Just make an simple query with a "where Claude" and sort > by > >> a field i See no sense in calculating a score right? > >> > >> Am 01.12.2017 16:33 schrieb "Aman Tandon" <amantandon...@gmail.com>: > >> > >>> Hi Faraz, > >>> > >>> Solr score which you could retrieved by adding in fl parameter could be > >>> helpful to understand the following: > >>> > >>> 1) search relevance ranking: how much score solr has given to the top & > >>> second top document, and with debug=true you could better understand > what > >>> is causing that score. > >>> > >>> 2) You could use the function query to multiply score with some feature > >>> e.g. paid customers score, popularity score, etc to improve the > relevance > >>> as per the business. > >>> > >>> I am able to think these few points only, someone can also put more > light > >>> if I am missing anything. I hope this is what you want to know. > >>> > >>> Regards, > >>> Aman > >>> > >>> On Dec 1, 2017 13:38, "Faraz Fallahi" <faraz.fall...@googlemail.com> > >>> wrote: > >>> > >>> Hi > >>> > >>> A simple question: what are the most common use cases for the solr > score > >>> of > >>> documents retrieved after firing queries? > >>> I dont have a real understanding of its purpose at the moment. > >>> > >>> Thx for helping > >>> > >> >
Re: Solr score use cases
Sorting certainly ignores scoring, I'm pretty sure it's just not calculated in that case. If your sorting results in multiple documents in the same bin, people will combine the primary sort with a secondary sort on score, so in that case the score is definitely calculated, ie "=day asc, score desc" Returning the score with documents is usually for development purposes. Scores are _not_ comparable except within a single query, so IMO telling users that a doc from one search has a score of X and a doc from another search has a score of Y is useless-to-misleading information. A score of 2X is _not_ necessarily "twice as good" (or even as good) as a score of X in another search. FWIW, Erick On Fri, Dec 1, 2017 at 6:31 AM, Faraz Fallahi <faraz.fall...@googlemail.com> wrote: > Or does the Score even get calculated when i sort or Not? > > Am 01.12.2017 4:38 nachm. schrieb "Faraz Fallahi" < > faraz.fall...@googlemail.com>: > >> Oki but If ID Just make an simple query with a "where Claude" and sort by >> a field i See no sense in calculating a score right? >> >> Am 01.12.2017 16:33 schrieb "Aman Tandon" <amantandon...@gmail.com>: >> >>> Hi Faraz, >>> >>> Solr score which you could retrieved by adding in fl parameter could be >>> helpful to understand the following: >>> >>> 1) search relevance ranking: how much score solr has given to the top & >>> second top document, and with debug=true you could better understand what >>> is causing that score. >>> >>> 2) You could use the function query to multiply score with some feature >>> e.g. paid customers score, popularity score, etc to improve the relevance >>> as per the business. >>> >>> I am able to think these few points only, someone can also put more light >>> if I am missing anything. I hope this is what you want to know. >>> >>> Regards, >>> Aman >>> >>> On Dec 1, 2017 13:38, "Faraz Fallahi" <faraz.fall...@googlemail.com> >>> wrote: >>> >>> Hi >>> >>> A simple question: what are the most common use cases for the solr score >>> of >>> documents retrieved after firing queries? >>> I dont have a real understanding of its purpose at the moment. >>> >>> Thx for helping >>> >>
Re: Solr score use cases
Or does the Score even get calculated when i sort or Not? Am 01.12.2017 4:38 nachm. schrieb "Faraz Fallahi" < faraz.fall...@googlemail.com>: > Oki but If ID Just make an simple query with a "where Claude" and sort by > a field i See no sense in calculating a score right? > > Am 01.12.2017 16:33 schrieb "Aman Tandon" <amantandon...@gmail.com>: > >> Hi Faraz, >> >> Solr score which you could retrieved by adding in fl parameter could be >> helpful to understand the following: >> >> 1) search relevance ranking: how much score solr has given to the top & >> second top document, and with debug=true you could better understand what >> is causing that score. >> >> 2) You could use the function query to multiply score with some feature >> e.g. paid customers score, popularity score, etc to improve the relevance >> as per the business. >> >> I am able to think these few points only, someone can also put more light >> if I am missing anything. I hope this is what you want to know. >> >> Regards, >> Aman >> >> On Dec 1, 2017 13:38, "Faraz Fallahi" <faraz.fall...@googlemail.com> >> wrote: >> >> Hi >> >> A simple question: what are the most common use cases for the solr score >> of >> documents retrieved after firing queries? >> I dont have a real understanding of its purpose at the moment. >> >> Thx for helping >> >
Re: Solr score use cases
Oki but If ID Just make an simple query with a "where Claude" and sort by a field i See no sense in calculating a score right? Am 01.12.2017 16:33 schrieb "Aman Tandon" <amantandon...@gmail.com>: > Hi Faraz, > > Solr score which you could retrieved by adding in fl parameter could be > helpful to understand the following: > > 1) search relevance ranking: how much score solr has given to the top & > second top document, and with debug=true you could better understand what > is causing that score. > > 2) You could use the function query to multiply score with some feature > e.g. paid customers score, popularity score, etc to improve the relevance > as per the business. > > I am able to think these few points only, someone can also put more light > if I am missing anything. I hope this is what you want to know. > > Regards, > Aman > > On Dec 1, 2017 13:38, "Faraz Fallahi" <faraz.fall...@googlemail.com> > wrote: > > Hi > > A simple question: what are the most common use cases for the solr score of > documents retrieved after firing queries? > I dont have a real understanding of its purpose at the moment. > > Thx for helping >
Re: Solr score use cases
Hi Faraz, Solr score which you could retrieved by adding in fl parameter could be helpful to understand the following: 1) search relevance ranking: how much score solr has given to the top & second top document, and with debug=true you could better understand what is causing that score. 2) You could use the function query to multiply score with some feature e.g. paid customers score, popularity score, etc to improve the relevance as per the business. I am able to think these few points only, someone can also put more light if I am missing anything. I hope this is what you want to know. Regards, Aman On Dec 1, 2017 13:38, "Faraz Fallahi" <faraz.fall...@googlemail.com> wrote: Hi A simple question: what are the most common use cases for the solr score of documents retrieved after firing queries? I dont have a real understanding of its purpose at the moment. Thx for helping
Solr score use cases
Hi A simple question: what are the most common use cases for the solr score of documents retrieved after firing queries? I dont have a real understanding of its purpose at the moment. Thx for helping
Re: Modify solr score
We came with a simple solution. We use termfreq <https://wiki.apache.org/solr/FunctionQuery#termfreq> and write a simple processor that counts words for making a boost function that only calculates the ratio between words that hit terms and the whole field length. Some tests are being made, maybe it could solves the problem. Thanks for your help! -- View this message in context: http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331614.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Modify solr score
This may be suggesting a solution that is too experimental or using the wrong hammer for the job, but to me it sounds like you could use “payloads” for this type of ranking of terms relationship to a document. See SOLR-1485 for the recent work I’ve been doing (and aim to get committed soon). You could index documents in this way: id, weighted_terms_dpf 1, A|5.0 B|95.0 2,A|88.7 B|0.1 And then search for “A” and use the 88.7 value to factor into the score or sorting. Erik > On Apr 21, 2017, at 12:35 PM, tstusr <ulfrhe...@gmail.com> wrote: > > Since we report the score, we think there will be some relation between them. > As far as we know scoring (and then ranking) are calculated based on tf-idf. > > What we want to do is to make a qualitative ranking, it means, according to > one topic we will tag documents as "very related", "fairly related" or "poor > related". So, we select some documents completely unrelated to a topic. > > On a very related document we found a ratio of ~2% of words that reports > ~0.85 of score (what we think is related to ranking). On a test document we > found a ratio of less than 0.01% and the score is heigher than the first > one. What we expect is that documents not related (those ones with less > ratio) report lower scores so we can then use them as minimum and create the > scale. > > We came with multiply (of affect in some way) the default rank solr provide > us with the ratio of documents so unrelated documents will be penalized > while those with higher ratio values will be overrated. > > Greetings, and thanks for your help. > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331315.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: Modify solr score
Ulf: Maybe there is a way you could filter out the unrelated documents. Qf? Rick On April 21, 2017 2:18:59 PM EDT, tstusr <ulfrhe...@gmail.com> wrote: >Well, I know they can change. > >I think, the main problem here it that (in this point) documents >completely >unrelated to a topic are being ranked as high as documents related. So, >in >order to penalize them we are trying to use the ratio or term >frequency/word >length. > >Nevertheless we aren't able to find a practical way to make it. > >Greetings. > > > >-- >View this message in context: >http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331342.html >Sent from the Solr - User mailing list archive at Nabble.com. -- Sorry for being brief. Alternate email is rickleir at yahoo dot com
Re: Modify solr score
Well, I know they can change. I think, the main problem here it that (in this point) documents completely unrelated to a topic are being ranked as high as documents related. So, in order to penalize them we are trying to use the ratio or term frequency/word length. Nevertheless we aren't able to find a practical way to make it. Greetings. -- View this message in context: http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331342.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Modify solr score
Using a minimum score cut off does not work. The score is not an absolute estimate of relevance. The idf component of the score is a whole-corpus metric. When you add or delete documents, the scores for the exact same query can change. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Apr 21, 2017, at 10:18 AM, tstusr <ulfrhe...@gmail.com> wrote: > > Well, maybe I explain it wrong. > > We have entry points, each of them are related to a topic. It mens that when > we select the first topic all information has to be related in some way to > this vocabulary. So, it can work since we select documents not related to > each vocabulary of every entry point. To establish a threshold of minimums, > so that, we are trying to use hit ratio to modify score. > > After we rank on that topics, all work after that is about faceting, word > selection and so on. > > Greeting > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331331.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: Modify solr score
Well, maybe I explain it wrong. We have entry points, each of them are related to a topic. It mens that when we select the first topic all information has to be related in some way to this vocabulary. So, it can work since we select documents not related to each vocabulary of every entry point. To establish a threshold of minimums, so that, we are trying to use hit ratio to modify score. After we rank on that topics, all work after that is about faceting, word selection and so on. Greeting -- View this message in context: http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331331.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Modify solr score
It isn’t going to work. The score is not an absolute relevance measurement. It only says that the first document is more relevant than the second, and so on. Scores are not comparable between different queries. The score cannot be used to say that the first hit for query A is a better match than the first hit for query B. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Apr 21, 2017, at 9:35 AM, tstusr <ulfrhe...@gmail.com> wrote: > > Since we report the score, we think there will be some relation between them. > As far as we know scoring (and then ranking) are calculated based on tf-idf. > > What we want to do is to make a qualitative ranking, it means, according to > one topic we will tag documents as "very related", "fairly related" or "poor > related". So, we select some documents completely unrelated to a topic. > > On a very related document we found a ratio of ~2% of words that reports > ~0.85 of score (what we think is related to ranking). On a test document we > found a ratio of less than 0.01% and the score is heigher than the first > one. What we expect is that documents not related (those ones with less > ratio) report lower scores so we can then use them as minimum and create the > scale. > > We came with multiply (of affect in some way) the default rank solr provide > us with the ratio of documents so unrelated documents will be penalized > while those with higher ratio values will be overrated. > > Greetings, and thanks for your help. > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331315.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: Modify solr score
Since we report the score, we think there will be some relation between them. As far as we know scoring (and then ranking) are calculated based on tf-idf. What we want to do is to make a qualitative ranking, it means, according to one topic we will tag documents as "very related", "fairly related" or "poor related". So, we select some documents completely unrelated to a topic. On a very related document we found a ratio of ~2% of words that reports ~0.85 of score (what we think is related to ranking). On a test document we found a ratio of less than 0.01% and the score is heigher than the first one. What we expect is that documents not related (those ones with less ratio) report lower scores so we can then use them as minimum and create the scale. We came with multiply (of affect in some way) the default rank solr provide us with the ratio of documents so unrelated documents will be penalized while those with higher ratio values will be overrated. Greetings, and thanks for your help. -- View this message in context: http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331315.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Modify solr score
It has been discussed countless times, never rely on score values. Rely on the ranking of your results. It seems you model a as a least of keywords and then you just run a query for each topic. Essentially for you, a is a query. The ranking of your results will already be affected by how many times ( Term Frequency) such keywords appear in the results. You can even play with different query parsers ( such as dismax/edismax) and play with the mm percentage to estabilish how strict you want your results to be, in relation with input query [1] . Can you elaborate better the way you would like to customize the score ? Which factor would you like to modify ? Cheers [1] https://cwiki.apache.org/confluence/display/solr/The+DisMax+Query+Parser#TheDisMaxQueryParser-Themm(MinimumShouldMatch)Parameter - --- Alessandro Benedetti Search Consultant, R Software Engineer, Director Sease Ltd. - www.sease.io -- View this message in context: http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331310.html Sent from the Solr - User mailing list archive at Nabble.com.
Modify solr score
Hi. We are making an application that searches for certain specific topics, as many captured words on a document the higher the score. We have 2 scenarios of testing. The first one with documents that users tag as relevant and other ones that contains documents out of our domain. In first scenario, we report ratios of 1-2% on the amount of captured terms against all document words. For the second scenario, we report ratios of less than 0.005%. Nevertheless, scores remain almost equal, ~0.85 for the first stage and ~0.8 for the latter one. So what we want is to decrease the score we report for this latter scenario according to the percentage of words captured in some way. Is there any way to store those values in a field in order to use them as query boost. Or any way to override the score default calculation to change relevancy? Thanks in advance... -- View this message in context: http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Solr debug 'explain' values differ from the Solr score
I still have the problem even without using the phonetic field. For example, the following query will result in some exact name matches having scores of 4.64, while others get 2.32. All debug info has final values of 4.64. =( ( (firstName:john~)^0.5 (firstName:john) )^4) I expect all exact matches to score the same, as the debug response seems to indicate they should be. I'm not having success reproducing the issue on a small amount of exported data indexed using post.jar. The issue still appears when I reduced the data pulled by the DIH to only the first 1,000,000 first names, however. Could this be due to some indexing issue with the DIH? Thanks, -Rick > Date: Tue, 15 Mar 2016 15:40:18 -0700 > From: hossman_luc...@fucit.org > To: solr-user@lucene.apache.org > Subject: RE: Solr debug 'explain' values differ from the Solr score > > > Sounds like a mismatch in the way the BooleanQuery explanation generation > code is handling situations where there is/isn't a coord factor involved > in computing the score itself. (the bug is almost certainly in the > "explain" code, since that is less rigorously tested in most cases, and > the score itself is probably correct) > > I tried to trivially reproduce the symptoms you described using the > techproducts example and was unable to generate a discrepency using a > simple boolean query w/a fuzzy clause... > > http://localhost:8983/solr/techproducts/query?q=ipod~%20belkin=id,name,score=query=results=true > > ...can you distill one of your problematic queries down to a > shorter/simpler reproducible example, and/or provide us with the field & > fieldType details for all of the fields used in your example? > > (i'm guessing it probably relates to your firstName_phonetic field?) > > > > : Date: Tue, 15 Mar 2016 13:17:04 -0700 > : From: Rick Sullivan <r...@ricksullivan.net> > : Reply-To: solr-user@lucene.apache.org > : To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org> > : Subject: RE: Solr debug 'explain' values differ from the Solr score > : > : After some digging and experimentation, here are some more details on the > issue I'm seeing. > : > : > : 1. The adjusted documents' scores are always exactly (debug_score/N), where > N is the number of OR items in the query. > : > : For example, `=firstName:gabby~ firstName_phonetic:gabby > firstName_tokens:(gabby)` will result in some of the documents with > firstName==GABBY receiving a score 1/3 of the score of other GABBY documents, > even though the debug explanation shows that they generated the same score. > : > : > : 2. This doesn't appear to be a brand new issue, or an issue with SolrCloud. > : > : I've tested the problem using SolrCloud 5.5.0, Solr 5.5.0 (not cloud), and > Solr 5.4.1. > : > : > : Anyone have any ideas? > : > : Thanks, > : -Rick > : > : From: r...@ricksullivan.net > : To: solr-user@lucene.apache.org > : Subject: Solr debug 'explain' values differ from the Solr score > : Date: Thu, 10 Mar 2016 08:34:30 -0800 > : > : Hi, > : > : I'm seeing behavior in Solr 5.5.0 where the top-level values I see in the > debug response don't always correspond with the scores Solr assigns to the > matched documents. > : > : For example, here is the top-level debug information for two documents > matched by a query: > : > : 114628: Object > : description: "sum of:" > : details: Array[2] > : match: true > : value: 20.542768 > : > : 357547: Object > : description: "sum of:" > : details: Array[2] > : match: true > : value: 26.517654 > : > : But they have scores > : > : 114628: 20.542767 > : 357547: 13.258826 > : > : I expect the second document to be the most relevant for my query, and the > debug values seem to agree. However, in the final score I receive, that > document's score has been adjusted down. > : > : The relevant debug response information can be found here: > http://apaste.info/mju > : > : Does anyone have an idea why the Solr score may differ from the debug value? > : > : Thanks, > : -Rick > > -Hoss > http://www.lucidworks.com/
RE: Solr debug 'explain' values differ from the Solr score
Sounds like a mismatch in the way the BooleanQuery explanation generation code is handling situations where there is/isn't a coord factor involved in computing the score itself. (the bug is almost certainly in the "explain" code, since that is less rigorously tested in most cases, and the score itself is probably correct) I tried to trivially reproduce the symptoms you described using the techproducts example and was unable to generate a discrepency using a simple boolean query w/a fuzzy clause... http://localhost:8983/solr/techproducts/query?q=ipod~%20belkin=id,name,score=query=results=true ...can you distill one of your problematic queries down to a shorter/simpler reproducible example, and/or provide us with the field & fieldType details for all of the fields used in your example? (i'm guessing it probably relates to your firstName_phonetic field?) : Date: Tue, 15 Mar 2016 13:17:04 -0700 : From: Rick Sullivan <r...@ricksullivan.net> : Reply-To: solr-user@lucene.apache.org : To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org> : Subject: RE: Solr debug 'explain' values differ from the Solr score : : After some digging and experimentation, here are some more details on the issue I'm seeing. : : : 1. The adjusted documents' scores are always exactly (debug_score/N), where N is the number of OR items in the query. : : For example, `=firstName:gabby~ firstName_phonetic:gabby firstName_tokens:(gabby)` will result in some of the documents with firstName==GABBY receiving a score 1/3 of the score of other GABBY documents, even though the debug explanation shows that they generated the same score. : : : 2. This doesn't appear to be a brand new issue, or an issue with SolrCloud. : : I've tested the problem using SolrCloud 5.5.0, Solr 5.5.0 (not cloud), and Solr 5.4.1. : : : Anyone have any ideas? : : Thanks, : -Rick : : From: r...@ricksullivan.net : To: solr-user@lucene.apache.org : Subject: Solr debug 'explain' values differ from the Solr score : Date: Thu, 10 Mar 2016 08:34:30 -0800 : : Hi, : : I'm seeing behavior in Solr 5.5.0 where the top-level values I see in the debug response don't always correspond with the scores Solr assigns to the matched documents. : : For example, here is the top-level debug information for two documents matched by a query: : : 114628: Object : description: "sum of:" : details: Array[2] : match: true : value: 20.542768 : : 357547: Object : description: "sum of:" : details: Array[2] : match: true : value: 26.517654 : : But they have scores : : 114628: 20.542767 : 357547: 13.258826 : : I expect the second document to be the most relevant for my query, and the debug values seem to agree. However, in the final score I receive, that document's score has been adjusted down. : : The relevant debug response information can be found here: http://apaste.info/mju : : Does anyone have an idea why the Solr score may differ from the debug value? : : Thanks, : -Rick -Hoss http://www.lucidworks.com/
RE: Solr debug 'explain' values differ from the Solr score
After some digging and experimentation, here are some more details on the issue I'm seeing. 1. The adjusted documents' scores are always exactly (debug_score/N), where N is the number of OR items in the query. For example, `=firstName:gabby~ firstName_phonetic:gabby firstName_tokens:(gabby)` will result in some of the documents with firstName==GABBY receiving a score 1/3 of the score of other GABBY documents, even though the debug explanation shows that they generated the same score. 2. This doesn't appear to be a brand new issue, or an issue with SolrCloud. I've tested the problem using SolrCloud 5.5.0, Solr 5.5.0 (not cloud), and Solr 5.4.1. Anyone have any ideas? Thanks, -Rick From: r...@ricksullivan.net To: solr-user@lucene.apache.org Subject: Solr debug 'explain' values differ from the Solr score Date: Thu, 10 Mar 2016 08:34:30 -0800 Hi, I'm seeing behavior in Solr 5.5.0 where the top-level values I see in the debug response don't always correspond with the scores Solr assigns to the matched documents. For example, here is the top-level debug information for two documents matched by a query: 114628: Object description: "sum of:" details: Array[2] match: true value: 20.542768 357547: Object description: "sum of:" details: Array[2] match: true value: 26.517654 But they have scores 114628: 20.542767 357547: 13.258826 I expect the second document to be the most relevant for my query, and the debug values seem to agree. However, in the final score I receive, that document's score has been adjusted down. The relevant debug response information can be found here: http://apaste.info/mju Does anyone have an idea why the Solr score may differ from the debug value? Thanks, -Rick
Solr debug 'explain' values differ from the Solr score
Hi, I'm seeing behavior in Solr 5.5.0 where the top-level values I see in the debug response don't always correspond with the scores Solr assigns to the matched documents. For example, here is the top-level debug information for two documents matched by a query: 114628: Objectdescription: "sum of:"details: Array[2]match: truevalue: 20.542768 357547: Objectdescription: "sum of:"details: Array[2]match: truevalue: 26.517654 But they have scores114628: 20.542767357547: 13.258826 I expect the second document to be the most relevant for my query, and the debug values seem to agree. However, in the final score I receive, that document's score has been adjusted down. The relevant debug response information can be found here: http://apaste.info/mju Does anyone have an idea why the Solr score may differ from the debug value? Thanks,-Rick
Re: solr score threashold
The ScoresAsPercentages page is not really instructions for how to normalize scores. It is an explanation of why a score threshold does not do what you want. Don’t use thresholds. If you want thresholds, you will need a search engine with a probabilistic model, like Verity K2. Those generally give worse results than a vector space model, but you can have thresholds. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Jan 20, 2016, at 5:11 AM, Emir Arnautovic <emir.arnauto...@sematext.com> > wrote: > > Hi Sara, > You can use funct and frange to achive needed, but note that scores are not > normalized meaning score 8 does not mean it is good match - it is just best > match. There are examples online how to normalize score (e.g. > http://wiki.apache.org/lucene-java/ScoresAsPercentages). > Other approach is to write custom component that will filter out docs below > some threshold. > > Thanks, > Emir > > On 20.01.2016 13:58, sara hajili wrote: >> hi all, >> i wanna to know about solr search relevency scoreing threashold. >> can i change it? >> i mean immagine when i searching i get this result >> doc1 score =8 >> doc2 score =6.4 >> doc3 score=6 >> doc8score=5.5 >> doc5 score=2 >> i wana to change solr score threashold .in this way i set threashold for >> example >4 >> and then i didn't get doc5 as result.can i do this?if yes how? >> and if not how i can modified search to don't get docs as a result that >> these docs have a lot distance from doc with max score? >> in other word i wanna to delete this gap between solr results >> > > -- > Monitoring * Alerting * Anomaly Detection * Centralized Log Management > Solr & Elasticsearch Support * http://sematext.com/ >
Re: solr score threashold
What problem are you trying to solve? If you're trying to cut out "bad" results, I might suggest explicitly using filters that eliminate undesirable search items in terms that are meaningful to how your users evaluate relevance. For example, let's say your users only want items that have at least one match in the title. One natural way to do this is to create a filter query like *fq={!edismax qf=title mm=1 v=$q} *(where q is the user's plaintext query). That's just an example, maybe you'd like to have some other criteria for cutting out poor results? Use a filter query and express what you need to trim out to Solr :) -Doug On Wed, Jan 20, 2016 at 7:58 AM, sara hajili <hajili.s...@gmail.com> wrote: > hi all, > i wanna to know about solr search relevency scoreing threashold. > can i change it? > i mean immagine when i searching i get this result > doc1 score =8 > doc2 score =6.4 > doc3 score=6 > doc8score=5.5 > doc5 score=2 > i wana to change solr score threashold .in this way i set threashold for > example >4 > and then i didn't get doc5 as result.can i do this?if yes how? > and if not how i can modified search to don't get docs as a result that > these docs have a lot distance from doc with max score? > in other word i wanna to delete this gap between solr results > -- *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections <http://opensourceconnections.com>, LLC | 240.476.9983 Author: Relevant Search <http://manning.com/turnbull> This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.
solr score threashold
hi all, i wanna to know about solr search relevency scoreing threashold. can i change it? i mean immagine when i searching i get this result doc1 score =8 doc2 score =6.4 doc3 score=6 doc8score=5.5 doc5 score=2 i wana to change solr score threashold .in this way i set threashold for example >4 and then i didn't get doc5 as result.can i do this?if yes how? and if not how i can modified search to don't get docs as a result that these docs have a lot distance from doc with max score? in other word i wanna to delete this gap between solr results
Re: solr score threashold
Hi Sara, You can use funct and frange to achive needed, but note that scores are not normalized meaning score 8 does not mean it is good match - it is just best match. There are examples online how to normalize score (e.g. http://wiki.apache.org/lucene-java/ScoresAsPercentages). Other approach is to write custom component that will filter out docs below some threshold. Thanks, Emir On 20.01.2016 13:58, sara hajili wrote: hi all, i wanna to know about solr search relevency scoreing threashold. can i change it? i mean immagine when i searching i get this result doc1 score =8 doc2 score =6.4 doc3 score=6 doc8score=5.5 doc5 score=2 i wana to change solr score threashold .in this way i set threashold for example >4 and then i didn't get doc5 as result.can i do this?if yes how? and if not how i can modified search to don't get docs as a result that these docs have a lot distance from doc with max score? in other word i wanna to delete this gap between solr results -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/
Solr score distribution usage
Hello, I would like to use the Solr score distribution to pick up most relevant documents from the search result. Rather than top n results, I am interested only in picking up the most relevant based on statistical distribution of the scores. A brief study of some sample searches (the most frequently searched terms) on my data-set shows that the mode and median scores seem to coincide or be very close together. Is this the kind of trend which is generally observed in Solr (though I understand variations on specific searches)? Hence, I was considering using statistical mode as the threshold above which I use the documents from the result. Has anyone done something like this before or would like to critique my approach? Regards, Ashish
Re: Include Solr score into a ranking algorithm
Hello Nicholas! you can specify a function query as a main query where you can operate with DVs, then you can use regular tfidf score from arbitrary query as one of the arguments in the functional query see an example in http://wiki.apache.org/solr/FunctionQuery#query have a good research! On Thu, Nov 20, 2014 at 6:45 AM, Nicholas Ding nicholas...@gmail.com wrote: Hi, Currently, I'm trying to implement a ranking algorithm on Solr to include TFIDFSimilarity score into a formula. Ranking = TFIDFSimilarity Score * X1 + V1 * X2 + V2 * X3 + . + Vn-1 * Xn Basically, the values of Vn are stored in DocValues, I can access them in customized Function Query. The Xn are parameters I will pass to the Function Query. I searched on internet and dig a little bit in the Solr/Lucene source code. I found there is no way to access TFIDFSimilarity Score in Function Query. (Please correct me if I'm wrong.) So, I'm wondering is it possible to do my ranking algorithm in Solr/Lucene? -- Nicholas Ding -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Re: Include Solr score into a ranking algorithm
Hi Mikhail, Thank you very much! I'm using eDisMax by default, I think I will need to change it to defType=func and pass all the query parameters (fq mainly) to the sub query right? Nicholas Ding On Thu, Nov 20, 2014 at 5:22 AM, Mikhail Khludnev mkhlud...@griddynamics.com wrote: Hello Nicholas! you can specify a function query as a main query where you can operate with DVs, then you can use regular tfidf score from arbitrary query as one of the arguments in the functional query see an example in http://wiki.apache.org/solr/FunctionQuery#query have a good research! On Thu, Nov 20, 2014 at 6:45 AM, Nicholas Ding nicholas...@gmail.com wrote: Hi, Currently, I'm trying to implement a ranking algorithm on Solr to include TFIDFSimilarity score into a formula. Ranking = TFIDFSimilarity Score * X1 + V1 * X2 + V2 * X3 + . + Vn-1 * Xn Basically, the values of Vn are stored in DocValues, I can access them in customized Function Query. The Xn are parameters I will pass to the Function Query. I searched on internet and dig a little bit in the Solr/Lucene source code. I found there is no way to access TFIDFSimilarity Score in Function Query. (Please correct me if I'm wrong.) So, I'm wondering is it possible to do my ranking algorithm in Solr/Lucene? -- Nicholas Ding -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Re: Include Solr score into a ranking algorithm
Hi Nicholas, you can use sort by function feature of solr. sort=sum( mul(query(field:TfIdfQuery),x1), mul(x1,v2)) On Thursday, November 20, 2014 4:23 PM, Nicholas Ding nicholas...@gmail.com wrote: Hi Mikhail, Thank you very much! I'm using eDisMax by default, I think I will need to change it to defType=func and pass all the query parameters (fq mainly) to the sub query right? Nicholas Ding On Thu, Nov 20, 2014 at 5:22 AM, Mikhail Khludnev mkhlud...@griddynamics.com wrote: Hello Nicholas! you can specify a function query as a main query where you can operate with DVs, then you can use regular tfidf score from arbitrary query as one of the arguments in the functional query see an example in http://wiki.apache.org/solr/FunctionQuery#query have a good research! On Thu, Nov 20, 2014 at 6:45 AM, Nicholas Ding nicholas...@gmail.com wrote: Hi, Currently, I'm trying to implement a ranking algorithm on Solr to include TFIDFSimilarity score into a formula. Ranking = TFIDFSimilarity Score * X1 + V1 * X2 + V2 * X3 + . + Vn-1 * Xn Basically, the values of Vn are stored in DocValues, I can access them in customized Function Query. The Xn are parameters I will pass to the Function Query. I searched on internet and dig a little bit in the Solr/Lucene source code. I found there is no way to access TFIDFSimilarity Score in Function Query. (Please correct me if I'm wrong.) So, I'm wondering is it possible to do my ranking algorithm in Solr/Lucene? -- Nicholas Ding -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Re: Include Solr score into a ranking algorithm
On Thu, Nov 20, 2014 at 5:23 PM, Nicholas Ding nicholas...@gmail.com wrote: Hi Mikhail, Thank you very much! I'm using eDisMax by default, I think I will need to change it to defType=func and I wonder why do you ask, because the given link has three examples of including edismax into the simple calculation. pass all the query parameters (fq mainly) to the sub query right? this one particularly doesn't seem right to me. fq is fq, keep them as is. Nicholas Ding On Thu, Nov 20, 2014 at 5:22 AM, Mikhail Khludnev mkhlud...@griddynamics.com wrote: Hello Nicholas! you can specify a function query as a main query where you can operate with DVs, then you can use regular tfidf score from arbitrary query as one of the arguments in the functional query see an example in http://wiki.apache.org/solr/FunctionQuery#query have a good research! On Thu, Nov 20, 2014 at 6:45 AM, Nicholas Ding nicholas...@gmail.com wrote: Hi, Currently, I'm trying to implement a ranking algorithm on Solr to include TFIDFSimilarity score into a formula. Ranking = TFIDFSimilarity Score * X1 + V1 * X2 + V2 * X3 + . + Vn-1 * Xn Basically, the values of Vn are stored in DocValues, I can access them in customized Function Query. The Xn are parameters I will pass to the Function Query. I searched on internet and dig a little bit in the Solr/Lucene source code. I found there is no way to access TFIDFSimilarity Score in Function Query. (Please correct me if I'm wrong.) So, I'm wondering is it possible to do my ranking algorithm in Solr/Lucene? -- Nicholas Ding -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Re: Include Solr score into a ranking algorithm
Thank you so much, Mikhail! It works perfectly. On Thu, Nov 20, 2014 at 12:54 PM, Mikhail Khludnev mkhlud...@griddynamics.com wrote: On Thu, Nov 20, 2014 at 5:23 PM, Nicholas Ding nicholas...@gmail.com wrote: Hi Mikhail, Thank you very much! I'm using eDisMax by default, I think I will need to change it to defType=func and I wonder why do you ask, because the given link has three examples of including edismax into the simple calculation. pass all the query parameters (fq mainly) to the sub query right? this one particularly doesn't seem right to me. fq is fq, keep them as is. Nicholas Ding On Thu, Nov 20, 2014 at 5:22 AM, Mikhail Khludnev mkhlud...@griddynamics.com wrote: Hello Nicholas! you can specify a function query as a main query where you can operate with DVs, then you can use regular tfidf score from arbitrary query as one of the arguments in the functional query see an example in http://wiki.apache.org/solr/FunctionQuery#query have a good research! On Thu, Nov 20, 2014 at 6:45 AM, Nicholas Ding nicholas...@gmail.com wrote: Hi, Currently, I'm trying to implement a ranking algorithm on Solr to include TFIDFSimilarity score into a formula. Ranking = TFIDFSimilarity Score * X1 + V1 * X2 + V2 * X3 + . + Vn-1 * Xn Basically, the values of Vn are stored in DocValues, I can access them in customized Function Query. The Xn are parameters I will pass to the Function Query. I searched on internet and dig a little bit in the Solr/Lucene source code. I found there is no way to access TFIDFSimilarity Score in Function Query. (Please correct me if I'm wrong.) So, I'm wondering is it possible to do my ranking algorithm in Solr/Lucene? -- Nicholas Ding -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Include Solr score into a ranking algorithm
Hi, Currently, I'm trying to implement a ranking algorithm on Solr to include TFIDFSimilarity score into a formula. Ranking = TFIDFSimilarity Score * X1 + V1 * X2 + V2 * X3 + . + Vn-1 * Xn Basically, the values of Vn are stored in DocValues, I can access them in customized Function Query. The Xn are parameters I will pass to the Function Query. I searched on internet and dig a little bit in the Solr/Lucene source code. I found there is no way to access TFIDFSimilarity Score in Function Query. (Please correct me if I'm wrong.) So, I'm wondering is it possible to do my ranking algorithm in Solr/Lucene? -- Nicholas Ding
Solr score manager
Hi All, I need a specific score mechanism. I would like to sort my results based on customize scoring field. scoring for example - 1. If this is a new object - 100 2. Edited - 80 3. Recent search - 50 4. Opened - 40 and some more actions... And then when execute a new search they sorted based on score field. Example: Object 1 : opened = 40. Object 2: New = 100 Object 3: edited X 2 + recent search X 1 = 210. Result: Object 3 Object 2 Object 1 Any good article for this? Examples? I'm using Solr with Java. Thanks in advance, Shay.
Re: Solr score manager
How are you storing this information in your documents? Regards, Alex On 16/07/2014 5:03 pm, Shay Sofer sha...@checkpoint.com wrote: Hi All, I need a specific score mechanism. I would like to sort my results based on customize scoring field. scoring for example - 1. If this is a new object - 100 2. Edited - 80 3. Recent search - 50 4. Opened - 40 and some more actions... And then when execute a new search they sorted based on score field. Example: Object 1 : opened = 40. Object 2: New = 100 Object 3: edited X 2 + recent search X 1 = 210. Result: Object 3 Object 2 Object 1 Any good article for this? Examples? I'm using Solr with Java. Thanks in advance, Shay.
Fwd: Solr score manager
-- Forwarded message -- From: Shay Sofer sha...@checkpoint.com Date: Wed, Jul 16, 2014 at 6:55 PM That’s my question :-) How should I manage this scoring system. I guess that I need to add new field (my_score) and update him as I want. -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: Wednesday, July 16, 2014 1:53 PM To: solr-user Subject: Re: Solr score manager How are you storing this information in your documents? Regards, Alex On 16/07/2014 5:03 pm, Shay Sofer sha...@checkpoint.com wrote: Hi All, I need a specific score mechanism. I would like to sort my results based on customize scoring field. scoring for example - 1. If this is a new object - 100 2. Edited - 80 3. Recent search - 50 4. Opened - 40 and some more actions... And then when execute a new search they sorted based on score field. Example: Object 1 : opened = 40. Object 2: New = 100 Object 3: edited X 2 + recent search X 1 = 210. Result: Object 3 Object 2 Object 1 Any good article for this? Examples? I'm using Solr with Java. Thanks in advance, Shay. Email secured by Check Point
RE: Solr score manager
Shay this presentation I gave at apachecon and dc solr exchange might be useful to you: http://www.slideshare.net/mobile/o19s/hacking-lucene-for-custom-search-results Sent from my Windows Phone From: Shay Sofer Sent: 7/16/2014 6:03 AM To: solr-user@lucene.apache.org Subject: Solr score manager Hi All, I need a specific score mechanism. I would like to sort my results based on customize scoring field. scoring for example - 1. If this is a new object - 100 2. Edited - 80 3. Recent search - 50 4. Opened - 40 and some more actions... And then when execute a new search they sorted based on score field. Example: Object 1 : opened = 40. Object 2: New = 100 Object 3: edited X 2 + recent search X 1 = 210. Result: Object 3 Object 2 Object 1 Any good article for this? Examples? I'm using Solr with Java. Thanks in advance, Shay.
Re: Combining Solr score with customized user ratings for a document
Good. -- View this message in context: http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-tp4040200p4138135.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Rounding errors with SOLR score
I will send the debugQuery. They are exactly the same. On Fri, Mar 21, 2014 at 2:59 AM, Raymond Wiker rwi...@gmail.com wrote: Are you sure that SOLR is rounding incorrectly, and not simply differently from what you expect? I was surprised myself at some of the rounding behaviour I saw with SOLR, but according to http://en.wikipedia.org/wiki/Rounding , the results were valid (just not the round-up-from-half that I naively expected). On Fri, Mar 21, 2014 at 3:27 AM, William Bell billnb...@gmail.com wrote: When doing complex boosting/bq we are getting rounding errors on the score. To get the score to be consistent I needed to use rint on sort: sort=rint(product(sum($p_score,$s_score,$q_score),100)) desc,s_query asc str name=p_scorerecip(priority,1,.5,.01)/str str name=s_scoreproduct(recip(synonym_rank,1,1,.01),17)/str str name=q_score query({!dismax qf=user_query_edge^1 user_query^0.5 user_query_fuzzy v=$q1}) /str The issue is in the qf area. {s_query: Ear Irrigation,score: 10.331313},{s_query: Ear Piercing, score: 10.331314},{s_query: Ear Pinning,score: 10.331313}, -- Bill Bell billnb...@gmail.com cell 720-256-8076 -- Bill Bell billnb...@gmail.com cell 720-256-8076
Re: Rounding errors with SOLR score
Are you sure that SOLR is rounding incorrectly, and not simply differently from what you expect? I was surprised myself at some of the rounding behaviour I saw with SOLR, but according to http://en.wikipedia.org/wiki/Rounding , the results were valid (just not the round-up-from-half that I naively expected). On Fri, Mar 21, 2014 at 3:27 AM, William Bell billnb...@gmail.com wrote: When doing complex boosting/bq we are getting rounding errors on the score. To get the score to be consistent I needed to use rint on sort: sort=rint(product(sum($p_score,$s_score,$q_score),100)) desc,s_query asc str name=p_scorerecip(priority,1,.5,.01)/str str name=s_scoreproduct(recip(synonym_rank,1,1,.01),17)/str str name=q_score query({!dismax qf=user_query_edge^1 user_query^0.5 user_query_fuzzy v=$q1}) /str The issue is in the qf area. {s_query: Ear Irrigation,score: 10.331313},{s_query: Ear Piercing, score: 10.331314},{s_query: Ear Pinning,score: 10.331313}, -- Bill Bell billnb...@gmail.com cell 720-256-8076
Rounding errors with SOLR score
When doing complex boosting/bq we are getting rounding errors on the score. To get the score to be consistent I needed to use rint on sort: sort=rint(product(sum($p_score,$s_score,$q_score),100)) desc,s_query asc str name=p_scorerecip(priority,1,.5,.01)/str str name=s_scoreproduct(recip(synonym_rank,1,1,.01),17)/str str name=q_score query({!dismax qf=user_query_edge^1 user_query^0.5 user_query_fuzzy v=$q1}) /str The issue is in the qf area. {s_query: Ear Irrigation,score: 10.331313},{s_query: Ear Piercing, score: 10.331314},{s_query: Ear Pinning,score: 10.331313}, -- Bill Bell billnb...@gmail.com cell 720-256-8076
Re: How to round solr score ?
Thanks for your replies. I am actually doing the frange approach for now. The only downside I see there is it makes the function call twice, calling createWeight() twice. And so my social connections are evaluated twice which is quite heavy operation. So I was thinking if I could get away with one additional call. This email is intended for the person(s) to whom it is addressed and may contain information that is PRIVILEGED or CONFIDENTIAL. Any unauthorized use, distribution, copying, or disclosure by any person other than the addressee(s) is strictly prohibited. If you have received this email in error, please notify the sender immediately by return email and delete the message and any attachments from your system.
Re: How to round solr score ?
Hi , As per this post here http://grokbase.com/t/lucene/solr-user/131jzcg3q2/how-to-round-solr-score. I was able to use my custom fn in sort(defType=funcq=socialDegree(id,1)fl=score,*sort=score%20asc) - works, but can't facet on the same(defType=funcq=socialDegree(id,1)fl=score,*facet=truefacet.field=score) - doesn't work. Exception: org.apache.solr.common.SolrException: undefined field: score at org.apache.solr.schema.IndexSchema.getField(IndexSchema.java:965) at org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:294) at org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:423) at org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:205) at org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:78) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072) Is there any way by which we can achieve this? Thanks, Mamta. This email is intended for the person(s) to whom it is addressed and may contain information that is PRIVILEGED or CONFIDENTIAL. Any unauthorized use, distribution, copying, or disclosure by any person other than the addressee(s) is strictly prohibited. If you have received this email in error, please notify the sender immediately by return email and delete the message and any attachments from your system.
Re: How to round solr score ?
: 'score' is a pseudo-field, i.e., it does not actually exist in : the index, which is probably why it cannot be faceted on. : Faceting on a rounded score seems like an unusual use : case. What requirement are you trying to address? agreed, more details would be helpful. FWIW: the only way available to facet on functions is to use facet.query along with the {!frange} paser to create facet constraints based on ranges of function values that you specify. there is no othe way i can think of to facet over function values -- there is an open issue where people were discussing it, but i don't think there wa ever a functional patch... https://issues.apache.org/jira/browse/SOLR-1581 -Hoss
Re: How to round solr score ?
On 17 September 2013 18:31, Mamta Thakur mtha...@care.com wrote: Hi , As per this post here http://grokbase.com/t/lucene/solr-user/131jzcg3q2/how-to-round-solr-score. I was able to use my custom fn in sort(defType=funcq=socialDegree(id,1)fl=score,*sort=score%20asc) - works, but can't facet on the same(defType=funcq=socialDegree(id,1)fl=score,*facet=truefacet.field=score) - doesn't work. 'score' is a pseudo-field, i.e., it does not actually exist in the index, which is probably why it cannot be faceted on. Faceting on a rounded score seems like an unusual use case. What requirement are you trying to address? Regards, Gora
Re: Combining Solr score with customized user ratings for a document
You can use DB for storing user preferences and later if you want you can flush them to solr as an update along with userid. Or you may add a result pipeline filter Rgds AJ On 13-Feb-2013, at 17:50, Á_o chachime...@yahoo.es wrote: Hi: I am working on a proyect where we want to recommend our users products based on their previous 'likes', purchases and so on (typical stuff of a recommender system), while we want to let them browse freely the catalogue by search queries, making use of facets, more-like-this and so on (typical stuff of a Solr index). After reading here and there, I have reached the conclusion that's it's better to keep Solr Index apart from the database. Solr is for products (which can be reindexed from the DB as a nightly batch) while the DB is for everything else, including -the products and- user profiles. So, given an user and a particular search (which can be as simple as q=*), on one hand we have Solr results (i.e. docs + scores) for the query, while on the other we have user predicted ratings (i.e. recommender scores) coming from the DB (though they could be cached elsewhere) for each of the products returned by Solr. And what I want is clear -to state-: combine both scores (e.g. by a simple product) so the user receives a sorted list of relevant products biased by his/her preferences. I have been googleing for the last days without finding which is the best way to achieve this. I think it's not a matter of boosting, or at least I can't see which boosting method could be useful as the boost should be user-based. I think that I need to extend -somewhere- Solr so I can alter the result scores by providing the user ID and connecting to the DB at query time, doing the necessary maths and returning the final score in a -quite- transparent way for the Web app. A less elegant solution could be letting Solr do its work as usual, and then navigate through the XML modifying the scores and reordering the whole list of products (or maybe just the first N results) by the new combined score. What do you think? A big THANKS in advance Álvaro -- View this message in context: http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-tp4040200.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Combining Solr score with customized user ratings for a document
: With this approach now I can boost (i.e. multiply Solr's score by a factor) : the results of any query by doing something like this: : http://localhost:8080/solr/Prueba/select_test?q={!boost : b=rating(usuario1)}text:grapafl=score : : Where 'rating' is the name of my function. : : Unfortunately, I still can't see which differences are between doing this or : making the product of both scores as the value for the query's sort : parameter... :( I'm not sure i understand your question. With the example query above, your score -- both returned, and used for sorting by score -- is the mathematical result of multiplying your function by the relevancy score of text:grapa Perhaps what you are refering to is the idea that if you wnat the score to remain purely about relevancy, you can still opionally sort on the results of this function, by using the function solely in your sort -- the only thing that tends to confuse people here is how you refer back to the original query in that sort by function command... http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201206.mbox/%3Calpine.DEB.2.00.1206111242260.17925@bester%3E or in your case, something like this would return the both the raw score, and your custom rating, but it would sort on the product of those two values... ?q=text:grapafl=id,score,rating(usuario1)sort=product(rating(usuario1),query($q) : Which is the best place to do it? I think I would query the DB/cache just : when the custom ValueSource is created in the ValueSourceParser's parse That might makes sense, but becareful where you put this cache data -- if it's part of the ValueSource then whenever that ValueSource is used in a FunctionQuery (ie: {!boost b=rating(usuario1)}text:grapa it will be part of the cache key for the queryResultCache or filterCache -- so having large data structures in your ValueSource could eat up a lot of RAM. Take a look at src/docs/differences between the ValueSource class and the FunctionValues class -Hoss
Re: Combining Solr score with customized user ratings for a document
Well, as Hoss suggested, I have implemented my own function (ValueSourceParser+ValueSource) :) It's a very simple function which receives a parameter, the userId, and returns a float value depending (with a switch-case structure just for this demo) on it. With this approach now I can boost (i.e. multiply Solr's score by a factor) the results of any query by doing something like this: http://localhost:8080/solr/Prueba/select_test?q={!boost b=rating(usuario1)}text:grapafl=score Where 'rating' is the name of my function. Unfortunately, I still can't see which differences are between doing this or making the product of both scores as the value for the query's sort parameter... :( Next step is, of course, replace that demo switch-case structure with a SQL query to my DB/retrieval from a Solr cache. My idea is to retrieve a docId,recScore map from the DB for each user that queries our system for the first time. Next time he/she queries it I'd like to get his/her map from a Solr's cache (until info becomes obsolete). Which is the best place to do it? I think I would query the DB/cache just when the custom ValueSource is created in the ValueSourceParser's parse call, storing the map in the ValueSource. Then my floatVal method would just be a 'get' from my map. I'm so close! Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-tp4040200p4041272.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Combining Solr score with customized user ratings for a document
Hi Tim! Thank you for bringing in some light ;) I have read your slides (in fact, I had already read them in the last days) but I'm still missing something. So, let's see... As I see (and I may be wrong) Solr's external file fields are some kind of docID, score maps, aren't them? I understand the power of this approach for popularity scores, like in your example, which don't depend on anything else but the docID but you don't want to have them stored in the index so they can be refreshed more often than the documents. The problem here (as with other regular rec system to my 'lil knowledge) is that we need a usrID, lt;docID, score map instead. The other thing you address is a custom Component. hmm... haven't thought of that before. Maybe I read your slides when I had less understanding of Solr's internal mechanishms (though I'm still quite confused). So, alright, something that receives a parameter with the user ID plus setting a cache so we don't eventually end in a bottleneck is definitely the direction I have to follow. But now the problem I find is that I don't want to query my database just to get categories for a filter. What I want to query is the user rating for each document so I can combine it with Solr's relevancy score. All complaints, I know... :p Could you go -just a bit- further into that Mahout integration with Solr? I think I'm going to dive into custom components to get a better understanding of them and to see if I can find there my holy grail :P Thanks A LOT! Regards, Álvaro -- View this message in context: http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-tp4040200p4040597.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Combining Solr score with customized user ratings for a document
Á_o wrote As I see (and I may be wrong) Solr's external file fields are some kind of lt;docID, scoregt; maps, aren't them? Actually I was wrong ;) The key does not have to be necessarily the docID. It can be some other field. Anyway, even in that case, it's still a 'docKey' which I can't see how could it be user-customized... :( -- View this message in context: http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-tp4040200p4040616.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Combining Solr score with customized user ratings for a document
Alvaro - still thinking ... will reply when I have more ;-) On Fri, Feb 15, 2013 at 6:31 AM, Á_o chachime...@yahoo.es wrote: Á_o wrote As I see (and I may be wrong) Solr's external file fields are some kind of lt;docID, scoregt; maps, aren't them? Actually I was wrong ;) The key does not have to be necessarily the docID. It can be some other field. Anyway, even in that case, it's still a 'docKey' which I can't see how could it be user-customized... :( -- View this message in context: http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-tp4040200p4040616.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Combining Solr score with customized user ratings for a document
: http://www.slideshare.net/thelabdude/boosting-documents-in-solr-lucene-revolution-2011 ... : Start by looking at Solr's external file field and Rather then using ExternalFileField as imspiration, i would suggest you look at implementing a custom ValueSourceParser... http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201301.mbox/%3Calpine.DEB.2.02.1301071825090.14692@frisbee%3E Then you can either use your custom function as a boost function included the main query, or in isolation as part of a sort by function which could also include the score from the main query. (Which one you choose should depends on your final goal, and how expensive it is to query your external datasource to find out hte per-user rankings.) -Hoss
Re: Combining Solr score with customized user ratings for a document
Well, thinking a bit more, the second solution is not practical. If Solr retrieves, say, 1.000 documents, I would have to navigate through ALL (maybe less with some reasonable upper limit) of them to recalculate the scores and reorder them according to the new score although the Web App is going to show just the first 20. In other words, I would lose the benefits of Solr's (well, and most DB's) row/offset feature to retrieve information in batchs rather than the whole amount of results which may not be seen by the user at all. I'm now wondering if a custom implementation of a ValueSource + a FunctionQuery is a solution to my problem... Any hint? Thanks! Álvaro -- View this message in context: http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-tp4040200p4040444.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Combining Solr score with customized user ratings for a document
Start by looking at Solr's external file field and http://www.linkedin.com/profile/view?id=18807864trk=tab_pro On Thu, Feb 14, 2013 at 6:24 AM, Á_o chachime...@yahoo.es wrote: Well, thinking a bit more, the second solution is not practical. If Solr retrieves, say, 1.000 documents, I would have to navigate through ALL (maybe less with some reasonable upper limit) of them to recalculate the scores and reorder them according to the new score although the Web App is going to show just the first 20. In other words, I would lose the benefits of Solr's (well, and most DB's) row/offset feature to retrieve information in batchs rather than the whole amount of results which may not be seen by the user at all. I'm now wondering if a custom implementation of a ValueSource + a FunctionQuery is a solution to my problem... Any hint? Thanks! Álvaro -- View this message in context: http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-tp4040200p4040444.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Combining Solr score with customized user ratings for a document
Oops - that's definitely not the link I meant to give ;-) Here's the link from slideshare: http://www.slideshare.net/thelabdude/boosting-documents-in-solr-lucene-revolution-2011 In there we used Mahout to calculate recommendation scores and then loaded them using external file field. Cheers, Tim On Thu, Feb 14, 2013 at 11:25 AM, Timothy Potter thelabd...@gmail.com wrote: Start by looking at Solr's external file field and http://www.linkedin.com/profile/view?id=18807864trk=tab_pro On Thu, Feb 14, 2013 at 6:24 AM, Á_o chachime...@yahoo.es wrote: Well, thinking a bit more, the second solution is not practical. If Solr retrieves, say, 1.000 documents, I would have to navigate through ALL (maybe less with some reasonable upper limit) of them to recalculate the scores and reorder them according to the new score although the Web App is going to show just the first 20. In other words, I would lose the benefits of Solr's (well, and most DB's) row/offset feature to retrieve information in batchs rather than the whole amount of results which may not be seen by the user at all. I'm now wondering if a custom implementation of a ValueSource + a FunctionQuery is a solution to my problem... Any hint? Thanks! Álvaro -- View this message in context: http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-tp4040200p4040444.html Sent from the Solr - User mailing list archive at Nabble.com.
Combining Solr score with customized user ratings for a document
Hi: I am working on a proyect where we want to recommend our users products based on their previous 'likes', purchases and so on (typical stuff of a recommender system), while we want to let them browse freely the catalogue by search queries, making use of facets, more-like-this and so on (typical stuff of a Solr index). After reading here and there, I have reached the conclusion that's it's better to keep Solr Index apart from the database. Solr is for products (which can be reindexed from the DB as a nightly batch) while the DB is for everything else, including -the products and- user profiles. So, given an user and a particular search (which can be as simple as q=*), on one hand we have Solr results (i.e. docs + scores) for the query, while on the other we have user predicted ratings (i.e. recommender scores) coming from the DB (though they could be cached elsewhere) for each of the products returned by Solr. And what I want is clear -to state-: combine both scores (e.g. by a simple product) so the user receives a sorted list of relevant products biased by his/her preferences. I have been googleing for the last days without finding which is the best way to achieve this. I think it's not a matter of boosting, or at least I can't see which boosting method could be useful as the boost should be user-based. I think that I need to extend -somewhere- Solr so I can alter the result scores by providing the user ID and connecting to the DB at query time, doing the necessary maths and returning the final score in a -quite- transparent way for the Web app. A less elegant solution could be letting Solr do its work as usual, and then navigate through the XML modifying the scores and reordering the whole list of products (or maybe just the first N results) by the new combined score. What do you think? A big THANKS in advance Álvaro -- View this message in context: http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-tp4040200.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to round solr score ?
On 18 January 2013 18:26, Gustav xbihy...@sharklasers.com wrote: I have to bump this... is it possible to do it (round solr's score) with any integrated query function?? Do not have a Solr index handy at the moment to check, but it should be possible to do this with function queries. Please see the rint() and query() function at http://wiki.apache.org/solr/FunctionQuery Regards, Gora
Re: How to round solr score ?
Hey Gora, thanks for the fast answer! I Had tried the rint(score) function before(it would be perfect in my case) but it didnt work out, i guess it only works with indexed fields, so i got the sort param could not be parsed as a query, and is not a field that exists in the index: rint(score) error, And with the query() function i didnt got any successful result... Im stuck in the same cenario as squaro. if two docs have score of 1.67989 and 1.6767, I would like to sort them by price. My sort rules ae something like: sort=score desc, price asc -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-round-solr-score-tp495198p4034551.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to round solr score ?
On 18 January 2013 19:18, Gustav xbihy...@sharklasers.com wrote: Hey Gora, thanks for the fast answer! I Had tried the rint(score) function before(it would be perfect in my case) but it didnt work out, i guess it only works with indexed fields, so i got the sort param could not be parsed as a query, and is not a field that exists in the index: rint(score) error, And with the query() function i didnt got any successful result... Im stuck in the same cenario as squaro. if two docs have score of 1.67989 and 1.6767, I would like to sort them by price. My sort rules ae something like: sort=score desc, price asc You have to use rint() in combination with query() If I understand your requirements correctly, something along the lines below should work: http://localhost:8983/solr/select/?defType=funcq=rint(query({!v=text:term}))fl=score,*sort=score desc,price asc should work, where one is searching for term in the field text. The score is displayed in the returned fields to demonstrate that it has been rounded off. Regards, Gora
Re: Solr Score threshold 'reasonably', independent of results returned
: Not really. The percentage given in other search packages is fairly : bogus. You have to do a global batch analysis of all of the index to : get a true scale for relevance. Exactly... https://wiki.apache.org/solr/FAQ#Why_Aren.27t_Scores_returned_as_a_percentage.3F_How_Do_I_normalize_Scores.3F https://wiki.apache.org/lucene-java/ScoresAsPercentages *you* -- as the person in control of your solr instance, who kows everything about every document in the index, and has total control over the set of valid queries being executed against the index -- you *MAY* be able to compute a meaningful threshold of scores, based on the constraints you know/enforce. But Solr can't do this, because in general Solr doesn't know those constraints (or if those constraints even exist) for an arbitrary index. -Hoss
Re: Solr Score threshold 'reasonably', independent of results returned
Not really. The percentage given in other search packages is fairly bogus. You have to do a global batch analysis of all of the index to get a true scale for relevance. On Sat, Aug 25, 2012 at 1:38 PM, Ramzi Alqrainy ramzi.alqra...@gmail.com wrote: You are right Mr.Ravish, because this depends on (ranking and search fields) formula, but please allow me to tell you that Solr score can help us to define this document is relevant or not in some cases. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Score-threshold-reasonably-independent-of-results-returned-tp4002312p4003248.html Sent from the Solr - User mailing list archive at Nabble.com. -- Lance Norskog goks...@gmail.com
Re: Solr Score threshold 'reasonably', independent of results returned
It will never return no result because its relative to score in previous result If score0.25*last_score then stop Since score0 and last score is 0 for initial hit it will not stop -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Score-threshold-reasonably-independent-of-results-returned-tp4002312p4003247.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Score threshold 'reasonably', independent of results returned
You are right Mr.Ravish, because this depends on (ranking and search fields) formula, but please allow me to tell you that Solr score can help us to define this document is relevant or not in some cases. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Score-threshold-reasonably-independent-of-results-returned-tp4002312p4003248.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Score threshold 'reasonably', independent of results returned
Hi, I think that this totally depends on your requirements and thus applicable for a user scenario. Score does not have any absolute meaning, it is always relative to the query. If you want to watch some particular queries and want to show results with score above previously set threshold, you can use this. If I always have that x% threshold in place , there may be many queries which would not return anything and I certainly do not want that. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Score-threshold-reasonably-independent-of-results-returned-tp4002312p4002673.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Score threshold 'reasonably', independent of results returned
Commercial solutions often have %age that is meant to signify the quality of match. Solr has relative score and you cannot tell by just looking at this value if a result is relevant enough to be in first page or not. Score depends on what else is in the index so not easy to normalize in the way you suggest. Ravish On Wed, Aug 22, 2012 at 4:03 PM, Mou mouna...@gmail.com wrote: Hi, I think that this totally depends on your requirements and thus applicable for a user scenario. Score does not have any absolute meaning, it is always relative to the query. If you want to watch some particular queries and want to show results with score above previously set threshold, you can use this. If I always have that x% threshold in place , there may be many queries which would not return anything and I certainly do not want that. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Score-threshold-reasonably-independent-of-results-returned-tp4002312p4002673.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr Score threshold 'reasonably', independent of results returned
Usually, search results are sorted by their score (how well the document matched the query), but it is common to need to support the sorting of supplied data too. Boosting affects the scores of matching documents in order to affect ranking in score-sorted search results. Providing a boost value, whether at the document or field level, is optional. When the results are returned with scores, we want to be able to only keep results that are above some score (i.e. results of a certain quality only). Is it possible to do this when the returned subset could be anything? I ask because it seems like on some queries a score of say 0.008 is resulting in a decent match, whereas other queries a higher score results in a poor match. I have written pseudo code to achieve what I said. Note: I have attached my code as screenshot http://lucene.472066.n3.nabble.com/file/n4002312/Screen_Shot_2012-08-21_at_5.30.38_AM.png https://issues.apache.org/jira/browse/SOLR-3747 -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Score-threshold-reasonably-independent-of-results-returned-tp4002312.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Customizing Solr score with DixMax query
--- On Mon, 2/27/12, Xiao shinelee.thew...@gmail.com wrote: From: Xiao shinelee.thew...@gmail.com Subject: Customizing Solr score with DixMax query To: solr-user@lucene.apache.org Date: Monday, February 27, 2012, 5:59 AM In my application logic, I want to implement the ranking (scoring) logic as follows: score = Solr relecency score * a_special_field_value. I tried to use DixMax to do this. My query statement is q={!type=dixmax qf='title content' bf=field1}data. However, when I open the debugquery option, I find that what Solr does is just a sum of of the two scores, i.e., the TF-IDF score and FunctionQuery score. But what I want is to multiple the two together. How can I implement the multiplication operation? edismax has a boost parameter for this. q={!type=edixmax qf='title content' boost=field1}data
Re: Customizing Solr score with DixMax query
Yes! Thank you! I also get this in this morning from Sematext Blog. Edismax Supports the “boost” parameter.. like the dismax bf param, but multiplies the function query instead of adding it in http://blog.sematext.com/2010/01/20/solr-digest-january-2010/ -- View this message in context: http://lucene.472066.n3.nabble.com/Customizing-Solr-score-with-DixMax-query-tp3779591p3781200.html Sent from the Solr - User mailing list archive at Nabble.com.
Customizing Solr score with DixMax query
In my application logic, I want to implement the ranking (scoring) logic as follows: score = Solr relecency score * a_special_field_value. I tried to use DixMax to do this. My query statement is q={!type=dixmax qf='title content' bf=field1}data. However, when I open the debugquery option, I find that what Solr does is just a sum of of the two scores, i.e., the TF-IDF score and FunctionQuery score. But what I want is to multiple the two together. How can I implement the multiplication operation? I also tried using boosted query. I issue a query like q={!boost b=field1}data. In this case, Solr does return a score which is a production of two scores. However, by using boosted query, I lost the power of dismax query which can search across multiple fields. -- View this message in context: http://lucene.472066.n3.nabble.com/Customizing-Solr-score-with-DixMax-query-tp3779591p3779591.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Score Normalization
Perhaps you can solve your usecase by playing with the new eDismax boost parameter, which multiplies the functions with the other score instead of adding. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com On 5. nov. 2011, at 01:26, sangrish wrote: Hi, I have a (dismax) request handler which has the following 3 scoring components (1 qf 2 bf) : qf = field1^2 field2^3 bf = func1(field3)^2 func2(field4)^3 Both func1 func2 return scores between 0 1. The score returned by textual match (qf) ranges from 0 to NOT_A_FIXED_NUMBER To allow better combination of text match my functions, I want the text score to be normalized between 0 1. Is there any way I can achieve that here? Thanks Sid -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Score-Normalization-tp3481627p3481627.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Score Normalization
: Perhaps you can solve your usecase by playing with the new eDismax : boost parameter, which multiplies the functions with the other score : instead of adding. and FWIW: the boost param of the edismax parser is really just syntactic sugar for using the BoostQParsre wrapped arround an edismax query -- you can wrap it around any query produced by any QParser... q={!edismax qf=foo}barboost=func(asdf) ...is the same as... q={!boost b=func(asdf) v=$qq}qq={!edismax qf=foo}bar -Hoss
Solr Score Normalization
Hi, I have a (dismax) request handler which has the following 3 scoring components (1 qf 2 bf) : qf = field1^2 field2^3 bf = func1(field3)^2 func2(field4)^3 Both func1 func2 return scores between 0 1. The score returned by textual match (qf) ranges from 0 to NOT_A_FIXED_NUMBER To allow better combination of text match my functions, I want the text score to be normalized between 0 1. Is there any way I can achieve that here? Thanks Sid -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Score-Normalization-tp3481627p3481627.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Score Normalization
:To allow better combination of text match my functions, I want the text : score to be normalized between 0 1. Is there any way I can achieve that : here? It is achievable, but it is not usualy meaningful... https://wiki.apache.org/lucene-java/ScoresAsPercentages -Hoss
solr score issue
Hi sir , Can anyone explain me how this score is being calculated. i am searching here software engineer using dismax handler. Total documents indexed are 477 and query results are 28. Query is like that - q=software+engineerfq=location%3Adelhi dismax setting is - str name=qf alltext title^2 functional_role^1 /str str name=pf body^100 /str Here alltext field is made by copying all fields. body field contains detail of job. I am unable to understand how these scores have been calculated. From where to start score calculating and what are default score for any term matching. str name=20080604/3eb9a7b30131a782a0c0a0e2cdb2b6b8.html 0.5901718 = (MATCH) sum of: 0.0032821721 = (MATCH) sum of: 0.0026574256 = (MATCH) max plus 0.1 times others of: 0.0026574256 = (MATCH) weight(alltext:softwar in 339), product of: 0.0067262817 = queryWeight(alltext:softwar), product of: 3.6121683 = idf(docFreq=34, maxDocs=477) 0.0018621174 = queryNorm 0.39508092 = (MATCH) fieldWeight(alltext:softwar in 339), product of: 1.0 = tf(termFreq(alltext:softwar)=1) 3.6121683 = idf(docFreq=34, maxDocs=477) 0.109375 = fieldNorm(field=alltext, doc=339) 6.2474643E-4 = (MATCH) max plus 0.1 times others of: 6.2474643E-4 = (MATCH) weight(alltext:engin in 339), product of: 0.0032613424 = queryWeight(alltext:engin), product of: 1.7514161 = idf(docFreq=224, maxDocs=477) 0.0018621174 = queryNorm 0.19156113 = (MATCH) fieldWeight(alltext:engin in 339), product of: 1.0 = tf(termFreq(alltext:engin)=1) 1.7514161 = idf(docFreq=224, maxDocs=477) 0.109375 = fieldNorm(field=alltext, doc=339) 0.5868896 = weight(body:softwar engin^100.0 in 339), product of: 0.9995919 = queryWeight(body:softwar engin^100.0), product of: 100.0 = boost 5.3680387 = idf(body: softwar=34 engin=223) 0.0018621174 = queryNorm 0.58712924 = fieldWeight(body:softwar engin in 339), product of: 1.0 = tf(phraseFreq=1.0) 5.3680387 = idf(body: softwar=34 engin=223) 0.109375 = fieldNorm(field=body, doc=339) /str please suggest me. -- View this message in context: http://lucene.472066.n3.nabble.com/solr-score-issue-tp2574680p2574680.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr score issue
Check the Need help in understanding output of searcher.explain() function thread. http://mail-archives.apache.org/mod_mbox/lucene-java-user/201008.mbox/%3CAANLkTi=m9a1guhrahpeyqaxhu9gta9fjbnr7-8-zi...@mail.gmail.com%3E Regards, Jayendra On Fri, Feb 25, 2011 at 6:57 AM, Bagesh Sharma mail.bag...@gmail.com wrote: Hi sir , Can anyone explain me how this score is being calculated. i am searching here software engineer using dismax handler. Total documents indexed are 477 and query results are 28. Query is like that - q=software+engineerfq=location%3Adelhi dismax setting is - str name=qf alltext title^2 functional_role^1 /str str name=pf body^100 /str Here alltext field is made by copying all fields. body field contains detail of job. I am unable to understand how these scores have been calculated. From where to start score calculating and what are default score for any term matching. str name=20080604/3eb9a7b30131a782a0c0a0e2cdb2b6b8.html 0.5901718 = (MATCH) sum of: 0.0032821721 = (MATCH) sum of: 0.0026574256 = (MATCH) max plus 0.1 times others of: 0.0026574256 = (MATCH) weight(alltext:softwar in 339), product of: 0.0067262817 = queryWeight(alltext:softwar), product of: 3.6121683 = idf(docFreq=34, maxDocs=477) 0.0018621174 = queryNorm 0.39508092 = (MATCH) fieldWeight(alltext:softwar in 339), product of: 1.0 = tf(termFreq(alltext:softwar)=1) 3.6121683 = idf(docFreq=34, maxDocs=477) 0.109375 = fieldNorm(field=alltext, doc=339) 6.2474643E-4 = (MATCH) max plus 0.1 times others of: 6.2474643E-4 = (MATCH) weight(alltext:engin in 339), product of: 0.0032613424 = queryWeight(alltext:engin), product of: 1.7514161 = idf(docFreq=224, maxDocs=477) 0.0018621174 = queryNorm 0.19156113 = (MATCH) fieldWeight(alltext:engin in 339), product of: 1.0 = tf(termFreq(alltext:engin)=1) 1.7514161 = idf(docFreq=224, maxDocs=477) 0.109375 = fieldNorm(field=alltext, doc=339) 0.5868896 = weight(body:softwar engin^100.0 in 339), product of: 0.9995919 = queryWeight(body:softwar engin^100.0), product of: 100.0 = boost 5.3680387 = idf(body: softwar=34 engin=223) 0.0018621174 = queryNorm 0.58712924 = fieldWeight(body:softwar engin in 339), product of: 1.0 = tf(phraseFreq=1.0) 5.3680387 = idf(body: softwar=34 engin=223) 0.109375 = fieldNorm(field=body, doc=339) /str please suggest me. -- View this message in context: http://lucene.472066.n3.nabble.com/solr-score-issue-tp2574680p2574680.html Sent from the Solr - User mailing list archive at Nabble.com.
How to round solr score ?
Hello, I would like to cut solr score to 3 or 4 digits . Indeed I would like to be able to sort by score, then by another critria ( price for example). So if two docs have score of 1.67989 and 1.6767, I would like to sort them by price. Do you have any idea how I could do that ? -- View this message in context: http://www.nabble.com/How-to-round-solr-score---tp22787254p22787254.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to round solr score ?
On Mon, Mar 30, 2009 at 10:04 PM, squaro marclebe...@gmail.com wrote: Hello, I would like to cut solr score to 3 or 4 digits . Indeed I would like to be able to sort by score, then by another critria ( price for example). So if two docs have score of 1.67989 and 1.6767, I would like to sort them by price. Do you have any idea how I could do that ? I don't there there is an existing way to round them. But it will be a useful contribution if you can write a function query for rounding. Look at http://wiki.apache.org/solr/FunctionQuery -- Regards, Shalin Shekhar Mangar.
Re: How to round solr score ?
On Mar 30, 2009, at 1:07 PM, Shalin Shekhar Mangar wrote: On Mon, Mar 30, 2009 at 10:04 PM, squaro marclebe...@gmail.com wrote: Hello, I would like to cut solr score to 3 or 4 digits . Indeed I would like to be able to sort by score, then by another critria ( price for example). So if two docs have score of 1.67989 and 1.6767, I would like to sort them by price. Do you have any idea how I could do that ? I don't there there is an existing way to round them. But it will be a useful contribution if you can write a function query for rounding. Look at http://wiki.apache.org/solr/FunctionQuery What did you have in mind, Shalin?It seems to me you would have to hook into the HitCollector and/or implement your own sorting capability, as the Func Query is just going to allow you to take price in as a scoring factor, no? -Grant
Re: How to round solr score ?
I think what you want to do is add in a function query that gives values in that range. There is no need to round the scores. That doesn't do anything but throw away information. wunder On 3/30/09 10:07 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Mon, Mar 30, 2009 at 10:04 PM, squaro marclebe...@gmail.com wrote: Hello, I would like to cut solr score to 3 or 4 digits . Indeed I would like to be able to sort by score, then by another critria ( price for example). So if two docs have score of 1.67989 and 1.6767, I would like to sort them by price. Do you have any idea how I could do that ? I don't there there is an existing way to round them. But it will be a useful contribution if you can write a function query for rounding. Look at http://wiki.apache.org/solr/FunctionQuery
Re: How to round solr score ?
On Mon, Mar 30, 2009 at 10:54 PM, Grant Ingersoll gsing...@apache.orgwrote: I don't there there is an existing way to round them. But it will be a useful contribution if you can write a function query for rounding. Look at http://wiki.apache.org/solr/FunctionQuery What did you have in mind, Shalin?It seems to me you would have to hook into the HitCollector and/or implement your own sorting capability, as the Func Query is just going to allow you to take price in as a scoring factor, no? Yonik added a way to use the score of a query in function queries with SOLR-939. Look at the query function on the wiki. Some very cool things are possible now :) -- Regards, Shalin Shekhar Mangar.
Re: How to round solr score ?
On Mon, Mar 30, 2009 at 11:06 PM, Walter Underwood wunderw...@netflix.comwrote: I think what you want to do is add in a function query that gives values in that range. The scale function won't work in this use-case because it will give you a double in the given range. So you cannot do sort by score and price. For this use-case you need to scale to an integer value in a discrete range. -- Regards, Shalin Shekhar Mangar.
Re: How to round solr score ?
On Mon, Mar 30, 2009 at 11:07 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: Yonik added a way to use the score of a query in function queries with SOLR-939. Look at the query function on the wiki. Some very cool things are possible now :) Sorry, that should have been SOLR-1046 -- Regards, Shalin Shekhar Mangar.
Re: How to round solr score ?
On Mon, Mar 30, 2009 at 11:10 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Mon, Mar 30, 2009 at 11:06 PM, Walter Underwood wunderw...@netflix.com wrote: I think what you want to do is add in a function query that gives values in that range. The scale function won't work in this use-case because it will give you a double in the given range. So you cannot do sort by score and price. For this use-case you need to scale to an integer value in a discrete range. Walter -- I think I misinterpreted your response. Sorry about that. You are indeed right. However, we can do scale(round(score, 2), 1, 10) or we can create a new scale function as you said. -- Regards, Shalin Shekhar Mangar.
Re: solr score
Hi Santhanaraj Just search for boost on Solr wiki and see if boost feature suffices your requirement. As for highlighting, this explains how to implement solr highlighting http://wiki.apache.org/solr/HighlightingParameters - neeti On Wed, Sep 24, 2008 at 10:31 AM, sanraj25 [EMAIL PROTECTED] wrote: hi, How to weightage more frequently searched word in solr? what is the functionality in Apache solr module? I have a list of more frequently searched word in my site , i need to highlight those words.From the net i found out that 'score' is used for this purpose. Isn't it true? Anybody knows about it? Please help me. with Regards, Santhanaraj R -- View this message in context: http://www.nabble.com/solr-score-tp19642046p19642046.html Sent from the Solr - User mailing list archive at Nabble.com.
solr score
hi, How to weightage more frequently searched word in solr? what is the functionality in Apache solr module? I have a list of more frequently searched word in my site , i need to highlight those words.From the net i found out that 'score' is used for this purpose. Isn't it true? Anybody knows about it? Please help me. with Regards, Santhanaraj R -- View this message in context: http://www.nabble.com/solr-score-tp19642046p19642046.html Sent from the Solr - User mailing list archive at Nabble.com.
A question about solr score
Hi, everyone! As we known, solr uses lucene scoring. This score is the raw score. Scores returned from Hits aren't necessarily the raw score, however. If the top-scoring document scores greater than 1.0, all scores are normalized from that score, such that all scores from Hits are uaranteed to be 1.0 or less. Now it is my question, I always get scores of some documents which are above 1.0, even some get up to 10.0! Why? I will really appreciate your reply.
Re: A question about solr score
Solr returns the raw score, not the Lucene Hits normalized one. It's trivial for the client to normalize if desired - take the top scoring document, if it's greater than 1.0 then scale all scores based on that. Erik On Oct 26, 2007, at 2:53 AM, zx zhang wrote: Hi, everyone! As we known, solr uses lucene scoring. This score is the raw score. Scores returned from Hits aren't necessarily the raw score, however. If the top-scoring document scores greater than 1.0, all scores are normalized from that score, such that all scores from Hits are uaranteed to be 1.0 or less. Now it is my question, I always get scores of some documents which are above 1.0, even some get up to 10.0! Why? I will really appreciate your reply.
Re: A question about solr score
: It's trivial for the client to normalize if desired - take the top scoring : document, if it's greater than 1.0 then scale all scores based on that. this is why doclists include the maxScore in their output as well, to make it easy to normalize scores even if you are using pagination (or sorting on a field other then score) http://localhost:8983/solr/select/?q=videofl=id,scorestart=1 result name=response numFound=3 start=1 maxScore=0.5145902 doc float name=score0.39613172/float str name=idEN7800GTX/2DHTV/256M/str /doc doc float name=score0.39613172/float str name=id100-435805/str /doc /result -Hoss