Hi - You mention having a list with important terms, then using payloads would be the most straightforward i suppose. You still need a custom similarity and custom query parser. Payloads work for us very well.
M -----Original message----- > From:Ahmet Arslan <iori...@yahoo.com.INVALID> > Sent: Monday 12th January 2015 19:50 > To: solr-user@lucene.apache.org > Subject: Re: Extending solr analysis in index time > > Hi Ali, > > Reading your example, if you could somehow replace idf component with your > "importance weight", > I think your use case looks like TFIDFSimilarity. Tf component remains same. > > https://lucene.apache.org/core/4_0_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html > > I also suggest you ask this in lucene mailing list. Someone familiar with > similarity package can give insight on this. > > Ahmet > > > > On Monday, January 12, 2015 6:54 PM, Jack Krupansky > <jack.krupan...@gmail.com> wrote: > Could you clarify what you mean by "Lucene reverse index"? That's not a > term I am familiar with. > > -- Jack Krupansky > > > On Mon, Jan 12, 2015 at 1:01 AM, Ali Nazemian <alinazem...@gmail.com> wrote: > > > Dear Jack, > > Thank you very much. > > Yeah I was thinking of function query for sorting, but I have to problems > > in this case, 1) function query do the process at query time which I dont > > want to. 2) I also want to have the score field for retrieving and showing > > to users. > > > > Dear Alexandre, > > Here is some more explanation about the business behind the question: > > I am going to provide a field for each document, lets refer it as > > "document_score". I am going to fill this field based on the information > > that could be extracted from Lucene reverse index. Assume I have a list of > > terms, called important terms and I am going to extract the term frequency > > for each of the terms inside this list per each document. To be honest I > > want to use the term frequency for calculating "document_score". > > "document_score" should be storable since I am going to retrieve this field > > for each document. I also want to do sorting on "document_store" in case of > > preferred by user. > > I hope I did convey my point. > > Best regards. > > > > > > On Mon, Jan 12, 2015 at 12:53 AM, Jack Krupansky <jack.krupan...@gmail.com > > > > > wrote: > > > > > Won't function queries do the job at query time? You can add or multiply > > > the tf*idf score by a function of the term frequency of arbitrary terms, > > > using the tf, mul, and add functions. > > > > > > See: > > > https://cwiki.apache.org/confluence/display/solr/Function+Queries > > > > > > -- Jack Krupansky > > > > > > On Sun, Jan 11, 2015 at 10:55 AM, Ali Nazemian <alinazem...@gmail.com> > > > wrote: > > > > > > > Dear Jack, > > > > Hi, > > > > I think you misunderstood my need. I dont want to change the default > > > > scoring behavior of Lucene (tf-idf) I just want to have another field > > to > > > do > > > > sorting for some specific queries (not all the search business), > > however > > > I > > > > am aware of Lucene payload. > > > > Thank you very much. > > > > > > > > On Sun, Jan 11, 2015 at 7:15 PM, Jack Krupansky < > > > jack.krupan...@gmail.com> > > > > wrote: > > > > > > > > > You would do that with a custom similarity (scoring) class. That's an > > > > > expert feature. In fact a SUPER-expert feature. > > > > > > > > > > Start by completely familiarizing yourself with how TF*IDF > > similarity > > > > > already works: > > > > > > > > > > > > > > > > > > > http://lucene.apache.org/core/4_10_3/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html > > > > > > > > > > And to use your custom similarity class in Solr: > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/solr/Other+Schema+Elements#OtherSchemaElements-Similarity > > > > > > > > > > > > > > > -- Jack Krupansky > > > > > > > > > > On Sun, Jan 11, 2015 at 9:04 AM, Ali Nazemian <alinazem...@gmail.com > > > > > > > > wrote: > > > > > > > > > > > Hi everybody, > > > > > > > > > > > > I am going to add some analysis to Solr at the index time. Here is > > > > what I > > > > > > am considering in my mind: > > > > > > Suppose I have two different fields for Solr schema, field "a" and > > > > field > > > > > > "b". I am going to use the created reverse index in a way that some > > > > terms > > > > > > are considered as important ones and tell lucene to calculate a > > value > > > > > based > > > > > > on these terms frequency per each document. For example let the > > word > > > > > > "hello" considered as important word with the weight of "2.0". > > > Suppose > > > > > the > > > > > > term frequency for this word at field "a" is 3 and at field "b" is > > 6 > > > > for > > > > > > document 1. Therefor the score value would be 2*3+(2*6)^2. I want > > to > > > > > > calculate this score based on these fields and put it in the index > > > for > > > > > > retrieving. My question would be how can I do such thing? First I > > did > > > > > > consider using term component for calculating this value from > > outside > > > > and > > > > > > put it back to Solr index, but it seems it is not efficient enough. > > > > > > > > > > > > Thank you very much. > > > > > > Best regards. > > > > > > > > > > > > -- > > > > > > A.Nazemian > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > A.Nazemian > > > > > > > > > > > > > > > -- > > A.Nazemian > > >