Can this type of sorting/boosting be done by solr
Hi, I have a journal article citation schema like this: { AT - article_title AID - article_id (Unique id) AREFS - article_references_list (List of article id's referred/cited in this article. Multi-valued) AA - Article Abstract --- other_article_stuff ... } So for example, in order to search for all those articles that refer(cite) article id 51643, I simply need to search for AREFS:51643 and it will give me the list of articles that have 51643 listed in AREFS. Now, I want to be able to search in the text of articles and sort the results by most referred articles. How can I do this ? Say if my search query is q=AT:metal and it gives me 1700 results. How can I sort 1700 results by those that have received maximum number of citations by others. I have been researching function queries to solve this but have been unable to do so. Thanks in advance. Ritesh -- View this message in context: http://lucene.472066.n3.nabble.com/Can-this-type-of-sorting-boosting-be-done-by-solr-tp3769315p3769315.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Can this type of sorting/boosting be done by solr
Hi Ritesh, you could add another field that contains the size of the list in the AREFS field. This way you'd simply sort by that field in descending order. Should you update AREFS dynamically, you'd have to update the field with the size, as well, of course. Chantal On Thu, 2012-02-23 at 11:27 +0100, rks_lucene wrote: Hi, I have a journal article citation schema like this: { AT - article_title AID - article_id (Unique id) AREFS - article_references_list (List of article id's referred/cited in this article. Multi-valued) AA - Article Abstract --- other_article_stuff ... } So for example, in order to search for all those articles that refer(cite) article id 51643, I simply need to search for AREFS:51643 and it will give me the list of articles that have 51643 listed in AREFS. Now, I want to be able to search in the text of articles and sort the results by most referred articles. How can I do this ? Say if my search query is q=AT:metal and it gives me 1700 results. How can I sort 1700 results by those that have received maximum number of citations by others. I have been researching function queries to solve this but have been unable to do so. Thanks in advance. Ritesh -- View this message in context: http://lucene.472066.n3.nabble.com/Can-this-type-of-sorting-boosting-be-done-by-solr-tp3769315p3769315.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Can this type of sorting/boosting be done by solr
Dear Chantal, Thanks for your reply, but thats not what I was asking. Let me explain. The size of the list in AREFS would give me how many records are *referred by* an article and NOT how many records *refer to* an article. Say if an article id - 51463 has been published in 2002 and refers to 10 articles dating from 1990-2002. Then the count of AREFS would be 10 which is static once the journal has been published. However if the same article is being *referred to* by 20 articles published from 2003-2012 then I am talking about this 20 count. This count is dynamic and as we keep adding records to the index, there are more articles that will refer to article 51463 it in their AREFS field in the future. /(Obviously when we are adding article 51463 to the index we have no clue who will be referring to it in the future, so we can have another field in it for this, nor can be update 51463 everytime someone refers to it)/ So today, if I want to know who all are referring to 51463, by actually searching for this id in the AREFS field. The query is as simple as q=AREFS:51463 and it will given the list of articles from 2003 to 2012 and the result count would be 20. So back to the question, say if my search query is q=AT:metal and it gives me 1700 results. How can I sort 1700 results by those that have received maximum number of citations (till date) by others. (i.e., that have maximum number of results if I individually search their ids in the AREFS field). Hope this makes it clear. I feel this is a sort/boost by function query candidate. But I am not able to figure it out. Thanks Ritesh -- View this message in context: http://lucene.472066.n3.nabble.com/Can-this-type-of-sorting-boosting-be-done-by-solr-tp3769315p3769475.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Can this type of sorting/boosting be done by solr
Have you looked at external fields? http://lucidworks.lucidimagination.com/display/solr/Solr+Field+Types#SolrFieldTypes-WorkingwithExternalFiles you will need a process to do the counts and note the limitation of updates only after a commit, but i think it would fit your usecase. On 23 February 2012 12:04, rks_lucene ppro.i...@gmail.com wrote: Dear Chantal, Thanks for your reply, but thats not what I was asking. Let me explain. The size of the list in AREFS would give me how many records are *referred by* an article and NOT how many records *refer to* an article. Say if an article id - 51463 has been published in 2002 and refers to 10 articles dating from 1990-2002. Then the count of AREFS would be 10 which is static once the journal has been published. However if the same article is being *referred to* by 20 articles published from 2003-2012 then I am talking about this 20 count. This count is dynamic and as we keep adding records to the index, there are more articles that will refer to article 51463 it in their AREFS field in the future. /(Obviously when we are adding article 51463 to the index we have no clue who will be referring to it in the future, so we can have another field in it for this, nor can be update 51463 everytime someone refers to it)/ So today, if I want to know who all are referring to 51463, by actually searching for this id in the AREFS field. The query is as simple as q=AREFS:51463 and it will given the list of articles from 2003 to 2012 and the result count would be 20. So back to the question, say if my search query is q=AT:metal and it gives me 1700 results. How can I sort 1700 results by those that have received maximum number of citations (till date) by others. (i.e., that have maximum number of results if I individually search their ids in the AREFS field). Hope this makes it clear. I feel this is a sort/boost by function query candidate. But I am not able to figure it out. Thanks Ritesh -- View this message in context: http://lucene.472066.n3.nabble.com/Can-this-type-of-sorting-boosting-be-done-by-solr-tp3769315p3769475.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Can this type of sorting/boosting be done by solr
Sorry to have misunderstood. It seems the new Relevance Functions in Solr 4.0 might help - unless you need to use an official release. http://wiki.apache.org/solr/FunctionQuery#Relevance_Functions On Thu, 2012-02-23 at 13:04 +0100, rks_lucene wrote: Dear Chantal, Thanks for your reply, but thats not what I was asking. Let me explain. The size of the list in AREFS would give me how many records are *referred by* an article and NOT how many records *refer to* an article. Say if an article id - 51463 has been published in 2002 and refers to 10 articles dating from 1990-2002. Then the count of AREFS would be 10 which is static once the journal has been published. However if the same article is being *referred to* by 20 articles published from 2003-2012 then I am talking about this 20 count. This count is dynamic and as we keep adding records to the index, there are more articles that will refer to article 51463 it in their AREFS field in the future. /(Obviously when we are adding article 51463 to the index we have no clue who will be referring to it in the future, so we can have another field in it for this, nor can be update 51463 everytime someone refers to it)/ So today, if I want to know who all are referring to 51463, by actually searching for this id in the AREFS field. The query is as simple as q=AREFS:51463 and it will given the list of articles from 2003 to 2012 and the result count would be 20. So back to the question, say if my search query is q=AT:metal and it gives me 1700 results. How can I sort 1700 results by those that have received maximum number of citations (till date) by others. (i.e., that have maximum number of results if I individually search their ids in the AREFS field). Hope this makes it clear. I feel this is a sort/boost by function query candidate. But I am not able to figure it out. Thanks Ritesh -- View this message in context: http://lucene.472066.n3.nabble.com/Can-this-type-of-sorting-boosting-be-done-by-solr-tp3769315p3769475.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Can this type of sorting/boosting be done by solr
Hi Chantal, Yes, I have thought about the docfreq(field_name,'search_text') function, but somehow I will have dereference the article id's (AID) from the result of the query to the sort. The below query does not work: q=AT:metalsort=docfreq(AREFS,$q.AID) Is there a mistake in the query that am missing out or is dereferencing not supported in Relevence functions ? Thanks, Ritesh -- View this message in context: http://lucene.472066.n3.nabble.com/Can-this-type-of-sorting-boosting-be-done-by-solr-tp3769315p3769779.html Sent from the Solr - User mailing list archive at Nabble.com.