Can this type of sorting/boosting be done by solr

2012-02-23 Thread rks_lucene
Hi,

I have a journal article citation schema like this:
{  AT - article_title
   AID - article_id (Unique id)
   AREFS - article_references_list (List of article id's referred/cited in
this article. Multi-valued)
   AA - Article Abstract
   ---
   other_article_stuff
   ...
}

So for example, in order to search for all those articles that refer(cite)
article id 51643, I simply need to search for AREFS:51643 and it will give
me the list of articles that have 51643 listed in AREFS.

Now, I want to be able to search in the text of articles and sort the
results by most referred articles. How can I do this ?

Say if my search query is q=AT:metal and it gives me 1700 results. How can I
sort 1700 results by those that have received maximum number of citations by
others.

I have been researching function queries to solve this but have been unable
to do so.

Thanks in advance.
Ritesh


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Can-this-type-of-sorting-boosting-be-done-by-solr-tp3769315p3769315.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Can this type of sorting/boosting be done by solr

2012-02-23 Thread Chantal Ackermann
Hi Ritesh,

you could add another field that contains the size of the list in the
AREFS field. This way you'd simply sort by that field in descending
order.

Should you update AREFS dynamically, you'd have to update the field with
the size, as well, of course.

Chantal

On Thu, 2012-02-23 at 11:27 +0100, rks_lucene wrote:
 Hi,
 
 I have a journal article citation schema like this:
 {  AT - article_title
AID - article_id (Unique id)
AREFS - article_references_list (List of article id's referred/cited in
 this article. Multi-valued)
AA - Article Abstract
---
other_article_stuff
...
 }
 
 So for example, in order to search for all those articles that refer(cite)
 article id 51643, I simply need to search for AREFS:51643 and it will give
 me the list of articles that have 51643 listed in AREFS.
 
 Now, I want to be able to search in the text of articles and sort the
 results by most referred articles. How can I do this ?
 
 Say if my search query is q=AT:metal and it gives me 1700 results. How can I
 sort 1700 results by those that have received maximum number of citations by
 others.
 
 I have been researching function queries to solve this but have been unable
 to do so.
 
 Thanks in advance.
 Ritesh
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Can-this-type-of-sorting-boosting-be-done-by-solr-tp3769315p3769315.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Can this type of sorting/boosting be done by solr

2012-02-23 Thread rks_lucene
Dear Chantal,

Thanks for your reply, but thats not what I was asking.

Let me explain. The size of the list in AREFS would give me how many records
are *referred by* an article and NOT how many records *refer to* an article.

Say if an article id - 51463 has been published in 2002 and refers to 10
articles dating from 1990-2002. Then the count of AREFS would be 10 which is
static once the journal has been published.

However if the same article is being *referred to* by 20 articles published
from 2003-2012 then I am talking about this 20 count. This count is dynamic
and as we keep adding records to the index, there are more articles that
will refer to article 51463 it in their AREFS field in the future.
/(Obviously when we are adding article 51463 to the index we have no clue
who will be referring to it in the future, so we can have another field in
it for this, nor can be update 51463 everytime someone refers to it)/

So today, if I want to know who all are referring to 51463, by actually
searching for this id in the AREFS field. The query is as simple as
q=AREFS:51463 and it will given the list of articles from 2003 to 2012 and
the result count would be 20.

So back to the question, say if my search query is q=AT:metal and it gives
me 1700 results. How can I 
sort 1700 results by those that have received maximum number of citations
(till date) by others. (i.e., that have maximum number of results if I
individually search their ids in the AREFS field).

Hope this makes it clear. I feel this is a sort/boost by function query
candidate. But I am not able to figure it out.

Thanks
Ritesh  

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Can-this-type-of-sorting-boosting-be-done-by-solr-tp3769315p3769475.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Can this type of sorting/boosting be done by solr

2012-02-23 Thread Lee Carroll
Have you looked at external fields?

 
http://lucidworks.lucidimagination.com/display/solr/Solr+Field+Types#SolrFieldTypes-WorkingwithExternalFiles

you will need a process to do the counts and note the limitation of
updates only after a commit, but i think it would fit your usecase.



On 23 February 2012 12:04, rks_lucene ppro.i...@gmail.com wrote:
 Dear Chantal,

 Thanks for your reply, but thats not what I was asking.

 Let me explain. The size of the list in AREFS would give me how many records
 are *referred by* an article and NOT how many records *refer to* an article.

 Say if an article id - 51463 has been published in 2002 and refers to 10
 articles dating from 1990-2002. Then the count of AREFS would be 10 which is
 static once the journal has been published.

 However if the same article is being *referred to* by 20 articles published
 from 2003-2012 then I am talking about this 20 count. This count is dynamic
 and as we keep adding records to the index, there are more articles that
 will refer to article 51463 it in their AREFS field in the future.
 /(Obviously when we are adding article 51463 to the index we have no clue
 who will be referring to it in the future, so we can have another field in
 it for this, nor can be update 51463 everytime someone refers to it)/

 So today, if I want to know who all are referring to 51463, by actually
 searching for this id in the AREFS field. The query is as simple as
 q=AREFS:51463 and it will given the list of articles from 2003 to 2012 and
 the result count would be 20.

 So back to the question, say if my search query is q=AT:metal and it gives
 me 1700 results. How can I
 sort 1700 results by those that have received maximum number of citations
 (till date) by others. (i.e., that have maximum number of results if I
 individually search their ids in the AREFS field).

 Hope this makes it clear. I feel this is a sort/boost by function query
 candidate. But I am not able to figure it out.

 Thanks
 Ritesh

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Can-this-type-of-sorting-boosting-be-done-by-solr-tp3769315p3769475.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Can this type of sorting/boosting be done by solr

2012-02-23 Thread Chantal Ackermann
Sorry to have misunderstood.
It seems the new Relevance Functions in Solr 4.0 might help - unless you
need to use an official release.

http://wiki.apache.org/solr/FunctionQuery#Relevance_Functions



On Thu, 2012-02-23 at 13:04 +0100, rks_lucene wrote:
 Dear Chantal,
 
 Thanks for your reply, but thats not what I was asking.
 
 Let me explain. The size of the list in AREFS would give me how many records
 are *referred by* an article and NOT how many records *refer to* an article.
 
 Say if an article id - 51463 has been published in 2002 and refers to 10
 articles dating from 1990-2002. Then the count of AREFS would be 10 which is
 static once the journal has been published.
 
 However if the same article is being *referred to* by 20 articles published
 from 2003-2012 then I am talking about this 20 count. This count is dynamic
 and as we keep adding records to the index, there are more articles that
 will refer to article 51463 it in their AREFS field in the future.
 /(Obviously when we are adding article 51463 to the index we have no clue
 who will be referring to it in the future, so we can have another field in
 it for this, nor can be update 51463 everytime someone refers to it)/
 
 So today, if I want to know who all are referring to 51463, by actually
 searching for this id in the AREFS field. The query is as simple as
 q=AREFS:51463 and it will given the list of articles from 2003 to 2012 and
 the result count would be 20.
 
 So back to the question, say if my search query is q=AT:metal and it gives
 me 1700 results. How can I 
 sort 1700 results by those that have received maximum number of citations
 (till date) by others. (i.e., that have maximum number of results if I
 individually search their ids in the AREFS field).
 
 Hope this makes it clear. I feel this is a sort/boost by function query
 candidate. But I am not able to figure it out.
 
 Thanks
 Ritesh  
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Can-this-type-of-sorting-boosting-be-done-by-solr-tp3769315p3769475.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Can this type of sorting/boosting be done by solr

2012-02-23 Thread rks_lucene
Hi Chantal,

Yes, I have thought about the docfreq(field_name,'search_text') function,
but somehow I will have dereference the article id's (AID) from the result
of the query to the sort. The below query does not work:

q=AT:metalsort=docfreq(AREFS,$q.AID) 

Is there a mistake in the query that am missing out or is dereferencing not
supported in Relevence functions ?

Thanks,
Ritesh




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Can-this-type-of-sorting-boosting-be-done-by-solr-tp3769315p3769779.html
Sent from the Solr - User mailing list archive at Nabble.com.