Hi all,

 

I am working on an application that must deal with ranking on highly dynamic
metadata. For example, suppose I want to provide ranking based on the number
of downloads of hit documents. A user may log-in to the system and send a
query, which will be answered by Lucene in a traditional way. The relevant
documents will be scored and ranked, based on default Lucene scoring
functions. In addition, the system wants to support users with the
"popularity" ranking facility. The number of downloads for a document may
continue to increase, even during a query. It will incur much overhead if we
put the "popularity" as a field in the Lucene index (delete and insert the
document when an update happens). Instead, we choose to store such
information in a database, with document identifiers linking database
records back to the index.

 

This setting, however, creates a ranking problem. It is not efficient to
send each hit document identifier to the database as a SQL query to obtain
the download popularity information. It will be good to have the mechanism
to link database records directly with Lucene indices, for which a query can
retrieve corresponding records from the database at the querying time. We
are very interested to know if there is some open source toolkits or
libraries to do the dirty-works. Of course, we also want to know other
alternative solutions to meet our ends.

 

 

Best Regards,

 

Joaquin

Reply via email to