[If this conversation is off-topic for this list, we can move it offline 
but I figure since at least one other is interested I'll keep it here 
for now.]

Thanks for the reply, Joshua.  I'm really just beginning my studies in 
IR so forgive any niavete.  Please see my comments below.

>I've also been doing research in IR using the Lucene API.  Regarding query
>expansion, I wrote my own code to do that, based on an algorithm I
>developed (which is a type of thesaurus-based expansion).  Basically I ran
>the original query terms through the appropriate analyzers, decided what
>terms I wanted to add, and constructed my own Query from these terms
>(using TermQuery and BooleanQuery).
>

This makes sense to me.  I'm looking to do something similar but add in 
a final step of allowing the user to select which terms (of those that 
are found to be relevant through a document relevance feedback 
mechanism) to actually include in the expansion.  (I'm thinking 
OKAPI/Giraffe style here... I suppose it's been done elsewhere too but 
Efthimis Efthimiadis is my reference point for this project.)  Either 
way, I can see how this is done without interacting with Lucene's 
scoring/ranking code.    

>
>While I did weight documents based on the terms they used (take a look at
>TermQuery.setBoost()), I didn't do relevance feedback per se.  Of course,
>how you want to implement relevance feedback will depend on what mechanism
>you want to use.
>

I'm planning on implementing an array of term weighting/document ranking 
algorithms such that the user can choose which algorithm to use.  (The 
system is for educational use so I'm trying to develop some transparency 
here.)  In the near future, I'd like to implement term weighting using: 
(1) porter, (2) W_P-Q, (3) F4, and (4) EMIM.  Each of these relies on 
(user-based) relevance feedback.  

Now that I look at this, I wonder if these algorithms could also be 
implemented outside of the scoring.  Perhaps I can request the relevance 
judgments from the user, recalculate the term weights, use the 
TermQuery.setBoost() method, and reiterate the search.  Am I missing 
something?  Will the term boost function properly given the term weights 
calculated by these algorithms?

>
>PS: Say hi to Wanda Pratt for me.  
>
I will... I haven't had the opportunity to meet her yet.  Have you 
worked with her? [Off topic.. sorry.]

Nathan


--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to