[If this conversation is off-topic for this list, we can move it offline but I figure since at least one other is interested I'll keep it here for now.]
Thanks for the reply, Joshua. I'm really just beginning my studies in IR so forgive any niavete. Please see my comments below. >I've also been doing research in IR using the Lucene API. Regarding query >expansion, I wrote my own code to do that, based on an algorithm I >developed (which is a type of thesaurus-based expansion). Basically I ran >the original query terms through the appropriate analyzers, decided what >terms I wanted to add, and constructed my own Query from these terms >(using TermQuery and BooleanQuery). > This makes sense to me. I'm looking to do something similar but add in a final step of allowing the user to select which terms (of those that are found to be relevant through a document relevance feedback mechanism) to actually include in the expansion. (I'm thinking OKAPI/Giraffe style here... I suppose it's been done elsewhere too but Efthimis Efthimiadis is my reference point for this project.) Either way, I can see how this is done without interacting with Lucene's scoring/ranking code. > >While I did weight documents based on the terms they used (take a look at >TermQuery.setBoost()), I didn't do relevance feedback per se. Of course, >how you want to implement relevance feedback will depend on what mechanism >you want to use. > I'm planning on implementing an array of term weighting/document ranking algorithms such that the user can choose which algorithm to use. (The system is for educational use so I'm trying to develop some transparency here.) In the near future, I'd like to implement term weighting using: (1) porter, (2) W_P-Q, (3) F4, and (4) EMIM. Each of these relies on (user-based) relevance feedback. Now that I look at this, I wonder if these algorithms could also be implemented outside of the scoring. Perhaps I can request the relevance judgments from the user, recalculate the term weights, use the TermQuery.setBoost() method, and reiterate the search. Am I missing something? Will the term boost function properly given the term weights calculated by these algorithms? > >PS: Say hi to Wanda Pratt for me. > I will... I haven't had the opportunity to meet her yet. Have you worked with her? [Off topic.. sorry.] Nathan -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>