> Actually, my problem is that, for instance, for a document d, > Its feature > vector may be keywords and concepts. I don't know how to > weight the two > items. Right now, i used a stupid method, given a document d, > i can obtain a > rank D based on keyword method. Also, it is annotated with a > concept c (The > most simple example) . People can have a rank C of these > concepts in the > domain ontology, where the most relevant concepts should be > the at top of > this concept list. Finally, document's rank is decided by the > sum of (C + > D).
hmm, if you index the concepts e.g. based on ist URI in a Lucene Filed you can set a boost value at indexing time like this: Field conceptField = Field.Text("classification", "http://concepts.server.com/classification/car/mercedes") conceptField.setBoost(1.3f); So your concept for this document, where the filed is added, is boosted in relevance computation. if you know the concept boost value at search time you can add the boost value to the query: e.g. querying for classification:"http://concepts.server.com/classification/car/mercedes"^ 4 Of course you have to think about the whole thing, but I think with good boost values it would work. - mathias ps. instead of C+D I would use (l-1)*C + l*D, so l from [0,1] can be used to specify if concept or content has more influence. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]