On 1/19/06, Mathias Lux <[EMAIL PROTECTED]> wrote: > > > > Actually, my problem is that, for instance, for a document d, > > Its feature > > vector may be keywords and concepts. I don't know how to > > weight the two > > items. Right now, i used a stupid method, given a document d, > > i can obtain a > > rank D based on keyword method. Also, it is annotated with a > > concept c (The > > most simple example) . People can have a rank C of these > > concepts in the > > domain ontology, where the most relevant concepts should be > > the at top of > > this concept list. Finally, document's rank is decided by the > > sum of (C + > > D). > > hmm, if you index the concepts e.g. based on ist URI in a Lucene Filed > you can set a boost value at indexing time like this: > > Field conceptField = Field.Text("classification", > "http://concepts.server.com/classification/car/mercedes") > conceptField.setBoost(1.3f); > > So your concept for this document, where the filed is added, is boosted > in relevance computation. > > if you know the concept boost value at search time you can add the boost > value to the query: > e.g. querying for > > classification:"http://concepts.server.com/classification/car/mercedes"^ > 4 > > Of course you have to think about the whole thing, but I think with good > boost values it would work. > > - mathias > > ps. instead of C+D I would use (l-1)*C + l*D, so l from [0,1] can be > used to specify if concept or content has more influence.
I will compute each concepts relevant to each query. Thus, i cannot set the boost value. Actually, I use the (l-1)*C + l*D method in my prototype. But my supervisor said this method is funny as it is too simple. --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > -- Regards Jiang Xing