Eric Osgood wrote:
Andrzej,
Based on what you suggested below, I have begun to write my own scoring
plugin:
Great!
in distributeScoreToOutlinks() if the link contains the string im
looking for, I set its score to kept_score and add a flag to the
metaData in parseData ("KEEP", "true"). How do I check for this flag in
generatorSortValue()? I only see a way to check the score, not a flag.
The flag should have been automagically added to the target CrawlDatum
metadata after you have updated your crawldb (see the details in
CrawlDbReducer). Then in generatorSortValue() you can check for the
presence of this flag by using the datum.getMetaData().
BTW - you are right, the Generator doesn't treat Float.MIN_VALUE in any
special way ... I thought it did. It's easy to add this, though - in
Generator.java:161 just add this:
if (sort == Float.MIN_VALUE) {
return;
}
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com