[ 
https://issues.apache.org/jira/browse/NUTCH-518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513819
 ] 

Enis Soztutar commented on NUTCH-518:
-------------------------------------

Since there is no ordering among scoring filters, if we do something specific 
to zero in OpicScoring, it might lead to nondeterministic behaviour. Let's say  
for example the code in OpicScoring is changed so that : 

public float indexerScore(Text url, Document doc, CrawlDatum dbDatum, 
CrawlDatum fetchDatum, Parse parse, Inlinks inlinks, float initScore) {
   if(initScore != 0)
      return (float)Math.pow(dbDatum.getScore(), scorePower) * initScore;
   else 
       //do smt nasty
}

then there will be a big difference if scoring-opic is run before or after 
scoring-foo. 
As far as i can tell from the massages in mailing lists, scoring filters are 
used for restricting the crawl to topics, so zero-handling might broke 
topic-specific crawls. So my vote is to keep current implementation. 

> Fix OpicScoringFilter to respect scoring filter chaining
> --------------------------------------------------------
>
>                 Key: NUTCH-518
>                 URL: https://issues.apache.org/jira/browse/NUTCH-518
>             Project: Nutch
>          Issue Type: Bug
>          Components: indexer
>    Affects Versions: 1.0.0
>            Reporter: Enis Soztutar
>            Assignee: Doğacan Güney
>             Fix For: 1.0.0
>
>         Attachments: opicScoring.chain.patch
>
>
> Opic Scoring returns the score that it calculates, rather than returning 
> previous_score * calculated_score. This prevents using another scoring filter 
> along with Opic scoring. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to