Boost is calculated during the indexing phase by plugins: scoring.link scoring.opik scoring.tld
Also boost is calculated during query by corresponding query plugins and my apply to certain fields. As I said boost is an essential part of the scoring algorithm. Digest is calculated during the crawl phase bu one of MD5Signature FeedSignature TextProfileSignature Digest *usually *but not always is page(content) md5 hash. Look at implementation of each plugin to see what particularly is used to calculate digest and boosts. Everything might be in the game. Best Regards Alexander Aristov 2009/6/16 Fabrice Estiévenart <[email protected]> > Thank you, > > From which information are they computed ? My suppositions : > > Boost : inlinks, ... ? > Digest : content, url, title, ... ? > > Fabrice > > Alexander Aristov a écrit : > >> Hi >> >> Boost is used to calculate document (field) score which is used by Lucene >> in >> queries to find the best results. It's part of the scoring algorithms. >> >> Digest is used to identify pages (like unique ID) and is used to remove >> duplicates during the dedup procedure. >> >> Best Regards >> Alexander Aristov >> >> >> 2009/6/16 Fabrice Estiévenart <[email protected]> >> >> >> >>> Hello, >>> >>> How are computed the "boost" and the "digest" fields in a Nutch index ? >>> What are they precisely using for ? >>> >>> I can't find this information, thanks. >>> >>> -- >>> Fabrice Estiévenart, Ingénieur R&D, CETIC >>> Tél : +32 (0)71/49.07.28 >>> Web : http://www.cetic.be >>> >>> >>> >>> >> >> >> > > > -- > Fabrice Estiévenart, Ingénieur R&D, CETIC > Tél : +32 (0)71/49.07.28 > Web : http://www.cetic.be > >
