I would assume that it need these for handling the indexing of the
link scores.  Lucene puts no scoring weight on things such as urls,
page rank and such. Since lucene only indexes documents, and
calculates its keyword/query relevancy based only on term vectors (or
whatever) nutch needs to add the url scoring and such to the index.



On 5/1/07, hzhong <[EMAIL PROTECTED]> wrote:
>
> Hello,
>
> In Indexer.java,  index(Path indexDir, Path crawlDb, Path linkDb, Path[]
> segments), can someone explain to me why crawlDB and linkDB is needed for
> indexing?
>
> In Lucene, there's no crawlDB and linkDB for indexing.
>
> Thank you very much
>
> Hanna
> --
> View this message in context: 
> http://www.nabble.com/Nutch-Indexer-tf3673420.html#a10264625
> Sent from the Nutch - User mailing list archive at Nabble.com.
>
>


-- 
"Conscious decisions by conscious minds are what make reality real"

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to