Thank you very much for your response. Without crawlDB and linkDB, shouldn't the nutch index become the lucene index? That is, if I do not care about the url scoring, can I index a page without taking crawlDB and linkDB into consideration?
Hanna Briggs wrote: > > I would assume that it need these for handling the indexing of the > link scores. Lucene puts no scoring weight on things such as urls, > page rank and such. Since lucene only indexes documents, and > calculates its keyword/query relevancy based only on term vectors (or > whatever) nutch needs to add the url scoring and such to the index. > > > > On 5/1/07, hzhong <[EMAIL PROTECTED]> wrote: >> >> Hello, >> >> In Indexer.java, index(Path indexDir, Path crawlDb, Path linkDb, Path[] >> segments), can someone explain to me why crawlDB and linkDB is needed for >> indexing? >> >> In Lucene, there's no crawlDB and linkDB for indexing. >> >> Thank you very much >> >> Hanna >> -- >> View this message in context: >> http://www.nabble.com/Nutch-Indexer-tf3673420.html#a10264625 >> Sent from the Nutch - User mailing list archive at Nabble.com. >> >> > > > -- > "Conscious decisions by conscious minds are what make reality real" > > -- View this message in context: http://www.nabble.com/Nutch-Indexer-tf3673420.html#a10279417 Sent from the Nutch - User mailing list archive at Nabble.com. ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
