Re: [Nutch-general] Nutch Indexer

hzhong Tue, 01 May 2007 19:56:38 -0700

Thank you very much for your response.

Without crawlDB and linkDB, shouldn't the nutch index become the lucene
index?  That is, if I do not care about the url scoring, can I index a page
without taking crawlDB and linkDB into consideration?


Hanna


Briggs wrote:
> 
> I would assume that it need these for handling the indexing of the
> link scores.  Lucene puts no scoring weight on things such as urls,
> page rank and such. Since lucene only indexes documents, and
> calculates its keyword/query relevancy based only on term vectors (or
> whatever) nutch needs to add the url scoring and such to the index.
> 
> 
> 
> On 5/1/07, hzhong <[EMAIL PROTECTED]> wrote:
>>
>> Hello,
>>
>> In Indexer.java,  index(Path indexDir, Path crawlDb, Path linkDb, Path[]
>> segments), can someone explain to me why crawlDB and linkDB is needed for
>> indexing?
>>
>> In Lucene, there's no crawlDB and linkDB for indexing.
>>
>> Thank you very much
>>
>> Hanna
>> --
>> View this message in context:
>> http://www.nabble.com/Nutch-Indexer-tf3673420.html#a10264625
>> Sent from the Nutch - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> -- 
> "Conscious decisions by conscious minds are what make reality real"
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Nutch-Indexer-tf3673420.html#a10279417
Sent from the Nutch - User mailing list archive at Nabble.com.


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Re: [Nutch-general] Nutch Indexer

Reply via email to