On Sun, 2006-01-29 at 01:04 +0000, Rafit Izhak_Ratzin wrote: > I thought that by running the fetch command (bin/nutch fetch ...) it already > does some kind of parsing , otherwise how it get the next level of URLss?
You are correct only if fetcher.parse property is set to true... The parsing is done in the map part right after the page is fetched. > > and in this case in what part the parsing is done in the mapping or in the > reducing of the fetch process? > > Thanks again, > Rafit > > > > >From: "Hvard W. Kongsgrd" <[EMAIL PROTECTED]> > >Reply-To: [email protected] > >To: [email protected] > >Subject: Re: The parsing is part of the Map or part of the Reduce? > >Date: Sat, 28 Jan 2006 23:05:05 +0100 > > > >So you have been following the quick tutorial for nutch 0.8 and later at > >media-style.................... > >The author has left out the parse and updatedb part. > >After fetch simply run bin/nutch parse segment/2006xxxx and then bin/nutch > >crawldb updatedb segment/2006xxx. > > > >Rafit Izhak_Ratzin wrote: > > > >>Hi, > >>In what part of the mapred the parsing is done in the Map part or in the > >>Reduce part? > >> > >>Thanks, > >>Rafit > >> > >>_________________________________________________________________ > >>Express yourself instantly with MSN Messenger! Download today it's FREE! > >>http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ > >> > >> > > > > _________________________________________________________________ > FREE pop-up blocking with the new MSN Toolbar - get it now! > http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/ > > ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
