On 11/21/06, Doug Cook <[EMAIL PROTECTED]> wrote: > > > If you are doing a lot of URL filtering with regular expressions, this can > take a massive amount of time in reduce. There may be some speedups > possible, depending upon your usage patterns; some are as simple as config > changes, others will take a patch (which I haven't contributed back yet, but > will). > > Let me know if you do a lot of filtering, and I'll post a longer list of > suggestions.
Yes, I like to know please. > -Doug > > > Benjamin Higgins wrote: > > > > I'd like to know what are all the known techniques for speeding up > > MapReduce > > for a single user machine. > > > > So far, I know of this patch: > > > > http://issues.apache.org/jira/browse/NUTCH-395 > > > > I also am reading that changing hadoop-site.xml can help, but I don't know > > what changes to make. > > > > Please add anything you've found that will help. I am considering going > > back to 0.7 if I can't get Nutch to go faster. In my case I am also > > crawling just a single site. > > > > Ben > > > > > > -- > View this message in context: > http://www.nabble.com/Guide-to-speeding-up-Map-Reduce-on-single-machine-setup-tf2680869.html#a7479019 > Sent from the Nutch - User mailing list archive at Nabble.com. > > ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
