Preferred Nutch development environment for OS X?

2006-09-06 Thread Doug Cook
Hey, folks, Sorry to gum up the works with such a banal question. I'm trying to decide which development environment I should use on my Mac. (OS X 10.4.7). Until now I've just been using the tried-and-true shell+editor combination (yes, I'm a crusty old Unix engineer), but as I ramp up work with

Re: Problem with logging of Fetcher output in 0.8-dev

2006-08-23 Thread Doug Cook
asks gets written > there as well. > Not an elegant solution but it works for debugging purposes. The problem > seems to be to do with the environment variable ${hadoop.log.dir} not > being > set when log4j.properties is parsed and so nutch tries to write to a file > in > "/

Re: Problem with logging of Fetcher output in 0.8-dev

2006-08-21 Thread Doug Cook
Hi, Ed- I'm seeing the same problem. If anyone has had a similar experience and solved it, please let me know. In the mean time, I'll keep investigating and post back if I figure out what's going wrong. This may or may not matter, but I'm running everything on a single MP machine w/o DFS. Doug

Re: Best performance approach for single MP machine?

2006-07-21 Thread Doug Cook
Thanks, HÃ¥vard (and Doug, in the original email). Those pointers, plus a few other tips from elsewhere, did the trick. I'm now up and running with all CPUs. One thing I found along the way was that if I did not set mapred.child.heap.size, I would run out of heap space in initialization of inject

Best performance approach for single MP machine?

2006-07-19 Thread Doug Cook
Hi, I've recently switched to 0.8 from 0.7, and after some initial fits and starts, I'm past the "get it working at all" stage to the "get reasonable performance" stage. I've got a single machine with 4 CPUs and a lot of memory. URL fetching works great because it's (mostly) multithreaded. But a

Re: .8 svn - fetcher performance..

2006-06-27 Thread Doug Cook
Byron, Did you ever resolve your 0.8 vs 0.7 crawling performance question? I'm running into a similar problem. -- View this message in context: http://www.nabble.com/.8-svnfetcher-performance..-tf1170232.html#a5076764 Sent from the Nutch - User forum at Nabble.com.

Crawl performance v0.7 vs v0.8

2006-06-27 Thread Doug Cook
Hi, I'm experimenting with switching to v0.8 because of the richer set of plugins, and from this point of view, it's great, but so far I have seen much lower crawl performance, and I'm hoping it's just a matter of tuning the right parameters. I'm running on 1 4-CPU machine, and under 0.7 I could

Dump of filtered-out URLs?

2006-05-10 Thread Doug Cook
to get Nutch to tell me what URLs have been excluded via URL filters? Thanks for your help, Doug Cook -- View this message in context: http://www.nabble.com/Dump-of-filtered-out-URLs--t1594311.html#a4326336 Sent from the Nutch - User forum at Nabble.com.