We have a python script with logging which fully automates the fetching and updating process, not the invert links or the indexing process. If anybody wants a copy, send me an email and I will send you a copy.
We are currently working on a more in-depth framework for automating these types of job streams in python but that is not complete yet. Andrzej, do you think this is something we should post to the wiki? Dennis Kubes Justin Hartman wrote: > Hi all > > Just have a couple more questions which remain unclear to me at this stage. > > 1. I'm fetching urls on a P4 2.8ghz machine with 1GB ram and 100mbps > connection. Based on this config what would you recommend the maximum > fetcher threads should be? > > 2. Does anyone know of a script or plugin that can automate the > segment/fetch/indexing process? Basicallly I'm fetching about 20 > million pages and I have to run the segment, fetch and index process > myself in a shell (which takes some time). I really would like some > sort of a shell script that I can run and the whole process can run as > a daemon in the background and I can worry about other issues. > > Thank you in advance!!!! ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
