I should also mention that I'm running nutch version 0.9 On 7/2/07, Lyndon Maydwell <[EMAIL PROTECTED]> wrote: > Hi, > > I'm a new user to nutch and am wondering about seeding the database by > running a crawl with a very shallow depth, then growing the database > every time the periodic update script is done. I have two scripts that > I'm currently using, but I'm not sure if the update script is actually > adding searchable data. The initial crawl script is doing a great job, > and I can verify that it is working by using the search app that comes > with nutch, but my maintenance script doesn't seem to be adding any > results, although it throws no errors. > > Below are the two small scripts. Am I missing any simple errors? > > -- initial crawl script << END1 -- > > #!/bin/sh > ./../bin/nutch crawl urls -dir crawl -depth 2 -topN 10000 > > END1 > > -- updater script << END2 -- > > first="crawl" > second="100000" > > ../bin/nutch generate $first/crawldb $first/segments -topN $second > > segment=`ls -d $first/segments/* | tail -1 | grep "[a-zA-Z0-9/]*"` > > ../bin/nutch fetch $segment > > ../bin/nutch updatedb $first/crawldb $segment > > rm -r $first/indexes > > ../bin/nutch invertlinks $first/linkdb $first/segments/* > > ../bin/nutch index $first/indexes $first/crawldb > $first/linkdb $first/segments/* > > END2 >
------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
