Upgrade proceeding on 0.9x - was able to parse and index just fine after
first halt, but now I get an error in the logs and an empty query
response in the browser. Only clue is in tomcat log which is tossing up
this nice error for me and since I'm an admin and not a serious coder,
the src shouts nothing obvious to me when I look at those lines...:
From Tomcat logs:
Dec 16, 2006 4:23:36 PM org.apache.catalina.core.StandardWrapperValve invoke
SEVERE: Servlet.service() for servlet jsp threw exception
java.lang.NullPointerException
at
org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.java:159)
at
org.apache.nutch.searcher.FetchedSegments$SummaryThread.run(FetchedSegments.java:177)
Catalina.out give this weird result - in asterisks below:
2006-12-16 16:16:55,215 INFO NutchBean - query request from 192.168.2.2
2006-12-16 16:16:55,216 INFO NutchBean - query: hoho
2006-12-16 16:16:55,216 INFO NutchBean - lang: en
2006-12-16 16:16:55,220 INFO NutchBean - searching for 20 raw hits
***2006-12-16 16:16:55,372 INFO NutchBean - re-searching for 40 raw
hits, query: sex -site:"www.cooking.com" -site:"www.foodtv.com"***
2006-12-16 16:16:55,732 INFO NutchBean - found 102857 raw hits
How did those sites get in the query..?? I didn't have them there as I
only searched for hoho..??
Again - I'm behind a slow pipe so to re-crawl all of this converted data
is not an option in the short term.... Only change from default 8 to 9
procedure was the addition of the mapred.speculation=false which cured
the parse error and allowed me to continue and index. Only other thing
may be the linkdb not being put where it was told, but I moved it over
(it built fine) and the index operation ran without issue.
--
rp
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general