[ http://issues.apache.org/jira/browse/NUTCH-147?page=comments#action_12361198 ]
raghavendra prabhu commented on NUTCH-147: ------------------------------------------ Is this issue because you need cygwin to run the crawl on windows The version 0.7.1 had no such dependencies. Can anyone conform???? > nutch map reduce does not work in windows map reduce runs in a loop > ------------------------------------------------------------------- > > Key: NUTCH-147 > URL: http://issues.apache.org/jira/browse/NUTCH-147 > Project: Nutch > Type: Bug > Components: indexer > Versions: 0.8-dev > Environment: Windows system Winxp Pro > Reporter: raghavendra prabhu > Priority: Blocker > > Description > Crawl Starts > and i am able to see the initial messages > Then the map reduce process starts and it continues to run in a loop > I do not find the same problem in linux(linux it works perfectly) > Below is loop into which i run into > clustering.OnlineClusterer) > 051222 182058 Nutch Indexing Filter > (org.apache.nutch.indexer.IndexingFilter) > 051222 182058 Nutch Content Parser (org.apache.nutch.parse.Parser) > 051222 182058 Ontology Model Loader (org.apache.nutch.ontology.Ontology) > 051222 182058 Nutch Analysis (org.apache.nutch.analysis.NutchAnalyzer) > 051222 182058 Nutch Query Filter (org.apache.nutch.searcher.QueryFilter) > 051222 182058 found resource crawl-urlfilter.txt at > file:/G:/trunklatest/conf/cr > awl-urlfilter.txt > 051222 182058 crawl\url.txt:0+25 > 051222 182059 crawl\url.txt:0+25 > 051222 182059 map -521216% > 051222 182100 crawl\url.txt:0+25 > 051222 182100 map -1107496% > 051222 182101 crawl\url.txt:0+25 > 051222 182101 map -1678544% > 051222 182102 crawl\url.txt:0+25 > 051222 182102 map -2265900% > 051222 182103 crawl\url.txt:0+25 > 051222 182103 map -2849416% > 051222 182104 crawl\url.txt:0+25 > 051222 182104 map -3422908% > 051222 182105 crawl\url.txt:0+25 > The same thing continues -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
