[ https://issues.apache.org/jira/browse/NUTCH-2049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14641402#comment-14641402 ]
Lewis John McGibbney commented on NUTCH-2049: --------------------------------------------- BTW, this is only for 2.4.0 for same reason as explained at last issue. Thsi is an upgrade of dependencies and API usage.... NOT mapred --> mapreduce API's for each NutchJob. [~markus.jel...@openindex.io] had a great crack at trying to upgrade some... I would also join his ranks and make best efforts to make all jobs 2.X mapreduce API if it makes sense. It would be nice to have a Nutch roadMap TBH. Team, how do we feel here? Tests are broken as follows {code} 1 Testsuite: org.apache.nutch.crawl.TestCrawlDbFilter 2 Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.986 sec 3 ------------- Standard Output --------------- 4 2015-07-25 01:29:50,852 WARN util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 5 2015-07-25 01:29:51,215 INFO compress.CodecPool (CodecPool.java:getCompressor(151)) - Got brand-new compressor [.deflate] 6 2015-07-25 01:29:51,231 INFO compress.CodecPool (CodecPool.java:getCompressor(151)) - Got brand-new compressor [.deflate] 7 2015-07-25 01:29:51,231 INFO crawl.CrawlDBTestUtil (CrawlDBTestUtil.java:createCrawlDb(67)) - adding:http://www.example.com 8 2015-07-25 01:29:51,232 INFO crawl.CrawlDBTestUtil (CrawlDBTestUtil.java:createCrawlDb(67)) - adding:http://www.example1.com 9 2015-07-25 01:29:51,235 INFO crawl.CrawlDBTestUtil (CrawlDBTestUtil.java:createCrawlDb(67)) - adding:http://www.example2.com 10 ------------- ---------------- --------------- 11 ------------- Standard Error ----------------- 12 SLF4J: Class path contains multiple SLF4J bindings. 13 SLF4J: Found binding in [jar:file:/usr/local/trunk_clean/build/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] 14 SLF4J: Found binding in [jar:file:/usr/local/trunk_clean/build/test/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] 15 SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 16 SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 17 ------------- ---------------- --------------- 18 19 Testcase: testUrl404Purging took 0.969 sec 20 Caused an ERROR 21 Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses. 22 java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses. 23 at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120) 24 at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:82) 25 at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:75) 26 at org.apache.hadoop.mapred.JobClient.init(JobClient.java:470) 27 at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:449) 28 at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:832) 29 at org.apache.nutch.crawl.TestCrawlDbFilter.testUrl404Purging(TestCrawlDbFilter.java:107) {code} > Upgrade Trunk to Hadoop > 2.4 stable > ------------------------------------ > > Key: NUTCH-2049 > URL: https://issues.apache.org/jira/browse/NUTCH-2049 > Project: Nutch > Issue Type: Improvement > Components: build > Reporter: Lewis John McGibbney > Assignee: Lewis John McGibbney > Fix For: 1.11 > > Attachments: NUTCH-2049.patch > > > Convo here - http://www.mail-archive.com/dev%40nutch.apache.org/msg18225.html > I am +1 for taking trunk (or a branch of trunk) to explicit dependency on > > Hadoop 2.6. > We can run our tests, we can validate, we can fix. > I will be doing validation on 2.X in paralegal as this is what I use on my > own projects. -- This message was sent by Atlassian JIRA (v6.3.4#6332)