Hello everyone, I'm trying to run nutch for the first time and while executing */bin/nutch generate -topN 5* I get the following exception: GeneratorJob: starting at 2016-02-13 21:01:42 GeneratorJob: Selecting best-scoring urls due for fetch. GeneratorJob: starting GeneratorJob: filtering: true GeneratorJob: normalizing: true GeneratorJob: topN: 5 GeneratorJob: java.lang.RuntimeException: job failed: name=apache-nutch-2.3.1.jar, jobid=job_local1061440919_0001 at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:120) at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:227) at org.apache.nutch.crawl.GeneratorJob.generate(GeneratorJob.java:256) at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:322) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.nutch.crawl.GeneratorJob.main(GeneratorJob.java:330)
Here is the stacktrace from *hadoop.log*: 2016-02-13 21:01:44,541 ERROR mapreduce.GoraRecordReader - Error reading Gora records: null 2016-02-13 21:01:44,557 WARN mapred.LocalJobRunner - job_local1061440919_0001 java.lang.Exception: java.lang.RuntimeException: java.util.NoSuchElementException at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) Caused by: java.lang.RuntimeException: java.util.NoSuchElementException at org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:122) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.util.NoSuchElementException at java.util.concurrent.ConcurrentSkipListMap.firstKey(ConcurrentSkipListMap.java:2036) at org.apache.gora.memory.store.MemStore.execute(MemStore.java:128) at org.apache.gora.query.impl.QueryBase.execute(QueryBase.java:73) at org.apache.gora.mapreduce.GoraRecordReader.executeQuery(GoraRecordReader.java:67) at org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:109) ... 12 more I've been following the tutorial here: https://github.com/renepickhardt/metalcon/wiki/simpleNutchSolrSetup for setting up nutch. I've seen a few posts on stackoverflow and the nutch archives with similar exceptions, and they've suggested that I might be running out of disk space in my /tmp directory but the /tmp directory only has about 8MB worth of data on it. Other than this, I'm clueless about what is causing this exception What could be the cause of this exception? I'm using Nutch 2.3.1 along with HBase 1.1.3 as the datastore and I'm running it on Ubuntu 15.10 Thanks -- Regards, Binoy Dalal