After a couple of tries the following code successfully worked. I didn't know about the TableMapReduceUtil class and the solution may be of some interest to others.
final Job job = new Job(this.configuration); job.setJarByClass(Indexer.class); job.setJobName("NInvestNewsIndexer"); final Scan scan = new Scan(); scan.addColumn("date".getBytes(), "crawled".getBytes()); scan.addColumn("date".getBytes(), "indexed".getBytes()); TableMapReduceUtil.initTableMapperJob(args[0], scan, MapChangedTableRowsIntoUrls.class, Text.class, Text.class, job); job.setOutputFormatClass(NullOutputFormat.class); job.setReducerClass(ReduceUrlsToLuceneIndexIntoKatta.class); On Mon, Aug 3, 2009 at 8:31 PM, Amandeep Khurana <ama...@gmail.com> wrote: > The implementation in the new package is different from the old one. So, if > you want to use it in the same way as you used to use the old one, you'll > have to stick to the mapred package till the time you upgrade the code > according to the new implementation. > > > On Mon, Aug 3, 2009 at 3:45 PM, Lucas Nazário dos Santos < > nazario.lu...@gmail.com> wrote: > > > Thanks. But I didn't get it. Why should I stick with the old mapred > package > > if I'm moving everything to Hadoop and HBase 0.20? Stuff in the old > mapred > > package are all deprecated. > > > > > > > > On Mon, Aug 3, 2009 at 7:31 PM, stack <st...@duboce.net> wrote: > > > > > Looks like crossed lines. > > > > > > In hadoop 0.20.0, there is the mapred package and the mapreduce > package. > > > The latter has the new lump-sum context to which you go for all things. > > > HBase has similar. The new mapreduce package that is in 0.20.0 hbase > is > > > the > > > old mapred redone to fit the new hadoop APIs. Below in your stacktrace > I > > > see use of the new hbase mapreduce stuff though you would hone to the > old > > > interface. Try using the stuff in mapred package? > > > > > > St.Ack > > > > > > > > > On Mon, Aug 3, 2009 at 2:30 PM, Lucas Nazário dos Santos < > > > nazario.lu...@gmail.com> wrote: > > > > > > > Hi, > > > > > > > > I'm migrating from HBase 0.19 to version 0.20 and facing an error > > > regarding > > > > the TableInputFormat class. Bellow is how I'm setting up the job and > > also > > > > the error message I'm getting. > > > > > > > > Does anybody have a clue on what may be happening? It used to work on > > > HBase > > > > 0.19. > > > > > > > > Lucas > > > > > > > > > > > > this.configuration.set(TableInputFormat.INPUT_TABLE, args[0]); > > > > this.configuration.set(TableInputFormat.SCAN, "date"); > > > > this.configuration.set("index.name", args[1]); > > > > this.configuration.set("hbase.master", args[2]); > > > > this.configuration.set("index.replication.level", args[3]); > > > > > > > > final Job jobConf = new Job(this.configuration); > > > > jobConf.setJarByClass(Indexer.class); > > > > jobConf.setJobName("NInvestNewsIndexer"); > > > > > > > > FileInputFormat.setInputPaths(jobConf, new Path(args[0])); > > > > > > > > jobConf.setInputFormatClass(TableInputFormat.class); > > > > jobConf.setOutputFormatClass(NullOutputFormat.class); > > > > > > > > jobConf.setOutputKeyClass(Text.class); > > > > jobConf.setOutputValueClass(Text.class); > > > > > > > > jobConf.setMapperClass(MapChangedTableRowsIntoUrls.class); > > > > jobConf.setReducerClass(ReduceUrlsToLuceneIndexIntoKatta.class); > > > > > > > > > > > > > > > > > > > > 09/08/03 18:19:19 ERROR mapreduce.TableInputFormat: An error > occurred. > > > > java.io.EOFException > > > > at java.io.DataInputStream.readFully(DataInputStream.java:180) > > > > at > > > org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:135) > > > > at > org.apache.hadoop.hbase.client.Scan.readFields(Scan.java:493) > > > > at > > > > > > > > > > > > > > org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertStringToScan(TableMapReduceUtil.java:94) > > > > at > > > > > > > > > > > > > > org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:79) > > > > at > > > > > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62) > > > > at > > > > > > > > > > > > > > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) > > > > at > > > > org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:882) > > > > at > > > > > > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779) > > > > at org.apache.hadoop.mapreduce.Job.submit(Job.java:432) > > > > at > > org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447) > > > > at com.nash.ninvest.index.indexer.Indexer.run(Unknown Source) > > > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > > > > at com.nash.ninvest.index.indexer.Indexer.main(Unknown Source) > > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > > at > > > > > > > > > > > > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > > > > at > > > > > > > > > > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > > > at java.lang.reflect.Method.invoke(Method.java:597) > > > > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > > > > Exception in thread "main" java.lang.NullPointerException > > > > at > > > > > > > > > > > > > > org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:280) > > > > at > > > > org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885) > > > > at > > > > > > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779) > > > > at org.apache.hadoop.mapreduce.Job.submit(Job.java:432) > > > > at > > org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447) > > > > at com.nash.ninvest.index.indexer.Indexer.run(Unknown Source) > > > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > > > > at com.nash.ninvest.index.indexer.Indexer.main(Unknown Source) > > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > > at > > > > > > > > > > > > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > > > > at > > > > > > > > > > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > > > at java.lang.reflect.Method.invoke(Method.java:597) > > > > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > > > > > > > > > >