Re: Problem with TableInputFormat - HBase 0.20

Lucas Nazário dos Santos Sat, 08 Aug 2009 15:45:47 -0700

After a couple of tries the following code successfully worked. I didn't
know about the TableMapReduceUtil class and the solution may be of some
interest to others.


final Job job = new Job(this.configuration);
job.setJarByClass(Indexer.class);
job.setJobName("NInvestNewsIndexer");

final Scan scan = new Scan();
scan.addColumn("date".getBytes(), "crawled".getBytes());
scan.addColumn("date".getBytes(), "indexed".getBytes());
TableMapReduceUtil.initTableMapperJob(args[0], scan,
MapChangedTableRowsIntoUrls.class, Text.class, Text.class, job);

job.setOutputFormatClass(NullOutputFormat.class);
job.setReducerClass(ReduceUrlsToLuceneIndexIntoKatta.class);



On Mon, Aug 3, 2009 at 8:31 PM, Amandeep Khurana <ama...@gmail.com> wrote:

> The implementation in the new package is different from the old one. So, if
> you want to use it in the same way as you used to use the old one, you'll
> have to stick to the mapred package till the time you upgrade the code
> according to the new implementation.
>
>
> On Mon, Aug 3, 2009 at 3:45 PM, Lucas Nazário dos Santos <
> nazario.lu...@gmail.com> wrote:
>
> > Thanks. But I didn't get it. Why should I stick with the old mapred
> package
> > if I'm moving everything to Hadoop and HBase 0.20? Stuff in the old
> mapred
> > package are all deprecated.
> >
> >
> >
> > On Mon, Aug 3, 2009 at 7:31 PM, stack <st...@duboce.net> wrote:
> >
> > > Looks like crossed lines.
> > >
> > > In hadoop 0.20.0, there is the mapred package and the mapreduce
> package.
> > > The latter has the new lump-sum context to which you go for all things.
> > > HBase has similar.  The new mapreduce package that is in 0.20.0 hbase
> is
> > > the
> > > old mapred redone to fit the new hadoop APIs.  Below in your stacktrace
> I
> > > see use of the new hbase mapreduce stuff though you would hone to the
> old
> > > interface.  Try using the stuff in mapred package?
> > >
> > > St.Ack
> > >
> > >
> > > On Mon, Aug 3, 2009 at 2:30 PM, Lucas Nazário dos Santos <
> > > nazario.lu...@gmail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > I'm migrating from HBase 0.19 to version 0.20 and facing an error
> > > regarding
> > > > the TableInputFormat class. Bellow is how I'm setting up the job and
> > also
> > > > the error message I'm getting.
> > > >
> > > > Does anybody have a clue on what may be happening? It used to work on
> > > HBase
> > > > 0.19.
> > > >
> > > > Lucas
> > > >
> > > >
> > > > this.configuration.set(TableInputFormat.INPUT_TABLE, args[0]);
> > > > this.configuration.set(TableInputFormat.SCAN, "date");
> > > > this.configuration.set("index.name", args[1]);
> > > > this.configuration.set("hbase.master", args[2]);
> > > > this.configuration.set("index.replication.level", args[3]);
> > > >
> > > > final Job jobConf = new Job(this.configuration);
> > > > jobConf.setJarByClass(Indexer.class);
> > > > jobConf.setJobName("NInvestNewsIndexer");
> > > >
> > > > FileInputFormat.setInputPaths(jobConf, new Path(args[0]));
> > > >
> > > > jobConf.setInputFormatClass(TableInputFormat.class);
> > > > jobConf.setOutputFormatClass(NullOutputFormat.class);
> > > >
> > > > jobConf.setOutputKeyClass(Text.class);
> > > > jobConf.setOutputValueClass(Text.class);
> > > >
> > > > jobConf.setMapperClass(MapChangedTableRowsIntoUrls.class);
> > > > jobConf.setReducerClass(ReduceUrlsToLuceneIndexIntoKatta.class);
> > > >
> > > >
> > > >
> > > >
> > > > 09/08/03 18:19:19 ERROR mapreduce.TableInputFormat: An error
> occurred.
> > > > java.io.EOFException
> > > >        at java.io.DataInputStream.readFully(DataInputStream.java:180)
> > > >        at
> > > org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:135)
> > > >        at
> org.apache.hadoop.hbase.client.Scan.readFields(Scan.java:493)
> > > >        at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertStringToScan(TableMapReduceUtil.java:94)
> > > >        at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:79)
> > > >        at
> > > >
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
> > > >        at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> > > >        at
> > > > org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:882)
> > > >        at
> > > >
> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
> > > >        at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
> > > >        at
> > org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
> > > >        at com.nash.ninvest.index.indexer.Indexer.run(Unknown Source)
> > > >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > > >        at com.nash.ninvest.index.indexer.Indexer.main(Unknown Source)
> > > >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > >        at
> > > >
> > > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > > >        at
> > > >
> > > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > > >        at java.lang.reflect.Method.invoke(Method.java:597)
> > > >        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> > > > Exception in thread "main" java.lang.NullPointerException
> > > >        at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:280)
> > > >        at
> > > > org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
> > > >        at
> > > >
> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
> > > >        at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
> > > >        at
> > org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
> > > >        at com.nash.ninvest.index.indexer.Indexer.run(Unknown Source)
> > > >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > > >        at com.nash.ninvest.index.indexer.Indexer.main(Unknown Source)
> > > >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > >        at
> > > >
> > > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > > >        at
> > > >
> > > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > > >        at java.lang.reflect.Method.invoke(Method.java:597)
> > > >        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> > > >
> > >
> >
>

Re: Problem with TableInputFormat - HBase 0.20

Reply via email to