Re: Problem with TableInputFormat - HBase 0.20

stack Sat, 08 Aug 2009 18:44:38 -0700

Thanks for posting the below Lucas.
St.Ack

On Sat, Aug 8, 2009 at 3:45 PM, Lucas Nazário dos Santos <
[email protected]> wrote:


> After a couple of tries the following code successfully worked. I didn't
> know about the TableMapReduceUtil class and the solution may be of some
> interest to others.
>
> final Job job = new Job(this.configuration);
> job.setJarByClass(Indexer.class);
> job.setJobName("NInvestNewsIndexer");
>
> final Scan scan = new Scan();
> scan.addColumn("date".getBytes(), "crawled".getBytes());
> scan.addColumn("date".getBytes(), "indexed".getBytes());
> TableMapReduceUtil.initTableMapperJob(args[0], scan,
> MapChangedTableRowsIntoUrls.class, Text.class, Text.class, job);
>
> job.setOutputFormatClass(NullOutputFormat.class);
> job.setReducerClass(ReduceUrlsToLuceneIndexIntoKatta.class);
>
>
>
> On Mon, Aug 3, 2009 at 8:31 PM, Amandeep Khurana <[email protected]> wrote:
>
> > The implementation in the new package is different from the old one. So,
> if
> > you want to use it in the same way as you used to use the old one, you'll
> > have to stick to the mapred package till the time you upgrade the code
> > according to the new implementation.
> >
> >
> > On Mon, Aug 3, 2009 at 3:45 PM, Lucas Nazário dos Santos <
> > [email protected]> wrote:
> >
> > > Thanks. But I didn't get it. Why should I stick with the old mapred
> > package
> > > if I'm moving everything to Hadoop and HBase 0.20? Stuff in the old
> > mapred
> > > package are all deprecated.
> > >
> > >
> > >
> > > On Mon, Aug 3, 2009 at 7:31 PM, stack <[email protected]> wrote:
> > >
> > > > Looks like crossed lines.
> > > >
> > > > In hadoop 0.20.0, there is the mapred package and the mapreduce
> > package.
> > > > The latter has the new lump-sum context to which you go for all
> things.
> > > > HBase has similar.  The new mapreduce package that is in 0.20.0 hbase
> > is
> > > > the
> > > > old mapred redone to fit the new hadoop APIs.  Below in your
> stacktrace
> > I
> > > > see use of the new hbase mapreduce stuff though you would hone to the
> > old
> > > > interface.  Try using the stuff in mapred package?
> > > >
> > > > St.Ack
> > > >
> > > >
> > > > On Mon, Aug 3, 2009 at 2:30 PM, Lucas Nazário dos Santos <
> > > > [email protected]> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I'm migrating from HBase 0.19 to version 0.20 and facing an error
> > > > regarding
> > > > > the TableInputFormat class. Bellow is how I'm setting up the job
> and
> > > also
> > > > > the error message I'm getting.
> > > > >
> > > > > Does anybody have a clue on what may be happening? It used to work
> on
> > > > HBase
> > > > > 0.19.
> > > > >
> > > > > Lucas
> > > > >
> > > > >
> > > > > this.configuration.set(TableInputFormat.INPUT_TABLE, args[0]);
> > > > > this.configuration.set(TableInputFormat.SCAN, "date");
> > > > > this.configuration.set("index.name", args[1]);
> > > > > this.configuration.set("hbase.master", args[2]);
> > > > > this.configuration.set("index.replication.level", args[3]);
> > > > >
> > > > > final Job jobConf = new Job(this.configuration);
> > > > > jobConf.setJarByClass(Indexer.class);
> > > > > jobConf.setJobName("NInvestNewsIndexer");
> > > > >
> > > > > FileInputFormat.setInputPaths(jobConf, new Path(args[0]));
> > > > >
> > > > > jobConf.setInputFormatClass(TableInputFormat.class);
> > > > > jobConf.setOutputFormatClass(NullOutputFormat.class);
> > > > >
> > > > > jobConf.setOutputKeyClass(Text.class);
> > > > > jobConf.setOutputValueClass(Text.class);
> > > > >
> > > > > jobConf.setMapperClass(MapChangedTableRowsIntoUrls.class);
> > > > > jobConf.setReducerClass(ReduceUrlsToLuceneIndexIntoKatta.class);
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > 09/08/03 18:19:19 ERROR mapreduce.TableInputFormat: An error
> > occurred.
> > > > > java.io.EOFException
> > > > >        at
> java.io.DataInputStream.readFully(DataInputStream.java:180)
> > > > >        at
> > > > org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:135)
> > > > >        at
> > org.apache.hadoop.hbase.client.Scan.readFields(Scan.java:493)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertStringToScan(TableMapReduceUtil.java:94)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:79)
> > > > >        at
> > > > >
> > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> > > > >        at
> > > > >
> org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:882)
> > > > >        at
> > > > >
> > >
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
> > > > >        at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
> > > > >        at
> > > org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
> > > > >        at com.nash.ninvest.index.indexer.Indexer.run(Unknown
> Source)
> > > > >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > > > >        at com.nash.ninvest.index.indexer.Indexer.main(Unknown
> Source)
> > > > >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > > > >        at java.lang.reflect.Method.invoke(Method.java:597)
> > > > >        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> > > > > Exception in thread "main" java.lang.NullPointerException
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:280)
> > > > >        at
> > > > >
> org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
> > > > >        at
> > > > >
> > >
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
> > > > >        at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
> > > > >        at
> > > org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
> > > > >        at com.nash.ninvest.index.indexer.Indexer.run(Unknown
> Source)
> > > > >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > > > >        at com.nash.ninvest.index.indexer.Indexer.main(Unknown
> Source)
> > > > >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > > > >        at java.lang.reflect.Method.invoke(Method.java:597)
> > > > >        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> > > > >
> > > >
> > >
> >
>

Re: Problem with TableInputFormat - HBase 0.20

Reply via email to