I've solved this problem, and (believe it or not) it was something I was not 
doing in my code...

I am pretty new to java, and previous languages I've worked in, you could 
simply override a method in a child class without doing anything special. So 
apparently in every of the 137 thousand times I read the 
IdentityTableReducer.java file, I completely missed the "@Override" directive, 
which I would bet tells the compiler and/or the framework to use my function 
instead of the default Identity function in the 
org.apache.hadoop.mapreduce.Reducer class. Since the framework was using the 
identity function, it kept passing text records to TableOutputFormat, as that 
is my map job output. I was pretty sure my reduce function wasn't being called, 
and finally figured out why.

Thanks everyone for the support, sorry to keep bugging the list with silly 
questions. Maybe they'll at least help some more new hbasers/hadoopers down the 
line.

Travis Hegner
http://www.travishegner.com/



-----Original Message-----
From: Travis Hegner 
<theg...@trilliumit.com<mailto:travis%20hegner%20%3ctheg...@trilliumit.com%3e>>
Reply-to: "hbase-user@hadoop.apache.org" <hbase-user@hadoop.apache.org>, 
"Hegner, Travis" <theg...@trilliumit.com>
To: hbase-user@hadoop.apache.org 
<hbase-user@hadoop.apache.org<mailto:%22hbase-u...@hadoop.apache.org%22%20%3chbase-user@hadoop.apache.org%3e>>
Subject: Re: Pass a Delete or a Put
Date: Mon, 27 Jul 2009 14:49:34 -0400



Andrew,

I did not realize those other settings were implicitly defined by the init* 
functions. Thanks for the tip! I've updated my code with this in mind.

All,

In spite of that mistake, I still can not get my job to successfully run. I'm 
not sure if there is some kind of ambiguity going on with the 
TableMapper.Context and TableReducer.Context, but whatever is calling 
TableOutputFormat$RecordWriter.write(Key, Value), is calling it with my MAP 
class output, instead of my REDUCE class output.

Anything else I can check?

Thanks,
Travis Hegner
http://www.travishegner.com/

-----Original Message-----
From: Andrew Purtell 
<apurt...@apache.org<mailto:apurt...@apache.org><mailto:andrew%20purtell%20%3capurt...@apache.org%3e>>
To: hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org> 
<hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org><mailto:%22hbase-u...@hadoop.apache.org%22%20%3chbase-user@hadoop.apache.org<mailto:%22%20%3chbase-u...@hadoop.apache.org>%3e>>,
 Hegner, Travis 
<theg...@trilliumit.com<mailto:theg...@trilliumit.com><mailto:%22Hegner,%20travis%22%20%3ctheg...@trilliumit.com<mailto:%20travis%22%20%3ctheg...@trilliumit.com>%3e>>
Subject: Re: Pass a Delete or a Put
Date: Mon, 27 Jul 2009 11:41:56 -0400

This is how I would do it. I don't know for sure if it will help or not:

public static class Map extends TableMapper<Text,Text> {
  public void map(ImmutableBytesWritable key, Result value, Mapper.Context 
context) throws IOException, InterruptedException {
    // ...
  }
}

public static class Reduce extends 
TableReducer<Text,Text,ImmutableBytesWritable> {
  public void reduce(Text key, Iterable<Text> values, Reducer.Context context) 
throws IOException, InterruptedException {
    Iterator<Text> i = values;
    while (i.hasNext()) {
      Text value = i.next();

      // ...

      byte[] rowKey = Bytes.toBytes(key.toString());
      Put = new Put(rowKey);

      //...

      // the key for write is ignored by TOF but we need one for the
      // framework
      ImmutableBytesWritable ibw = new ImmutableBytesWritable(rowKey);
      context.write(ibw, put);
    }
  }
}

public static void main(String[] args) throws Exception {
  Job myJob = new Job();
  myJob.setJobName("myJob");
  myJob.setJarByClass(MyClass.class);

  Scan myScan = new Scan("".getBytes(),"12345".getBytes());
  myScan.addColumn("Resume:Text".getBytes());

  TableMapReduceUtil.initTableMapperJob("inputTable", myScan, Map.class, 
Text.class, Text.class, myJob);
  TableMapReduceUtil.initTableReducerJob("outputTable", Reduce.class, myJob);

  // the following are done implicitly by initTableReducerJob
  //   job.setOutputFormatClass(TableOutputFormat.class);
  //   job.setOutputKeyClass(ImmutableBytesWritable.class);
  //   job.setOutputValueClass(Put.class);

  myJob.setNumReduceTasks(12);

  myJob.submit();

  while(!myJob.isComplete()) {
    Thread.currentThread().sleep(10000);
    System.out.println("Map: " + (myJob.mapProgress() * 100) + "% ... Reduce: " 
+ (myJob.reduceProgress() * 100) + "%");
  }

  if(myJob.isSuccessful()) {
    System.out.println("Job Successful.");
  } else {
    System.out.println("Job Failed.");
  }
}





________________________________

From: Travis Hegner <theg...@trilliumit.com<mailto:theg...@trilliumit.com>>
To: "hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org>" 
<hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org>>
Sent: Monday, July 27, 2009 6:33:57 AM
Subject: Re: Pass a Delete or a Put

Here is the main function from my m/r class. I've been just piecing what to do 
together through old and new documentation, so forgive me if this is not right.

public static void main(String[] args) throws Exception {
Job myJob = new Job();
myJob.setJobName("myJob");
myJob.setJarByClass(MyClass.class);

myJob.setMapOutputKeyClass(Text.class);
myJob.setMapOutputValueClass(Text.class);

myJob.setOutputKeyClass(Text.class);
myJob.setOutputValueClass(Put.class);

Scan myScan = new Scan("".getBytes(),"12345".getBytes());
myScan.addColumn("Resume:Text".getBytes());

TableMapReduceUtil.initTableMapperJob("inputTable", myScan, Map.class, 
Text.class, Text.class, myJob);
TableMapReduceUtil.initTableReducerJob("outputTable", Reduce.class, myJob);

myJob.setMapperClass(Map.class);
myJob.setReducerClass(Reduce.class);

myJob.setInputFormatClass(TableInputFormat.class);
myJob.setOutputFormatClass(TableOutputFormat.class);

myJob.setNumReduceTasks(12);

myJob.submit();

while(!myJob.isComplete()) {
Thread.currentThread().sleep(10000);
System.out.println("Map: " + (myJob.mapProgress() * 100) + "% ... Reduce: " + 
(myJob.reduceProgress() * 100) + "%");
}

if(myJob.isSuccessful()) {
System.out.println("Job Successful.");
} else {
System.out.println("Job Failed.");
}
}

I originally did not have "myJob.setOutputValueClass(Put.class)" set properly 
(I was looking for something like 'setReduceOutputValueClass') but found it 
just before reading this email. I changed my context.write statements for both 
my map and reduce classes to output static data, and what seems to be happening 
is that the job framework is calling my map class where it should be calling my 
reduce class. To explain further, I did as stack suggested and modified 
TableOutputFormat.java as follows:

96:    else throw new IOException("Pass a Delete or a Put rather than a " + 
value.getClass() + " = " + value);

It seems that no matter what I put in my reduce functions "context.write()", 
the TableOutputFormat.write() function believes that I have passed a "Text", 
and the value that it contains, is the very static value that I am writing for 
every map iteration.

my map/reduce classes/functions are defined as follows:

public static class Map extends TableMapper<Text,Text> {
  public void map(ImmutableBytesWritable key, Result value, Mapper.Context 
context) throws IOException, InterruptedException {
  }
}

public static class Reduce extends TableReducer<Writable,Writable,Put> {
  public void reduce(Text key, Iterable<Text> values, Reducer.Context context) 
throws IOException, InterruptedException {
  }
}

I tried modeling after the identity functions, but apparently I'm doing 
something wrong...

Thanks for any help,

Travis


-----Original Message-----
From: Andrew Purtell 
<apurt...@apache.org<mailto:apurt...@apache.org><mailto:apurt...@apache.org><mailto:andrew%20purtell%20%3capurt...@apache.org<mailto:3capurt...@apache.org>%3e>>
Reply-to: 
"hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org><mailto:hbase-user@hadoop.apache.org>"
 
<hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org><mailto:hbase-user@hadoop.apache.org>>
To: 
hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org><mailto:hbase-user@hadoop.apache.org>
 
<hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org><mailto:hbase-user@hadoop.apache.org><mailto:%22hbase-u...@hadoop.apache.org<mailto:22hbase-u...@hadoop.apache.org>%22%20%3chbase-u...@hadoop.apache.org<mailto:%22%20%3chbase-u...@hadoop.apache.org><mailto:3chbase-u...@hadoop.apache.org>%3e>>
Subject: Re: Pass a Delete or a Put
Date: Sun, 26 Jul 2009 14:33:14 -0400



How is the job configured? Are the TableMapReduceUtil static methods used
or is it done by hand?

This might be missing:

  job.setOutputValueClass(Put.class)

- Andy




________________________________
From: stack 
<st...@duboce.net<mailto:st...@duboce.net><mailto:st...@duboce.net><mailto:st...@duboce.net<mailto:st...@duboce.net>>>
To: 
hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org><mailto:hbase-user@hadoop.apache.org><mailto:hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org>>
Sent: Saturday, July 25, 2009 9:08:01 AM
Subject: Re: Pass a Delete or a Put

Hmm... That should work.

Here is code from TOF:

    public void write(KEY key, Writable value)
    throws IOException {
      if (value instanceof Put) this.table.put(new Put((Put)value));
      else if (value instanceof Delete) this.table.delete(new
Delete((Delete)value));
      else throw new IOException("Pass a Delete or a Put");
    }


Maybe change the IOE... to something like:

      else throw new IOException("Pass a Delete or a Put rather than a " +
value);


...compile and retry.  Whats the exception look like?

We're squashing the type somehow or else context.write and
TOF#RecordWriter#write are not properly hooked up.

Thanks Travis,
St.Ack


On Sat, Jul 25, 2009 at 7:14 AM, Hegner, Travis 
<theg...@trilliumit.com<mailto:theg...@trilliumit.com><mailto:theg...@trilliumit.com><mailto:theg...@trilliumit.com<mailto:theg...@trilliumit.com>>>wrote:

> Hi All,
>
> I am getting the "Pass a Delete or a Put" exception from my reducer tasks
> (TableOutputFormat.java:96), even though I am actually passing a put...
>
>                        for(int i = 0; i < idList.size(); i++) {
>                                Put thisput = new
> Put(key.toString().getBytes());
>                                thisput.add("Positions".getBytes(),
> idList.get(i).toString().getBytes(), posList.get(i).toString().getBytes());
>                                context.write(key, thisput);
>                        }
>
> Is there anything wrong with this section of code from my reduce()?
>
> I have also tried casting the value with:
>
> context.write(key, (Put)thisput);
>
> Any Ideas?
>
> Travis Hegner
> http://www.travishegner.com/
>
> The information contained in this communication is confidential and is
> intended only for the use of the named recipient.  Unauthorized use,
> disclosure, or copying is strictly prohibited and may be unlawful.  If you
> have received this communication in error, you should know that you are
> bound to confidentiality, and should please immediately notify the sender or
> our IT Department at  866.459.4599.
>






________________________________
The information contained in this communication is confidential and is intended 
only for the use of the named recipient. Unauthorized use, disclosure, or 
copying is strictly prohibited and may be unlawful. If you have received this 
communication in error, you should know that you are bound to confidentiality, 
and should please immediately notify the sender or our IT Department at 
866.459.4599.



________________________________
The information contained in this communication is confidential and is intended 
only for the use of the named recipient. Unauthorized use, disclosure, or 
copying is strictly prohibited and may be unlawful. If you have received this 
communication in error, you should know that you are bound to confidentiality, 
and should please immediately notify the sender or our IT Department at 
866.459.4599.


________________________________
The information contained in this communication is confidential and is intended 
only for the use of the named recipient. Unauthorized use, disclosure, or 
copying is strictly prohibited and may be unlawful. If you have received this 
communication in error, you should know that you are bound to confidentiality, 
and should please immediately notify the sender or our IT Department at 
866.459.4599.

Reply via email to