in my original reduce() function I had declared "Reducer.Context", then later "TableReducer.Context" trying to avoid ambiguity. After adding the "@Override", it would not compile until I had declared the third parameter as simply "Context". Just before adding the "@Override" I also changed my first two parameters to be declared as Writable, and Iterable<Writable> respectively. I did this because after studying the source a little, the run() function of org.apache.hadoop.mapreduce.Reducer called the reduce() function with those parameters, and I was attempting to avoid accidentally over-loading, and not having my reduce() called. I don't know if that tid bit aided in my solution, but since it worked I left it.
Thanks, Travis Hegner http://www.travishegner.com/ -----Original Message----- From: Andrew Purtell <apurt...@apache.org<mailto:andrew%20purtell%20%3capurt...@apache.org%3e>> To: hbase-user@hadoop.apache.org <hbase-user@hadoop.apache.org<mailto:%22hbase-u...@hadoop.apache.org%22%20%3chbase-user@hadoop.apache.org%3e>>, Hegner, Travis <theg...@trilliumit.com<mailto:%22Hegner,%20travis%22%20%3ctheg...@trilliumit.com%3e>> Subject: Re: Pass a Delete or a Put Date: Tue, 28 Jul 2009 13:25:03 -0400 Hi Travis, No that's not silly. When you put the @Override decoration on the reduce method of your subclass, did the compiler accept it straight away, or did you have to fix up the method signature in some way in order to successfully compile the code? - Andy ________________________________ From: Travis Hegner <theg...@trilliumit.com> To: "hbase-user@hadoop.apache.org" <hbase-user@hadoop.apache.org> Sent: Tuesday, July 28, 2009 5:54:40 AM Subject: Re: Pass a Delete or a Put I've solved this problem, and (believe it or not) it was something I was not doing in my code... I am pretty new to java, and previous languages I've worked in, you could simply override a method in a child class without doing anything special. So apparently in every of the 137 thousand times I read the IdentityTableReducer.java file, I completely missed the "@Override" directive, which I would bet tells the compiler and/or the framework to use my function instead of the default Identity function in the org.apache.hadoop.mapreduce.Reducer class. Since the framework was using the identity function, it kept passing text records to TableOutputFormat, as that is my map job output. I was pretty sure my reduce function wasn't being called, and finally figured out why. Thanks everyone for the support, sorry to keep bugging the list with silly questions. Maybe they'll at least help some more new hbasers/hadoopers down the line. Travis Hegner http://www.travishegner.com/ -----Original Message----- From: Travis Hegner <theg...@trilliumit.com<mailto:theg...@trilliumit.com><mailto:travis%20hegner%20%3ctheg...@trilliumit.com<mailto:3ctheg...@trilliumit.com>%3e>> Reply-to: "hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org>" <hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org>>, "Hegner, Travis" <theg...@trilliumit.com<mailto:theg...@trilliumit.com>> To: hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org> <hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org><mailto:%22hbase-u...@hadoop.apache.org<mailto:22hbase-u...@hadoop.apache.org>%22%20%3chbase-u...@hadoop.apache.org<mailto:3chbase-u...@hadoop.apache.org>%3e>> Subject: Re: Pass a Delete or a Put Date: Mon, 27 Jul 2009 14:49:34 -0400 Andrew, I did not realize those other settings were implicitly defined by the init* functions. Thanks for the tip! I've updated my code with this in mind. All, In spite of that mistake, I still can not get my job to successfully run. I'm not sure if there is some kind of ambiguity going on with the TableMapper.Context and TableReducer.Context, but whatever is calling TableOutputFormat$RecordWriter.write(Key, Value), is calling it with my MAP class output, instead of my REDUCE class output. Anything else I can check? Thanks, Travis Hegner http://www.travishegner.com/ -----Original Message----- From: Andrew Purtell <apurt...@apache.org<mailto:apurt...@apache.org><mailto:apurt...@apache.org<mailto:apurt...@apache.org>><mailto:andrew%20purtell%20%3capurt...@apache.org<mailto:3capurt...@apache.org>%3e>> To: hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org><mailto:hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org>> <hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org><mailto:hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org>><mailto:%22hbase-u...@hadoop.apache.org<mailto:22hbase-u...@hadoop.apache.org>%22%20%3chbase-u...@hadoop.apache.org<mailto:3chbase-u...@hadoop.apache.org><mailto:%22%20%3chbase-u...@hadoop.apache.org<mailto:3chbase-u...@hadoop.apache.org>>%3e>>, Hegner, Travis <theg...@trilliumit.com<mailto:theg...@trilliumit.com><mailto:theg...@trilliumit.com<mailto:theg...@trilliumit.com>><mailto:%22Hegner,%20travis%22%20%3ctheg...@trilliumit.com<mailto:3ctheg...@trilliumit.com><mailto:%20travis%22%20%3ctheg...@trilliumit.com<mailto:3ctheg...@trilliumit.com>>%3e>> Subject: Re: Pass a Delete or a Put Date: Mon, 27 Jul 2009 11:41:56 -0400 This is how I would do it. I don't know for sure if it will help or not: public static class Map extends TableMapper<Text,Text> { public void map(ImmutableBytesWritable key, Result value, Mapper.Context context) throws IOException, InterruptedException { // ... } } public static class Reduce extends TableReducer<Text,Text,ImmutableBytesWritable> { public void reduce(Text key, Iterable<Text> values, Reducer.Context context) throws IOException, InterruptedException { Iterator<Text> i = values; while (i.hasNext()) { Text value = i.next(); // ... byte[] rowKey = Bytes.toBytes(key.toString()); Put = new Put(rowKey); //... // the key for write is ignored by TOF but we need one for the // framework ImmutableBytesWritable ibw = new ImmutableBytesWritable(rowKey); context.write(ibw, put); } } } public static void main(String[] args) throws Exception { Job myJob = new Job(); myJob.setJobName("myJob"); myJob.setJarByClass(MyClass.class); Scan myScan = new Scan("".getBytes(),"12345".getBytes()); myScan.addColumn("Resume:Text".getBytes()); TableMapReduceUtil.initTableMapperJob("inputTable", myScan, Map.class, Text.class, Text.class, myJob); TableMapReduceUtil.initTableReducerJob("outputTable", Reduce.class, myJob); // the following are done implicitly by initTableReducerJob // job.setOutputFormatClass(TableOutputFormat.class); // job.setOutputKeyClass(ImmutableBytesWritable.class); // job.setOutputValueClass(Put.class); myJob.setNumReduceTasks(12); myJob.submit(); while(!myJob.isComplete()) { Thread.currentThread().sleep(10000); System.out.println("Map: " + (myJob.mapProgress() * 100) + "% ... Reduce: " + (myJob.reduceProgress() * 100) + "%"); } if(myJob.isSuccessful()) { System.out.println("Job Successful."); } else { System.out.println("Job Failed."); } } ________________________________ From: Travis Hegner <theg...@trilliumit.com<mailto:theg...@trilliumit.com><mailto:theg...@trilliumit.com<mailto:theg...@trilliumit.com>>> To: "hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org><mailto:hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org>>" <hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org><mailto:hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org>>> Sent: Monday, July 27, 2009 6:33:57 AM Subject: Re: Pass a Delete or a Put Here is the main function from my m/r class. I've been just piecing what to do together through old and new documentation, so forgive me if this is not right. public static void main(String[] args) throws Exception { Job myJob = new Job(); myJob.setJobName("myJob"); myJob.setJarByClass(MyClass.class); myJob.setMapOutputKeyClass(Text.class); myJob.setMapOutputValueClass(Text.class); myJob.setOutputKeyClass(Text.class); myJob.setOutputValueClass(Put.class); Scan myScan = new Scan("".getBytes(),"12345".getBytes()); myScan.addColumn("Resume:Text".getBytes()); TableMapReduceUtil.initTableMapperJob("inputTable", myScan, Map.class, Text.class, Text.class, myJob); TableMapReduceUtil.initTableReducerJob("outputTable", Reduce.class, myJob); myJob.setMapperClass(Map.class); myJob.setReducerClass(Reduce.class); myJob.setInputFormatClass(TableInputFormat.class); myJob.setOutputFormatClass(TableOutputFormat.class); myJob.setNumReduceTasks(12); myJob.submit(); while(!myJob.isComplete()) { Thread.currentThread().sleep(10000); System.out.println("Map: " + (myJob.mapProgress() * 100) + "% ... Reduce: " + (myJob.reduceProgress() * 100) + "%"); } if(myJob.isSuccessful()) { System.out.println("Job Successful."); } else { System.out.println("Job Failed."); } } I originally did not have "myJob.setOutputValueClass(Put.class)" set properly (I was looking for something like 'setReduceOutputValueClass') but found it just before reading this email. I changed my context.write statements for both my map and reduce classes to output static data, and what seems to be happening is that the job framework is calling my map class where it should be calling my reduce class. To explain further, I did as stack suggested and modified TableOutputFormat.java as follows: 96: else throw new IOException("Pass a Delete or a Put rather than a " + value.getClass() + " = " + value); It seems that no matter what I put in my reduce functions "context.write()", the TableOutputFormat.write() function believes that I have passed a "Text", and the value that it contains, is the very static value that I am writing for every map iteration. my map/reduce classes/functions are defined as follows: public static class Map extends TableMapper<Text,Text> { public void map(ImmutableBytesWritable key, Result value, Mapper.Context context) throws IOException, InterruptedException { } } public static class Reduce extends TableReducer<Writable,Writable,Put> { public void reduce(Text key, Iterable<Text> values, Reducer.Context context) throws IOException, InterruptedException { } } I tried modeling after the identity functions, but apparently I'm doing something wrong... Thanks for any help, Travis -----Original Message----- From: Andrew Purtell <apurt...@apache.org<mailto:apurt...@apache.org><mailto:apurt...@apache.org<mailto:apurt...@apache.org>><mailto:apurt...@apache.org<mailto:apurt...@apache.org>><mailto:andrew%20purtell%20%3capurt...@apache.org<mailto:3capurt...@apache.org><mailto:3capurt...@apache.org<mailto:3capurt...@apache.org>>%3e>> Reply-to: "hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org><mailto:hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org>><mailto:hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org>>" <hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org><mailto:hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org>><mailto:hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org>>> To: hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org><mailto:hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org>><mailto:hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org>> <hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org><mailto:hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org>><mailto:hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org>><mailto:%22hbase-u...@hadoop.apache.org<mailto:22hbase-u...@hadoop.apache.org><mailto:22hbase-u...@hadoop.apache.org<mailto:22hbase-u...@hadoop.apache.org>>%22%20%3chbase-u...@hadoop.apache.org<mailto:3chbase-u...@hadoop.apache.org><mailto:%22%20%3chbase-u...@hadoop.apache.org<mailto:3chbase-u...@hadoop.apache.org>><mailto:3chbase-u...@hadoop.apache.org<mailto:3chbase-u...@hadoop.apache.org>>%3e>> Subject: Re: Pass a Delete or a Put Date: Sun, 26 Jul 2009 14:33:14 -0400 How is the job configured? Are the TableMapReduceUtil static methods used or is it done by hand? This might be missing: job.setOutputValueClass(Put.class) - Andy ________________________________ From: stack <st...@duboce.net<mailto:st...@duboce.net><mailto:st...@duboce.net<mailto:st...@duboce.net>><mailto:st...@duboce.net<mailto:st...@duboce.net>><mailto:st...@duboce.net<mailto:st...@duboce.net><mailto:st...@duboce.net<mailto:st...@duboce.net>>>> To: hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org><mailto:hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org>><mailto:hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org>><mailto:hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org><mailto:hbase-user@hadoop.apache.org<mailto:hbase-user@hadoop.apache.org>>> Sent: Saturday, July 25, 2009 9:08:01 AM Subject: Re: Pass a Delete or a Put Hmm... That should work. Here is code from TOF: public void write(KEY key, Writable value) throws IOException { if (value instanceof Put) this.table.put(new Put((Put)value)); else if (value instanceof Delete) this.table.delete(new Delete((Delete)value)); else throw new IOException("Pass a Delete or a Put"); } Maybe change the IOE... to something like: else throw new IOException("Pass a Delete or a Put rather than a " + value); ...compile and retry. Whats the exception look like? We're squashing the type somehow or else context.write and TOF#RecordWriter#write are not properly hooked up. Thanks Travis, St.Ack On Sat, Jul 25, 2009 at 7:14 AM, Hegner, Travis <theg...@trilliumit.com<mailto:theg...@trilliumit.com><mailto:theg...@trilliumit.com<mailto:theg...@trilliumit.com>><mailto:theg...@trilliumit.com<mailto:theg...@trilliumit.com>><mailto:theg...@trilliumit.com<mailto:theg...@trilliumit.com><mailto:theg...@trilliumit.com<mailto:theg...@trilliumit.com>>>>wrote: > Hi All, > > I am getting the "Pass a Delete or a Put" exception from my reducer tasks > (TableOutputFormat.java:96), even though I am actually passing a put... > > for(int i = 0; i < idList.size(); i++) { > Put thisput = new > Put(key.toString().getBytes()); > thisput.add("Positions".getBytes(), > idList.get(i).toString().getBytes(), posList.get(i).toString().getBytes()); > context.write(key, thisput); > } > > Is there anything wrong with this section of code from my reduce()? > > I have also tried casting the value with: > > context.write(key, (Put)thisput); > > Any Ideas? > > Travis Hegner > http://www.travishegner.com/ > > The information contained in this communication is confidential and is > intended only for the use of the named recipient. Unauthorized use, > disclosure, or copying is strictly prohibited and may be unlawful. If you > have received this communication in error, you should know that you are > bound to confidentiality, and should please immediately notify the sender or > our IT Department at 866.459.4599. > ________________________________ The information contained in this communication is confidential and is intended only for the use of the named recipient. Unauthorized use, disclosure, or copying is strictly prohibited and may be unlawful. If you have received this communication in error, you should know that you are bound to confidentiality, and should please immediately notify the sender or our IT Department at 866.459.4599. ________________________________ The information contained in this communication is confidential and is intended only for the use of the named recipient. Unauthorized use, disclosure, or copying is strictly prohibited and may be unlawful. If you have received this communication in error, you should know that you are bound to confidentiality, and should please immediately notify the sender or our IT Department at 866.459.4599. ________________________________ The information contained in this communication is confidential and is intended only for the use of the named recipient. Unauthorized use, disclosure, or copying is strictly prohibited and may be unlawful. If you have received this communication in error, you should know that you are bound to confidentiality, and should please immediately notify the sender or our IT Department at 866.459.4599. ________________________________ The information contained in this communication is confidential and is intended only for the use of the named recipient. Unauthorized use, disclosure, or copying is strictly prohibited and may be unlawful. If you have received this communication in error, you should know that you are bound to confidentiality, and should please immediately notify the sender or our IT Department at 866.459.4599.