Does this page help? http://wiki.apache.org/hadoop/Hbase/MapReduce
St.Ack Dan Tamowski wrote:
Hello and thanks for your time, I'm trying to run a MapReduce job that outputs to HBase. Since I have previously run the LineIndexer example as a simple Hadoop job (lab 1 at http://code.google.com/edu/content/submissions/uwspr2007_clustercourse/listing.html) I am trying to modify the code to output to an HBase table. It runs into the reduce phase, but then returns a number of ClassNotFoundExceptions and attempts to restart. The job ultimately fails. I have tried this using both the Eclipse plugin and the command line. All of the exceptions are in response to TableOutputFormat and I attempted to see if there was some issue with this. I placed the following two lines at the beginning of the main function to see if there was in issue with the hbase jar on my computer: TableOutputFormat testRef = new TableOutputFormat(); Class testGet = TableOutputFormat.class; These two lines executed with no issue, and I am at a loss. Any insight that could be provided is appreciated. Also, I only subscribe to the digest, so could you please cc me ([EMAIL PROTECTED]) on any responses? The specs of the cluster, code I am using, and the output are below, Thanks, Dan [EMAIL PROTECTED] ----------------------------------------Specs Master: Apple XServe w/ Mac OS X Leopard Nodes (4): Mac Minis w/ Mac OS X Tiger Java version: Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_13-b05-237) Hadoop version: 0.15.1 ----------------------------------------Code package bkl; import java.io.*; import java.util.*; import org.apache.hadoop.conf.*; import org.apache.hadoop.io.*; import org.apache.hadoop.mapred.*; import org.apache.hadoop.util.*; import org.apache.hadoop.fs.*; import org.apache.hadoop.hbase.io.*; import org.apache.hadoop.hbase.mapred.*; public class HBaseTest { /** * @param args * @throws Exception */ public static void main(String[] args) throws Exception { TableOutputFormat testRef = new TableOutputFormat(); Class testGet = TableOutputFormat.class; String[] test = {"Thomson-OutlineOfScience-V1"}; JobConf conf = new JobConf(HBaseTest.class); conf.setJobName("LineIndexer"); conf.setMapperClass(HBaseTestMap.class); conf.setReducerClass(HBaseTestReduce.class); //conf.setInputPath(new Path(args[0])); conf.setInputPath(new Path(test[0])); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TableOutputFormat.class); conf.setOutputKeyClass(Text.class); conf.set("hbase.mapred.outputtable", "Test"); conf.set("hbase.master", "BKLCluster.hadoop:9003"); JobClient.runJob(conf); } public static class HBaseTestMap extends MapReduceBase implements Mapper<WritableComparable, Writable, Text, Text> { private final static Text word = new Text(); private final static Text summary = new Text(); public void map(WritableComparable key, Writable val, OutputCollector<Text, Text> output, Reporter report) throws IOException { String line = val.toString(); summary.set(key.toString() + ":" + line); StringTokenizer itr = new StringTokenizer(line.toLowerCase()); while(itr.hasMoreTokens()) { word.set("test1:" + itr.nextToken()); output.collect(word, summary); } } } public static class HBaseTestReduce extends MapReduceBase implements Reducer<WritableComparable, Text, WritableComparable, MapWritable> { public void reduce(WritableComparable key, Iterator<Text> values, OutputCollector<WritableComparable, MapWritable> output, Reporter reporter) throws IOException { boolean first = true; StringBuilder toReturn = new StringBuilder(); while(values.hasNext()){ if(!first) toReturn.append('^'); first=false; toReturn.append(values.next().toString()); } byte[] bytes = toReturn.toString().getBytes(); MapWritable retval = new MapWritable(); retval.put(new Text(""), new ImmutableBytesWritable(bytes)); output.collect(key, retval); } } } ----------------------------------------Output 08/01/29 13:19:40 INFO mapred.FileInputFormat: Total input paths to process : 1 08/01/29 13:19:40 INFO mapred.JobClient: Running job: job_200801241404_0024 08/01/29 13:19:41 INFO mapred.JobClient: map 0% reduce 0% 08/01/29 13:19:43 INFO mapred.JobClient: map 2% reduce 0% 08/01/29 13:19:44 INFO mapred.JobClient: map 7% reduce 0% 08/01/29 13:19:45 INFO mapred.JobClient: map 12% reduce 0% 08/01/29 13:19:46 INFO mapred.JobClient: map 17% reduce 0% 08/01/29 13:19:47 INFO mapred.JobClient: map 25% reduce 0% 08/01/29 13:19:48 INFO mapred.JobClient: map 35% reduce 0% 08/01/29 13:19:49 INFO mapred.JobClient: map 47% reduce 0% 08/01/29 13:19:50 INFO mapred.JobClient: map 62% reduce 0% 08/01/29 13:19:51 INFO mapred.JobClient: map 74% reduce 0% 08/01/29 13:19:52 INFO mapred.JobClient: map 89% reduce 0% 08/01/29 13:19:53 INFO mapred.JobClient: map 100% reduce 0% 08/01/29 13:20:01 INFO mapred.JobClient: map 100% reduce 3% 08/01/29 13:20:02 INFO mapred.JobClient: map 100% reduce 20% 08/01/29 13:20:03 INFO mapred.JobClient: map 100% reduce 23% 08/01/29 13:20:06 INFO mapred.JobClient: map 100% reduce 11% 08/01/29 13:20:06 INFO mapred.JobClient: Task Id : task_200801241404_0024_r_000000_0, Status : FAILED java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapred.TableOutputFormat at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:576) at org.apache.hadoop.mapred.JobConf.getOutputFormat(JobConf.java:461) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:301) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java :1760) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapred.TableOutputFormat at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:544) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:568) ... 3 more Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapred.TableOutputFormat at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at java.lang.ClassLoader.loadClass(ClassLoader.java:251) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:242) at org.apache.hadoop.conf.Configuration.getClassByName( Configuration.java:524) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:542) ... 4 more 08/01/29 13:20:06 INFO mapred.JobClient: Task Id : task_200801241404_0024_r_000001_0, Status : FAILED java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapred.TableOutputFormat at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:576) at org.apache.hadoop.mapred.JobConf.getOutputFormat(JobConf.java:461) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:301) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java :1760) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapred.TableOutputFormat at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:544) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:568) ... 3 more Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapred.TableOutputFormat at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at java.lang.ClassLoader.loadClass(ClassLoader.java:251) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:242) at org.apache.hadoop.conf.Configuration.getClassByName( Configuration.java:524) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:542) ... 4 more 08/01/29 13:20:06 INFO mapred.JobClient: Task Id : task_200801241404_0024_r_000002_0, Status : FAILED java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapred.TableOutputFormat at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:576) at org.apache.hadoop.mapred.JobConf.getOutputFormat(JobConf.java:461) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:301) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java :1760) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapred.TableOutputFormat at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:544) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:568) ... 3 more Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapred.TableOutputFormat at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at java.lang.ClassLoader.loadClass(ClassLoader.java:251) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:242) at org.apache.hadoop.conf.Configuration.getClassByName( Configuration.java:524) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:542) ... 4 more 08/01/29 13:20:07 INFO mapred.JobClient: map 100% reduce 0% 08/01/29 13:20:07 INFO mapred.JobClient: Task Id : task_200801241404_0024_r_000004_0, Status : FAILED java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapred.TableOutputFormat at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:576) at org.apache.hadoop.mapred.JobConf.getOutputFormat(JobConf.java:461) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:301) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java :1760) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapred.TableOutputFormat at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:544) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:568) ... 3 more Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapred.TableOutputFormat at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at java.lang.ClassLoader.loadClass(ClassLoader.java:251) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:242) at org.apache.hadoop.conf.Configuration.getClassByName( Configuration.java:524) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:542) ... 4 more 08/01/29 13:20:07 INFO mapred.JobClient: Task Id : task_200801241404_0024_r_000003_0, Status : FAILED java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapred.TableOutputFormat at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:576) at org.apache.hadoop.mapred.JobConf.getOutputFormat(JobConf.java:461) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:301) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java :1760) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapred.TableOutputFormat at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:544) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:568) ... 3 more Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapred.TableOutputFormat at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at java.lang.ClassLoader.loadClass(ClassLoader.java:251) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:242) at org.apache.hadoop.conf.Configuration.getClassByName( Configuration.java:524) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:542) ... 4 more 08/01/29 13:20:07 INFO mapred.JobClient: Task Id : task_200801241404_0024_r_000005_0, Status : FAILED java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapred.TableOutputFormat at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:576) at org.apache.hadoop.mapred.JobConf.getOutputFormat(JobConf.java:461) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:301) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java :1760) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapred.TableOutputFormat at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:544) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:568) ... 3 more Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapred.TableOutputFormat at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:316) at java.lang.ClassLoader.loadClass(ClassLoader.java:251) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:374) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:242) at org.apache.hadoop.conf.Configuration.getClassByName( Configuration.java:524) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:542) ... 4 more
