Hello and thanks for your time,
I'm trying to run a MapReduce job that outputs to HBase. Since I have
previously run the LineIndexer example as a simple Hadoop job (lab 1 at
http://code.google.com/edu/content/submissions/uwspr2007_clustercourse/listing.html)
I am trying to modify the code to output to an HBase table. It runs into the
reduce phase, but then returns a number of ClassNotFoundExceptions and
attempts to restart. The job ultimately fails. I have tried this using both
the Eclipse plugin and the command line.
All of the exceptions are in response to TableOutputFormat and I attempted
to see if there was some issue with this. I placed the following two lines
at the beginning of the main function to see if there was in issue with the
hbase jar on my computer:
TableOutputFormat testRef = new TableOutputFormat();
Class testGet = TableOutputFormat.class;
These two lines executed with no issue, and I am at a loss. Any insight that
could be provided is appreciated. Also, I only subscribe to the digest, so
could you please cc me ([EMAIL PROTECTED]) on any responses?
The specs of the cluster, code I am using, and the output are below,
Thanks,
Dan
[EMAIL PROTECTED]
----------------------------------------Specs
Master: Apple XServe w/ Mac OS X Leopard
Nodes (4): Mac Minis w/ Mac OS X Tiger
Java version: Java(TM) 2 Runtime Environment, Standard Edition (build
1.5.0_13-b05-237)
Hadoop version: 0.15.1
----------------------------------------Code
package bkl;
import java.io.*;
import java.util.*;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;
import org.apache.hadoop.fs.*;
import org.apache.hadoop.hbase.io.*;
import org.apache.hadoop.hbase.mapred.*;
public class HBaseTest {
/**
* @param args
* @throws Exception
*/
public static void main(String[] args) throws Exception {
TableOutputFormat testRef = new TableOutputFormat();
Class testGet = TableOutputFormat.class;
String[] test = {"Thomson-OutlineOfScience-V1"};
JobConf conf = new JobConf(HBaseTest.class);
conf.setJobName("LineIndexer");
conf.setMapperClass(HBaseTestMap.class);
conf.setReducerClass(HBaseTestReduce.class);
//conf.setInputPath(new Path(args[0]));
conf.setInputPath(new Path(test[0]));
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TableOutputFormat.class);
conf.setOutputKeyClass(Text.class);
conf.set("hbase.mapred.outputtable", "Test");
conf.set("hbase.master", "BKLCluster.hadoop:9003");
JobClient.runJob(conf);
}
public static class HBaseTestMap
extends MapReduceBase implements Mapper<WritableComparable,
Writable, Text, Text> {
private final static Text word = new Text();
private final static Text summary = new Text();
public void map(WritableComparable key, Writable val,
OutputCollector<Text, Text> output, Reporter report)
throws IOException {
String line = val.toString();
summary.set(key.toString() + ":" + line);
StringTokenizer itr = new StringTokenizer(line.toLowerCase());
while(itr.hasMoreTokens())
{
word.set("test1:" + itr.nextToken());
output.collect(word, summary);
}
}
}
public static class HBaseTestReduce
extends MapReduceBase implements Reducer<WritableComparable, Text,
WritableComparable, MapWritable> {
public void reduce(WritableComparable key, Iterator<Text> values,
OutputCollector<WritableComparable, MapWritable> output,
Reporter reporter) throws IOException {
boolean first = true;
StringBuilder toReturn = new StringBuilder();
while(values.hasNext()){
if(!first)
toReturn.append('^');
first=false;
toReturn.append(values.next().toString());
}
byte[] bytes = toReturn.toString().getBytes();
MapWritable retval = new MapWritable();
retval.put(new Text(""), new ImmutableBytesWritable(bytes));
output.collect(key, retval);
}
}
}
----------------------------------------Output
08/01/29 13:19:40 INFO mapred.FileInputFormat: Total input paths to process
: 1
08/01/29 13:19:40 INFO mapred.JobClient: Running job: job_200801241404_0024
08/01/29 13:19:41 INFO mapred.JobClient: map 0% reduce 0%
08/01/29 13:19:43 INFO mapred.JobClient: map 2% reduce 0%
08/01/29 13:19:44 INFO mapred.JobClient: map 7% reduce 0%
08/01/29 13:19:45 INFO mapred.JobClient: map 12% reduce 0%
08/01/29 13:19:46 INFO mapred.JobClient: map 17% reduce 0%
08/01/29 13:19:47 INFO mapred.JobClient: map 25% reduce 0%
08/01/29 13:19:48 INFO mapred.JobClient: map 35% reduce 0%
08/01/29 13:19:49 INFO mapred.JobClient: map 47% reduce 0%
08/01/29 13:19:50 INFO mapred.JobClient: map 62% reduce 0%
08/01/29 13:19:51 INFO mapred.JobClient: map 74% reduce 0%
08/01/29 13:19:52 INFO mapred.JobClient: map 89% reduce 0%
08/01/29 13:19:53 INFO mapred.JobClient: map 100% reduce 0%
08/01/29 13:20:01 INFO mapred.JobClient: map 100% reduce 3%
08/01/29 13:20:02 INFO mapred.JobClient: map 100% reduce 20%
08/01/29 13:20:03 INFO mapred.JobClient: map 100% reduce 23%
08/01/29 13:20:06 INFO mapred.JobClient: map 100% reduce 11%
08/01/29 13:20:06 INFO mapred.JobClient: Task Id :
task_200801241404_0024_r_000000_0, Status : FAILED
java.lang.RuntimeException: java.lang.RuntimeException:
java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.mapred.TableOutputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:576)
at org.apache.hadoop.mapred.JobConf.getOutputFormat(JobConf.java:461)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:301)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java
:1760)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.mapred.TableOutputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:544)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:568)
... 3 more
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.mapred.TableOutputFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:242)
at org.apache.hadoop.conf.Configuration.getClassByName(
Configuration.java:524)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:542)
... 4 more
08/01/29 13:20:06 INFO mapred.JobClient: Task Id :
task_200801241404_0024_r_000001_0, Status : FAILED
java.lang.RuntimeException: java.lang.RuntimeException:
java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.mapred.TableOutputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:576)
at org.apache.hadoop.mapred.JobConf.getOutputFormat(JobConf.java:461)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:301)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java
:1760)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.mapred.TableOutputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:544)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:568)
... 3 more
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.mapred.TableOutputFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:242)
at org.apache.hadoop.conf.Configuration.getClassByName(
Configuration.java:524)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:542)
... 4 more
08/01/29 13:20:06 INFO mapred.JobClient: Task Id :
task_200801241404_0024_r_000002_0, Status : FAILED
java.lang.RuntimeException: java.lang.RuntimeException:
java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.mapred.TableOutputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:576)
at org.apache.hadoop.mapred.JobConf.getOutputFormat(JobConf.java:461)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:301)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java
:1760)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.mapred.TableOutputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:544)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:568)
... 3 more
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.mapred.TableOutputFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:242)
at org.apache.hadoop.conf.Configuration.getClassByName(
Configuration.java:524)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:542)
... 4 more
08/01/29 13:20:07 INFO mapred.JobClient: map 100% reduce 0%
08/01/29 13:20:07 INFO mapred.JobClient: Task Id :
task_200801241404_0024_r_000004_0, Status : FAILED
java.lang.RuntimeException: java.lang.RuntimeException:
java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.mapred.TableOutputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:576)
at org.apache.hadoop.mapred.JobConf.getOutputFormat(JobConf.java:461)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:301)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java
:1760)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.mapred.TableOutputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:544)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:568)
... 3 more
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.mapred.TableOutputFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:242)
at org.apache.hadoop.conf.Configuration.getClassByName(
Configuration.java:524)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:542)
... 4 more
08/01/29 13:20:07 INFO mapred.JobClient: Task Id :
task_200801241404_0024_r_000003_0, Status : FAILED
java.lang.RuntimeException: java.lang.RuntimeException:
java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.mapred.TableOutputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:576)
at org.apache.hadoop.mapred.JobConf.getOutputFormat(JobConf.java:461)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:301)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java
:1760)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.mapred.TableOutputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:544)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:568)
... 3 more
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.mapred.TableOutputFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:242)
at org.apache.hadoop.conf.Configuration.getClassByName(
Configuration.java:524)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:542)
... 4 more
08/01/29 13:20:07 INFO mapred.JobClient: Task Id :
task_200801241404_0024_r_000005_0, Status : FAILED
java.lang.RuntimeException: java.lang.RuntimeException:
java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.mapred.TableOutputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:576)
at org.apache.hadoop.mapred.JobConf.getOutputFormat(JobConf.java:461)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:301)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java
:1760)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.mapred.TableOutputFormat
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:544)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:568)
... 3 more
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.mapred.TableOutputFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:316)
at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:374)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:242)
at org.apache.hadoop.conf.Configuration.getClassByName(
Configuration.java:524)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:542)
... 4 more