2012/5/10 Harsh J <ha...@cloudera.com>: > Christoph, > > I still don't get your issue though. A stack trace of the error thrown > out by the components you used would be good to have :) ok. ;)
$export HBASE_CONF_DIR=/etc/hbase/conf/ $export HADOOP_CLASSPATH=`hbase classpath` $hadoop jar test.jar simple.HtablePartitionerTest -wp ... 12/05/10 13:24:37 INFO mapred.JobClient: Running job: job_201205021601_0064 12/05/10 13:24:38 INFO mapred.JobClient: map 0% reduce 0% 12/05/10 13:24:56 INFO mapred.JobClient: Task Id : attempt_201205021601_0064_m_000000_0, Status : FAILED java.lang.NullPointerException at org.apache.hadoop.hbase.mapreduce.HRegionPartitioner.setConf(HRegionPartitioner.java:128) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:560) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:639) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) at org.apache.hadoop.mapred.Child.main(Child.java:264) 12/05/10 13:25:13 INFO mapred.JobClient: Task Id : attempt_201205021601_0064_m_000000_1, Status : FAILED java.lang.NullPointerException at org.apache.hadoop.hbase.mapreduce.HRegionPartitioner.setConf(HRegionPartitioner.java:128) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:560) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:639) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) at org.apache.hadoop.mapred.Child.main(Child.java:264) 12/05/10 13:25:30 INFO mapred.JobClient: Task Id : attempt_201205021601_0064_m_000000_2, Status : FAILED java.lang.NullPointerException at org.apache.hadoop.hbase.mapreduce.HRegionPartitioner.setConf(HRegionPartitioner.java:128) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:560) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:639) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) at org.apache.hadoop.mapred.Child.main(Child.java:264) 12/05/10 13:25:49 INFO mapred.JobClient: Job complete: job_201205021601_0064 12/05/10 13:25:49 INFO mapred.JobClient: Counters: 8 12/05/10 13:25:49 INFO mapred.JobClient: Job Counters 12/05/10 13:25:49 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=69823 12/05/10 13:25:49 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 12/05/10 13:25:49 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 12/05/10 13:25:49 INFO mapred.JobClient: Rack-local map tasks=3 12/05/10 13:25:49 INFO mapred.JobClient: Launched map tasks=4 12/05/10 13:25:49 INFO mapred.JobClient: Data-local map tasks=1 12/05/10 13:25:49 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0 12/05/10 13:25:49 INFO mapred.JobClient: Failed map tasks=1 $ HtablePartitionerTest.java: package simple; import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.conf.Configured; import org.apache.hadoop.hbase.KeyValue; import org.apache.hadoop.hbase.client.Put; import org.apache.hadoop.hbase.client.Result; import org.apache.hadoop.hbase.client.Scan; import org.apache.hadoop.hbase.io.ImmutableBytesWritable; import org.apache.hadoop.hbase.mapreduce.HRegionPartitioner; import org.apache.hadoop.hbase.mapreduce.IdentityTableReducer; import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil; import org.apache.hadoop.hbase.mapreduce.TableMapper; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.util.Tool; import org.apache.hadoop.util.ToolRunner; import org.apache.log4j.Level; public class HtablePartitionerTest extends Configured implements Tool { public static void main(String[] args) throws Exception { int res = ToolRunner.run(new Configuration(), new HtablePartitionerTest(), args); System.exit(res); } @Override public int run(String[] args) throws Exception { Configuration conf = getConf(); Job job = new Job(conf); job.setJobName("simple partitioner test"); TableMapReduceUtil.initTableMapperJob("test", new Scan(), Importer.class, ImmutableBytesWritable.class, Put.class, job); TableMapReduceUtil.initTableReducerJob("test2", IdentityTableReducer.class, job, HRegionPartitioner.class); job.setJarByClass(HtablePartitionerTest.class); TableMapReduceUtil.addDependencyJars(job); TableMapReduceUtil.addDependencyJars(job.getConfiguration(), com.google.common.base.Function.class); job.waitForCompletion(true); return 0; } /** * Importer converts Results to Puts */ static class Importer extends TableMapper<ImmutableBytesWritable, Put> { @Override public void map(ImmutableBytesWritable row, Result value, Context context) throws IOException { try { writeResult(row, value, context); } catch (InterruptedException e) { e.printStackTrace(); } } private void writeResult(ImmutableBytesWritable key, Result result, Context context) throws IOException, InterruptedException { Put put = new Put(key.get()); for (KeyValue kv : result.raw()) { put.add(kv); } context.write(key, put); } } } test and test2 are identical tables with 1 row in test and from the tasklog (domainnames changed): 2012-05-10 13:33:49,598 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.class.path=/usr/lib/hadoop-0.20/conf:/usr/lib/jvm/java-6-sun/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u3.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0.20/lib/commons-lang-2.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r09-jarjar.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u3.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar:/um/srv/hadoop/data01/mapred/local/taskTracker/c.bauer/jobcache/job_201205021601_0065/jars/classes:/um/srv/hadoop/data01/mapred/local/taskTracker/c.bauer/jobcache/job_201205021601_0065/jars/job.jar:/um/srv/hadoop/data01/mapred/local/taskTracker/c.bauer/distcache/-7989996860260536268_1437961863_914797102/sargas.mycompany.net/mnt/data01/hadoop/cache/mapred/mapred/staging/c.bauer/.staging/job_201205021601_0065/libjars/hbase-0.90.4-cdh3u3.jar:/um/srv/hadoop/data01/mapred/local/taskTracker/c.bauer/distcache/-17356112888283949_-1672022207_914797221/sargas.mycompany.net/mnt/data01/hadoop/cache/mapred/mapred/staging/c.bauer/.staging/job_201205021601_0065/libjars/zookeeper-3.3.4-cdh3u3.jar:/um/srv/hadoop/data01/mapred/local/taskTracker/c.bauer/distcache/2846241399872591301_137152222_914797256/sargas.mycompany.net/mnt/data01/hadoop/cache/mapred/mapred/staging/c.bauer/.staging/job_201205021601_0065/libjars/hadoop-core-0.20.2-cdh3u3.jar:/um/srv/hadoop/data01/mapred/local/taskTracker/c.bauer/distcache/168365755990403618_1434190150_914797329/sargas.mycompany.net/mnt/data01/hadoop/cache/mapred/mapred/staging/c.bauer/.staging/job_201205021601_0065/libjars/guava-r06.jar:/um/srv/hadoop/data01/mapred/local/taskTracker/c.bauer/jobcache/job_201205021601_0065/attempt_201205021601_0065_m_000000_0/work 2012-05-10 13:33:49,598 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64:/um/srv/hadoop/data01/mapred/local/taskTracker/c.bauer/jobcache/job_201205021601_0065/attempt_201205021601_0065_m_000000_0/work 2012-05-10 13:33:49,598 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/um/srv/hadoop/data01/mapred/local/taskTracker/c.bauer/jobcache/job_201205021601_0065/attempt_201205021601_0065_m_000000_0/work/tmp 2012-05-10 13:33:49,598 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=<NA> 2012-05-10 13:33:49,598 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.name=Linux 2012-05-10 13:33:49,598 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.arch=amd64 2012-05-10 13:33:49,598 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=2.6.32-38-server 2012-05-10 13:33:49,598 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=mapred 2012-05-10 13:33:49,598 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/usr/lib/hadoop-0.20 2012-05-10 13:33:49,598 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/mnt/data01/hadoop/mapred/local/taskTracker/c.bauer/jobcache/job_201205021601_0065/attempt_201205021601_0065_m_000000_0/work 2012-05-10 13:33:49,600 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=peacock:2222,nusakan:2222,sheratan:2222,sargas:2222,muscida:2222 sessionTimeout=180000 watcher=hconnection 2012-05-10 13:33:49,625 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server muscida/x.x.183.22:2222 2012-05-10 13:33:49,626 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to muscida/x.x.183.22:2222, initiating session 2012-05-10 13:33:49,635 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server muscida/x.x.183.22:2222, sessionid = 0x5370ddaa6a70092, negotiated timeout = 40000 2012-05-10 13:33:49,848 INFO org.apache.hadoop.hbase.mapreduce.TableOutputFormat: Created table instance for test2 2012-05-10 13:33:49,870 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0 2012-05-10 13:33:49,881 INFO org.apache.hadoop.mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@68a0864f 2012-05-10 13:33:49,974 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=peacock:2222,nusakan:2222,sheratan:2222,sargas:2222,muscida:2222 sessionTimeout=180000 watcher=hconnection 2012-05-10 13:33:49,975 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server sheratan/x.x.183.19:2222 2012-05-10 13:33:49,976 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to sheratan/x.x.183.19:2222, initiating session 2012-05-10 13:33:49,979 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server sheratan/x.x.183.19:2222, sessionid = 0x2370dda9fbd009e, negotiated timeout = 40000 2012-05-10 13:33:50,041 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 100 2012-05-10 13:33:50,104 INFO org.apache.hadoop.mapred.MapTask: data buffer = 79691776/99614720 2012-05-10 13:33:50,104 INFO org.apache.hadoop.mapred.MapTask: record buffer = 262144/327680 2012-05-10 13:33:50,136 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=180000 watcher=hconnection 2012-05-10 13:33:50,137 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181 2012-05-10 13:33:50,139 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1146) 2012-05-10 13:33:50,295 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181 2012-05-10 13:33:50,296 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1146) 2012-05-10 13:33:51,692 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181 2012-05-10 13:33:51,693 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused ... repeated several times at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1146) 2012-05-10 13:34:00,362 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181 2012-05-10 13:34:00,363 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1146) 2012-05-10 13:34:00,931 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181 2012-05-10 13:34:01,032 INFO org.apache.zookeeper.ZooKeeper: Session: 0x0 closed 2012-05-10 13:34:01,032 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2012-05-10 13:34:01,032 ERROR org.apache.hadoop.hbase.mapreduce.TableInputFormat: org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to connect to ZooKeeper but the connection closes immediately. This could be a sign that the server has too many connections (30 is the default). Consider inspecting your ZK server logs for that error and then make sure you are reusing HBaseConfiguration as often as you can. See HTable's javadoc for more information. 2012-05-10 13:34:01,038 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 2012-05-10 13:34:01,041 WARN org.apache.hadoop.mapred.Child: Error running child java.lang.NullPointerException at org.apache.hadoop.hbase.mapreduce.HRegionPartitioner.setConf(HRegionPartitioner.java:128) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:560) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:639) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) at org.apache.hadoop.mapred.Child.main(Child.java:264) 2012-05-10 13:34:01,045 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task > > On Thu, May 10, 2012 at 3:38 PM, Christoph Bauer > <christoph.ba...@unbelievable-machine.com> wrote: >> Hi, >> thank you for your input. >> >> I've been doing it exactly that way. Jars appear in the classpath >> without problems. >> But I am unable to transfer hbase-site.xml to the mapper's classpath. >> >> So now I will try adding hbase-site.xml to the CP hadoop-env.sh. >> >> The addDependencyJars mechanism does not work for xml files :( >> >> >> >> 2012/5/9 Harsh J <ha...@cloudera.com>: >>> Could you share your whole stack trace? >>> >>> How do you launch your HBase+MR job? The ideal way is to simply do: >>> >>> HADOOP_CLASSPATH=`hbase classpath` hadoop jar <hbase job jar> <args> >>> >>> And this will take care of hbase-site.xml location appearing in the >>> classpath as well. If you're using a package-installed environment, >>> ensure /etc/hbase/conf/hbase-site.xml is populated with the right >>> settings and if not, make such a file and: >>> >>> export HBASE_CONF_DIR=/dir/that/contains/that/file >>> >>> Before running the former command. >>> >>> Let us know if this helps. >>> >>> On Wed, May 9, 2012 at 9:38 PM, Christoph Bauer >>> <christoph.ba...@unbelievable-machine.com> wrote: >>>> Hi, >>>> >>>> first, I'm aware of HBASE-4398 though I don't know how that patch could >>>> work. >>>> >>>> I'm on a cdh3u3 cluster with 4 nodes. hbase is 0.90.4. >>>> >>>> The problem is zookeeper is running on port 2222 >>>> >>>> The following line results in a NPE when the mappers start: >>>> TableMapReduceUtil.initTableReducerJob("test2", >>>> IdentityTableReducer.class, job, HRegionPartitioner.class); >>>> >>>> HBaseConfiguration.addHbaseResources in HRegionPartitioner.setConf >>>> overwrites quorum and clientPort with hbase-default.xml from hbase >>>> jar, maybe more. >>>> HBaseConfiguration.addHbaseResources also tries to load >>>> hbase-site.xml, but fails silently (not found as resource). >>>> >>>> Can I make my mapreduce jobs aware of this resource. i.e. pass it to >>>> all the mappers or do I have to ask my administrator to make some >>>> changes? >>>> >>>> >>>> Thank you, >>>> >>>> Christoph Bauer >>> >>> >>> >>> -- >>> Harsh J > > > > -- > Harsh J