Thanks a ton - That worked like a charm. I have been struggling with this the whole day! I did not need to specify auxlib or auxpath - Just putting the 3 Hive/HBase jars in the HADOOP_HOME/lib on the remote job server worked fine. Btw if I use ADD JAR from hive will that obviate the need to put the jars in HADOOP_HOME/lib
I guess this is not the ideal scenario - but at least I can proceed. Regards Abhijit On 16 March 2011 22:29, Edward Capriolo <edlinuxg...@gmail.com> wrote: > On Wed, Mar 16, 2011 at 12:51 PM, Abhijit Sharma > <abhijit.sha...@gmail.com> wrote: > > Hi, > > I am trying to connect the hive shell running on my laptop to a remote > > hadoop / hbase cluster and test out the HBase/Hive integration. I manage > to > > connect and create the table in hbase from remote Hive shell. I am also > > passing the auxpath parameter to the shell (specifying the Hive/HBase > > integration related jars). In addition I have copied over these files to > > HDFS as well (I am using the user name hadoop - so the jars are stored in > > HDFS under /user/hadoop). > > However when I fire a query on the HBase table - select * from h1 where > > key=12; - the map reduce job launches but the map task fails with the > > following error: > > ---- > > > > java.io.IOException: Cannot create an instance of InputSplit class = > > > org.apache.hadoop.hive.hbase.HBaseSplit:org.apache.hadoop.hive.hbase.HBaseSplit > > at > > > org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:143) > > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:333) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) > > at org.apache.hadoop.mapred.Child.main(Child.java:170) > > > > ---- > > This basically indicates that the Mapper task is unable to locate the > > Hive/HBase storage handler that it requires when running. This happens > even > > though this has been specified in the auxpath and uploaded to HDFS. > > Any ideas/pointers/debug options on what I might be doing wrong? Any help > is > > much appreciated. > > p.s. the exploded jars do get copied too under the taskTracker directory > on > > the cluster node > > Thanks > > I have seen this error. This is oddness between hadoop,hive, and > map/reduce classpaths. > > This is what I do > mkdir hive_home/auxlib > cp all hive and hbase jars here. > Also copy the hbase handler jar to auxlib. > > Auxlib get pushed out by the distributed cache each job and you do not > need to use ADD_JAR XXXX; > > But that is not enough! DOH! Planning the job and getting the splits > happen before the map tasks are launched. > > For this i drop all the hbase libs in hadoop_home/lib only on the > machine that is launching the job. > > You can fiddle around with HADOOP_CLASSPATH and achieve similar results. > > Good luck. > -- Regards, Abhijit