[ 
https://issues.apache.org/jira/browse/HIVE-4515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13914655#comment-13914655
 ] 

sridhar commented on HIVE-4515:
-------------------------------

Hi,

We are using Hbase Avro tables and would like to access to data using Hive. So 
i modified the code in HBaseStorageHandler, LazyHBaseRow, LazyHBaseCellMap to 
provide support for Avro schema parsing. All works perfectly and able to see 
the data with basic query like; select * .... But when i query the hive table 
with any filter or select only some columns i see the same error that was 
reported above.
Also the exact same error is noticed when we access the data in hive using the 
original HBaseStorage Handler. So do not think this is something introduced by 
my changes to code.
So wanted check if there is any work around or fix available for this. We are 
using CDH 4.4.  
Any suggestions?

> "select count(*) from table" query on hive-0.10.0, hbase-0.94.7 integration 
> throws exceptions
> ---------------------------------------------------------------------------------------------
>
>                 Key: HIVE-4515
>                 URL: https://issues.apache.org/jira/browse/HIVE-4515
>             Project: Hive
>          Issue Type: Bug
>          Components: HBase Handler
>    Affects Versions: 0.10.0, 0.11.0
>         Environment: hive-0.10.0, hive-0.11.0
> hbase-0.94.7, hbase-0.94.6.1
> zookeeper-3.4.3
> hadoop-1.0.4
> centos-5.7
>            Reporter: Yanhui Ma
>            Assignee: Swarnim Kulkarni
>            Priority: Critical
>
> After integration hive-0.10.0+hbase-0.94.7, these commands could be executed 
> successfully:
> {noformat}
> create table
> insert overwrite table
> select * from table
> However, when execute "select count(*) from table", throws exception:
> hive> select count(*) from test;         
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
>   set mapred.reduce.tasks=<number>
> Starting Job = job_201305061042_0028, Tracking URL = 
> http://master0:50030/jobdetails.jsp?jobid=job_201305061042_0028
> Kill Command = /opt/modules/hadoop/hadoop-1.0.4/libexec/../bin/hadoop job  
> -kill job_201305061042_0028
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 
> 1
> 2013-05-07 18:41:42,649 Stage-1 map = 0%,  reduce = 0%
> 2013-05-07 18:42:14,789 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_201305061042_0028 with errors
> Error during job, obtaining debugging information...
> Job Tracking URL: 
> http://master0:50030/jobdetails.jsp?jobid=job_201305061042_0028
> Examining task ID: task_201305061042_0028_m_000002 (and more) from job 
> job_201305061042_0028
> Task with the most failures(4): 
> -----
> Task ID:
>   task_201305061042_0028_m_000000
> URL:
>   
> http://master0:50030/taskdetails.jsp?jobid=job_201305061042_0028&tipid=task_201305061042_0028_m_000000
> -----
> Diagnostic Messages for this Task:
> java.lang.NegativeArraySizeException: -1
>       at org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:148)
>       at 
> org.apache.hadoop.hbase.mapreduce.TableSplit.readFields(TableSplit.java:133)
>       at 
> org.apache.hadoop.hive.hbase.HBaseSplit.readFields(HBaseSplit.java:53)
>       at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:150)
>       at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
>       at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
>       at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:396)
>       at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>       at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>       at org.apache.hadoop.mapred.Child.main(Child.java:249)
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.MapRedTask
> MapReduce Jobs Launched: 
> Job 0: Map: 1  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
> Total MapReduce CPU Time Spent: 0 msec
> ==================================================================
> The log of tasktracker:
> stderr logs
> 13/05/07 18:43:20 INFO util.NativeCodeLoader: Loaded the native-hadoop library
> 13/05/07 18:43:20 INFO mapred.TaskRunner: Creating symlink: 
> /tmp/hadoop-hadoop/mapred/local/taskTracker/distcache/1073284782955556390_-1298160740_2123690974/master0/tmp/hive-hadoop/hive_2013-05-07_18-41-30_290_832140779606816147/-mr-10003/fd22448b-e923-498c-bc00-2164ca68447d
>  <- 
> /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201305061042_0028/attempt_201305061042_0028_m_000000_0/work/HIVE_PLANfd22448b-e923-498c-bc00-2164ca68447d
> 13/05/07 18:43:20 INFO filecache.TrackerDistributedCacheManager: Creating 
> symlink: 
> /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201305061042_0028/jars/javolution
>  <- 
> /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201305061042_0028/attempt_201305061042_0028_m_000000_0/work/javolution
> 13/05/07 18:43:20 INFO filecache.TrackerDistributedCacheManager: Creating 
> symlink: 
> /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201305061042_0028/jars/org
>  <- 
> /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201305061042_0028/attempt_201305061042_0028_m_000000_0/work/org
> 13/05/07 18:43:20 INFO filecache.TrackerDistributedCacheManager: Creating 
> symlink: 
> /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201305061042_0028/jars/hive-exec-log4j.properties
>  <- 
> /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201305061042_0028/attempt_201305061042_0028_m_000000_0/work/hive-exec-log4j.properties
> 13/05/07 18:43:20 INFO filecache.TrackerDistributedCacheManager: Creating 
> symlink: 
> /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201305061042_0028/jars/META-INF
>  <- 
> /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201305061042_0028/attempt_201305061042_0028_m_000000_0/work/META-INF
> 13/05/07 18:43:20 INFO filecache.TrackerDistributedCacheManager: Creating 
> symlink: 
> /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201305061042_0028/jars/.job.jar.crc
>  <- 
> /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201305061042_0028/attempt_201305061042_0028_m_000000_0/work/.job.jar.crc
> 13/05/07 18:43:20 INFO filecache.TrackerDistributedCacheManager: Creating 
> symlink: 
> /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201305061042_0028/jars/job.jar
>  <- 
> /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201305061042_0028/attempt_201305061042_0028_m_000000_0/work/job.jar
> 13/05/07 18:43:20 INFO filecache.TrackerDistributedCacheManager: Creating 
> symlink: 
> /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201305061042_0028/jars/javax
>  <- 
> /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201305061042_0028/attempt_201305061042_0028_m_000000_0/work/javax
> 13/05/07 18:43:20 INFO filecache.TrackerDistributedCacheManager: Creating 
> symlink: 
> /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201305061042_0028/jars/javaewah
>  <- 
> /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201305061042_0028/attempt_201305061042_0028_m_000000_0/work/javaewah
> 13/05/07 18:43:20 INFO impl.MetricsConfig: loaded properties from 
> hadoop-metrics2.properties
> 13/05/07 18:43:21 INFO impl.MetricsSourceAdapter: MBean for source 
> MetricsSystem,sub=Stats registered.
> 13/05/07 18:43:21 INFO impl.MetricsSystemImpl: Scheduled snapshot period at 
> 10 second(s).
> 13/05/07 18:43:21 INFO impl.MetricsSystemImpl: MapTask metrics system started
> 13/05/07 18:43:21 INFO impl.MetricsSourceAdapter: MBean for source ugi 
> registered.
> 13/05/07 18:43:21 WARN impl.MetricsSystemImpl: Source name ugi already exists!
> 13/05/07 18:43:21 INFO impl.MetricsSourceAdapter: MBean for source jvm 
> registered.
> 13/05/07 18:43:21 INFO util.ProcessTree: setsid exited with exit code 0
> 13/05/07 18:43:21 INFO mapred.Task:  Using ResourceCalculatorPlugin : 
> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@2d7aece8
> 13/05/07 18:43:21 INFO mapred.TaskLogsTruncater: Initializing logs' truncater 
> with mapRetainSize=-1 and reduceRetainSize=-1
> 13/05/07 18:43:21 INFO nativeio.NativeIO: Initialized cache for UID to User 
> mapping with a cache timeout of 14400 seconds.
> 13/05/07 18:43:21 INFO nativeio.NativeIO: Got UserName hadoop for UID 1001 
> from the native implementation
> 13/05/07 18:43:21 WARN mapred.Child: Error running child
> java.lang.NegativeArraySizeException: -1
>       at org.apache.hadoop.hbase.util.Bytes.readByteArray(Bytes.java:148)
>       at 
> org.apache.hadoop.hbase.mapreduce.TableSplit.readFields(TableSplit.java:133)
>       at 
> org.apache.hadoop.hive.hbase.HBaseSplit.readFields(HBaseSplit.java:53)
>       at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:150)
>       at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
>       at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
>       at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:396)
>       at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>       at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>       at org.apache.hadoop.mapred.Child.main(Child.java:249)
> 13/05/07 18:43:21 INFO mapred.Task: Runnning cleanup for the task
> 13/05/07 18:43:21 INFO impl.MetricsSystemImpl: Stopping MapTask metrics 
> system...
> 13/05/07 18:43:21 INFO impl.MetricsSystemImpl: Stopping metrics source 
> ugi(org.apache.hadoop.security.UgiInstrumentation)
> 13/05/07 18:43:21 INFO impl.MetricsSystemImpl: Stopping metrics source 
> jvm(org.apache.hadoop.metrics2.source.JvmMetricsSource)
> 13/05/07 18:43:21 INFO impl.MetricsSystemImpl: MapTask metrics system stopped.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to