Re: Hive/Hbase Integration issue

2015-05-13 Thread Talat Uyarer
Your Zookeeper managed by Hbase. Could you check your
hbase.zookeeper.quorum settings. It should be same with Hbase
Zookeeper.

Talat

2015-05-13 23:03 GMT+03:00 Ibrar Ahmed :
> Here is my hbase-site.xml
>
> 
>   
> hbase.rootdir
> file:///usr/local/hbase
>   
>   
> hbase.zookeeper.property.dataDir
> /usr/local/hbase/zookeeperdata
>   
> 
>
>
> And hive-site.xml
>
> 
>  
> hive.aux.jars.path
>
> file:///usr/local/hive/lib/zookeeper-3.4.5.jar,file:/usr/local/hive/lib/hive-hbase-handler-0.13.1.jar,file:///usr/local/hive/lib/guava-11.0.2.jar,file:///usr/local/hbase/lib/hbase-client-0.98.2-
> hadoop2.jar,file:///usr/local/hbase/lib/hbase-common-0.98.2-hadoop2.jar,file:///usr/local/hbase/lib/hbase-protocol-0.98.2-hadoop2.jar,file:///usr/local/hbase/lib/hbase-server-0.98.2-hadoop2.jar,file:///usr
> /local/hbase/lib/hbase-shell-0.98.2-hadoop2.jar,file:///usr/local/hbase/lib/hbase-thrift-0.98.2-hadoop2.jar
>   
>
> 
>hbase.zookeeper.quorum
>zk1,zk2,zk3
> 
>
>
> 
> hive.exec.scratchdir
> /usr/local/hive/mydir
> Scratch space for Hive jobs
> 
>
> 
>
>
>
> Hadoop classpath
>
>
> /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*::/usr/local/hbase/conf/hbase-site.xml:/contrib/capacity-scheduler/*.jar
>
>
> On Thu, May 14, 2015 at 12:28 AM, Ibrar Ahmed  wrote:
>
>> Hi,
>>
>> I am creating a table using hive and getting this error.
>>
>> [127.0.0.1:1] hive> CREATE TABLE hbase_table_1(key int, value string)
>>   > STORED BY
>> 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>>   > WITH SERDEPROPERTIES ("hbase.columns.mapping" =
>> ":key,cf1:val")
>>   > TBLPROPERTIES ("hbase.table.name" = "xyz");
>>
>>
>>
>> [Hive Error]: Query returned non-zero code: 1, cause: FAILED: Execution
>> Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
>> MetaException(message:org.apache.hadoop.hbase.client.RetriesExhaustedException:
>> Can't get the locations
>> at
>> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:305)
>> at
>> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:147)
>> at
>> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:56)
>> at
>> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
>> at
>> org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:288)
>> at
>> org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267)
>> at
>> org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139)
>> at
>> org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:134)
>> at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:823)
>> at
>> org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:601)
>> at
>> org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:365)
>> at
>> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:281)
>> at
>> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:291)
>> at
>> org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162)
>> at
>> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:554)
>> at
>> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at
>> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
>> at com.sun.proxy.$Proxy7.createTable(Unknown Source)
>> at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:613)
>> at
>> org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4194)
>> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281)
>> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
>> at
>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1472)
>> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1239)
>> at org.apache.hado

Re: Hive/Hbase Integration issue

2015-05-13 Thread Ibrar Ahmed
Here is my hbase-site.xml


  
hbase.rootdir
file:///usr/local/hbase
  
  
hbase.zookeeper.property.dataDir
/usr/local/hbase/zookeeperdata
  



And hive-site.xml


 
hive.aux.jars.path

file:///usr/local/hive/lib/zookeeper-3.4.5.jar,file:/usr/local/hive/lib/hive-hbase-handler-0.13.1.jar,file:///usr/local/hive/lib/guava-11.0.2.jar,file:///usr/local/hbase/lib/hbase-client-0.98.2-
hadoop2.jar,file:///usr/local/hbase/lib/hbase-common-0.98.2-hadoop2.jar,file:///usr/local/hbase/lib/hbase-protocol-0.98.2-hadoop2.jar,file:///usr/local/hbase/lib/hbase-server-0.98.2-hadoop2.jar,file:///usr
/local/hbase/lib/hbase-shell-0.98.2-hadoop2.jar,file:///usr/local/hbase/lib/hbase-thrift-0.98.2-hadoop2.jar
  


   hbase.zookeeper.quorum
   zk1,zk2,zk3




hive.exec.scratchdir
/usr/local/hive/mydir
Scratch space for Hive jobs






Hadoop classpath


/usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*::/usr/local/hbase/conf/hbase-site.xml:/contrib/capacity-scheduler/*.jar


On Thu, May 14, 2015 at 12:28 AM, Ibrar Ahmed  wrote:

> Hi,
>
> I am creating a table using hive and getting this error.
>
> [127.0.0.1:1] hive> CREATE TABLE hbase_table_1(key int, value string)
>   > STORED BY
> 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>   > WITH SERDEPROPERTIES ("hbase.columns.mapping" =
> ":key,cf1:val")
>   > TBLPROPERTIES ("hbase.table.name" = "xyz");
>
>
>
> [Hive Error]: Query returned non-zero code: 1, cause: FAILED: Execution
> Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
> MetaException(message:org.apache.hadoop.hbase.client.RetriesExhaustedException:
> Can't get the locations
> at
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:305)
> at
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:147)
> at
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:56)
> at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
> at
> org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:288)
> at
> org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267)
> at
> org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139)
> at
> org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:134)
> at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:823)
> at
> org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:601)
> at
> org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:365)
> at
> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:281)
> at
> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:291)
> at
> org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:554)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
> at com.sun.proxy.$Proxy7.createTable(Unknown Source)
> at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:613)
> at
> org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4194)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
> at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1472)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1239)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1057)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:880)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:870)
> at
> org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:198)
> at
> org.apache.hadoop.hive.service.ThriftHi

Re: Hive/Hbase Integration issue

2015-05-13 Thread Talat Uyarer
This issue similar some missing settings. What do you for your Hive
Hbase integration ? Can you give some information about your cluster ?

BTW In [1], someone had same issue. Maybe help you

[1] 
http://mail-archives.apache.org/mod_mbox/hive-user/201307.mbox/%3cce01cda1.9221%25sanjay.subraman...@wizecommerce.com%3E

2015-05-13 22:28 GMT+03:00 Ibrar Ahmed :
> Hi,
>
> I am creating a table using hive and getting this error.
>
> [127.0.0.1:1] hive> CREATE TABLE hbase_table_1(key int, value string)
>   > STORED BY
> 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>   > WITH SERDEPROPERTIES ("hbase.columns.mapping" =
> ":key,cf1:val")
>   > TBLPROPERTIES ("hbase.table.name" = "xyz");
>
>
>
> [Hive Error]: Query returned non-zero code: 1, cause: FAILED: Execution
> Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
> MetaException(message:org.apache.hadoop.hbase.client.RetriesExhaustedException:
> Can't get the locations
> at
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:305)
> at
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:147)
> at
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:56)
> at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
> at
> org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:288)
> at
> org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267)
> at
> org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139)
> at
> org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:134)
> at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:823)
> at
> org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:601)
> at
> org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:365)
> at
> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:281)
> at
> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:291)
> at
> org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:554)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
> at com.sun.proxy.$Proxy7.createTable(Unknown Source)
> at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:613)
> at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4194)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
> at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1472)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1239)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1057)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:880)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:870)
> at
> org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:198)
> at
> org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:644)
> at
> org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:628)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> )
>
>
> Any help/clue can help.



-- 
Talat UYARER
Websitesi: http://talat.uyarer.com
Twitter: http://twitter.com/talatuyarer
Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304


Hive/Hbase Integration issue

2015-05-13 Thread Ibrar Ahmed
Hi,

I am creating a table using hive and getting this error.

[127.0.0.1:1] hive> CREATE TABLE hbase_table_1(key int, value string)
  > STORED BY
'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
  > WITH SERDEPROPERTIES ("hbase.columns.mapping" =
":key,cf1:val")
  > TBLPROPERTIES ("hbase.table.name" = "xyz");



[Hive Error]: Query returned non-zero code: 1, cause: FAILED: Execution
Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
MetaException(message:org.apache.hadoop.hbase.client.RetriesExhaustedException:
Can't get the locations
at
org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:305)
at
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:147)
at
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:56)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
at
org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:288)
at
org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267)
at
org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139)
at
org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:134)
at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:823)
at
org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:601)
at
org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:365)
at
org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:281)
at
org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:291)
at
org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:554)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
at com.sun.proxy.$Proxy7.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:613)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4194)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1472)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1239)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1057)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:880)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:870)
at
org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:198)
at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:644)
at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:628)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
)


Any help/clue can help.


Re: hive hbase integration

2014-04-17 Thread Ted Yu
Please read the first sentence under Abstract of
http://hbase.apache.org/book.html

Refer to the following:
http://hbase.apache.org/book.html#arch.overview.nosql

w.r.t. storage handler, hive user mailing list would be a better place to
ask.

Cheers


On Thu, Apr 17, 2014 at 9:46 AM, Shushant Arora
wrote:

> Wanna know why hive hbase integration is required.
> Is it because  hbase cannot provide all functionalities of sql like and if
> yes then why?
> What is storage handler and best practices for hive hbase integration?
>


hive hbase integration

2014-04-17 Thread Shushant Arora
Wanna know why hive hbase integration is required.
Is it because  hbase cannot provide all functionalities of sql like and if
yes then why?
What is storage handler and best practices for hive hbase integration?


Re: Hbase connection closed when query multiple complicated hql with hive+hbase integration

2012-11-06 Thread Cheng Su
The exceptions seem to be another problem.
They all happened on one node.
And after the task attempts failed at that node,
retried on other nodes and no exceptions.
So that, the exception maybe have nothing to do with the performance issue.

On Wed, Nov 7, 2012 at 11:07 AM, Cheng Su  wrote:
> Hi, all.
>
> I have a hive + hbase integration cluster. I met a performance issue.
> I executed some complicated hql, but only the first one is actually running.
> The rests are showed "running" on the job track web ui, but task
> attempt failed with exceptions below:
>
> java.io.IOException: java.io.IOException:
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5117a20
> closed
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:243)
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:522)
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:197)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Unknown Source)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.io.IOException:
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5117a20
> closed
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:794)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:782)
> at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:249)
> at org.apache.hadoop.hbase.client.HTable.(HTable.java:213)
> at 
> org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getRecordReader(HiveHBaseTableInputFormat.java:92)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:240)
> ... 9 more
>
> So actually the rests are queuing.
>
> I guess it might be because the count(?) of hbase handler is too
> small,  which handle access request from hive.
> So that when a big job occupied all handles the rests have to wait
> until the handlers are released.
>
> Is my assumption right? and what settings should I tuning?
>
> Thanks.
>
> --
>
> Regards,
> Cheng Su



-- 

Regards,
Cheng Su


Re: Performance: hive+hbase integration query against the row_key

2012-09-12 Thread Jean-Daniel Cryans
On Tue, Sep 11, 2012 at 6:56 AM, Shengjie Min  wrote:
> 1. if you do a hive query against the row key like "select * from
> hive_hbase_test where key='blabla'", this would utilize the hbase row_key
> index which give you very quick nearly real-time response just like hbase
> does.
> From my test, query 1 doesn't seem fast at all, still taking ages, so
> select * from hive_hbase_test where key='blabla'   36secs
> vs
> get 'test', 'blabla'  less than 1 sec
> still shows a huge difference.
>
> Anybody has tried this before? Is there anyway I can do sort of query plan
> analysis against hive query? or I am not mapping hive table against hbase
> table correctly?

It doesn't work like that. Every Hive query is translated into a MR
job, so you're still doing a full scan to find that one row key.

J-D


Performance: hive+hbase integration query against the row_key

2012-09-11 Thread Shengjie Min
Hi,

I am trying to get hive working on top of my hbase table following the
guide below:
https://cwiki.apache.org/Hive/hbaseintegration.html

CREATE EXTERNAL TABLE hive_hbase_test (key string, a string, b string, c
string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES
("hbase.columns.mapping"=":key,cf:a,cf:b,cf:c") TBLPROPERTIES ("
hbase.table.name"="test");

this hive table creation makes my mapping roughly look like this:

hive_hbase_test  VS   test
Hive key  -   hbase row_key
Hive column a -  hbase cf:a
Hive column b  -  hbase cf:b
Hive column c  -  hbase cf:c

>From my understanding on how HBaseStorageHandler works, it's supposed to
take advantage of the hbase row_key index as much as possible. So I would
expect,

1. if you do a hive query against the row key like "select * from
hive_hbase_test where key='blabla'", this would utilize the hbase row_key
index which give you very quick nearly real-time response just like hbase
does.

2. of coz, if you do a hive query against a column like "select * from
hive_hbase_test where a='blabla'", in this case, it queries against a
specific column, it probably uses mapred because there is nothing from
Hbase side can be utilized.

>From my test, query 1 doesn't seem fast at all, still taking ages, so
select * from hive_hbase_test where key='blabla'   36secs
vs
get 'test', 'blabla'  less than 1 sec
still shows a huge difference.

Anybody has tried this before? Is there anyway I can do sort of query plan
analysis against hive query? or I am not mapping hive table against hbase
table correctly?


RE: Hive HBase integration scan failing

2010-12-10 Thread Jonathan Gray
Hey,

Need some more info.

Can you paste logs from the MR tasks that fail?  What's going on in the cluster 
while the MR job is running (cpu, io-wait, memory, etc)?

And what is the setup of your cluster... how many nodes, specs of nodes (cores, 
memory, RS heap), and then how many concurrent map tasks you have per node.

JG

> -Original Message-
> From: vlisovsky [mailto:vlisov...@gmail.com]
> Sent: Thursday, December 09, 2010 10:49 PM
> To: user@hbase.apache.org
> Subject: Hive HBase integration scan failing
> 
> Hi Guys,
> 
> > Wonder if  anybody could shed some light on how to reduce the load on
> > HBase cluster when running a full scan.
> > The need is to dump everything I have in HBase and into a Hive table.
> > The HBase data size is around 500g.
> > The job creates 9000 mappers, after about 1000 maps things go south
> > every time..
> > If I run below insert it runs for about 30 minutes then starts
> > bringing down HBase cluster after which region servers need to be
> restarted..
> > Wonder if there is a way to throttle it somehow or otherwise if there
> > is any other method of getting structured data out?
> > Any help is appreciated,
> > Thanks,
> > -Vitaly
> >
> > create external table hbase_linked_table (
> > mykeystring,
> > infomap,
> > )
> > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> > WITH
> > SERDEPROPERTIES ("hbase.columns.mapping" = ":key,info:")
> TBLPROPERTIES
> > ("hbase.table.name" = "hbase_table2");
> >
> > set hive.exec.compress.output=true;
> > set io.seqfile.compression.type=BLOCK;
> > set mapred.output.compression.type=BLOCK;
> > set
> >
> mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCo
> de
> > c;
> >
> > set mapred.reduce.tasks=40;
> > set mapred.map.tasks=25;
> >
> > INSERT overwrite table tmp_hive_destination select * from
> > hbase_linked_table;
> >


Hive HBase integration scan failing

2010-12-09 Thread vlisovsky
Hi Guys,

> Wonder if  anybody could shed some light on how to reduce the load on HBase
> cluster when running a full scan.
> The need is to dump everything I have in HBase and into a Hive table. The
> HBase data size is around 500g.
> The job creates 9000 mappers, after about 1000 maps things go south every
> time..
> If I run below insert it runs for about 30 minutes then starts bringing
> down HBase cluster after which region servers need to be restarted..
> Wonder if there is a way to throttle it somehow or otherwise if there is
> any other method of getting structured data out?
> Any help is appreciated,
> Thanks,
> -Vitaly
>
> create external table hbase_linked_table (
> mykeystring,
> infomap,
> )
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH
> SERDEPROPERTIES ("hbase.columns.mapping" = ":key,info:")
> TBLPROPERTIES ("hbase.table.name" = "hbase_table2");
>
> set hive.exec.compress.output=true;
> set io.seqfile.compression.type=BLOCK;
> set mapred.output.compression.type=BLOCK;
> set
> mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec;
>
> set mapred.reduce.tasks=40;
> set mapred.map.tasks=25;
>
> INSERT overwrite table tmp_hive_destination
> select * from hbase_linked_table;
>