Re: Hive/Hbase Integration issue
Your Zookeeper managed by Hbase. Could you check your hbase.zookeeper.quorum settings. It should be same with Hbase Zookeeper. Talat 2015-05-13 23:03 GMT+03:00 Ibrar Ahmed : > Here is my hbase-site.xml > > > > hbase.rootdir > file:///usr/local/hbase > > > hbase.zookeeper.property.dataDir > /usr/local/hbase/zookeeperdata > > > > > And hive-site.xml > > > > hive.aux.jars.path > > file:///usr/local/hive/lib/zookeeper-3.4.5.jar,file:/usr/local/hive/lib/hive-hbase-handler-0.13.1.jar,file:///usr/local/hive/lib/guava-11.0.2.jar,file:///usr/local/hbase/lib/hbase-client-0.98.2- > hadoop2.jar,file:///usr/local/hbase/lib/hbase-common-0.98.2-hadoop2.jar,file:///usr/local/hbase/lib/hbase-protocol-0.98.2-hadoop2.jar,file:///usr/local/hbase/lib/hbase-server-0.98.2-hadoop2.jar,file:///usr > /local/hbase/lib/hbase-shell-0.98.2-hadoop2.jar,file:///usr/local/hbase/lib/hbase-thrift-0.98.2-hadoop2.jar > > > >hbase.zookeeper.quorum >zk1,zk2,zk3 > > > > > hive.exec.scratchdir > /usr/local/hive/mydir > Scratch space for Hive jobs > > > > > > > Hadoop classpath > > > /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*::/usr/local/hbase/conf/hbase-site.xml:/contrib/capacity-scheduler/*.jar > > > On Thu, May 14, 2015 at 12:28 AM, Ibrar Ahmed wrote: > >> Hi, >> >> I am creating a table using hive and getting this error. >> >> [127.0.0.1:1] hive> CREATE TABLE hbase_table_1(key int, value string) >> > STORED BY >> 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' >> > WITH SERDEPROPERTIES ("hbase.columns.mapping" = >> ":key,cf1:val") >> > TBLPROPERTIES ("hbase.table.name" = "xyz"); >> >> >> >> [Hive Error]: Query returned non-zero code: 1, cause: FAILED: Execution >> Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. >> MetaException(message:org.apache.hadoop.hbase.client.RetriesExhaustedException: >> Can't get the locations >> at >> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:305) >> at >> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:147) >> at >> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:56) >> at >> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200) >> at >> org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:288) >> at >> org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267) >> at >> org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139) >> at >> org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:134) >> at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:823) >> at >> org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:601) >> at >> org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:365) >> at >> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:281) >> at >> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:291) >> at >> org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162) >> at >> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:554) >> at >> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:606) >> at >> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) >> at com.sun.proxy.$Proxy7.createTable(Unknown Source) >> at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:613) >> at >> org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4194) >> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281) >> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) >> at >> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) >> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1472) >> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1239) >> at org.apache.hado
Re: Hive/Hbase Integration issue
Here is my hbase-site.xml hbase.rootdir file:///usr/local/hbase hbase.zookeeper.property.dataDir /usr/local/hbase/zookeeperdata And hive-site.xml hive.aux.jars.path file:///usr/local/hive/lib/zookeeper-3.4.5.jar,file:/usr/local/hive/lib/hive-hbase-handler-0.13.1.jar,file:///usr/local/hive/lib/guava-11.0.2.jar,file:///usr/local/hbase/lib/hbase-client-0.98.2- hadoop2.jar,file:///usr/local/hbase/lib/hbase-common-0.98.2-hadoop2.jar,file:///usr/local/hbase/lib/hbase-protocol-0.98.2-hadoop2.jar,file:///usr/local/hbase/lib/hbase-server-0.98.2-hadoop2.jar,file:///usr /local/hbase/lib/hbase-shell-0.98.2-hadoop2.jar,file:///usr/local/hbase/lib/hbase-thrift-0.98.2-hadoop2.jar hbase.zookeeper.quorum zk1,zk2,zk3 hive.exec.scratchdir /usr/local/hive/mydir Scratch space for Hive jobs Hadoop classpath /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*::/usr/local/hbase/conf/hbase-site.xml:/contrib/capacity-scheduler/*.jar On Thu, May 14, 2015 at 12:28 AM, Ibrar Ahmed wrote: > Hi, > > I am creating a table using hive and getting this error. > > [127.0.0.1:1] hive> CREATE TABLE hbase_table_1(key int, value string) > > STORED BY > 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > > WITH SERDEPROPERTIES ("hbase.columns.mapping" = > ":key,cf1:val") > > TBLPROPERTIES ("hbase.table.name" = "xyz"); > > > > [Hive Error]: Query returned non-zero code: 1, cause: FAILED: Execution > Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. > MetaException(message:org.apache.hadoop.hbase.client.RetriesExhaustedException: > Can't get the locations > at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:305) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:147) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:56) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200) > at > org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:288) > at > org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267) > at > org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139) > at > org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:134) > at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:823) > at > org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:601) > at > org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:365) > at > org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:281) > at > org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:291) > at > org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:554) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) > at com.sun.proxy.$Proxy7.createTable(Unknown Source) > at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:613) > at > org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4194) > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1472) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1239) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1057) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:880) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:870) > at > org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:198) > at > org.apache.hadoop.hive.service.ThriftHi
Re: Hive/Hbase Integration issue
This issue similar some missing settings. What do you for your Hive Hbase integration ? Can you give some information about your cluster ? BTW In [1], someone had same issue. Maybe help you [1] http://mail-archives.apache.org/mod_mbox/hive-user/201307.mbox/%3cce01cda1.9221%25sanjay.subraman...@wizecommerce.com%3E 2015-05-13 22:28 GMT+03:00 Ibrar Ahmed : > Hi, > > I am creating a table using hive and getting this error. > > [127.0.0.1:1] hive> CREATE TABLE hbase_table_1(key int, value string) > > STORED BY > 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > > WITH SERDEPROPERTIES ("hbase.columns.mapping" = > ":key,cf1:val") > > TBLPROPERTIES ("hbase.table.name" = "xyz"); > > > > [Hive Error]: Query returned non-zero code: 1, cause: FAILED: Execution > Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. > MetaException(message:org.apache.hadoop.hbase.client.RetriesExhaustedException: > Can't get the locations > at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:305) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:147) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:56) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200) > at > org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:288) > at > org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267) > at > org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139) > at > org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:134) > at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:823) > at > org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:601) > at > org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:365) > at > org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:281) > at > org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:291) > at > org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:554) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) > at com.sun.proxy.$Proxy7.createTable(Unknown Source) > at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:613) > at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4194) > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1472) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1239) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1057) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:880) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:870) > at > org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:198) > at > org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:644) > at > org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:628) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > ) > > > Any help/clue can help. -- Talat UYARER Websitesi: http://talat.uyarer.com Twitter: http://twitter.com/talatuyarer Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304
Hive/Hbase Integration issue
Hi, I am creating a table using hive and getting this error. [127.0.0.1:1] hive> CREATE TABLE hbase_table_1(key int, value string) > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val") > TBLPROPERTIES ("hbase.table.name" = "xyz"); [Hive Error]: Query returned non-zero code: 1, cause: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the locations at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:305) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:147) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:56) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200) at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:288) at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267) at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139) at org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:134) at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:823) at org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:601) at org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:365) at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:281) at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:291) at org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:554) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) at com.sun.proxy.$Proxy7.createTable(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:613) at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4194) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1472) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1239) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1057) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:880) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:870) at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:198) at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:644) at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:628) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) ) Any help/clue can help.
Re: hive hbase integration
Please read the first sentence under Abstract of http://hbase.apache.org/book.html Refer to the following: http://hbase.apache.org/book.html#arch.overview.nosql w.r.t. storage handler, hive user mailing list would be a better place to ask. Cheers On Thu, Apr 17, 2014 at 9:46 AM, Shushant Arora wrote: > Wanna know why hive hbase integration is required. > Is it because hbase cannot provide all functionalities of sql like and if > yes then why? > What is storage handler and best practices for hive hbase integration? >
hive hbase integration
Wanna know why hive hbase integration is required. Is it because hbase cannot provide all functionalities of sql like and if yes then why? What is storage handler and best practices for hive hbase integration?
Re: Hbase connection closed when query multiple complicated hql with hive+hbase integration
The exceptions seem to be another problem. They all happened on one node. And after the task attempts failed at that node, retried on other nodes and no exceptions. So that, the exception maybe have nothing to do with the performance issue. On Wed, Nov 7, 2012 at 11:07 AM, Cheng Su wrote: > Hi, all. > > I have a hive + hbase integration cluster. I met a performance issue. > I executed some complicated hql, but only the first one is actually running. > The rests are showed "running" on the job track web ui, but task > attempt failed with exceptions below: > > java.io.IOException: java.io.IOException: > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5117a20 > closed > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:243) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:522) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:197) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Unknown Source) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > Caused by: java.io.IOException: > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5117a20 > closed > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:794) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:782) > at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:249) > at org.apache.hadoop.hbase.client.HTable.(HTable.java:213) > at > org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getRecordReader(HiveHBaseTableInputFormat.java:92) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:240) > ... 9 more > > So actually the rests are queuing. > > I guess it might be because the count(?) of hbase handler is too > small, which handle access request from hive. > So that when a big job occupied all handles the rests have to wait > until the handlers are released. > > Is my assumption right? and what settings should I tuning? > > Thanks. > > -- > > Regards, > Cheng Su -- Regards, Cheng Su
Re: Performance: hive+hbase integration query against the row_key
On Tue, Sep 11, 2012 at 6:56 AM, Shengjie Min wrote: > 1. if you do a hive query against the row key like "select * from > hive_hbase_test where key='blabla'", this would utilize the hbase row_key > index which give you very quick nearly real-time response just like hbase > does. > From my test, query 1 doesn't seem fast at all, still taking ages, so > select * from hive_hbase_test where key='blabla' 36secs > vs > get 'test', 'blabla' less than 1 sec > still shows a huge difference. > > Anybody has tried this before? Is there anyway I can do sort of query plan > analysis against hive query? or I am not mapping hive table against hbase > table correctly? It doesn't work like that. Every Hive query is translated into a MR job, so you're still doing a full scan to find that one row key. J-D
Performance: hive+hbase integration query against the row_key
Hi, I am trying to get hive working on top of my hbase table following the guide below: https://cwiki.apache.org/Hive/hbaseintegration.html CREATE EXTERNAL TABLE hive_hbase_test (key string, a string, b string, c string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping"=":key,cf:a,cf:b,cf:c") TBLPROPERTIES (" hbase.table.name"="test"); this hive table creation makes my mapping roughly look like this: hive_hbase_test VS test Hive key - hbase row_key Hive column a - hbase cf:a Hive column b - hbase cf:b Hive column c - hbase cf:c >From my understanding on how HBaseStorageHandler works, it's supposed to take advantage of the hbase row_key index as much as possible. So I would expect, 1. if you do a hive query against the row key like "select * from hive_hbase_test where key='blabla'", this would utilize the hbase row_key index which give you very quick nearly real-time response just like hbase does. 2. of coz, if you do a hive query against a column like "select * from hive_hbase_test where a='blabla'", in this case, it queries against a specific column, it probably uses mapred because there is nothing from Hbase side can be utilized. >From my test, query 1 doesn't seem fast at all, still taking ages, so select * from hive_hbase_test where key='blabla' 36secs vs get 'test', 'blabla' less than 1 sec still shows a huge difference. Anybody has tried this before? Is there anyway I can do sort of query plan analysis against hive query? or I am not mapping hive table against hbase table correctly?
RE: Hive HBase integration scan failing
Hey, Need some more info. Can you paste logs from the MR tasks that fail? What's going on in the cluster while the MR job is running (cpu, io-wait, memory, etc)? And what is the setup of your cluster... how many nodes, specs of nodes (cores, memory, RS heap), and then how many concurrent map tasks you have per node. JG > -Original Message- > From: vlisovsky [mailto:vlisov...@gmail.com] > Sent: Thursday, December 09, 2010 10:49 PM > To: user@hbase.apache.org > Subject: Hive HBase integration scan failing > > Hi Guys, > > > Wonder if anybody could shed some light on how to reduce the load on > > HBase cluster when running a full scan. > > The need is to dump everything I have in HBase and into a Hive table. > > The HBase data size is around 500g. > > The job creates 9000 mappers, after about 1000 maps things go south > > every time.. > > If I run below insert it runs for about 30 minutes then starts > > bringing down HBase cluster after which region servers need to be > restarted.. > > Wonder if there is a way to throttle it somehow or otherwise if there > > is any other method of getting structured data out? > > Any help is appreciated, > > Thanks, > > -Vitaly > > > > create external table hbase_linked_table ( > > mykeystring, > > infomap, > > ) > > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > > WITH > > SERDEPROPERTIES ("hbase.columns.mapping" = ":key,info:") > TBLPROPERTIES > > ("hbase.table.name" = "hbase_table2"); > > > > set hive.exec.compress.output=true; > > set io.seqfile.compression.type=BLOCK; > > set mapred.output.compression.type=BLOCK; > > set > > > mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCo > de > > c; > > > > set mapred.reduce.tasks=40; > > set mapred.map.tasks=25; > > > > INSERT overwrite table tmp_hive_destination select * from > > hbase_linked_table; > >
Hive HBase integration scan failing
Hi Guys, > Wonder if anybody could shed some light on how to reduce the load on HBase > cluster when running a full scan. > The need is to dump everything I have in HBase and into a Hive table. The > HBase data size is around 500g. > The job creates 9000 mappers, after about 1000 maps things go south every > time.. > If I run below insert it runs for about 30 minutes then starts bringing > down HBase cluster after which region servers need to be restarted.. > Wonder if there is a way to throttle it somehow or otherwise if there is > any other method of getting structured data out? > Any help is appreciated, > Thanks, > -Vitaly > > create external table hbase_linked_table ( > mykeystring, > infomap, > ) > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > WITH > SERDEPROPERTIES ("hbase.columns.mapping" = ":key,info:") > TBLPROPERTIES ("hbase.table.name" = "hbase_table2"); > > set hive.exec.compress.output=true; > set io.seqfile.compression.type=BLOCK; > set mapred.output.compression.type=BLOCK; > set > mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec; > > set mapred.reduce.tasks=40; > set mapred.map.tasks=25; > > INSERT overwrite table tmp_hive_destination > select * from hbase_linked_table; >