Hive - Tez error with big join - Container expired.

2015-06-17 Thread Daniel Klinger
Hi all,

 

I have a pretty big Hive Query. I'm joining over 3 Hive-Tables which have
thousands of lines each. I'm grouping this join by several columns. In the
Hive-Shell this query only reach about 80%. After about 1400 seconds its
canceling with the following error:  

 

Status: Failed

Vertex failed, vertexName=Map 2, vertexId=vertex_1434357133795_0008_1_01,
diagnostics=[Task failed, taskId=task_1434357133795_0008_1_01_33,
diagnostics=[TaskAttempt 0 failed,
info=[Containercontainer_1434357133795_0008_01_39 finished while trying
to launch. Diagnostics: [Container failed. Container expired since it was
unused]], TaskAttempt 1 failed,
info=[Containercontainer_1434357133795_0008_01_55 finished while trying
to launch. Diagnostics: [Container failed. Container expired since it was
unused]], TaskAttempt 2 failed,
info=[Containercontainer_1434357133795_0008_01_72 finished while trying
to launch. Diagnostics: [Container failed. Container expired since it was
unused]], TaskAttempt 3 failed,
info=[Containercontainer_1434357133795_0008_01_000101 finished while trying
to launch. Diagnostics: [Container failed. Container expired since it was
unused]]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex
vertex_1434357133795_0008_1_01 [Map 2] killed/failed due to:null]

DAG failed due to vertex failure. failedVertices:1 killedVertices:0

FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.tez.TezTask

 

My yarn resource manager is at 100% during the whole execution (using all of
the 300 GB memory). I tried to extend the live time of my containers with
the following setting in the yarn-site.xml but no success:

 

yarn.resourcemanager.rm.container-allocation.expiry-interval-ms = 120

 

After this change my query stays at 0% over thousands of seconds. The query
itself is working (tested with less data). How can I solve this problem.

 

Thanks for your help.

 

Greetz

DK



Disabling hdfs and local filesystem cache no longer necessary?

2015-06-17 Thread Ben Tse
In the wiki page,
https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2
It is still advising to disable hdfs and local filesystem cache. It should
not be needed anymore, given that HIVE-4501 is resolved since 0.13.0.

Is this correct?

referenced configuration:
fs.hdfs.impl.disable.cache
fs.file.impl.disable.cache

-Ben


Re: delta file compact take no effect

2015-06-17 Thread Alan Gates
See 
https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration


Compaction is initiated by the thrift metastore server.  You need to set 
the values labeled metastore in the above page in the hive-site.xml for 
your metastore server.


Alan.


r7raul1...@163.com mailto:r7raul1...@163.com
June 16, 2015 at 23:33
My config is on my client. What is metastore config?




r7raul1...@163.com


Re: Re: delta file compact take no effect

2015-06-17 Thread r7raul1...@163.com
It's work~~   But  I see some  ERROR and Deadlock .

2015-06-18 09:06:06,509 ERROR [test.oracle-22]: txn.CompactionTxnHandler 
(CompactionTxnHandler.java:findNextToCompact(194)) - Unable to select next 
element for compaction, ERROR: could not serialize access due to concurrent 
update 
2015-06-18 09:06:06,509 ERROR [test.oracle-27]: txn.CompactionTxnHandler 
(CompactionTxnHandler.java:findNextToCompact(194)) - Unable to select next 
element for compaction, ERROR: could not serialize access due to concurrent 
update 
2015-06-18 09:06:06,509 ERROR [test.oracle-28]: txn.CompactionTxnHandler 
(CompactionTxnHandler.java:findNextToCompact(194)) - Unable to select next 
element for compaction, ERROR: could not serialize access due to concurrent 
update 
2015-06-18 09:06:06,509 WARN [test.oracle-22]: txn.TxnHandler 
(TxnHandler.java:checkRetryable(916)) - Deadlock detected in findNextToCompact, 
trying again. 
2015-06-18 09:06:06,509 WARN [test.oracle-27]: txn.TxnHandler 
(TxnHandler.java:checkRetryable(916)) - Deadlock detected in findNextToCompact, 
trying again. 
2015-06-18 09:06:06,509 WARN [test.oracle-28]: txn.TxnHandler 
(TxnHandler.java:checkRetryable(916)) - Deadlock detected in findNextToCompact, 
trying again. 
2015-06-18 09:06:06,544 INFO [test.oracle-26]: compactor.Worker 
(Worker.java:run(140)) - Starting MAJOR compaction for default.u_data_txn 
2015-06-18 09:06:06,874 INFO [test.oracle-26]: impl.TimelineClientImpl 
(TimelineClientImpl.java:serviceInit(123)) - Timeline service address: 
http://192.168.117.117:8188/ws/v1/timeline/ 
2015-06-18 09:06:06,960 INFO [test.oracle-26]: client.RMProxy 
(RMProxy.java:createRMProxy(92)) - Connecting to ResourceManager at 
localhost/127.0.0.1:8032 
2015-06-18 09:06:07,175 INFO [test.oracle-26]: impl.TimelineClientImpl 
(TimelineClientImpl.java:serviceInit(123)) - Timeline service address: 
http://192.168.117.117:8188/ws/v1/timeline/ 
2015-06-18 09:06:07,176 INFO [test.oracle-26]: client.RMProxy 
(RMProxy.java:createRMProxy(92)) - Connecting to ResourceManager at 
localhost/127.0.0.1:8032 
2015-06-18 09:06:07,298 WARN [test.oracle-26]: mapreduce.JobSubmitter 
(JobSubmitter.java:copyAndConfigureFiles(150)) - Hadoop command-line option 
parsing not performed. Implement the Tool interface and execute your 
application with ToolRunner to remedy this. 
2015-06-18 09:06:07,777 INFO [test.oracle-26]: mapreduce.JobSubmitter 
(JobSubmitter.java:submitJobInternal(401)) - number of splits:2 
2015-06-18 09:06:07,876 INFO [test.oracle-26]: mapreduce.JobSubmitter 
(JobSubmitter.java:printTokens(484)) - Submitting tokens for job: 
job_1433398549746_0035 
2015-06-18 09:06:08,021 INFO [test.oracle-26]: impl.YarnClientImpl 
(YarnClientImpl.java:submitApplication(236)) - Submitted application 
application_1433398549746_0035 
2015-06-18 09:06:08,052 INFO [test.oracle-26]: mapreduce.Job 
(Job.java:submit(1299)) - The url to track the job: 
http://localhost:8088/proxy/application_1433398549746_0035/ 
2015-06-18 09:06:08,052 INFO [test.oracle-26]: mapreduce.Job 
(Job.java:monitorAndPrintJob(1344)) - Running job: job_1433398549746_0035 
2015-06-18 09:06:18,174 INFO [test.oracle-26]: mapreduce.Job 
(Job.java:monitorAndPrintJob(1365)) - Job job_1433398549746_0035 running in 
uber mode : false 
2015-06-18 09:06:18,176 INFO [test.oracle-26]: mapreduce.Job 
(Job.java:monitorAndPrintJob(1372)) - map 0% reduce 0% 
2015-06-18 09:06:23,232 INFO [test.oracle-26]: mapreduce.Job 
(Job.java:monitorAndPrintJob(1372)) - map 50% reduce 0% 
2015-06-18 09:06:28,262 INFO [test.oracle-26]: mapreduce.Job 
(Job.java:monitorAndPrintJob(1372)) - map 100% reduce 0% 
2015-06-18 09:06:28,273 INFO [test.oracle-26]: mapreduce.Job 
(Job.java:monitorAndPrintJob(1383)) - Job job_1433398549746_0035 completed 
successfully 
2015-06-18 09:06:28,327 INFO [test.oracle-26]: mapreduce.Job 
(Job.java:monitorAndPrintJob(1390)) - Counters: 30 



r7raul1...@163.com
 
From: r7raul1...@163.com
Date: 2015-06-18 08:37
To: user
Subject: Re: Re: delta file compact take no effect
Thank you! I will try



r7raul1...@163.com
 
From: Alan Gates
Date: 2015-06-18 08:33
To: user
Subject: Re: delta file compact take no effect
See 
https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration

Compaction is initiated by the thrift metastore server.  You need to set the 
values labeled metastore in the above page in the hive-site.xml for your 
metastore server.

Alan.

r7raul1...@163.com
June 16, 2015 at 23:33
My config is on my client. What is metastore config?





r7raul1...@163.com


Re: Running Hive Unit Tests from IntelliJ and Datanucleus

2015-06-17 Thread Ruoxi Sun
Hi Rajat, I used an alternative instead of Datanucleus plugin in IntelliJ.
Try create a run configuration as the following pictured shows. And make
sure you have datanucleus in your module's dependencies.

Hope it can help.


*孙若曦*

2015-06-18 3:32 GMT+08:00 Rajat Jain rajat...@gmail.com:

 Hi,

 I want to run Hive unit tests from IntelliJ but am unable to do so due to
 datanucleus issues. I tried a lot of options but always seem to be getting
 the same error.

 1. Datanucleus plugin:

 I installed Datanucleus plugin on IntelliJ, enabled the enhancer but got
 error of the type:

 Caused by: org.datanucleus.exceptions.ClassNotPersistableException: The
 class org.apache.hadoop.hive.metastore.model.MVersionTable is not
 persistable. This means that it either hasnt been enhanced, or that the
 enhanced version of the file is not in the CLASSPATH (or is hidden by an
 unenhanced version), or the Meta-Data/annotations for the class are not
 found.
 at
 org.datanucleus.ExecutionContextImpl.assertClassPersistable(ExecutionContextImpl.java:5698)
 at
 org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2123)
 at
 org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:2065)
 at
 org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1913)
 at
 org.datanucleus.ExecutionContextThreadedImpl.persistObject(ExecutionContextThreadedImpl.java:217)
 at
 org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:727)
 ... 66 more

 When I checked the Datanucleus settings, I noticed that only MDatabase
 class was registered there and no other classes (like MVersionTable) were
 registered. Is that the issue? If not, then any way this can be resolved?

 2. I tried pushing up the hive-metastore dependency over Module Source as
 suggested here
 http://qnalist.com/questions/5105293/running-tests-in-intellij. Didn't
 work either. I tried this option while both enabling and disabling
 datanucleus plugin.

 Let me know if someone has any ideas. I have attached sample screenshots
 for reference.

 Thanks,
 Rajat





Maximum number of columns

2015-06-17 Thread Shimpei Kodama
Hi guys,

Let me ask a quick question.

Is there something like maximum number for columns in Hive table?

Thanks,
Shimpei


Query in column name

2015-06-17 Thread Renuka Be
Hi Folks,

I have tried column name like (Order Details) with space, it cause error.
Is there any way to mention column name with sapce.

Regards,
Renuka N


Re: Merging small files in partitions

2015-06-17 Thread Mohammad Islam
Hi Edward,Can we do the same/similar thing for parquet file?Any 
pointer?Regards,Mohammad 


 On Tuesday, June 16, 2015 2:35 PM, Edward Capriolo edlinuxg...@gmail.com 
wrote:
   

 https://github.com/edwardcapriolo/filecrush

On Tue, Jun 16, 2015 at 5:05 PM, Chagarlamudi, Prasanth 
prasanth.chagarlam...@epsilon.com wrote:

Hello,I am looking for an optimized way to merge small files in hive partitions 
into one big file.I came across Alter Table/Partition Concatenate 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable/PartitionConcatenate.
 Doc says this only works for RCFiles. I wish there is something similar for 
TEXT FILE format.Any suggestions? Thanks in advancePrasanth  

This e-mail and files transmitted with it are confidential, and are intended 
solely for the use of the individual or entity to whom this e-mail is 
addressed. If you are not the intended recipient, or the employee or agent 
responsible to deliver it to the intended recipient, you are hereby notified 
that any dissemination, distribution or copying of this communication is 
strictly prohibited. If you are not one of the named recipient(s) or otherwise 
have reason to believe that you received this message in error, please 
immediately notify sender by e-mail, and destroy the original message. Thank 
You.




  

Query HBase table based on timestamp

2015-06-17 Thread Buntu Dev
My use case is to query time series data ingested into HBase table
containing a web page name or url as row key and related properties as
column qualifiers. The properties for the web page are dynamic ie, the
columns qualifiers are dynamic for a given timestamp.

I would like to create a Hive managed HBase table to query this time series
data for properties of the web page at a given timestamp. Can anyone
clarify:

* How to create a Hive table in this case and what to provide as
hbase.columns.mapping property as the columns themselves are dynamic?

* How to modify the HBase or Hive table schema to be able to query for
given timestamp since it doesn't seem to be supported based on the HBase
integration wiki:
 there is currently no way to access the HBase timestamp attribute, and
queries always access data with the latest timestamp.


Thanks!


Re: Alter table statement for create table like-tables

2015-06-17 Thread Julian Keppel
I tried it with an external table with csv format now. This worked
properly. So it really seems to has to do with avro as data format.

The Jira ticket https://issues.apache.org/jira/browse/HIVE-7446 was closed
and the corresponding patch was added to release 0.14. Should I open a new
ticket?

2015-06-16 20:24 GMT+02:00 Jason Dere jd...@hortonworks.com:

  Probably has to do with the fact that it is an Avro table.  I don't have
 any experience using Avro, but maybe take a look at
 https://issues.apache.org/jira/browse/HIVE-7446 for some of the issues
 described there, or maybe look at the test that was added for that Jira.


  On Jun 16, 2015, at 2:42 AM, Julian Keppel juliankeppel1...@gmail.com
 wrote:

  *Push* Does no one have an idea or hit similar issues?

 2015-06-09 15:46 GMT+02:00 Julian Keppel juliankeppel1...@gmail.com:

 I use Hive Version 1.1.0 in Cloudera CDH 5.4.0.

  I have created an external table:

  CREATE EXTERNAL TABLE *tableA*
 ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
 STORED AS
 INPUTFORMAT
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
 OUTPUTFORMAT
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
 LOCATION 'location'
 TBLPROPERTIES ('avro.schema.url'=schema_location');

  Now I wanted to create a managed table with exactly the same columns
 except of one additional column (a generated hash key). The only idea I had
 was to create the table with: CREATE TABLE *tableB* LIKE *tableA*;

  And then add the additional column with: ALTER TABLE *tableB* ADD
 COLUMNS (new_column INT);

  The statements run without any errors or exceptions (even in the log
 files under /var/log/hive) but the new column doesn't appear.

  What am I doing wrong? Or is this not possible? What other ideas do you
 have for my use case?

  Thank you in advance for your help!






Re: Re: delta file compact take no effect

2015-06-17 Thread r7raul1...@163.com
My config is on my client. What is metastore config?





r7raul1...@163.com
 
From: Alan Gates
Date: 2015-06-17 13:42
To: user
Subject: Re: delta file compact take no effect
Is the config you give on your metastore or your client?  The worker thread and 
initiator must be started on the metastore.

Alan.

r7raul1...@163.com
June 16, 2015 at 22:38
Any help?



r7raul1...@163.com


Query Timeout

2015-06-17 Thread Ibrar Ahmed
Hi,

Whats wrong with my settings?

[127.0.0.1:1] hive CREATE TABLE IF NOT EXISTS pagecounts_hbase (rowkey
STRING, pageviews STRING, bytes STRING) STORED BY
'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES
('hbase.columns.mapping' = ':key,f:c1,f:c2') TBLPROPERTIES ('
hbase.table.name' = 'pagecounts');

[Hive Error]: Query returned non-zero code: 1, cause: FAILED: Execution
Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
MetaException(message:MetaException(message:org.apache.hadoop.hbase.MasterNotRunningException:
Retried 10 times
at org.apache.hadoop.hbase.client.HBaseAdmin.init(HBaseAdmin.java:127)
at
org.apache.hadoop.hive.hbase.HBaseStorageHandler.getHBaseAdmin(HBaseStorageHandler.java:84)
at
org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:554)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
at com.sun.proxy.$Proxy7.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:613)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4194)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1472)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1239)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1057)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:880)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:870)
at
org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:198)
at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:644)
at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:628)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
)
at
org.apache.hadoop.hive.hbase.HBaseStorageHandler.getHBaseAdmin(HBaseStorageHandler.java:88)
at
org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:554)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
at com.sun.proxy.$Proxy7.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:613)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4194)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1472)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1239)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1057)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:880)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:870)
at
org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:198)
at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:644)
at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:628)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at

Re: Query Timeout

2015-06-17 Thread Ibrar Ahmed
I am able to fix that issue, but got another error


[127.0.0.1:1] hive CREATE TABLE IF NOT EXISTS pagecounts_hbase (rowkey
STRING, pageviews STRING, bytes STRING) STORED BY
'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES
('hbase.columns.mapping' = ':key,f:c1,f:c2') TBLPROPERTIES ('
hbase.table.name' = 'pagecounts');
[Hive Error]: Query returned non-zero code: 1, cause: FAILED: Execution
Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
MetaException(message:java.lang.IllegalArgumentException: Not a host:port
pair: PBUF

ibrar-virtual-machine���߯��)��
at
org.apache.hadoop.hbase.util.Addressing.parseHostname(Addressing.java:60)
at org.apache.hadoop.hbase.ServerName.init(ServerName.java:96)
at
org.apache.hadoop.hbase.ServerName.parseVersionedServerName(ServerName.java:278)
at
org.apache.hadoop.hbase.MasterAddressTracker.bytesToServerName(MasterAddressTracker.java:77)
at
org.apache.hadoop.hbase.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:61)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:631)
at org.apache.hadoop.hbase.client.HBaseAdmin.init(HBaseAdmin.java:106)
at
org.apache.hadoop.hive.hbase.HBaseStorageHandler.getHBaseAdmin(HBaseStorageHandler.java:84)
at
org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:554)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
at com.sun.proxy.$Proxy7.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:613)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4194)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1472)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1239)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1057)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:880)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:870)
at
org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:198)
at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:644)
at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:628)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)


On Wed, Jun 17, 2015 at 3:51 PM, Ibrar Ahmed ibrar.ah...@gmail.com wrote:

 Hi,

 Whats wrong with my settings?

 [127.0.0.1:1] hive CREATE TABLE IF NOT EXISTS pagecounts_hbase
 (rowkey STRING, pageviews STRING, bytes STRING) STORED BY
 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES
 ('hbase.columns.mapping' = ':key,f:c1,f:c2') TBLPROPERTIES ('
 hbase.table.name' = 'pagecounts');

 [Hive Error]: Query returned non-zero code: 1, cause: FAILED: Execution
 Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
 MetaException(message:MetaException(message:org.apache.hadoop.hbase.MasterNotRunningException:
 Retried 10 times
 at
 org.apache.hadoop.hbase.client.HBaseAdmin.init(HBaseAdmin.java:127)
 at
 org.apache.hadoop.hive.hbase.HBaseStorageHandler.getHBaseAdmin(HBaseStorageHandler.java:84)
 at
 org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162)
 at
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:554)
 at
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at 

hive cdh5.4 1.1.0 metastore java.lang.RuntimeException: commitTransaction was called but openTransactionCalls = 0

2015-06-17 Thread ??????.
my hive version hive-cdh5.4.0 


follower this step, the exception throw

# hive
CREATE TABLE test1 (name string) PARTITIONED BY (pt string);
ALTER TABLE test1 ADD PARTITION (pt='1');
ALTER TABLE test1 CHANGE name name1 string;
ALTER TABLE test1 CHANGE name name1 string cascade; 


then throw exception,
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. 
java.lang.RuntimeException: commitTransaction was called but 
openTransactionCalls = 0. This probably indicates that there are unbalanced 
calls to openTransaction/commitTransaction




metasotre log
MetaException(message:java.lang.RuntimeException: commitTransaction was called 
but openTransactionCalls = 0. This probably indicates that there are unbalanced 
calls to openTransaction/commitTransaction)  at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5257)
  at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3338)
  at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_cascade(HiveMetaStore.java:3290)
  at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)  at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606) at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
  at com.sun.proxy.$Proxy5.alter_table_with_cascade(Unknown Source)   
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9131)
 at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9115)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)  
 at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
   at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
   at java.security.AccessController.doPrivileged(Native Method)   at 
javax.security.auth.Subject.doAs(Subject.java:415)   at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
 at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
 at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
 at java.lang.Thread.run(Thread.java:745) Caused by: 
java.lang.RuntimeException: commitTransaction was called but 
openTransactionCalls = 0. This probably indicates that there are unbalanced 
calls to openTransaction/commitTransaction at 
org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:448)
 at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)  at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606) at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98)
 at com.sun.proxy.$Proxy0.commitTransaction(Unknown Source)  at 
org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:242)
  at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3318)
  ... 19 more


I debug the code, may this function private void 
updatePartColumnStatsForAlterColumns wrong.some transaction rollback, but I 
don't known the exact error.

Hive counters for records read and written

2015-06-17 Thread Hemanth Meka
Hi,


I can see that two new counters have been added for hive 
(RECORDS_IN/RECORDS_OUT) in hive 0.14.


Prior to this release which counters could be used to get the records read by 
hive job and records written.  Because i noticed that in hive 0.14 for a few 
hive jobs i see map_input_records but the map_output_records counter is 0 but 
the job actually writes something to output table and the hive log also gives 
that count correctly.


In this case how else can we get records read and records written  in releases 
before 0.14.


Regards

Hemanth


Read error : Varchar cannot be cast to string

2015-06-17 Thread Devansh Srivastava
Hi,


I have one table with VARCHAR and CHAR datatypes. While reading data through 
hive, I am getting  below error :--


Diagnostic Messages for this Task:
Error: java.io.IOException: java.io.IOException: java.lang.RuntimeException: 
java.lang.ClassCastException: org.apache.hadoop.hive.common.type.HiveVarchar 
cannot be cast to java.lang.String
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:273)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:183)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: java.lang.RuntimeException: 
java.lang.ClassCastException: org.apache.hadoop.hive.common.type.HiveVarchar 
cannot be cast to java.lang.String
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:352)
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:115)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:271)
... 11 more
Caused by: java.lang.RuntimeException: java.lang.ClassCastException: 
org.apache.hadoop.hive.common.type.HiveVarchar cannot be cast to 
java.lang.String
at 
org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.next(VectorizedOrcInputFormat.java:95)
at 
org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.next(VectorizedOrcInputFormat.java:49)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:347)
... 15 more
Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.common.type.HiveVarchar cannot be cast to 
java.lang.String
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatchCtx.addPartitionColsToBatch(VectorizedRowBatchCtx.java:566)
at 
org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.next(VectorizedOrcInputFormat.java:90)
... 17 more


FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
hive

Can anyone please help me on this.

With Regards
Devansh



Re: Read error : Varchar cannot be cast to string

2015-06-17 Thread Gopal Vijayaraghavan
Hi,

 Caused by: java.lang.ClassCastException:
org.apache.hadoop.hive.common.type.HiveVarchar cannot be cast to
java.lang.String
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatchCtx.addPartitionCo
lsToBatch(VectorizedRowBatchCtx.java:566)

Is it a partition column the one marked as a varchar?

Can you write a small test-case and post a bug about this?

I can take a look at this, looks like a simple missed call to toString().

Cheers,
Gopal