date:20150617

Re: Merging small files in partitions

2015-06-17 Thread Mohammad Islam

Hi Edward,Can we do the same/similar thing for parquet file?Any 
pointer?Regards,Mohammad 


 On Tuesday, June 16, 2015 2:35 PM, Edward Capriolo  
wrote:
   

 https://github.com/edwardcapriolo/filecrush

On Tue, Jun 16, 2015 at 5:05 PM, Chagarlamudi, Prasanth 
 wrote:

Hello,I am looking for an optimized way to merge small files in hive partitions 
into one big file.I came across Alter Table/Partition Concatenate 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable/PartitionConcatenate.
 Doc says this only works for RCFiles. I wish there is something similar for 
TEXT FILE format.Any suggestions? Thanks in advancePrasanth  

This e-mail and files transmitted with it are confidential, and are intended 
solely for the use of the individual or entity to whom this e-mail is 
addressed. If you are not the intended recipient, or the employee or agent 
responsible to deliver it to the intended recipient, you are hereby notified 
that any dissemination, distribution or copying of this communication is 
strictly prohibited. If you are not one of the named recipient(s) or otherwise 
have reason to believe that you received this message in error, please 
immediately notify sender by e-mail, and destroy the original message. Thank 
You.

Query HBase table based on timestamp

2015-06-17 Thread Buntu Dev

My use case is to query time series data ingested into HBase table
containing a web page name or url as row key and related properties as
column qualifiers. The properties for the web page are dynamic ie, the
columns qualifiers are dynamic for a given timestamp.

I would like to create a Hive managed HBase table to query this time series
data for properties of the web page at a given timestamp. Can anyone
clarify:

* How to create a Hive table in this case and what to provide as
"hbase.columns.mapping" property as the columns themselves are dynamic?

* How to modify the HBase or Hive table schema to be able to query for
given timestamp since it doesn't seem to be supported based on the HBase
integration wiki:
 "there is currently no way to access the HBase timestamp attribute, and
queries always access data with the latest timestamp."


Thanks!

hive cdh5.4 1.1.0 metastore java.lang.RuntimeException: commitTransaction was called but openTransactionCalls = 0

2015-06-17 Thread ??????.

my hive version hive-cdh5.4.0 


follower this step, the exception throw

# hive
CREATE TABLE test1 (name string) PARTITIONED BY (pt string);
ALTER TABLE test1 ADD PARTITION (pt='1');
ALTER TABLE test1 CHANGE name name1 string;
ALTER TABLE test1 CHANGE name name1 string cascade; 


then throw exception,
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. 
java.lang.RuntimeException: commitTransaction was called but 
openTransactionCalls = 0. This probably indicates that there are unbalanced 
calls to openTransaction/commitTransaction




metasotre log
MetaException(message:java.lang.RuntimeException: commitTransaction was called 
but openTransactionCalls = 0. This probably indicates that there are unbalanced 
calls to openTransaction/commitTransaction)  at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5257)
  at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3338)
  at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_cascade(HiveMetaStore.java:3290)
  at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)  at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606) at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
  at com.sun.proxy.$Proxy5.alter_table_with_cascade(Unknown Source)   
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9131)
 at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_cascade.getResult(ThriftHiveMetastore.java:9115)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)  
 at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
   at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
   at java.security.AccessController.doPrivileged(Native Method)   at 
javax.security.auth.Subject.doAs(Subject.java:415)   at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
 at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
 at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
 at java.lang.Thread.run(Thread.java:745) Caused by: 
java.lang.RuntimeException: commitTransaction was called but 
openTransactionCalls = 0. This probably indicates that there are unbalanced 
calls to openTransaction/commitTransaction at 
org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:448)
 at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)  at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606) at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98)
 at com.sun.proxy.$Proxy0.commitTransaction(Unknown Source)  at 
org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:242)
  at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3318)
  ... 19 more


I debug the code, may this function "private void 
updatePartColumnStatsForAlterColumns" wrong.some transaction rollback, but I 
don't known the exact error.

Query Timeout

2015-06-17 Thread Ibrar Ahmed

Hi,

Whats wrong with my settings?

[127.0.0.1:1] hive> CREATE TABLE IF NOT EXISTS pagecounts_hbase (rowkey
STRING, pageviews STRING, bytes STRING) STORED BY
'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES
('hbase.columns.mapping' = ':key,f:c1,f:c2') TBLPROPERTIES ('
hbase.table.name' = 'pagecounts');

[Hive Error]: Query returned non-zero code: 1, cause: FAILED: Execution
Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
MetaException(message:MetaException(message:org.apache.hadoop.hbase.MasterNotRunningException:
Retried 10 times
at org.apache.hadoop.hbase.client.HBaseAdmin.(HBaseAdmin.java:127)
at
org.apache.hadoop.hive.hbase.HBaseStorageHandler.getHBaseAdmin(HBaseStorageHandler.java:84)
at
org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:554)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
at com.sun.proxy.$Proxy7.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:613)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4194)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1472)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1239)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1057)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:880)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:870)
at
org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:198)
at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:644)
at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:628)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
)
at
org.apache.hadoop.hive.hbase.HBaseStorageHandler.getHBaseAdmin(HBaseStorageHandler.java:88)
at
org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:554)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
at com.sun.proxy.$Proxy7.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:613)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4194)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1472)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1239)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1057)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:880)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:870)
at
org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:198)
at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:644)
at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:628)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at
org.apac

Re: Query Timeout

2015-06-17 Thread Ibrar Ahmed

I am able to fix that issue, but got another error

[127.0.0.1:1] hive> CREATE TABLE IF NOT EXISTS pagecounts_hbase (rowkey
STRING, pageviews STRING, bytes STRING) STORED BY
'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES
('hbase.columns.mapping' = ':key,f:c1,f:c2') TBLPROPERTIES ('
hbase.table.name' = 'pagecounts');
[Hive Error]: Query returned non-zero code: 1, cause: FAILED: Execution
Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
MetaException(message:java.lang.IllegalArgumentException: Not a host:port
pair: PBUF
"
ibrar-virtual-machine���߯��)��
at
org.apache.hadoop.hbase.util.Addressing.parseHostname(Addressing.java:60)
at org.apache.hadoop.hbase.ServerName.(ServerName.java:96)
at
org.apache.hadoop.hbase.ServerName.parseVersionedServerName(ServerName.java:278)
at
org.apache.hadoop.hbase.MasterAddressTracker.bytesToServerName(MasterAddressTracker.java:77)
at
org.apache.hadoop.hbase.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:61)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:631)
at org.apache.hadoop.hbase.client.HBaseAdmin.(HBaseAdmin.java:106)
at
org.apache.hadoop.hive.hbase.HBaseStorageHandler.getHBaseAdmin(HBaseStorageHandler.java:84)
at
org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:554)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
at com.sun.proxy.$Proxy7.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:613)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4194)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1472)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1239)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1057)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:880)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:870)
at
org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:198)
at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:644)
at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:628)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

On Wed, Jun 17, 2015 at 3:51 PM, Ibrar Ahmed  wrote:

> Hi,
>
> Whats wrong with my settings?
>
> [127.0.0.1:1] hive> CREATE TABLE IF NOT EXISTS pagecounts_hbase
> (rowkey STRING, pageviews STRING, bytes STRING) STORED BY
> 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES
> ('hbase.columns.mapping' = ':key,f:c1,f:c2') TBLPROPERTIES ('
> hbase.table.name' = 'pagecounts');
>
> [Hive Error]: Query returned non-zero code: 1, cause: FAILED: Execution
> Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
> MetaException(message:MetaException(message:org.apache.hadoop.hbase.MasterNotRunningException:
> Retried 10 times
> at
> org.apache.hadoop.hbase.client.HBaseAdmin.(HBaseAdmin.java:127)
> at
> org.apache.hadoop.hive.hbase.HBaseStorageHandler.getHBaseAdmin(HBaseStorageHandler.java:84)
> at
> org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:554)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Me

Hive counters for records read and written

2015-06-17 Thread Hemanth Meka

Hi,


I can see that two new counters have been added for hive 
(RECORDS_IN/RECORDS_OUT) in hive 0.14.


Prior to this release which counters could be used to get the records read by 
hive job and records written.  Because i noticed that in hive 0.14 for a few 
hive jobs i see map_input_records but the map_output_records counter is 0 but 
the job actually writes something to output table and the hive log also gives 
that count correctly.


In this case how else can we get records read and records written  in releases 
before 0.14.


Regards

Hemanth

Read error : Varchar cannot be cast to string

2015-06-17 Thread Devansh Srivastava

Hi,


I have one table with VARCHAR and CHAR datatypes. While reading data through 
hive, I am getting  below error :--


Diagnostic Messages for this Task:
Error: java.io.IOException: java.io.IOException: java.lang.RuntimeException: 
java.lang.ClassCastException: org.apache.hadoop.hive.common.type.HiveVarchar 
cannot be cast to java.lang.String
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:273)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:183)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: java.lang.RuntimeException: 
java.lang.ClassCastException: org.apache.hadoop.hive.common.type.HiveVarchar 
cannot be cast to java.lang.String
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:352)
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:115)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:271)
... 11 more
Caused by: java.lang.RuntimeException: java.lang.ClassCastException: 
org.apache.hadoop.hive.common.type.HiveVarchar cannot be cast to 
java.lang.String
at 
org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.next(VectorizedOrcInputFormat.java:95)
at 
org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.next(VectorizedOrcInputFormat.java:49)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:347)
... 15 more
Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.common.type.HiveVarchar cannot be cast to 
java.lang.String
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatchCtx.addPartitionColsToBatch(VectorizedRowBatchCtx.java:566)
at 
org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.next(VectorizedOrcInputFormat.java:90)
... 17 more


FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
hive>

Can anyone please help me on this.

With Regards
Devansh

Re: Read error : Varchar cannot be cast to string

2015-06-17 Thread Gopal Vijayaraghavan

Hi,

> Caused by: java.lang.ClassCastException:
>org.apache.hadoop.hive.common.type.HiveVarchar cannot be cast to
>java.lang.String
>at 
>org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatchCtx.addPartitionCo
>lsToBatch(VectorizedRowBatchCtx.java:566)

Is it a partition column the one marked as a varchar?

Can you write a small test-case and post a bug about this?

I can take a look at this, looks like a simple missed call to toString().

Cheers,
Gopal

Disabling hdfs and local filesystem cache no longer necessary?

2015-06-17 Thread Ben Tse

In the wiki page,
https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2
It is still advising to disable hdfs and local filesystem cache. It should
not be needed anymore, given that HIVE-4501 is resolved since 0.13.0.

Is this correct?

referenced configuration:
fs.hdfs.impl.disable.cache
fs.file.impl.disable.cache

-Ben

Hive - Tez error with big join - Container expired.

2015-06-17 Thread Daniel Klinger

Hi all,

 

I have a pretty big Hive Query. I'm joining over 3 Hive-Tables which have
thousands of lines each. I'm grouping this join by several columns. In the
Hive-Shell this query only reach about 80%. After about 1400 seconds its
canceling with the following error:  

 

Status: Failed

Vertex failed, vertexName=Map 2, vertexId=vertex_1434357133795_0008_1_01,
diagnostics=[Task failed, taskId=task_1434357133795_0008_1_01_33,
diagnostics=[TaskAttempt 0 failed,
info=[Containercontainer_1434357133795_0008_01_39 finished while trying
to launch. Diagnostics: [Container failed. Container expired since it was
unused]], TaskAttempt 1 failed,
info=[Containercontainer_1434357133795_0008_01_55 finished while trying
to launch. Diagnostics: [Container failed. Container expired since it was
unused]], TaskAttempt 2 failed,
info=[Containercontainer_1434357133795_0008_01_72 finished while trying
to launch. Diagnostics: [Container failed. Container expired since it was
unused]], TaskAttempt 3 failed,
info=[Containercontainer_1434357133795_0008_01_000101 finished while trying
to launch. Diagnostics: [Container failed. Container expired since it was
unused]]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex
vertex_1434357133795_0008_1_01 [Map 2] killed/failed due to:null]

DAG failed due to vertex failure. failedVertices:1 killedVertices:0

FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.tez.TezTask

 

My yarn resource manager is at 100% during the whole execution (using all of
the 300 GB memory). I tried to extend the live time of my containers with
the following setting in the yarn-site.xml but no success:

 

yarn.resourcemanager.rm.container-allocation.expiry-interval-ms = 120

 

After this change my query stays at 0% over thousands of seconds. The query
itself is working (tested with less data). How can I solve this problem.

 

Thanks for your help.

 

Greetz

DK

Re: delta file compact take no effect

2015-06-17 Thread Alan Gates

See 
https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration


Compaction is initiated by the thrift metastore server.  You need to set 
the values labeled metastore in the above page in the hive-site.xml for 
your metastore server.


Alan.


r7raul1...@163.com 
June 16, 2015 at 23:33
My config is on my client. What is metastore config?




r7raul1...@163.com

Re: Re: delta file compact take no effect

2015-06-17 Thread r7raul1...@163.com

Thank you! I will try

r7raul1...@163.com

From: Alan Gates
Date: 2015-06-18 08:33
To: user
Subject: Re: delta file compact take no effect
See 
https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration

Compaction is initiated by the thrift metastore server.  You need to set the 
values labeled metastore in the above page in the hive-site.xml for your 
metastore server.

Alan.

r7raul1...@163.com
June 16, 2015 at 23:33
My config is on my client. What is metastore config?

r7raul1...@163.com

Re: Re: delta file compact take no effect

2015-06-17 Thread r7raul1...@163.com

It's work~~   But  I see some  ERROR and Deadlock .

2015-06-18 09:06:06,509 ERROR [test.oracle-22]: txn.CompactionTxnHandler 
(CompactionTxnHandler.java:findNextToCompact(194)) - Unable to select next 
element for compaction, ERROR: could not serialize access due to concurrent 
update 
2015-06-18 09:06:06,509 ERROR [test.oracle-27]: txn.CompactionTxnHandler 
(CompactionTxnHandler.java:findNextToCompact(194)) - Unable to select next 
element for compaction, ERROR: could not serialize access due to concurrent 
update 
2015-06-18 09:06:06,509 ERROR [test.oracle-28]: txn.CompactionTxnHandler 
(CompactionTxnHandler.java:findNextToCompact(194)) - Unable to select next 
element for compaction, ERROR: could not serialize access due to concurrent 
update 
2015-06-18 09:06:06,509 WARN [test.oracle-22]: txn.TxnHandler 
(TxnHandler.java:checkRetryable(916)) - Deadlock detected in findNextToCompact, 
trying again. 
2015-06-18 09:06:06,509 WARN [test.oracle-27]: txn.TxnHandler 
(TxnHandler.java:checkRetryable(916)) - Deadlock detected in findNextToCompact, 
trying again. 
2015-06-18 09:06:06,509 WARN [test.oracle-28]: txn.TxnHandler 
(TxnHandler.java:checkRetryable(916)) - Deadlock detected in findNextToCompact, 
trying again. 
2015-06-18 09:06:06,544 INFO [test.oracle-26]: compactor.Worker 
(Worker.java:run(140)) - Starting MAJOR compaction for default.u_data_txn 
2015-06-18 09:06:06,874 INFO [test.oracle-26]: impl.TimelineClientImpl 
(TimelineClientImpl.java:serviceInit(123)) - Timeline service address: 
http://192.168.117.117:8188/ws/v1/timeline/ 
2015-06-18 09:06:06,960 INFO [test.oracle-26]: client.RMProxy 
(RMProxy.java:createRMProxy(92)) - Connecting to ResourceManager at 
localhost/127.0.0.1:8032 
2015-06-18 09:06:07,175 INFO [test.oracle-26]: impl.TimelineClientImpl 
(TimelineClientImpl.java:serviceInit(123)) - Timeline service address: 
http://192.168.117.117:8188/ws/v1/timeline/ 
2015-06-18 09:06:07,176 INFO [test.oracle-26]: client.RMProxy 
(RMProxy.java:createRMProxy(92)) - Connecting to ResourceManager at 
localhost/127.0.0.1:8032 
2015-06-18 09:06:07,298 WARN [test.oracle-26]: mapreduce.JobSubmitter 
(JobSubmitter.java:copyAndConfigureFiles(150)) - Hadoop command-line option 
parsing not performed. Implement the Tool interface and execute your 
application with ToolRunner to remedy this. 
2015-06-18 09:06:07,777 INFO [test.oracle-26]: mapreduce.JobSubmitter 
(JobSubmitter.java:submitJobInternal(401)) - number of splits:2 
2015-06-18 09:06:07,876 INFO [test.oracle-26]: mapreduce.JobSubmitter 
(JobSubmitter.java:printTokens(484)) - Submitting tokens for job: 
job_1433398549746_0035 
2015-06-18 09:06:08,021 INFO [test.oracle-26]: impl.YarnClientImpl 
(YarnClientImpl.java:submitApplication(236)) - Submitted application 
application_1433398549746_0035 
2015-06-18 09:06:08,052 INFO [test.oracle-26]: mapreduce.Job 
(Job.java:submit(1299)) - The url to track the job: 
http://localhost:8088/proxy/application_1433398549746_0035/ 
2015-06-18 09:06:08,052 INFO [test.oracle-26]: mapreduce.Job 
(Job.java:monitorAndPrintJob(1344)) - Running job: job_1433398549746_0035 
2015-06-18 09:06:18,174 INFO [test.oracle-26]: mapreduce.Job 
(Job.java:monitorAndPrintJob(1365)) - Job job_1433398549746_0035 running in 
uber mode : false 
2015-06-18 09:06:18,176 INFO [test.oracle-26]: mapreduce.Job 
(Job.java:monitorAndPrintJob(1372)) - map 0% reduce 0% 
2015-06-18 09:06:23,232 INFO [test.oracle-26]: mapreduce.Job 
(Job.java:monitorAndPrintJob(1372)) - map 50% reduce 0% 
2015-06-18 09:06:28,262 INFO [test.oracle-26]: mapreduce.Job 
(Job.java:monitorAndPrintJob(1372)) - map 100% reduce 0% 
2015-06-18 09:06:28,273 INFO [test.oracle-26]: mapreduce.Job 
(Job.java:monitorAndPrintJob(1383)) - Job job_1433398549746_0035 completed 
successfully 
2015-06-18 09:06:28,327 INFO [test.oracle-26]: mapreduce.Job 
(Job.java:monitorAndPrintJob(1390)) - Counters: 30 



r7raul1...@163.com
 
From: r7raul1...@163.com
Date: 2015-06-18 08:37
To: user
Subject: Re: Re: delta file compact take no effect
Thank you! I will try



r7raul1...@163.com
 
From: Alan Gates
Date: 2015-06-18 08:33
To: user
Subject: Re: delta file compact take no effect
See 
https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration

Compaction is initiated by the thrift metastore server.  You need to set the 
values labeled metastore in the above page in the hive-site.xml for your 
metastore server.

Alan.

r7raul1...@163.com
June 16, 2015 at 23:33
My config is on my client. What is metastore config?





r7raul1...@163.com

Maximum number of columns

2015-06-17 Thread Shimpei Kodama

Hi guys,

Let me ask a quick question.

Is there something like maximum number for columns in Hive table?

Thanks,
Shimpei

Re: Running Hive Unit Tests from IntelliJ and Datanucleus

2015-06-17 Thread Ruoxi Sun

Hi Rajat, I used an alternative instead of Datanucleus plugin in IntelliJ.
Try create a run configuration as the following pictured shows. And make
sure you have datanucleus in your module's dependencies.

Hope it can help.


*孙若曦*

2015-06-18 3:32 GMT+08:00 Rajat Jain :

> Hi,
>
> I want to run Hive unit tests from IntelliJ but am unable to do so due to
> datanucleus issues. I tried a lot of options but always seem to be getting
> the same error.
>
> 1. Datanucleus plugin:
>
> I installed Datanucleus plugin on IntelliJ, enabled the enhancer but got
> error of the type:
>
> Caused by: org.datanucleus.exceptions.ClassNotPersistableException: The
> class "org.apache.hadoop.hive.metastore.model.MVersionTable" is not
> persistable. This means that it either hasnt been enhanced, or that the
> enhanced version of the file is not in the CLASSPATH (or is hidden by an
> unenhanced version), or the Meta-Data/annotations for the class are not
> found.
> at
> org.datanucleus.ExecutionContextImpl.assertClassPersistable(ExecutionContextImpl.java:5698)
> at
> org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2123)
> at
> org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:2065)
> at
> org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1913)
> at
> org.datanucleus.ExecutionContextThreadedImpl.persistObject(ExecutionContextThreadedImpl.java:217)
> at
> org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:727)
> ... 66 more
>
> When I checked the Datanucleus settings, I noticed that only MDatabase
> class was registered there and no other classes (like MVersionTable) were
> registered. Is that the issue? If not, then any way this can be resolved?
>
> 2. I tried pushing up the hive-metastore dependency over Module Source as
> suggested here
> . Didn't
> work either. I tried this option while both enabling and disabling
> datanucleus plugin.
>
> Let me know if someone has any ideas. I have attached sample screenshots
> for reference.
>
> Thanks,
> Rajat
>
>
>

Query in column name

2015-06-17 Thread Renuka Be

Hi Folks,

I have tried column name like (Order Details) with space, it cause error.
Is there any way to mention column name with sapce.

Regards,
Renuka N

Re: Merging small files in partitions

Query HBase table based on timestamp

hive cdh5.4 1.1.0 metastore java.lang.RuntimeException: commitTransaction was called but openTransactionCalls = 0

Query Timeout

Re: Query Timeout

Hive counters for records read and written

Read error : Varchar cannot be cast to string

Re: Read error : Varchar cannot be cast to string

Disabling hdfs and local filesystem cache no longer necessary?

Hive - Tez error with big join - Container expired.

Re: delta file compact take no effect

Re: Re: delta file compact take no effect

Re: Re: delta file compact take no effect

Maximum number of columns

Re: Running Hive Unit Tests from IntelliJ and Datanucleus

Query in column name

16 matches

Site Navigation

Mail list logo

Footer information