Re: import sql file
I exported sql table into .sql file and would like to import this into hive Best, Patcharee On 23. nov. 2016 10:40, Markovitz, Dudu wrote: Hi Patcharee The question is not clear. Dudu -Original Message- From: patcharee [mailto:patcharee.thong...@uni.no] Sent: Wednesday, November 23, 2016 11:37 AM To: user@hive.apache.org Subject: import sql file Hi, How can I import .sql file into hive? Best, Patcharee
import sql file
Hi, How can I import .sql file into hive? Best, Patcharee
Re: hiveserver2 java heap space
It works on Hive cli Patcharee On 10/24/2016 11:51 AM, Mich Talebzadeh wrote: does this work ok through Hive cli? Dr Mich Talebzadeh LinkedIn /https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw/ http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own risk.Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On 24 October 2016 at 10:43, Patcharee Thongtra mailto:patcharee.thong...@uni.no>> wrote: Hi, I tried to query orc file by beeline and java program using jdbc ("select * from orcfileTable limit 1"). Both failed with Caused by: java.lang.OutOfMemoryError: Java heap space. Hiveserver2 heap size is 1024m. I guess I need to increase this Hiveserver2 heap size? However I wonder why I got this error because I query just ONE line. Any ideas? Thanks, Patcharee
hiveserver2 java heap space
Hi, I tried to query orc file by beeline and java program using jdbc ("select * from orcfileTable limit 1"). Both failed with Caused by: java.lang.OutOfMemoryError: Java heap space. Hiveserver2 heap size is 1024m. I guess I need to increase this Hiveserver2 heap size? However I wonder why I got this error because I query just ONE line. Any ideas? Thanks, Patcharee
hiveserver2 GC overhead limit exceeded
Hi, I use beeline to connect to hiveserver2. I tested with a simple command and got an error GC overhead limit exceeded 0: jdbc:hive2://service-10-1:10010/default> drop table testhivedrivertable; Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. GC overhead limit exceeded (state=08S01,code=1) How to solve this? How to identify if this error is from the client (beeline) or from hiveserver2? Thanks, Patcharee
Re: query orc file by hive
Hi, It works after I altered add partition. Thanks! My partitioned orc file (directory) is created by Spark, therefore hive is not aware of the partitions automatically. Best, Patcharee On 13. nov. 2015 13:08, Elliot West wrote: Have you added the partitions to the meta store? ALTER TABLE ... ADD PARTITION ... If using Spark, I believe it has good support to do this automatically with the HiveContext, although I have not used it myself. Elliot. On Friday, 13 November 2015, patcharee <mailto:patcharee.thong...@uni.no>> wrote: Hi, It work with non-partition ORC, but does not work with (2-column) partitioned ORC. Thanks, Patcharee On 09. nov. 2015 10:55, Elliot West wrote: Hi, You can create a table and point the location property to the folder containing your ORC file: CREATE EXTERNAL TABLE orc_table ( ) STORED AS ORC LOCATION '/hdfs/folder/containing/orc/file' ; https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable Thanks - Elliot. On 9 November 2015 at 09:44, patcharee > wrote: Hi, How can I query an orc file (*.orc) by Hive? This orc file is created by other apps, like spark, mr. Thanks, Patcharee
Re: query orc file by hive
Hi, It work with non-partition ORC, but does not work with (2-column) partitioned ORC. Thanks, Patcharee On 09. nov. 2015 10:55, Elliot West wrote: Hi, You can create a table and point the location property to the folder containing your ORC file: CREATE EXTERNAL TABLE orc_table ( ) STORED AS ORC LOCATION '/hdfs/folder/containing/orc/file' ; https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable Thanks - Elliot. On 9 November 2015 at 09:44, patcharee <mailto:patcharee.thong...@uni.no>> wrote: Hi, How can I query an orc file (*.orc) by Hive? This orc file is created by other apps, like spark, mr. Thanks, Patcharee
query orc file by hive
Hi, How can I query an orc file (*.orc) by Hive? This orc file is created by other apps, like spark, mr. Thanks, Patcharee
Min-Max Index vs Bloom filter
Hi, For the orc format, which scenario that bloom filter is better than min-max index? Best, Patcharee
the column names removed after insert select
Hi I inserted a table from select (insert into table newtable select date, hh, x, y from oldtable). After the insert the column names of the table have been removed, see the output below when I use hive --orcfiledump - Type: struct<_col0:int,_col1:int,_col2:int,_col3:int> while it is supposed to be - Type: struct Any ideas how this happened and how I can fix it. Please suggest me. BR, Patcharee
the number of files after merging
Hi, I am using alter command below to merge partitioned orc file on one partition: alter table X partition(zone=1,z=1,year=2009,month=1) CONCATENATE; - How can I control the number of files after merging? I would like to get only one file per partition. - Is it possible to concatenate the whole table, not one-by-one partition? Thanks, Patcharee
orc table with sorted field
Hi, How can I create a partitioned orc table with sorted field(s)? I tried to use sorted by keyword, but failed parse exception> CREATE TABLE peoplesort (name string, age int) partition by (bddate int) SORTED BY (age) stored as orc Is it possible to have some sorted columns? From hive ddl page, it seems only bucket table can be sorted. Any suggestions please BR, Patcharee
Re: character '' not supported here
Hi, I created a hive table stored as orc file (partitioned and compressed by ZLIB) from Hive CLI, added data into this table by a Spark application. After adding I was able to query data and everything looked fine. Then I concatenated the table from Hive CLI. After that I am not able to query data, like select count(*) from Table, any more, just got error line 1:1 character '' not supported here, no matter Tez or MR engine. How can you solve the problem in your case? BR, Patcharee On 18. juli 2015 21:26, Nitin Pawar wrote: can you tell exactly what steps you did/? also did you try running the query with processing to MR instead of tez? not sure this issue with orc file formats .. i had once faced issues on alter table for orc backed tabled on adding a new column On Sun, Jul 19, 2015 at 12:05 AM, pth001 <mailto:patcharee.thong...@uni.no>> wrote: Hi, The query result> 11236119012.64043-5.970886 8.5592070.0 0.00.0-19.686993 1308.804799848.00.006196644 0.00.0 301.274750.382470460.0NULL11 20081 11236122012.513598-6.3671713 7.3927946 0.00.00.0-22.300392 1441.054799848.0 0.00508465060.00.0 112.207870.304595230.0 NULL11 20081 5122503682415.1955.172235 4.9027147 -0.0244086120.023590.553 -38.96928-1130.0469 74660.542.5969802E-4 9.706164E-1123054.2680.0 0.241967370.0 NULL1120081 9121449412.25196412.081688 -9.594620.0 0.00.0-25.93576258.65625 99848.00.0021708217 0.00.01.2963213 1.15602660.0NULL11 20081 9121458412.3020987.752461 -12.1834630.0 0.00.0-24.983763 351.195399848.00.0023723599 0.00.0 1.41373750.992398860.0NULL11 20081 I stored table in orc format, partitioned and compressed by ZLIB. The problem happened just after I concatenate table. BR, Patcharee On 18/07/15 12:46, Nitin Pawar wrote: select * without where will work because it does not involve file processing I suspect the problem is with field delimiter so i asked for records so that we can see whats the data in each column are you using csv file with columns delimited by some char and it has numeric data in quotes ? On Sat, Jul 18, 2015 at 3:58 PM, patcharee mailto:patcharee.thong...@uni.no>> wrote: This select * from table limit 5; works, but not others. So? Patcharee On 18. juli 2015 12:08, Nitin Pawar wrote: can you do select * from table limit 5; On Sat, Jul 18, 2015 at 3:35 PM, patcharee mailto:patcharee.thong...@uni.no>> wrote: Hi, I am using hive 0.14 with Tez engine. Found a weird problem. Any suggestions? hive> select count(*) from 4D; line 1:1 character '' not supported here line 1:2 character '' not supported here line 1:3 character '' not supported here line 1:4 character '' not supported here line 1:5 character '' not supported here line 1:6 character '' not supported here line 1:7 character '' not supported here line 1:8 character '' not supported here line 1:9 character '' not supported here ... ... line 1:131 character '' not supported here line 1:132 character '' not supported here line 1:133 character '' not supported here line 1:134 character '' not supported here line 1:135 character '' not supported here line 1:136 character '' not supported here line 1:137 character '' not supported here line 1:138 character '' not supported here line 1:139 character '' not supported here line 1:140 character '' not supported here line 1:141 character '' not supported here line 1:142 character '' not supported here line 1:143 character '' not supported here line 1:144 character '' not supported here line 1:145 character '' not supported here line 1:146 character '' not supported here BR, Patcharee -- Nitin Pawar -- Nitin Pawar -- Nitin Pawar
Re: character '' not supported here
This select * from table limit 5; works, but not others. So? Patcharee On 18. juli 2015 12:08, Nitin Pawar wrote: can you do select * from table limit 5; On Sat, Jul 18, 2015 at 3:35 PM, patcharee <mailto:patcharee.thong...@uni.no>> wrote: Hi, I am using hive 0.14 with Tez engine. Found a weird problem. Any suggestions? hive> select count(*) from 4D; line 1:1 character '' not supported here line 1:2 character '' not supported here line 1:3 character '' not supported here line 1:4 character '' not supported here line 1:5 character '' not supported here line 1:6 character '' not supported here line 1:7 character '' not supported here line 1:8 character '' not supported here line 1:9 character '' not supported here ... ... line 1:131 character '' not supported here line 1:132 character '' not supported here line 1:133 character '' not supported here line 1:134 character '' not supported here line 1:135 character '' not supported here line 1:136 character '' not supported here line 1:137 character '' not supported here line 1:138 character '' not supported here line 1:139 character '' not supported here line 1:140 character '' not supported here line 1:141 character '' not supported here line 1:142 character '' not supported here line 1:143 character '' not supported here line 1:144 character '' not supported here line 1:145 character '' not supported here line 1:146 character '' not supported here BR, Patcharee -- Nitin Pawar
character '' not supported here
Hi, I am using hive 0.14 with Tez engine. Found a weird problem. Any suggestions? hive> select count(*) from 4D; line 1:1 character '' not supported here line 1:2 character '' not supported here line 1:3 character '' not supported here line 1:4 character '' not supported here line 1:5 character '' not supported here line 1:6 character '' not supported here line 1:7 character '' not supported here line 1:8 character '' not supported here line 1:9 character '' not supported here ... ... line 1:131 character '' not supported here line 1:132 character '' not supported here line 1:133 character '' not supported here line 1:134 character '' not supported here line 1:135 character '' not supported here line 1:136 character '' not supported here line 1:137 character '' not supported here line 1:138 character '' not supported here line 1:139 character '' not supported here line 1:140 character '' not supported here line 1:141 character '' not supported here line 1:142 character '' not supported here line 1:143 character '' not supported here line 1:144 character '' not supported here line 1:145 character '' not supported here line 1:146 character '' not supported here BR, Patcharee
Re: fails to alter table concatenate
Actually it works on mr. So the problem is from tez. thanks! BR, Patcharee On 30. juni 2015 10:23, Nitin Pawar wrote: can you try doing same by changing the query engine from tez to mr1? not sure if its hive bug or tez bug On Tue, Jun 30, 2015 at 1:46 PM, patcharee <mailto:patcharee.thong...@uni.no>> wrote: Hi, I am using hive 0.14. It fails to alter table concatenate occasionally (see the exception below). It is strange that it fails from time to time not predictable. Is there any suggestion/clue? hive> alter table 4dim partition(zone=2,z=15,year=2005,month=4) CONCATENATE; VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED File MergeFAILED -1 00 -1 0 0 VERTICES: 00/01 [>>--] 0% ELAPSED TIME: 1435651968.00 s Status: Failed Vertex failed, vertexName=File Merge, vertexId=vertex_1435307579867_0041_1_00, diagnostics=[Vertex vertex_1435307579867_0041_1_00 [File Merge] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: [hdfs://service-10-0.local:8020/apps/hive/warehouse/wrf_tables/4dim/zone=2/z=15/year=2005/month=4] initializer failed, vertex=vertex_1435307579867_0041_1_00 [File Merge], java.lang.NullPointerException at org.apache.hadoop.hive.ql.io <http://org.apache.hadoop.hive.ql.io>.HiveInputFormat.init(HiveInputFormat.java:265) at org.apache.hadoop.hive.ql.io <http://org.apache.hadoop.hive.ql.io>.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:452) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateOldSplits(MRInputHelpers.java:441) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateInputSplitsToMem(MRInputHelpers.java:295) at org.apache.tez.mapreduce.common.MRInputAMSplitGenerator.initialize(MRInputAMSplitGenerator.java:124) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) ] DAG failed due to vertex failure. failedVertices:1 killedVertices:0 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.DDLTask BR, Patcharee -- Nitin Pawar
fails to alter table concatenate
Hi, I am using hive 0.14. It fails to alter table concatenate occasionally (see the exception below). It is strange that it fails from time to time not predictable. Is there any suggestion/clue? hive> alter table 4dim partition(zone=2,z=15,year=2005,month=4) CONCATENATE; VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED File MergeFAILED -1 00 -1 0 0 VERTICES: 00/01 [>>--] 0%ELAPSED TIME: 1435651968.00 s Status: Failed Vertex failed, vertexName=File Merge, vertexId=vertex_1435307579867_0041_1_00, diagnostics=[Vertex vertex_1435307579867_0041_1_00 [File Merge] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: [hdfs://service-10-0.local:8020/apps/hive/warehouse/wrf_tables/4dim/zone=2/z=15/year=2005/month=4] initializer failed, vertex=vertex_1435307579867_0041_1_00 [File Merge], java.lang.NullPointerException at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:452) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateOldSplits(MRInputHelpers.java:441) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateInputSplitsToMem(MRInputHelpers.java:295) at org.apache.tez.mapreduce.common.MRInputAMSplitGenerator.initialize(MRInputAMSplitGenerator.java:124) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) ] DAG failed due to vertex failure. failedVertices:1 killedVertices:0 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.DDLTask BR, Patcharee
MetaException(message:java.security.AccessControlException: Permission denied
Hi, I was running a spark job to insert overwrite hive table and got Permission denied. My question is why spark job did the insert by using user 'hive', not myself who ran the job? How can I fix the problem? val hiveContext = new HiveContext(sc) import hiveContext.implicits._ hiveContext.sql("INSERT OVERWRITE table 4dim ... ") Caused by: MetaException(message:java.security.AccessControlException: Permission denied: user=hive, access=WRITE, inode="/apps/hive/warehouse/wrf_tables/4dim/zone=2/z=1/year=2009/month=1":patcharee:hdfs:drwxr-xr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:271) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:257) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:185) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6795) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6777) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6702) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAccess(FSNamesystem.java:9529) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.checkAccess(NameNodeRpcServer.java:1516) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.checkAccess(ClientNamenodeProtocolServerSideTranslatorPB.java:1433) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) ) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_partition_result$alter_partition_resultStandardScheme.read(ThriftHiveMetastore.java) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_partition_result$alter_partition_resultStandardScheme.read(ThriftHiveMetastore.java) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_partition_result.read(ThriftHiveMetastore.java) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_alter_partition(ThriftHiveMetastore.java:2033) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.alter_partition(ThriftHiveMetastore.java:2018) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_partition(HiveMetaStoreClient.java:1091) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) at com.sun.proxy.$Proxy37.alter_partition(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.alterPartition(Hive.java:469) ... 26 more BR, Patcharee
Re: cast column float
Hi, There is really some data, but with these two queries return nothing... so weird select count(*) from u where xlong_u = 7.1578474 and xlat_u = 55.192524; select count(*) from u where xlong_u = cast(7.1578474 as float) and xlat_u = cast(55.192524 as float); Here is sample data. select xlong_u, xlat_u from u order by abs(xlong_u - 7.1578474), abs(xlat_u - 55.192524) limit 5; Query ID = patcharee_20150529115353_4eeacf47-d2d1-468e-811c-729be80bb750 Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1432633634512_0109) VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED Map 1 .. SUCCEEDED2892890 0 0 0 Reducer 2 .. SUCCEEDED 1 10 0 0 0 VERTICES: 02/02 [==>>] 100% ELAPSED TIME: 391.59 s OK 7.157847455.192524 7.157847455.192524 7.157847455.192524 7.157847455.192524 7.157847455.192524 Patcharee On 27. mai 2015 18:12, Bhagwan S. Soni wrote: could you also provide some sample dataset for these two columns? On Wed, May 27, 2015 at 7:17 PM, patcharee <mailto:patcharee.thong...@uni.no>> wrote: Hi, I queried a table based on value of two float columns select count(*) from u where xlong_u = 7.1578474 and xlat_u = 55.192524; select count(*) from u where xlong_u = cast(7.1578474 as float) and xlat_u = cast(55.192524 as float); Both query returned 0 records, even though there are some records matched the condition. What can be wrong? I am using Hive 0.14 BR, Patcharee
cast column float
Hi, I queried a table based on value of two float columns select count(*) from u where xlong_u = 7.1578474 and xlat_u = 55.192524; select count(*) from u where xlong_u = cast(7.1578474 as float) and xlat_u = cast(55.192524 as float); Both query returned 0 records, even though there are some records matched the condition. What can be wrong? I am using Hive 0.14 BR, Patcharee
Re: hive on Tez - merging orc files
Hi, I generated the new hive-exec.jar as you suggested. On the sandbox, Hive 0.14 with the new jar file is now using Tez to alter table concate, and It concatenates files correctly on Tez. Thanks! However I also tested on the production cluster using Hive 0.14 as well, merging did not work and generated another exception below 2015-04-24 13:01:52,259 INFO [main] app.DAGAppMaster: Running DAG: alter table orc_merge5a partit...concatenate 2015-04-24 13:01:52,355 INFO [IPC Server handler 0 on 46526] ipc.Server: IPC Server handler 0 on 46526, call org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPB.getDAGStatus from 10.2.1.254:39356 Call#361 Retry#0 org.apache.tez.dag.api.TezException: No running dag at present at org.apache.tez.dag.api.client.DAGClientHandler.getDAG(DAGClientHandler.java:84) at org.apache.tez.dag.api.client.DAGClientHandler.getACLManager(DAGClientHandler.java:151) at org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.getDAGStatus(DAGClientAMProtocolBlockingPBServerImpl.java:94) at org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7375) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) What could be the cause of this exception? Any ideas? BR, Patcharee On 24. april 2015 10:27, Prasanth Jayachandran wrote: You can download the branch-0.14 source code from https://github.com/apache/hive/tree/branch-0.14, apply HIVE-9529-branch-1.0.0.patch from https://issues.apache.org/jira/browse/HIVE-9529 and compile it using “mvn clean install -DskipTests -Phadoop-2,dist”. This will generate tar file under hive/packaging/target. You can extract the tar file, copy the hive-exec-x.x.x.jar into /usr/hdp/2.2.*.*/hive/lib/ (take backup of hive-exec.jar and replace with the new one). Rerunning hive cli should use the new hive-exec jar with the patch. Thanks Prasanth On Apr 24, 2015, at 1:15 AM, patcharee wrote: Hi, The sandbox 2.2 comes with hive 0.14. Does it also have the bug? If so, how can I patch hive on sandbox? BR, Patcharee On 24. april 2015 09:42, Prasanth Jayachandran wrote: Hi This has been fixed recently https://issues.apache.org/jira/browse/HIVE-9529. Merging is triggered in two different ways. INSERT/CTAS can trigger merging of small files and CONCATENATE can trigger merging of small files. The later had a bug which generated MR task instead of TEZ task which was fixed recently. Earlier one will use TEZ task always. Thanks Prasanth On Apr 24, 2015, at 12:33 AM, patcharee wrote: Hi, Is there anyone using hortonworks sandbox 2.2? I am trying to use hive on Tez on the sandbox. I set the running engine in hive-site.xml to Tez. hive.execution.engine tez Then I ran the script that alters a table to merge small orc files (alter table orc_merge5a partition(st=0.8) concatenate;). The merging feature worked, but Hive does not use Tez, it used MapReduce, so weird! Another point, I tried to run the same script on the production cluster which is on always Tez, the merging feature sometimes worked, sometimes did not. I would appreciate any suggestions. BR, Patcharee
Re: hive on Tez - merging orc files
Hi, The sandbox 2.2 comes with hive 0.14. Does it also have the bug? If so, how can I patch hive on sandbox? BR, Patcharee On 24. april 2015 09:42, Prasanth Jayachandran wrote: Hi This has been fixed recently https://issues.apache.org/jira/browse/HIVE-9529. Merging is triggered in two different ways. INSERT/CTAS can trigger merging of small files and CONCATENATE can trigger merging of small files. The later had a bug which generated MR task instead of TEZ task which was fixed recently. Earlier one will use TEZ task always. Thanks Prasanth On Apr 24, 2015, at 12:33 AM, patcharee wrote: Hi, Is there anyone using hortonworks sandbox 2.2? I am trying to use hive on Tez on the sandbox. I set the running engine in hive-site.xml to Tez. hive.execution.engine tez Then I ran the script that alters a table to merge small orc files (alter table orc_merge5a partition(st=0.8) concatenate;). The merging feature worked, but Hive does not use Tez, it used MapReduce, so weird! Another point, I tried to run the same script on the production cluster which is on always Tez, the merging feature sometimes worked, sometimes did not. I would appreciate any suggestions. BR, Patcharee
hive on Tez - merging orc files
Hi, Is there anyone using hortonworks sandbox 2.2? I am trying to use hive on Tez on the sandbox. I set the running engine in hive-site.xml to Tez. hive.execution.engine tez Then I ran the script that alters a table to merge small orc files (alter table orc_merge5a partition(st=0.8) concatenate;). The merging feature worked, but Hive does not use Tez, it used MapReduce, so weird! Another point, I tried to run the same script on the production cluster which is on always Tez, the merging feature sometimes worked, sometimes did not. I would appreciate any suggestions. BR, Patcharee
Re: merge small orc files
Hi Gopal, The table created is not a bucketed table, but a dynamic partitioned table. I took the script test from https://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/orc_merge7.q - create table orc_merge5 (userid bigint, string1 string, subtype double, decimal1 decimal, ts timestamp) stored as orc; - create table orc_merge5a (userid bigint, string1 string, subtype double, decimal1 decimal, ts timestamp) partitioned by (st double) stored as orc; I sent you the desc formatted table and application log. I just found out that there are some TezException which could be the cause of the problem. Please let me know how to fix it. BR, Patcharee On 21. april 2015 13:10, Gopal Vijayaraghavan wrote: alter table concatenate do not work? I have a dynamic partitioned table (stored as orc). I tried to alter concatenate, but it did not work. See my test result. ORC fast concatenate does work on partitioned tables, but it doesn¹t work on bucketed tables. Bucketed tables cannot merge files, since the file count is capped by the numBuckets parameter. hive> dfs -ls ${hiveconf:hive.metastore.warehouse.dir}/orc_merge5a/st=0.8/; Found 2 items -rw-r--r-- 3 patcharee hdfs534 2015-04-21 12:33 /apps/hive/warehouse/orc_merge5a/st=0.8/00_0 -rw-r--r-- 3 patcharee hdfs533 2015-04-21 12:33 /apps/hive/warehouse/orc_merge5a/st=0.8/01_0 Is this a bucketed table? When you look at the point of view of split generation & cluster parallelism, bucketing is an anti-pattern, since in most query schemas it significantly slows down the slowest task. Making the fastest task faster isn¹t often worth it, if the overall query time goes up. Also if you want to, you can send me the yarn logs -applicationId and the desc formatted of the table, which will help me understand what¹s happening better. Cheers, Gopal Container: container_1424363133313_0082_01_03 on compute-test-1-2.testlocal_45454 === LogType:stderr Log Upload Time:21-Apr-2015 14:17:54 LogLength:0 Log Contents: LogType:stdout Log Upload Time:21-Apr-2015 14:17:54 LogLength:2124 Log Contents: 0.294: [GC [PSYoungGen: 3642K->490K(6656K)] 3642K->1308K(62976K), 0.0071100 secs] [Times: user=0.00 sys=0.00, real=0.01 secs] 0.600: [GC [PSYoungGen: 6110K->496K(12800K)] 6929K->1992K(69120K), 0.0058540 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 1.061: [GC [PSYoungGen: 10217K->496K(12800K)] 11714K->3626K(69120K), 0.0077230 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] 1.477: [GC [PSYoungGen: 8914K->512K(25088K)] 12045K->5154K(81408K), 0.0095740 secs] [Times: user=0.01 sys=0.01, real=0.01 secs] 2.361: [GC [PSYoungGen: 14670K->512K(25088K)] 19313K->6827K(81408K), 0.0106680 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] 3.476: [GC [PSYoungGen: 22967K->3059K(51712K)] 29282K->9958K(108032K), 0.0201770 secs] [Times: user=0.02 sys=0.00, real=0.02 secs] 5.538: [GC [PSYoungGen: 50438K->3568K(52224K)] 57336K->15383K(108544K), 0.0374340 secs] [Times: user=0.04 sys=0.01, real=0.04 secs] 6.811: [GC [PSYoungGen: 29358K->6331K(61440K)] 41173K->18282K(117760K), 0.0421300 secs] [Times: user=0.03 sys=0.01, real=0.04 secs] 7.689: [GC [PSYoungGen: 28530K->6401K(61440K)] 40482K->19476K(117760K), 0.0443730 secs] [Times: user=0.03 sys=0.00, real=0.05 secs] Heap PSYoungGen total 61440K, used 28333K [0xfbb8, 0x0001, 0x0001) eden space 54784K, 40% used [0xfbb8,0xfdf463f8,0xff10) lgrp 0 space 2K, 49% used [0xfbb8,0xfc95add8,0xfd7b6000) lgrp 1 space 25896K, 29% used [0xfd7b6000,0xfdf463f8,0xff10) from space 6656K, 96% used [0xff10,0xff740400,0xff78) to space 8704K, 0% used [0xff78,0xff78,0x0001) ParOldGen total 56320K, used 13075K [0xd9a0, 0xdd10, 0xfbb8) object space 56320K, 23% used [0xd9a0,0xda6c4e68,0xdd10) PSPermGen total 28672K, used 28383K [0xd480, 0xd640, 0xd9a0) object space 28672K, 98% used [0xd480,0xd63b7f78,0xd640) LogType:syslog Log Upload Time:21-Apr-2015 14:17:54 LogLength:1355 Log Contents: 2015-04-21 14:17:40,208 INFO [main] task.TezChild: TezChild starting 2015-04-21 14:17:41,856 INFO [main] task.TezChild: PID, containerIdentifier: 15169, container_1424363133313_0082_01_03 2015-04-21 14:17:41,985 INFO [main] impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2015-04-21 14:17:42,146 INFO [main] impl.MetricsSystemImpl: Scheduled snapshot period at 60 second(s). 2015-04-21 14:17:42,146 INFO [main] impl.MetricsSystemImpl: TezTask metrics system started 2015-04-21 14:17
Re: merge small orc files
Hi Gopal, Thanks for your explanation. What could be the case that SET hive.merge.orcfile.stripe.level=true && alter table concatenate do not work? I have a dynamic partitioned table (stored as orc). I tried to alter concatenate, but it did not work. See my test result. hive> SET hive.merge.orcfile.stripe.level=true; hive> alter table orc_merge5a partition(st=0.8) concatenate; Starting Job = job_1424363133313_0053, Tracking URL = http://service-test-1-2.testlocal:8088/proxy/application_1424363133313_0053/ Kill Command = /usr/hdp/2.2.0.0-2041/hadoop/bin/hadoop job -kill job_1424363133313_0053 Hadoop job information for null: number of mappers: 0; number of reducers: 0 2015-04-21 12:32:56,165 null map = 0%, reduce = 0% 2015-04-21 12:33:05,964 null map = 100%, reduce = 0% Ended Job = job_1424363133313_0053 Loading data to table default.orc_merge5a partition (st=0.8) Moved: 'hdfs://service-test-1-0.testlocal:8020/apps/hive/warehouse/orc_merge5a/st=0.8/00_0' to trash at: hdfs://service-test-1-0.testlocal:8020/user/patcharee/.Trash/Current Moved: 'hdfs://service-test-1-0.testlocal:8020/apps/hive/warehouse/orc_merge5a/st=0.8/02_0' to trash at: hdfs://service-test-1-0.testlocal:8020/user/patcharee/.Trash/Current Partition default.orc_merge5a{st=0.8} stats: [numFiles=2, numRows=0, totalSize=1067, rawDataSize=0] MapReduce Jobs Launched: Stage-null: HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec OK Time taken: 22.839 seconds hive> dfs -ls ${hiveconf:hive.metastore.warehouse.dir}/orc_merge5a/st=0.8/; Found 2 items -rw-r--r-- 3 patcharee hdfs534 2015-04-21 12:33 /apps/hive/warehouse/orc_merge5a/st=0.8/00_0 -rw-r--r-- 3 patcharee hdfs533 2015-04-21 12:33 /apps/hive/warehouse/orc_merge5a/st=0.8/01_0 It seems nothing happened when I altered table concatenate. Any ideas? BR, Patcharee On 21. april 2015 04:41, Gopal Vijayaraghavan wrote: Hi, How to set the configuration hive-site.xml to automatically merge small orc file (output from mapreduce job) in hive 0.14 ? Hive cannot add work-stages to a map-reduce job. Hive follows merge.mapfiles=true when Hive generates a plan, by adding more work to the plan as a conditional task. -rwxr-xr-x 1 root hdfs 29072 2015-04-20 15:23 /apps/hive/warehouse/coordinate/zone=2/part-r-0 This looks like it was written by an MRv2 Reducer and not by the Hive FileSinkOperator & handled by the MR outputcommitter instead of the Hive MoveTask. But 0.14 has an option which helps ³hive.merge.orcfile.stripe.level². If that is true (like your setting), then do ³alter table concatenate² which effectively concatenates ORC blocks (without decompressing them), while maintaining metadata linkage of start/end offsets in the footer. Cheers, Gopal
merge small orc files
Hi, How to set the configuration hive-site.xml to automatically merge small orc file (output from mapreduce job) in hive 0.14 ? This is my current configuration> hive.merge.mapfiles true hive.merge.mapredfiles true hive.merge.orcfile.stripe.level true However the output from a mapreduce job, which is stored into an orc file, was not merged. This is the output> -rwxr-xr-x 1 root hdfs 0 2015-04-20 15:23 /apps/hive/warehouse/coordinate/zone=2/_SUCCESS -rwxr-xr-x 1 root hdfs 29072 2015-04-20 15:23 /apps/hive/warehouse/coordinate/zone=2/part-r-0 -rwxr-xr-x 1 root hdfs 29049 2015-04-20 15:23 /apps/hive/warehouse/coordinate/zone=2/part-r-1 -rwxr-xr-x 1 root hdfs 29075 2015-04-20 15:23 /apps/hive/warehouse/coordinate/zone=2/part-r-2 Any ideas? BR, Patcharee
Orc file input format
Hi, Is there any example to read/query orc file using orc file input format from Map-Reduce job or Spark job? BR, Patcharee
Load columns changed name
Hi, I have a hive table with a column which was changed its name. Pig is not able to load data from this column, it is all empty. Any ideas how to fix it? BR, Patcharee
Re: cannot store value into partition column
After I changed org.apache.hcatalog.pig.HCatStorer() to org.apache.hive.hcatalog.pig.HCatStorer(), it worked. Patcharee On 01/14/2015 02:57 PM, Patcharee Thongtra wrote: Hi, I am having a weird problem. I created a table in orc format: Create table create external table cossin (x int, y int, cos float, sin float) PARTITIONED BY(zone int) stored as orc location '/apps/hive/warehouse/wrf_tables/cossin' tblproperties ("orc.compress"="ZLIB"); I run a pig script below to import data into this table 'cossin'. Pig script ... r_three_dim = FOREACH result_three_dim GENERATE $ZONE as zone: int, result::x as x: int, result::y as y: int, result::cos as cos: float, result::sin as sin: float; x = FILTER r_three_dim by x < 5 and y < 5; dump x; describe x; store x into 'cossin' using org.apache.hcatalog.pig.HCatStorer(); Dump x (2,3,3,0.9883806,-0.15199915) (2,3,4,0.98836243,-0.15211758) (2,4,1,0.98830783,-0.15247186) (2,4,2,0.9882811,-0.15264522) (2,4,3,0.9882628,-0.15276346) (2,4,4,0.98824626,-0.15287022) x: {zone: int,x: int,y: int,cos: float,sin: float} But when I checked the table 'cossin', zone is NULL instead on 2. Any ideas? BR, Patcharee
cannot store value into partition column
Hi, I am having a weird problem. I created a table in orc format: Create table create external table cossin (x int, y int, cos float, sin float) PARTITIONED BY(zone int) stored as orc location '/apps/hive/warehouse/wrf_tables/cossin' tblproperties ("orc.compress"="ZLIB"); I run a pig script below to import data into this table 'cossin'. Pig script ... r_three_dim = FOREACH result_three_dim GENERATE $ZONE as zone: int, result::x as x: int, result::y as y: int, result::cos as cos: float, result::sin as sin: float; x = FILTER r_three_dim by x < 5 and y < 5; dump x; describe x; store x into 'cossin' using org.apache.hcatalog.pig.HCatStorer(); Dump x (2,3,3,0.9883806,-0.15199915) (2,3,4,0.98836243,-0.15211758) (2,4,1,0.98830783,-0.15247186) (2,4,2,0.9882811,-0.15264522) (2,4,3,0.9882628,-0.15276346) (2,4,4,0.98824626,-0.15287022) x: {zone: int,x: int,y: int,cos: float,sin: float} But when I checked the table 'cossin', zone is NULL instead on 2. Any ideas? BR, Patcharee
Re: compare float column
It works. Thanks! Patcharee On 01/13/2015 10:15 AM, Devopam Mittra wrote: please try the following and report observation: WHERE long = CAST(-41.338276 AS FLOAT) regards Devopam On Tue, Jan 13, 2015 at 2:25 PM, Patcharee Thongtra mailto:patcharee.thong...@uni.no>> wrote: Hi, I have a table with float columns. I tried to query based on the condition on a float column (called 'long'), but it failed (nothing returned). hive> select * from test_float where long == -41.338276; select * from test_float where long == -41.338276 Status: Finished successfully OK Time taken: 14.262 seconds hive> select long from test_float; select long from test_float Status: Finished successfully OK -41.338276 Time taken: 6.843 seconds, Fetched: 1 row(s) Any ideas? I am using hive version 0.13. BR, Patcharee -- Devopam Mittra Life and Relations are not binary
compare float column
Hi, I have a table with float columns. I tried to query based on the condition on a float column (called 'long'), but it failed (nothing returned). hive> select * from test_float where long == -41.338276; select * from test_float where long == -41.338276 Status: Finished successfully OK Time taken: 14.262 seconds hive> select long from test_float; select long from test_float Status: Finished successfully OK -41.338276 Time taken: 6.843 seconds, Fetched: 1 row(s) Any ideas? I am using hive version 0.13. BR, Patcharee
Backend error : java.lang.String cannot be cast to java.lang.Integer
Hi, I am using hive with pig. In my pig script I tried to query and sort data from a hive table as below: query = load 'fino31' USING org.apache.hive.hcatalog.pig.HCatLoader(); z_query = FILTER query BY zone==2 and z >= 2 and z <= 9; z_order = ORDER z_query by z; dump z_order; the column 'z' is integer, but I got an error ERROR grunt.Grunt: ERROR 1066: Unable to open iterator for alias z_order. Backend error : java.lang.String cannot be cast to java.lang.Integer Any ideas? Patcharee
Re: alter table add more columns
Hi, Can I add a partition column? Patcharee On 06/02/2014 10:25 PM, Mohammad Tariq wrote: Hi Patcharee, You can definitely add new columns. This is how it is done : *ALTER TABLE table_name ADD|REPLACE COLUMNS (col_name data_type [COMMENT col_comment], ...)* For more info on Hive DDL you can visit this link <https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL>. *Warm regards,* *Mohammad Tariq* /cloudfront.blogspot.com <http://cloudfront.blogspot.com>/ On Tue, Jun 3, 2014 at 1:36 AM, Patcharee Thongtra mailto:patcharee.thong...@uni.no>> wrote: Hi, I have a table named fino31 with the following schema hive> describe fino31; OK datetimestamp zoneint z int u float v float th float qv float ph float # Partition Information # col_namedata_type comment zoneint z int u float v float th float qv float ph float Is it possible to alter this table later by adding more columns? Patcharee
alter table add more columns
Hi, I have a table named fino31 with the following schema hive> describe fino31; OK datetimestamp zoneint z int u float v float th float qv float ph float # Partition Information # col_namedata_type comment zoneint z int u float v float th float qv float ph float Is it possible to alter this table later by adding more columns? Patcharee
java.lang.TypeNotPresentException: Type timestamp not present
Hi, I am using hive 0.13 and I created a table with a column date in timestamp format. When I used pig script to store data into the table I got exception java.lang.TypeNotPresentException: Type timestamp not present This is my pig script result = FOREACH result_raw GENERATE generate ToDate(dateString, '-MM-dd HH:mm:ss') as date:DateTime, zone, z, u, v, th, qv, ph; store result into 'fino31' using org.apache.hcatalog.pig.HCatStorer(); This link https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore#HCatalogLoadStore-DataTypesSupportedinHive0.13.0andLaterReleases.1 mentions that hive 0.13 supports timestamp and in pig we should provide DateTime to store in timestamp column. I did as mentioned but I got the exception. Any suggestion is appreciated. Patcharee