AppId is missing in the latest hive version (3.1.2)
Hi Folks, I just found that the yarn appId is missing in the beeline log when using hive version (3.1.2) This is very inconvenience for problem shooting. what is the consideration to remove that ? Could you add that back to the beeline log ? -- Best Regards Jeff Zhang
no mr-job execution info in beeline
In beeline, I could not see the job execution info (like job progress), I have already set the following properties in hive-site.xml. Could anyone help to figure out how to diagnose such issues ? How can I check whether hive server2 take the correct configuration ? hive.server2.logging.operation.level VERBOSE I only see the following log in beeline WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. -- Best Regards Jeff Zhang
Does hive metastore service support proxy user access ?
I try to access hive metastore service using proxy user, but didn't succeed. I just wonder whether hive metastore support this kind of access ? 16/03/15 08:57:57 DEBUG security.UserGroupInformation: PrivilegedAction as:jeff (auth:PROXY) via l...@example.com (auth:KERBEROS) from:org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) 16/03/15 08:57:57 ERROR transport.TSaslTransport: SASL negotiation failure javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212) at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:420) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:236) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:74) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:86) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024) at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1234) at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174) at org.apache.hadoop.hive.ql.metadata.Hive.(Hive.java:166) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) -- Best Regards Jeff Zhang
Re: [ANNOUNCE] New Hive Committer - Wei Zheng
Congratulations, Wei ! On Thu, Mar 10, 2016 at 3:27 PM, Lefty Leverenz <leftylever...@gmail.com> wrote: > Congratulations! > > -- Lefty > > On Wed, Mar 9, 2016 at 10:30 PM, Dmitry Tolpeko <dmtolp...@gmail.com> > wrote: > >> Congratulations, Wei! >> >> On Thu, Mar 10, 2016 at 5:48 AM, Chao Sun <sunc...@apache.org> wrote: >> >>> Congratulations! >>> >>> On Wed, Mar 9, 2016 at 6:44 PM, Prasanth Jayachandran < >>> pjayachand...@hortonworks.com> wrote: >>> >>>> Congratulations Wei! >>>> >>>> On Mar 9, 2016, at 8:43 PM, Sergey Shelukhin <ser...@hortonworks.com >>>> <mailto:ser...@hortonworks.com>> wrote: >>>> >>>> Congrats! >>>> >>>> From: Szehon Ho <sze...@cloudera.com<mailto:sze...@cloudera.com>> >>>> Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" < >>>> user@hive.apache.org<mailto:user@hive.apache.org>> >>>> Date: Wednesday, March 9, 2016 at 17:40 >>>> To: "user@hive.apache.org<mailto:user@hive.apache.org>" < >>>> user@hive.apache.org<mailto:user@hive.apache.org>> >>>> Cc: "d...@hive.apache.org<mailto:d...@hive.apache.org>" < >>>> d...@hive.apache.org<mailto:d...@hive.apache.org>>, "w...@apache.org >>>> <mailto:w...@apache.org>" <w...@apache.org<mailto:w...@apache.org>> >>>> Subject: Re: [ANNOUNCE] New Hive Committer - Wei Zheng >>>> >>>> Congratulations Wei! >>>> >>>> On Wed, Mar 9, 2016 at 5:26 PM, Vikram Dixit K <vik...@apache.org >>>> <mailto:vik...@apache.org>> wrote: >>>> The Apache Hive PMC has voted to make Wei Zheng a committer on the >>>> Apache Hive Project. Please join me in congratulating Wei. >>>> >>>> Thanks >>>> Vikram. >>>> >>>> >>>> >>> >> > -- Best Regards Jeff Zhang
Fwd: Wrong column is picked in HIVE 2.0.0 + TEZ 0.8.2 left join
+ hive mail list, more likely hive issue. -- Forwarded message -- From: GAO Chi <chi@microfun.com> Date: Tue, Mar 1, 2016 at 12:24 PM Subject: Wrong column is picked in HIVE 2.0.0 + TEZ 0.8.2 left join To: u...@tez.apache.org Hi all, We encountered a strange behavior after upgrading to HIVE 2.0.0 + TEZ 0.8.2. I simplified our query to this: SELECT a.key, a.a_one, b.b_one, a.a_zero, b.b_zero FROM ( SELECT 11 key, 0 confuse_you, 1 a_one, 0 a_zero ) a LEFT JOIN ( SELECT 11 key, 0 confuse_you, 1 b_one, 0 b_zero ) b ON a.key = b.key ; Above query generates this unexpected result: INFO : Status: Running (Executing on YARN cluster with App id application_1456723490535_3653) INFO : Map 1: 0/1 Map 2: 0/1 INFO : Map 1: 0/1 Map 2: 0(+1)/1 INFO : Map 1: 0(+1)/1 Map 2: 0(+1)/1 INFO : Map 1: 0(+1)/1 Map 2: 1/1 INFO : Map 1: 1/1 Map 2: 1/1 INFO : Completed executing command(queryId=hive_20160301115630_0a0dbee5-ba4b-45e7-b027-085f655640fd); Time taken: 10.225 seconds INFO : OK ++--+--+---+---+--+ | a.key | a.a_one | b.b_one | a.a_zero | b.b_zero | ++--+--+---+---+--+ | 11 | 1| 0| 0 | 1 | ++--+--+---+---+--+ If you change the constant value of subquery-b’s confuse_you column from 0 to 2, the problem disappears. The plan returned from EXPLAIN shows the incorrect one is picking _col1 and _col2, while the correct one is picking _col2 and _col3 form sub query b. Seems it cannot distinguish 2 columns with same constant value? Anyone encountered similar problem? Thanks! Chi -- Best Regards Jeff Zhang
Re: Getting dot files for DAGs
Hi James, It is under the working directory of the yarn container (should be the first container which is AM) Best Regard, Jeff Zhang From: James Pirz <james.p...@gmail.com<mailto:james.p...@gmail.com>> Reply-To: "u...@tez.apache.org<mailto:u...@tez.apache.org>" <u...@tez.apache.org<mailto:u...@tez.apache.org>> Date: Thursday, October 1, 2015 at 8:29 AM To: "u...@tez.apache.org<mailto:u...@tez.apache.org>" <u...@tez.apache.org<mailto:u...@tez.apache.org>> Cc: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>> Subject: Getting dot files for DAGs I am using Tez 0.7.0 on Hadopp 2.6 to run Hive queries. I am interested in checking DAGs for my queries visually, and I realized that I can do that by graphviz once I can get "dot" files of my DAGs. My issue is I can not find those files, they are not in the log directory of Yarn or Hadoop or under /tmp . Any hint as where I can find those files would be great. Do I need to add any settings to my tez-site.xml in-order to enable generating them ? Thanks.
Re: Hive Compile mode
Use explain https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain On Thu, Sep 10, 2015 at 11:07 AM, Raajay <raaja...@gmail.com> wrote: > Is it possible to use Hive only in compile mode ( and not execute the > queries) ? > > The output here would be a DAG say to be executed on TEZ later. > > Thanks > Raajay > -- Best Regards Jeff Zhang
Re: Getting error while performing Insert query
What is your hive sql ? And please check the hive.log which may have more info. (By default hive.log is located in /tmp/${user}/hive.log ) Best Regard, Jeff Zhang From: Sateesh Karuturi <sateesh.karutu...@gmail.com<mailto:sateesh.karutu...@gmail.com>> Reply-To: "u...@tez.apache.org<mailto:u...@tez.apache.org>" <u...@tez.apache.org<mailto:u...@tez.apache.org>> Date: Wednesday, September 9, 2015 at 1:39 PM To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>>, "u...@tez.apache.org<mailto:u...@tez.apache.org>" <u...@tez.apache.org<mailto:u...@tez.apache.org>> Subject: Getting error while performing Insert query hello..., iam using hive 1.1 and tez 0.7... Whenever iam trying to INSERT data into hive table using tez via java iam getting following error: Exception in thread "main" org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException [Error 10293]: Unable to create temp file for insert values Expression of type TOK_TABLE_OR_COL not supported in insert/values
Re: Is YSmart integrated into Hive on tez ?
+ dev mail list The original correlation optimization might be designed for mr engine. But similar optimization could be applied for tez too. Is there any existing jira to track that ? On Tue, Sep 1, 2015 at 1:58 PM, Jeff Zhang <zjf...@gmail.com> wrote: > Hi Pengcheng, > > Is there reason why the correlation optimization disabled in tez ? > > And even when I change the code to enable the correlation optimization in > tez. I still get the same query plan. > > >>> Vertex dependency in root stage > >>> Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 4 (SIMPLE_EDGE) > >>> Reducer 3 <- Reducer 2 (SIMPLE_EDGE) > > On Tue, Sep 1, 2015 at 1:14 AM, Pengcheng Xiong <pxi...@apache.org> wrote: > >> Hi Jeff, >> >> From code base point of view, YSmart is integrated into Hive on Tez >> because it is one of the optimization of the current Hive. However, from >> the execution point of view, it is now disabled when Hive is running on >> Tez. You may take look at the source code of Hive >> >> Optimizer.java, L175-180: >> {code} >> >> if(HiveConf.getBoolVar(hiveConf, HiveConf.ConfVars.HIVEOPTCORRELATION) && >> >> !HiveConf.getBoolVar(hiveConf, HiveConf.ConfVars.HIVEGROUPBYSKEW) >> && >> >> !HiveConf.getBoolVar(hiveConf, HiveConf.ConfVars. >> HIVE_OPTIMIZE_SKEWJOIN_COMPILETIME) && >> >> !isTezExecEngine) { >> >> transformations.add(new CorrelationOptimizer()); >> >> } >> {code} >> >> Hope it helps. >> >> Best >> Pengcheng Xiong >> >> >> On Mon, Aug 31, 2015 at 12:56 AM, Jeff Zhang <zjf...@gmail.com> wrote: >> >>> The reason why I ask this question is that when I execute the following >>> sql, it will generated a query plan with 4 vertices. But as my >>> understanding if YSmart is integrated into hive, it should only take 3 >>> vertices since the join key and group by key are the same. Anybody know >>> this ? Thanks >>> >>> >>> >> insert overwrite directory '/tmp/jzhang/1' select o.o_orderkey as >>> orderkey,count(1) from lineitem l >> join orders o on >>> l.l_orderkey=o.o_orderkey group by o.o_orderkey; >>> >>> *YSmart Hive Jira* >>> >>> https://issues.apache.org/jira/browse/HIVE-2206 >>> >>> >>> >>> >>> -- >>> Best Regards >>> >>> Jeff Zhang >>> >> >> > > > -- > Best Regards > > Jeff Zhang > -- Best Regards Jeff Zhang
Re: Hive on tez error
Please refer this wiki page https://cwiki.apache.org/confluence/display/TEZ/How+to+Diagnose+Tez+App Best Regard, Jeff Zhang From: Sateesh Karuturi sateesh.karutu...@gmail.commailto:sateesh.karutu...@gmail.com Reply-To: u...@tez.apache.orgmailto:u...@tez.apache.org u...@tez.apache.orgmailto:u...@tez.apache.org Date: Friday, August 28, 2015 at 12:33 AM To: user@hive.apache.orgmailto:user@hive.apache.org user@hive.apache.orgmailto:user@hive.apache.org, u...@tez.apache.orgmailto:u...@tez.apache.org u...@tez.apache.orgmailto:u...@tez.apache.org Subject: Hive on tez error I am trying to connect hive database(execution.engine value changed to tez) using Java code... In case of select query its working But in the case of INSERT getting an error: The error looks like. Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask Please help me out
Re: tez error
+ hive mail list Might be a hive bug. Best Regard, Jeff Zhang From: hanked...@emar.commailto:hanked...@emar.com hanked...@emar.commailto:hanked...@emar.com Reply-To: user u...@tez.apache.orgmailto:u...@tez.apache.org Date: Thursday, August 27, 2015 at 5:51 PM To: user u...@tez.apache.orgmailto:u...@tez.apache.org Subject: tez error Hi ~all I build tez-0.7.0 with 2.6.0-cdh5.4.0, deploy and run demo wordcount success. but, hive on tez problems. error infomation: Status: Running (Executing on YARN cluster with App id application_1440664563598_0005) VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED Map 1 FAILED -1 00 -1 0 0 Reducer 2 KILLED 1 001 0 0 VERTICES: 00/02 [--] 0%ELAPSED TIME: 0.24 s Status: Failed Vertex failed, vertexName=Map 1, vertexId=vertex_1440664563598_0005_1_00, diagnostics=[Vertex vertex_1440664563598_0005_1_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: test initializer failed, vertex=vertex_1440664563598_0005_1_00 [Map 1], java.lang.IllegalArgumentException: Illegal Capacity: -12185 at java.util.ArrayList.init(ArrayList.java:156) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:330) at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:306) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:408) at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:129) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) ] Vertex killed, vertexName=Reducer 2, vertexId=vertex_1440664563598_0005_1_01, diagnostics=[Vertex received Kill in INITED state., Vertex vertex_1440664563598_0005_1_01 [Reducer 2] killed/failed due to:null] DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask hanked...@emar.commailto:hanked...@emar.com
Re: execution error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.teztask error on hive on tez
Please check the hive log for more useful info which is located in /tmp/${user}/hive.log by default. Best Regard, Jeff Zhang From: Sateesh Karuturi sateesh.karutu...@gmail.commailto:sateesh.karutu...@gmail.com Reply-To: u...@tez.apache.orgmailto:u...@tez.apache.org u...@tez.apache.orgmailto:u...@tez.apache.org Date: Wednesday, July 29, 2015 at 5:54 PM To: user@hive.apache.orgmailto:user@hive.apache.org user@hive.apache.orgmailto:user@hive.apache.org, u...@tez.apache.orgmailto:u...@tez.apache.org u...@tez.apache.orgmailto:u...@tez.apache.org Subject: execution error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.teztask error on hive on tez iam using hive 1.0 and tez 0.7 whenever iam performing insert query its returns following error: execution error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.teztask
Re: multiple users for hive access
Have you tried to start hive cli using these 2 users ? What issue did you see ? On Wed, Jul 8, 2015 at 11:50 AM, Jack Yang j...@uow.edu.au wrote: Thanks, mate. I have mysql run as my metadata store. What is the next step? when I start hive (0.13 version), I just type in hive in my command line. Now, that is say I have two users: A and B. I would like A and B access hive tables using hive-cli. How can I do that? *From:* Jeff Zhang [mailto:zjf...@gmail.com] *Sent:* Tuesday, 7 July 2015 4:22 PM *To:* user@hive.apache.org *Subject:* Re: multiple users for hive access Hive support multiple user scenario as long as the metadata store support multiple user access. By default hive use derby embedded mode which don't support multiple user access. You can configure it to server mode or use other metadata store like mysql etc. Here's the tutorial for how to configure derby server mode https://cwiki.apache.org/confluence/display/Hive/HiveDerbyServerMode On Tue, Jul 7, 2015 at 1:50 PM, Jack Yang j...@uow.edu.au wrote: Hi all, I would like to have multiple users to access hive. Does anyone try that before? Is there any tutorial or link I can study from? Best regards, Jack -- Best Regards Jeff Zhang -- Best Regards Jeff Zhang
Re: Hive With tez
Regarding the mapper task number, Hive on tez is very similar with Hive on MapReduce. One difference is that hive on tez can group split together which may use less tasks than mapreduce. What issues did you see when you use hive on tez ? On Sun, Jul 5, 2015 at 10:39 PM, saurabh mpp.databa...@gmail.com wrote: Hi, We are in process of exploring TEZ for Hive 0.14. Needed some pointers to start on Hive with Tez. E.g. in Hive HDFS Block size plays a vital role in getting the number of Mappers and later independent execution of mappers can accelerate processing substantially. I understand this is a very vast topic and cannot be described, however some quick pointers will be helpful. I am currently working on: Query vectorization and COB with ORC tables. Thanks, Saurabh -- Best Regards Jeff Zhang
Re: hive -e run tez query error
But keeping that client cache disabled when running against trunk generally kills queries all the time with occasional errors like these. I think tez staging directory is supposed to be deleted when JVM exit (deleteOnExit), why killing query will cause the path deleted ? On Sat, Jun 27, 2015 at 11:48 AM, Gopal Vijayaraghavan gop...@apache.org wrote: perhaps deleteOnExit() is set somewhere fs.cache.disable settings from hdfs-site.xml are usually to blame for that. Till hive-1.0, HiveServer2 used to leak filesystem objects, so the cache was disabled. 2015-06-25 15:54:33,673 INFO FSNamesystem.audit: allowed=true ugi=lujian (auth:SIMPLE)ip=/10.17.28.11 http://10.17.28.11/ cmd=delete src=/user/lujian/lujian/_tez_session_dir/abb91da9-ac07-4024-a09f-8622ee1ca edf dst=nullperm=null But keeping that client cache disabled when running against trunk generally kills queries all the time with occasional errors like these. Cheers, Gopal -- Best Regards Jeff Zhang
Is it expected behavior ?
I use the following command to try to check the command options of beeline. beeline --help It does display the help info, but after that it continue enter beeline. I think it should only display the help info and should not go into beeline. Is it expected behavior ? --truncateTable=[true/false]truncate table column when it exceeds length --delimiterForDSV=DELIMITER specify the delimiter for delimiter-separated values output format (default: |) --isolation=LEVEL set the transaction isolation level --nullemptystring=[true/false] set to true to get historic behavior of printing null as empty string --addlocaldriverjar=DRIVERJARNAME Add driver jar file in the beeline client side --addlocaldrivername=DRIVERNAME Add drvier name needs to be supported in the beeline client side --help display this message Beeline version 2.0.0-SNAPSHOT by Apache Hive beeline -- Best Regards Jeff Zhang
Re: hive -e run tez query error
to [Container failed. File does not exist: hdfs:// yhd-jqhadoop2.int.yihaodian.com:8020/user/lujian/lujian/_tez_session_dir/63de23a2-1cff-4434-96ad-1304089fb489/.tez/application_1433219182593_252390/tez-conf.pb ]], TaskAttempt 2 failed, info=[Container container_1433219182593_252390_01_05 finished with diagnostics set to [Container failed. File does not exist: hdfs:// yhd-jqhadoop2.int.yihaodian.com:8020/user/lujian/lujian/_tez_session_dir/63de23a2-1cff-4434-96ad-1304089fb489/.tez/application_1433219182593_252390/tez-conf.pb ]], TaskAttempt 3 failed, info=[Container container_1433219182593_252390_01_06 finished with diagnostics set to [Container failed. File does not exist: hdfs:// yhd-jqhadoop2.int.yihaodian.com:8020/user/lujian/lujian/_tez_session_dir/63de23a2-1cff-4434-96ad-1304089fb489/.tez/application_1433219182593_252390/ tez-conf.pb ]]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex vertex_1433219182593_252390_2_00 [Map 1] killed/failed due to:null] DAG failed due to vertex failure. failedVertices:1 killedVertices:0 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask *use hivecli to execute same query ,no exception throw.* *use hive -e to execute same query on mr ,no exeception throw * -- r7raul1...@163.com -- Best Regards Jeff Zhang
Re: Hive - Tez error with big join - Container expired.
Tez will hold the idle containers for a while, but it would also expire the container if it reach some threshold. Have you set property tez.am.container.idle.release-timeout-max.millis in tez-site.xml ? And can you attach the yarn app log ? Best Regard, Jeff Zhang From: Daniel Klinger d...@web-computing.demailto:d...@web-computing.de Reply-To: user@hive.apache.orgmailto:user@hive.apache.org user@hive.apache.orgmailto:user@hive.apache.org Date: Thursday, June 18, 2015 at 5:35 AM To: user@hive.apache.orgmailto:user@hive.apache.org user@hive.apache.orgmailto:user@hive.apache.org Subject: Hive - Tez error with big join - Container expired. Hi all, I have a pretty big Hive Query. I'm joining over 3 Hive-Tables which have thousands of lines each. I'm grouping this join by several columns. In the Hive-Shell this query only reach about 80%. After about 1400 seconds its canceling with the following error: Status: Failed Vertex failed, vertexName=Map 2, vertexId=vertex_1434357133795_0008_1_01, diagnostics=[Task failed, taskId=task_1434357133795_0008_1_01_33, diagnostics=[TaskAttempt 0 failed, info=[Containercontainer_1434357133795_0008_01_39 finished while trying to launch. Diagnostics: [Container failed. Container expired since it was unused]], TaskAttempt 1 failed, info=[Containercontainer_1434357133795_0008_01_55 finished while trying to launch. Diagnostics: [Container failed. Container expired since it was unused]], TaskAttempt 2 failed, info=[Containercontainer_1434357133795_0008_01_72 finished while trying to launch. Diagnostics: [Container failed. Container expired since it was unused]], TaskAttempt 3 failed, info=[Containercontainer_1434357133795_0008_01_000101 finished while trying to launch. Diagnostics: [Container failed. Container expired since it was unused]]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex vertex_1434357133795_0008_1_01 [Map 2] killed/failed due to:null] DAG failed due to vertex failure. failedVertices:1 killedVertices:0 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask My yarn resource manager is at 100% during the whole execution (using all of the 300 GB memory). I tried to extend the live time of my containers with the following setting in the yarn-site.xml but no success: yarn.resourcemanager.rm.container-allocation.expiry-interval-ms = 120 After this change my query stays at 0% over thousands of seconds. The query itself is working (tested with less data). How can I solve this problem. Thanks for your help. Greetz DK
Re: error on hive insert query
You can refer to the following link to figure out whether this is tez problem https://cwiki.apache.org/confluence/display/TEZ/How+to+Diagnose+Tez+App Best Regard, Jeff Zhang From: Sateesh Karuturi sateesh.karutu...@gmail.commailto:sateesh.karutu...@gmail.com Reply-To: u...@tez.apache.orgmailto:u...@tez.apache.org u...@tez.apache.orgmailto:u...@tez.apache.org Date: Wednesday, June 17, 2015 at 1:19 AM To: user@hive.apache.orgmailto:user@hive.apache.org user@hive.apache.orgmailto:user@hive.apache.org, u...@tez.apache.orgmailto:u...@tez.apache.org u...@tez.apache.orgmailto:u...@tez.apache.org Subject: error on hive insert query iam using hive 1.0.0 and tez 0.5.2. when i set hive.execution.engine value in hive-site.xml to tezselect query works well... but in case of insert getting error. the query is : insert into table tablename values(intvalue,'string value'); and the error is : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.Tez Task
Re: hive tez error
Sorry, to make it more clear. The applicationId is the placeholder of your applicationId of your hive session. You can look for it in the RM web UI Best Regard, Jeff Zhang From: Sateesh Karuturi sateesh.karutu...@gmail.commailto:sateesh.karutu...@gmail.com Reply-To: user@hive.apache.orgmailto:user@hive.apache.org user@hive.apache.orgmailto:user@hive.apache.org Date: Monday, June 8, 2015 at 3:52 PM To: user@hive.apache.orgmailto:user@hive.apache.org user@hive.apache.orgmailto:user@hive.apache.org Subject: Re: hive tez error when i try to check getting this hadoop@localhost:~$ yarn logs -applicationId options parsing failed: Missing argument for option: applicationId Retrieve logs for completed YARN applications. usage: yarn logs -applicationId application ID [OPTIONS] general options are: -appOwner Application Owner AppOwner (assumed to be current user if not specified) -containerId Container ID ContainerId (must be specified if node address is specified) -nodeAddress Node Address NodeAddress in the format nodename:port (must be specified if container id is specified) On Mon, Jun 8, 2015 at 1:20 PM, Jianfeng (Jeff) Zhang jzh...@hortonworks.commailto:jzh...@hortonworks.com wrote: Could you check the yarn app logs ? By invoking command : yarn logs -applicationId Best Regard, Jeff Zhang From: Sateesh Karuturi sateesh.karutu...@gmail.commailto:sateesh.karutu...@gmail.com Reply-To: user@hive.apache.orgmailto:user@hive.apache.org user@hive.apache.orgmailto:user@hive.apache.org Date: Monday, June 8, 2015 at 3:45 PM To: user@hive.apache.orgmailto:user@hive.apache.org user@hive.apache.orgmailto:user@hive.apache.org Subject: hive tez error getting FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask error... when iam trying to perform insert operation on hive(set hive.execution.engine=tez). whenever hive.execution.engine value is set to mr its works fine please help me out
Re: drop table command hang
I think I got the error. When I started hive, it will throw mysql key too long exception (only in log file but not in client log, this is very unfriendly IMO ) 2015-03-14 14:18:49,588 ERROR [main]: DataNucleus.Datastore (Log4JLogger.java:error(115)) - An exception was thrown while adding/validating class(es) : Specified key was too long; max key length is 767 bytes On Sat, Mar 14, 2015 at 1:44 PM, Jeff Zhang zjf...@gmail.com wrote: It doesn't matter whether I truncate the table, it always hangs there. Very werid. On Wed, Mar 11, 2015 at 3:06 PM, Mich Talebzadeh m...@peridale.co.uk wrote: Have you truncated the table before dropping it? I Truncate table table_name Drop table rable_name Mich Talebzadeh http://talebzadehmich.wordpress.com *Publications due shortly:* *Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache* NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility. *From:* Jeff Zhang [mailto:zjf...@gmail.com] *Sent:* 11 March 2015 06:56 *To:* user@hive.apache.org *Subject:* drop table command hang I invoke a drop table command and it hangs there. Here's the log. I am using mysql and I can invoke describe command and create table through mysql console, so I assume mysql works properly. Can anyone help this ? Thanks 2015-03-11 14:48:09,441 INFO [main]: ql.Driver (Driver.java:checkConcurrency(161)) - Concurrency mode is disabled, not creating a lock manager 2015-03-11 14:48:09,441 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - PERFLOG method=Driver.execute from=org.apache.hadoop.hive.ql.Driver 2015-03-11 14:48:09,441 INFO [main]: ql.Driver (Driver.java:execute(1321)) - Starting command: drop table student_bucketed_s1 2015-03-11 14:48:09,441 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - /PERFLOG method=TimeToSubmit start=1426056489421 end=1426056489441 duration=20 from=org.apache.hadoop.hive.ql.Driver 2015-03-11 14:48:09,442 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - PERFLOG method=runTasks from=org.apache.hadoop.hive.ql.Driver 2015-03-11 14:48:09,442 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - PERFLOG method=task.DDL.Stage-0 from=org.apache.hadoop.hive.ql.Driver 2015-03-11 14:48:09,442 INFO [main]: ql.Driver (Driver.java:launchTask(1640)) - Starting task [Stage-0:DDL] in serial mode 2015-03-11 14:48:09,442 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(743)) - 0: get_table : db=default tbl=student_bucketed_s1 2015-03-11 14:48:09,443 INFO [main]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(368)) - ugi=jzhang ip=unknown-ip-addr cmd=get_table : db=default tbl=student_bucketed_s1 2015-03-11 14:48:09,458 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(743)) - 0: get_table : db=default tbl=student_bucketed_s1 2015-03-11 14:48:09,458 INFO [main]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(368)) - ugi=jzhang ip=unknown-ip-addr cmd=get_table : db=default tbl=student_bucketed_s1 2015-03-11 14:48:09,474 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(743)) - 0: drop_table : db=default tbl=student_bucketed_s1 2015-03-11 14:48:09,474 INFO [main]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(368)) - ugi=jzhang ip=unknown-ip-addr cmd=drop_table : db=default tbl=student_bucketed_s1 -- Best Regards Jeff Zhang -- Best Regards Jeff Zhang -- Best Regards Jeff Zhang
Re: drop table command hang
It doesn't matter whether I truncate the table, it always hangs there. Very werid. On Wed, Mar 11, 2015 at 3:06 PM, Mich Talebzadeh m...@peridale.co.uk wrote: Have you truncated the table before dropping it? I Truncate table table_name Drop table rable_name Mich Talebzadeh http://talebzadehmich.wordpress.com *Publications due shortly:* *Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache* NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility. *From:* Jeff Zhang [mailto:zjf...@gmail.com] *Sent:* 11 March 2015 06:56 *To:* user@hive.apache.org *Subject:* drop table command hang I invoke a drop table command and it hangs there. Here's the log. I am using mysql and I can invoke describe command and create table through mysql console, so I assume mysql works properly. Can anyone help this ? Thanks 2015-03-11 14:48:09,441 INFO [main]: ql.Driver (Driver.java:checkConcurrency(161)) - Concurrency mode is disabled, not creating a lock manager 2015-03-11 14:48:09,441 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - PERFLOG method=Driver.execute from=org.apache.hadoop.hive.ql.Driver 2015-03-11 14:48:09,441 INFO [main]: ql.Driver (Driver.java:execute(1321)) - Starting command: drop table student_bucketed_s1 2015-03-11 14:48:09,441 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - /PERFLOG method=TimeToSubmit start=1426056489421 end=1426056489441 duration=20 from=org.apache.hadoop.hive.ql.Driver 2015-03-11 14:48:09,442 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - PERFLOG method=runTasks from=org.apache.hadoop.hive.ql.Driver 2015-03-11 14:48:09,442 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - PERFLOG method=task.DDL.Stage-0 from=org.apache.hadoop.hive.ql.Driver 2015-03-11 14:48:09,442 INFO [main]: ql.Driver (Driver.java:launchTask(1640)) - Starting task [Stage-0:DDL] in serial mode 2015-03-11 14:48:09,442 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(743)) - 0: get_table : db=default tbl=student_bucketed_s1 2015-03-11 14:48:09,443 INFO [main]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(368)) - ugi=jzhang ip=unknown-ip-addr cmd=get_table : db=default tbl=student_bucketed_s1 2015-03-11 14:48:09,458 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(743)) - 0: get_table : db=default tbl=student_bucketed_s1 2015-03-11 14:48:09,458 INFO [main]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(368)) - ugi=jzhang ip=unknown-ip-addr cmd=get_table : db=default tbl=student_bucketed_s1 2015-03-11 14:48:09,474 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(743)) - 0: drop_table : db=default tbl=student_bucketed_s1 2015-03-11 14:48:09,474 INFO [main]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(368)) - ugi=jzhang ip=unknown-ip-addr cmd=drop_table : db=default tbl=student_bucketed_s1 -- Best Regards Jeff Zhang -- Best Regards Jeff Zhang
Re: when start hive could not generate log file
By default, hive.log is located in /tmp/${user}/hive.log Best Regard, Jeff Zhang From: zhangjp smart...@hotmail.commailto:smart...@hotmail.com Reply-To: user@hive.apache.orgmailto:user@hive.apache.org user@hive.apache.orgmailto:user@hive.apache.org Date: Wednesday, March 11, 2015 at 7:12 PM To: user@hive.apache.orgmailto:user@hive.apache.org user@hive.apache.orgmailto:user@hive.apache.org Subject: when start hive could not generate log file when i run command hive , the message as follows [@xxx/]# hive log4j:WARN No appenders could be found for logger (org.apache.hadoop.hive.common.LogUtils). log4j:WARN Please initialize the log4j system properly. Logging initialized using configuration in file:/search/apache-hive-0.13.1-bin/conf/hive-log4j.properties? My hive-log4j.properties use the default template. but when I run find -name hive.log couldn't find any file.?
drop table command hang
I invoke a drop table command and it hangs there. Here's the log. I am using mysql and I can invoke describe command and create table through mysql console, so I assume mysql works properly. Can anyone help this ? Thanks 2015-03-11 14:48:09,441 INFO [main]: ql.Driver (Driver.java:checkConcurrency(161)) - Concurrency mode is disabled, not creating a lock manager 2015-03-11 14:48:09,441 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - PERFLOG method=Driver.execute from=org.apache.hadoop.hive.ql.Driver 2015-03-11 14:48:09,441 INFO [main]: ql.Driver (Driver.java:execute(1321)) - Starting command: drop table student_bucketed_s1 2015-03-11 14:48:09,441 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - /PERFLOG method=TimeToSubmit start=1426056489421 end=1426056489441 duration=20 from=org.apache.hadoop.hive.ql.Driver 2015-03-11 14:48:09,442 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - PERFLOG method=runTasks from=org.apache.hadoop.hive.ql.Driver 2015-03-11 14:48:09,442 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - PERFLOG method=task.DDL.Stage-0 from=org.apache.hadoop.hive.ql.Driver 2015-03-11 14:48:09,442 INFO [main]: ql.Driver (Driver.java:launchTask(1640)) - Starting task [Stage-0:DDL] in serial mode 2015-03-11 14:48:09,442 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(743)) - 0: get_table : db=default tbl=student_bucketed_s1 2015-03-11 14:48:09,443 INFO [main]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(368)) - ugi=jzhang ip=unknown-ip-addr cmd=get_table : db=default tbl=student_bucketed_s1 2015-03-11 14:48:09,458 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(743)) - 0: get_table : db=default tbl=student_bucketed_s1 2015-03-11 14:48:09,458 INFO [main]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(368)) - ugi=jzhang ip=unknown-ip-addr cmd=get_table : db=default tbl=student_bucketed_s1 2015-03-11 14:48:09,474 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(743)) - 0: drop_table : db=default tbl=student_bucketed_s1 2015-03-11 14:48:09,474 INFO [main]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(368)) - ugi=jzhang ip=unknown-ip-addr cmd=drop_table : db=default tbl=student_bucketed_s1 -- Best Regards Jeff Zhang
Any tutorial document about how to use the example data
Hi, I notice there's one folder example which contains sample data and sample queries. But I didn't find any document about how to use these data and queries. Could anyone point it to me ? Thanks
Where does hive do sampling in order by ?
Order by usually invoke 2 steps (sampling job and repartition job) but hive only run one mr job for order by, so wondering when and where does hive do sampling ? client side ? -- Best Regards Jeff Zhang
Re: Hive CLI question
Bals's right, you can execute any shell command in hive-cli by using !shell_command; On Fri, Jan 2, 2015 at 4:30 AM, Bala Krishna Gangisetty b...@altiscale.com wrote: !clear is an another option too. --Bala G. On Thu, Jan 1, 2015 at 12:23 PM, Mohan Krishna mohan.25fe...@gmail.com wrote: Hi Louis I use Ctrl+L as a keyboard short cut to clear Hive screen. Thanks Mohan On Fri, Jan 2, 2015 at 1:41 AM, Louis Vincent Frolio frol...@yahoo.com wrote: Does anyone know how to issue a clear screen at the hive prompt? Is this even possible? I am looking for something similar to system clear in MySQL. Thank you, Louis. -- Best Regards Jeff Zhang
What's official site for howl ?
Hi all, Sorry for bothering this mail list. But I only this mail list may relate with howl. Just wonder what's official site or mail list for howl. I'd like to get more information about howl -- Best Regards Jeff Zhang
Re: What's official site for howl ?
Thanks for your guys' quick reply, really appreciate that On Thu, May 5, 2011 at 10:13 AM, Alan Gates ga...@yahoo-inc.com wrote: http://incubator.apache.org/hcatalog/ Howl has been renamed to HCatalog (due to naming conflicts with an existing ow2 project called Howl). Alan. On May 4, 2011, at 7:04 PM, Jeff Zhang wrote: Hi all, Sorry for bothering this mail list. But I only this mail list may relate with howl. Just wonder what's official site or mail list for howl. I'd like to get more information about howl -- Best Regards Jeff Zhang -- Best Regards Jeff Zhang
Find a case that does not make sense in hive
Hi all I create a table with a partition. Then I import data into this table without specify partition, in this case I cannot retrieval data from this table using select statement. But I can retrieve data when importing data with specify partition. So I think since hive do not allow me to retrieve data in the first case, it should not allow me to import data without specify partition -- Best Regards Jeff Zhang
Re: Does hive have batch processing mode ?
Thanks, it works On Thu, Nov 25, 2010 at 3:49 PM, james warren ja...@rockyou.com wrote: Try the following: % hive -f myhive.file cheers, -James On Wed, Nov 24, 2010 at 11:37 PM, Jeff Zhang zjf...@gmail.com wrote: Hi all, I have a bunch of files, and want to import them into one table each partition per file. Currently, I have to enter each add partition statement in cli. So I wonder whether hive have a batch processing mode, so that I can put the sql statements in one file, and execute the file using one command. Thanks in advance. -- Best Regards Jeff Zhang -- Best Regards Jeff Zhang
Does hive have batch processing mode ?
Hi all, I have a bunch of files, and want to import them into one table each partition per file. Currently, I have to enter each add partition statement in cli. So I wonder whether hive have a batch processing mode, so that I can put the sql statements in one file, and execute the file using one command. Thanks in advance. -- Best Regards Jeff Zhang