I am running hive on Tez. I am able to get the Yarn application ID for the hive 
query by submitting the query through Hive JDBC and using HiveStatement.

Connection con = 
DriverManager.getConnection("jdbc:hive2://abc:10000/default","xyz", "");
HiveStatement stmt = (HiveStatement) con.createStatement();
String sql = " SELECT COMP_ID, COUNT(1) FROM tableA GROUP BY COMP_ID ";
ResultSet res = stmt.executeQuery(sql);
String yarn_app_id = new String();

for (String log : stmt.getQueryLog()) {
if (log.contains("App id")){
                yarn_app_id = log.substring(log.indexOf("App id") +7, 
log.length()-1);
}
}

System.out.println("YARN Application ID: " + yarn_app_id);

Now I am trying to find the Tez DAG ID for the query.


From: Gerber, Bryan W [mailto:bryan.ger...@pnnl.gov]
Sent: Monday, July 18, 2016 1:47 PM
To: user@hive.apache.org
Subject: RE: Yarn Application ID for Hive query

Making Hive look like a normal SQL database is the goal of libraries like this, 
so it make sense that that abstraction wouldn't leak a concept like application 
ID. Especially because not all Hive queries generate a YARN application.

That said, we went through this with JDBC access to Hive a while back to allow 
our user interface to cancel a query. Only relevant discussion I found was 
here: 
http://grokbase.com/t/cloudera/hue-user/1373c258xg/how-hue-beeswax-is-able-to-read-the-hadoop-job-id-that-gets-generated-by-hiveserver2

We are using this method, plus a background task that polls the YARN resource 
manager API to find the job with the corresponding hive.session.id. It is a lot 
of work for something that seems very simple. It would be nice to have access 
to a command or API call in HiveServer2 similar to MySQL's "SHOW PROCESSLIST" 
(and equivalent commands in most other databases).

From: Amit Bajpai [mailto:amit.baj...@flextronics.com]
Sent: Thursday, July 14, 2016 10:22 PM
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: Yarn Application ID for Hive query

Hi,

I am using the below python program to run a hive query. How can I get the Yarn 
application ID using the python program for the hive query execution.

import pyhs2

with pyhs2.connect(host='abc.sac.com',
               port=10000,
               authMechanism="PLAIN",
               user='amit',
               password='amit',
               database='default') as conn:
    with conn.cursor() as cur:
        #Execute query
        cur.execute("SELECT COMP_ID, COUNT(1) FROM tableA GROUP BY COMP_ID")

        #Fetch table results
        for i in cur.fetch():
            print i

Thanks
Amit


Legal Disclaimer:
The information contained in this message may be privileged and confidential. 
It is intended to be read only by the individual or entity to whom it is 
addressed or by their designee. If the reader of this message is not the 
intended recipient, you are on notice that any distribution of this message, in 
any form, is strictly prohibited. If you have received this message in error, 
please immediately notify the sender and delete or destroy any copy of this 
message!

Legal Disclaimer:
The information contained in this message may be privileged and confidential. 
It is intended to be read only by the individual or entity to whom it is 
addressed or by their designee. If the reader of this message is not the 
intended recipient, you are on notice that any distribution of this message, in 
any form, is strictly prohibited. If you have received this message in error, 
please immediately notify the sender and delete or destroy any copy of this 
message!

Reply via email to