>>(1)- For every TEZ AM it is possible to launch just a single query/DAG at a 
>>time. So within a given AM several DAGs can be executed only in sequential 
>>order (a.k.a. a session), not in parallel. To execute DAGs in parallel we 
>>always need several AMs.

Correct. Today a single AM will accept new DAGs when the AM is idle and run 
them. An AM is idle when no DAG is running.

>>(2)- The AM is user-specific, and each user is expected to run queries 
>>through its own AM (or on multiple AMs if there is a need for parallelism).

Correct in a secure cluster. In a non-secure cluster an AM runs as the yarn 
user which is common to all AMs. In a secure cluster, any entity that has been 
given a client token (for that app attempt) by the RM, can communicate with the 
AM. In a non-secure cluster, any entity that has obtained the AMs connection 
information from the RM can communicate with the AM. The AM has an additional 
set of ACL’s that determine who can submit, view, modify DAGs.

>>(3)- Several users can submit their DAGs as the same user (e.g.: through 
>>hiveserver2), but in this case we will still have several AM.

Correct. However, the number of AMs will be determined by the policy of the 
mediating server. It may choose to launch a new AM for every new DAG. Or queue 
up and round robin through a limited set of AMs, etc.

Bikas

From: Fabio C. [mailto:anyte...@gmail.com]
Sent: Monday, March 09, 2015 4:31 AM
To: u...@tez.apache.org; user@hive.apache.org
Subject: Parallel queries/dags running in same AM?

Hi all,
I've been using Tez on hive, and I had a chance to hear a conversation that 
mismatches with my present knowledge, can anyone confirm the following 
statement?
(1)- For every TEZ AM it is possible to launch just a single query/DAG at a 
time. So within a given AM several DAGs can be executed only in sequential 
order (a.k.a. a session), not in parallel. To execute DAGs in parallel we 
always need several AMs.
(2)- The AM is user-specific, and each user is expected to run queries through 
its own AM (or on multiple AMs if there is a need for parallelism).
(3)- Several users can submit their DAGs as the same user (e.g.: through 
hiveserver2), but in this case we will still have several AM.

Thanks in advance

Fabio

Reply via email to