Shohei Okumiya created TEZ-4569:
-----------------------------------
Summary: SCATTER_GATHER + BROADCAST hangs on DAG Recovery
Key: TEZ-4569
URL: https://issues.apache.org/jira/browse/TEZ-4569
Project: Apache Tez
Issue Type: Improvement
Affects Versions: 0.10.3
Reporter: Shohei Okumiya
Assignee: Shohei Okumiya
Attachments: image-2024-06-11-20-45-12-540.png
A Tez DAG fails to initialize itself when an Application Master is timely
preempted.
The problem typically happens with Map Join(Broadcast Hash Join) of Hive when
the broadcast edge is multi-staged. In the following case, the smaller side
includes one aggregation, and the condition is satisfied.
{code:java}
CREATE TABLE small AS SELECT 1 AS id;
CREATE TABLE big AS SELECT 1 AS id UNION ALL SELECT 2 AS id UNION ALL SELECT 3
AS id;
SELECT *
FROM big
JOIN (SELECT id, count(*) AS num FROM small GROUP BY id) s ON big.id = s.id
{code}
Once it happens, a retried AM fails to configure the Map Join vertex. In the
following case, Map 1 never starts.
{code:java}
----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING
FAILED KILLED
----------------------------------------------------------------------------------------------
Map 2 .......... container SUCCEEDED 1 1 0 0
0 1
Reducer 3 ...... container SUCCEEDED 1 1 0 0
0 0
Map 1 container INITIALIZING -1 0 0 -1
0 0
----------------------------------------------------------------------------------------------
{code}
Tez starts Map 2 and Map 1 once their splits are configured. The hang issue
happens when an AM is retried before it starts Reducer 3.
!image-2024-06-11-20-45-12-540.png!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)