Abhinav Chandel created ATLAS-5274:
--------------------------------------
Summary: [Impala Hook] Self-referencing INSERT OVERWRITE produces
impala_process with empty outputs[], breaking lineage
Key: ATLAS-5274
URL: https://issues.apache.org/jira/browse/ATLAS-5274
Project: Atlas
Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Abhinav Chandel
Assignee: Abhinav Chandel
Problem
When a user executes a self-referencing DML query in Impala (i.e., the source
and destination table are the same), the Atlas Impala hook creates an
impala_process entity where outputs[] is empty. The target table is recorded
only in inputs[], not outputs[]. This breaks the lineage graph for that
operation — the table has inputToProcesses=1 but outputFromProcesses=0, so the
data lifecycle cannot be tracked.
Steps to Reproduce
Run the following on an Impala cluster with the Atlas hook enabled:
{code:sql}
CREATE DATABASE IF NOT EXISTS atlas_test_self_only;
CREATE TABLE IF NOT EXISTS atlas_test_self_only.target_self_ref (
id INT,
amount INT
);
INSERT INTO atlas_test_self_only.target_self_ref VALUES (1, 100), (2, 200);
INSERT OVERWRITE TABLE atlas_test_self_only.target_self_ref
SELECT id, cast(amount + 50 as int)
FROM atlas_test_self_only.target_self_ref
WHERE amount > 0;
{code}
Expected Behavior
An impala_process entity is created in Atlas with:
* inputs: [target_self_ref] ← source table
* outputs: [target_self_ref] ← same table, as destination
Actual Behavior
An impala_process entity IS created in Atlas, but:
* inputs: [target_self_ref] ← correct
* outputs: [] ← EMPTY — target table missing
Atlas entity observed
* typeName: impala_process
* inputs: ['c16fc913-3cbc-4d86-9c2a-7610b49e212b']
* outputs: []
Impact
* Lineage graph is broken for all self-referencing ETL patterns in Impala.
* target_self_ref.outputFromProcesses = 0 (should be 1).
* Users cannot track data transformation history for in-place update patterns
such as incremental aggregation, SCD updates, and self-join enrichment.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)