jinqi long created HIVE-29614:
---------------------------------
Summary: Incorrect column lineage for multiple window functions
with identical partition keys
Key: HIVE-29614
URL: https://issues.apache.org/jira/browse/HIVE-29614
Project: Hive
Issue Type: Bug
Components: lineage
Affects Versions: 1.1.0
Environment:
Reporter: jinqi long
Fix For: 4.3.0
An error occurs in column lineage when a query contains multiple PTFs with
identical partition and order keys, for example:
{code:java}
create table table_2 as
select
sum(id1) over(partition by key ) sum1,
sum(id2) over(partition by key ) sum2
from table_1;{code}
{code:java}
The current result is:
{
"version": "1.0",
"engine": "tez",
"database": "default",
"hash": "f81777f9774d12cc77dd583ea9ff99b3",
"queryText": "create table table_2 as select\nsum(id1) over(partition by
key ) sum1,\nsum(id2) over(partition by key ) sum2\nfrom table_1",
"edges": [
{
"sources": [
2,
3
],
"targets": [
0,
1
],
"expression": "sum(table_1.id1) over (partition by table_1.key
order by table_1.key ROWS between unbounded and unbounded)",
"edgeType": "PROJECTION"
}
],
"vertices": [
{
"id": 0,
"vertexType": "COLUMN",
"vertexId": "default.table_2.sum1"
},
{
"id": 1,
"vertexType": "COLUMN",
"vertexId": "default.table_2.sum2"
},
{
"id": 2,
"vertexType": "COLUMN",
"vertexId": "default.table_1.id1"
},
{
"id": 3,
"vertexType": "COLUMN",
"vertexId": "default.table_1.key"
}
]
}{code}
The correct result should be two PROJECTION edges:
"sources": [id1,key],"targets": [sum1]
"sources": [id2,key],"targets": [sum2]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)