Vsevolod Ostapenko created KYLIN-4350:
-----------------------------------------
Summary: Pushdown improperly rewrites the query causing it to fail
Key: KYLIN-4350
URL: https://issues.apache.org/jira/browse/KYLIN-4350
Project: Kylin
Issue Type: Bug
Components: Query Engine
Affects Versions: v2.6.4
Environment: HDP 2.6.5, Kylin 2.6.4, CentOS 7.6
Reporter: Vsevolod Ostapenko
A query that uses WITH clause and is subject for pushdown to Hive (or Impala)
for execution is incorrectly rewritten before being submitted to the execution
engine. Table aliases are attributed with database name, with makes query
invalid.
Sample log excerpts are below:
{quote}2020-01-17 12:12:21,997 INFO [Query
e844b846-c589-4729-5a04-483f6d73c834-31163] service.QueryService:404 : The
original query: with
t as
(
SELECT ZETTICSDW.A_VL_HOURLY_V.IMSIID "ZETTICSDW_A_VL_HOURLY_V_IMSIID",
ZETTICSDW.A_VL_HOURLY_V.MEDIA_GAP_CALL_ID
"ZETTICSDW_A_VL_HOURLY_V_MEDIA_GAP_CALL_ID",
count(*) cnt
FROM ZETTICSDW.A_VL_HOURLY_V
WHERE ((ZETTICSDW.A_VL_HOURLY_V.THEDATE = '20200117')
AND ((ZETTICSDW.A_VL_HOURLY_V.THEHOUR >= '10')
AND (ZETTICSDW.A_VL_HOURLY_V.THEHOUR <= '10')))
GROUP BY ZETTICSDW.A_VL_HOURLY_V.IMSIID,
ZETTICSDW.A_VL_HOURLY_V.MEDIA_GAP_CALL_ID
)
select t.ZETTICSDW_A_VL_HOURLY_V_IMSIID,
count(*) "vl_aggs_model___CD_MEDIA_GAP_CALL_ID"
*from t*
group by t.ZETTICSDW_A_VL_HOURLY_V_IMSIID
ORDER BY "vl_aggs_model___CD_MEDIA_GAP_CALL_ID" desc
LIMIT 500
....
2020-01-17 12:12:22,073 INFO [Query e844b846-c589-4729-5a04-483f6d73c834-31163]
adhocquery.AbstractPushdownRunner:37 : the query is converted to with
t as
(
SELECT ZETTICSDW.A_VL_HOURLY_V.IMSIID `ZETTICSDW_A_VL_HOURLY_V_IMSIID`,
ZETTICSDW.A_VL_HOURLY_V.MEDIA_GAP_CALL_ID
`ZETTICSDW_A_VL_HOURLY_V_MEDIA_GAP_CALL_ID`,
count(*) cnt
FROM ZETTICSDW.A_VL_HOURLY_V
WHERE ((ZETTICSDW.A_VL_HOURLY_V.THEDATE = '20200117')
AND ((ZETTICSDW.A_VL_HOURLY_V.THEHOUR >= '10')
AND (ZETTICSDW.A_VL_HOURLY_V.THEHOUR <= '10')))
GROUP BY ZETTICSDW.A_VL_HOURLY_V.IMSIID,
ZETTICSDW.A_VL_HOURLY_V.MEDIA_GAP_CALL_ID
)
select t.ZETTICSDW_A_VL_HOURLY_V_IMSIID,
count(*) `vl_aggs_model___CD_MEDIA_GAP_CALL_ID`
*{color:#FF0000}from ZETTICSDW.t{color}*
group by t.ZETTICSDW_A_VL_HOURLY_V_IMSIID
ORDER BY `vl_aggs_model___CD_MEDIA_GAP_CALL_ID` desc
LIMIT 500 after applying converter
org.apache.kylin.source.adhocquery.HivePushDownConverter
2020-01-17 12:12:22,108 ERROR [Query
e844b846-c589-4729-5a04-483f6d73c834-31163] service.QueryService:989 : pushdown
engine failed current query too
org.apache.hive.service.cli.HiveSQLException: AnalysisException: Could not
resolve table reference: '*zetticsdw.t*'
{quote}
Pushdown query should be submitted into query engine as written by the user.
As the best effort Kylin push down executor should issue "use <database>" over
the same JDBC connection right before submitting the query.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)