[ 
https://issues.apache.org/jira/browse/HIVE-29542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-29542:
----------------------------------
    Labels: pull-request-available  (was: )

> Large unexpected NULL outputs for LEFT JOIN + aggregated subquery during full 
> INSERT (MR execution consistency issue in MapredLocalTask/ExecMapper)
> ---------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-29542
>                 URL: https://issues.apache.org/jira/browse/HIVE-29542
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 3.1.3
>            Reporter: zhuoyu.chen
>            Priority: Major
>              Labels: pull-request-available
>
> Background
> In Hive 3.1.3, we observed a query pattern where single-key queries return 
> correct results,
> but full-volume INSERT OVERWRITE produces a large number of unexpected NULLs 
> in the right-side metric column of a LEFT JOIN.
> Repro SQL (simplified)
> WITH user_data AS (
>   SELECT user_id, mobile
>   FROM features_dw.dwd_feature_user_df
>   WHERE dt = '2026-03-10'
> )
> INSERT OVERWRITE TABLE 
> features_dw.dwd_feature_loan_balance_behavior_df_balance_std_v31
> PARTITION (country_code='NG', dt='2026-03-10')
> SELECT
>   uu.mobile,
>   w180.opay_balance_wealth_std_180
> FROM user_data uu
> LEFT JOIN (
>   SELECT
>     b.mobile,
>     stddev(a.balance_wealth_decimal) AS opay_balance_wealth_std_180
>   FROM test_db.tmp_debug_wealth_180_list_explode_v2 a
>   JOIN user_data b
>     ON a.user_id = b.user_id
>   GROUP BY b.mobile
> ) w180
> ON uu.mobile = w180.mobile;
> Observed behavior
> 1) Single-key query results are correct;
> 2) After full INSERT OVERWRITE, most values of opay_balance_wealth_std_180 
> become NULL;
> 3) The issue still reproduces after disabling common optimizations 
> (CBO/vectorization/auto join conversion).
> Key plan characteristics
> - Subquery path: JOIN + GROUP BY(stddev)
> - Outer path: LEFT OUTER JOIN
> - Execution path includes local hash-table build and map-side processing:
>   MapredLocalTask -> HashTableSinkOperator -> ExecMapper
> Root cause
> This is caused by MR execution-path consistency gaps:
> 1) In MapredLocalTask local hash-table build, transactional internal columns
>    (ROW__ID / INPUT__FILE__NAME / BLOCK__OFFSET__INSIDE__FILE) may remain in 
> expressions,
>    which can shift key/value expression layout and lead to large-scale probe 
> mismatches;
> 2) In ExecMapper, bucketing version handling across related operators may be 
> inconsistent,
>    increasing join mismatch risk.
> Fix
> 1) MapredLocalTask
>    - Remove transactional internal columns before local hash-table build;
>    - Apply the cleanup in both executeInChildVM and executeInProcess paths.
> 2) ExecMapper
>    - Add bucket-version balancing across related 
> ReduceSinkOperator/TableScanOperator nodes during map-side initialization.
> Result
> - Full INSERT no longer produces large-scale unexpected NULLs;
> - Consistency between single-key and full-volume output is restored;
> - Behavior matches expected LEFT JOIN semantics.
> Scope
> - Changes are localized to MR execution internals in ql module;
> - No SQL semantic changes;
> - Limited impact on queries that do not use the affected local hash-table 
> build path.
> Files touched
> - ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java
> - ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapper.java



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to