zabetak commented on code in PR #6382:
URL: https://github.com/apache/hive/pull/6382#discussion_r3065427664


##########
ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java:
##########
@@ -1977,8 +1977,9 @@ private void 
removeSemijoinOptimizationByBenefit(OptimizeTezProcContext procCtx)
           LOG.debug("Old stats for {}: {}", roi.filterOperator, 
roi.filterStats);
           LOG.debug("Number of rows reduction: {}/{}", newNumRows, 
roi.filterStats.getNumRows());
         }
+        boolean useColStats = roi.filterStats.getColumnStats() != null;
         StatsUtils.updateStats(roi.filterStats, newNumRows,
-            true, roi.filterOperator, roi.colNames);
+            useColStats, roi.filterOperator, roi.colNames);

Review Comment:
   The modifications to the test were pushing the query closer to the failure 
but they were not enough to trigger the NPE and hit the problematic code. I 
played a bit with the code and managed to trigger the NPE using the test added 
in 
https://github.com/apache/hive/pull/6382/commits/7328e0f5a0a5cc1157433ce5ca23956904ae5270
   
   In addition, I removed various redundant properties and renamed a bit the 
tables to make the test more readable.
   With these changes the PR should be ready to merge.



##########
ql/src/test/queries/clientpositive/semijoin_stats_missing_colstats.q:
##########
@@ -0,0 +1,45 @@
+-- HIVE-29516: Test that semijoin optimization handles missing column 
statistics gracefully

Review Comment:
   Fixed by 
https://github.com/apache/hive/pull/6382/commits/7328e0f5a0a5cc1157433ce5ca23956904ae5270



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to