[ 
https://issues.apache.org/jira/browse/HIVE-29646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18088539#comment-18088539
 ] 

Stamatis Zampetakis commented on HIVE-29646:
--------------------------------------------

The proposal here changes the meaning of the 
"hive.disable.unsafe.external.table.operations" property which currently is 
defined as follows:
{noformat}
Whether to disable certain optimizations and operations on external tables, on 
the assumption that data changes by external applications may have negative 
effects on these operations.{noformat}
In addition it is a essentially a partial revert of HIVE-19329 
(HIVE-19335/HIVE-19336). In HIVE-19335 and HIVE-19336 I don't see concrete 
elements justifying the deactivation of the optimization but I assume that they 
opted to put this enforcement cause they found some performance regressions.

On the other hand, this ticket re-enables the optimizations by arguing that 
when stats are up to date its worth doing no matter if we have external tables 
or not. However, when dealing with external tables I assume that in most 
real-world use-cases the stats are not gonna be up to date (even if they appear 
as such). 

So basically the question shifts to should we do perform these optimizations 
for external tables (no matter the stats)?

> Enable semijoin reduction and map-join conversion on external tables with 
> accurate statistics
> ---------------------------------------------------------------------------------------------
>
>                 Key: HIVE-29646
>                 URL: https://issues.apache.org/jira/browse/HIVE-29646
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Denys Kuzmenko
>            Priority: Major
>              Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to