[ 
https://issues.apache.org/jira/browse/FLINK-37329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Syed Shameerur Rahman updated FLINK-37329:
------------------------------------------
    Description: 
Currently when "table.optimizer.source.report-statistics-enabled" is set to 
false, The statistics collection is not disabled for all the cases. It was 
noted that when running Batch workload to read Hive table TPC-DS data set, 
although  "table.optimizer.source.report-statistics-enabled" was set to false, 
both table and column statistics were being collected.

 

This goes against the configuration description :
{code:java}
@Documentation.TableOption(execMode = Documentation.ExecMode.BATCH_STREAMING)
public static final ConfigOption<Boolean> 
TABLE_OPTIMIZER_SOURCE_REPORT_STATISTICS_ENABLED =
        key("table.optimizer.source.report-statistics-enabled")
                .booleanType()
                .defaultValue(true)
                .withDescription(
                        "When it is true, the optimizer will collect and use 
the statistics from source connectors"
                                + " if the source extends from 
SupportsStatisticReport and the statistics from catalog is UNKNOWN."
                                + "Default value is true."); {code}

  was:Currently when "table.optimizer.source.report-statistics-enabled" is set 
to false, The statistics collection is not disabled for all the cases. It was 
noted that when running Batch workload to read Hive table TPC-DS data set, 
although  "table.optimizer.source.report-statistics-enabled" was set to false, 
both table and column statistics were being collected.


> Skip Source Stats Collection When 
> "table.optimizer.source.report-statistics-enabled" is False
> ---------------------------------------------------------------------------------------------
>
>                 Key: FLINK-37329
>                 URL: https://issues.apache.org/jira/browse/FLINK-37329
>             Project: Flink
>          Issue Type: Bug
>          Components: Table SQL / Planner
>            Reporter: Syed Shameerur Rahman
>            Priority: Major
>
> Currently when "table.optimizer.source.report-statistics-enabled" is set to 
> false, The statistics collection is not disabled for all the cases. It was 
> noted that when running Batch workload to read Hive table TPC-DS data set, 
> although  "table.optimizer.source.report-statistics-enabled" was set to 
> false, both table and column statistics were being collected.
>  
> This goes against the configuration description :
> {code:java}
> @Documentation.TableOption(execMode = Documentation.ExecMode.BATCH_STREAMING)
> public static final ConfigOption<Boolean> 
> TABLE_OPTIMIZER_SOURCE_REPORT_STATISTICS_ENABLED =
>         key("table.optimizer.source.report-statistics-enabled")
>                 .booleanType()
>                 .defaultValue(true)
>                 .withDescription(
>                         "When it is true, the optimizer will collect and use 
> the statistics from source connectors"
>                                 + " if the source extends from 
> SupportsStatisticReport and the statistics from catalog is UNKNOWN."
>                                 + "Default value is true."); {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to