[ 
https://issues.apache.org/jira/browse/FLINK-5859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16165823#comment-16165823
 ] 

ASF GitHub Bot commented on FLINK-5859:
---------------------------------------

GitHub user godfreyhe opened a pull request:

    https://github.com/apache/flink/pull/4667

    [FLINK-5859] [table] Add PartitionableTableSource for partition pruning

    ## What is the purpose of the change
    
    This pull request adds PartitionableTableSource for partition pruning when 
optimizing the query plan. That way both query optimization time and execution 
time can be reduced obviously, especially for a large partitioned table.
    
    ## Brief change log
    
      - *Adds PartitionableTableSource which extends FilterableTableSource*
      - *Adds setRelBuilder method in FilterableTableSource class*
      - *Adds implementation for partition pruning and extracting partition 
predicates*
    
    ## Verifying this change
    
    This change added tests and can be verified as follows:
    
      - *Added integration tests for PartitionableTableSource on batch and 
stream sql*
      - *Added test that validates the correct of partition pruning*
      - *Added test that validates the correct of extracting partition 
predicates*
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): (no)
      - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
      - The serializers: (no)
      - The runtime per-record code paths (performance sensitive): (no)
      - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
    
    ## Documentation
    
      - Does this pull request introduce a new feature? (yes)
      - If yes, how is the feature documented? (JavaDocs)
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/godfreyhe/flink FLINK-5859

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/4667.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4667
    
----
commit 2fc5f9d6e6a1625a0d7784d320f8dde5df8a7f5e
Author: godfreyhe <godfre...@163.com>
Date:   2017-09-14T06:00:54Z

    [FLINK-5859] [table] Add PartitionableTableSource for partition pruning

----


> support partition pruning on Table API & SQL
> --------------------------------------------
>
>                 Key: FLINK-5859
>                 URL: https://issues.apache.org/jira/browse/FLINK-5859
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table API & SQL
>            Reporter: godfrey he
>            Assignee: godfrey he
>
> Many data sources are partitionable storage, e.g. HDFS, Druid. And many 
> queries just need to read a small subset of the total data. We can use 
> partition information to prune or skip over files irrelevant to the user’s 
> queries. Both query optimization time and execution time can be reduced 
> obviously, especially for a large partitioned table.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to