[ 
https://issues.apache.org/jira/browse/TAJO-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14991356#comment-14991356
 ] 

ASF GitHub Bot commented on TAJO-1952:
--------------------------------------

GitHub user blrunner opened a pull request:

    https://github.com/apache/tajo/pull/846

    TAJO-1952: Implement PartitionedFileFragment

    This patch contains following modifications:
    
    * Remove partition paths from PartitionedTableScanNode
    * Implement PartitionedFileFragment
    * Separate a method for pruning partition paths from 
PartitionedTableRewriter to PartitionedTableUtil
    * Build a type by using partition name which contains partition keys and 
values


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/blrunner/tajo TAJO-1952

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tajo/pull/846.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #846
    
----
commit 0d2a634d2353efdeecced4729be9f585789acdb1
Author: JaeHwa Jung <[email protected]>
Date:   2015-10-28T08:22:40Z

    Implement PartitionedFileFragment

commit 4d7e73b7b20d1e6721b0f6b2ee53c4d04eb278d4
Author: JaeHwa Jung <[email protected]>
Date:   2015-10-28T09:10:53Z

    Add unit test cases for PartitionedFileFragment

commit 6fab5adadb303e690f7377547f842f84eb1f9286
Author: JaeHwa Jung <[email protected]>
Date:   2015-10-29T07:25:47Z

    Add PartitionedTableUtil for finding filtered partition directories.

commit b3bbcd188b0afc3b977f85005c0dffa20a8312dc
Author: JaeHwa Jung <[email protected]>
Date:   2015-11-02T06:57:39Z

    Remove the array of partition directories of rerwrite rule and apply 
PartitionedFileFragment.

commit 25163d0cdade5f45e7e524db4ceac4250b7ea805
Author: JaeHwa Jung <[email protected]>
Date:   2015-11-02T07:01:56Z

    Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into 
TAJO-1952
    
    Conflicts:
        
tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/physical/TestPhysicalPlanner.java

commit 4f711fa2ff7a18979198d80a70f283f73b91edf9
Author: JaeHwa Jung <[email protected]>
Date:   2015-11-02T07:14:49Z

    Remove unnecessary method

commit 33dc1407a3d1417a81895e5e36d528f64c88bbbe
Author: JaeHwa Jung <[email protected]>
Date:   2015-11-02T07:22:24Z

    Update comments

commit dede3e2957a2cee7bccd235a3f873aac0ab40377
Author: JaeHwa Jung <[email protected]>
Date:   2015-11-02T07:40:15Z

    Remove unnecessary constructor parameter of PhysicalPlannerImpl

commit ccc4f6cb2e12bd642d00be08f393f6754e74db7f
Author: JaeHwa Jung <[email protected]>
Date:   2015-11-02T07:55:16Z

    Remove unnecessary parameter of 
PartitionedTableUtil::buildTupleFromPartitionName

commit d5f563a1d6764f21f80e91a2540a9de5330a38cf
Author: JaeHwa Jung <[email protected]>
Date:   2015-11-02T07:59:32Z

    Update wrong indent

commit 086b02beb700e125a6ba37cbe275965150a89183
Author: JaeHwa Jung <[email protected]>
Date:   2015-11-02T07:59:57Z

    Remove unused package

commit 22731ec4a13f1ad0e75d7987966c17715afbeb52
Author: JaeHwa Jung <[email protected]>
Date:   2015-11-02T08:20:01Z

    Update wrong comparison operator

commit 437f5ecdc7fad8b056bb638ea0897cd6e455b9b8
Author: JaeHwa Jung <[email protected]>
Date:   2015-11-02T08:24:05Z

    Update log message

commit d76f41aac39e7536f4acac265559fb136aa05b71
Author: JaeHwa Jung <[email protected]>
Date:   2015-11-03T00:33:16Z

    When rewriting PartitionedTableScanNode, set partition paths and table 
volume.

commit 126f5e06de3aa88563281fd0c382d03f4afab5bf
Author: JaeHwa Jung <[email protected]>
Date:   2015-11-03T00:36:27Z

    Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into 
TAJO-1952

commit 9112ceb61547667020423bf4fbe18f99c07c2539
Author: JaeHwa Jung <[email protected]>
Date:   2015-11-03T01:47:32Z

    Update the result message of partition pruning

commit 71d65a5dec1571852448f0b349e121a9a0268a5e
Author: JaeHwa Jung <[email protected]>
Date:   2015-11-05T08:57:54Z

    Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into 
TAJO-1952

----


> Implement PartitionedFileFragment
> ---------------------------------
>
>                 Key: TAJO-1952
>                 URL: https://issues.apache.org/jira/browse/TAJO-1952
>             Project: Tajo
>          Issue Type: Improvement
>          Components: Planner/Optimizer, Storage
>            Reporter: Jaehwa Jung
>            Assignee: Jaehwa Jung
>
> Currently, PartitionedTableScanNode contains the list of partitions and it 
> seems to me that the list has some problems as following:
> 1. Duplicate Informs: Task contains Fragment which specify target directory 
> or target file for scanning. A path of partition lists already would write to 
> Fragment. 
> 2. Network Resource: When scanning lost of partition, it will occupy network 
> resource, for example, several hundred kilobytes or more. It looks like an 
> unnecessary resource because Fragment already has the path of partitions.
> I want to improve above problems by implementing new Fragment called 
> PartitionedFileFragment. Currently, I'm planning the implementation as 
> following:
> * PartitionedFileFragment will borrow FileFragment and it contains the 
> partition path and the partition key values.  
> * Remove the path array of partitions from PartitionedTableScanNode. 
> * Implement a method for getting filtered partition directories in 
> FileTableSpace.
> * Implement a method for making PartitionedFileFragment array.
> * Before making splits, call above method and use it for making splits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to