[ 
https://issues.apache.org/jira/browse/TAJO-931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099728#comment-14099728
 ] 

ASF GitHub Bot commented on TAJO-931:
-------------------------------------

GitHub user hyunsik opened a pull request:

    https://github.com/apache/tajo/pull/119

    TAJO-931: Output file can be punctuated depending on the file size.

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/hyunsik/tajo TAJO-931

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tajo/pull/119.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #119
    
----
commit a3b78642abb6c160b147eae2f29a10e362c14cac
Author: Hyunsik Choi <[email protected]>
Date:   2014-07-08T08:47:42Z

    Improve session variables to affect the query config.

commit 0a0035d9b259a1a05ba790b7a778a745251d27bd
Author: Hyunsik Choi <[email protected]>
Date:   2014-07-08T12:54:32Z

     Fixed.

commit 3fb54a6dde89d2d8e972253c1eccd17f334180d4
Author: Hyunsik Choi <[email protected]>
Date:   2014-07-09T02:23:28Z

    Completed output file rotating.

commit 8028f5f876af2050bb602e277026e76ca802619a
Author: Hyunsik Choi <[email protected]>
Date:   2014-07-15T03:57:29Z

    Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into 
OUTPUT_ROTATING
    
    Conflicts:
        
tajo-core/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java
        
tajo-core/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java
        tajo-core/src/main/java/org/apache/tajo/master/querymaster/SubQuery.java

commit 50f6af418b42704ba14a4c7a084372f80c7ce1ec
Author: Hyunsik Choi <[email protected]>
Date:   2014-07-15T06:25:09Z

    Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into 
OUTPUT_ROTATING

commit 4d0abc0dfbf6c5898bce6bd0e1ecd4c995108571
Author: Hyunsik Choi <[email protected]>
Date:   2014-07-15T11:13:55Z

    Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into 
OUTPUT_ROTATING
    
    Conflicts:
        
tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/HashBasedColPartitionStoreExec.java
        
tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/SortBasedColPartitionStoreExec.java

commit dd79f666d81875bf6a547478b76fc55b60f37d09
Author: Hyunsik Choi <[email protected]>
Date:   2014-07-15T12:31:11Z

    Added estimatedwrittensize.

commit da231ca89e5cf3638ea16faad281f8296854a9dd
Author: Hyunsik Choi <[email protected]>
Date:   2014-07-17T03:03:37Z

    Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into 
OUTPUT_ROTATING

commit c006382a3b16973872d753c9a0e0150da1c0f687
Author: Hyunsik Choi <[email protected]>
Date:   2014-07-17T03:10:20Z

    Reflect session variables to GlobalPlanner, Repartitioner, and 
PhysicalPlannerImpl.

commit 681aa25916f8de8a45f2b953215de76b023393a0
Author: Hyunsik Choi <[email protected]>
Date:   2014-08-11T07:56:37Z

    Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into 
OUTPUT_ROTATING
    
    Conflicts:
        
tajo-core/src/main/java/org/apache/tajo/engine/planner/PhysicalPlannerImpl.java
        
tajo-core/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java
        
tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/SortBasedColPartitionStoreExec.java
        
tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/StoreTableExec.java
        tajo-core/src/main/java/org/apache/tajo/engine/query/QueryContext.java
        
tajo-core/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java
        tajo-core/src/main/java/org/apache/tajo/worker/TaskAttemptContext.java
        tajo-storage/src/main/java/org/apache/tajo/storage/Appender.java

commit b7a73bb22df1198010e2b18f3e67aaeeec30f52f
Author: Hyunsik Choi <[email protected]>
Date:   2014-08-11T08:33:51Z

    Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into 
OUTPUT_ROTATING
    
    Conflicts:
        
tajo-core/src/main/java/org/apache/tajo/engine/planner/PhysicalPlannerImpl.java
        
tajo-core/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java
        
tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/SortBasedColPartitionStoreExec.java
        
tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/StoreTableExec.java
        tajo-core/src/main/java/org/apache/tajo/engine/query/QueryContext.java
        
tajo-core/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java
        tajo-core/src/main/java/org/apache/tajo/worker/TaskAttemptContext.java
        tajo-storage/src/main/java/org/apache/tajo/storage/Appender.java

commit 803fb6a677b6831faf5e602bf77961b31128b7cd
Author: Hyunsik Choi <[email protected]>
Date:   2014-08-15T14:20:55Z

    Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into 
TAJO-931

commit 8a6f782bb3c423ee7ef17f8522fd9206803da0cb
Author: Hyunsik Choi <[email protected]>
Date:   2014-08-16T18:05:20Z

    TAJO-931: Output file can be punctuated depending on the file size.

----


> Output file can be punctuated depending on the file size.
> ---------------------------------------------------------
>
>                 Key: TAJO-931
>                 URL: https://issues.apache.org/jira/browse/TAJO-931
>             Project: Tajo
>          Issue Type: Improvement
>          Components: physical operator
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>             Fix For: 0.9.0
>
>
> There are some file formats (e.g., Parquet) which are not splittable. They 
> can usually span multiple HDFS blocks if one file is very large. It causes 
> remote HDFS access and limits the parallel degree, resulting in significant 
> performance degradation.
> We can solve this problem if StoreTableExec or 
> {Col|SortBased}PartitionStoreExec can punctuate the final output file 
> according to the written size.
> In addition, we need to support a session variable to determine the per file 
> size of final output files. So, TAJO-928 blocks this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to