[ 
https://issues.apache.org/jira/browse/TAJO-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13891214#comment-13891214
 ] 

Hudson commented on TAJO-36:
----------------------------

SUCCESS: Integrated in Tajo-master-build #55 (See 
[https://builds.apache.org/job/Tajo-master-build/55/])
TAJO-36: Improve ExternalSortExec with N-merge sort and final pass omission. 
(hyunsik: 
https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=5177dcfa4b44e953919f47b94d39f9c5f7afb38b)
* 
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestHashSemiJoinExec.java
* CHANGES.txt
* 
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PhysicalPlannerImpl.java
* 
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/UnaryPhysicalExec.java
* 
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java
* 
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestPhysicalPlanner.java
* 
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/eval/ExprTestBase.java
* tajo-common/src/main/java/org/apache/tajo/datum/Float8Datum.java
* tajo-storage/src/test/java/org/apache/tajo/storage/v2/TestStorages.java
* tajo-storage/src/main/java/org/apache/tajo/storage/RowStoreUtil.java
* tajo-storage/src/main/java/org/apache/tajo/storage/RowFile.java
* 
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/worker/TestRangeRetrieverHandler.java
* 
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/PhysicalPlanningException.java
* tajo-storage/src/main/java/org/apache/tajo/storage/RawFile.java
* 
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/PhysicalExec.java
* 
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestHashJoinExec.java
* tajo-storage/src/main/java/org/apache/tajo/storage/LazyTuple.java
* tajo-common/src/main/java/org/apache/tajo/util/ClassSize.java
* tajo-storage/src/main/java/org/apache/tajo/storage/MemoryUtil.java
* tajo-common/src/main/java/org/apache/tajo/util/CommonTestingUtil.java
* tajo-jdbc/src/main/java/org/apache/tajo/jdbc/MetaDataTuple.java
* tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java
* 
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestExternalSortExec.java
* tajo-storage/src/main/java/org/apache/tajo/storage/Tuple.java
* tajo-storage/src/main/java/org/apache/tajo/storage/VTuple.java
* 
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestBNLJoinExec.java
* tajo-storage/src/main/java/org/apache/tajo/storage/FrameTuple.java
* tajo-jdbc/src/main/java/org/apache/tajo/jdbc/TajoDatabaseMetaData.java
* tajo-storage/src/test/java/org/apache/tajo/storage/TestVTuple.java
* 
tajo-core/tajo-core-pullserver/src/main/java/org/apache/tajo/storage/Tuple.java
* 
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestHashAntiJoinExec.java
* 
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestNLJoinExec.java
* tajo-storage/src/test/java/org/apache/tajo/storage/TestLazyTuple.java
* tajo-storage/src/test/java/org/apache/tajo/storage/TestStorages.java
* 
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestMergeJoinExec.java


> Improve ExternalSortExec with N-merge sort and final pass omission
> ------------------------------------------------------------------
>
>                 Key: TAJO-36
>                 URL: https://issues.apache.org/jira/browse/TAJO-36
>             Project: Tajo
>          Issue Type: Improvement
>          Components: physical operator
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>            Priority: Critical
>             Fix For: 0.8-incubating
>
>         Attachments: TAJO-36.patch, TAJO-36_140204_163419.patch, 
> TAJO-36_140204_172847.patch, TAJO-36_20140205_00:21:44.patch, 
> TAJO-36_final.patch
>
>
> Background:
> The current ExternalSortExec just uses the binary external merge sort 
> algorithm 
> (http://en.wikipedia.org/wiki/External_sorting#External_merge_sort). In other 
> words, for each pass, ExternalSortExec just merges two files into one sorted 
> file.
> Proposal:
> The goal of this proposal is to improve ExternalSortExec with the following 
> improvements:
> * N-merge sort - we can merge N files though more memory at each pass. It 
> will reduce the number of passes. Consequently, it will reduces considerable 
> I/O overheads.
> * the final pass omission - a physical operator is pipelined by the parent 
> operator. The final pass of the merge sort must also be invoked by the parent 
> physical operator. So, we can omit the final pass of the merge sort.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to