[
https://issues.apache.org/jira/browse/TAJO-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13888442#comment-13888442
]
Hyunsik Choi commented on TAJO-574:
-----------------------------------
Created a review request against branch master in reviewboard
https://reviews.apache.org/r/17633/
> Add a sort-based physical executor for column partition store
> -------------------------------------------------------------
>
> Key: TAJO-574
> URL: https://issues.apache.org/jira/browse/TAJO-574
> Project: Tajo
> Issue Type: New Feature
> Components: physical operator
> Reporter: Hyunsik Choi
> Assignee: Hyunsik Choi
> Fix For: 0.8-incubating
>
> Attachments: TAJO-574.patch
>
>
> ColumnPartitionStoreExec keeps numerous open files while it is storing all
> data. In addition, it's random write gives burden to HDFS namenode.
> To solve this problem, I would like to propose a sort-based physical executor
> for column partition store. It assumes that input tuples are sorted in an
> ascending or descending order of partition keys. It means that it needs extra
> sort operation. But, it opens only one file simultaneously. It writes all
> data sequentially. In many cases, it would be the best choice for column
> partition store.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)