[
https://issues.apache.org/jira/browse/TAJO-673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jaehwa Jung updated TAJO-673:
-----------------------------
Attachment: TAJO-673_3.patch
I modified the patch as following:
- Renamed new shuffle type to scattered hash shuffle.
- Set TajoConf:SHUFFLE_TASK_NUM_VOLUME to 512MB
For reference, I tested this patch on my testing cluster with TPC-H dataset.
I found a problem which produces many empty Tasks. Luckily, Tasks doesn't
effect query result.
But we need to resolve this problem. I think that we should refactor query
result stats for removing empty tasks.
> Assign proper number of tasks when inserting into partitioned table
> -------------------------------------------------------------------
>
> Key: TAJO-673
> URL: https://issues.apache.org/jira/browse/TAJO-673
> Project: Tajo
> Issue Type: Improvement
> Components: planner/optimizer
> Reporter: Hyoungjun Kim
> Assignee: Jaehwa Jung
> Fix For: 0.9.0
>
> Attachments: TAJO-673.patch, TAJO-673_2.patch, TAJO-673_3.patch
>
>
> When inserting into partitioned table, if the number of partitions is smaller
> than cluster concurrency capacity, a query execution is too slow.
--
This message was sent by Atlassian JIRA
(v6.2#6252)