[
https://issues.apache.org/jira/browse/TAJO-673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jaehwa Jung updated TAJO-673:
-----------------------------
Attachment: TAJO-673_7.patch
I updated the patch as follows:
- Divide fetch uris into the the proper number of tasks by IntermediateData
output volume. The output volume is 256MB, but you can set it at tajo
configuration file. This property name is
tajo.scattered.hash.shuffle.split.volume.
- Adding shuffle output volume to TajoWorkerProtocol. If task complete, then
Task::getTaskCompletionReport will set this property.
For reference, I tested lots of cases on TPC-H benchmarking cluster, and I
found that it ran successfully.
> Assign proper number of tasks when inserting into partitioned table
> -------------------------------------------------------------------
>
> Key: TAJO-673
> URL: https://issues.apache.org/jira/browse/TAJO-673
> Project: Tajo
> Issue Type: Improvement
> Components: planner/optimizer
> Reporter: Hyoungjun Kim
> Assignee: Jaehwa Jung
> Fix For: 0.9.0
>
> Attachments: TAJO-673.patch, TAJO-673_2.patch, TAJO-673_3.patch,
> TAJO-673_4.patch, TAJO-673_5.patch, TAJO-673_6.patch, TAJO-673_7.patch
>
>
> When inserting into partitioned table, if the number of partitions is smaller
> than cluster concurrency capacity, a query execution is too slow.
--
This message was sent by Atlassian JIRA
(v6.2#6252)