[
https://issues.apache.org/jira/browse/TEZ-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14061602#comment-14061602
]
Gopal V commented on TEZ-1260:
------------------------------
[~rohini]: the API is something we can work with pre-0.5.0 release and iterate
towards a better performing sorter in the 0.5.1 release.
On that context, can you try running your GBY queries with
"tez.runtime.sort.threads=2"?
> Allow KeyValueWriter to support writing list of values also
> -----------------------------------------------------------
>
> Key: TEZ-1260
> URL: https://issues.apache.org/jira/browse/TEZ-1260
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Rohini Palaniswamy
> Assignee: Rajesh Balamohan
> Attachments: TEZ-1260.1.patch
>
>
> TEZ-1228 adds support to IFile for storing K,L<V>. Currently KeyValueWriter
> allows write of K,V
> public void write(Object key, Object value) throws IOException;
> We should add support for
> public void write(Object key, Iterable<Object> values) throws IOException;
> taking advantage of TEZ-1228. In few cases, pig unwraps key, list<values> and
> writes them as separate K,V pairs. This can avoid that overhead. That may
> enable us to even add something similar to hash based partial aggregation for
> join like what we do for groupby.
--
This message was sent by Atlassian JIRA
(v6.2#6252)