[ 
https://issues.apache.org/jira/browse/TEZ-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14061602#comment-14061602
 ] 

Gopal V commented on TEZ-1260:
------------------------------

[~rohini]: the API is something we can work with pre-0.5.0 release and iterate 
towards a better performing sorter in the 0.5.1 release.

On that context, can you try running your GBY queries with 
"tez.runtime.sort.threads=2"?

> Allow KeyValueWriter to support writing list of values also
> -----------------------------------------------------------
>
>                 Key: TEZ-1260
>                 URL: https://issues.apache.org/jira/browse/TEZ-1260
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Rohini Palaniswamy
>            Assignee: Rajesh Balamohan
>         Attachments: TEZ-1260.1.patch
>
>
> TEZ-1228 adds support to IFile for storing K,L<V>.  Currently KeyValueWriter 
> allows write of K,V
> public void write(Object key, Object value) throws IOException;
> We should add support for 
> public void write(Object key, Iterable<Object> values) throws IOException;
> taking advantage of TEZ-1228. In few cases, pig unwraps key, list<values> and 
> writes them as separate K,V pairs. This can avoid that overhead. That may 
> enable us to even add something similar to hash based partial aggregation for 
> join like what we do for groupby.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to