[ 
https://issues.apache.org/jira/browse/TAJO-789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976539#comment-13976539
 ] 

Hyunsik Choi commented on TAJO-789:
-----------------------------------

In addition, I explain the issue in more detail. Currently, QueryMaster 
collects all shuffled data set written by all tasks, and then QueryMaster 
directly sends URIs to each worker. Each worker pulls the shuffled data which 
resides in each node via URI. BTW, URI is usually is long. It may consume lots 
of network bandwidth. The main objective of this issue is to reduce the 
information to indicate the locations of intermediate data.

> Improve shuffle URI
> -------------------
>
>                 Key: TAJO-789
>                 URL: https://issues.apache.org/jira/browse/TAJO-789
>             Project: Tajo
>          Issue Type: Improvement
>          Components: data shuffle
>            Reporter: Jinho Kim
>            Assignee: Jinho Kim
>             Fix For: 0.9
>
>
> Currently, shuffle uri use the string field. but most params is a number in 
> uri
> We need change to Varint of protocol buffer.
> https://developers.google.com/protocol-buffers/docs/encoding#varints



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to