[jira] [Commented] (HAMA-531) Data re-partitioning in BSPJobClient

Thomas Jungblut (JIRA) Mon, 21 May 2012 06:20:45 -0700

    [ 
https://issues.apache.org/jira/browse/HAMA-531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280134#comment-13280134
 ]


Thomas Jungblut commented on HAMA-531:
--------------------------------------

I take a first shot for the graph algorithms.
I guess we should distinct between pre-job partitioning and runtime 
partitioning. For graph algorithms we can use runtime partitioning.
For other algorithms this might not be suitable.
                
> Data re-partitioning in BSPJobClient
> ------------------------------------
>
>                 Key: HAMA-531
>                 URL: https://issues.apache.org/jira/browse/HAMA-531
>             Project: Hama
>          Issue Type: Improvement
>            Reporter: Edward J. Yoon
>            Assignee: Thomas Jungblut
>
> The re-partitioning the data is a very expensive operation. By the way, 
> currently, we processes read/write operations sequentially using HDFS api in 
> BSPJobClient from client-side. This causes potential too many open files 
> error, contains HDFS overheads, and shows slow performance.
> We have to find another way to re-partitioning data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HAMA-531) Data re-partitioning in BSPJobClient

Reply via email to