[jira] [Updated] (HAMA-531) Data re-partitioning in BSPJobClient

Thomas Jungblut (JIRA) Mon, 21 May 2012 11:41:43 -0700

     [ 
https://issues.apache.org/jira/browse/HAMA-531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Thomas Jungblut updated HAMA-531:
---------------------------------

    Attachment: HAMA-531_final.patch

okay works now, basically it was because pagerank input adjacent edges were 
marked as double instead of null.

This broke the serialization.

Fixed, build is fine. I'd like to commit this tomorrow. 

However we should think about how we build the pre-job partitioner.
                
> Data re-partitioning in BSPJobClient
> ------------------------------------
>
>                 Key: HAMA-531
>                 URL: https://issues.apache.org/jira/browse/HAMA-531
>             Project: Hama
>          Issue Type: Improvement
>            Reporter: Edward J. Yoon
>            Assignee: Thomas Jungblut
>         Attachments: HAMA-531_1.patch, HAMA-531_2.patch, HAMA-531_final.patch
>
>
> The re-partitioning the data is a very expensive operation. By the way, 
> currently, we processes read/write operations sequentially using HDFS api in 
> BSPJobClient from client-side. This causes potential too many open files 
> error, contains HDFS overheads, and shows slow performance.
> We have to find another way to re-partitioning data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HAMA-531) Data re-partitioning in BSPJobClient

Reply via email to