To my knowledge, it seems BSP is more suite the model of Hama than M/R as Hama is focusing on matrix computation, not data processing.
Chanwit On Fri, Sep 18, 2009 at 11:19, Edward J. Yoon <[email protected]> wrote: > As we discussed before, we considering to re-factor the packages as below: > > org.apache.hama.bsp > org.apache.hama.mapred > org.apache.hama.matrix > org.apache.hama.graph > org.apache.hama.examples > > So, I'd like to re-arrange the architecture page and discuss about it > before commit them. > > This is my rough idea. > > - Our Goal > - About BSP and Map/Reduce > - Matrix Computing Strategies > - Graph Computing Strategies > - Example in matrix and graph computation areas > > What do you think? > > On Fri, Sep 18, 2009 at 6:51 PM, Edward J. Yoon <[email protected]> wrote: >> In distributed system, the matrix and graph computation are both need >> a lot of communication between each nodes. IMO, there is no way to >> avoid them. Of course, It could be performed by M/R iterations. But it >> seems very slow and there's an overhead cost. I think that's why we'd >> like to survey and consider the BSP (bulk synchronous parallel) model. >> >> - We need to explain theoretically about the BSP and How to apply out >> project. >> >> And, regarding matrix and graph, they are closely connected. I expect >> the synergy between two. However, I think we should clear the >> relationship between matrix and graph. and our main goal. >> >> Any advices are welcome. >> >> On Fri, Sep 18, 2009 at 6:30 PM, Edward J. Yoon <[email protected]> >> wrote: >>> Firstly, We need to share our plans and consider about overall architecture. >>> >>> What's the BSP? What's the relationship between matrix and graph? >>> What's the plan of matrix and graph packages? What's the our main >>> goal? >>> >>> On Fri, Sep 18, 2009 at 5:52 PM, Apache Wiki <[email protected]> wrote: >>>> Dear Wiki user, >>>> >>>> You have subscribed to a wiki page or wiki category on "Hama Wiki" for >>>> change notification. >>>> >>>> The following page has been changed by HyunsikChoi: >>>> http://wiki.apache.org/hama/GraphPackage >>>> >>>> New page: >>>> = The Graph Package (Angrapa) = >>>> The graph package, called Angrapa, is an large-scale graph data management >>>> framework for analytical processing. It is still an ongoing project. It >>>> will employ massive parallelism on Hadoop. It aims to achieve the >>>> scalability for processing tera bytes or peta bytes graph data. Angrapa >>>> will be used in a variety of scientific and industrial areas, such as data >>>> mining, machine learning, information retrieval, bioinformatics, and >>>> social networks, required to process large-scale graph data. >>>> >>>> = Description = >>>> The graph package is new programming framework for graph processing. >>>> >>>> = The Main Goal = >>>> * Easy APIs familar to graph features >>>> * Store structure suited to graph data when it comes to considering the >>>> connectivity of graph data >>>> * Applying data communication method (i.e., BSP) without deterioration of >>>> graph data locality >>>> >>> >>> >>> >>> -- >>> Best Regards, Edward J. Yoon @ NHN, corp. >>> [email protected] >>> http://blog.udanax.org >>> >> >> >> >> -- >> Best Regards, Edward J. Yoon @ NHN, corp. >> [email protected] >> http://blog.udanax.org >> > > > > -- > Best Regards, Edward J. Yoon @ NHN, corp. > [email protected] > http://blog.udanax.org > -- Chanwit Kaewkasi PhD Candidate, Centre for Novel Computing School of Computer Science The University of Manchester Oxford Road Manchester M13 9PL, UK
