As we discussed before, we considering to re-factor the packages as below: org.apache.hama.bsp org.apache.hama.mapred org.apache.hama.matrix org.apache.hama.graph org.apache.hama.examples
So, I'd like to re-arrange the architecture page and discuss about it before commit them. This is my rough idea. - Our Goal - About BSP and Map/Reduce - Matrix Computing Strategies - Graph Computing Strategies - Example in matrix and graph computation areas What do you think? On Fri, Sep 18, 2009 at 6:51 PM, Edward J. Yoon <[email protected]> wrote: > In distributed system, the matrix and graph computation are both need > a lot of communication between each nodes. IMO, there is no way to > avoid them. Of course, It could be performed by M/R iterations. But it > seems very slow and there's an overhead cost. I think that's why we'd > like to survey and consider the BSP (bulk synchronous parallel) model. > > - We need to explain theoretically about the BSP and How to apply out project. > > And, regarding matrix and graph, they are closely connected. I expect > the synergy between two. However, I think we should clear the > relationship between matrix and graph. and our main goal. > > Any advices are welcome. > > On Fri, Sep 18, 2009 at 6:30 PM, Edward J. Yoon <[email protected]> wrote: >> Firstly, We need to share our plans and consider about overall architecture. >> >> What's the BSP? What's the relationship between matrix and graph? >> What's the plan of matrix and graph packages? What's the our main >> goal? >> >> On Fri, Sep 18, 2009 at 5:52 PM, Apache Wiki <[email protected]> wrote: >>> Dear Wiki user, >>> >>> You have subscribed to a wiki page or wiki category on "Hama Wiki" for >>> change notification. >>> >>> The following page has been changed by HyunsikChoi: >>> http://wiki.apache.org/hama/GraphPackage >>> >>> New page: >>> = The Graph Package (Angrapa) = >>> The graph package, called Angrapa, is an large-scale graph data management >>> framework for analytical processing. It is still an ongoing project. It >>> will employ massive parallelism on Hadoop. It aims to achieve the >>> scalability for processing tera bytes or peta bytes graph data. Angrapa >>> will be used in a variety of scientific and industrial areas, such as data >>> mining, machine learning, information retrieval, bioinformatics, and social >>> networks, required to process large-scale graph data. >>> >>> = Description = >>> The graph package is new programming framework for graph processing. >>> >>> = The Main Goal = >>> * Easy APIs familar to graph features >>> * Store structure suited to graph data when it comes to considering the >>> connectivity of graph data >>> * Applying data communication method (i.e., BSP) without deterioration of >>> graph data locality >>> >> >> >> >> -- >> Best Regards, Edward J. Yoon @ NHN, corp. >> [email protected] >> http://blog.udanax.org >> > > > > -- > Best Regards, Edward J. Yoon @ NHN, corp. > [email protected] > http://blog.udanax.org > -- Best Regards, Edward J. Yoon @ NHN, corp. [email protected] http://blog.udanax.org
