As we discussed before, we considering to re-factor the packages as below:

org.apache.hama.bsp
org.apache.hama.mapred
org.apache.hama.matrix
org.apache.hama.graph
org.apache.hama.examples

So, I'd like to re-arrange the architecture page and discuss about it
before commit them.

This is my rough idea.

- Our Goal
- About BSP and Map/Reduce
- Matrix Computing Strategies
- Graph Computing Strategies
- Example in matrix and graph computation areas

What do you think?

On Fri, Sep 18, 2009 at 6:51 PM, Edward J. Yoon <[email protected]> wrote:
> In distributed system, the matrix and graph computation are both need
> a lot of communication between each nodes. IMO, there is no way to
> avoid them. Of course, It could be performed by M/R iterations. But it
> seems very slow and there's an overhead cost. I think that's why we'd
> like to survey and consider the BSP (bulk synchronous parallel) model.
>
> - We need to explain theoretically about the BSP and How to apply out project.
>
> And, regarding matrix and graph, they are closely connected. I expect
> the synergy between two. However, I think we should clear the
> relationship between matrix and graph. and our main goal.
>
> Any advices are welcome.
>
> On Fri, Sep 18, 2009 at 6:30 PM, Edward J. Yoon <[email protected]> wrote:
>> Firstly, We need to share our plans and consider about overall architecture.
>>
>> What's the BSP? What's the relationship between matrix and graph?
>> What's the plan of matrix and graph packages? What's the our main
>> goal?
>>
>> On Fri, Sep 18, 2009 at 5:52 PM, Apache Wiki <[email protected]> wrote:
>>> Dear Wiki user,
>>>
>>> You have subscribed to a wiki page or wiki category on "Hama Wiki" for 
>>> change notification.
>>>
>>> The following page has been changed by HyunsikChoi:
>>> http://wiki.apache.org/hama/GraphPackage
>>>
>>> New page:
>>> = The Graph Package (Angrapa) =
>>> The graph package, called Angrapa, is an large-scale graph data management 
>>> framework for analytical processing. It is still an ongoing project. It 
>>> will employ massive parallelism on Hadoop. It aims to achieve the 
>>> scalability for processing tera bytes or peta bytes graph data. Angrapa 
>>> will be used in a variety of scientific and industrial areas, such as data 
>>> mining, machine learning, information retrieval, bioinformatics, and social 
>>> networks, required to process large-scale graph data.
>>>
>>> = Description =
>>> The graph package is new programming framework for graph processing.
>>>
>>> = The Main Goal =
>>>  * Easy APIs familar to graph features
>>>  * Store structure suited to graph data when it comes to considering the 
>>> connectivity of graph data
>>>  * Applying data communication method (i.e., BSP) without deterioration of 
>>> graph data locality
>>>
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon @ NHN, corp.
>> [email protected]
>> http://blog.udanax.org
>>
>
>
>
> --
> Best Regards, Edward J. Yoon @ NHN, corp.
> [email protected]
> http://blog.udanax.org
>



-- 
Best Regards, Edward J. Yoon @ NHN, corp.
[email protected]
http://blog.udanax.org

Reply via email to