Hi Edward, Thank you for the reply. I would look into BSPLib. I am also looking into Parallel Boost Graph Library. The discussion at http://stackoverflow.com/questions/3010805/scalable-parallel-large-graph-analysis-libraryis useful to me. I am also looking at the libraries mentioned there as alternatives :).
Regards, Raghava. On Sun, Sep 12, 2010 at 9:25 PM, Edward J. Yoon <[email protected]>wrote: > I estimate a month or two months for 0.2.0 release. > > If input/output system and fault tolerant mechanism are added to BSP > package in the future, the graph specified programming model and > framework will be implemented easily. I guess, we can implement > input/output system and FT mechanism within this year. > > > Do you know of any alternate parallel graph processing frameworks similar > to > > Pregel & Hama? > > Nope, but I was used BSPLib for simulate of Pregel concept. It might > be good solution for you. > > > About the data split -- if, for example, there are 10 nodes in the > cluster > > and the data be divided into 10 splits (split-1 to split-10), then can we > > control which split goes to which node as local data? In case of MR > splits, > > this cannot be controlled isn't it, can we do that here? > > I understood your question. > > It is depending on {"how to designing the data structure", "how to > storing, organizing and re-using data"} on "somewhere". We don't have > a plan for graph data store yet. > > Thanks. :) > > On Fri, Sep 10, 2010 at 1:38 PM, Raghava Mutharaju > <[email protected]> wrote: > > Hello Edward, > > > > Thank you for the reply. Please correct me if I am wrong about what I am > > going to say. > > > > If the BSP computing framework is in place, how much more of a work would > it > > be to place a graph processing framework on top of it? I guess some parts > of > > the graph processing framework (Angrapa) is in place? > > > > While I was searching for parallel graph processing frameworks, I came > > across Pregel and also Hama :). Pregel development would have taken lot > of > > time, Hama is just starting out, so it would be unrealistic to make it as > > robust with as many features as Pregel, but it would be great to have > > something in place to test out my ideas. > > > > When is the release of 0.2.0 scheduled? > > > > Do you know of any alternate parallel graph processing frameworks similar > to > > Pregel & Hama? > > > > About the data split -- if, for example, there are 10 nodes in the > cluster > > and the data be divided into 10 splits (split-1 to split-10), then can we > > control which split goes to which node as local data? In case of MR > splits, > > this cannot be controlled isn't it, can we do that here? > > > > Thank you. > > > > Regards, > > Raghava. > > > > On Thu, Sep 9, 2010 at 10:47 PM, Edward J. Yoon <[email protected] > >wrote: > > > >> Hello, > >> > >> > 1) What is the status of the project, specifically the graph > processing > >> part > >> > (Angrapa?). Is it sufficiently stable to be used? Although this is an > >> > academic research project, it would be better to work on a stable one. > >> > >> At present, we're focussing on a framework for more general-purpose > >> BSP computing, so yet far from the graph processing framework such as > >> Google Pregel. > >> > >> We have a release plan for 0.2.0 version and we're working on it.The > >> release 0.2.0 will include: > >> > >> * BSP computing framework (no fault tolerant mechanism, no data > >> input-output API) > >> * and its examples > >> > >> > 2) I haven't come across any installation/building steps for Hama. How > to > >> > integrate with HDFS/HBase? > >> > >> We'll create a input-output system that can be used to process data. > >> You can think it as a M/R computing framework on HDFS/HBase. > >> > >> > 3) Are there more extensive performance tests say w.r.t the latest > branch > >> of > >> > development? Do they have better performance? > >> > >> Not yet. > >> > >> > 4) Can the data assigned to each partition (cluster) be split > according > >> to > >> > some condition i.e. can it be controlled unlike a MR split? > >> > >> Do you mean, whether it can assign a task to slaves according to other > >> condition (not based on local)? Then, no. > >> > >> The all splits should be loaded and computed locally. Otherwise, it > >> will cause meaningless huge data-copy overhead among servers. > >> > >> Thanks :) > >> > >> On Fri, Sep 10, 2010 at 7:09 AM, Raghava Mutharaju > >> <[email protected]> wrote: > >> > Hi all, > >> > > >> > I am working on a research project where I faced the issues that > formed > >> the > >> > motivation for Hama (Hamburg) -- the splits in the data depend on each > >> other > >> > and data locality issue in case of multiple MR iterations. I was > thinking > >> of > >> > checking other alternatives to MR when I came across Hama. I am in the > >> > process of checking whether Hama would fit our project needs and I > need > >> your > >> > help in that regard. > >> > > >> > I am interested in the graph processing part of Hama. > >> > > >> > I have the following questions > >> > > >> > 1) What is the status of the project, specifically the graph > processing > >> part > >> > (Angrapa?). Is it sufficiently stable to be used? Although this is an > >> > academic research project, it would be better to work on a stable one. > >> > 2) I haven't come across any installation/building steps for Hama. How > to > >> > integrate with HDFS/HBase? > >> > 3) Are there more extensive performance tests say w.r.t the latest > branch > >> of > >> > development? Do they have better performance? > >> > 4) Can the data assigned to each partition (cluster) be split > according > >> to > >> > some condition i.e. can it be controlled unlike a MR split? > >> > > >> > Thank you. > >> > > >> > Regards, > >> > Raghava. > >> > > >> > >> > >> > >> -- > >> Best Regards, Edward J. Yoon > >> [email protected] > >> http://blog.udanax.org > >> > > > > > > -- > Best Regards, Edward J. Yoon > [email protected] > http://blog.udanax.org >
