Please welcome our newest committer, Igor Kabiljo!

2015-02-10 Thread Maja Kabiljo
I am pleased to announce that Igor Kabiljo has been invited to become a committer by the Project Management Committee (PMC) of Apache Giraph, and he accepted. Igor's most important contributions are implementing reduce/broadcast that generalizes aggregators and working on primitive

Please welcome our newest committer, Sergey Edunov!

2014-12-03 Thread Maja Kabiljo
I am happy to announce that the Project Management Committee (PMC) for Apache Giraph has elected Sergey Edunov to become a committer, and he accepted. Sergey has been an active member of Giraph community, finding issues, submitting patches and reviewing code. We’re looking forward to Sergey’s

Re: [RESULT] [VOTE] Apache Giraph 1.1.0 RC2

2014-11-24 Thread Maja Kabiljo
1.1.0 RC2 as the 1.1.0 release of Apache Giraph passes. Thanks to everybody who spent time on validating the bits! The vote tally is +1s: Claudio Martella (binding) Maja Kabiljo (binding) Eli Reisman (binding) Roman Shaposhnik (non-binding) I'll do

Re: [VOTE] Apache Giraph 1.1.0 RC2

2014-11-13 Thread Maja Kabiljo
+1, thanks Roman! From: Claudio Martella claudio.marte...@gmail.commailto:claudio.marte...@gmail.com Reply-To: user@giraph.apache.orgmailto:user@giraph.apache.org user@giraph.apache.orgmailto:user@giraph.apache.org Date: Thursday, November 13, 2014 at 5:53 AM To:

Re: [VOTE] Apache Giraph 1.1.0 RC1

2014-11-03 Thread Maja Kabiljo
We¹ve been running code which is the same as release candidate plus fix on GIRAPH-961 in production for 5 days now, no problems. This is hadoop_facebook profile, using only hive-io from all io modules. On 11/1/14, 3:49 PM, Roman Shaposhnik ro...@shaposhnik.org wrote: Ping! Any progress on

Re: [VOTE] Apache Giraph 1.1.0 RC1

2014-10-29 Thread Maja Kabiljo
Roman, again thanks for taking care of the release. We found one issue https://issues.apache.org/jira/browse/GIRAPH-961 - any application using MasterLoggingAggregator fails without this fix. Can we backport it to the release? Thanks, Maja On 10/26/14, 12:25 AM, Roman Shaposhnik

Re: Running one compute function after another..

2014-01-11 Thread Maja Kabiljo
Hi Jyoti, A cleaner way to do this is to switch Computation class which is used in the moment your condition is satisfied. So you can have an aggregator to check whether the condition is met, and then in your MasterCompute you call setComputation(SecondComputationClass.class) when needed.

Re: About writing our own aggregator..

2014-01-09 Thread Maja Kabiljo
Hi Jyoti, You can take a look inside of org.apache.giraph.aggregators package, there are many implementations already there. Some simple, like LongSumAggregator, and some more complex ones inside of matrix package. Please look through that and let me know if you need additional help. When you

Re: Problem with Giraph (please help me)

2014-01-09 Thread Maja Kabiljo
Hi Chadi, That does seem like a serialization issue. Which OutEdges class are you using, is it something you implemented? Regards, Maja From: chadi jaber chadijaber...@hotmail.commailto:chadijaber...@hotmail.com Reply-To: user@giraph.apache.orgmailto:user@giraph.apache.org

Re: A Vertex Holds Other Than Text

2014-01-08 Thread Maja Kabiljo
Hi Agrta, Take a look at IntIntTextVertexValueInputFormat for example, where vertex values are ints. If your vertex values are complex objects, you need to create a class which implements Writable interface which is going to hold all your data, and then extend the input format to read all the

Re: Extending AbstractComputation

2014-01-08 Thread Maja Kabiljo
Hi Pushparaj and Peter, There is going to be one Computation per partition in each of the supersteps. Each partition is processed by a single thread, so accessing any data inside of your Computation is thread-safe. Multiple threads are going to be executing computation on multiple partitions,

Re: MultiVertexInputFormat

2013-08-21 Thread Maja Kabiljo
Hi Yasser, You can do this through the Configuration parameters. You should call: description1.addParameter(myApplication.vertexInputPath, file1.txt); and description2.addParameter(myApplication.vertexInputPath, file2.txt); Then from the code of your InputFormat class you can get this parameter

Re: Multiple Data Sources

2013-07-16 Thread Maja Kabiljo
Hi Tom, We recently added something like this, please take a look at MultiVertexInputFormat. That one can basically wrap any number of vertex input formats, coming from any sources. You can also take a look at HiveGiraphRunner to see how it's used there. As for multiple vertex types, we don't

Re: Regarding multiple values of a vertex

2013-07-09 Thread Maja Kabiljo
Hi Harsh, The other thing you can do at the moment is make another implementation of Partition (similar to SimplePartition) which is going to do a different thing when duplicate vertex is encountered, and then set giraph.partitionClass to your Partition. Maja From: Alessandro Presta

Re: Are new vertices active?

2013-07-01 Thread Maja Kabiljo
Hi Christian, As javadoc for getTotalNumVertices() says, it returns the number of vertices which existed in previous superstep, so newly created vertices are not going to be counted there. In the code mutations are applied before the next superstep starts. The way it's currently implemented,

Re: What if the resulting graph is larger than the memory?

2013-05-17 Thread Maja Kabiljo
Hi JU, One thing you can try is to use out-of-core graph (giraph.useOutOfCoreGraph option). I don't know what your exact use case is – do you have the graph which is huge or the data which you calculate in your application is? In the second case, there is 'giraph.doOutputDuringComputation'

Re: Broadcast of large aggregated value is slow.

2013-05-16 Thread Maja Kabiljo
Eric, Can you please take a look at the logs of one of the workers listed (13, 34, 38, 50, 48, 52, 58, 56), what are they doing? The fact that a worker is waiting on aggregator can have different causes, it doesn’t necessarily mean that sending aggregators is slow. It can for example mean that

Re: Broadcast of large aggregated value is slow.

2013-05-16 Thread Maja Kabiljo
%2FM2K95hnYD1RGWK1CQ%2BbcclArMcjzJodKY%3D%0Am=JneyqIVoubY0J4ko9BK2DwfsA%2BN6Qy8nBTZj%2BVg78Uw%3D%0As=70d18c70634f46634d557cf4f36276e3e5936b40e403d69a1ac10e3e4e5ff52b Agility. Ingenuity. Integrity. From: Maja Kabiljo majakabi...@fb.commailto:majakabi...@fb.com Reply-To: user@giraph.apache.orgmailto:user

Re: Custom halt condition

2013-03-29 Thread Maja Kabiljo
Hi Nicolas, You are right, using aggregators and master compute is the way to go. Please take a look at https://cwiki.apache.org/confluence/display/GIRAPH/Aggregators to learn more about aggregators. From the MasterCompute.compute() you will be calling haltComputation() when you decided it's time

Re: Waiting for times required to be 19 (currently 18)

2013-02-21 Thread Maja Kabiljo
Nate, Are all the workers waiting for request from the same worker? (in the log waitSomeRequests: Waiting for request destTask is what you should look at) If so, check if there is some exception on that worker. You can also try decreasing giraph.maxRequestMilliseconds and see what happens

Re: Waiting for times required to be 19 (currently 18)

2013-02-21 Thread Maja Kabiljo
Nate, Great, glad to hear it works! We resend open requests after 10 minutes, so that's why you were seeing supersteps taking that long. Have fun with Giraph and let us know if you have any other questions. Maja From: Nate touring_...@msn.commailto:touring_...@msn.com Reply-To:

Re: Where can I find a simple Hello World example for Giraph

2013-02-21 Thread Maja Kabiljo
Hi Ryan, Before running the job, you need to set Vertex and input/output format classes on it. Please take a look at one of the benchmarks to see how to do that. Alternatively, you can try using GiraphRunner, where you pass these classes as command line arguments. Maja On 2/21/13 2:43 PM, Ryan

Re: InputFormat for the example SimpleMasterComputeVertex

2013-02-21 Thread Maja Kabiljo
That sounds great to me, maybe just a mention in the wiki that the two functionalities are tied together will help the idea click for people. Either way this will be a big help I think. On Thu, Feb 21, 2013 at 3:24 PM, Maja Kabiljo majakabi...@fb.commailto:majakabi...@fb.com wrote: Eli, that's

Re: InputFormat for the example SimpleMasterComputeVertex

2013-02-14 Thread Maja Kabiljo
Progressable exception can be caused by many different reasons (it's totally unrelated to aggregators), and when looking at which exception it's caused by users should get better sense about what's going on. What you are suggesting about providing default master compute is not doable, since the

Re: Can Giraph handle graphs with very large number of edges per vertex?

2012-09-13 Thread Maja Kabiljo
Hi Jeyendran, As Paolo mentioned, there were two patches to deal with out-of-core: GIRAPH-249 for out-of-core graph GIRAPH-45 for out-of-core messages For the graph part, currently assumption is that you have enough memory to keep at least one whole partition at the time. Options you need to set

Re: How to register aggregators with the 'new' Giraph?

2012-09-12 Thread Maja Kabiljo
. Thanks again! On Tue, Sep 11, 2012 at 9:36 AM, Paolo Castagna castagna.li...@gmail.commailto:castagna.li...@gmail.com wrote: Hi Maja, yep, your explanation makes sense. Clear now. Paiki On 11 September 2012 16:09, Maja Kabiljo majakabi...@fb.commailto:majakabi...@fb.com wrote: Hi Paolo, Glad