Re: Please welcome our newest PMC member, Maja!

2014-04-22 Thread Sebastian Schelter
Great to have you in the PMC! On 04/22/2014 05:43 PM, Avery Ching wrote: Maja has been working on Giraph for over a year and is one of our biggest contributors. Adding her to the Giraph PMC in recognition of her impressive work is long overdue. Some of her major contributions include

Re: Powered-by Giraph page

2014-04-09 Thread Sebastian Schelter
We have a guy from spotify on the list, who seems to be evaluating Giraph. You could ask whether they want to be on that page. On 04/09/2014 03:29 PM, Claudio Martella wrote: Hello giraphers, as Giraph is getting more visibility and users, I think it would be nice to add a Powered-by page on

Re: Information

2014-03-26 Thread Sebastian Schelter
For such a small graph, using a single machine graph processing system makes more sense imho. Should be faster and easier to program. Google for cassovary. Am 26.03.2014 10:12 schrieb Angelo Immediata angelo...@gmail.com: Hi there In my project I have to implement a routing system with good

Re: Information

2014-03-26 Thread Sebastian Schelter
Hi Angelo, It very much depends on your use case. Do you want to precompute paths offline in batch or are you looking for a system that answers online? Giraph has been built for the first scenario. --sebastian On 03/26/2014 02:48 PM, Angelo Immediata wrote: hi Claudio so, if I understood

Re: Using Giraph

2014-03-11 Thread Sebastian Schelter
Try mvn -Phadoop_1 -DskipTests clean package That should built for Hadoop 1.x On 03/11/2014 07:23 PM, Arko Provo Mukherjee wrote: To add on to the last post, I am using Hadoop 1.2.1. I want to be sure that is not causing an issue? Thanks regards Arko On Tue, Mar 11, 2014 at 1:19 PM, Arko

Re: Different num supersteps

2014-03-03 Thread Sebastian Schelter
that one gives me different outputs for the same input graph. cheers Martin On Mon, Mar 3, 2014 at 8:06 AM, Sebastian Schelter s...@apache.org wrote: Martin, can you write a MapReduce job that creates your graph and run it with a simpler inputformat? I really suspect that the bug lies

Re: Sample data for Single Source shortest path

2014-03-01 Thread Sebastian Schelter
Hi Jyoti, You can find a couple of very large graphs in KONECT [1] and on the website of the laboratory for web algorithmics from the University of Milan [2]. You will probably have to convert them to an appropriate format for Giraph. Best, Sebastian [1] http://konect.uni-koblenz.de/ [2]

Re: Library of ML and Graph Mining algorithms

2014-03-01 Thread Sebastian Schelter
. On 2/27/14, 11:29 PM, Sebastian Schelter wrote: Hi, It seems a team from Telefonica built a machine learning library on top of Giraph: http://grafos.ml/ Looks pretty interesting to me :) Best, Sebastian

Re: Running Giraph-854 SimpleShortestPathsComputation

2014-02-27 Thread Sebastian Schelter
Can you provide the exact arguments you use to start the job? On 02/27/2014 10:50 AM, Agrta Rawat wrote: Hi All, I have build Giraph for patch-854 on hadoop1.0.0. I have run the example of SimpleShortestPathsComputation successfully but when I try to run the some other code by following the

Re: Different num supersteps

2014-02-27 Thread Sebastian Schelter
Hi Martin, You are right, this should not happen, your code looks correct. One way to check the output would be to simply count the number vertices per component and see if that number stays stable. Do you supply all vertices in your input data or are some vertices created during the

Re: Different num supersteps

2014-02-27 Thread Sebastian Schelter
Hi Martin I don't think that there are problems with comparing and sorting Text writables as Hadoop is basically a big external sorting system. I'm not sure I understand your edge input reader, it looks very complex, maybe there's a bug somewhere. You could try to preprocess your data using

Re: pagerank in giraph.

2014-02-26 Thread Sebastian Schelter
Hi Suijian, Giraph has several PageRank implementations. I suggest that you use org.apache.giraph.examples.PageRankComputation which will automatically check convergence for you and correctly handle dangling vertices (vertices without any outlinks). It relies on

Re: overriding vertex value

2014-02-22 Thread Sebastian Schelter
Hi Apostolous, can you provide a few more details on what you're exactly trying to achieve? Best, Sebastian On 02/22/2014 01:07 PM, Apostolos Koutras wrote: Hi to all, can you please direct be to an older post of how to override the vertex value and implement the serializer? Iam still in the

Re: overriding vertex value

2014-02-22 Thread Sebastian Schelter
a small step are needed... Thanks.. On Sat, Feb 22, 2014 at 2:32 PM, Sebastian Schelter s...@apache.org wrote: Hi Apostolous, can you provide a few more details on what you're exactly trying to achieve? Best, Sebastian On 02/22/2014 01:07 PM, Apostolos Koutras wrote: Hi to all, can you

Re: overriding vertex value

2014-02-22 Thread Sebastian Schelter
in a stalemate Any ideas so that I can make even a small step are needed... Thanks.. On Sat, Feb 22, 2014 at 2:32 PM, Sebastian Schelter s...@apache.org wrote: Hi Apostolous, can you provide a few more details on what you're exactly trying to achieve? Best, Sebastian On 02/22/2014 01:07 PM

Re: Basic questions about Giraph internals

2014-02-07 Thread Sebastian Schelter
I tried the setup with one multithreaded worker per machine for the first time a few minutes ago on a cluster of 25 machines, and my job (closeness centrality estimation on a billion edge graph) ran twice as fast! On 02/07/2014 12:21 PM, Claudio Martella wrote: Yes, I think this is the

Re: Basic questions about Giraph internals

2014-02-06 Thread Sebastian Schelter
Yes, this is correct. On 02/06/2014 12:15 PM, Alexander Frolov wrote: On Thu, Feb 6, 2014 at 3:00 PM, Claudio Martella claudio.marte...@gmail.com wrote: On Thu, Feb 6, 2014 at 11:56 AM, Alexander Frolov alexndr.fro...@gmail.com wrote: Hi Claudio, thank you. If I understood

Re: Problem with giraph deployment on the cluster

2014-02-05 Thread Sebastian Schelter
No need to excuse :) On 02/05/2014 07:15 PM, Alexander Frolov wrote: I think I have solved problem. Configuration of Hadoop was messy. Sorry. On Wed, Feb 5, 2014 at 5:55 PM, Alexander Frolov alexndr.fro...@gmail.comwrote: On Wed, Feb 5, 2014 at 5:49 PM, Alexander Frolov

Re: IntIntNullTextInputFormat problem

2014-02-03 Thread Sebastian Schelter
Hi Alexander, what do you use as type for your vertex ids? It looks that you are trying to use longs, while IntIntNullTextInputFormat only provides ints, that could be the error. --sebastian On 02/03/2014 05:06 PM, Alexander Frolov wrote: Hello, I am trying to load data in from file in

Re: IntIntNullTextInputFormat problem

2014-02-03 Thread Sebastian Schelter
What types does the vertex class that you use in your computation require? On 02/03/2014 05:28 PM, Alexander Frolov wrote: On Mon, Feb 3, 2014 at 8:20 PM, Sebastian Schelter s...@apache.org wrote: Hi Alexander, what do you use as type for your vertex ids? It looks that you are trying to use

Re: IntIntNullTextInputFormat problem

2014-02-03 Thread Sebastian Schelter
:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) On Mon, Feb 3, 2014 at 8:38 PM, Alexander Frolov alexndr.fro...@gmail.comwrote: On Mon, Feb 3, 2014 at 8:30 PM, Sebastian Schelter s...@apache.org wrote What types does

Re: IntIntNullTextInputFormat problem

2014-02-03 Thread Sebastian Schelter
. PageRank and SingleSourceShortestPath both need weighted graph to compute. I just missed it somehow. Thank you for help. On Mon, Feb 3, 2014 at 9:25 PM, Alexander Frolov alexndr.fro...@gmail.comwrote: On Mon, Feb 3, 2014 at 9:06 PM, Sebastian Schelter s...@apache.org wrote: If you just

Re: About LineRank algo ..

2014-01-20 Thread Sebastian Schelter
Sure :) On 01/20/2014 09:39 AM, Claudio Martella wrote: do you plan to share it when you're done? :) On Mon, Jan 20, 2014 at 9:15 AM, Sebastian Schelter s...@apache.org wrote: I have a student working on an implementation, do you have questions? On 01/20/2014 08:11 AM, Jyoti Yadav wrote

Re: About LineRank algo ..

2014-01-20 Thread Sebastian Schelter
an undirected edge by two directed ones. Regarding your problems with convergence, I can give you access to my matlab code and some toy data that it converges on, so that you can test your implementation. --sebastian Thanks On Mon, Jan 20, 2014 at 2:40 PM, Sebastian Schelter s...@apache.org

Re: About LineRank algo ..

2014-01-20 Thread Sebastian Schelter
-- On 01/20/2014 05:07 PM, Jyoti Yadav wrote: Thanks Sebastian.. You pls send your code,I will also check where i went wrong.. On Mon, Jan 20, 2014 at 8:51 PM, Sebastian Schelter s...@apache.org wrote: On 01/20/2014

Intermediate output

2014-01-18 Thread Sebastian Schelter
Hi, Did we have a way to write out the state of the graph after each superstep? I have an algorithm that requires this and I don't want to buffer the intermediate results in memory until the algorithm finishes. --sebastian

Re:

2014-01-09 Thread Sebastian Schelter
Did you try to increase the number of map slots in your cluster, as suggested? On 01/09/2014 11:20 AM, Jyoti Yadav wrote: Hi.. Is anyone familiar with below mentioned error*??* *ERROR:*org.apache.giraph.master.BspServiceMaster: checkWorkers: Did not receive enough processes in time (only 0

Re: Which partition scheme is used by default in giraph 1.0?

2013-12-09 Thread Sebastian Schelter
By hash on vertex id. On 09.12.2013 11:01, Yi Lu wrote: Hi, I have a question on load balance in giraph. How does giraph to load data if only VertexInputFormat is used? By hash on vertex id, or range based? Thank you. -Yi

Re: Need your help to serialize an boolean array....

2013-11-20 Thread Sebastian Schelter
What errors do you exactly get? Can you show the whole implementation of your vertex? On 20.11.2013 08:42, Jyoti Yadav wrote: Hi folks.. I am implementing one program where I need to pass message as boolean array.. While implementing my MyMessageWritable.java class,I need to define

Re: About Giraph Design

2013-10-28 Thread Sebastian Schelter
Centrality algo then what should we do? Regards Jyoti On Mon, Oct 28, 2013 at 6:10 PM, Sebastian Schelter s...@apache.org wrote: Hi Jyoti, All-Pairs-Shortest-Path is a problem with a solution quadratic in the number of vertices of the input graph. For a reasonably large graph, you

Re: How to specify parameters in order to run giraph job in parallel

2013-10-18 Thread Sebastian Schelter
Da, Holding objects in serialized form as bytes in byte arrays consumes much less memory than holding them as Java objects (which have a huge overhead), I think that is the other main reason for serialization. --sebastian On 18.10.2013 19:28, YAN Da wrote: Dear Claudio Martella, According

Re: workload used to measure Giraph performance number

2013-10-02 Thread Sebastian Schelter
Another option is to use the Koblenz network collection [1], which offers even more (and larger) datasets than Snap. Best, Sebastian [1] http://konect.uni-koblenz.de/ On 02.10.2013 17:41, Alok Kumbhare wrote: There are a number real (medium sized) graphs at

Re: Using Giraph at Facebook

2013-08-14 Thread Sebastian Schelter
Just ran into it, great read. I hope that I will be able contribute again in the future. Awesome job done! Am 14.08.2013 16:55 schrieb Avery Ching ach...@apache.org: Hi Giraphers, We recently released an article on we can use Giraph at the scale of a trillion edges at Facebook. If you're

[JOBS] Postdoctoral Research Position at TU Berlin

2013-06-02 Thread Sebastian Schelter
Hi there, my research group is looking for a PostDoc, if you know anyone interested, please forward this. Thank you, Sebastian Postdoctoral Research Position in Big Data Analytics Systems at TU Berlin Area of Work The

Re: Trying to implement program to find betweenness centrality in giraph

2013-02-08 Thread Sebastian Schelter
How large is the graph for which you are trying to compute betweeness centrality? On 08.02.2013 13:22, Claudio Martella wrote: Unfortunately there is no way to disable the counter limit completely. Counters are very expensive as they require the jobtracker to keep a lot of information for the

Re: GreenMarl

2013-02-05 Thread Sebastian Schelter
: https://github.com/stanford-ppl/Green-Marl The BC algorithm: https://github.com/stanford-ppl/Green-Marl/blob/master/apps/src/bc_random.gm On Tue, Feb 5, 2013 at 8:07 PM, Sebastian Schelter s...@apache.org wrote: Hello Pradeep, the standard betweeness and closeness measures do not scale

Re: Deadlock when running on Hadoop 1.0.4

2013-01-25 Thread Sebastian Schelter
, Sebastian Schelter s...@apache.orgwrote: Hi, I'm testing a custom PageRank implementation using trunk on Hadoop 1.0.4. I seem to run into a deadlock after the input superstep. The workers report finishSuperstep: (all workers done) WORKER_ONLY - Attempt=0, Superstep=0 and the master reports that all

Deadlock when running on Hadoop 1.0.4

2013-01-21 Thread Sebastian Schelter
Hi, I'm testing a custom PageRank implementation using trunk on Hadoop 1.0.4. I seem to run into a deadlock after the input superstep. The workers report finishSuperstep: (all workers done) WORKER_ONLY - Attempt=0, Superstep=0 and the master reports that all workers are done with superstep -1.

Re: trunk on hadoop 1.0.4

2013-01-16 Thread Sebastian Schelter
I haven't but I intend to try running on a 26 machine 1.0.4 cluster next week. /s On 16.01.2013 16:27, Claudio Martella wrote: Hey guys, is anybody of you running successfully trunk on hadoop 1.0.4? I'm failing the tests (both on a real working cluster and in pseudo-distributed) and simple

Trouble building trunk with profile hadoop_1.0

2013-01-15 Thread Sebastian Schelter
Hi guys, long time no see :) I'm trying to test the current trunk on my Hadoop 1.0.4 cluster, but I have trouble building it: mvn -Phadoop_1.0 -DskipTests clean package [INFO] Apache Giraph Parent .. SUCCESS [0.164s] [INFO] Apache Giraph

Re: Trouble building trunk with profile hadoop_1.0

2013-01-15 Thread Sebastian Schelter
Oops. That could be the problem :) Thank you! On 15.01.2013 12:37, Claudio Martella wrote: Hi, btw are you getting trunk from git? On Tue, Jan 15, 2013 at 11:50 AM, Sebastian Schelter s...@apache.org wrote: Hi guys, long time no see :) I'm trying to test the current trunk on my Hadoop

Re: Minimum superstep time

2012-11-27 Thread Sebastian Schelter
Are the vertices which are active in supersteps dependent on each other? If this is not the case, you could try to execute the messaging simultaneously. Could you give a little more details about the problem, which you are trying to solve? /s On 27.11.2012 19:07, Jonathan Bishop wrote: Hi,

Re: All pairs shortest paths

2012-09-24 Thread Sebastian Schelter
Hi Venkata, The result of All-Pairs-Shortest-Paths is quadratic in the number of vertices which makes it infeasible for sufficiently large graphs. Best, Sebastian On 24.09.2012 20:24, Venkata Sastry Malladi wrote: I'm thinking of modifying the simple shortest paths algorithm so that it

Re: Trying to run the Connected Components example.

2012-08-06 Thread Sebastian Schelter
1239 1 2989 1 3961 2 5417 2 7350 What am I doing wrong? Also, in general does the graph have to have int values for nodes? Or can I have strings? Appreciate your help! Vishal On Mon, Aug 6, 2012 at 2:22 PM, Sebastian Schelter s...@apache.org wrote