Re: Maybe a bug in PartitionBalancer.java

2016-04-13 Thread lukas nalezenec
> > > - Original Message - > From: "lukas nalezenec" <lukas.naleze...@gmail.com> > To: user@giraph.apache.org > Sent: Wednesday, 13 April, 2016 02:51:10 > Subject: RE: Maybe a bug in PartitionBalancer.java > > Hi, > > I have already fixed that, patch

RE: Maybe a bug in PartitionBalancer.java

2016-04-13 Thread lukas nalezenec
Hi, I have already fixed that, patch is available. See ticket GIRAPH-886 . Apply the patch, test it and then please contact some Giraph commiter to merge it. Lukas Hi there, We would like to change the balance algorithm. For example, we

Re: Graph job self-killed after superstep 0 with large input

2015-05-22 Thread Lukas Nalezenec
On 22.5.2015 12:25, Hai Lan wrote: Missing chosen workers [Worker(hostname=bespin05.umiacs.umd.edu http://bespin05.umiacs.umd.edu, MRtaskID=2, port=30002), Worker(hostname=bespin04d.umiacs.umd.edu http://bespin04d.umiacs.umd.edu, MRtaskID=6, port=30006),

Re: Giraph + HBase – How to define HBase connectivity properties?

2015-05-12 Thread Lukas Nalezenec
Hi, How about hive.zookeeper.quorum ? Lukas On 11.5.2015 19:37, G.W. wrote: Hi, org.apache.giraph.io.hbase has a couple of classes to load graph data from HBase. It has a HBaseVertexInputFormat class, but does anyone know how the HBase location is to be passed to this class? G

Re: Custom assignment of partitions to workers

2015-03-25 Thread Lukas Nalezenec
Hi, There are two interfaces: WorkerGraphPartitioner - Maps vertexes to partitions MasterGraphPartitioner - Maps partitions to workers. So you need custom MasterGraphPartitioner. You dont need any external preprocessing step. Lukas On 25.3.2015 19:51, Arjun Sharma wrote: Hi, I understand we

Re: Custom assignment of partitions to workers

2015-03-25 Thread Lukas Nalezenec
at 12:54 PM, Lukas Nalezenec lukas.naleze...@firma.seznam.cz mailto:lukas.naleze...@firma.seznam.cz wrote: Hi, There are two interfaces: WorkerGraphPartitioner - Maps vertexes to partitions MasterGraphPartitioner - Maps partitions to workers. So you need custom MasterGraphPartitioner. You

Re: Number of concurrent workers

2015-01-27 Thread Lukas Nalezenec
On 23.1.2015 00:40, Walaa Eldin Moustafa wrote: Hi, I am experimenting with a memory-intensive Giraph application on top of a large graph (50 million nodes), on a 14 node cluster. When setting the number of workers to a large number (500 in this example), I get errors for not being able to

Re: Giraph and Ruby?

2014-11-25 Thread Lukas Nalezenec
Hi There is only Python support. If you want to use ruby just prepare files and submit MR job from command line. Lukas On 25.11.2014 00:24, Adam Fields wrote: Hi - I’m just getting started with Giraph, and the documentation is vague on this point - is there any way to load data into the

Re: Multiple sendMessage calls vs. sendMessageToMultipleEdges

2014-10-22 Thread Lukas Nalezenec
Hi Matthew, See class SendMessageToAllCache. Its in the same directory as SendMessageCache. The first class is not used by Giraph unless you set property giraph.oneToAllMsgSending to true. Lukas On 22.10.2014 20:10, Matthew Saltz wrote: Hi everyone, I have two questions: *Question 1)*

Re: The relation between the number of partitions, number of workers, number of mappers

2014-09-22 Thread Lukas Nalezenec
Hi, Number of mappers = number of workers Number of partitions = (multiplier * (number of workers) ^ 2 ) by default (multiplier = 1 by default) Lukas On 22.9.2014 23:18, xuhong zhang wrote: I know that the number of mappers equals to the number of worker *

Re: concept of vertex in giraph

2014-07-25 Thread Lukas Nalezenec
Hi, Afaik vertex ids must be unique but you can combine vertexes with same ID to one using VertexValueCombiner. Lukas On 25.7.2014 10:33, Carmen Manzulli wrote: Hi experts, i would like to ask you if , in the graph rapresentation, every time a vertexId is reapeated, would giraph consider

Re: clustering coefficient (counting triangles) in giraph.

2014-05-06 Thread Lukas Nalezenec
at 2:30 PM, Lukas Nalezenec lukas.naleze...@firma.seznam.cz mailto:lukas.naleze...@firma.seznam.cz wrote: Hi, Check Okapi ML library from Grafos: http://grafos.ml/okapi.html#collaborative-als It needs some tuning but it will work. Regards Lukas On 17.3.2014 20:17

Re: GiraphJob

2014-05-06 Thread Lukas Nalezenec
Hi, You can read String array of parameters given to Giraph in method main as in any other Java application, you can do the same if your driver implements org.apache.hadoop.util.Tool in method run. Imho best way is using GenericOptionsParser - the parameters will be copied to configuration,

Re: GiraphJob

2014-05-06 Thread Lukas Nalezenec
Update: Using GenericOptionsParser is not the best way. Best way implementing org.apache.hadoop.util.Tool and using format -Dname=value for parameters. Regards Lukas On 6.5.2014 13:38, Lukas Nalezenec wrote: Hi, You can read String array of parameters given to Giraph in method main

Re: Difference in Input format and Vertex type

2014-04-19 Thread Lukas Nalezenec
Hi, You have to customize the input format. Regards, Lukas On 19.4.2014 07:36, Akshay Trivedi wrote: Hi, Using giraph, I am taking input in the format TextDoubleDoubleAdjacencyListVertexInputFormat and I want my Vertex to be of type VertexText,Text,DoubleWritable,Text i.e vertex value in

Re: Changing index of a graph

2014-04-15 Thread Lukas Nalezenec
Hi, I did same think in two M/R jobs during preprocesing - it was pretty powerful for web graphs but little bit slow. Solution for Giraph is: 1. Implement own partition which will iterate vertices in order. Use appropriate partitioner. 2. During first iteration you need to rename vertexes in

Re: Can a vertex belong to more than one partition

2014-04-07 Thread Lukas Nalezenec
Hi, No, Vertex can belong only to one partition. Can you describe algorithm you are solving ? How many those vertexes belonging to all partitions you have ? Why do you need so strict partitioning ? Regards Lukas On 6.4.2014 12:38, Akshay Trivedi wrote: In order to custom partition the

Re: Master/Agreggators

2014-04-07 Thread Lukas Nalezenec
Hi, You have got bug in class MyArrayListWritable in method write. Your code writes LONG but tries to read INT. Regards Lukas On 3.4.2014 18:45, ghufran malik wrote: Sorry I by accident sent that email before finishing it. I tested the compute method with just: public void

Re: Breadth First Search (BFS)

2014-04-07 Thread Lukas Nalezenec
...@gmail.com mailto:nishantgandh...@gmail.com wrote: Check this out for BFS.. http://stackoverflow.com/questions/12253794/breadth-first-implentation-in-giraph-graphchi-or-pregel Nishant Gandhi M.Tech CSE IIT Patna On Thu, Apr 3, 2014 at 3:18 PM, Lukas Nalezenec

Re: Giraph job hangs indefinitely and is eventually killed by JobTracker

2014-04-07 Thread Lukas Nalezenec
Hi, |Try making and analyzing memory dump after exception (JVM param -XX:+HeapDumpOnOutOfMemoryError|) What configuration (mainly Partition class) do you use ? Lukas On 7.4.2014 11:45, Vikesh Khanna wrote: Hi, Any ideas why Giraph waits indefinitely? I've been stuck on this for a long time

Re: Custom partitioning among workers

2014-04-05 Thread Lukas Nalezenec
partitioner, then using bfs I can label vertices in it and using these labels I can distribute my graph. If not, what to do? Regards, Akshay On Sat, Apr 5, 2014 at 4:07 AM, Lukas Nalezenec lukas.naleze...@firma.seznam.cz wrote: Hi, Make labels be part of vertex id (I know, its limiting

Re: zookeeper problem in giraph..

2014-04-04 Thread Lukas Nalezenec
Hi, I had similar issue, it was caused by long GC pauses. I patched NettyClient so when reconnect fails it sleeps for some time before next try. Patch is enclosed. Let me know if it works for you. I would try tuning GC. You can also try to use giraph.waitForRequestsConfirmation and

Re: using partition class

2014-04-03 Thread Lukas Nalezenec
PartitionerFactoryClass. Best Regards, Liannet 2014-04-02 17:32 GMT+02:00 Lukas Nalezenec lukas.naleze...@firma.seznam.cz mailto:lukas.naleze...@firma.seznam.cz: Hi, 1. It have to work if you set giraph.vertexKeySpaceSize just after or before the other property. But it can interfere

Re: Breadth First Search (BFS)

2014-04-03 Thread Lukas Nalezenec
Hi, It looks like you are using wrong algorithm. If you are doing simple BFS you should not need to remember vertex ids. Lukas Lukas On 2.4.2014 20:30, ghufran malik wrote: Hi I am trying to implement the BFS algorithm using Giraph 1.1.0. I have partly implemented it and am stuck on just

Re: loading graph stuck.

2014-04-03 Thread Lukas Nalezenec
Hi, Try finding master and check what is it doing in jobtracker. Lukas On 2.4.2014 23:58, Suijian Zhou wrote: Hi, Why the giraph program will stuck when loading input graph( the size of the graph is 500MB, not so big)? No matter how I try different number of workers( from -w 2 to -w 30) or

Re: how to input two graphs using giraph

2014-04-02 Thread Lukas Nalezenec
Hi, You can also store the query graph to MR Distributed cache. Regards Lukas On 2.4.2014 14:15, Lukas Nalezenec wrote: Hi, I would try to load the query graph in custom MasterComputation and distribute it to computations/workers using persistent aggregator. Regards Lukas On 2.4.2014 14

Re: using partition class

2014-04-02 Thread Lukas Nalezenec
but it didn't work 2- Which unit is this PARTITION_VERTEX_KEY_SPACE_SIZE? Bytes?? Regards, Liannet 2014-04-01 20:55 GMT+02:00 Lukas Nalezenec lukas.naleze...@firma.seznam.cz mailto:lukas.naleze...@firma.seznam.cz: Hi, Partition is for storing vertexes, Partitioner is for distributing

Re: Graph stats in giraph.

2014-04-02 Thread Lukas Nalezenec
Hi, IMHO its sum of bytes sent in messages in all iterations. It is possible, it depends on number of edges in your graph, size of messages and number of iterations. Regards, Lukas On 2.4.2014 17:36, Suijian Zhou wrote: Hi, Does anybody know what Aggregate sent message message bytes means

Re: Graph stats in giraph.

2014-04-02 Thread Lukas Nalezenec
Hi, If you want to reduce the number, you can replace hash partitioning with range partitioning. You can also add some cache to your computation and combine messages. Regards, Lukas On 2.4.2014 17:36, Suijian Zhou wrote: Hi, Does anybody know what Aggregate sent message message bytes

Re: using partition class

2014-04-01 Thread Lukas Nalezenec
Hi, Partition is for storing vertexes, Partitioner is for distributing vertexes between Partitions. Try this: -Dgiraph.graphPartitionerFactoryClass=org.apache.giraph.partition.SimpleLongRangePartitionerFactory Its good idea to switch Partition to ByteArrayPartition (or better). Lukas On

Edge/Vertex Balancing

2014-03-31 Thread Lukas Nalezenec
Hi, Is anybody successfully using edge/vertex Balancing ? I need some answers, you can answer yes/no . Lukas

Re: Building Giraph Jar for Hadoop 2.0.0 Version

2014-03-31 Thread Lukas Nalezenec
Open pom.xml file from giraph-parent project, choose corresponding maven profile, active it from command line and recompile Giraph. Lukas On 31.3.2014 14:29, Agrta Rawat wrote: Hi All, I need to build Giraph 1.0.0 for Hadoop version2.0.0. how can I build Giraph jar for a specific version of

Re: help in compiling

2014-03-30 Thread Lukas Nalezenec
On 30.3.2014 12:37, nishant gandhi wrote: I recently shifted my work on giraph. I ran already available example but not able to run my own code. can you suggest something what i should be missing? I normally get error of ClassNotFound while compiling even though i give classpath of

Integration tests

2014-03-25 Thread Lukas Nalezenec
Hi, Are there some integration tests for Giraph ? Something for testing Partitions, Edges, Rebalancing etc. It could be simple job with known correct results. Lukas

Re: clustering coefficient (counting triangles) in giraph.

2014-03-18 Thread Lukas Nalezenec
Hi, Check Okapi ML library from Grafos: http://grafos.ml/okapi.html#collaborative-als It needs some tuning but it will work. Regards Lukas On 17.3.2014 20:17, Suijian Zhou wrote: Hi, Experts, Does anybody know if there are examples of implementation in giraph for clustering coefficient

Re: Help needed in Compilation and execution of a giraph code(Giraph noob)

2014-01-29 Thread Lukas Nalezenec
go to ../giraph-core directory and run mvn clean install there. Lukas On 29.1.2014 16:00, Deepankar Patra wrote: Hi Jyoti, Thanks for the help. I tried your steps, while trying mvn compile I get build failure message with these two warnings: [INFO]

Re:

2014-01-09 Thread Lukas Nalezenec
I thing you cluster is busy, you need to increase timeout -Dgiraph.maxMasterSuperstepWaitMsecs=... On 9.1.2014 11:20, Jyoti Yadav wrote: Hi.. Is anyone familiar with below mentioned error*??* * ERROR:*org.apache.giraph.master.BspServiceMaster: checkWorkers: Did not receive enough processes

Re: Exception Already has missing vertex on this worker

2013-09-26 Thread Lukas Nalezenec
Hi, Do you use partition balancing ? Lukas On 09/26/13 05:16, Yingyi Bu wrote: Hi, I got this exception when I ran a Giraph-1.0.0 PageRank job over a 60 machine cluster with 28GB input data. But I got this exception: java.lang.IllegalStateException: run: Caught an unrecoverable exception

Re: Out of memory with giraph-release-1.0.0-RC3, used to work on old Giraph

2013-09-04 Thread Lukas Nalezenec
be using it like this if, as described, they have billions of vertices and a trillion edges. So do you, or Avery, have any idea how you might initialize this is a more reasonable way, and how??? On Mon, Sep 2, 2013 at 6:08 AM, Lukas Nalezenec lukas.naleze...@firma.seznam.cz mailto:lukas.naleze

Serialization Error

2013-07-16 Thread Lukas Nalezenec
Hi, I have got problem i cannot solve. When one node loads given data and send vertexes belonging to another node, the anorther mode throw this exception. It looks like it expects number of edges but gets random bytes. Did somebody solved something similar ? Thanks Lukas 2013-07-16