Re: Vertices reading the file..

2014-04-15 Thread Jyoti Yadav
Thanks Agrta for your reply.. You understood correctly.. I want this extra value as vertex value. But am giving input in the form of json array. At the time of giving input. vertices are not having this extra content. I want to initialize this value in superstep 0. And vertices will read this

Re: giraph memory usage info

2014-04-15 Thread Agrta Rawat
Hi Arun, I would suggest you to check two things - 1. Does your code run successfully for some small data? 2. Have you modified hadoop-env.sh and mapred-site.xml file for all 6 clusters? And it is showing Caused by: java.util.concurrent. ExecutionException: java.lang.OutOfMemoryError: Java

Optimal number of Workers

2014-04-15 Thread chadi jaber
Hello !!Can anybody explain how threads are used by worker in Giraph ? for which purposes ? how the number of thread to use is determined by worker? I often have the following error :org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: unable to create new native

Changing index of a graph

2014-04-15 Thread Martin Neumann
Hej, I have a huge edgelist (several billion edges) where node ID's are URL's. The algorithm I want to run needs the ID's to be long and there should be no holes in the ID space (so I cant simply hash the URL's). Is anyone aware of a simple solution that does not require a impractical huge hash

Re: Changing index of a graph

2014-04-15 Thread Claudio Martella
The only solution i know is usually done via a so-called dictionary outside of giraph (e.g. for semantic web graphs which also have URIs as IDs), through a datastore like HBase/Cassandra, basically the hashmap you mentioned. While initially computationally expensive, it allows you to scale in the

Re: Changing index of a graph

2014-04-15 Thread Lukas Nalezenec
Hi, I did same think in two M/R jobs during preprocesing - it was pretty powerful for web graphs but little bit slow. Solution for Giraph is: 1. Implement own partition which will iterate vertices in order. Use appropriate partitioner. 2. During first iteration you need to rename vertexes in

Re: Changing index of a graph

2014-04-15 Thread Martin Neumann
I have a pipeline that creates a graph then does some transformations on it (with Giraph). In the end I want to dump it into Neo4j to allow for cypher queries. I was told that I could make the batch import for Neo4j a lot faster if I would use Long identifiers without holes, and therefore

Re: Can a vertex belong to more than one partition

2014-04-15 Thread Akshay Trivedi
Hi, I am solving graph isomorhism between a large graph and query graph. The large graph is partitioned and so the query graph should be available to all partitions. Apart from this, some of the large graph vertices(such as those which have edges between partitions) also have to be duplicated. On