Thanks Agrta for your reply.. You understood correctly.. I want this extra
value as vertex value. But am giving input in the form of json array. At
the time of giving input. vertices are not having this extra content. I
want to initialize this value in superstep 0. And vertices will read this
Hi Arun,
I would suggest you to check two things -
1. Does your code run successfully for some small data?
2. Have you modified hadoop-env.sh and mapred-site.xml file for all 6
clusters?
And it is showing Caused by: java.util.concurrent.
ExecutionException: java.lang.OutOfMemoryError: Java
Hello !!Can anybody explain how threads are used by worker in Giraph ? for
which purposes ? how the number of thread to use is determined by worker?
I often have the following error :org.apache.hadoop.mapred.Child: Error running
child : java.lang.OutOfMemoryError: unable to create new native
Hej,
I have a huge edgelist (several billion edges) where node ID's are URL's.
The algorithm I want to run needs the ID's to be long and there should be
no holes in the ID space (so I cant simply hash the URL's).
Is anyone aware of a simple solution that does not require a impractical
huge hash
The only solution i know is usually done via a so-called dictionary outside
of giraph (e.g. for semantic web graphs which also have URIs as IDs),
through a datastore like HBase/Cassandra, basically the hashmap you
mentioned.
While initially computationally expensive, it allows you to scale in the
Hi,
I did same think in two M/R jobs during preprocesing - it was pretty
powerful for web graphs but little bit slow.
Solution for Giraph is:
1. Implement own partition which will iterate vertices in order. Use
appropriate partitioner.
2. During first iteration you need to rename vertexes in
I have a pipeline that creates a graph then does some transformations on it
(with Giraph).
In the end I want to dump it into Neo4j to allow for cypher queries.
I was told that I could make the batch import for Neo4j a lot faster if I
would use Long identifiers without holes, and therefore
Hi,
I am solving graph isomorhism between a large graph and query graph.
The large graph is partitioned and so the query graph should be
available to all partitions. Apart from this, some of the large graph
vertices(such as those which have edges between partitions) also have
to be duplicated.
On