I looked at the code again & does not seem like workerList is sorted, etc. so
by knowing a worker number there is no consistent way to tell the actual worker
details each time. Lukas was working on such a diff sometime back. Perhaps he
can answer more.
From: pava...@outlook.com
To: user@giraph.a
I wrote a diff sometime ago where you can easily do that.
You can find implementation details at -
https://issues.apache.org/jira/browse/GIRAPH-908 &
https://reviews.apache.org/r/22234/
Some options you can use are
-Dgiraph.mappingStoreClass=org.apache.giraph.mapping.LongByteMappingStore
You can also look at https://issues.apache.org/jira/browse/GIRAPH-908which
solves the case where you have a partition map and would like graph to be
partitioned that way after loading the input. It does not however solve the {do
not shuffle data part}
From: claudio.marte...@gmail.com
Date: Tue,
scenario.
Thanks,Charith
On Mon, Sep 29, 2014 at 3:34 PM, Pavan Kumar A wrote:
we have two inputs - vertex & edgesif we partition edges vertices based on a
map, then when we want to send messages we should be able to know which
partition a vertex is on.
typically we send message
4 at 8:29 AM, Pavan Kumar A wrote:
I worked on this feature sometime back - but I only worked on inputting hive
file & not hdfs
You can use logic outside giraph to select which partition file to use - this
is possible because you input the number of workers anyway.For instance in the
scr
If you are using hashpartitioning, then as long as number of workers is same,
partitions will remain unchanged, though they might run on a different worker.
However, yes graph is always partitioned.
Date: Mon, 29 Sep 2014 15:01:37 -0400
Subject: Graph re-partitioning
From: xuhongne...@gmail.com
T
I worked on this feature sometime back - but I only worked on inputting hive
file & not hdfs
You can use logic outside giraph to select which partition file to use - this
is possible because you input the number of workers anyway.For instance in the
script that you use to launch a giraph job hav
Can you give more context?What are the types of messages, patch of your compute
method, etc.You will not receive messages that are not sent, but one thing that
can happen is-- message can have multiple parameters.suppose message objects
can have 2 parametersm - a,bsay in m's write(out) you do no
http://www.manning.com/martella/I am not sure if there is any example in
documentation Claudio might know more.
From: khaled.am...@gmail.com
Date: Fri, 19 Sep 2014 02:11:45 -0400
Subject: looking for a User guide
To: user@giraph.apache.org
Hi all,
I was looking for a user guide for Giraph 1.0.0
help!
--
Andrew
On Mon, Sep 8, 2014, at 05:31 PM, Pavan Kumar A wrote:
ByteArrayEdges or any of the other edge stores used array based/ map based
stores, all of these will encounter this exception when size of the array
approaches Integer.MAX
some things to consider for time being, wha
ByteArrayEdges or any of the other edge stores used array based/ map based
stores, all of these will encounter this exception when size of the array
approaches Integer.MAXsome things to consider for time being, what do your
edges look like?if they are long ids & null values u can use LongNullArr
ated'. 'date created' property belongs to A-> C-> B.Can I represent this in
Giraph. Also does giraph has querying mechanism? So that I can retrieve
triplets which are created before particular
date?
Sujan Perera
On Wednesday, May 21, 2014 3:51 PM, Pavan Kumar A
wrote:
C
Can you please provide more context.
vertex -> edge (edge value can store any properties required of that edge) ->
vertex (vertex value can store any property required for the vertex)
Date: Wed, 21 May 2014 13:50:34 -0700
From: sujanu...@yahoo.com
Subject: n-ary relationship on Giraph
To: user@gi
.org/giraph-core/apidocs/org/apache/giraph/counters/GiraphTimers.html
Thanks,
Ghufran
On Fri, Apr 18, 2014 at 3:25 PM, Pavan Kumar A wrote:
I wrote the Initialize counter :) Please tell me if the name seems confusing
So,Initialize = the time spent by job waiting for resources. In a shared
look up the
meanings there.
https://giraph.apache.org/giraph-core/apidocs/org/apache/giraph/counters/GiraphTimers.html
Thanks,
Ghufran
On Fri, Apr 18, 2014 at 3:25 PM, Pavan Kumar A wrote:
I wrote the Initialize counter :) Please tell me if the name seems confusing
So,Initialize =
itialize (ms)=775
Setup (ms)=105
Shutdown (ms)=12537
Total (ms)=27075
Thanks,
Ghufran
On Thu, Apr 17, 2014 at 9:10 PM, Pavan Kumar A wrote:
Input consists of > reading the input (vertices and/or edges as
Input consists of > reading the input (vertices and/or edges as provided) into
memory on individual workers> assigning vertices to partitions and partitions
to workers> moving all partitions (i.e., vertices & their out-edges) to a
worker (which owns the partition)> doing some bookkeeping of inte
Giraph uses threads for compute, netty server, netty client on workers,
execution pools, input, output etc.You can see most of these options in
org.apache.giraph.conf.GiraphConstants for instance
/** Netty client threads */ IntConfOption NETTY_CLIENT_THREADS = new
IntConfOption("giraph.n
,
Agrta Rawat
On Wed, Apr 16, 2014 at 12:44 PM, Pavan Kumar A wrote:
What do u mean by buffer size? Just as a note, please ensure that Xmx & Xms
values are properly set for the mapper using mapred.child.java.opts or
mapred.map.child.java.opts
Also what does the error message show: please
It totally depends on the input distribution, one very simple thing that can be
done is:> Define a VertexResolver that upon every vertex creation sets its Id
= domain of url & value = "set" of urls in the domain; it keeps appending as
more vertices with same id (i.e., domain) are read from inpu
partitioned and so the query graph should be
> available to all partitions. Apart from this, some of the large graph
> vertices(such as those which have edges between partitions) also have
> to be duplicated.
>
> On Mon, Apr 7, 2014 at 9:53 PM, Pavan Kumar A wrote:
> > If you wa
What do u mean by buffer size? Just as a note, please ensure that Xmx & Xms
values are properly set for the mapper using mapred.child.java.opts or
mapred.map.child.java.optsAlso what does the error message show: please use
pastebin & post the link here.
Date: Wed, 16 Apr 2014 12:13:29 +0530
Sub
Hi Vikesh,
It seems that you are trying to run benchmarks on giraph.We had a lot of
improvements in 1.1.0-SNAPSHOT - (though it is not released publicly in maven
at Facebook we run all our applications on the snapshot version)So, you can
pull the latest trunk from giraph: git clone
https://git-
If you want the vertex.value to be available to all vertices, then you can
store it in an aggregator.A vertex can belong to exactly one partition. But
please answer Lukas's questions so we can answer more appropriately.
> Date: Mon, 7 Apr 2014 11:23:58 +0200
> From: lukas.naleze...@firma.seznam.
If what you need is
http://en.wikipedia.org/wiki/Clustering_coefficient#Local_clustering_coefficientthen
I implemented it in Giraph, will submit a patch soon
Date: Mon, 17 Mar 2014 15:33:07 -0400
Subject: Re: clustering coefficient (counting triangles) in giraph.
From: kaushikpatn...@gmail.com
T
Jyoti - I recently did a similar thing. In fact, my approach was exactly what
Maja suggested. However, there is a caveat. You can switch computation class
for workers in mastercompute's compute method but that requires the messages
sent by computation class active before switching and messages r
Hi Pankaj,
Note that in Giraph, vertex is the first-class citizen, while edges are just
data associated with a vertex.So, when you delete a vertex you delete all data
associated with it i.e., its outgoing edges, its value, its id, etc.
However, it is not trivial to delete all incoming edges to a
@DavidYou can have a look at
http://researcher.watson.ibm.com/researcher/files/us-ytian/giraph++.pdfThis
work was done by
http://researcher.watson.ibm.com/researcher/view.php?person=us-ytianIn this she
talks about alternative partitioning schemes she implemented on top of giraph
and the showst
28 matches
Mail list logo