Pavan Kumar created GIRAPH-909:
----------------------------------
Summary: support succinct representation of messages in
messagestores
Key: GIRAPH-909
URL: https://issues.apache.org/jira/browse/GIRAPH-909
Project: Giraph
Issue Type: Improvement
Reporter: Pavan Kumar
Assignee: Pavan Kumar
Currently we use ByteArrayVertexIdMessages data-structure to store vertex id &
its messages. Even for ByteArrayOneToAllMessages data-structure, while storing
in message-store we always convert it to ByteArrayVertexIdMessages, in this
case if many vertices on a worker receive the same message, it is stored as
many times. This uses up lot of memory, so if we have message stores that can
avoid all this duplication in storing messages, we can bring down memory
footprint by a large amount.
Note: however, here the pre-req is that your graph has to be partitioned such
that a vertex sends messages to vertices only in a few other workers (not all
of them - as happens with HashPartitioning)
This change depends heavily on GIRAPH-907 & GIRAPH-908
I already have patch for it, need to tidy up few things, will put it up by end
of this week.
--
This message was sent by Atlassian JIRA
(v6.2#6252)