Re: Multiple sendMessage calls vs. sendMessageToMultipleEdges

2014-10-22 Thread Matthew Saltz
Actually,  one more question: are there any disadvantages to enabling
oneToAllMessaging? Is there any reason not to do it by default?

Best,
Matthew
El 22/10/2014 23:28, "Matthew Saltz"  escribió:

> Lukas,
>
> Thank you so much for the help. By 'the first class', you mean 
> SendMessageToAllCache
> is not used unless I set the property to true, right? Because I actually do
> have giraph.oneToAllMsgSending=true, so if that means it's using
> SendMessageToAllCache  then everything makes much more sense. So I guess
> it makes sense then that case (b) that I mentioned that would be much
> faster than case (a)? I really appreciate it.  And do you have any ideas
> about the second question I asked? I think the answer is no but I'm kind of
> hoping it's not.
>
> Best,
> Matthew
>
>
>
> On Wed, Oct 22, 2014 at 11:16 PM, Lukas Nalezenec <
> lukas.naleze...@firma.seznam.cz> wrote:
>
>>  Hi Matthew,
>>
>> See class SendMessageToAllCache. Its in the same directory as
>> SendMessageCache. The first class is not used by Giraph unless you set
>> property giraph.oneToAllMsgSending to true.
>>
>> Lukas
>>
>>
>> On 22.10.2014 20:10, Matthew Saltz wrote:
>>
>> Hi everyone,
>>
>> I have two questions:
>>
>>  *Question 1)* I'm using release 1.1.0 and I'm really confused about the
>> fact that I'm having massive performance differences in the following
>> scenario. I need to send one message from each vertex to a subset of its
>> neighbors (all that satisfy a certain condition). For that, I see two basic
>> options:
>>
>> a) Loop over all edges, making a call to sendMessage(source, target)
>> whenever target satisfies a condition I want, reusing the same IntWritable
>> for the target vertex by calling target.set(_)
>>b) Loop over all edges, building up an ArrayList (or whatever) of
>> targets that satisfy the condition, and calling
>> sendMessageToMultipleMessages(targets) at the end.
>>
>>  Surprisingly, I get much, much worse performance using option (a),
>> which I would think would be much faster. So I looked in the code and
>> eventually found my way to SendMessageCache
>> ,
>> where it turns out that sendMessageToMultipleMessages ->
>> sendMessageToAllRequest(Iterator, Message) actually just loops over the
>> iterator, repeatedly calling sendMessageRequest (which is what I thought I
>> was doing in scenario (a). I might have incorrectly traced the code though.
>> Can anyone tell me what might be going on? I'm really puzzled by this.
>>
>>  *Question 2) *Is there a good way of sending a vertex's adjacency list
>> to its neighbors, without building up your own copy of an adjacency list
>> and then sending that? I'm going through the Edge iterable and building an
>> ArrayPrimitiveWritable of ids but it would be nice if I could somehow
>> access the underlying data structure behind the iterable or just wrap the
>> iterable as a writable somehow.
>>
>>  Thanks so much for the help,
>> Matthew Saltz
>>
>>
>>
>>
>>
>


Re: Multiple sendMessage calls vs. sendMessageToMultipleEdges

2014-10-22 Thread Matthew Saltz
Lukas,

Thank you so much for the help. By 'the first class', you mean
SendMessageToAllCache
is not used unless I set the property to true, right? Because I actually do
have giraph.oneToAllMsgSending=true, so if that means it's using
SendMessageToAllCache  then everything makes much more sense. So I guess it
makes sense then that case (b) that I mentioned that would be much faster
than case (a)? I really appreciate it.  And do you have any ideas about the
second question I asked? I think the answer is no but I'm kind of hoping
it's not.

Best,
Matthew



On Wed, Oct 22, 2014 at 11:16 PM, Lukas Nalezenec <
lukas.naleze...@firma.seznam.cz> wrote:

>  Hi Matthew,
>
> See class SendMessageToAllCache. Its in the same directory as
> SendMessageCache. The first class is not used by Giraph unless you set
> property giraph.oneToAllMsgSending to true.
>
> Lukas
>
>
> On 22.10.2014 20:10, Matthew Saltz wrote:
>
> Hi everyone,
>
> I have two questions:
>
>  *Question 1)* I'm using release 1.1.0 and I'm really confused about the
> fact that I'm having massive performance differences in the following
> scenario. I need to send one message from each vertex to a subset of its
> neighbors (all that satisfy a certain condition). For that, I see two basic
> options:
>
> a) Loop over all edges, making a call to sendMessage(source, target)
> whenever target satisfies a condition I want, reusing the same IntWritable
> for the target vertex by calling target.set(_)
>b) Loop over all edges, building up an ArrayList (or whatever) of
> targets that satisfy the condition, and calling
> sendMessageToMultipleMessages(targets) at the end.
>
>  Surprisingly, I get much, much worse performance using option (a), which
> I would think would be much faster. So I looked in the code and eventually
> found my way to SendMessageCache
> ,
> where it turns out that sendMessageToMultipleMessages ->
> sendMessageToAllRequest(Iterator, Message) actually just loops over the
> iterator, repeatedly calling sendMessageRequest (which is what I thought I
> was doing in scenario (a). I might have incorrectly traced the code though.
> Can anyone tell me what might be going on? I'm really puzzled by this.
>
>  *Question 2) *Is there a good way of sending a vertex's adjacency list
> to its neighbors, without building up your own copy of an adjacency list
> and then sending that? I'm going through the Edge iterable and building an
> ArrayPrimitiveWritable of ids but it would be nice if I could somehow
> access the underlying data structure behind the iterable or just wrap the
> iterable as a writable somehow.
>
>  Thanks so much for the help,
> Matthew Saltz
>
>
>
>
>


Re: Multiple sendMessage calls vs. sendMessageToMultipleEdges

2014-10-22 Thread Lukas Nalezenec

Hi Matthew,

See class SendMessageToAllCache. Its in the same directory as 
SendMessageCache. The first class is not used by Giraph unless you set 
property giraph.oneToAllMsgSending to true.


Lukas

On 22.10.2014 20:10, Matthew Saltz wrote:

Hi everyone,

I have two questions:

*Question 1)* I'm using release 1.1.0 and I'm really confused about 
the fact that I'm having massive performance differences in the 
following scenario. I need to send one message from each vertex to a 
subset of its neighbors (all that satisfy a certain condition). For 
that, I see two basic options:


   a) Loop over all edges, making a call to sendMessage(source, 
target) whenever target satisfies a condition I want, reusing the same 
IntWritable for the target vertex by calling target.set(_)
   b) Loop over all edges, building up an ArrayList (or whatever) of 
targets that satisfy the condition, and calling 
sendMessageToMultipleMessages(targets) at the end.


Surprisingly, I get much, much worse performance using option (a), 
which I would think would be much faster. So I looked in the code and 
eventually found my way to SendMessageCache 
, 
where it turns out that sendMessageToMultipleMessages -> 
sendMessageToAllRequest(Iterator, Message) actually just loops over 
the iterator, repeatedly calling sendMessageRequest (which is what I 
thought I was doing in scenario (a). I might have incorrectly traced 
the code though. Can anyone tell me what might be going on? I'm really 
puzzled by this.


*Question 2) *Is there a good way of sending a vertex's adjacency list 
to its neighbors, without building up your own copy of an adjacency 
list and then sending that? I'm going through the Edge iterable and 
building an ArrayPrimitiveWritable of ids but it would be nice if I 
could somehow access the underlying data structure behind the iterable 
or just wrap the iterable as a writable somehow.


Thanks so much for the help,
Matthew Saltz







Multiple sendMessage calls vs. sendMessageToMultipleEdges

2014-10-22 Thread Matthew Saltz
Hi everyone,

I have two questions:

*Question 1)* I'm using release 1.1.0 and I'm really confused about the
fact that I'm having massive performance differences in the following
scenario. I need to send one message from each vertex to a subset of its
neighbors (all that satisfy a certain condition). For that, I see two basic
options:

   a) Loop over all edges, making a call to sendMessage(source, target)
whenever target satisfies a condition I want, reusing the same IntWritable
for the target vertex by calling target.set(_)
   b) Loop over all edges, building up an ArrayList (or whatever) of
targets that satisfy the condition, and calling
sendMessageToMultipleMessages(targets) at the end.

Surprisingly, I get much, much worse performance using option (a), which I
would think would be much faster. So I looked in the code and eventually
found my way to SendMessageCache
,
where it turns out that sendMessageToMultipleMessages ->
sendMessageToAllRequest(Iterator, Message) actually just loops over the
iterator, repeatedly calling sendMessageRequest (which is what I thought I
was doing in scenario (a). I might have incorrectly traced the code though.
Can anyone tell me what might be going on? I'm really puzzled by this.

*Question 2) *Is there a good way of sending a vertex's adjacency list to
its neighbors, without building up your own copy of an adjacency list and
then sending that? I'm going through the Edge iterable and building an
ArrayPrimitiveWritable of ids but it would be nice if I could somehow
access the underlying data structure behind the iterable or just wrap the
iterable as a writable somehow.

Thanks so much for the help,
Matthew Saltz