[ 
https://issues.apache.org/jira/browse/FLINK-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698423#comment-14698423
 ] 

Gabor Gevay commented on FLINK-2527:
------------------------------------

I'm also leaning towards (1) now. I have actually implemented (2) in the 
meantime, but then I realized that it sets a subtle trap for the user, that I 
immediately fell into :)

In my user function, I have a loop over the msgs, and for each msg I decide to 
set some new vertex value or not. This loop might set a new value multiple 
times, and the last one should be retained.

At first, I liked (2) better, because if we have (1), then I essentially have 
to implement (2) inside my user function anyway. I thought that this situation 
is probably a common one, so why have everyone reimplement (2) inside their 
user functions, if we can do it in Gelly? However, the trap is that my code 
inside the loop implicitly assumed that the setNewVertexValue function updates 
the vertex variable (the first parameter of the UDF), but it does not. Of 
course, we could make the setNewVertexValue do this update, but this is getting 
complicated. So it is probably just best to go with (1), to keep the API nice 
and simple.

> If a VertexUpdateFunction calls setNewVertexValue more than once, the 
> MessagingFunction will only see the first value set
> -------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-2527
>                 URL: https://issues.apache.org/jira/browse/FLINK-2527
>             Project: Flink
>          Issue Type: Bug
>          Components: Gelly
>            Reporter: Gabor Gevay
>            Assignee: Gabor Gevay
>             Fix For: 0.10, 0.9.1
>
>
> The problem is that if setNewVertexValue is called more than once, it sends 
> each new value to the out Collector, and these all end up in the workset, but 
> then the coGroups in the two descendants of MessagingUdfWithEdgeValues use 
> only the first value in the state Iterable. I see three ways to resolve this:
> 1. Add it to the documentation that setNewVertexValue should only be called 
> once, and optionally add a check for this.
> 2. In setNewVertexValue, do not send the newValue to the out Collector at 
> once, but only record it in outVal, and send the last recorded value after 
> updateVertex returns.
> 3. Iterate over the entire Iterable in MessagingUdfWithEVsSimpleVV.coGroup 
> and MessagingUdfWithEVsVVWithDegrees.coGroup. (This would probably still need 
> some documentation addition.)
> I like 2. the best. What are your opinions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to