[jira] [Commented] (FLINK-2527) If a VertexUpdateFunction calls setNewVertexValue more than once, the MessagingFunction will only see the first value set

2015-08-15 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698305#comment-14698305
 ] 

Stephan Ewen commented on FLINK-2527:
-

My first intuition is to allow {{setVertexValue}} to be called only once. After 
all, in this Vertex-Centric model, a vertex has only one value and the solution 
set should will not consider duplicates, but take only the latest values.



> If a VertexUpdateFunction calls setNewVertexValue more than once, the 
> MessagingFunction will only see the first value set
> -
>
> Key: FLINK-2527
> URL: https://issues.apache.org/jira/browse/FLINK-2527
> Project: Flink
>  Issue Type: Bug
>  Components: Gelly
>Reporter: Gabor Gevay
>Assignee: Gabor Gevay
> Fix For: 0.10, 0.9.1
>
>
> The problem is that if setNewVertexValue is called more than once, it sends 
> each new value to the out Collector, and these all end up in the workset, but 
> then the coGroups in the two descendants of MessagingUdfWithEdgeValues use 
> only the first value in the state Iterable. I see three ways to resolve this:
> 1. Add it to the documentation that setNewVertexValue should only be called 
> once, and optionally add a check for this.
> 2. In setNewVertexValue, do not send the newValue to the out Collector at 
> once, but only record it in outVal, and send the last recorded value after 
> updateVertex returns.
> 3. Iterate over the entire Iterable in MessagingUdfWithEVsSimpleVV.coGroup 
> and MessagingUdfWithEVsVVWithDegrees.coGroup. (This would probably still need 
> some documentation addition.)
> I like 2. the best. What are your opinions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2527) If a VertexUpdateFunction calls setNewVertexValue more than once, the MessagingFunction will only see the first value set

2015-08-15 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698399#comment-14698399
 ] 

Vasia Kalavri commented on FLINK-2527:
--

I think (3) would break the model semantics. I'm leaning towards (1). [~ggevay] 
do you have any case in mind that (2) would allow to implement but (1) wouldn't?

> If a VertexUpdateFunction calls setNewVertexValue more than once, the 
> MessagingFunction will only see the first value set
> -
>
> Key: FLINK-2527
> URL: https://issues.apache.org/jira/browse/FLINK-2527
> Project: Flink
>  Issue Type: Bug
>  Components: Gelly
>Reporter: Gabor Gevay
>Assignee: Gabor Gevay
> Fix For: 0.10, 0.9.1
>
>
> The problem is that if setNewVertexValue is called more than once, it sends 
> each new value to the out Collector, and these all end up in the workset, but 
> then the coGroups in the two descendants of MessagingUdfWithEdgeValues use 
> only the first value in the state Iterable. I see three ways to resolve this:
> 1. Add it to the documentation that setNewVertexValue should only be called 
> once, and optionally add a check for this.
> 2. In setNewVertexValue, do not send the newValue to the out Collector at 
> once, but only record it in outVal, and send the last recorded value after 
> updateVertex returns.
> 3. Iterate over the entire Iterable in MessagingUdfWithEVsSimpleVV.coGroup 
> and MessagingUdfWithEVsVVWithDegrees.coGroup. (This would probably still need 
> some documentation addition.)
> I like 2. the best. What are your opinions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2527) If a VertexUpdateFunction calls setNewVertexValue more than once, the MessagingFunction will only see the first value set

2015-08-15 Thread Gabor Gevay (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698423#comment-14698423
 ] 

Gabor Gevay commented on FLINK-2527:


I'm also leaning towards (1) now. I have actually implemented (2) in the 
meantime, but then I realized that it sets a subtle trap for the user, that I 
immediately fell into :)

In my user function, I have a loop over the msgs, and for each msg I decide to 
set some new vertex value or not. This loop might set a new value multiple 
times, and the last one should be retained.

At first, I liked (2) better, because if we have (1), then I essentially have 
to implement (2) inside my user function anyway. I thought that this situation 
is probably a common one, so why have everyone reimplement (2) inside their 
user functions, if we can do it in Gelly? However, the trap is that my code 
inside the loop implicitly assumed that the setNewVertexValue function updates 
the vertex variable (the first parameter of the UDF), but it does not. Of 
course, we could make the setNewVertexValue do this update, but this is getting 
complicated. So it is probably just best to go with (1), to keep the API nice 
and simple.

> If a VertexUpdateFunction calls setNewVertexValue more than once, the 
> MessagingFunction will only see the first value set
> -
>
> Key: FLINK-2527
> URL: https://issues.apache.org/jira/browse/FLINK-2527
> Project: Flink
>  Issue Type: Bug
>  Components: Gelly
>Reporter: Gabor Gevay
>Assignee: Gabor Gevay
> Fix For: 0.10, 0.9.1
>
>
> The problem is that if setNewVertexValue is called more than once, it sends 
> each new value to the out Collector, and these all end up in the workset, but 
> then the coGroups in the two descendants of MessagingUdfWithEdgeValues use 
> only the first value in the state Iterable. I see three ways to resolve this:
> 1. Add it to the documentation that setNewVertexValue should only be called 
> once, and optionally add a check for this.
> 2. In setNewVertexValue, do not send the newValue to the out Collector at 
> once, but only record it in outVal, and send the last recorded value after 
> updateVertex returns.
> 3. Iterate over the entire Iterable in MessagingUdfWithEVsSimpleVV.coGroup 
> and MessagingUdfWithEVsVVWithDegrees.coGroup. (This would probably still need 
> some documentation addition.)
> I like 2. the best. What are your opinions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2527) If a VertexUpdateFunction calls setNewVertexValue more than once, the MessagingFunction will only see the first value set

2015-08-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698670#comment-14698670
 ] 

ASF GitHub Bot commented on FLINK-2527:
---

GitHub user ggevay opened a pull request:

https://github.com/apache/flink/pull/1027

[FLINK-2527] [gelly] Ensure that VertexUpdateFunction.setNewVertexValue is 
called at most once

I implemented (1), with the check to enforce that it is called at most 
once. Unfortunately, I had to add exception specification to a number of 
methods, and users might also have to do this with already existing code. If 
you think this does not worth it, then I can remove the check.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ggevay/flink setNewVertexValueFix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1027.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1027


commit 9b514633d868b28d628785a0d099115134599cee
Author: Gabor Gevay 
Date:   2015-08-16T13:26:59Z

[FLINK-2527] [gelly] Ensure that VertexUpdateFunction.setNewVertexValue is 
called at most once per updateVertex




> If a VertexUpdateFunction calls setNewVertexValue more than once, the 
> MessagingFunction will only see the first value set
> -
>
> Key: FLINK-2527
> URL: https://issues.apache.org/jira/browse/FLINK-2527
> Project: Flink
>  Issue Type: Bug
>  Components: Gelly
>Reporter: Gabor Gevay
>Assignee: Gabor Gevay
> Fix For: 0.10, 0.9.1
>
>
> The problem is that if setNewVertexValue is called more than once, it sends 
> each new value to the out Collector, and these all end up in the workset, but 
> then the coGroups in the two descendants of MessagingUdfWithEdgeValues use 
> only the first value in the state Iterable. I see three ways to resolve this:
> 1. Add it to the documentation that setNewVertexValue should only be called 
> once, and optionally add a check for this.
> 2. In setNewVertexValue, do not send the newValue to the out Collector at 
> once, but only record it in outVal, and send the last recorded value after 
> updateVertex returns.
> 3. Iterate over the entire Iterable in MessagingUdfWithEVsSimpleVV.coGroup 
> and MessagingUdfWithEVsVVWithDegrees.coGroup. (This would probably still need 
> some documentation addition.)
> I like 2. the best. What are your opinions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2527) If a VertexUpdateFunction calls setNewVertexValue more than once, the MessagingFunction will only see the first value set

2015-08-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698738#comment-14698738
 ] 

ASF GitHub Bot commented on FLINK-2527:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1027#issuecomment-131576936
  
I think you can make this non-API-breaking by simply throwing an 
`IllegalStateException`, which is a `RuntimeException` and therefore needs no 
part in the signature.


> If a VertexUpdateFunction calls setNewVertexValue more than once, the 
> MessagingFunction will only see the first value set
> -
>
> Key: FLINK-2527
> URL: https://issues.apache.org/jira/browse/FLINK-2527
> Project: Flink
>  Issue Type: Bug
>  Components: Gelly
>Reporter: Gabor Gevay
>Assignee: Gabor Gevay
> Fix For: 0.10, 0.9.1
>
>
> The problem is that if setNewVertexValue is called more than once, it sends 
> each new value to the out Collector, and these all end up in the workset, but 
> then the coGroups in the two descendants of MessagingUdfWithEdgeValues use 
> only the first value in the state Iterable. I see three ways to resolve this:
> 1. Add it to the documentation that setNewVertexValue should only be called 
> once, and optionally add a check for this.
> 2. In setNewVertexValue, do not send the newValue to the out Collector at 
> once, but only record it in outVal, and send the last recorded value after 
> updateVertex returns.
> 3. Iterate over the entire Iterable in MessagingUdfWithEVsSimpleVV.coGroup 
> and MessagingUdfWithEVsVVWithDegrees.coGroup. (This would probably still need 
> some documentation addition.)
> I like 2. the best. What are your opinions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2527) If a VertexUpdateFunction calls setNewVertexValue more than once, the MessagingFunction will only see the first value set

2015-08-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698834#comment-14698834
 ] 

ASF GitHub Bot commented on FLINK-2527:
---

Github user ggevay commented on the pull request:

https://github.com/apache/flink/pull/1027#issuecomment-131612652
  
Done.


> If a VertexUpdateFunction calls setNewVertexValue more than once, the 
> MessagingFunction will only see the first value set
> -
>
> Key: FLINK-2527
> URL: https://issues.apache.org/jira/browse/FLINK-2527
> Project: Flink
>  Issue Type: Bug
>  Components: Gelly
>Reporter: Gabor Gevay
>Assignee: Gabor Gevay
> Fix For: 0.10, 0.9.1
>
>
> The problem is that if setNewVertexValue is called more than once, it sends 
> each new value to the out Collector, and these all end up in the workset, but 
> then the coGroups in the two descendants of MessagingUdfWithEdgeValues use 
> only the first value in the state Iterable. I see three ways to resolve this:
> 1. Add it to the documentation that setNewVertexValue should only be called 
> once, and optionally add a check for this.
> 2. In setNewVertexValue, do not send the newValue to the out Collector at 
> once, but only record it in outVal, and send the last recorded value after 
> updateVertex returns.
> 3. Iterate over the entire Iterable in MessagingUdfWithEVsSimpleVV.coGroup 
> and MessagingUdfWithEVsVVWithDegrees.coGroup. (This would probably still need 
> some documentation addition.)
> I like 2. the best. What are your opinions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2527) If a VertexUpdateFunction calls setNewVertexValue more than once, the MessagingFunction will only see the first value set

2015-08-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699283#comment-14699283
 ] 

ASF GitHub Bot commented on FLINK-2527:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1027#issuecomment-131753704
  
Looks good, will merge this...


> If a VertexUpdateFunction calls setNewVertexValue more than once, the 
> MessagingFunction will only see the first value set
> -
>
> Key: FLINK-2527
> URL: https://issues.apache.org/jira/browse/FLINK-2527
> Project: Flink
>  Issue Type: Bug
>  Components: Gelly
>Reporter: Gabor Gevay
>Assignee: Gabor Gevay
> Fix For: 0.10, 0.9.1
>
>
> The problem is that if setNewVertexValue is called more than once, it sends 
> each new value to the out Collector, and these all end up in the workset, but 
> then the coGroups in the two descendants of MessagingUdfWithEdgeValues use 
> only the first value in the state Iterable. I see three ways to resolve this:
> 1. Add it to the documentation that setNewVertexValue should only be called 
> once, and optionally add a check for this.
> 2. In setNewVertexValue, do not send the newValue to the out Collector at 
> once, but only record it in outVal, and send the last recorded value after 
> updateVertex returns.
> 3. Iterate over the entire Iterable in MessagingUdfWithEVsSimpleVV.coGroup 
> and MessagingUdfWithEVsVVWithDegrees.coGroup. (This would probably still need 
> some documentation addition.)
> I like 2. the best. What are your opinions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2527) If a VertexUpdateFunction calls setNewVertexValue more than once, the MessagingFunction will only see the first value set

2015-08-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699441#comment-14699441
 ] 

ASF GitHub Bot commented on FLINK-2527:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1027


> If a VertexUpdateFunction calls setNewVertexValue more than once, the 
> MessagingFunction will only see the first value set
> -
>
> Key: FLINK-2527
> URL: https://issues.apache.org/jira/browse/FLINK-2527
> Project: Flink
>  Issue Type: Bug
>  Components: Gelly
>Reporter: Gabor Gevay
>Assignee: Gabor Gevay
> Fix For: 0.10, 0.9.1
>
>
> The problem is that if setNewVertexValue is called more than once, it sends 
> each new value to the out Collector, and these all end up in the workset, but 
> then the coGroups in the two descendants of MessagingUdfWithEdgeValues use 
> only the first value in the state Iterable. I see three ways to resolve this:
> 1. Add it to the documentation that setNewVertexValue should only be called 
> once, and optionally add a check for this.
> 2. In setNewVertexValue, do not send the newValue to the out Collector at 
> once, but only record it in outVal, and send the last recorded value after 
> updateVertex returns.
> 3. Iterate over the entire Iterable in MessagingUdfWithEVsSimpleVV.coGroup 
> and MessagingUdfWithEVsVVWithDegrees.coGroup. (This would probably still need 
> some documentation addition.)
> I like 2. the best. What are your opinions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)