[GitHub] flink pull request: [FLINK-1885] [gelly] Added bulk execution mode...

2016-01-26 Thread markus-h
Github user markus-h closed the pull request at:

https://github.com/apache/flink/pull/598


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1885] [gelly] Added bulk execution mode...

2016-01-16 Thread vasia
Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/598#issuecomment-172196328
  
Hey @markus-h,
are there any news regarding this PR? If not, would you mind closing it? 
Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1885] [gelly] Added bulk execution mode...

2015-08-09 Thread vasia
Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/598#issuecomment-129223234
  
Hi @markus-h,

I'm so sorry it took me so long to look into this.. I agree with Stephan's 
comment and also it would be great if we could add this option to 
gather-sum-apply, too.
Would you like to try to rebase?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1885] [gelly] Added bulk execution mode...

2015-05-12 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/598#issuecomment-101321028
  
I had a look at this, and it actually looks quite good. The basic idea 
seems to be that you emit the original vertex if no update happens.

It would be nice to not have the `isLastCollected` flag in the user-facing 
classes. If you could have a dedicated vertex-centric bulk coGroup, with its 
own output collector, you can track this in the OutputCollector. I think that 
would be cleanrer with respect to the user-facing API.

Otherwise, I think this is a good addition...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1885] [gelly] Added bulk execution mode...

2015-04-17 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/598#issuecomment-93985063
  
Interesting idea. Are there use cases that require that, or is that 
basically to allow for an easy comparison of the bulk vs delta performance?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1885] [gelly] Added bulk execution mode...

2015-04-17 Thread markus-h
Github user markus-h commented on the pull request:

https://github.com/apache/flink/pull/598#issuecomment-93987347
  
There is no specific usecase, but when you try to process big graphs 
locally you often run out of memory with delta iterations.
But the reason I needed this change is a different one. I am doing research 
on failure recovery methods in graph analysis. Most Pregel like systems just do 
a full checkpointing of all vertices. This was way easier to implement with a 
bulk iteration than with delta iterations in Flink so I decided to just provide 
gelly with this mode.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1885] [gelly] Added bulk execution mode...

2015-04-17 Thread vasia
Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/598#issuecomment-94002207
  
Hi @markus-h!
I see the point in having a bulk iteration in Gelly, however I'm not sure I 
would add it as a mode in vertex-centric iteration. VertexCentricIteration 
implements the Pregel model and it might be confusing to change its semantics 
like this (or maybe not). 
Also, I am not sure whether the messaging-vertexUpdate abstraction is what 
you would like to have in a bulk graph iteration.
It might be better to add a bulk graph iteration, where the gathering of 
neighborhoods is abstracted and the user only provides the step function, i.e. 
something like the neighborhood methods, but iterative.
What do you think?
In any case, I think that a use-case / example would really help motivate 
adding this :-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1885] [gelly] Added bulk execution mode...

2015-04-17 Thread markus-h
Github user markus-h commented on the pull request:

https://github.com/apache/flink/pull/598#issuecomment-94008273
  
Hi @vasia,
thanks for your comments! I thought about this extension in a different 
way. Whenever you have a graph that is too big to process it with delta 
iteration you could just turn on bulk mode to get the computation done. It will 
be a lot slower, but sometimes this might be better then not getting any 
results.
I dont think a dedicated bulk operator would be very useful. People can 
just use plain Flink if they dont need the Pregel abstraction. And in most 
cases it would be much slower then using the current solution.

You know gelly and its usecases a lot better then me. If you dont think 
that a mode like this might be userful I am totally find with that. It is a 
very small change anyway.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1885] [gelly] Added bulk execution mode...

2015-04-17 Thread vasia
Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/598#issuecomment-94013714
  
Aha I see! I totally misunderstood your intention :-)
I'll take a look as soon as I finish a few more reviews. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1885] [gelly] Added bulk execution mode...

2015-04-14 Thread markus-h
GitHub user markus-h opened a pull request:

https://github.com/apache/flink/pull/598

[FLINK-1885] [gelly] Added bulk execution mode to gellys vertex centric 
iterations

See https://issues.apache.org/jira/browse/FLINK-1885

I essentially exchanged the delta iteration with a bulk iteration and made 
the coGroup of the VertexUpdateUdf kind of an outer join so that the vertices 
that are not changed in one superstep are kept around in the next one.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/markus-h/incubator-flink gellyBulkMode

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/598.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #598


commit e3641c88ea260dbb533015adfb6ef44272a2e615
Author: Markus Holzemer markus.holze...@gmx.de
Date:   2015-04-13T15:55:03Z

Added bulk execution mode to gellys vertex centric iterations




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---