[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user mushketyk commented on the issue: https://github.com/apache/flink/pull/2564 Hi @vasia , thank you for merging my PR. Thank you for the reminder about the documentation. I've created the JIRA for it: https://issues.apache.org/jira/browse/FLINK-5311 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user greghogan commented on the issue: https://github.com/apache/flink/pull/2564 Thanks for the reminder @vasia. The separate JIRA sub-task does allow for a discussion of how best to document the full set of proposed bipartite functionality. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user vasia commented on the issue: https://github.com/apache/flink/pull/2564 Thank you both for your work @mushketyk and @greghogan! Please, keep in mind that we should always add documentation for every new feature; especially a big one such as supporting a new graph type. We've added the checklist template for each new PR so that we don't forget about it :) Can you please open a JIRA to track that docs for bipartite graphs are missing? Thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user mushketyk commented on the issue: https://github.com/apache/flink/pull/2564 Hi @greghogan , I've fixed the PR according to your review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user mushketyk commented on the issue: https://github.com/apache/flink/pull/2564 @vasia , @greghogan I've created a new package, moved new classes there and update PR according to your latest comments. Best regards, Ivan. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user greghogan commented on the issue: https://github.com/apache/flink/pull/2564 Yes, you are right, `bipartite`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user vasia commented on the issue: https://github.com/apache/flink/pull/2564 I would go for `org.apache.flink.graph.bipartite`. I think that `bidirectional` simply suggests that each edge exists in both directions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user greghogan commented on the issue: https://github.com/apache/flink/pull/2564 @mushketyk @vasia, thoughts on package naming? Should we create a new `org.apache.flink.bigraph` package? Another option would be `org.apache.flink.graph.bidirectional` which would suggest future package names like `org.apache.flink.graph.multi` and `org.apache.flink.graph.temporal`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user mushketyk commented on the issue: https://github.com/apache/flink/pull/2564 Hi @greghogan, I've updated the PR according to your review and rebased it on top of the `master` branch. The only thing that I didn't change is the message in the `assertEquals` you pointed to since it is not very helpful to receive an error message like: "Wrong number of elements result. Expected 4, actual 3." I think it is much more helpful for the debugging purposes to see contents of the arrays to figure out why their lengths are different. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user mushketyk commented on the issue: https://github.com/apache/flink/pull/2564 Hi @greghogan , thank you for your review. I'll try to fix them in the next couple of days. Best regards, Ivan. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user mushketyk commented on the issue: https://github.com/apache/flink/pull/2564 @vasia I don't think anybody is shepherding this PR :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user vasia commented on the issue: https://github.com/apache/flink/pull/2564 Thank @mushketyk. @greghogan are you shepherding this PR or shall I? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user mushketyk commented on the issue: https://github.com/apache/flink/pull/2564 @greghogan Thank you for the suggestion. The build is passing now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user greghogan commented on the issue: https://github.com/apache/flink/pull/2564 Try switching to `ExecutionEnvironment.createCollectionsEnvironment()`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user mushketyk commented on the issue: https://github.com/apache/flink/pull/2564 New **gelly** tests failed with errors like: > Caused by: java.io.IOException: Insufficient number of network buffers: required 32, but only 3 available. The total number of network buffers is currently set to 2048. You can increase this number by setting the configuration key 'taskmanager.network.numberOfBuffers'. Do you know what is causing this error? Should I update the code somehow? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user mushketyk commented on the issue: https://github.com/apache/flink/pull/2564 @vasia @greghogan I've updated the PR. Could you please give it another look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user vasia commented on the issue: https://github.com/apache/flink/pull/2564 Thanks for the update @mushketyk and for the review @greghogan. I agree with your suggestions. For the type parameters I would go for ``. Let me know if there's any other issue you'd like my opinion on. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user mushketyk commented on the issue: https://github.com/apache/flink/pull/2564 Hi @greghogan. Thank you for clarification. I'll update code accordingly. Do you have any other comments regarding the PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user greghogan commented on the issue: https://github.com/apache/flink/pull/2564 The advantage to joining on vertex values before the grouped cross is that the number of projected vertices is quadratic in the vertex degree. The projected graphs will usually be much larger than the bipartite graph. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user mushketyk commented on the issue: https://github.com/apache/flink/pull/2564 @greghogan @vasia I've update the code according to your suggestion. The only thing that I did differently: I return Tuple4 from a more complete version of a bottom/top projections it contains vertex key, vertex value and values of two vertices. I assumed that to get values of two other vertices I would need to perform two other joins which will make the method much slower, while a user can do with the result of the method if needed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user mushketyk commented on the issue: https://github.com/apache/flink/pull/2564 Makes sense to me. I'll implement this during weekend. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user greghogan commented on the issue: https://github.com/apache/flink/pull/2564 Agreed, I would amend my earlier suggestion to say we only need to start with two projection methods (for each of top and bottom), something like `public Graph> topProjectionSimple() {` and `public Graph> topProjection() {` `TopProjection` (we can find better names) would be a `Tuple6` with POJO accessors as with `Vertex`, `Edge`, etc. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user vasia commented on the issue: https://github.com/apache/flink/pull/2564 What I meant is simply creating an edge with a Tuple2 label containing the labels of the two edges in the bipartite graph. Makes sense? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user mushketyk commented on the issue: https://github.com/apache/flink/pull/2564 Tuple8 does not seem to friendly to me either. What do you mean by "attaching the labels"? Is it something similar to what we do with Edge/Vertex classes right, inheriting Tuple and providing getters and setters to access values in it? Or is there some other way to attach labels to tuples? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user vasia commented on the issue: https://github.com/apache/flink/pull/2564 Providing a flattened tuple is certainly better than having the user implement the reduce, but I think we should provide separate methods for the default and custom operations. A projection is a very well-defined operation: create a graph where there is an edge between any pair of vertices with a common neighbor in the bipartite graph. If the user wants to apply mappers or other transformations on the vertices and edges, they can do so afterwards, using the graph methods. The problem is that with a projections, some information is lost, e.g. the edge values. For these cases, we can provide a custom projection method where we give the labels in a flattened tuple as @greghogan suggested, but I'm afraid the API will look ugly with a Tuple8 there. Another, maybe more friendly solution, would be attaching the labels on the projection graph edges. What do you think? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class
Github user mushketyk commented on the issue: https://github.com/apache/flink/pull/2564 Hi @greghogan, I like your ideas about providing different API for projections. This should be better than my approach. @vasia What do you think about this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---