[jira] [Commented] (FLINK-2310) Add an Adamic-Adar Similarity example
[ https://issues.apache.org/jira/browse/FLINK-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14742234#comment-14742234 ] ASF GitHub Bot commented on FLINK-2310: --- Github user andralungu commented on the pull request: https://github.com/apache/flink/pull/892#issuecomment-139822550 I think it's safe to open a fresh PR... fixing this one would be overkill. You can also update and rebase #923 so that we can review them. > Add an Adamic-Adar Similarity example > - > > Key: FLINK-2310 > URL: https://issues.apache.org/jira/browse/FLINK-2310 > Project: Flink > Issue Type: Task > Components: Gelly >Reporter: Andra Lungu >Assignee: Shivani Ghatge >Priority: Minor > > Just as Jaccard, the Adamic-Adar algorithm measures the similarity between a > set of nodes. However, instead of counting the common neighbors and dividing > them by the total number of neighbors, the similarity is weighted according > to the vertex degrees. In particular, it's equal to log(1/numberOfEdges). > The Adamic-Adar algorithm can be broken into three steps: > 1). For each vertex, compute the log of its inverse degrees (with the formula > above) and set it as the vertex value. > 2). Each vertex will then send this new computed value along with a list of > neighbors to the targets of its out-edges > 3). Weigh the edges with the Adamic-Adar index: Sum over n from CN of > log(1/k_n)(CN is the set of all common neighbors of two vertices x, y. k_n is > the degree of node n). See [2] > Prerequisites: > - Full understanding of the Jaccard Similarity Measure algorithm > - Reading the associated literature: > [1] http://social.cs.uiuc.edu/class/cs591kgk/friendsadamic.pdf > [2] > http://stackoverflow.com/questions/22565620/fast-algorithm-to-compute-adamic-adar -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2310] Add an Adamic Adar Similarity exa...
Github user andralungu commented on the pull request: https://github.com/apache/flink/pull/892#issuecomment-139822550 I think it's safe to open a fresh PR... fixing this one would be overkill. You can also update and rebase #923 so that we can review them. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2634) Add a Vertex-centric Version of the Tringle Count Library Method
[ https://issues.apache.org/jira/browse/FLINK-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14742229#comment-14742229 ] ASF GitHub Bot commented on FLINK-2634: --- Github user andralungu commented on the pull request: https://github.com/apache/flink/pull/1105#issuecomment-139820404 If it's okay with you, I'd like to see what @vasia has to say about adding the Triangle Count example from the DataSet API + a reduce as a library method. IMO, it's a better addition, but for some reason, we preferred the Pregel/GSA implementations at a certain point (it was because in my thesis, I take Pregel as a baseline). Also the generic K instead of Long makes perfect sense. However, if we decide to change it, I'll have to open a JIRA to revise the entire set of library methods because apart from PageRank, I think they all restrict themselves to one common key type. I would be kind of scared of type erasure in the K implements Key case :-S > Add a Vertex-centric Version of the Tringle Count Library Method > > > Key: FLINK-2634 > URL: https://issues.apache.org/jira/browse/FLINK-2634 > Project: Flink > Issue Type: Task > Components: Gelly >Affects Versions: 0.10 >Reporter: Andra Lungu >Assignee: Andra Lungu >Priority: Minor > > The vertex-centric version of this algorithm receives an undirected graph as > input and outputs the total number of triangles formed by the graph's edges. > The implementation consists of three phases: > 1). Select neighbours with id greater than the current vertex id. > 2). Propagate each received value to neighbours with higher id. > 3). Compute the number of Triangles by verifying if the final vertex contains > the sender's id in its list. > As opposed to the GAS version, all these three steps will be performed via > message passing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2634] [gelly] Added a vertex-centric Tr...
Github user andralungu commented on the pull request: https://github.com/apache/flink/pull/1105#issuecomment-139820404 If it's okay with you, I'd like to see what @vasia has to say about adding the Triangle Count example from the DataSet API + a reduce as a library method. IMO, it's a better addition, but for some reason, we preferred the Pregel/GSA implementations at a certain point (it was because in my thesis, I take Pregel as a baseline). Also the generic K instead of Long makes perfect sense. However, if we decide to change it, I'll have to open a JIRA to revise the entire set of library methods because apart from PageRank, I think they all restrict themselves to one common key type. I would be kind of scared of type erasure in the K implements Key case :-S --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2661) Add a Node Splitting Technique to Overcome the Limitations of Skewed Graphs
[ https://issues.apache.org/jira/browse/FLINK-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14742216#comment-14742216 ] ASF GitHub Bot commented on FLINK-2661: --- GitHub user andralungu opened a pull request: https://github.com/apache/flink/pull/1124 [FLINK-2661] Added Node Splitting Methods Social media graphs, citation networks or even protein networks have a common property: their degree distribution follows a power-law curve. This structure raises challenges to both vertex-centric and GSA/GAS models because they uniformly process vertices, regardless of their degree distribution. This leads to large execution time stalls: vertices wait for skewed nodes to finish computation [synchronous]. This PR aims to diminish the impact of high-degree nodes by proposing four main functions: `determinieSkewedVertices`, `treeDeAggregate` (splits a node into subnodes, recursively, in levels), `propagateValuesToSplitVertices` (useful when the algorithm performs more than one superstep), `treeAggregate` (brings the graph back to its initial state). These functions modify a graph at a high-level, making its degree distribution more uniform. The method does not need any special partitioning or runtime modification and (for skewed networks and computationally intensive algorithms) can speed up the run time by a factor of two. I added an example: NodeSplittingJaccardSimilarityMeasure, for which I needed to split the overall sequence of operations to two functions to be able to test the result. Calling the entire main method would have resulted in the "Two few memory segments etc" exception - too many operations called within one test, in other words. For more info, please consult the additional entry in the documentation. If we reach a common point and this PR gets merged, I will also follow @fhueske 's suggestion from the mailing list - adding a Split version of each of the library methods to allow users to verify whether their run time improves. You can merge this pull request into a Git repository by running: $ git pull https://github.com/andralungu/flink splitJaccardFlink Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1124.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1124 commit b02d0917edcd5f3c8846fe01044afd7444a58c08 Author: Andra Lungu Date: 2015-09-12T10:25:20Z [FLINK-2661] Added Node Splitting Methods [FLINK-2661] Minor modifications in the docs [FLINK-2661] pdf to png > Add a Node Splitting Technique to Overcome the Limitations of Skewed Graphs > --- > > Key: FLINK-2661 > URL: https://issues.apache.org/jira/browse/FLINK-2661 > Project: Flink > Issue Type: Task > Components: Gelly >Affects Versions: 0.10 >Reporter: Andra Lungu >Assignee: Andra Lungu > > Skewed graphs raise unique challenges to computation models such as Gelly's > vertex-centric or GSA iterations. This is mainly because of the fact that > these approaches uniformly process vertices regardless of their degree > distribution. > In vertex-centric, for instance, a skewed node will take more time to process > its neighbors compared to the other nodes in the graph. The first will act as > a straggler causing the latter to remain idle until it finishes its > computation. > This issue can be mitigated by splitting a high-degree node into subnodes and > evenly distributing the edges to the the resulted subvertices. The > computation will then be performed on the split vertex. > To this end, we should add a Splitting API on top of Gelly which can help: > - determine skewed nodes > - split them > - merge them back at the end of the computation, given a user defined > combiner. > To illustrate the usage of these methods, we should add an example as well as > a separate entry in the documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2661] Added Node Splitting Methods
GitHub user andralungu opened a pull request: https://github.com/apache/flink/pull/1124 [FLINK-2661] Added Node Splitting Methods Social media graphs, citation networks or even protein networks have a common property: their degree distribution follows a power-law curve. This structure raises challenges to both vertex-centric and GSA/GAS models because they uniformly process vertices, regardless of their degree distribution. This leads to large execution time stalls: vertices wait for skewed nodes to finish computation [synchronous]. This PR aims to diminish the impact of high-degree nodes by proposing four main functions: `determinieSkewedVertices`, `treeDeAggregate` (splits a node into subnodes, recursively, in levels), `propagateValuesToSplitVertices` (useful when the algorithm performs more than one superstep), `treeAggregate` (brings the graph back to its initial state). These functions modify a graph at a high-level, making its degree distribution more uniform. The method does not need any special partitioning or runtime modification and (for skewed networks and computationally intensive algorithms) can speed up the run time by a factor of two. I added an example: NodeSplittingJaccardSimilarityMeasure, for which I needed to split the overall sequence of operations to two functions to be able to test the result. Calling the entire main method would have resulted in the "Two few memory segments etc" exception - too many operations called within one test, in other words. For more info, please consult the additional entry in the documentation. If we reach a common point and this PR gets merged, I will also follow @fhueske 's suggestion from the mailing list - adding a Split version of each of the library methods to allow users to verify whether their run time improves. You can merge this pull request into a Git repository by running: $ git pull https://github.com/andralungu/flink splitJaccardFlink Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1124.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1124 commit b02d0917edcd5f3c8846fe01044afd7444a58c08 Author: Andra Lungu Date: 2015-09-12T10:25:20Z [FLINK-2661] Added Node Splitting Methods [FLINK-2661] Minor modifications in the docs [FLINK-2661] pdf to png --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-2662) CompilerException: "Bug: Plan generation for Unions picked a ship strategy between binary plan operators."
Gabor Gevay created FLINK-2662: -- Summary: CompilerException: "Bug: Plan generation for Unions picked a ship strategy between binary plan operators." Key: FLINK-2662 URL: https://issues.apache.org/jira/browse/FLINK-2662 Project: Flink Issue Type: Bug Components: Optimizer Affects Versions: master Reporter: Gabor Gevay I have a Flink program which throws the exception in the jira title. Full text: Exception in thread "main" org.apache.flink.optimizer.CompilerException: Bug: Plan generation for Unions picked a ship strategy between binary plan operators. at org.apache.flink.optimizer.traversals.BinaryUnionReplacer.collect(BinaryUnionReplacer.java:113) at org.apache.flink.optimizer.traversals.BinaryUnionReplacer.postVisit(BinaryUnionReplacer.java:72) at org.apache.flink.optimizer.traversals.BinaryUnionReplacer.postVisit(BinaryUnionReplacer.java:41) at org.apache.flink.optimizer.plan.DualInputPlanNode.accept(DualInputPlanNode.java:170) at org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199) at org.apache.flink.optimizer.plan.DualInputPlanNode.accept(DualInputPlanNode.java:163) at org.apache.flink.optimizer.plan.DualInputPlanNode.accept(DualInputPlanNode.java:163) at org.apache.flink.optimizer.plan.WorksetIterationPlanNode.acceptForStepFunction(WorksetIterationPlanNode.java:194) at org.apache.flink.optimizer.traversals.BinaryUnionReplacer.preVisit(BinaryUnionReplacer.java:49) at org.apache.flink.optimizer.traversals.BinaryUnionReplacer.preVisit(BinaryUnionReplacer.java:41) at org.apache.flink.optimizer.plan.DualInputPlanNode.accept(DualInputPlanNode.java:162) at org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199) at org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199) at org.apache.flink.optimizer.plan.OptimizedPlan.accept(OptimizedPlan.java:127) at org.apache.flink.optimizer.Optimizer.compile(Optimizer.java:520) at org.apache.flink.optimizer.Optimizer.compile(Optimizer.java:402) at org.apache.flink.client.LocalExecutor.getOptimizerPlanAsJSON(LocalExecutor.java:202) at org.apache.flink.api.java.LocalEnvironment.getExecutionPlan(LocalEnvironment.java:63) at malom.Solver.main(Solver.java:66) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140) The execution plan: http://compalg.inf.elte.hu/~ggevay/flink/plan_3_4_0_0_without_verif.txt (I obtained this by commenting out the line that throws the exception) The code is here: https://github.com/ggevay/flink/tree/plan-generation-bug The class to run is "Solver". It needs a command line argument, which is a directory where it would write output. (On first run, it generates some lookuptables for a few minutes, which are then placed to /tmp/movegen) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2537) Add scala examples.jar to build-target/examples
[ https://issues.apache.org/jira/browse/FLINK-2537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14742087#comment-14742087 ] ASF GitHub Bot commented on FLINK-2537: --- GitHub user chenliang613 opened a pull request: https://github.com/apache/flink/pull/1123 [FLINK-2537] Add scala examples.jar to build-target/examples Currently Scala as functional programming language has been acknowledged by more and more developers, some starters may want to modify scala examples' code for further understanding flink mechanism. After changing scala code,they may select the below steps to check result: 1.go to "build-target/bin" start server 2.use web UI to upload scala examples' jar 3.this time they would get confusion, why changes would be not updated. Because build-target/examples only copy java examples, suggest adding scala examples also. The new directory would like this : build-target/examples/java build-target/examples/scala You can merge this pull request into a Git repository by running: $ git pull https://github.com/chenliang613/flink FLINK-2537 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1123.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1123 commit 5a6735ce8cdf1555714377dc94f6ad90c41673c3 Author: chenliang613 Date: 2015-09-12T15:18:10Z FLINK-2537 Add scala examples.jar to build-target/examples > Add scala examples.jar to build-target/examples > --- > > Key: FLINK-2537 > URL: https://issues.apache.org/jira/browse/FLINK-2537 > Project: Flink > Issue Type: Improvement > Components: Examples >Affects Versions: 0.10 >Reporter: chenliang >Assignee: chenliang >Priority: Minor > Labels: maven > Fix For: 0.10 > > > Currently Scala as functional programming language has been acknowledged by > more and more developers, some starters may want to modify scala examples' > code for further understanding flink mechanism. After changing scala > code,they may select this method to check result: > 1.go to "build-target/bin" start server > 2.use web UI to upload scala examples' jar > 3.this time they would get confusion, why changes would be not updated. > Because build-target/examples only copy java examples, suggest adding scala > examples also. > The new directory would like this : > build-target/examples/java > build-target/examples/scala -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2537] Add scala examples.jar to build-t...
GitHub user chenliang613 opened a pull request: https://github.com/apache/flink/pull/1123 [FLINK-2537] Add scala examples.jar to build-target/examples Currently Scala as functional programming language has been acknowledged by more and more developers, some starters may want to modify scala examples' code for further understanding flink mechanism. After changing scala code,they may select the below steps to check result: 1.go to "build-target/bin" start server 2.use web UI to upload scala examples' jar 3.this time they would get confusion, why changes would be not updated. Because build-target/examples only copy java examples, suggest adding scala examples also. The new directory would like this : build-target/examples/java build-target/examples/scala You can merge this pull request into a Git repository by running: $ git pull https://github.com/chenliang613/flink FLINK-2537 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1123.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1123 commit 5a6735ce8cdf1555714377dc94f6ad90c41673c3 Author: chenliang613 Date: 2015-09-12T15:18:10Z FLINK-2537 Add scala examples.jar to build-target/examples --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2659) Object reuse in UnionWithTempOperator
[ https://issues.apache.org/jira/browse/FLINK-2659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14742076#comment-14742076 ] Stephan Ewen commented on FLINK-2659: - Yes, this clearly looks like a bug. Thank you for so thoroughly checking out the object-reuse mode... > Object reuse in UnionWithTempOperator > - > > Key: FLINK-2659 > URL: https://issues.apache.org/jira/browse/FLINK-2659 > Project: Flink > Issue Type: Bug > Components: Distributed Runtime >Affects Versions: master >Reporter: Greg Hogan > > The first loop in UnionWithTempOperator.run() executes until null, then the > second loop attempts to reuse this null value. [~StephanEwen], would you like > me to submit a pull request? > Stack trace: > {noformat} > org.apache.flink.client.program.ProgramInvocationException: The program > execution failed: Job execution failed. > at org.apache.flink.client.program.Client.run(Client.java:381) > at org.apache.flink.client.program.Client.run(Client.java:319) > at org.apache.flink.client.program.Client.run(Client.java:312) > at > org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:63) > at > org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:790) > at Driver.main(Driver.java:376) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437) > at > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353) > at org.apache.flink.client.program.Client.run(Client.java:278) > at > org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:630) > at org.apache.flink.client.CliFrontend.run(CliFrontend.java:318) > at > org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:953) > at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1003) > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed. > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1.applyOrElse(JobManager.scala:418) > at > scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) > at > scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) > at > scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) > at > org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:36) > at > scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) > at > scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) > at > scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) > at > org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:40) > at > org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28) > at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) > at > org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28) > at akka.actor.Actor$class.aroundReceive(Actor.scala:465) > at > org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:104) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) > at akka.actor.ActorCell.invoke(ActorCell.scala:487) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) > at akka.dispatch.Mailbox.run(Mailbox.scala:221) > at akka.dispatch.Mailbox.exec(Mailbox.scala:231) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Caused by: java.lang.NullPointerException > at > org.apache.flink.api.java.typeutils.runtime.TupleSerializer.deserialize(TupleSerializer.java:136) > at > org.apache.flink.api.java.typeutils.runtime.TupleSerializer.deserialize(TupleSerializer.java:30) > at > org.apache.flink.runtime.plugable.ReusingDeserializationDelegate.read(ReusingDeserializationDelegate.java:57) > at > org.apache.flink.
[jira] [Created] (FLINK-2661) Add a Node Splitting Technique to Overcome the Limitations of Skewed Graphs
Andra Lungu created FLINK-2661: -- Summary: Add a Node Splitting Technique to Overcome the Limitations of Skewed Graphs Key: FLINK-2661 URL: https://issues.apache.org/jira/browse/FLINK-2661 Project: Flink Issue Type: Task Components: Gelly Affects Versions: 0.10 Reporter: Andra Lungu Assignee: Andra Lungu Skewed graphs raise unique challenges to computation models such as Gelly's vertex-centric or GSA iterations. This is mainly because of the fact that these approaches uniformly process vertices regardless of their degree distribution. In vertex-centric, for instance, a skewed node will take more time to process its neighbors compared to the other nodes in the graph. The first will act as a straggler causing the latter to remain idle until it finishes its computation. This issue can be mitigated by splitting a high-degree node into subnodes and evenly distributing the edges to the the resulted subvertices. The computation will then be performed on the split vertex. To this end, we should add a Splitting API on top of Gelly which can help: - determine skewed nodes - split them - merge them back at the end of the computation, given a user defined combiner. To illustrate the usage of these methods, we should add an example as well as a separate entry in the documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Parameter Server: Distributed Key-Value store,...
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1003#issuecomment-139742195 Having interfaces for a Parameter Server service in Flink is a very good idea, IMO. This interface can be implemented for different backends, such as Ignite or an own lightweight implementation. However, I doubt that it really necessary to bake the Parameter Server master into the JobManager. Can't this be a completely stand-alone service to which Flink programs write to and read from via the provided interfaces? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2557] TypeExtractor properly returns Mi...
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1045#issuecomment-139741907 I am not familiar with the TypeExtractor in detail, but would support to fix the bug first and open a separate issue to refactor the extractor, if that is possible. @StephanEwen, @tillrohrmann What do you think? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2557) Manual type information via "returns" fails in DataSet API
[ https://issues.apache.org/jira/browse/FLINK-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741981#comment-14741981 ] ASF GitHub Bot commented on FLINK-2557: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1045#issuecomment-139741907 I am not familiar with the TypeExtractor in detail, but would support to fix the bug first and open a separate issue to refactor the extractor, if that is possible. @StephanEwen, @tillrohrmann What do you think? > Manual type information via "returns" fails in DataSet API > -- > > Key: FLINK-2557 > URL: https://issues.apache.org/jira/browse/FLINK-2557 > Project: Flink > Issue Type: Bug > Components: Java API >Reporter: Matthias J. Sax >Assignee: Chesnay Schepler > > I changed the WordCount example as below and get an exception: > Tokenizer is change to this (removed generics and added cast to String): > {code:java} > public static final class Tokenizer implements FlatMapFunction { > public void flatMap(Object value, Collector out) { > String[] tokens = ((String) value).toLowerCase().split("\\W+"); > for (String token : tokens) { > if (token.length() > 0) { > out.collect(new Tuple2(token, > 1)); > } > } > } > } > {code} > I added call to "returns()" here: > {code:java} > DataSet> counts = > text.flatMap(new Tokenizer()).returns("Tuple2") > .groupBy(0).sum(1); > {code} > The exception is: > {noformat} > Exception in thread "main" java.lang.IllegalArgumentException: The types of > the interface org.apache.flink.api.common.functions.FlatMapFunction could not > be inferred. Support for synthetic interfaces, lambdas, and generic types is > limited at this point. > at > org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:686) > at > org.apache.flink.api.java.typeutils.TypeExtractor.getParameterTypeFromGenericType(TypeExtractor.java:710) > at > org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:673) > at > org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:365) > at > org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:279) > at > org.apache.flink.api.java.typeutils.TypeExtractor.getFlatMapReturnTypes(TypeExtractor.java:120) > at org.apache.flink.api.java.DataSet.flatMap(DataSet.java:262) > at > org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:69) > {noformat} > Fix: > This should not immediately fail, but also only give a "MissingTypeInfo" so > that type hints would work. > The error message is also wrong, btw: It should state that raw types are not > supported. > The issue has been reported here: > http://stackoverflow.com/questions/32122495/stuck-with-type-hints-in-clojure-for-generic-class -- This message was sent by Atlassian JIRA (v6.3.4#6332)