[jira] [Created] (FLINK-2007) Initial data point in Delta function needs to be serializable
Márton Balassi created FLINK-2007: - Summary: Initial data point in Delta function needs to be serializable Key: FLINK-2007 URL: https://issues.apache.org/jira/browse/FLINK-2007 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.9 Reporter: Márton Balassi Assignee: Márton Balassi Fix For: 0.9 Currently we expect the data point passed to the delta function to be serializable, which breaks the serialization assumptions provided in other parts of the code. This information should be properly serialized by Flink serializers instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-2008) PersistentKafkaSource is sometimes emitting tuples multiple times
Robert Metzger created FLINK-2008: - Summary: PersistentKafkaSource is sometimes emitting tuples multiple times Key: FLINK-2008 URL: https://issues.apache.org/jira/browse/FLINK-2008 Project: Flink Issue Type: Bug Components: Kafka Connector, Streaming Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger The PersistentKafkaSource is expected to emit records exactly once. Two test cases of the KafkaITCase are sporadically failing because records are emitted multiple times. Affected tests: {{testPersistentSourceWithOffsetUpdates()}}, after the offsets have been changed manually in ZK: {code} java.lang.RuntimeException: Expected v to be 3, but was 4 on element 0 array=[4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2] {code} {{brokerFailureTest()}} also fails: {code} 05/13/2015 08:13:16 Custom source - Stream Sink(1/1) switched to FAILED java.lang.AssertionError: Received tuple with value 21 twice at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertFalse(Assert.java:64) at org.apache.flink.streaming.connectors.kafka.KafkaITCase$15.invoke(KafkaITCase.java:877) at org.apache.flink.streaming.connectors.kafka.KafkaITCase$15.invoke(KafkaITCase.java:859) at org.apache.flink.streaming.api.operators.StreamSink.callUserFunction(StreamSink.java:39) at org.apache.flink.streaming.api.operators.StreamOperator.callUserFunctionAndLogException(StreamOperator.java:137) at org.apache.flink.streaming.api.operators.ChainableStreamOperator.collect(ChainableStreamOperator.java:54) at org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39) at org.apache.flink.streaming.connectors.kafka.api.persistent.PersistentKafkaSource.run(PersistentKafkaSource.java:173) at org.apache.flink.streaming.api.operators.StreamSource.callUserFunction(StreamSource.java:40) at org.apache.flink.streaming.api.operators.StreamOperator.callUserFunctionAndLogException(StreamOperator.java:137) at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:34) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:139) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2007) Initial data point in Delta function needs to be serializable
[ https://issues.apache.org/jira/browse/FLINK-2007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541668#comment-14541668 ] Márton Balassi commented on FLINK-2007: --- I have a working prototype, the API needs cleanup before opening the PR. https://github.com/mbalassi/flink/commit/84f5cb75e5aa4616baea322a3d508433c86c1de7 Initial data point in Delta function needs to be serializable - Key: FLINK-2007 URL: https://issues.apache.org/jira/browse/FLINK-2007 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.9 Reporter: Márton Balassi Assignee: Márton Balassi Labels: serialization, windowing Fix For: 0.9 Currently we expect the data point passed to the delta function to be serializable, which breaks the serialization assumptions provided in other parts of the code. This information should be properly serialized by Flink serializers instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-1857) Flakey SimpleRecoveryITCase#testRestartMultipleTimes test
[ https://issues.apache.org/jira/browse/FLINK-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-1857. --- Flakey SimpleRecoveryITCase#testRestartMultipleTimes test - Key: FLINK-1857 URL: https://issues.apache.org/jira/browse/FLINK-1857 Project: Flink Issue Type: Bug Components: Build System Affects Versions: master Reporter: Ufuk Celebi Fix For: 0.9 SimpleRecoveryITCase#testRestartMultipleTimes failed on Travis (with an unrelated change to the documentation): https://travis-ci.org/uce/incubator-flink/builds/57814190 https://travis-ci.org/uce/incubator-flink/jobs/57814194 https://s3.amazonaws.com/archive.travis-ci.org/jobs/57814194/log.txt https://travis-ci.org/uce/incubator-flink/jobs/57814195 https://s3.amazonaws.com/archive.travis-ci.org/jobs/57814195/log.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1857) Flakey SimpleRecoveryITCase#testRestartMultipleTimes test
[ https://issues.apache.org/jira/browse/FLINK-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1857. - Resolution: Done Fix Version/s: 0.9 Flakey SimpleRecoveryITCase#testRestartMultipleTimes test - Key: FLINK-1857 URL: https://issues.apache.org/jira/browse/FLINK-1857 Project: Flink Issue Type: Bug Components: Build System Affects Versions: master Reporter: Ufuk Celebi Fix For: 0.9 SimpleRecoveryITCase#testRestartMultipleTimes failed on Travis (with an unrelated change to the documentation): https://travis-ci.org/uce/incubator-flink/builds/57814190 https://travis-ci.org/uce/incubator-flink/jobs/57814194 https://s3.amazonaws.com/archive.travis-ci.org/jobs/57814194/log.txt https://travis-ci.org/uce/incubator-flink/jobs/57814195 https://s3.amazonaws.com/archive.travis-ci.org/jobs/57814195/log.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1857) Flakey SimpleRecoveryITCase#testRestartMultipleTimes test
[ https://issues.apache.org/jira/browse/FLINK-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541682#comment-14541682 ] Stephan Ewen commented on FLINK-1857: - I think this has been fixed by the combination of multiple fixes to timeouts and task initialization and message robustness Flakey SimpleRecoveryITCase#testRestartMultipleTimes test - Key: FLINK-1857 URL: https://issues.apache.org/jira/browse/FLINK-1857 Project: Flink Issue Type: Bug Components: Build System Affects Versions: master Reporter: Ufuk Celebi SimpleRecoveryITCase#testRestartMultipleTimes failed on Travis (with an unrelated change to the documentation): https://travis-ci.org/uce/incubator-flink/builds/57814190 https://travis-ci.org/uce/incubator-flink/jobs/57814194 https://s3.amazonaws.com/archive.travis-ci.org/jobs/57814194/log.txt https://travis-ci.org/uce/incubator-flink/jobs/57814195 https://s3.amazonaws.com/archive.travis-ci.org/jobs/57814195/log.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1784) KafkaITCase
[ https://issues.apache.org/jira/browse/FLINK-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1784. - Resolution: Not A Problem KafkaITCase --- Key: FLINK-1784 URL: https://issues.apache.org/jira/browse/FLINK-1784 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.9 Reporter: Fabian Hueske Priority: Minor I observed a non-deterministic failure of the {{KafkaITCase}} on Travis: https://travis-ci.org/fhueske/flink/jobs/55815808 {code} Running org.apache.flink.streaming.connectors.kafka.KafkaITCase 03/25/2015 16:08:15 Job execution switched to status RUNNING. 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to SCHEDULED 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to DEPLOYING 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to SCHEDULED 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to DEPLOYING 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to RUNNING 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to RUNNING 03/25/2015 16:08:17 Custom Source - Stream Sink(1/1) switched to FAILED java.util.NoSuchElementException: next on empty iterator at scala.collection.Iterator$$anon$2.next(Iterator.scala:39) at scala.collection.Iterator$$anon$2.next(Iterator.scala:37) at scala.collection.LinearSeqLike$$anon$1.next(LinearSeqLike.scala:62) at scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:30) at org.apache.flink.streaming.connectors.kafka.api.simple.KafkaTopicUtils.getLeaderBrokerAddressForTopic(KafkaTopicUtils.java:83) at org.apache.flink.streaming.connectors.kafka.api.KafkaSink.open(KafkaSink.java:118) at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:33) at org.apache.flink.streaming.api.invokable.StreamInvokable.open(StreamInvokable.java:158) at org.apache.flink.streaming.api.streamvertex.StreamVertex.openOperator(StreamVertex.java:202) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:165) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:209) at java.lang.Thread.run(Thread.java:701) 03/25/2015 16:08:17 Job execution switched to status FAILING. 03/25/2015 16:08:17 Custom Source - Stream Sink(1/1) switched to CANCELING 03/25/2015 16:08:17 Custom Source - Stream Sink(1/1) switched to FAILED java.util.NoSuchElementException: next on empty iterator at scala.collection.Iterator$$anon$2.next(Iterator.scala:39) at scala.collection.Iterator$$anon$2.next(Iterator.scala:37) at scala.collection.LinearSeqLike$$anon$1.next(LinearSeqLike.scala:62) at scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:30) at org.apache.flink.streaming.connectors.kafka.api.simple.KafkaTopicUtils.getLeaderBrokerAddressForTopic(KafkaTopicUtils.java:83) at org.apache.flink.streaming.connectors.kafka.api.simple.PersistentKafkaSource.open(PersistentKafkaSource.java:159) at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:33) at org.apache.flink.streaming.api.invokable.StreamInvokable.open(StreamInvokable.java:158) at org.apache.flink.streaming.api.streamvertex.StreamVertex.openOperator(StreamVertex.java:198) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:165) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:209) at java.lang.Thread.run(Thread.java:701) 03/25/2015 16:08:17 Job execution switched to status FAILED. Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 10.774 sec FAILURE! - in org.apache.flink.streaming.connectors.kafka.KafkaITCase test(org.apache.flink.streaming.connectors.kafka.KafkaITCase) Time elapsed: 10.604 sec FAILURE! java.lang.AssertionError: Test failed with: null at org.junit.Assert.fail(Assert.java:88) at org.apache.flink.streaming.connectors.kafka.KafkaITCase.test(KafkaITCase.java:104) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1784) KafkaITCase
[ https://issues.apache.org/jira/browse/FLINK-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541689#comment-14541689 ] Stephan Ewen commented on FLINK-1784: - Subsumed by FLINK-2008 KafkaITCase --- Key: FLINK-1784 URL: https://issues.apache.org/jira/browse/FLINK-1784 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.9 Reporter: Fabian Hueske Priority: Minor I observed a non-deterministic failure of the {{KafkaITCase}} on Travis: https://travis-ci.org/fhueske/flink/jobs/55815808 {code} Running org.apache.flink.streaming.connectors.kafka.KafkaITCase 03/25/2015 16:08:15 Job execution switched to status RUNNING. 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to SCHEDULED 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to DEPLOYING 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to SCHEDULED 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to DEPLOYING 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to RUNNING 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to RUNNING 03/25/2015 16:08:17 Custom Source - Stream Sink(1/1) switched to FAILED java.util.NoSuchElementException: next on empty iterator at scala.collection.Iterator$$anon$2.next(Iterator.scala:39) at scala.collection.Iterator$$anon$2.next(Iterator.scala:37) at scala.collection.LinearSeqLike$$anon$1.next(LinearSeqLike.scala:62) at scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:30) at org.apache.flink.streaming.connectors.kafka.api.simple.KafkaTopicUtils.getLeaderBrokerAddressForTopic(KafkaTopicUtils.java:83) at org.apache.flink.streaming.connectors.kafka.api.KafkaSink.open(KafkaSink.java:118) at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:33) at org.apache.flink.streaming.api.invokable.StreamInvokable.open(StreamInvokable.java:158) at org.apache.flink.streaming.api.streamvertex.StreamVertex.openOperator(StreamVertex.java:202) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:165) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:209) at java.lang.Thread.run(Thread.java:701) 03/25/2015 16:08:17 Job execution switched to status FAILING. 03/25/2015 16:08:17 Custom Source - Stream Sink(1/1) switched to CANCELING 03/25/2015 16:08:17 Custom Source - Stream Sink(1/1) switched to FAILED java.util.NoSuchElementException: next on empty iterator at scala.collection.Iterator$$anon$2.next(Iterator.scala:39) at scala.collection.Iterator$$anon$2.next(Iterator.scala:37) at scala.collection.LinearSeqLike$$anon$1.next(LinearSeqLike.scala:62) at scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:30) at org.apache.flink.streaming.connectors.kafka.api.simple.KafkaTopicUtils.getLeaderBrokerAddressForTopic(KafkaTopicUtils.java:83) at org.apache.flink.streaming.connectors.kafka.api.simple.PersistentKafkaSource.open(PersistentKafkaSource.java:159) at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:33) at org.apache.flink.streaming.api.invokable.StreamInvokable.open(StreamInvokable.java:158) at org.apache.flink.streaming.api.streamvertex.StreamVertex.openOperator(StreamVertex.java:198) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:165) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:209) at java.lang.Thread.run(Thread.java:701) 03/25/2015 16:08:17 Job execution switched to status FAILED. Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 10.774 sec FAILURE! - in org.apache.flink.streaming.connectors.kafka.KafkaITCase test(org.apache.flink.streaming.connectors.kafka.KafkaITCase) Time elapsed: 10.604 sec FAILURE! java.lang.AssertionError: Test failed with: null at org.junit.Assert.fail(Assert.java:88) at org.apache.flink.streaming.connectors.kafka.KafkaITCase.test(KafkaITCase.java:104) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1690) ProcessFailureBatchRecoveryITCase.testTaskManagerProcessFailure spuriously fails on Travis
[ https://issues.apache.org/jira/browse/FLINK-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541692#comment-14541692 ] Stephan Ewen commented on FLINK-1690: - I think this one is stable by now... ProcessFailureBatchRecoveryITCase.testTaskManagerProcessFailure spuriously fails on Travis -- Key: FLINK-1690 URL: https://issues.apache.org/jira/browse/FLINK-1690 Project: Flink Issue Type: Bug Reporter: Till Rohrmann Priority: Minor I got the following error on Travis. {code} ProcessFailureBatchRecoveryITCase.testTaskManagerProcessFailure:244 The program did not finish in time {code} I think we have to increase the timeouts for this test case to make it reliably run on Travis. The log of the failed Travis build can be found [here|https://api.travis-ci.org/jobs/53952486/log.txt?deansi=true] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1990] [staging table] Support upper cas...
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/667#issuecomment-101617351 For some reason it doesn't like the keywork2Parser when you use String Interpolation. If you replace the line with: ``` ((i?)\Q + kw.key + \E).r ``` it should work. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1525) Provide utils to pass -D parameters to UDFs
[ https://issues.apache.org/jira/browse/FLINK-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541624#comment-14541624 ] ASF GitHub Bot commented on FLINK-1525: --- Github user mxm commented on the pull request: https://github.com/apache/flink/pull/664#issuecomment-101579496 Do you want to allow empty keys like in `-- value` or even `--`? I think the parser should throw an exception in this case. Provide utils to pass -D parameters to UDFs Key: FLINK-1525 URL: https://issues.apache.org/jira/browse/FLINK-1525 Project: Flink Issue Type: Improvement Components: flink-contrib Reporter: Robert Metzger Assignee: Robert Metzger Labels: starter Hadoop users are used to setting job configuration through -D on the command line. Right now, Flink users have to manually parse command line arguments and pass them to the methods. It would be nice to provide a standard args parser with is taking care of such stuff. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1525]Introduction of a small input para...
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/664#issuecomment-101579496 Do you want to allow empty keys like in `-- value` or even `--`? I think the parser should throw an exception in this case. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1865) Unstable test KafkaITCase
[ https://issues.apache.org/jira/browse/FLINK-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541684#comment-14541684 ] Stephan Ewen commented on FLINK-1865: - With the rewrite of the PersistentKafkaSource, this has been subsumed by FLINK-2008 Unstable test KafkaITCase - Key: FLINK-1865 URL: https://issues.apache.org/jira/browse/FLINK-1865 Project: Flink Issue Type: Bug Components: Streaming, Tests Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Robert Metzger {code} Running org.apache.flink.streaming.connectors.kafka.KafkaITCase 04/10/2015 13:46:53 Job execution switched to status RUNNING. 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.ChainableInvokable.collect(ChainableInvokable.java:54) at org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39) at org.apache.flink.streaming.connectors.kafka.api.KafkaSource.run(KafkaSource.java:196) at org.apache.flink.streaming.api.invokable.SourceInvokable.callUserFunction(SourceInvokable.java:42) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 4 more Caused by: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:166) at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:141) at org.apache.flink.streaming.api.invokable.SinkInvokable.callUserFunction(SinkInvokable.java:41) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 9 more 04/10/2015 13:47:04 Job execution switched to status FAILING. 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELED 04/10/2015 13:47:04 Job execution switched to status FAILED. 04/10/2015 13:47:05 Job execution switched to status RUNNING. 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:15 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at
[jira] [Closed] (FLINK-1865) Unstable test KafkaITCase
[ https://issues.apache.org/jira/browse/FLINK-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-1865. --- Resolution: Not A Problem Unstable test KafkaITCase - Key: FLINK-1865 URL: https://issues.apache.org/jira/browse/FLINK-1865 Project: Flink Issue Type: Bug Components: Streaming, Tests Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Robert Metzger {code} Running org.apache.flink.streaming.connectors.kafka.KafkaITCase 04/10/2015 13:46:53 Job execution switched to status RUNNING. 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.ChainableInvokable.collect(ChainableInvokable.java:54) at org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39) at org.apache.flink.streaming.connectors.kafka.api.KafkaSource.run(KafkaSource.java:196) at org.apache.flink.streaming.api.invokable.SourceInvokable.callUserFunction(SourceInvokable.java:42) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 4 more Caused by: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:166) at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:141) at org.apache.flink.streaming.api.invokable.SinkInvokable.callUserFunction(SinkInvokable.java:41) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 9 more 04/10/2015 13:47:04 Job execution switched to status FAILING. 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELED 04/10/2015 13:47:04 Job execution switched to status FAILED. 04/10/2015 13:47:05 Job execution switched to status RUNNING. 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:15 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at
[jira] [Resolved] (FLINK-1661) No Documentation for the isWeaklyConnected() Method
[ https://issues.apache.org/jira/browse/FLINK-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vasia Kalavri resolved FLINK-1661. -- Resolution: Not A Problem Fix Version/s: 0.9 The method was removed by http://github.com/apache/flink/pull/657. No Documentation for the isWeaklyConnected() Method --- Key: FLINK-1661 URL: https://issues.apache.org/jira/browse/FLINK-1661 Project: Flink Issue Type: Wish Components: Gelly Affects Versions: 0.9 Reporter: Andra Lungu Priority: Trivial Labels: easyfix Fix For: 0.9 The current Gelly documentation does not include an entry for the isWeaklyConnected method. It would be nice to also include this description. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Clean up naming of State/Checkpoint Interfaces
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/671#issuecomment-101653784 It would mean the state with which that operator instance is started. It is just a personal preference - just keep it as it is if you want... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Clean up naming of State/Checkpoint Interfaces
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/671#issuecomment-101652722 But what does restoreInitialState mean? What is the initial state? The state when the operator was first created? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-2011) Improve error message when user-defined serialization logic is wrong
Stephan Ewen created FLINK-2011: --- Summary: Improve error message when user-defined serialization logic is wrong Key: FLINK-2011 URL: https://issues.apache.org/jira/browse/FLINK-2011 Project: Flink Issue Type: Bug Components: Distributed Runtime Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1977) Rework Stream Operators to always be push based
[ https://issues.apache.org/jira/browse/FLINK-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541908#comment-14541908 ] ASF GitHub Bot commented on FLINK-1977: --- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/659#issuecomment-101663155 I rebased this PR on top of the latest work on the checkpointing. What are the opinions on merging this? Rework Stream Operators to always be push based --- Key: FLINK-1977 URL: https://issues.apache.org/jira/browse/FLINK-1977 Project: Flink Issue Type: Improvement Reporter: Aljoscha Krettek Assignee: Aljoscha Krettek This is a result of the discussion on the mailing list. This is an excerpt from the mailing list that gives the basic idea of the change: I propose to change all streaming operators to be push based, with a slightly improved interface: In addition to collect(), which I would call receiveElement() I would add receivePunctuation() and receiveBarrier(). The first operator in the chain would also get data from the outside invokable that reads from the input iterator and calls receiveElement() for the first operator in a chain. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Clean up naming of State/Checkpoint Interfaces
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/671#issuecomment-101651301 Looks good I actually liked the function name `restoreInitialState`. This makes it very clear that this method sets only the initial state, not any state whensoever. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1962) Add Gelly Scala API
[ https://issues.apache.org/jira/browse/FLINK-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541827#comment-14541827 ] Aljoscha Krettek commented on FLINK-1962: - Sorry, your comment did not appear yet when I wrote my reply. My fix should work this time. Even though it looks quite ugly because of the dual casting. The reason for this is that the TupleTypeInfo constructor has type bounds on T to only allow subclasses of Tuple. The T I have in the TypeInformationGen method does not have these type bounds, so I have to cast things around, but the end result should be correct this time. I'm sorry this look so long. I've never looked at the graph API in depth before. Add Gelly Scala API --- Key: FLINK-1962 URL: https://issues.apache.org/jira/browse/FLINK-1962 Project: Flink Issue Type: Task Components: Gelly, Scala API Affects Versions: 0.9 Reporter: Vasia Kalavri -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1977] Rework Stream Operators to always...
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/659#issuecomment-101663155 I rebased this PR on top of the latest work on the checkpointing. What are the opinions on merging this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1962) Add Gelly Scala API
[ https://issues.apache.org/jira/browse/FLINK-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541810#comment-14541810 ] Aljoscha Krettek commented on FLINK-1962: - I'm sorry. It seems that the Graph API also needs the correct type class in the TupleTypeInfo. I pushed a fix to my pull request that should address this. Add Gelly Scala API --- Key: FLINK-1962 URL: https://issues.apache.org/jira/browse/FLINK-1962 Project: Flink Issue Type: Task Components: Gelly, Scala API Affects Versions: 0.9 Reporter: Vasia Kalavri -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1731) Add kMeans clustering algorithm to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541818#comment-14541818 ] Peter Schrott commented on FLINK-1731: -- As discussed with [~aalexandrov] The initial centroids are given to the kmeans within the parameter map. For convenience fluent syntax is added. Add kMeans clustering algorithm to machine learning library --- Key: FLINK-1731 URL: https://issues.apache.org/jira/browse/FLINK-1731 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Till Rohrmann Assignee: Alexander Alexandrov Labels: ML The Flink repository already contains a kMeans implementation but it is not yet ported to the machine learning library. I assume that only the used data types have to be adapted and then it can be more or less directly moved to flink-ml. The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better implementation because the improve the initial seeding phase to achieve near optimal clustering. It might be worthwhile to implement kMeans||. Resources: [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-1595) Add a complex integration test for Streaming API
[ https://issues.apache.org/jira/browse/FLINK-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Márton Balassi closed FLINK-1595. - Resolution: Implemented Fix Version/s: 0.9 Implemented via 4786b56 and 48e21a1. Add a complex integration test for Streaming API Key: FLINK-1595 URL: https://issues.apache.org/jira/browse/FLINK-1595 Project: Flink Issue Type: Task Components: Streaming Reporter: Gyula Fora Assignee: Péter Szabó Labels: Starter Fix For: 0.9 The streaming tests currently lack a sophisticated integration test that would test many api features at once. This should include different merging, partitioning, grouping, aggregation types, as well as windowing and connected operators. The results should be tested for correctness. A test like this would help identifying bugs that are hard to detect by unit-tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1962) Add Gelly Scala API
[ https://issues.apache.org/jira/browse/FLINK-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541783#comment-14541783 ] PJ Van Aeken commented on FLINK-1962: - So the line I mentioned is definitely the problem. What you'd really need to call there is the other constructor for TupleTypeInfo which also takes a Class as constructor parameter. Unfortunately, due to type erasure, the class T is not available. Since T : WeakTypeTag, perhaps there is a way to pass the class parameter after all? I am not that familiar with how WeakTypeTags work exactly. Add Gelly Scala API --- Key: FLINK-1962 URL: https://issues.apache.org/jira/browse/FLINK-1962 Project: Flink Issue Type: Task Components: Gelly, Scala API Affects Versions: 0.9 Reporter: Vasia Kalavri -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-2010) Add test that verifies that chained tasks are properly checkpointed
Stephan Ewen created FLINK-2010: --- Summary: Add test that verifies that chained tasks are properly checkpointed Key: FLINK-2010 URL: https://issues.apache.org/jira/browse/FLINK-2010 Project: Flink Issue Type: Bug Components: Streaming Reporter: Stephan Ewen Because this is critical logic, we should have a unit test for the {{StreamingTask}} validating that the state from all chained tasks is taken into account. It needs to check - The trigger checkpoint method, and that the state handle handle in the checkpoint acknowledgement contains all the necessary state - The restore state method, that state is reset properly -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Clean up naming of State/Checkpoint Interfaces
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/671#issuecomment-101654445 Nah, you can change it of course. :smile_cat: --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1595) Add a complex integration test for Streaming API
[ https://issues.apache.org/jira/browse/FLINK-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541940#comment-14541940 ] ASF GitHub Bot commented on FLINK-1595: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/520 Add a complex integration test for Streaming API Key: FLINK-1595 URL: https://issues.apache.org/jira/browse/FLINK-1595 Project: Flink Issue Type: Task Components: Streaming Reporter: Gyula Fora Assignee: Péter Szabó Labels: Starter The streaming tests currently lack a sophisticated integration test that would test many api features at once. This should include different merging, partitioning, grouping, aggregation types, as well as windowing and connected operators. The results should be tested for correctness. A test like this would help identifying bugs that are hard to detect by unit-tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1595] [streaming] Complex integration t...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/520 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1990) Uppercase AS keyword not allowed in select expression
[ https://issues.apache.org/jira/browse/FLINK-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542011#comment-14542011 ] ASF GitHub Bot commented on FLINK-1990: --- Github user chhao01 commented on the pull request: https://github.com/apache/flink/pull/667#issuecomment-101688740 Thank you @aljoscha, eventually I got the IDE works with unit test, and I update the code, let's see what travis will say. Uppercase AS keyword not allowed in select expression --- Key: FLINK-1990 URL: https://issues.apache.org/jira/browse/FLINK-1990 Project: Flink Issue Type: Bug Components: Table API Affects Versions: 0.9 Reporter: Fabian Hueske Assignee: Aljoscha Krettek Priority: Minor Fix For: 0.9 Table API select expressions do not allow an uppercase AS keyword. The following expression fails with an {{ExpressionException}}: {{table.groupBy(request).select(request, request.count AS cnt)}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1977) Rework Stream Operators to always be push based
[ https://issues.apache.org/jira/browse/FLINK-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541985#comment-14541985 ] ASF GitHub Bot commented on FLINK-1977: --- Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/659#issuecomment-101684152 I'll do another pass tomorrow. Rework Stream Operators to always be push based --- Key: FLINK-1977 URL: https://issues.apache.org/jira/browse/FLINK-1977 Project: Flink Issue Type: Improvement Reporter: Aljoscha Krettek Assignee: Aljoscha Krettek This is a result of the discussion on the mailing list. This is an excerpt from the mailing list that gives the basic idea of the change: I propose to change all streaming operators to be push based, with a slightly improved interface: In addition to collect(), which I would call receiveElement() I would add receivePunctuation() and receiveBarrier(). The first operator in the chain would also get data from the outside invokable that reads from the input iterator and calls receiveElement() for the first operator in a chain. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1990] [staging table] Support upper cas...
Github user chhao01 commented on the pull request: https://github.com/apache/flink/pull/667#issuecomment-101688740 Thank you @aljoscha, eventually I got the IDE works with unit test, and I update the code, let's see what travis will say. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1977] Rework Stream Operators to always...
Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/659#issuecomment-101684152 I'll do another pass tomorrow. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1525) Provide utils to pass -D parameters to UDFs
[ https://issues.apache.org/jira/browse/FLINK-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542129#comment-14542129 ] ASF GitHub Bot commented on FLINK-1525: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/664#issuecomment-101720728 Good catch, thank you. I added test cases that validate we're throwing an exception for `-` and `--` inputs. Provide utils to pass -D parameters to UDFs Key: FLINK-1525 URL: https://issues.apache.org/jira/browse/FLINK-1525 Project: Flink Issue Type: Improvement Components: flink-contrib Reporter: Robert Metzger Assignee: Robert Metzger Labels: starter Hadoop users are used to setting job configuration through -D on the command line. Right now, Flink users have to manually parse command line arguments and pass them to the methods. It would be nice to provide a standard args parser with is taking care of such stuff. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1977) Rework Stream Operators to always be push based
[ https://issues.apache.org/jira/browse/FLINK-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542134#comment-14542134 ] ASF GitHub Bot commented on FLINK-1977: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/659#issuecomment-101721428 I will also try to have a closer look later today or tomorrow. First rough glance looks good! Rework Stream Operators to always be push based --- Key: FLINK-1977 URL: https://issues.apache.org/jira/browse/FLINK-1977 Project: Flink Issue Type: Improvement Reporter: Aljoscha Krettek Assignee: Aljoscha Krettek This is a result of the discussion on the mailing list. This is an excerpt from the mailing list that gives the basic idea of the change: I propose to change all streaming operators to be push based, with a slightly improved interface: In addition to collect(), which I would call receiveElement() I would add receivePunctuation() and receiveBarrier(). The first operator in the chain would also get data from the outside invokable that reads from the input iterator and calls receiveElement() for the first operator in a chain. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1977] Rework Stream Operators to always...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/659#issuecomment-101721428 I will also try to have a closer look later today or tomorrow. First rough glance looks good! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1990) Uppercase AS keyword not allowed in select expression
[ https://issues.apache.org/jira/browse/FLINK-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542108#comment-14542108 ] ASF GitHub Bot commented on FLINK-1990: --- Github user chhao01 commented on the pull request: https://github.com/apache/flink/pull/667#issuecomment-101719363 Failed in `flink-streaming-connectors` with java 8, does that related to this change? Uppercase AS keyword not allowed in select expression --- Key: FLINK-1990 URL: https://issues.apache.org/jira/browse/FLINK-1990 Project: Flink Issue Type: Bug Components: Table API Affects Versions: 0.9 Reporter: Fabian Hueske Assignee: Aljoscha Krettek Priority: Minor Fix For: 0.9 Table API select expressions do not allow an uppercase AS keyword. The following expression fails with an {{ExpressionException}}: {{table.groupBy(request).select(request, request.count AS cnt)}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2003) Building on some encrypted filesystems leads to File name too long error
[ https://issues.apache.org/jira/browse/FLINK-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542183#comment-14542183 ] Theodore Vasiloudis commented on FLINK-2003: As a note, in my case this was caused by the flink-yarn package, so adding the compilation arg to its pom fixed the problems. Building on some encrypted filesystems leads to File name too long error -- Key: FLINK-2003 URL: https://issues.apache.org/jira/browse/FLINK-2003 Project: Flink Issue Type: Bug Components: Build System Reporter: Theodore Vasiloudis Priority: Minor Labels: build, starter The classnames generated from the build system can be too long. Creating too long filenames in some encrypted filesystems is not possible, including encfs which is what Ubuntu uses. This the same as this [Spark issue|https://issues.apache.org/jira/browse/SPARK-4820] The workaround (taken from the linked issue) is to add in Maven under the compile options: {code} + arg-Xmax-classfile-name/arg + arg128/arg {code} And in SBT add: {code} +scalacOptions in Compile ++= Seq(-Xmax-classfile-name, 128), {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1711) Replace all usages off commons.Validate with guava.check
[ https://issues.apache.org/jira/browse/FLINK-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543112#comment-14543112 ] ASF GitHub Bot commented on FLINK-1711: --- GitHub user lokeshrajaram opened a pull request: https://github.com/apache/flink/pull/673 [FLINK-1711] - Converted all usages of Commons Validate to Guava Checks(for Java classes), Scala predef require(for Scala classes) [FLINK-1711] - Converted all usages of Commons Validate to Guava Checks(for Java classes), Scala predef require(for Scala classes) You can merge this pull request into a Git repository by running: $ git pull https://github.com/lokeshrajaram/flink all_guava Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/673.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #673 commit 04e1695d3b8414616216264a5b0972d762664ec7 Author: lrajaram lokesh_raja...@intuit.com Date: 2015-05-10T01:57:36Z converted all usages of Commons Validate to Guava Checks commit 4f68d03d50d0fab47f5067906ec805f4a8b93cfa Author: lrajaram lokesh_raja...@intuit.com Date: 2015-05-14T02:29:03Z converted all usages of commons validate to guava checks(for Java classes), scala predef require(for scala classes) Replace all usages off commons.Validate with guava.check Key: FLINK-1711 URL: https://issues.apache.org/jira/browse/FLINK-1711 Project: Flink Issue Type: Improvement Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Lokesh Rajaram Priority: Minor Labels: easyfix, starter Fix For: 0.9 Per discussion on the mailing list, we decided to increase homogeneity. One part is to consistently use the Guava methods {{checkNotNull}} and {{checkArgument}}, rather than Apache Commons Lang3 {{Validate}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1711] - Converted all usages of Commons...
GitHub user lokeshrajaram opened a pull request: https://github.com/apache/flink/pull/673 [FLINK-1711] - Converted all usages of Commons Validate to Guava Checks(for Java classes), Scala predef require(for Scala classes) [FLINK-1711] - Converted all usages of Commons Validate to Guava Checks(for Java classes), Scala predef require(for Scala classes) You can merge this pull request into a Git repository by running: $ git pull https://github.com/lokeshrajaram/flink all_guava Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/673.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #673 commit 04e1695d3b8414616216264a5b0972d762664ec7 Author: lrajaram lokesh_raja...@intuit.com Date: 2015-05-10T01:57:36Z converted all usages of Commons Validate to Guava Checks commit 4f68d03d50d0fab47f5067906ec805f4a8b93cfa Author: lrajaram lokesh_raja...@intuit.com Date: 2015-05-14T02:29:03Z converted all usages of commons validate to guava checks(for Java classes), scala predef require(for scala classes) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Assigned] (FLINK-1932) Add L-BFGS to the optimization framework
[ https://issues.apache.org/jira/browse/FLINK-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Theodore Vasiloudis reassigned FLINK-1932: -- Assignee: Theodore Vasiloudis Add L-BFGS to the optimization framework Key: FLINK-1932 URL: https://issues.apache.org/jira/browse/FLINK-1932 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Till Rohrmann Assignee: Theodore Vasiloudis Labels: ML A good candidate to add to the new optimization framework could be L-BFGS [1, 2]. Resources: [1] http://papers.nips.cc/paper/5333-large-scale-l-bfgs-using-mapreduce.pdf [2] http://en.wikipedia.org/wiki/Limited-memory_BFGS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1907] Scala shell
GitHub user nikste opened a pull request: https://github.com/apache/flink/pull/672 [FLINK-1907] Scala shell implementation of Scala Shell, for Scala versions 2.10 and 2.11. Also includes change to eager print() You can merge this pull request into a Git repository by running: $ git pull https://github.com/nikste/flink Scala_shell Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/672.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #672 commit d28f69bb5ed6ce71657f62cf515839ded93f5deb Author: fleischhauf nikolaas.steenber...@googlemail.com Date: 2015-05-12T19:54:20Z Added custom Scala shell with ScalaShellRemoteEnvironment from Remote⦠⦠â¦Environment Either started locally, or connecting to remote sever added startup sh script for scala shell works with Scala 2.10 and 2.11 commit 68e8c087f678c53715f65f74dabe367eeb49ada1 Author: Aljoscha Krettek aljoscha.kret...@gmail.com Date: 2015-04-28T08:52:29Z [hotfix] Change print() to eager execution print() now uses collect() internally commit 320c95749af97c33bf2245b6db3d659bc3fc0379 Author: Nikolaas Steenbergen nikolaas.steenber...@googlemail.com Date: 2015-05-05T08:49:54Z Changed tests and examples for not failing with eager print method Added lastJobExecutionResult for getting the result of the last execution, when executing eager print commit 0b7cdc5fbd3f3271f23e781722edb75d380b3945 Author: fleischhauf nikolaas.steenber...@googlemail.com Date: 2015-05-12T19:58:57Z Added flink scala shell documentation with wordcount example --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Closed] (FLINK-2006) TaskManagerTest.testRunJobWithForwardChannel:432 expected:FINISHED but was:FAILED
[ https://issues.apache.org/jira/browse/FLINK-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-2006. --- TaskManagerTest.testRunJobWithForwardChannel:432 expected:FINISHED but was:FAILED - Key: FLINK-2006 URL: https://issues.apache.org/jira/browse/FLINK-2006 Project: Flink Issue Type: Bug Components: TaskManager Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Stephan Ewen Fix For: 0.9 I've seen this test failing multiple times now. Example: https://travis-ci.org/rmetzger/flink/jobs/62355627 [~till.rohrmann] tried to fix the issue a while ago with this commit http://git-wip-us.apache.org/repos/asf/flink/commit/3897b47b -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-2011) Improve error message when user-defined serialization logic is wrong
[ https://issues.apache.org/jira/browse/FLINK-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-2011. --- Improve error message when user-defined serialization logic is wrong Key: FLINK-2011 URL: https://issues.apache.org/jira/browse/FLINK-2011 Project: Flink Issue Type: Bug Components: Distributed Runtime Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-2011) Improve error message when user-defined serialization logic is wrong
[ https://issues.apache.org/jira/browse/FLINK-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-2011. - Resolution: Pending Closed Fixed in 113b20b7f8717b12c5f0dfa691da582d426fbae0 Improve error message when user-defined serialization logic is wrong Key: FLINK-2011 URL: https://issues.apache.org/jira/browse/FLINK-2011 Project: Flink Issue Type: Bug Components: Distributed Runtime Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-2006) TaskManagerTest.testRunJobWithForwardChannel:432 expected:FINISHED but was:FAILED
[ https://issues.apache.org/jira/browse/FLINK-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-2006. - Resolution: Pending Closed Fix Version/s: 0.9 Assignee: Stephan Ewen Fixed in 9da2f1f725473f1242067acb546607888e3a5015 TaskManagerTest.testRunJobWithForwardChannel:432 expected:FINISHED but was:FAILED - Key: FLINK-2006 URL: https://issues.apache.org/jira/browse/FLINK-2006 Project: Flink Issue Type: Bug Components: TaskManager Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Stephan Ewen Fix For: 0.9 I've seen this test failing multiple times now. Example: https://travis-ci.org/rmetzger/flink/jobs/62355627 [~till.rohrmann] tried to fix the issue a while ago with this commit http://git-wip-us.apache.org/repos/asf/flink/commit/3897b47b -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1907) Scala Interactive Shell
[ https://issues.apache.org/jira/browse/FLINK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542326#comment-14542326 ] ASF GitHub Bot commented on FLINK-1907: --- GitHub user nikste opened a pull request: https://github.com/apache/flink/pull/672 [FLINK-1907] Scala shell implementation of Scala Shell, for Scala versions 2.10 and 2.11. Also includes change to eager print() You can merge this pull request into a Git repository by running: $ git pull https://github.com/nikste/flink Scala_shell Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/672.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #672 commit d28f69bb5ed6ce71657f62cf515839ded93f5deb Author: fleischhauf nikolaas.steenber...@googlemail.com Date: 2015-05-12T19:54:20Z Added custom Scala shell with ScalaShellRemoteEnvironment from Remote… … …Environment Either started locally, or connecting to remote sever added startup sh script for scala shell works with Scala 2.10 and 2.11 commit 68e8c087f678c53715f65f74dabe367eeb49ada1 Author: Aljoscha Krettek aljoscha.kret...@gmail.com Date: 2015-04-28T08:52:29Z [hotfix] Change print() to eager execution print() now uses collect() internally commit 320c95749af97c33bf2245b6db3d659bc3fc0379 Author: Nikolaas Steenbergen nikolaas.steenber...@googlemail.com Date: 2015-05-05T08:49:54Z Changed tests and examples for not failing with eager print method Added lastJobExecutionResult for getting the result of the last execution, when executing eager print commit 0b7cdc5fbd3f3271f23e781722edb75d380b3945 Author: fleischhauf nikolaas.steenber...@googlemail.com Date: 2015-05-12T19:58:57Z Added flink scala shell documentation with wordcount example Scala Interactive Shell --- Key: FLINK-1907 URL: https://issues.apache.org/jira/browse/FLINK-1907 Project: Flink Issue Type: New Feature Components: Scala API Reporter: Nikolaas Steenbergen Assignee: Nikolaas Steenbergen Priority: Minor Build an interactive Shell for the Scala api. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (FLINK-1731) Add kMeans clustering algorithm to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541516#comment-14541516 ] Peter Schrott edited comment on FLINK-1731 at 5/13/15 7:29 AM: --- Hi [~chiwanpark], the thing is, to fit the model, the KMeans uses two datasets. One is the training data, the other are the initial centroids. This means, the {{fit}}-method should take two attributes at that point. This is the reason why I suggested to use the parameter map for passing the initial centroids. The training dataset will be passed as argument to the {{fit}}-method, equally to the CoCoA implementation. was (Author: peedeex21): Hi [~chiwanpark], the thing is, to fit the model, the KMeans uses two datasets. One is the training data, the other are the initial centroids. This means, the {code:java}fit{code}-method should take two attributes at that point. This is the reason why I suggested to use the parameter map for passing the initial centroids. The training dataset will be passed as argument to the {code:java}fit{code}-method, equally to the CoCoA implementation. Add kMeans clustering algorithm to machine learning library --- Key: FLINK-1731 URL: https://issues.apache.org/jira/browse/FLINK-1731 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Till Rohrmann Assignee: Alexander Alexandrov Labels: ML The Flink repository already contains a kMeans implementation but it is not yet ported to the machine learning library. I assume that only the used data types have to be adapted and then it can be more or less directly moved to flink-ml. The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better implementation because the improve the initial seeding phase to achieve near optimal clustering. It might be worthwhile to implement kMeans||. Resources: [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1990) Uppercase AS keyword not allowed in select expression
[ https://issues.apache.org/jira/browse/FLINK-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541505#comment-14541505 ] ASF GitHub Bot commented on FLINK-1990: --- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/667#issuecomment-101544907 I'll check it out. Uppercase AS keyword not allowed in select expression --- Key: FLINK-1990 URL: https://issues.apache.org/jira/browse/FLINK-1990 Project: Flink Issue Type: Bug Components: Table API Affects Versions: 0.9 Reporter: Fabian Hueske Assignee: Aljoscha Krettek Priority: Minor Fix For: 0.9 Table API select expressions do not allow an uppercase AS keyword. The following expression fails with an {{ExpressionException}}: {{table.groupBy(request).select(request, request.count AS cnt)}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1990] [staging table] Support upper cas...
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/667#issuecomment-101544907 I'll check it out. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Comment Edited] (FLINK-1731) Add kMeans clustering algorithm to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541516#comment-14541516 ] Peter Schrott edited comment on FLINK-1731 at 5/13/15 7:30 AM: --- Hi [~chiwanpark], the thing is, to fit the model, the KMeans uses two datasets. One is the training data, the other are the initial centroids. This means, the {{fit}}-method should take two attributes at that point. This is the reason why I suggested to use the parameter map for passing the initial centroids. The training dataset will be passed as argument to the {{fit}}-method, equally to the CoCoA implementation. was (Author: peedeex21): Hi [~chiwanpark], the thing is, to fit the model, the KMeans uses two datasets. One is the training data, the other are the initial centroids. This means, the {{fit}}-method should take two attributes at that point. This is the reason why I suggested to use the parameter map for passing the initial centroids. The training dataset will be passed as argument to the {{fit}}-method, equally to the CoCoA implementation. Add kMeans clustering algorithm to machine learning library --- Key: FLINK-1731 URL: https://issues.apache.org/jira/browse/FLINK-1731 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Till Rohrmann Assignee: Alexander Alexandrov Labels: ML The Flink repository already contains a kMeans implementation but it is not yet ported to the machine learning library. I assume that only the used data types have to be adapted and then it can be more or less directly moved to flink-ml. The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better implementation because the improve the initial seeding phase to achieve near optimal clustering. It might be worthwhile to implement kMeans||. Resources: [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (FLINK-1731) Add kMeans clustering algorithm to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541516#comment-14541516 ] Peter Schrott edited comment on FLINK-1731 at 5/13/15 7:33 AM: --- Hi [~chiwanpark], the thing is, to fit the model, the KMeans uses two datasets. One is the training data, the other are the initial centroids. The initial centroids are used to create the appropriated clusters on the training dataset. These clusters define the fitted model. This means, the {{fit}}-method should take two attributes at that point. This is the reason why I suggested to use the parameter map for passing the initial centroids. The training dataset will be passed as argument to the {{fit}}-method, equally to the CoCoA implementation. The test dataset will be applied to the trained model afterwards. was (Author: peedeex21): Hi [~chiwanpark], the thing is, to fit the model, the KMeans uses two datasets. One is the training data, the other are the initial centroids. This means, the {{fit}}-method should take two attributes at that point. This is the reason why I suggested to use the parameter map for passing the initial centroids. The training dataset will be passed as argument to the {{fit}}-method, equally to the CoCoA implementation. Add kMeans clustering algorithm to machine learning library --- Key: FLINK-1731 URL: https://issues.apache.org/jira/browse/FLINK-1731 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Till Rohrmann Assignee: Alexander Alexandrov Labels: ML The Flink repository already contains a kMeans implementation but it is not yet ported to the machine learning library. I assume that only the used data types have to be adapted and then it can be more or less directly moved to flink-ml. The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better implementation because the improve the initial seeding phase to achieve near optimal clustering. It might be worthwhile to implement kMeans||. Resources: [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-2006) TaskManagerTest.testRunJobWithForwardChannel:432 expected:FINISHED but was:FAILED
Robert Metzger created FLINK-2006: - Summary: TaskManagerTest.testRunJobWithForwardChannel:432 expected:FINISHED but was:FAILED Key: FLINK-2006 URL: https://issues.apache.org/jira/browse/FLINK-2006 Project: Flink Issue Type: Bug Components: TaskManager Affects Versions: 0.9 Reporter: Robert Metzger I've seen this test failing multiple times now. Example: https://travis-ci.org/rmetzger/flink/jobs/62355627 [~till.rohrmann] tried to fix the issue a while ago with this commit http://git-wip-us.apache.org/repos/asf/flink/commit/3897b47b -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1731) Add kMeans clustering algorithm to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541570#comment-14541570 ] Alexander Alexandrov commented on FLINK-1731: - I suggest to try and add the initial centroids as a proper parameter and not as part of the ParameterMap, since they are an actual input to the algorithm (as opposed to an algorithm hyper-parameter). Add kMeans clustering algorithm to machine learning library --- Key: FLINK-1731 URL: https://issues.apache.org/jira/browse/FLINK-1731 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Till Rohrmann Assignee: Alexander Alexandrov Labels: ML The Flink repository already contains a kMeans implementation but it is not yet ported to the machine learning library. I assume that only the used data types have to be adapted and then it can be more or less directly moved to flink-ml. The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better implementation because the improve the initial seeding phase to achieve near optimal clustering. It might be worthwhile to implement kMeans||. Resources: [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)