[jira] [Created] (FLINK-2007) Initial data point in Delta function needs to be serializable

2015-05-13 Thread JIRA
Márton Balassi created FLINK-2007:
-

 Summary: Initial data point in Delta function needs to be 
serializable
 Key: FLINK-2007
 URL: https://issues.apache.org/jira/browse/FLINK-2007
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.9
Reporter: Márton Balassi
Assignee: Márton Balassi
 Fix For: 0.9


Currently we expect the data point passed to the delta function to be 
serializable, which breaks the serialization assumptions provided in other 
parts of the code.

This information should be properly serialized by Flink serializers instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2008) PersistentKafkaSource is sometimes emitting tuples multiple times

2015-05-13 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-2008:
-

 Summary: PersistentKafkaSource is sometimes emitting tuples 
multiple times
 Key: FLINK-2008
 URL: https://issues.apache.org/jira/browse/FLINK-2008
 Project: Flink
  Issue Type: Bug
  Components: Kafka Connector, Streaming
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger


The PersistentKafkaSource is expected to emit records exactly once.

Two test cases of the KafkaITCase are sporadically failing because records are 
emitted multiple times.
Affected tests:
{{testPersistentSourceWithOffsetUpdates()}}, after the offsets have been 
changed manually in ZK:
{code}
java.lang.RuntimeException: Expected v to be 3, but was 4 on element 0 
array=[4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2]
{code}

{{brokerFailureTest()}} also fails:
{code}
05/13/2015 08:13:16 Custom source - Stream Sink(1/1) switched to FAILED 
java.lang.AssertionError: Received tuple with value 21 twice
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertFalse(Assert.java:64)
at 
org.apache.flink.streaming.connectors.kafka.KafkaITCase$15.invoke(KafkaITCase.java:877)
at 
org.apache.flink.streaming.connectors.kafka.KafkaITCase$15.invoke(KafkaITCase.java:859)
at 
org.apache.flink.streaming.api.operators.StreamSink.callUserFunction(StreamSink.java:39)
at 
org.apache.flink.streaming.api.operators.StreamOperator.callUserFunctionAndLogException(StreamOperator.java:137)
at 
org.apache.flink.streaming.api.operators.ChainableStreamOperator.collect(ChainableStreamOperator.java:54)
at 
org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39)
at 
org.apache.flink.streaming.connectors.kafka.api.persistent.PersistentKafkaSource.run(PersistentKafkaSource.java:173)
at 
org.apache.flink.streaming.api.operators.StreamSource.callUserFunction(StreamSource.java:40)
at 
org.apache.flink.streaming.api.operators.StreamOperator.callUserFunctionAndLogException(StreamOperator.java:137)
at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:34)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:139)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
at java.lang.Thread.run(Thread.java:745)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2007) Initial data point in Delta function needs to be serializable

2015-05-13 Thread JIRA

[ 
https://issues.apache.org/jira/browse/FLINK-2007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541668#comment-14541668
 ] 

Márton Balassi commented on FLINK-2007:
---

I have a working prototype, the API needs cleanup before opening the PR. 

https://github.com/mbalassi/flink/commit/84f5cb75e5aa4616baea322a3d508433c86c1de7

 Initial data point in Delta function needs to be serializable
 -

 Key: FLINK-2007
 URL: https://issues.apache.org/jira/browse/FLINK-2007
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.9
Reporter: Márton Balassi
Assignee: Márton Balassi
  Labels: serialization, windowing
 Fix For: 0.9


 Currently we expect the data point passed to the delta function to be 
 serializable, which breaks the serialization assumptions provided in other 
 parts of the code.
 This information should be properly serialized by Flink serializers instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-1857) Flakey SimpleRecoveryITCase#testRestartMultipleTimes test

2015-05-13 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-1857.
---

 Flakey SimpleRecoveryITCase#testRestartMultipleTimes test
 -

 Key: FLINK-1857
 URL: https://issues.apache.org/jira/browse/FLINK-1857
 Project: Flink
  Issue Type: Bug
  Components: Build System
Affects Versions: master
Reporter: Ufuk Celebi
 Fix For: 0.9


 SimpleRecoveryITCase#testRestartMultipleTimes failed on Travis (with an 
 unrelated change to the documentation):
 https://travis-ci.org/uce/incubator-flink/builds/57814190
 https://travis-ci.org/uce/incubator-flink/jobs/57814194
 https://s3.amazonaws.com/archive.travis-ci.org/jobs/57814194/log.txt
 https://travis-ci.org/uce/incubator-flink/jobs/57814195
 https://s3.amazonaws.com/archive.travis-ci.org/jobs/57814195/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1857) Flakey SimpleRecoveryITCase#testRestartMultipleTimes test

2015-05-13 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-1857.
-
   Resolution: Done
Fix Version/s: 0.9

 Flakey SimpleRecoveryITCase#testRestartMultipleTimes test
 -

 Key: FLINK-1857
 URL: https://issues.apache.org/jira/browse/FLINK-1857
 Project: Flink
  Issue Type: Bug
  Components: Build System
Affects Versions: master
Reporter: Ufuk Celebi
 Fix For: 0.9


 SimpleRecoveryITCase#testRestartMultipleTimes failed on Travis (with an 
 unrelated change to the documentation):
 https://travis-ci.org/uce/incubator-flink/builds/57814190
 https://travis-ci.org/uce/incubator-flink/jobs/57814194
 https://s3.amazonaws.com/archive.travis-ci.org/jobs/57814194/log.txt
 https://travis-ci.org/uce/incubator-flink/jobs/57814195
 https://s3.amazonaws.com/archive.travis-ci.org/jobs/57814195/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1857) Flakey SimpleRecoveryITCase#testRestartMultipleTimes test

2015-05-13 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541682#comment-14541682
 ] 

Stephan Ewen commented on FLINK-1857:
-

I think this has been fixed by the combination of multiple fixes to timeouts 
and task initialization and message robustness

 Flakey SimpleRecoveryITCase#testRestartMultipleTimes test
 -

 Key: FLINK-1857
 URL: https://issues.apache.org/jira/browse/FLINK-1857
 Project: Flink
  Issue Type: Bug
  Components: Build System
Affects Versions: master
Reporter: Ufuk Celebi

 SimpleRecoveryITCase#testRestartMultipleTimes failed on Travis (with an 
 unrelated change to the documentation):
 https://travis-ci.org/uce/incubator-flink/builds/57814190
 https://travis-ci.org/uce/incubator-flink/jobs/57814194
 https://s3.amazonaws.com/archive.travis-ci.org/jobs/57814194/log.txt
 https://travis-ci.org/uce/incubator-flink/jobs/57814195
 https://s3.amazonaws.com/archive.travis-ci.org/jobs/57814195/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1784) KafkaITCase

2015-05-13 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-1784.
-
Resolution: Not A Problem

 KafkaITCase
 ---

 Key: FLINK-1784
 URL: https://issues.apache.org/jira/browse/FLINK-1784
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.9
Reporter: Fabian Hueske
Priority: Minor

 I observed a non-deterministic failure of the {{KafkaITCase}} on Travis:
 https://travis-ci.org/fhueske/flink/jobs/55815808
 {code}
 Running org.apache.flink.streaming.connectors.kafka.KafkaITCase
 03/25/2015 16:08:15 Job execution switched to status RUNNING.
 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to SCHEDULED
 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to DEPLOYING
 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to SCHEDULED
 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to DEPLOYING
 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to RUNNING
 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to RUNNING
 03/25/2015 16:08:17 Custom Source - Stream Sink(1/1) switched to FAILED
 java.util.NoSuchElementException: next on empty iterator
 at scala.collection.Iterator$$anon$2.next(Iterator.scala:39)
 at scala.collection.Iterator$$anon$2.next(Iterator.scala:37)
 at scala.collection.LinearSeqLike$$anon$1.next(LinearSeqLike.scala:62)
 at scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:30)
 at 
 org.apache.flink.streaming.connectors.kafka.api.simple.KafkaTopicUtils.getLeaderBrokerAddressForTopic(KafkaTopicUtils.java:83)
 at 
 org.apache.flink.streaming.connectors.kafka.api.KafkaSink.open(KafkaSink.java:118)
 at 
 org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:33)
 at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.open(StreamInvokable.java:158)
 at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.openOperator(StreamVertex.java:202)
 at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:165)
 at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:209)
 at java.lang.Thread.run(Thread.java:701)
 03/25/2015 16:08:17 Job execution switched to status FAILING.
 03/25/2015 16:08:17 Custom Source - Stream Sink(1/1) switched to CANCELING
 03/25/2015 16:08:17 Custom Source - Stream Sink(1/1) switched to FAILED
 java.util.NoSuchElementException: next on empty iterator
 at scala.collection.Iterator$$anon$2.next(Iterator.scala:39)
 at scala.collection.Iterator$$anon$2.next(Iterator.scala:37)
 at scala.collection.LinearSeqLike$$anon$1.next(LinearSeqLike.scala:62)
 at scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:30)
 at 
 org.apache.flink.streaming.connectors.kafka.api.simple.KafkaTopicUtils.getLeaderBrokerAddressForTopic(KafkaTopicUtils.java:83)
 at 
 org.apache.flink.streaming.connectors.kafka.api.simple.PersistentKafkaSource.open(PersistentKafkaSource.java:159)
 at 
 org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:33)
 at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.open(StreamInvokable.java:158)
 at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.openOperator(StreamVertex.java:198)
 at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:165)
 at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:209)
 at java.lang.Thread.run(Thread.java:701)
 03/25/2015 16:08:17 Job execution switched to status FAILED.
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 10.774 sec 
  FAILURE! - in org.apache.flink.streaming.connectors.kafka.KafkaITCase
 test(org.apache.flink.streaming.connectors.kafka.KafkaITCase) Time elapsed: 
 10.604 sec  FAILURE!
 java.lang.AssertionError: Test failed with: null
 at org.junit.Assert.fail(Assert.java:88)
 at 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase.test(KafkaITCase.java:104)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1784) KafkaITCase

2015-05-13 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541689#comment-14541689
 ] 

Stephan Ewen commented on FLINK-1784:
-

Subsumed by FLINK-2008

 KafkaITCase
 ---

 Key: FLINK-1784
 URL: https://issues.apache.org/jira/browse/FLINK-1784
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.9
Reporter: Fabian Hueske
Priority: Minor

 I observed a non-deterministic failure of the {{KafkaITCase}} on Travis:
 https://travis-ci.org/fhueske/flink/jobs/55815808
 {code}
 Running org.apache.flink.streaming.connectors.kafka.KafkaITCase
 03/25/2015 16:08:15 Job execution switched to status RUNNING.
 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to SCHEDULED
 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to DEPLOYING
 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to SCHEDULED
 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to DEPLOYING
 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to RUNNING
 03/25/2015 16:08:15 Custom Source - Stream Sink(1/1) switched to RUNNING
 03/25/2015 16:08:17 Custom Source - Stream Sink(1/1) switched to FAILED
 java.util.NoSuchElementException: next on empty iterator
 at scala.collection.Iterator$$anon$2.next(Iterator.scala:39)
 at scala.collection.Iterator$$anon$2.next(Iterator.scala:37)
 at scala.collection.LinearSeqLike$$anon$1.next(LinearSeqLike.scala:62)
 at scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:30)
 at 
 org.apache.flink.streaming.connectors.kafka.api.simple.KafkaTopicUtils.getLeaderBrokerAddressForTopic(KafkaTopicUtils.java:83)
 at 
 org.apache.flink.streaming.connectors.kafka.api.KafkaSink.open(KafkaSink.java:118)
 at 
 org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:33)
 at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.open(StreamInvokable.java:158)
 at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.openOperator(StreamVertex.java:202)
 at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:165)
 at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:209)
 at java.lang.Thread.run(Thread.java:701)
 03/25/2015 16:08:17 Job execution switched to status FAILING.
 03/25/2015 16:08:17 Custom Source - Stream Sink(1/1) switched to CANCELING
 03/25/2015 16:08:17 Custom Source - Stream Sink(1/1) switched to FAILED
 java.util.NoSuchElementException: next on empty iterator
 at scala.collection.Iterator$$anon$2.next(Iterator.scala:39)
 at scala.collection.Iterator$$anon$2.next(Iterator.scala:37)
 at scala.collection.LinearSeqLike$$anon$1.next(LinearSeqLike.scala:62)
 at scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:30)
 at 
 org.apache.flink.streaming.connectors.kafka.api.simple.KafkaTopicUtils.getLeaderBrokerAddressForTopic(KafkaTopicUtils.java:83)
 at 
 org.apache.flink.streaming.connectors.kafka.api.simple.PersistentKafkaSource.open(PersistentKafkaSource.java:159)
 at 
 org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:33)
 at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.open(StreamInvokable.java:158)
 at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.openOperator(StreamVertex.java:198)
 at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:165)
 at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:209)
 at java.lang.Thread.run(Thread.java:701)
 03/25/2015 16:08:17 Job execution switched to status FAILED.
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 10.774 sec 
  FAILURE! - in org.apache.flink.streaming.connectors.kafka.KafkaITCase
 test(org.apache.flink.streaming.connectors.kafka.KafkaITCase) Time elapsed: 
 10.604 sec  FAILURE!
 java.lang.AssertionError: Test failed with: null
 at org.junit.Assert.fail(Assert.java:88)
 at 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase.test(KafkaITCase.java:104)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1690) ProcessFailureBatchRecoveryITCase.testTaskManagerProcessFailure spuriously fails on Travis

2015-05-13 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541692#comment-14541692
 ] 

Stephan Ewen commented on FLINK-1690:
-

I think this one is stable by now...

 ProcessFailureBatchRecoveryITCase.testTaskManagerProcessFailure spuriously 
 fails on Travis
 --

 Key: FLINK-1690
 URL: https://issues.apache.org/jira/browse/FLINK-1690
 Project: Flink
  Issue Type: Bug
Reporter: Till Rohrmann
Priority: Minor

 I got the following error on Travis.
 {code}
 ProcessFailureBatchRecoveryITCase.testTaskManagerProcessFailure:244 The 
 program did not finish in time
 {code}
 I think we have to increase the timeouts for this test case to make it 
 reliably run on Travis.
 The log of the failed Travis build can be found 
 [here|https://api.travis-ci.org/jobs/53952486/log.txt?deansi=true]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1990] [staging table] Support upper cas...

2015-05-13 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/667#issuecomment-101617351
  
For some reason it doesn't like the keywork2Parser when you use String 
Interpolation. If you replace the line with:
```
((i?)\Q + kw.key + \E).r
```

it should work.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1525) Provide utils to pass -D parameters to UDFs

2015-05-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541624#comment-14541624
 ] 

ASF GitHub Bot commented on FLINK-1525:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/664#issuecomment-101579496
  
Do you want to allow empty keys like in `-- value` or even `--`? I think 
the parser should throw an exception in this case.


 Provide utils to pass -D parameters to UDFs 
 

 Key: FLINK-1525
 URL: https://issues.apache.org/jira/browse/FLINK-1525
 Project: Flink
  Issue Type: Improvement
  Components: flink-contrib
Reporter: Robert Metzger
Assignee: Robert Metzger
  Labels: starter

 Hadoop users are used to setting job configuration through -D on the 
 command line.
 Right now, Flink users have to manually parse command line arguments and pass 
 them to the methods.
 It would be nice to provide a standard args parser with is taking care of 
 such stuff.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1525]Introduction of a small input para...

2015-05-13 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/664#issuecomment-101579496
  
Do you want to allow empty keys like in `-- value` or even `--`? I think 
the parser should throw an exception in this case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1865) Unstable test KafkaITCase

2015-05-13 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541684#comment-14541684
 ] 

Stephan Ewen commented on FLINK-1865:
-

With the rewrite of the PersistentKafkaSource, this has been subsumed by 
FLINK-2008

 Unstable test KafkaITCase
 -

 Key: FLINK-1865
 URL: https://issues.apache.org/jira/browse/FLINK-1865
 Project: Flink
  Issue Type: Bug
  Components: Streaming, Tests
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Robert Metzger

 {code}
 Running org.apache.flink.streaming.connectors.kafka.KafkaITCase
 04/10/2015 13:46:53   Job execution switched to status RUNNING.
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to FAILED 
 java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37)
   at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
   at java.lang.Thread.run(Thread.java:701)
 Caused by: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.ChainableInvokable.collect(ChainableInvokable.java:54)
   at 
 org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39)
   at 
 org.apache.flink.streaming.connectors.kafka.api.KafkaSource.run(KafkaSource.java:196)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.callUserFunction(SourceInvokable.java:42)
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139)
   ... 4 more
 Caused by: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:166)
   at 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:141)
   at 
 org.apache.flink.streaming.api.invokable.SinkInvokable.callUserFunction(SinkInvokable.java:41)
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139)
   ... 9 more
 04/10/2015 13:47:04   Job execution switched to status FAILING.
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to CANCELING 
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to CANCELED 
 04/10/2015 13:47:04   Job execution switched to status FAILED.
 04/10/2015 13:47:05   Job execution switched to status RUNNING.
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:15   Custom Source - Stream Sink(1/1) switched to FAILED 
 java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37)
   at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
   at java.lang.Thread.run(Thread.java:701)
 Caused by: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 

[jira] [Closed] (FLINK-1865) Unstable test KafkaITCase

2015-05-13 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-1865.
---
Resolution: Not A Problem

 Unstable test KafkaITCase
 -

 Key: FLINK-1865
 URL: https://issues.apache.org/jira/browse/FLINK-1865
 Project: Flink
  Issue Type: Bug
  Components: Streaming, Tests
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Robert Metzger

 {code}
 Running org.apache.flink.streaming.connectors.kafka.KafkaITCase
 04/10/2015 13:46:53   Job execution switched to status RUNNING.
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:46:53   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to FAILED 
 java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37)
   at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
   at java.lang.Thread.run(Thread.java:701)
 Caused by: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.ChainableInvokable.collect(ChainableInvokable.java:54)
   at 
 org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39)
   at 
 org.apache.flink.streaming.connectors.kafka.api.KafkaSource.run(KafkaSource.java:196)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.callUserFunction(SourceInvokable.java:42)
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139)
   ... 4 more
 Caused by: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:166)
   at 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:141)
   at 
 org.apache.flink.streaming.api.invokable.SinkInvokable.callUserFunction(SinkInvokable.java:41)
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139)
   ... 9 more
 04/10/2015 13:47:04   Job execution switched to status FAILING.
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to CANCELING 
 04/10/2015 13:47:04   Custom Source - Stream Sink(1/1) switched to CANCELED 
 04/10/2015 13:47:04   Job execution switched to status FAILED.
 04/10/2015 13:47:05   Job execution switched to status RUNNING.
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to SCHEDULED 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to DEPLOYING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:05   Custom Source - Stream Sink(1/1) switched to RUNNING 
 04/10/2015 13:47:15   Custom Source - Stream Sink(1/1) switched to FAILED 
 java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37)
   at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221)
   at java.lang.Thread.run(Thread.java:701)
 Caused by: java.lang.RuntimeException: 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145)
   at 
 

[jira] [Resolved] (FLINK-1661) No Documentation for the isWeaklyConnected() Method

2015-05-13 Thread Vasia Kalavri (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasia Kalavri resolved FLINK-1661.
--
   Resolution: Not A Problem
Fix Version/s: 0.9

The method was removed by http://github.com/apache/flink/pull/657.

 No Documentation for the isWeaklyConnected() Method
 ---

 Key: FLINK-1661
 URL: https://issues.apache.org/jira/browse/FLINK-1661
 Project: Flink
  Issue Type: Wish
  Components: Gelly
Affects Versions: 0.9
Reporter: Andra Lungu
Priority: Trivial
  Labels: easyfix
 Fix For: 0.9


 The current Gelly documentation does not include an entry for the 
 isWeaklyConnected method. It would be nice to also include this description.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Clean up naming of State/Checkpoint Interfaces

2015-05-13 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/671#issuecomment-101653784
  
It would mean the state with which that operator instance is started.
It is just a personal preference - just keep it as it is if you want...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Clean up naming of State/Checkpoint Interfaces

2015-05-13 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/671#issuecomment-101652722
  
But what does restoreInitialState mean? What is the initial state? The 
state when the operator was first created?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-2011) Improve error message when user-defined serialization logic is wrong

2015-05-13 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-2011:
---

 Summary: Improve error message when user-defined serialization 
logic is wrong
 Key: FLINK-2011
 URL: https://issues.apache.org/jira/browse/FLINK-2011
 Project: Flink
  Issue Type: Bug
  Components: Distributed Runtime
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 0.9






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1977) Rework Stream Operators to always be push based

2015-05-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541908#comment-14541908
 ] 

ASF GitHub Bot commented on FLINK-1977:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/659#issuecomment-101663155
  
I rebased this PR on top of the latest work on the checkpointing. What are 
the opinions on merging this?


 Rework Stream Operators to always be push based
 ---

 Key: FLINK-1977
 URL: https://issues.apache.org/jira/browse/FLINK-1977
 Project: Flink
  Issue Type: Improvement
Reporter: Aljoscha Krettek
Assignee: Aljoscha Krettek

 This is a result of the discussion on the mailing list. This is an excerpt 
 from the mailing list that gives the basic idea of the change:
 I propose to change all streaming operators to be push based, with a
 slightly improved interface: In addition to collect(), which I would
 call receiveElement() I would add receivePunctuation() and
 receiveBarrier(). The first operator in the chain would also get data
 from the outside invokable that reads from the input iterator and
 calls receiveElement() for the first operator in a chain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Clean up naming of State/Checkpoint Interfaces

2015-05-13 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/671#issuecomment-101651301
  
Looks good

I actually liked the function name `restoreInitialState`. This makes it 
very clear that this method sets only the initial state, not any state 
whensoever.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1962) Add Gelly Scala API

2015-05-13 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541827#comment-14541827
 ] 

Aljoscha Krettek commented on FLINK-1962:
-

Sorry, your comment did not appear yet when I wrote my reply.

My fix should work this time. Even though it looks quite ugly because of the 
dual casting. The reason for this is that the TupleTypeInfo constructor has 
type bounds on T to only allow subclasses of Tuple. The T I have in the 
TypeInformationGen method does not have these type bounds, so I have to cast 
things around, but the end result should be correct this time.

I'm sorry this look so long. I've never looked at the graph API in depth before.

 Add Gelly Scala API
 ---

 Key: FLINK-1962
 URL: https://issues.apache.org/jira/browse/FLINK-1962
 Project: Flink
  Issue Type: Task
  Components: Gelly, Scala API
Affects Versions: 0.9
Reporter: Vasia Kalavri





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1977] Rework Stream Operators to always...

2015-05-13 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/659#issuecomment-101663155
  
I rebased this PR on top of the latest work on the checkpointing. What are 
the opinions on merging this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1962) Add Gelly Scala API

2015-05-13 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541810#comment-14541810
 ] 

Aljoscha Krettek commented on FLINK-1962:
-

I'm sorry. It seems that the Graph API also needs the correct type class in the 
TupleTypeInfo. I pushed a fix to my pull request that should address this.

 Add Gelly Scala API
 ---

 Key: FLINK-1962
 URL: https://issues.apache.org/jira/browse/FLINK-1962
 Project: Flink
  Issue Type: Task
  Components: Gelly, Scala API
Affects Versions: 0.9
Reporter: Vasia Kalavri





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1731) Add kMeans clustering algorithm to machine learning library

2015-05-13 Thread Peter Schrott (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541818#comment-14541818
 ] 

Peter Schrott commented on FLINK-1731:
--

As discussed with [~aalexandrov] The initial centroids are given to the kmeans 
within the parameter map. For convenience fluent syntax is added.

 Add kMeans clustering algorithm to machine learning library
 ---

 Key: FLINK-1731
 URL: https://issues.apache.org/jira/browse/FLINK-1731
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Alexander Alexandrov
  Labels: ML

 The Flink repository already contains a kMeans implementation but it is not 
 yet ported to the machine learning library. I assume that only the used data 
 types have to be adapted and then it can be more or less directly moved to 
 flink-ml.
 The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better 
 implementation because the improve the initial seeding phase to achieve near 
 optimal clustering. It might be worthwhile to implement kMeans||.
 Resources:
 [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf
 [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-1595) Add a complex integration test for Streaming API

2015-05-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/FLINK-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Márton Balassi closed FLINK-1595.
-
   Resolution: Implemented
Fix Version/s: 0.9

Implemented via 4786b56 and 48e21a1.

 Add a complex integration test for Streaming API
 

 Key: FLINK-1595
 URL: https://issues.apache.org/jira/browse/FLINK-1595
 Project: Flink
  Issue Type: Task
  Components: Streaming
Reporter: Gyula Fora
Assignee: Péter Szabó
  Labels: Starter
 Fix For: 0.9


 The streaming tests currently lack a sophisticated integration test that 
 would test many api features at once. 
 This should include different merging, partitioning, grouping, aggregation 
 types, as well as windowing and connected operators.
 The results should be tested for correctness.
 A test like this would help identifying bugs that are hard to detect by 
 unit-tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1962) Add Gelly Scala API

2015-05-13 Thread PJ Van Aeken (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541783#comment-14541783
 ] 

PJ Van Aeken commented on FLINK-1962:
-

So the line I mentioned is definitely the problem. What you'd really need to 
call there is the other constructor for TupleTypeInfo which also takes a Class 
as constructor parameter. Unfortunately, due to type erasure, the class T is 
not available. Since T : WeakTypeTag, perhaps there is a way to pass the class 
parameter after all? I am not that familiar with how WeakTypeTags work exactly.

 Add Gelly Scala API
 ---

 Key: FLINK-1962
 URL: https://issues.apache.org/jira/browse/FLINK-1962
 Project: Flink
  Issue Type: Task
  Components: Gelly, Scala API
Affects Versions: 0.9
Reporter: Vasia Kalavri





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2010) Add test that verifies that chained tasks are properly checkpointed

2015-05-13 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-2010:
---

 Summary: Add test that verifies that chained tasks are properly 
checkpointed
 Key: FLINK-2010
 URL: https://issues.apache.org/jira/browse/FLINK-2010
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Reporter: Stephan Ewen


Because this is critical logic, we should have a unit test for the 
{{StreamingTask}} validating that the state from all chained tasks is taken 
into account.

It needs to check
  - The trigger checkpoint method, and that the state handle handle in the 
checkpoint acknowledgement contains all the necessary state

  - The restore state method, that state is reset properly



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Clean up naming of State/Checkpoint Interfaces

2015-05-13 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/671#issuecomment-101654445
  
Nah, you can change it of course. :smile_cat: 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1595) Add a complex integration test for Streaming API

2015-05-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541940#comment-14541940
 ] 

ASF GitHub Bot commented on FLINK-1595:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/520


 Add a complex integration test for Streaming API
 

 Key: FLINK-1595
 URL: https://issues.apache.org/jira/browse/FLINK-1595
 Project: Flink
  Issue Type: Task
  Components: Streaming
Reporter: Gyula Fora
Assignee: Péter Szabó
  Labels: Starter

 The streaming tests currently lack a sophisticated integration test that 
 would test many api features at once. 
 This should include different merging, partitioning, grouping, aggregation 
 types, as well as windowing and connected operators.
 The results should be tested for correctness.
 A test like this would help identifying bugs that are hard to detect by 
 unit-tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1595] [streaming] Complex integration t...

2015-05-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/520


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1990) Uppercase AS keyword not allowed in select expression

2015-05-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542011#comment-14542011
 ] 

ASF GitHub Bot commented on FLINK-1990:
---

Github user chhao01 commented on the pull request:

https://github.com/apache/flink/pull/667#issuecomment-101688740
  
Thank you @aljoscha, eventually I got the IDE works with unit test, and I 
update the code, let's see what travis will say.


 Uppercase AS keyword not allowed in select expression
 ---

 Key: FLINK-1990
 URL: https://issues.apache.org/jira/browse/FLINK-1990
 Project: Flink
  Issue Type: Bug
  Components: Table API
Affects Versions: 0.9
Reporter: Fabian Hueske
Assignee: Aljoscha Krettek
Priority: Minor
 Fix For: 0.9


 Table API select expressions do not allow an uppercase AS keyword.
 The following expression fails with an {{ExpressionException}}:
  {{table.groupBy(request).select(request, request.count AS cnt)}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1977) Rework Stream Operators to always be push based

2015-05-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541985#comment-14541985
 ] 

ASF GitHub Bot commented on FLINK-1977:
---

Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/659#issuecomment-101684152
  
I'll do another pass tomorrow.


 Rework Stream Operators to always be push based
 ---

 Key: FLINK-1977
 URL: https://issues.apache.org/jira/browse/FLINK-1977
 Project: Flink
  Issue Type: Improvement
Reporter: Aljoscha Krettek
Assignee: Aljoscha Krettek

 This is a result of the discussion on the mailing list. This is an excerpt 
 from the mailing list that gives the basic idea of the change:
 I propose to change all streaming operators to be push based, with a
 slightly improved interface: In addition to collect(), which I would
 call receiveElement() I would add receivePunctuation() and
 receiveBarrier(). The first operator in the chain would also get data
 from the outside invokable that reads from the input iterator and
 calls receiveElement() for the first operator in a chain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1990] [staging table] Support upper cas...

2015-05-13 Thread chhao01
Github user chhao01 commented on the pull request:

https://github.com/apache/flink/pull/667#issuecomment-101688740
  
Thank you @aljoscha, eventually I got the IDE works with unit test, and I 
update the code, let's see what travis will say.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1977] Rework Stream Operators to always...

2015-05-13 Thread mbalassi
Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/659#issuecomment-101684152
  
I'll do another pass tomorrow.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1525) Provide utils to pass -D parameters to UDFs

2015-05-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542129#comment-14542129
 ] 

ASF GitHub Bot commented on FLINK-1525:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/664#issuecomment-101720728
  
Good catch, thank you.
I added test cases that validate we're throwing an exception for `-` and 
`--` inputs.


 Provide utils to pass -D parameters to UDFs 
 

 Key: FLINK-1525
 URL: https://issues.apache.org/jira/browse/FLINK-1525
 Project: Flink
  Issue Type: Improvement
  Components: flink-contrib
Reporter: Robert Metzger
Assignee: Robert Metzger
  Labels: starter

 Hadoop users are used to setting job configuration through -D on the 
 command line.
 Right now, Flink users have to manually parse command line arguments and pass 
 them to the methods.
 It would be nice to provide a standard args parser with is taking care of 
 such stuff.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1977) Rework Stream Operators to always be push based

2015-05-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542134#comment-14542134
 ] 

ASF GitHub Bot commented on FLINK-1977:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/659#issuecomment-101721428
  
I will also try to have a closer look later today or tomorrow.
First rough glance looks good!


 Rework Stream Operators to always be push based
 ---

 Key: FLINK-1977
 URL: https://issues.apache.org/jira/browse/FLINK-1977
 Project: Flink
  Issue Type: Improvement
Reporter: Aljoscha Krettek
Assignee: Aljoscha Krettek

 This is a result of the discussion on the mailing list. This is an excerpt 
 from the mailing list that gives the basic idea of the change:
 I propose to change all streaming operators to be push based, with a
 slightly improved interface: In addition to collect(), which I would
 call receiveElement() I would add receivePunctuation() and
 receiveBarrier(). The first operator in the chain would also get data
 from the outside invokable that reads from the input iterator and
 calls receiveElement() for the first operator in a chain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1977] Rework Stream Operators to always...

2015-05-13 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/659#issuecomment-101721428
  
I will also try to have a closer look later today or tomorrow.
First rough glance looks good!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1990) Uppercase AS keyword not allowed in select expression

2015-05-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542108#comment-14542108
 ] 

ASF GitHub Bot commented on FLINK-1990:
---

Github user chhao01 commented on the pull request:

https://github.com/apache/flink/pull/667#issuecomment-101719363
  
Failed in `flink-streaming-connectors` with java 8, does that related to 
this change?


 Uppercase AS keyword not allowed in select expression
 ---

 Key: FLINK-1990
 URL: https://issues.apache.org/jira/browse/FLINK-1990
 Project: Flink
  Issue Type: Bug
  Components: Table API
Affects Versions: 0.9
Reporter: Fabian Hueske
Assignee: Aljoscha Krettek
Priority: Minor
 Fix For: 0.9


 Table API select expressions do not allow an uppercase AS keyword.
 The following expression fails with an {{ExpressionException}}:
  {{table.groupBy(request).select(request, request.count AS cnt)}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2003) Building on some encrypted filesystems leads to File name too long error

2015-05-13 Thread Theodore Vasiloudis (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542183#comment-14542183
 ] 

Theodore Vasiloudis commented on FLINK-2003:


As a note, in my case this was caused by the flink-yarn package, so adding the 
compilation arg to its pom fixed the problems.

 Building on some encrypted filesystems leads to File name too long error
 --

 Key: FLINK-2003
 URL: https://issues.apache.org/jira/browse/FLINK-2003
 Project: Flink
  Issue Type: Bug
  Components: Build System
Reporter: Theodore Vasiloudis
Priority: Minor
  Labels: build, starter

 The classnames generated from the build system can be too long.
 Creating too long filenames in some encrypted filesystems is not possible, 
 including encfs which is what Ubuntu uses.
 This the same as this [Spark 
 issue|https://issues.apache.org/jira/browse/SPARK-4820]
 The workaround (taken from the linked issue) is to add in Maven under the 
 compile options: 
 {code}
 +  arg-Xmax-classfile-name/arg
 +  arg128/arg
 {code}
 And in SBT add:
 {code}
 +scalacOptions in Compile ++= Seq(-Xmax-classfile-name, 128),
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1711) Replace all usages off commons.Validate with guava.check

2015-05-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543112#comment-14543112
 ] 

ASF GitHub Bot commented on FLINK-1711:
---

GitHub user lokeshrajaram opened a pull request:

https://github.com/apache/flink/pull/673

[FLINK-1711] - Converted all usages of Commons Validate to Guava Checks(for 
Java classes), Scala predef require(for Scala classes)

[FLINK-1711] - Converted all usages of Commons Validate to Guava Checks(for 
Java classes), Scala predef require(for Scala classes)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lokeshrajaram/flink all_guava

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/673.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #673


commit 04e1695d3b8414616216264a5b0972d762664ec7
Author: lrajaram lokesh_raja...@intuit.com
Date:   2015-05-10T01:57:36Z

converted all usages of Commons Validate to Guava Checks

commit 4f68d03d50d0fab47f5067906ec805f4a8b93cfa
Author: lrajaram lokesh_raja...@intuit.com
Date:   2015-05-14T02:29:03Z

converted all usages of commons validate to guava checks(for Java classes), 
scala predef require(for scala classes)




 Replace all usages off commons.Validate with guava.check
 

 Key: FLINK-1711
 URL: https://issues.apache.org/jira/browse/FLINK-1711
 Project: Flink
  Issue Type: Improvement
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Lokesh Rajaram
Priority: Minor
  Labels: easyfix, starter
 Fix For: 0.9


 Per discussion on the mailing list, we decided to increase homogeneity. One 
 part is to consistently use the Guava methods {{checkNotNull}} and 
 {{checkArgument}}, rather than Apache Commons Lang3 {{Validate}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1711] - Converted all usages of Commons...

2015-05-13 Thread lokeshrajaram
GitHub user lokeshrajaram opened a pull request:

https://github.com/apache/flink/pull/673

[FLINK-1711] - Converted all usages of Commons Validate to Guava Checks(for 
Java classes), Scala predef require(for Scala classes)

[FLINK-1711] - Converted all usages of Commons Validate to Guava Checks(for 
Java classes), Scala predef require(for Scala classes)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lokeshrajaram/flink all_guava

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/673.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #673


commit 04e1695d3b8414616216264a5b0972d762664ec7
Author: lrajaram lokesh_raja...@intuit.com
Date:   2015-05-10T01:57:36Z

converted all usages of Commons Validate to Guava Checks

commit 4f68d03d50d0fab47f5067906ec805f4a8b93cfa
Author: lrajaram lokesh_raja...@intuit.com
Date:   2015-05-14T02:29:03Z

converted all usages of commons validate to guava checks(for Java classes), 
scala predef require(for scala classes)




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (FLINK-1932) Add L-BFGS to the optimization framework

2015-05-13 Thread Theodore Vasiloudis (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Theodore Vasiloudis reassigned FLINK-1932:
--

Assignee: Theodore Vasiloudis

 Add L-BFGS to the optimization framework
 

 Key: FLINK-1932
 URL: https://issues.apache.org/jira/browse/FLINK-1932
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Theodore Vasiloudis
  Labels: ML

 A good candidate to add to the new optimization framework could be L-BFGS [1, 
 2].
 Resources:
 [1] http://papers.nips.cc/paper/5333-large-scale-l-bfgs-using-mapreduce.pdf
 [2] http://en.wikipedia.org/wiki/Limited-memory_BFGS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1907] Scala shell

2015-05-13 Thread nikste
GitHub user nikste opened a pull request:

https://github.com/apache/flink/pull/672

[FLINK-1907] Scala shell

implementation of Scala Shell, for Scala versions 2.10 and 2.11.
Also includes change to eager print()

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nikste/flink Scala_shell

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/672.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #672


commit d28f69bb5ed6ce71657f62cf515839ded93f5deb
Author: fleischhauf nikolaas.steenber...@googlemail.com
Date:   2015-05-12T19:54:20Z

Added custom Scala shell with ScalaShellRemoteEnvironment from Remote… …
…Environment

Either started locally, or connecting to remote sever
added startup sh script for scala shell
works with Scala 2.10 and 2.11

commit 68e8c087f678c53715f65f74dabe367eeb49ada1
Author: Aljoscha Krettek aljoscha.kret...@gmail.com
Date:   2015-04-28T08:52:29Z

[hotfix] Change print() to eager execution

print() now uses collect() internally

commit 320c95749af97c33bf2245b6db3d659bc3fc0379
Author: Nikolaas Steenbergen nikolaas.steenber...@googlemail.com
Date:   2015-05-05T08:49:54Z

Changed tests and examples for not failing with eager print method
Added lastJobExecutionResult for getting the result of the last execution, 
when executing eager print

commit 0b7cdc5fbd3f3271f23e781722edb75d380b3945
Author: fleischhauf nikolaas.steenber...@googlemail.com
Date:   2015-05-12T19:58:57Z

Added flink scala shell documentation with wordcount example




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Closed] (FLINK-2006) TaskManagerTest.testRunJobWithForwardChannel:432 expected:FINISHED but was:FAILED

2015-05-13 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-2006.
---

 TaskManagerTest.testRunJobWithForwardChannel:432 expected:FINISHED but 
 was:FAILED
 -

 Key: FLINK-2006
 URL: https://issues.apache.org/jira/browse/FLINK-2006
 Project: Flink
  Issue Type: Bug
  Components: TaskManager
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Stephan Ewen
 Fix For: 0.9


 I've seen this test failing multiple times now.
 Example: https://travis-ci.org/rmetzger/flink/jobs/62355627
 [~till.rohrmann] tried to fix the issue a while ago with this commit 
 http://git-wip-us.apache.org/repos/asf/flink/commit/3897b47b



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2011) Improve error message when user-defined serialization logic is wrong

2015-05-13 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-2011.
---

 Improve error message when user-defined serialization logic is wrong
 

 Key: FLINK-2011
 URL: https://issues.apache.org/jira/browse/FLINK-2011
 Project: Flink
  Issue Type: Bug
  Components: Distributed Runtime
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 0.9






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-2011) Improve error message when user-defined serialization logic is wrong

2015-05-13 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-2011.
-
Resolution: Pending Closed

Fixed in 113b20b7f8717b12c5f0dfa691da582d426fbae0

 Improve error message when user-defined serialization logic is wrong
 

 Key: FLINK-2011
 URL: https://issues.apache.org/jira/browse/FLINK-2011
 Project: Flink
  Issue Type: Bug
  Components: Distributed Runtime
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 0.9






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-2006) TaskManagerTest.testRunJobWithForwardChannel:432 expected:FINISHED but was:FAILED

2015-05-13 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-2006.
-
   Resolution: Pending Closed
Fix Version/s: 0.9
 Assignee: Stephan Ewen

Fixed in 9da2f1f725473f1242067acb546607888e3a5015

 TaskManagerTest.testRunJobWithForwardChannel:432 expected:FINISHED but 
 was:FAILED
 -

 Key: FLINK-2006
 URL: https://issues.apache.org/jira/browse/FLINK-2006
 Project: Flink
  Issue Type: Bug
  Components: TaskManager
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Stephan Ewen
 Fix For: 0.9


 I've seen this test failing multiple times now.
 Example: https://travis-ci.org/rmetzger/flink/jobs/62355627
 [~till.rohrmann] tried to fix the issue a while ago with this commit 
 http://git-wip-us.apache.org/repos/asf/flink/commit/3897b47b



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1907) Scala Interactive Shell

2015-05-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542326#comment-14542326
 ] 

ASF GitHub Bot commented on FLINK-1907:
---

GitHub user nikste opened a pull request:

https://github.com/apache/flink/pull/672

[FLINK-1907] Scala shell

implementation of Scala Shell, for Scala versions 2.10 and 2.11.
Also includes change to eager print()

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nikste/flink Scala_shell

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/672.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #672


commit d28f69bb5ed6ce71657f62cf515839ded93f5deb
Author: fleischhauf nikolaas.steenber...@googlemail.com
Date:   2015-05-12T19:54:20Z

Added custom Scala shell with ScalaShellRemoteEnvironment from Remote… …
…Environment

Either started locally, or connecting to remote sever
added startup sh script for scala shell
works with Scala 2.10 and 2.11

commit 68e8c087f678c53715f65f74dabe367eeb49ada1
Author: Aljoscha Krettek aljoscha.kret...@gmail.com
Date:   2015-04-28T08:52:29Z

[hotfix] Change print() to eager execution

print() now uses collect() internally

commit 320c95749af97c33bf2245b6db3d659bc3fc0379
Author: Nikolaas Steenbergen nikolaas.steenber...@googlemail.com
Date:   2015-05-05T08:49:54Z

Changed tests and examples for not failing with eager print method
Added lastJobExecutionResult for getting the result of the last execution, 
when executing eager print

commit 0b7cdc5fbd3f3271f23e781722edb75d380b3945
Author: fleischhauf nikolaas.steenber...@googlemail.com
Date:   2015-05-12T19:58:57Z

Added flink scala shell documentation with wordcount example




 Scala Interactive Shell
 ---

 Key: FLINK-1907
 URL: https://issues.apache.org/jira/browse/FLINK-1907
 Project: Flink
  Issue Type: New Feature
  Components: Scala API
Reporter: Nikolaas Steenbergen
Assignee: Nikolaas Steenbergen
Priority: Minor

 Build an interactive Shell for the Scala api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-1731) Add kMeans clustering algorithm to machine learning library

2015-05-13 Thread Peter Schrott (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541516#comment-14541516
 ] 

Peter Schrott edited comment on FLINK-1731 at 5/13/15 7:29 AM:
---

Hi [~chiwanpark],

the thing is, to fit the model, the KMeans uses two datasets. One is the 
training data, the other are the initial centroids. This means, the 
{{fit}}-method should take two attributes at that point. This is the reason why 
I suggested to use the parameter map for passing the initial centroids. The 
training dataset will be passed as argument to the {{fit}}-method, equally to 
the CoCoA implementation.




was (Author: peedeex21):
Hi [~chiwanpark],

the thing is, to fit the model, the KMeans uses two datasets. One is the 
training data, the other are the initial centroids. This means, the 
{code:java}fit{code}-method should take two attributes at that point. This is 
the reason why I suggested to use the parameter map for passing the initial 
centroids. The training dataset will be passed as argument to the 
{code:java}fit{code}-method, equally to the CoCoA implementation.



 Add kMeans clustering algorithm to machine learning library
 ---

 Key: FLINK-1731
 URL: https://issues.apache.org/jira/browse/FLINK-1731
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Alexander Alexandrov
  Labels: ML

 The Flink repository already contains a kMeans implementation but it is not 
 yet ported to the machine learning library. I assume that only the used data 
 types have to be adapted and then it can be more or less directly moved to 
 flink-ml.
 The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better 
 implementation because the improve the initial seeding phase to achieve near 
 optimal clustering. It might be worthwhile to implement kMeans||.
 Resources:
 [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf
 [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1990) Uppercase AS keyword not allowed in select expression

2015-05-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541505#comment-14541505
 ] 

ASF GitHub Bot commented on FLINK-1990:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/667#issuecomment-101544907
  
I'll check it out.


 Uppercase AS keyword not allowed in select expression
 ---

 Key: FLINK-1990
 URL: https://issues.apache.org/jira/browse/FLINK-1990
 Project: Flink
  Issue Type: Bug
  Components: Table API
Affects Versions: 0.9
Reporter: Fabian Hueske
Assignee: Aljoscha Krettek
Priority: Minor
 Fix For: 0.9


 Table API select expressions do not allow an uppercase AS keyword.
 The following expression fails with an {{ExpressionException}}:
  {{table.groupBy(request).select(request, request.count AS cnt)}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1990] [staging table] Support upper cas...

2015-05-13 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/667#issuecomment-101544907
  
I'll check it out.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Comment Edited] (FLINK-1731) Add kMeans clustering algorithm to machine learning library

2015-05-13 Thread Peter Schrott (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541516#comment-14541516
 ] 

Peter Schrott edited comment on FLINK-1731 at 5/13/15 7:30 AM:
---

Hi [~chiwanpark],

the thing is, to fit the model, the KMeans uses two datasets. One is the 
training data, the other are the initial centroids. 

This means, the {{fit}}-method should take two attributes at that point. This 
is the reason why I suggested to use the parameter map for passing the initial 
centroids. The training dataset will be passed as argument to the 
{{fit}}-method, equally to the CoCoA implementation.




was (Author: peedeex21):
Hi [~chiwanpark],

the thing is, to fit the model, the KMeans uses two datasets. One is the 
training data, the other are the initial centroids. This means, the 
{{fit}}-method should take two attributes at that point. This is the reason why 
I suggested to use the parameter map for passing the initial centroids. The 
training dataset will be passed as argument to the {{fit}}-method, equally to 
the CoCoA implementation.



 Add kMeans clustering algorithm to machine learning library
 ---

 Key: FLINK-1731
 URL: https://issues.apache.org/jira/browse/FLINK-1731
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Alexander Alexandrov
  Labels: ML

 The Flink repository already contains a kMeans implementation but it is not 
 yet ported to the machine learning library. I assume that only the used data 
 types have to be adapted and then it can be more or less directly moved to 
 flink-ml.
 The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better 
 implementation because the improve the initial seeding phase to achieve near 
 optimal clustering. It might be worthwhile to implement kMeans||.
 Resources:
 [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf
 [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-1731) Add kMeans clustering algorithm to machine learning library

2015-05-13 Thread Peter Schrott (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541516#comment-14541516
 ] 

Peter Schrott edited comment on FLINK-1731 at 5/13/15 7:33 AM:
---

Hi [~chiwanpark],

the thing is, to fit the model, the KMeans uses two datasets. One is the 
training data, the other are the initial centroids. The initial centroids are 
used to create the appropriated clusters on the training dataset. These 
clusters define the fitted model.

This means, the {{fit}}-method should take two attributes at that point. This 
is the reason why I suggested to use the parameter map for passing the initial 
centroids. The training dataset will be passed as argument to the 
{{fit}}-method, equally to the CoCoA implementation.

The test dataset will be applied to the trained model afterwards.


was (Author: peedeex21):
Hi [~chiwanpark],

the thing is, to fit the model, the KMeans uses two datasets. One is the 
training data, the other are the initial centroids. 

This means, the {{fit}}-method should take two attributes at that point. This 
is the reason why I suggested to use the parameter map for passing the initial 
centroids. The training dataset will be passed as argument to the 
{{fit}}-method, equally to the CoCoA implementation.



 Add kMeans clustering algorithm to machine learning library
 ---

 Key: FLINK-1731
 URL: https://issues.apache.org/jira/browse/FLINK-1731
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Alexander Alexandrov
  Labels: ML

 The Flink repository already contains a kMeans implementation but it is not 
 yet ported to the machine learning library. I assume that only the used data 
 types have to be adapted and then it can be more or less directly moved to 
 flink-ml.
 The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better 
 implementation because the improve the initial seeding phase to achieve near 
 optimal clustering. It might be worthwhile to implement kMeans||.
 Resources:
 [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf
 [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2006) TaskManagerTest.testRunJobWithForwardChannel:432 expected:FINISHED but was:FAILED

2015-05-13 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-2006:
-

 Summary: TaskManagerTest.testRunJobWithForwardChannel:432 
expected:FINISHED but was:FAILED
 Key: FLINK-2006
 URL: https://issues.apache.org/jira/browse/FLINK-2006
 Project: Flink
  Issue Type: Bug
  Components: TaskManager
Affects Versions: 0.9
Reporter: Robert Metzger


I've seen this test failing multiple times now.

Example: https://travis-ci.org/rmetzger/flink/jobs/62355627

[~till.rohrmann] tried to fix the issue a while ago with this commit 
http://git-wip-us.apache.org/repos/asf/flink/commit/3897b47b



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1731) Add kMeans clustering algorithm to machine learning library

2015-05-13 Thread Alexander Alexandrov (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541570#comment-14541570
 ] 

Alexander Alexandrov commented on FLINK-1731:
-

I suggest to try and add the initial centroids as a proper parameter and not as 
part of the ParameterMap, since they are an actual input to the algorithm (as 
opposed to an algorithm hyper-parameter).

 Add kMeans clustering algorithm to machine learning library
 ---

 Key: FLINK-1731
 URL: https://issues.apache.org/jira/browse/FLINK-1731
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Alexander Alexandrov
  Labels: ML

 The Flink repository already contains a kMeans implementation but it is not 
 yet ported to the machine learning library. I assume that only the used data 
 types have to be adapted and then it can be more or less directly moved to 
 flink-ml.
 The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better 
 implementation because the improve the initial seeding phase to achieve near 
 optimal clustering. It might be worthwhile to implement kMeans||.
 Resources:
 [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf
 [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)