[GitHub] flink pull request: [FLINK-1874] [streaming] Connector breakup

2015-05-28 Thread mbalassi
Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/719#issuecomment-106203949
  
I added the `parent` suffix to the main pom, because that is how we tend to 
name modules not containing source code themselves. Then I have realized that 
the name is becoming too long and partially irrelevant for the reasons Stephan 
mentioned.

I'd go for `flink-connector-kafka` then and have the modules in streaming 
for now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1874) Break up streaming connectors into submodules

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562403#comment-14562403
 ] 

ASF GitHub Bot commented on FLINK-1874:
---

Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/719#issuecomment-106203949
  
I added the `parent` suffix to the main pom, because that is how we tend to 
name modules not containing source code themselves. Then I have realized that 
the name is becoming too long and partially irrelevant for the reasons Stephan 
mentioned.

I'd go for `flink-connector-kafka` then and have the modules in streaming 
for now.


> Break up streaming connectors into submodules
> -
>
> Key: FLINK-1874
> URL: https://issues.apache.org/jira/browse/FLINK-1874
> Project: Flink
>  Issue Type: Task
>  Components: Build System, Streaming
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Assignee: Márton Balassi
>  Labels: starter
>
> As per: 
> http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Break-up-streaming-connectors-into-subprojects-td5001.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [streaming] Source sync

2015-05-28 Thread mbalassi
GitHub user mbalassi opened a pull request:

https://github.com/apache/flink/pull/738

[streaming] Source sync

This builds on #521 and fixes some issues in it:

  * The pull request was rebased
  * fromCollection and generateSequence and the parallel versions have the 
expected behaviour
  * Removed old Serializable unnecessary mentions from both the batch and 
streaming execution environments.

I consider this an 0.9 blocker. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mbalassi/flink source-sync

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/738.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #738


commit 5bf18af599ec924403a887ba8756a5d94769b692
Author: szape 
Date:   2015-04-16T12:21:16Z

[FLINK-1687] [streaming] [api-extending] Synchronizing streaming source API 
to batch source API

Closes #521

commit b570d6dc812de610b3702a5cf4b7fe1fa2a58b26
Author: mbalassi 
Date:   2015-05-23T20:34:06Z

[FLINK-1687] [streaming] [api-breaking] fromCollection and generateSequence 
rework

commit 101aa563846a2015d0c84e13d5b57296692e8e60
Author: mbalassi 
Date:   2015-05-24T15:51:21Z

[docs] Removed unnecessary Serializable from ExecutionEnvironment JavaDocs

Also did for StreamExecutionEnvironment

commit c4a38bad2c2e9ac81ecc8ccf873f866cfdf91e95
Author: mbalassi 
Date:   2015-05-27T10:40:33Z

[streaming] [scala] [api-breaking] StreamExecutionEnvironment API update




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2088) change return type of name function in DataStream scala class

2015-05-28 Thread JIRA

[ 
https://issues.apache.org/jira/browse/FLINK-2088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562442#comment-14562442
 ] 

Márton Balassi commented on FLINK-2088:
---

Thanks for the report, fixing it.

> change return type of name function in DataStream scala class
> -
>
> Key: FLINK-2088
> URL: https://issues.apache.org/jira/browse/FLINK-2088
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Faye Beligianni
>Assignee: Márton Balassi
>  Labels: Streaming
> Fix For: 0.9
>
>
> {{name}} function of {{DataStream}} scala class has a wrong return type: 
> {code} DataStream[_] {code} when it should return a:{code}DataStream[T] 
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-2088) change return type of name function in DataStream scala class

2015-05-28 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/FLINK-2088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Márton Balassi reassigned FLINK-2088:
-

Assignee: Márton Balassi

> change return type of name function in DataStream scala class
> -
>
> Key: FLINK-2088
> URL: https://issues.apache.org/jira/browse/FLINK-2088
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Faye Beligianni
>Assignee: Márton Balassi
>  Labels: Streaming
> Fix For: 0.9
>
>
> {{name}} function of {{DataStream}} scala class has a wrong return type: 
> {code} DataStream[_] {code} when it should return a:{code}DataStream[T] 
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2088) change return type of name function in DataStream scala class

2015-05-28 Thread JIRA

[ 
https://issues.apache.org/jira/browse/FLINK-2088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562448#comment-14562448
 ] 

Márton Balassi commented on FLINK-2088:
---

We could do it, but currently streaming does not have all the distinct 
operators just the ones that were necessary. Currently the most specific type 
we could return would be the SingleOutputStreamOperator.

> change return type of name function in DataStream scala class
> -
>
> Key: FLINK-2088
> URL: https://issues.apache.org/jira/browse/FLINK-2088
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Faye Beligianni
>Assignee: Márton Balassi
>  Labels: Streaming
> Fix For: 0.9
>
>
> {{name}} function of {{DataStream}} scala class has a wrong return type: 
> {code} DataStream[_] {code} when it should return a:{code}DataStream[T] 
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2088) change return type of name function in DataStream scala class

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562459#comment-14562459
 ] 

ASF GitHub Bot commented on FLINK-2088:
---

GitHub user mbalassi opened a pull request:

https://github.com/apache/flink/pull/739

[FLINK-2088] [streaming] [scala] DataStream.name() returns a typed Da…

…taStream

Just a quickfix.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mbalassi/flink flink-2088

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/739.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #739


commit 31efaa171080fd1820fd5051e6d59699ac26c332
Author: mbalassi 
Date:   2015-05-28T07:58:40Z

[FLINK-2088] [streaming] [scala] DataStream.name() returns a typed 
DataStream




> change return type of name function in DataStream scala class
> -
>
> Key: FLINK-2088
> URL: https://issues.apache.org/jira/browse/FLINK-2088
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Faye Beligianni
>Assignee: Márton Balassi
>  Labels: Streaming
> Fix For: 0.9
>
>
> {{name}} function of {{DataStream}} scala class has a wrong return type: 
> {code} DataStream[_] {code} when it should return a:{code}DataStream[T] 
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2088] [streaming] [scala] DataStream.na...

2015-05-28 Thread mbalassi
GitHub user mbalassi opened a pull request:

https://github.com/apache/flink/pull/739

[FLINK-2088] [streaming] [scala] DataStream.name() returns a typed Da…

…taStream

Just a quickfix.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mbalassi/flink flink-2088

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/739.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #739


commit 31efaa171080fd1820fd5051e6d59699ac26c332
Author: mbalassi 
Date:   2015-05-28T07:58:40Z

[FLINK-2088] [streaming] [scala] DataStream.name() returns a typed 
DataStream




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1430) Add test for streaming scala api completeness

2015-05-28 Thread JIRA

[ 
https://issues.apache.org/jira/browse/FLINK-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562464#comment-14562464
 ] 

Márton Balassi commented on FLINK-1430:
---

Hey, [~qmlmoon] are you working on this issue currently or is it OK if I take 
this over?

> Add test for streaming scala api completeness
> -
>
> Key: FLINK-1430
> URL: https://issues.apache.org/jira/browse/FLINK-1430
> Project: Flink
>  Issue Type: Test
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Márton Balassi
>Assignee: Mingliang Qi
>
> Currently the completeness of the streaming scala api is not tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [streaming] Source sync

2015-05-28 Thread gyfora
Github user gyfora commented on the pull request:

https://github.com/apache/flink/pull/738#issuecomment-106224522
  
Looks good , but we probably should modify (or add) some tests to test the 
newly added functions because I think none of the new functionality is tested 
properly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [streaming] Source sync

2015-05-28 Thread mbalassi
Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/738#issuecomment-106225424
  
Thanks for the review. The test coverage is the following:

  * The newly introduced `StreamExecutionEnvionmentTest` is for testing the 
parallelism issues with the new API.
  * The parallel versions in functionality are covered by the 
`ComplexIntegrationTest`
  * The underlying sources are actually the same for the parallel and 
non-parallel cases they are just initiated with parallelism one, so I would 
bother with adding specific test cases for that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [ml] [WIP] Consolidation of loss function

2015-05-28 Thread tillrohrmann
GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/740

[ml] [WIP] Consolidation of loss function

This WIP PR shows a prototypical consolidation of the `LossFunction`. 
Furthermore, it introduces some syntactic sugar to make Scala programming more 
fun :-)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink mlSugar

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/740.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #740


commit cffd8727c8fb2c25acd15b6952bcbdbf93c11ea5
Author: Till Rohrmann 
Date:   2015-05-28T01:03:24Z

Adds syntactic sugar for map with single broadcast element. Starts 
rewriting the optimization framework.

commit fd77f8febe9414fca92ad58b0d87068612719c58
Author: Till Rohrmann 
Date:   2015-05-28T08:16:46Z

Finished reworking SGD

commit 6e08118f38529e002f7aa7d4c23bb7fe892a1d15
Author: Till Rohrmann 
Date:   2015-05-28T08:17:47Z

Removed line breaks




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-2103) Expose partitionBy to the user in Stream API

2015-05-28 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-2103:
---

 Summary: Expose partitionBy to the user in Stream API
 Key: FLINK-2103
 URL: https://issues.apache.org/jira/browse/FLINK-2103
 Project: Flink
  Issue Type: Improvement
Affects Versions: 0.9
Reporter: Aljoscha Krettek
Assignee: Aljoscha Krettek


Is there a reason why this is not exposed to the user? I could see cases where 
this would be useful to have.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1319) Add static code analysis for UDFs

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562512#comment-14562512
 ] 

ASF GitHub Bot commented on FLINK-1319:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/729#issuecomment-106232545
  
I second Ufuk's comments.
Merging it and deactivating it by default. I can see a 0.9.1 or 0.10.0 
release coming in very soon afterwards, because we have a big set of issues 
still in the pipeline.

Initially activating hinting in the local environment (what people use 
during debigging anyways) and having it deactivated in the "production" 
environments (remote and context).

Other comments:
  - How about printing the hints to sysout? I can see them getting lost 
among the logging statements. Also, people often have logging not activated in 
the IDE.
  - Package based exclusions never worked, it was always an issue with the 
quickstarts. I assume you want the exclusion to make sure you do not analyze 
the built-in default join function, for example? What you can do is add an 
annotation that says "DoNotAnalyze" to that functions, and then simply analyze 
everything.





> Add static code analysis for UDFs
> -
>
> Key: FLINK-1319
> URL: https://issues.apache.org/jira/browse/FLINK-1319
> Project: Flink
>  Issue Type: New Feature
>  Components: Java API, Scala API
>Reporter: Stephan Ewen
>Assignee: Timo Walther
>Priority: Minor
>
> Flink's Optimizer takes information that tells it for UDFs which fields of 
> the input elements are accessed, modified, or frwarded/copied. This 
> information frequently helps to reuse partitionings, sorts, etc. It may speed 
> up programs significantly, as it can frequently eliminate sorts and shuffles, 
> which are costly.
> Right now, users can add lightweight annotations to UDFs to provide this 
> information (such as adding {{@ConstandFields("0->3, 1, 2->1")}}.
> We worked with static code analysis of UDFs before, to determine this 
> information automatically. This is an incredible feature, as it "magically" 
> makes programs faster.
> For record-at-a-time operations (Map, Reduce, FlatMap, Join, Cross), this 
> works surprisingly well in many cases. We used the "Soot" toolkit for the 
> static code analysis. Unfortunately, Soot is LGPL licensed and thus we did 
> not include any of the code so far.
> I propose to add this functionality to Flink, in the form of a drop-in 
> addition, to work around the LGPL incompatibility with ALS 2.0. Users could 
> simply download a special "flink-code-analysis.jar" and drop it into the 
> "lib" folder to enable this functionality. We may even add a script to 
> "tools" that downloads that library automatically into the lib folder. This 
> should be legally fine, since we do not redistribute LGPL code and only 
> dynamically link it (the incompatibility with ASL 2.0 is mainly in the 
> patentability, if I remember correctly).
> Prior work on this has been done by [~aljoscha] and [~skunert], which could 
> provide a code base to start with.
> *Appendix*
> Hompage to Soot static analysis toolkit: http://www.sable.mcgill.ca/soot/
> Papers on static analysis and for optimization: 
> http://stratosphere.eu/assets/papers/EnablingOperatorReorderingSCA_12.pdf and 
> http://stratosphere.eu/assets/papers/openingTheBlackBoxes_12.pdf
> Quick introduction to the Optimizer: 
> http://stratosphere.eu/assets/papers/2014-VLDBJ_Stratosphere_Overview.pdf 
> (Section 6)
> Optimizer for Iterations: 
> http://stratosphere.eu/assets/papers/spinningFastIterativeDataFlows_12.pdf 
> (Sections 4.3 and 5.3)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1319][core] Add static code analysis fo...

2015-05-28 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/729#issuecomment-106232545
  
I second Ufuk's comments.
Merging it and deactivating it by default. I can see a 0.9.1 or 0.10.0 
release coming in very soon afterwards, because we have a big set of issues 
still in the pipeline.

Initially activating hinting in the local environment (what people use 
during debigging anyways) and having it deactivated in the "production" 
environments (remote and context).

Other comments:
  - How about printing the hints to sysout? I can see them getting lost 
among the logging statements. Also, people often have logging not activated in 
the IDE.
  - Package based exclusions never worked, it was always an issue with the 
quickstarts. I assume you want the exclusion to make sure you do not analyze 
the built-in default join function, for example? What you can do is add an 
annotation that says "DoNotAnalyze" to that functions, and then simply analyze 
everything.





---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1319][core] Add static code analysis fo...

2015-05-28 Thread twalthr
Github user twalthr commented on the pull request:

https://github.com/apache/flink/pull/729#issuecomment-106233384
  
I disabled the analyzer for all classes starting with "org.apache.flink". 
Because I wanted to reduce the output to the user for build-in UDFs (e.g. 
`org.apache.flink.api.java.Utils$CollectHelper` or UDFs within the Graph API). 
Initially I thought about an annotation `@SkipCodeAnalysis` but there are too 
many UDFs where this annotation should then be placed at. I think we can assume 
that UDFs shipped with Flink are already implemented effcient or unefficient 
for example purposes only.

Object creations "in method" mean that these objects are created directly 
in e.g. `map()`. The analyzer also follows method calls. "transitively" created 
objects are objects created in the nested method calls.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1319) Add static code analysis for UDFs

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562523#comment-14562523
 ] 

ASF GitHub Bot commented on FLINK-1319:
---

Github user twalthr commented on the pull request:

https://github.com/apache/flink/pull/729#issuecomment-106233384
  
I disabled the analyzer for all classes starting with "org.apache.flink". 
Because I wanted to reduce the output to the user for build-in UDFs (e.g. 
`org.apache.flink.api.java.Utils$CollectHelper` or UDFs within the Graph API). 
Initially I thought about an annotation `@SkipCodeAnalysis` but there are too 
many UDFs where this annotation should then be placed at. I think we can assume 
that UDFs shipped with Flink are already implemented effcient or unefficient 
for example purposes only.

Object creations "in method" mean that these objects are created directly 
in e.g. `map()`. The analyzer also follows method calls. "transitively" created 
objects are objects created in the nested method calls.


> Add static code analysis for UDFs
> -
>
> Key: FLINK-1319
> URL: https://issues.apache.org/jira/browse/FLINK-1319
> Project: Flink
>  Issue Type: New Feature
>  Components: Java API, Scala API
>Reporter: Stephan Ewen
>Assignee: Timo Walther
>Priority: Minor
>
> Flink's Optimizer takes information that tells it for UDFs which fields of 
> the input elements are accessed, modified, or frwarded/copied. This 
> information frequently helps to reuse partitionings, sorts, etc. It may speed 
> up programs significantly, as it can frequently eliminate sorts and shuffles, 
> which are costly.
> Right now, users can add lightweight annotations to UDFs to provide this 
> information (such as adding {{@ConstandFields("0->3, 1, 2->1")}}.
> We worked with static code analysis of UDFs before, to determine this 
> information automatically. This is an incredible feature, as it "magically" 
> makes programs faster.
> For record-at-a-time operations (Map, Reduce, FlatMap, Join, Cross), this 
> works surprisingly well in many cases. We used the "Soot" toolkit for the 
> static code analysis. Unfortunately, Soot is LGPL licensed and thus we did 
> not include any of the code so far.
> I propose to add this functionality to Flink, in the form of a drop-in 
> addition, to work around the LGPL incompatibility with ALS 2.0. Users could 
> simply download a special "flink-code-analysis.jar" and drop it into the 
> "lib" folder to enable this functionality. We may even add a script to 
> "tools" that downloads that library automatically into the lib folder. This 
> should be legally fine, since we do not redistribute LGPL code and only 
> dynamically link it (the incompatibility with ASL 2.0 is mainly in the 
> patentability, if I remember correctly).
> Prior work on this has been done by [~aljoscha] and [~skunert], which could 
> provide a code base to start with.
> *Appendix*
> Hompage to Soot static analysis toolkit: http://www.sable.mcgill.ca/soot/
> Papers on static analysis and for optimization: 
> http://stratosphere.eu/assets/papers/EnablingOperatorReorderingSCA_12.pdf and 
> http://stratosphere.eu/assets/papers/openingTheBlackBoxes_12.pdf
> Quick introduction to the Optimizer: 
> http://stratosphere.eu/assets/papers/2014-VLDBJ_Stratosphere_Overview.pdf 
> (Section 6)
> Optimizer for Iterations: 
> http://stratosphere.eu/assets/papers/spinningFastIterativeDataFlows_12.pdf 
> (Sections 4.3 and 5.3)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1319][core] Add static code analysis fo...

2015-05-28 Thread twalthr
Github user twalthr commented on the pull request:

https://github.com/apache/flink/pull/729#issuecomment-106236797
  
+1 for activating hinting locally. 
I thought logging is the standard way to print, but I can change it to a 
sysout


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1319) Add static code analysis for UDFs

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562532#comment-14562532
 ] 

ASF GitHub Bot commented on FLINK-1319:
---

Github user twalthr commented on the pull request:

https://github.com/apache/flink/pull/729#issuecomment-106236797
  
+1 for activating hinting locally. 
I thought logging is the standard way to print, but I can change it to a 
sysout


> Add static code analysis for UDFs
> -
>
> Key: FLINK-1319
> URL: https://issues.apache.org/jira/browse/FLINK-1319
> Project: Flink
>  Issue Type: New Feature
>  Components: Java API, Scala API
>Reporter: Stephan Ewen
>Assignee: Timo Walther
>Priority: Minor
>
> Flink's Optimizer takes information that tells it for UDFs which fields of 
> the input elements are accessed, modified, or frwarded/copied. This 
> information frequently helps to reuse partitionings, sorts, etc. It may speed 
> up programs significantly, as it can frequently eliminate sorts and shuffles, 
> which are costly.
> Right now, users can add lightweight annotations to UDFs to provide this 
> information (such as adding {{@ConstandFields("0->3, 1, 2->1")}}.
> We worked with static code analysis of UDFs before, to determine this 
> information automatically. This is an incredible feature, as it "magically" 
> makes programs faster.
> For record-at-a-time operations (Map, Reduce, FlatMap, Join, Cross), this 
> works surprisingly well in many cases. We used the "Soot" toolkit for the 
> static code analysis. Unfortunately, Soot is LGPL licensed and thus we did 
> not include any of the code so far.
> I propose to add this functionality to Flink, in the form of a drop-in 
> addition, to work around the LGPL incompatibility with ALS 2.0. Users could 
> simply download a special "flink-code-analysis.jar" and drop it into the 
> "lib" folder to enable this functionality. We may even add a script to 
> "tools" that downloads that library automatically into the lib folder. This 
> should be legally fine, since we do not redistribute LGPL code and only 
> dynamically link it (the incompatibility with ASL 2.0 is mainly in the 
> patentability, if I remember correctly).
> Prior work on this has been done by [~aljoscha] and [~skunert], which could 
> provide a code base to start with.
> *Appendix*
> Hompage to Soot static analysis toolkit: http://www.sable.mcgill.ca/soot/
> Papers on static analysis and for optimization: 
> http://stratosphere.eu/assets/papers/EnablingOperatorReorderingSCA_12.pdf and 
> http://stratosphere.eu/assets/papers/openingTheBlackBoxes_12.pdf
> Quick introduction to the Optimizer: 
> http://stratosphere.eu/assets/papers/2014-VLDBJ_Stratosphere_Overview.pdf 
> (Section 6)
> Optimizer for Iterations: 
> http://stratosphere.eu/assets/papers/spinningFastIterativeDataFlows_12.pdf 
> (Sections 4.3 and 5.3)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2096] Remove implicit conversion from W...

2015-05-28 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/726#issuecomment-106237798
  
manually merged


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2096) Remove implicit conversions in Streaming Scala API

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562540#comment-14562540
 ] 

ASF GitHub Bot commented on FLINK-2096:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/726#issuecomment-106237798
  
manually merged


> Remove implicit conversions in Streaming Scala API
> --
>
> Key: FLINK-2096
> URL: https://issues.apache.org/jira/browse/FLINK-2096
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
> Fix For: 0.9
>
>
> At least one user ran into a problem with this: 
> http://stackoverflow.com/questions/30461809/in-flink-stream-windowing-does-not-seem-to-work
> Also, the implicit conversions from Java DataStreams to Scala Streams could 
> be problematic. I think we should at least make them private to the flink 
> package.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2096) Remove implicit conversions in Streaming Scala API

2015-05-28 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek closed FLINK-2096.
---
   Resolution: Fixed
Fix Version/s: 0.9

Fixed in 
https://github.com/apache/flink/commit/7ec95c974d5cbcd83af8bc1bee370031cd09a88b

> Remove implicit conversions in Streaming Scala API
> --
>
> Key: FLINK-2096
> URL: https://issues.apache.org/jira/browse/FLINK-2096
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
> Fix For: 0.9
>
>
> At least one user ran into a problem with this: 
> http://stackoverflow.com/questions/30461809/in-flink-stream-windowing-does-not-seem-to-work
> Also, the implicit conversions from Java DataStreams to Scala Streams could 
> be problematic. I think we should at least make them private to the flink 
> package.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2096] Remove implicit conversion from W...

2015-05-28 Thread aljoscha
Github user aljoscha closed the pull request at:

https://github.com/apache/flink/pull/726


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2096) Remove implicit conversions in Streaming Scala API

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562539#comment-14562539
 ] 

ASF GitHub Bot commented on FLINK-2096:
---

Github user aljoscha closed the pull request at:

https://github.com/apache/flink/pull/726


> Remove implicit conversions in Streaming Scala API
> --
>
> Key: FLINK-2096
> URL: https://issues.apache.org/jira/browse/FLINK-2096
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
> Fix For: 0.9
>
>
> At least one user ran into a problem with this: 
> http://stackoverflow.com/questions/30461809/in-flink-stream-windowing-does-not-seem-to-work
> Also, the implicit conversions from Java DataStreams to Scala Streams could 
> be problematic. I think we should at least make them private to the flink 
> package.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1319][core] Add static code analysis fo...

2015-05-28 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/729#issuecomment-106241536
  
- I agree with hints going to sysout and activating by default.
- For simple functions, most of the transitive allocations will be Flink 
internal, right? E.g. after calling the collect method. Would it make sense to 
exclude transitive allocations reached via collect?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1319) Add static code analysis for UDFs

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562547#comment-14562547
 ] 

ASF GitHub Bot commented on FLINK-1319:
---

Github user uce commented on the pull request:

https://github.com/apache/flink/pull/729#issuecomment-106241536
  
- I agree with hints going to sysout and activating by default.
- For simple functions, most of the transitive allocations will be Flink 
internal, right? E.g. after calling the collect method. Would it make sense to 
exclude transitive allocations reached via collect?


> Add static code analysis for UDFs
> -
>
> Key: FLINK-1319
> URL: https://issues.apache.org/jira/browse/FLINK-1319
> Project: Flink
>  Issue Type: New Feature
>  Components: Java API, Scala API
>Reporter: Stephan Ewen
>Assignee: Timo Walther
>Priority: Minor
>
> Flink's Optimizer takes information that tells it for UDFs which fields of 
> the input elements are accessed, modified, or frwarded/copied. This 
> information frequently helps to reuse partitionings, sorts, etc. It may speed 
> up programs significantly, as it can frequently eliminate sorts and shuffles, 
> which are costly.
> Right now, users can add lightweight annotations to UDFs to provide this 
> information (such as adding {{@ConstandFields("0->3, 1, 2->1")}}.
> We worked with static code analysis of UDFs before, to determine this 
> information automatically. This is an incredible feature, as it "magically" 
> makes programs faster.
> For record-at-a-time operations (Map, Reduce, FlatMap, Join, Cross), this 
> works surprisingly well in many cases. We used the "Soot" toolkit for the 
> static code analysis. Unfortunately, Soot is LGPL licensed and thus we did 
> not include any of the code so far.
> I propose to add this functionality to Flink, in the form of a drop-in 
> addition, to work around the LGPL incompatibility with ALS 2.0. Users could 
> simply download a special "flink-code-analysis.jar" and drop it into the 
> "lib" folder to enable this functionality. We may even add a script to 
> "tools" that downloads that library automatically into the lib folder. This 
> should be legally fine, since we do not redistribute LGPL code and only 
> dynamically link it (the incompatibility with ASL 2.0 is mainly in the 
> patentability, if I remember correctly).
> Prior work on this has been done by [~aljoscha] and [~skunert], which could 
> provide a code base to start with.
> *Appendix*
> Hompage to Soot static analysis toolkit: http://www.sable.mcgill.ca/soot/
> Papers on static analysis and for optimization: 
> http://stratosphere.eu/assets/papers/EnablingOperatorReorderingSCA_12.pdf and 
> http://stratosphere.eu/assets/papers/openingTheBlackBoxes_12.pdf
> Quick introduction to the Optimizer: 
> http://stratosphere.eu/assets/papers/2014-VLDBJ_Stratosphere_Overview.pdf 
> (Section 6)
> Optimizer for Iterations: 
> http://stratosphere.eu/assets/papers/spinningFastIterativeDataFlows_12.pdf 
> (Sections 4.3 and 5.3)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1319][core] Add static code analysis fo...

2015-05-28 Thread twalthr
Github user twalthr commented on the pull request:

https://github.com/apache/flink/pull/729#issuecomment-106243626
  
The `collect()` is a special case, the analyzer does not follow it. But 
after thinking about it I recognized that I forgot `getRuntimeContext()` I will 
fix this ;)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1319) Add static code analysis for UDFs

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562553#comment-14562553
 ] 

ASF GitHub Bot commented on FLINK-1319:
---

Github user twalthr commented on the pull request:

https://github.com/apache/flink/pull/729#issuecomment-106243626
  
The `collect()` is a special case, the analyzer does not follow it. But 
after thinking about it I recognized that I forgot `getRuntimeContext()` I will 
fix this ;)


> Add static code analysis for UDFs
> -
>
> Key: FLINK-1319
> URL: https://issues.apache.org/jira/browse/FLINK-1319
> Project: Flink
>  Issue Type: New Feature
>  Components: Java API, Scala API
>Reporter: Stephan Ewen
>Assignee: Timo Walther
>Priority: Minor
>
> Flink's Optimizer takes information that tells it for UDFs which fields of 
> the input elements are accessed, modified, or frwarded/copied. This 
> information frequently helps to reuse partitionings, sorts, etc. It may speed 
> up programs significantly, as it can frequently eliminate sorts and shuffles, 
> which are costly.
> Right now, users can add lightweight annotations to UDFs to provide this 
> information (such as adding {{@ConstandFields("0->3, 1, 2->1")}}.
> We worked with static code analysis of UDFs before, to determine this 
> information automatically. This is an incredible feature, as it "magically" 
> makes programs faster.
> For record-at-a-time operations (Map, Reduce, FlatMap, Join, Cross), this 
> works surprisingly well in many cases. We used the "Soot" toolkit for the 
> static code analysis. Unfortunately, Soot is LGPL licensed and thus we did 
> not include any of the code so far.
> I propose to add this functionality to Flink, in the form of a drop-in 
> addition, to work around the LGPL incompatibility with ALS 2.0. Users could 
> simply download a special "flink-code-analysis.jar" and drop it into the 
> "lib" folder to enable this functionality. We may even add a script to 
> "tools" that downloads that library automatically into the lib folder. This 
> should be legally fine, since we do not redistribute LGPL code and only 
> dynamically link it (the incompatibility with ASL 2.0 is mainly in the 
> patentability, if I remember correctly).
> Prior work on this has been done by [~aljoscha] and [~skunert], which could 
> provide a code base to start with.
> *Appendix*
> Hompage to Soot static analysis toolkit: http://www.sable.mcgill.ca/soot/
> Papers on static analysis and for optimization: 
> http://stratosphere.eu/assets/papers/EnablingOperatorReorderingSCA_12.pdf and 
> http://stratosphere.eu/assets/papers/openingTheBlackBoxes_12.pdf
> Quick introduction to the Optimizer: 
> http://stratosphere.eu/assets/papers/2014-VLDBJ_Stratosphere_Overview.pdf 
> (Section 6)
> Optimizer for Iterations: 
> http://stratosphere.eu/assets/papers/spinningFastIterativeDataFlows_12.pdf 
> (Sections 4.3 and 5.3)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2047] [ml] Rename CoCoA to SVM

2015-05-28 Thread tillrohrmann
Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/733#issuecomment-106249838
  
Great. Then I'll merge it today.

On Wed, May 27, 2015 at 4:38 PM, Theodore Vasiloudis <
notificati...@github.com> wrote:

> Just noticed a small mistake in the example, I'll fix it and push again.
>
> —
> Reply to this email directly or view it on GitHub
> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2047) Rename CoCoA to SVM

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562580#comment-14562580
 ] 

ASF GitHub Bot commented on FLINK-2047:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/733#issuecomment-106249838
  
Great. Then I'll merge it today.

On Wed, May 27, 2015 at 4:38 PM, Theodore Vasiloudis <
notificati...@github.com> wrote:

> Just noticed a small mistake in the example, I'll fix it and push again.
>
> —
> Reply to this email directly or view it on GitHub
> .
>



> Rename CoCoA to SVM
> ---
>
> Key: FLINK-2047
> URL: https://issues.apache.org/jira/browse/FLINK-2047
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Theodore Vasiloudis
>Assignee: Theodore Vasiloudis
>Priority: Trivial
> Fix For: 0.9
>
>
> The CoCoA algorithm as implemented functions as an SVM classifier.
> As CoCoA mostly concerns the optimization process and not the actual learning 
> algorithm, it makes sense to rename the learner to SVM which users are more 
> familiar with.
> In the future we would like to use the CoCoA algorithm to solve more large 
> scale optimization problems for other learning algorithms.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2072) Add a quickstart guide for FlinkML

2015-05-28 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562582#comment-14562582
 ] 

Till Rohrmann commented on FLINK-2072:
--

Really good idea. I like it a lot :-) This is also a good litmus test to see if 
our stuff works.

> Add a quickstart guide for FlinkML
> --
>
> Key: FLINK-2072
> URL: https://issues.apache.org/jira/browse/FLINK-2072
> Project: Flink
>  Issue Type: New Feature
>  Components: Documentation, Machine Learning Library
>Reporter: Theodore Vasiloudis
>Assignee: Theodore Vasiloudis
> Fix For: 0.9
>
>
> We need a quickstart guide that introduces users to the core concepts of 
> FlinkML to get them up and running quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-687) Add support for outer-joins

2015-05-28 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562593#comment-14562593
 ] 

Till Rohrmann commented on FLINK-687:
-

How do you want to realize the outer join based on a hash join variant? Isn't 
it easier to realize it as a sort merge variant? Otherwise you would have to 
keep track which elements from the build side have matched to at least one 
element of the probe side.

For the left/right outer join it should be possible to implement a sort merge 
and hash join strategy.

> Add support for outer-joins
> ---
>
> Key: FLINK-687
> URL: https://issues.apache.org/jira/browse/FLINK-687
> Project: Flink
>  Issue Type: New Feature
>Reporter: GitHub Import
>Assignee: Chiwan Park
>Priority: Minor
>  Labels: github-import
> Fix For: pre-apache
>
>
> There are three types of outer-joins:
> - left outer,
> - right outer, and
> - full outer
> joins.
> An outer-join does not "filter" tuples of the outer-side that do not find a 
> matching tuple on the other side. Instead, it is joined with a NULL value.
> Supporting outer-joins requires some modifications in the join execution 
> strategies.
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/687
> Created by: [fhueske|https://github.com/fhueske]
> Labels: enhancement, java api, runtime, 
> Created at: Mon Apr 14 12:09:00 CEST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2094) Implement Word2Vec

2015-05-28 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562596#comment-14562596
 ] 

Till Rohrmann commented on FLINK-2094:
--

Great pointers [~jkirsch]. We can also use these implementations to verify our 
implementation and benchmark against them.

> Implement Word2Vec
> --
>
> Key: FLINK-2094
> URL: https://issues.apache.org/jira/browse/FLINK-2094
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Nikolaas Steenbergen
>Assignee: Nikolaas Steenbergen
>Priority: Minor
>
> implement Word2Vec
> http://arxiv.org/pdf/1402.3722v1.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2094) Implement Word2Vec

2015-05-28 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-2094:
-
Assignee: Nikolaas Steenbergen  (was: Till Rohrmann)

> Implement Word2Vec
> --
>
> Key: FLINK-2094
> URL: https://issues.apache.org/jira/browse/FLINK-2094
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Nikolaas Steenbergen
>Assignee: Nikolaas Steenbergen
>Priority: Minor
>  Labels: ML
>
> implement Word2Vec
> http://arxiv.org/pdf/1402.3722v1.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2094) Implement Word2Vec

2015-05-28 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-2094:
-
Labels: ML  (was: )

> Implement Word2Vec
> --
>
> Key: FLINK-2094
> URL: https://issues.apache.org/jira/browse/FLINK-2094
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Nikolaas Steenbergen
>Assignee: Till Rohrmann
>Priority: Minor
>  Labels: ML
>
> implement Word2Vec
> http://arxiv.org/pdf/1402.3722v1.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2098) Checkpoint barrier initiation at source is not aligned with snapshotting

2015-05-28 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562597#comment-14562597
 ] 

Aljoscha Krettek commented on FLINK-2098:
-

I started working on this. I will first develop a test that spots the current 
faulty behaviour and then I will fix it.

> Checkpoint barrier initiation at source is not aligned with snapshotting
> 
>
> Key: FLINK-2098
> URL: https://issues.apache.org/jira/browse/FLINK-2098
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Aljoscha Krettek
>Priority: Blocker
> Fix For: 0.9
>
>
> The stream source does not properly align the emission of checkpoint barriers 
> with the drawing of snapshots.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-2098) Checkpoint barrier initiation at source is not aligned with snapshotting

2015-05-28 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek reassigned FLINK-2098:
---

Assignee: Aljoscha Krettek  (was: Stephan Ewen)

> Checkpoint barrier initiation at source is not aligned with snapshotting
> 
>
> Key: FLINK-2098
> URL: https://issues.apache.org/jira/browse/FLINK-2098
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Aljoscha Krettek
>Priority: Blocker
> Fix For: 0.9
>
>
> The stream source does not properly align the emission of checkpoint barriers 
> with the drawing of snapshots.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2044) Implementation of Gelly HITS Algorithm

2015-05-28 Thread Mohammad Fahim Azizi (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562654#comment-14562654
 ] 

Mohammad Fahim Azizi commented on FLINK-2044:
-

Hey [~vkalavri],
the hits algorithm has two phases, hub rule and authority rule. these two 
phases are must calculate simultaneously.
to use runVertexCentricIteration() for this algorithm, we apply following two 
different approaches but we face with some problems (inside updateVertex 
function).

1. For switching  between hub rule and authority rule phases we used 
getSuperstepNumber() inside of updateVertex function but problem is 
updateVertex function doesn't have functionality to change direction of edges 
inside it. 

2. to over come with this problem, we used undirected graph with labeled edges 
as "H" and "A" (mean first we label the edges of directed graph as "A" then 
create revese of this graph and label the edges as "H" and then union). but the 
problem is accessing of edge values inside updateVertex function. 
Is there any solution to access edges inside updateVertex function?

> Implementation of Gelly HITS Algorithm
> --
>
> Key: FLINK-2044
> URL: https://issues.apache.org/jira/browse/FLINK-2044
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Reporter: Ahamd Javid
>Priority: Minor
>
> Implementation of Hits Algorithm in Gelly API using Java. the feature branch 
> can be found here: (https://github.com/JavidMayar/flink/commits/HITS)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2104) Fallback implicit values for PredictOperation and TransformOperation don't work if Nothing is inferred as the output type

2015-05-28 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-2104:


 Summary: Fallback implicit values for PredictOperation and 
TransformOperation don't work if Nothing is inferred as the output type
 Key: FLINK-2104
 URL: https://issues.apache.org/jira/browse/FLINK-2104
 Project: Flink
  Issue Type: Bug
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 0.9


If one calls a {{Predictor}} or {{Transformer}} with a wrong input type, then 
the Scala compiler tries to apply the fallback implicit value for this 
operation type. However, since the return type of the operations is 
parameterized, it will infer it to be {{Nothing}}. The problem is then that the 
implicit value {{Operation[Self, Input, Nothing]}} cannot be unified with the 
implicit parameter {{Operation[Self, Input, Output]}}. This seems to be a known 
Scala issue [https://issues.scala-lang.org/browse/SI-1570].

I propose to fix the output type of the implicit values to {{Any}} which will 
avoid that {{Nothing}} is inferred. This should solve the problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2104) Fallback implicit values for PredictOperation and TransformOperation don't work if Nothing is inferred as the output type

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562676#comment-14562676
 ] 

ASF GitHub Bot commented on FLINK-2104:
---

GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/741

[FLINK-2104] [ml] Fixes problem with type inference for fallback implicits

The solution seems to be to fix the output type to `Any`. This has the 
consequence that all following operations will also be fallback operations if 
they don't support `Any` inputs. But this should be fine since only the first 
fallback operation will throw an exception stating why the operator could not 
find a suitable operation.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink fixFallbackImplicits

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/741.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #741


commit 48d3003036a0d049b73fd40278a1470b99f87c2b
Author: Till Rohrmann 
Date:   2015-05-27T18:08:19Z

[FLINK-2104] [ml] Fixes problem with type inference for fallback implicits 
where Nothing is not correctly treated (see SI-1570)




> Fallback implicit values for PredictOperation and TransformOperation don't 
> work if Nothing is inferred as the output type
> -
>
> Key: FLINK-2104
> URL: https://issues.apache.org/jira/browse/FLINK-2104
> Project: Flink
>  Issue Type: Bug
>  Components: Machine Learning Library
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>  Labels: ML
> Fix For: 0.9
>
>
> If one calls a {{Predictor}} or {{Transformer}} with a wrong input type, then 
> the Scala compiler tries to apply the fallback implicit value for this 
> operation type. However, since the return type of the operations is 
> parameterized, it will infer it to be {{Nothing}}. The problem is then that 
> the implicit value {{Operation[Self, Input, Nothing]}} cannot be unified with 
> the implicit parameter {{Operation[Self, Input, Output]}}. This seems to be a 
> known Scala issue [https://issues.scala-lang.org/browse/SI-1570].
> I propose to fix the output type of the implicit values to {{Any}} which will 
> avoid that {{Nothing}} is inferred. This should solve the problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2104] [ml] Fixes problem with type infe...

2015-05-28 Thread tillrohrmann
GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/741

[FLINK-2104] [ml] Fixes problem with type inference for fallback implicits

The solution seems to be to fix the output type to `Any`. This has the 
consequence that all following operations will also be fallback operations if 
they don't support `Any` inputs. But this should be fine since only the first 
fallback operation will throw an exception stating why the operator could not 
find a suitable operation.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink fixFallbackImplicits

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/741.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #741


commit 48d3003036a0d049b73fd40278a1470b99f87c2b
Author: Till Rohrmann 
Date:   2015-05-27T18:08:19Z

[FLINK-2104] [ml] Fixes problem with type inference for fallback implicits 
where Nothing is not correctly treated (see SI-1570)




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1874) Break up streaming connectors into submodules

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562714#comment-14562714
 ] 

ASF GitHub Bot commented on FLINK-1874:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/719#issuecomment-106285952
  
Okay, I agree with you that we can name the modules the way marton 
suggested it.


> Break up streaming connectors into submodules
> -
>
> Key: FLINK-1874
> URL: https://issues.apache.org/jira/browse/FLINK-1874
> Project: Flink
>  Issue Type: Task
>  Components: Build System, Streaming
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Assignee: Márton Balassi
>  Labels: starter
>
> As per: 
> http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Break-up-streaming-connectors-into-subprojects-td5001.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1874] [streaming] Connector breakup

2015-05-28 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/719#issuecomment-106285952
  
Okay, I agree with you that we can name the modules the way marton 
suggested it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2061] CSVReader: quotedStringParsing an...

2015-05-28 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/734#issuecomment-106286147
  
Hi @chiwanpark , thanks for the PR!
I'm a bit busy right now but will have a look at the PR soon.
Thanks, Fabian


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2061) CSVReader: quotedStringParsing and includeFields yields ParseException

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562716#comment-14562716
 ] 

ASF GitHub Bot commented on FLINK-2061:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/734#issuecomment-106286147
  
Hi @chiwanpark , thanks for the PR!
I'm a bit busy right now but will have a look at the PR soon.
Thanks, Fabian


> CSVReader: quotedStringParsing and includeFields yields ParseException
> --
>
> Key: FLINK-2061
> URL: https://issues.apache.org/jira/browse/FLINK-2061
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.9
>Reporter: Fabian Hueske
>Assignee: Chiwan Park
>
> Fields in a CSV file with quoted String cannot be skipped.
> Parsing a line such as: 
> {code}
> "20:41:52-1-3-2015"|"Re: Taskmanager memory error in Eclipse"|"Stephan Ewen 
> "|"bla"|"blubb"
> {code}
> with a CSVReader configured as: 
> {code}
> DataSet> data =
>   env.readCsvFile("/path/to/my/data")
>   .lineDelimiter("\n")
>   .fieldDelimiter("|")
>   .parseQuotedStrings('"')
>   .includeFields("101")
>   .types(String.class, String.class);
> {code}
> gives a {{ParseException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2088) change return type of name function in DataStream scala class

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562727#comment-14562727
 ] 

ASF GitHub Bot commented on FLINK-2088:
---

Github user gyfora commented on the pull request:

https://github.com/apache/flink/pull/739#issuecomment-106286739
  
Looks good, let's merge this soon.


> change return type of name function in DataStream scala class
> -
>
> Key: FLINK-2088
> URL: https://issues.apache.org/jira/browse/FLINK-2088
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Faye Beligianni
>Assignee: Márton Balassi
>  Labels: Streaming
> Fix For: 0.9
>
>
> {{name}} function of {{DataStream}} scala class has a wrong return type: 
> {code} DataStream[_] {code} when it should return a:{code}DataStream[T] 
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2088] [streaming] [scala] DataStream.na...

2015-05-28 Thread gyfora
Github user gyfora commented on the pull request:

https://github.com/apache/flink/pull/739#issuecomment-106286739
  
Looks good, let's merge this soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-687) Add support for outer-joins

2015-05-28 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562728#comment-14562728
 ] 

Fabian Hueske commented on FLINK-687:
-

[~till.rohrmann] +1 For starting with a Sort-Merge Join strategy. It can cover 
left, right, and full outer joins and should be easy to implement. Hash 
variants for left and right outer joins can be easily added later.

I'll open sub-issues for a Sort-Merge Outer Join algorithms and integrating it 
into API, optimizer, and runtime.

> Add support for outer-joins
> ---
>
> Key: FLINK-687
> URL: https://issues.apache.org/jira/browse/FLINK-687
> Project: Flink
>  Issue Type: New Feature
>Reporter: GitHub Import
>Assignee: Chiwan Park
>Priority: Minor
>  Labels: github-import
> Fix For: pre-apache
>
>
> There are three types of outer-joins:
> - left outer,
> - right outer, and
> - full outer
> joins.
> An outer-join does not "filter" tuples of the outer-side that do not find a 
> matching tuple on the other side. Instead, it is joined with a NULL value.
> Supporting outer-joins requires some modifications in the join execution 
> strategies.
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/687
> Created by: [fhueske|https://github.com/fhueske]
> Labels: enhancement, java api, runtime, 
> Created at: Mon Apr 14 12:09:00 CEST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2088] [streaming] [scala] DataStream.na...

2015-05-28 Thread mbalassi
Github user mbalassi closed the pull request at:

https://github.com/apache/flink/pull/739


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2088] [streaming] [scala] DataStream.na...

2015-05-28 Thread mbalassi
GitHub user mbalassi reopened a pull request:

https://github.com/apache/flink/pull/739

[FLINK-2088] [streaming] [scala] DataStream.name() returns a typed Da…

…taStream

Just a quickfix.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mbalassi/flink flink-2088

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/739.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #739


commit 31efaa171080fd1820fd5051e6d59699ac26c332
Author: mbalassi 
Date:   2015-05-28T07:58:40Z

[FLINK-2088] [streaming] [scala] DataStream.name() returns a typed 
DataStream




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2088) change return type of name function in DataStream scala class

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562731#comment-14562731
 ] 

ASF GitHub Bot commented on FLINK-2088:
---

Github user mbalassi closed the pull request at:

https://github.com/apache/flink/pull/739


> change return type of name function in DataStream scala class
> -
>
> Key: FLINK-2088
> URL: https://issues.apache.org/jira/browse/FLINK-2088
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Faye Beligianni
>Assignee: Márton Balassi
>  Labels: Streaming
> Fix For: 0.9
>
>
> {{name}} function of {{DataStream}} scala class has a wrong return type: 
> {code} DataStream[_] {code} when it should return a:{code}DataStream[T] 
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2088) change return type of name function in DataStream scala class

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562732#comment-14562732
 ] 

ASF GitHub Bot commented on FLINK-2088:
---

GitHub user mbalassi reopened a pull request:

https://github.com/apache/flink/pull/739

[FLINK-2088] [streaming] [scala] DataStream.name() returns a typed Da…

…taStream

Just a quickfix.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mbalassi/flink flink-2088

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/739.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #739


commit 31efaa171080fd1820fd5051e6d59699ac26c332
Author: mbalassi 
Date:   2015-05-28T07:58:40Z

[FLINK-2088] [streaming] [scala] DataStream.name() returns a typed 
DataStream




> change return type of name function in DataStream scala class
> -
>
> Key: FLINK-2088
> URL: https://issues.apache.org/jira/browse/FLINK-2088
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Faye Beligianni
>Assignee: Márton Balassi
>  Labels: Streaming
> Fix For: 0.9
>
>
> {{name}} function of {{DataStream}} scala class has a wrong return type: 
> {code} DataStream[_] {code} when it should return a:{code}DataStream[T] 
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2105) Implement Sort-Merge Outer Join algorithm

2015-05-28 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-2105:


 Summary: Implement Sort-Merge Outer Join algorithm
 Key: FLINK-2105
 URL: https://issues.apache.org/jira/browse/FLINK-2105
 Project: Flink
  Issue Type: Sub-task
  Components: Local Runtime
Reporter: Fabian Hueske
Priority: Minor


Flink does not natively support outer joins at the moment. 
This issue proposes to implement a sort-merge outer join algorithm that can 
cover left, right, and full outer joins.

The implementation can be based on the regular sort-merge join iterator 
({{ReusingMergeMatchIterator}} and {{NonReusingMergeMatchIterator}}, see also 
{{MatchDriver}} class)

The Reusing and NonReusing variants differ in whether object instances are 
reused or new objects are created. I would start with the NonReusing variant 
which is safer from a user's point of view and should also be easier to 
implement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2106) Add outer joins to API, Optimizer, and Runtime

2015-05-28 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-2106:


 Summary: Add outer joins to API, Optimizer, and Runtime
 Key: FLINK-2106
 URL: https://issues.apache.org/jira/browse/FLINK-2106
 Project: Flink
  Issue Type: Sub-task
  Components: Java API, Local Runtime, Optimizer, Scala API
Reporter: Fabian Hueske
Priority: Minor


Add left/right/full outer join methods to the DataSet APIs (Java, Scala), to 
the optimizer, and the runtime of Flink.

Initially, the execution strategy should be a sort-merge outer join 
(FLINK-2105) but can later be extended to hash joins for left/right outer joins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1874) Break up streaming connectors into submodules

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562838#comment-14562838
 ] 

ASF GitHub Bot commented on FLINK-1874:
---

Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/719#issuecomment-106299970
  
Thanks, merging as soon as travis verifies.


> Break up streaming connectors into submodules
> -
>
> Key: FLINK-1874
> URL: https://issues.apache.org/jira/browse/FLINK-1874
> Project: Flink
>  Issue Type: Task
>  Components: Build System, Streaming
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Assignee: Márton Balassi
>  Labels: starter
>
> As per: 
> http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Break-up-streaming-connectors-into-subprojects-td5001.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1874] [streaming] Connector breakup

2015-05-28 Thread mbalassi
Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/719#issuecomment-106299970
  
Thanks, merging as soon as travis verifies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2092) Document (new) behavior of print() and execute()

2015-05-28 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562862#comment-14562862
 ] 

Maximilian Michels commented on FLINK-2092:
---

As far as I know, the behavior of execute() hasn't changed.

> Document (new) behavior of print() and execute()
> 
>
> Key: FLINK-2092
> URL: https://issues.apache.org/jira/browse/FLINK-2092
> Project: Flink
>  Issue Type: Task
>  Components: Documentation
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Blocker
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-687) Add support for outer-joins

2015-05-28 Thread Ricky Pogalz (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562879#comment-14562879
 ] 

Ricky Pogalz commented on FLINK-687:


Hi, during the project course IMPRO3 at TU-Berlin we planned to work on this 
operator as a bigger task in the next weeks. Unfortunately, I just recognized 
that this parent issue is already assigned, but not the subtasks though.

[~chiwanpark] Are you about to work on the sort-merge based outer join or is it 
okay for you if we take this issue? 

> Add support for outer-joins
> ---
>
> Key: FLINK-687
> URL: https://issues.apache.org/jira/browse/FLINK-687
> Project: Flink
>  Issue Type: New Feature
>Reporter: GitHub Import
>Assignee: Chiwan Park
>Priority: Minor
>  Labels: github-import
> Fix For: pre-apache
>
>
> There are three types of outer-joins:
> - left outer,
> - right outer, and
> - full outer
> joins.
> An outer-join does not "filter" tuples of the outer-side that do not find a 
> matching tuple on the other side. Instead, it is joined with a NULL value.
> Supporting outer-joins requires some modifications in the join execution 
> strategies.
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/687
> Created by: [fhueske|https://github.com/fhueske]
> Labels: enhancement, java api, runtime, 
> Created at: Mon Apr 14 12:09:00 CEST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-687) Add support for outer-joins

2015-05-28 Thread Ricky Pogalz (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562882#comment-14562882
 ] 

Ricky Pogalz commented on FLINK-687:


Hi, during the project course IMPRO3 at TU-Berlin we planned to work on this 
operator as a bigger task in the next weeks. Unfortunately, I just recognized 
that this parent issue is already assigned, but not the subtasks though.

[~chiwanpark] Are you about to work on the sort-merge based outer join or is it 
okay for you if we take this issue? 

> Add support for outer-joins
> ---
>
> Key: FLINK-687
> URL: https://issues.apache.org/jira/browse/FLINK-687
> Project: Flink
>  Issue Type: New Feature
>Reporter: GitHub Import
>Assignee: Chiwan Park
>Priority: Minor
>  Labels: github-import
> Fix For: pre-apache
>
>
> There are three types of outer-joins:
> - left outer,
> - right outer, and
> - full outer
> joins.
> An outer-join does not "filter" tuples of the outer-side that do not find a 
> matching tuple on the other side. Instead, it is joined with a NULL value.
> Supporting outer-joins requires some modifications in the join execution 
> strategies.
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/687
> Created by: [fhueske|https://github.com/fhueske]
> Labels: enhancement, java api, runtime, 
> Created at: Mon Apr 14 12:09:00 CEST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (FLINK-687) Add support for outer-joins

2015-05-28 Thread Ricky Pogalz (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ricky Pogalz updated FLINK-687:
---
Comment: was deleted

(was: Hi, during the project course IMPRO3 at TU-Berlin we planned to work on 
this operator as a bigger task in the next weeks. Unfortunately, I just 
recognized that this parent issue is already assigned, but not the subtasks 
though.

[~chiwanpark] Are you about to work on the sort-merge based outer join or is it 
okay for you if we take this issue? )

> Add support for outer-joins
> ---
>
> Key: FLINK-687
> URL: https://issues.apache.org/jira/browse/FLINK-687
> Project: Flink
>  Issue Type: New Feature
>Reporter: GitHub Import
>Assignee: Chiwan Park
>Priority: Minor
>  Labels: github-import
> Fix For: pre-apache
>
>
> There are three types of outer-joins:
> - left outer,
> - right outer, and
> - full outer
> joins.
> An outer-join does not "filter" tuples of the outer-side that do not find a 
> matching tuple on the other side. Instead, it is joined with a NULL value.
> Supporting outer-joins requires some modifications in the join execution 
> strategies.
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/687
> Created by: [fhueske|https://github.com/fhueske]
> Labels: enhancement, java api, runtime, 
> Created at: Mon Apr 14 12:09:00 CEST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1874) Break up streaming connectors into submodules

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562906#comment-14562906
 ] 

ASF GitHub Bot commented on FLINK-1874:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/719


> Break up streaming connectors into submodules
> -
>
> Key: FLINK-1874
> URL: https://issues.apache.org/jira/browse/FLINK-1874
> Project: Flink
>  Issue Type: Task
>  Components: Build System, Streaming
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Assignee: Márton Balassi
>  Labels: starter
>
> As per: 
> http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Break-up-streaming-connectors-into-subprojects-td5001.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2088] [streaming] [scala] DataStream.na...

2015-05-28 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/739


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2061) CSVReader: quotedStringParsing and includeFields yields ParseException

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562911#comment-14562911
 ] 

ASF GitHub Bot commented on FLINK-2061:
---

Github user chiwanpark commented on the pull request:

https://github.com/apache/flink/pull/734#issuecomment-106313923
  
Okay. :)
Because there is Stephen's email address in test code, I modified test code.


> CSVReader: quotedStringParsing and includeFields yields ParseException
> --
>
> Key: FLINK-2061
> URL: https://issues.apache.org/jira/browse/FLINK-2061
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.9
>Reporter: Fabian Hueske
>Assignee: Chiwan Park
>
> Fields in a CSV file with quoted String cannot be skipped.
> Parsing a line such as: 
> {code}
> "20:41:52-1-3-2015"|"Re: Taskmanager memory error in Eclipse"|"Stephan Ewen 
> "|"bla"|"blubb"
> {code}
> with a CSVReader configured as: 
> {code}
> DataSet> data =
>   env.readCsvFile("/path/to/my/data")
>   .lineDelimiter("\n")
>   .fieldDelimiter("|")
>   .parseQuotedStrings('"')
>   .includeFields("101")
>   .types(String.class, String.class);
> {code}
> gives a {{ParseException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2061] CSVReader: quotedStringParsing an...

2015-05-28 Thread chiwanpark
Github user chiwanpark commented on the pull request:

https://github.com/apache/flink/pull/734#issuecomment-106313923
  
Okay. :)
Because there is Stephen's email address in test code, I modified test code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2088) change return type of name function in DataStream scala class

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562907#comment-14562907
 ] 

ASF GitHub Bot commented on FLINK-2088:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/739


> change return type of name function in DataStream scala class
> -
>
> Key: FLINK-2088
> URL: https://issues.apache.org/jira/browse/FLINK-2088
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Faye Beligianni
>Assignee: Márton Balassi
>  Labels: Streaming
> Fix For: 0.9
>
>
> {{name}} function of {{DataStream}} scala class has a wrong return type: 
> {code} DataStream[_] {code} when it should return a:{code}DataStream[T] 
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1874] [streaming] Connector breakup

2015-05-28 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/719


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1907) Scala Interactive Shell

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562921#comment-14562921
 ] 

ASF GitHub Bot commented on FLINK-1907:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/672


> Scala Interactive Shell
> ---
>
> Key: FLINK-1907
> URL: https://issues.apache.org/jira/browse/FLINK-1907
> Project: Flink
>  Issue Type: New Feature
>  Components: Scala API
>Reporter: Nikolaas Steenbergen
>Assignee: Nikolaas Steenbergen
>Priority: Minor
>
> Build an interactive Shell for the Scala api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1907] Scala shell

2015-05-28 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/672


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-687) Add support for outer-joins

2015-05-28 Thread Chiwan Park (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562939#comment-14562939
 ] 

Chiwan Park commented on FLINK-687:
---

[~r-pogalz] Okay. You can take this issue. :) I don't start work about the 
sort-merge based outer join. If there is a chance after the sort-merge based 
outer join, I would like to implement the hash-join based outer join.

> Add support for outer-joins
> ---
>
> Key: FLINK-687
> URL: https://issues.apache.org/jira/browse/FLINK-687
> Project: Flink
>  Issue Type: New Feature
>Reporter: GitHub Import
>Assignee: Chiwan Park
>Priority: Minor
>  Labels: github-import
> Fix For: pre-apache
>
>
> There are three types of outer-joins:
> - left outer,
> - right outer, and
> - full outer
> joins.
> An outer-join does not "filter" tuples of the outer-side that do not find a 
> matching tuple on the other side. Instead, it is joined with a NULL value.
> Supporting outer-joins requires some modifications in the join execution 
> strategies.
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/687
> Created by: [fhueske|https://github.com/fhueske]
> Labels: enhancement, java api, runtime, 
> Created at: Mon Apr 14 12:09:00 CEST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (FLINK-687) Add support for outer-joins

2015-05-28 Thread Chiwan Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chiwan Park updated FLINK-687:
--
Comment: was deleted

(was: [~r-pogalz] Okay. You can take this issue. :) I don't start work about 
the sort-merge based outer join. If there is a chance after the sort-merge 
based outer join, I would like to implement the hash-join based outer join.)

> Add support for outer-joins
> ---
>
> Key: FLINK-687
> URL: https://issues.apache.org/jira/browse/FLINK-687
> Project: Flink
>  Issue Type: New Feature
>Reporter: GitHub Import
>Assignee: Chiwan Park
>Priority: Minor
>  Labels: github-import
> Fix For: pre-apache
>
>
> There are three types of outer-joins:
> - left outer,
> - right outer, and
> - full outer
> joins.
> An outer-join does not "filter" tuples of the outer-side that do not find a 
> matching tuple on the other side. Instead, it is joined with a NULL value.
> Supporting outer-joins requires some modifications in the join execution 
> strategies.
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/687
> Created by: [fhueske|https://github.com/fhueske]
> Labels: enhancement, java api, runtime, 
> Created at: Mon Apr 14 12:09:00 CEST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-687) Add support for outer-joins

2015-05-28 Thread Chiwan Park (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562937#comment-14562937
 ] 

Chiwan Park commented on FLINK-687:
---

[~r-pogalz] Okay. You can take this issue. :) I don't start work about the 
sort-merge based outer join. If there is a chance after the sort-merge based 
outer join, I would like to implement the hash-join based outer join.

> Add support for outer-joins
> ---
>
> Key: FLINK-687
> URL: https://issues.apache.org/jira/browse/FLINK-687
> Project: Flink
>  Issue Type: New Feature
>Reporter: GitHub Import
>Assignee: Chiwan Park
>Priority: Minor
>  Labels: github-import
> Fix For: pre-apache
>
>
> There are three types of outer-joins:
> - left outer,
> - right outer, and
> - full outer
> joins.
> An outer-join does not "filter" tuples of the outer-side that do not find a 
> matching tuple on the other side. Instead, it is joined with a NULL value.
> Supporting outer-joins requires some modifications in the join execution 
> strategies.
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/687
> Created by: [fhueske|https://github.com/fhueske]
> Labels: enhancement, java api, runtime, 
> Created at: Mon Apr 14 12:09:00 CEST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-687) Add support for outer-joins

2015-05-28 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562963#comment-14562963
 ] 

Fabian Hueske commented on FLINK-687:
-

How about both of you implement an outer join algorithm? I'll open another 
subtask for a Hash Outer Join and you two sync who is implementing what. Once 
this is done, we can still decide who will be working on the API, optimizer, 
runtime integration.

Sounds good?

> Add support for outer-joins
> ---
>
> Key: FLINK-687
> URL: https://issues.apache.org/jira/browse/FLINK-687
> Project: Flink
>  Issue Type: New Feature
>Reporter: GitHub Import
>Assignee: Chiwan Park
>Priority: Minor
>  Labels: github-import
> Fix For: pre-apache
>
>
> There are three types of outer-joins:
> - left outer,
> - right outer, and
> - full outer
> joins.
> An outer-join does not "filter" tuples of the outer-side that do not find a 
> matching tuple on the other side. Instead, it is joined with a NULL value.
> Supporting outer-joins requires some modifications in the join execution 
> strategies.
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/687
> Created by: [fhueske|https://github.com/fhueske]
> Labels: enhancement, java api, runtime, 
> Created at: Mon Apr 14 12:09:00 CEST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-687) Add support for outer-joins

2015-05-28 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562966#comment-14562966
 ] 

Fabian Hueske commented on FLINK-687:
-

How about both of you implement an outer join algorithm? I'll open another 
subtask for a Hash Outer Join and you two sync who is implementing what. Once 
this is done, we can still decide who will be working on the API, optimizer, 
runtime integration.

Sounds good?

> Add support for outer-joins
> ---
>
> Key: FLINK-687
> URL: https://issues.apache.org/jira/browse/FLINK-687
> Project: Flink
>  Issue Type: New Feature
>Reporter: GitHub Import
>Assignee: Chiwan Park
>Priority: Minor
>  Labels: github-import
> Fix For: pre-apache
>
>
> There are three types of outer-joins:
> - left outer,
> - right outer, and
> - full outer
> joins.
> An outer-join does not "filter" tuples of the outer-side that do not find a 
> matching tuple on the other side. Instead, it is joined with a NULL value.
> Supporting outer-joins requires some modifications in the join execution 
> strategies.
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/687
> Created by: [fhueske|https://github.com/fhueske]
> Labels: enhancement, java api, runtime, 
> Created at: Mon Apr 14 12:09:00 CEST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-687) Add support for outer-joins

2015-05-28 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562973#comment-14562973
 ] 

Fabian Hueske commented on FLINK-687:
-

How about both of you implement an outer join algorithm? I'll open another 
subtask for a Hash Outer Join and you two sync who is implementing what. Once 
this is done, we can still decide who will be working on the API, optimizer, 
runtime integration.

Sounds good?

> Add support for outer-joins
> ---
>
> Key: FLINK-687
> URL: https://issues.apache.org/jira/browse/FLINK-687
> Project: Flink
>  Issue Type: New Feature
>Reporter: GitHub Import
>Assignee: Chiwan Park
>Priority: Minor
>  Labels: github-import
> Fix For: pre-apache
>
>
> There are three types of outer-joins:
> - left outer,
> - right outer, and
> - full outer
> joins.
> An outer-join does not "filter" tuples of the outer-side that do not find a 
> matching tuple on the other side. Instead, it is joined with a NULL value.
> Supporting outer-joins requires some modifications in the join execution 
> strategies.
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/687
> Created by: [fhueske|https://github.com/fhueske]
> Labels: enhancement, java api, runtime, 
> Created at: Mon Apr 14 12:09:00 CEST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-687) Add support for outer-joins

2015-05-28 Thread Chiwan Park (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562976#comment-14562976
 ] 

Chiwan Park commented on FLINK-687:
---

Good. I'll take subtask for a hash-join based outer join.

> Add support for outer-joins
> ---
>
> Key: FLINK-687
> URL: https://issues.apache.org/jira/browse/FLINK-687
> Project: Flink
>  Issue Type: New Feature
>Reporter: GitHub Import
>Assignee: Chiwan Park
>Priority: Minor
>  Labels: github-import
> Fix For: pre-apache
>
>
> There are three types of outer-joins:
> - left outer,
> - right outer, and
> - full outer
> joins.
> An outer-join does not "filter" tuples of the outer-side that do not find a 
> matching tuple on the other side. Instead, it is joined with a NULL value.
> Supporting outer-joins requires some modifications in the join execution 
> strategies.
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/687
> Created by: [fhueske|https://github.com/fhueske]
> Labels: enhancement, java api, runtime, 
> Created at: Mon Apr 14 12:09:00 CEST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-687) Add support for outer-joins

2015-05-28 Thread Ricky Pogalz (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562987#comment-14562987
 ] 

Ricky Pogalz commented on FLINK-687:


Yeah, I guess this makes sense :)

> Add support for outer-joins
> ---
>
> Key: FLINK-687
> URL: https://issues.apache.org/jira/browse/FLINK-687
> Project: Flink
>  Issue Type: New Feature
>Reporter: GitHub Import
>Assignee: Chiwan Park
>Priority: Minor
>  Labels: github-import
> Fix For: pre-apache
>
>
> There are three types of outer-joins:
> - left outer,
> - right outer, and
> - full outer
> joins.
> An outer-join does not "filter" tuples of the outer-side that do not find a 
> matching tuple on the other side. Instead, it is joined with a NULL value.
> Supporting outer-joins requires some modifications in the join execution 
> strategies.
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/687
> Created by: [fhueske|https://github.com/fhueske]
> Labels: enhancement, java api, runtime, 
> Created at: Mon Apr 14 12:09:00 CEST 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2047) Rename CoCoA to SVM

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562993#comment-14562993
 ] 

ASF GitHub Bot commented on FLINK-2047:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/733


> Rename CoCoA to SVM
> ---
>
> Key: FLINK-2047
> URL: https://issues.apache.org/jira/browse/FLINK-2047
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Theodore Vasiloudis
>Assignee: Theodore Vasiloudis
>Priority: Trivial
> Fix For: 0.9
>
>
> The CoCoA algorithm as implemented functions as an SVM classifier.
> As CoCoA mostly concerns the optimization process and not the actual learning 
> algorithm, it makes sense to rename the learner to SVM which users are more 
> familiar with.
> In the future we would like to use the CoCoA algorithm to solve more large 
> scale optimization problems for other learning algorithms.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2047] [ml] Rename CoCoA to SVM

2015-05-28 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/733


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-2047) Rename CoCoA to SVM

2015-05-28 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann resolved FLINK-2047.
--
Resolution: Fixed

Done with 995f8f9693c4fe2e40efcf6a82c10ccab11c37ba

> Rename CoCoA to SVM
> ---
>
> Key: FLINK-2047
> URL: https://issues.apache.org/jira/browse/FLINK-2047
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Theodore Vasiloudis
>Assignee: Theodore Vasiloudis
>Priority: Trivial
> Fix For: 0.9
>
>
> The CoCoA algorithm as implemented functions as an SVM classifier.
> As CoCoA mostly concerns the optimization process and not the actual learning 
> algorithm, it makes sense to rename the learner to SVM which users are more 
> familiar with.
> In the future we would like to use the CoCoA algorithm to solve more large 
> scale optimization problems for other learning algorithms.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2104) Fallback implicit values for PredictOperation and TransformOperation don't work if Nothing is inferred as the output type

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563005#comment-14563005
 ] 

ASF GitHub Bot commented on FLINK-2104:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/741#issuecomment-106367425
  
If there are no more objections, then I would merge it in the next hours.


> Fallback implicit values for PredictOperation and TransformOperation don't 
> work if Nothing is inferred as the output type
> -
>
> Key: FLINK-2104
> URL: https://issues.apache.org/jira/browse/FLINK-2104
> Project: Flink
>  Issue Type: Bug
>  Components: Machine Learning Library
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>  Labels: ML
> Fix For: 0.9
>
>
> If one calls a {{Predictor}} or {{Transformer}} with a wrong input type, then 
> the Scala compiler tries to apply the fallback implicit value for this 
> operation type. However, since the return type of the operations is 
> parameterized, it will infer it to be {{Nothing}}. The problem is then that 
> the implicit value {{Operation[Self, Input, Nothing]}} cannot be unified with 
> the implicit parameter {{Operation[Self, Input, Output]}}. This seems to be a 
> known Scala issue [https://issues.scala-lang.org/browse/SI-1570].
> I propose to fix the output type of the implicit values to {{Any}} which will 
> avoid that {{Nothing}} is inferred. This should solve the problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2104] [ml] Fixes problem with type infe...

2015-05-28 Thread tillrohrmann
Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/741#issuecomment-106367425
  
If there are no more objections, then I would merge it in the next hours.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-2107) Implement Hash Outer Join algorithm

2015-05-28 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-2107:


 Summary: Implement Hash Outer Join algorithm
 Key: FLINK-2107
 URL: https://issues.apache.org/jira/browse/FLINK-2107
 Project: Flink
  Issue Type: Sub-task
  Components: Local Runtime
Reporter: Fabian Hueske
Priority: Minor


Flink does not natively support outer joins at the moment.
This issue proposes to implement a hash outer join algorithm that can cover 
left and right outer joins.

The implementation can be based on the regular hash join iterators (for example 
`ReusingBuildFirstHashMatchIterator` and 
`NonReusingBuildFirstHashMatchIterator`, see also `MatchDriver` class)

The Reusing and NonReusing variants differ in whether object instances are 
reused or new objects are created. I would start with the NonReusing variant 
which is safer from a user's point of view and should also be easier to 
implement.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-2107) Implement Hash Outer Join algorithm

2015-05-28 Thread Chiwan Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chiwan Park reassigned FLINK-2107:
--

Assignee: Chiwan Park

> Implement Hash Outer Join algorithm
> ---
>
> Key: FLINK-2107
> URL: https://issues.apache.org/jira/browse/FLINK-2107
> Project: Flink
>  Issue Type: Sub-task
>  Components: Local Runtime
>Reporter: Fabian Hueske
>Assignee: Chiwan Park
>Priority: Minor
> Fix For: pre-apache
>
>
> Flink does not natively support outer joins at the moment.
> This issue proposes to implement a hash outer join algorithm that can cover 
> left and right outer joins.
> The implementation can be based on the regular hash join iterators (for 
> example `ReusingBuildFirstHashMatchIterator` and 
> `NonReusingBuildFirstHashMatchIterator`, see also `MatchDriver` class)
> The Reusing and NonReusing variants differ in whether object instances are 
> reused or new objects are created. I would start with the NonReusing variant 
> which is safer from a user's point of view and should also be easier to 
> implement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request:

2015-05-28 Thread thvasilo
Github user thvasilo commented on the pull request:


https://github.com/apache/flink/commit/995f8f9693c4fe2e40efcf6a82c10ccab11c37ba#commitcomment-11408683
  
In 
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/classification/SVM.scala:
In 
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/classification/SVM.scala
 on line 84:
Just noticed: The example still uses CoCoA() instead of SVM()


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (FLINK-2105) Implement Sort-Merge Outer Join algorithm

2015-05-28 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske reassigned FLINK-2105:


Assignee: Ricky Pogalz

> Implement Sort-Merge Outer Join algorithm
> -
>
> Key: FLINK-2105
> URL: https://issues.apache.org/jira/browse/FLINK-2105
> Project: Flink
>  Issue Type: Sub-task
>  Components: Local Runtime
>Reporter: Fabian Hueske
>Assignee: Ricky Pogalz
>Priority: Minor
> Fix For: pre-apache
>
>
> Flink does not natively support outer joins at the moment. 
> This issue proposes to implement a sort-merge outer join algorithm that can 
> cover left, right, and full outer joins.
> The implementation can be based on the regular sort-merge join iterator 
> ({{ReusingMergeMatchIterator}} and {{NonReusingMergeMatchIterator}}, see also 
> {{MatchDriver}} class)
> The Reusing and NonReusing variants differ in whether object instances are 
> reused or new objects are created. I would start with the NonReusing variant 
> which is safer from a user's point of view and should also be easier to 
> implement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-2102) Add predict operation for LabeledVector

2015-05-28 Thread Theodore Vasiloudis (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Theodore Vasiloudis reassigned FLINK-2102:
--

Assignee: Theodore Vasiloudis

> Add predict operation for LabeledVector
> ---
>
> Key: FLINK-2102
> URL: https://issues.apache.org/jira/browse/FLINK-2102
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Theodore Vasiloudis
>Assignee: Theodore Vasiloudis
>Priority: Minor
>  Labels: ML
> Fix For: 0.9
>
>
> Currently we can only call predict on DataSet[V <: Vector].
> A lot of times though we have a DataSet[LabeledVector] that we split into a 
> train and test set.
> We should be able to make predictions on the test DataSet[LabeledVector] 
> without having to transform it into a DataSet[Vector]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2102) Add predict operation for LabeledVector

2015-05-28 Thread Theodore Vasiloudis (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563040#comment-14563040
 ] 

Theodore Vasiloudis commented on FLINK-2102:


My suggestion for an initial version of this is to add a PredictOperation that 
looks like this to each implemented Predictor:
{code}
implicit def predictLabeledValues = {
new PredictOperation[PredictorType, LabeledVector, (Double, Double)] {
  ...
}
}
{code}
i.e. for each example in the dataset, we make a prediction, and then return a 
tuple with the true value and the predicted value. That {{DataSet[(Double, 
Double)]}} could then be passed on to a scoring function.

> Add predict operation for LabeledVector
> ---
>
> Key: FLINK-2102
> URL: https://issues.apache.org/jira/browse/FLINK-2102
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Theodore Vasiloudis
>Assignee: Theodore Vasiloudis
>Priority: Minor
>  Labels: ML
> Fix For: 0.9
>
>
> Currently we can only call predict on DataSet[V <: Vector].
> A lot of times though we have a DataSet[LabeledVector] that we split into a 
> train and test set.
> We should be able to make predictions on the test DataSet[LabeledVector] 
> without having to transform it into a DataSet[Vector]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2091) Lock contention during release of network buffer pools

2015-05-28 Thread Ufuk Celebi (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ufuk Celebi closed FLINK-2091.
--
Resolution: Not A Problem

The core issue was not lock contention, but having too many tasks run at the 
same time. I think the tests, which provoked this issue were running around a 
thousand tasks on 8 slot task managers.

We can think about improving the registration/unregistration logic, but I don't 
think that it is a problem at the moment (especially with regard to the 
release).

> Lock contention during release of network buffer pools
> --
>
> Key: FLINK-2091
> URL: https://issues.apache.org/jira/browse/FLINK-2091
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Runtime
>Affects Versions: master
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
>
> [~rmetzger] reported the following stack traces during cancelling of high 
> parallelism jobs:
> {code}
> 13:43:46,803 WARN  org.apache.flink.runtime.taskmanager.Task  
>- Task 'DataSource (at main(Job.java:59) 
> (org.apache.flink.api.java.io.TextInputFormat)) (4/16)' did not react to 
> cancelling signal, but is stuck in method:
>  
> org.apache.flink.runtime.io.network.buffer.LocalBufferPool.setNumBuffers(LocalBufferPool.java:238)
> org.apache.flink.runtime.io.network.buffer.NetworkBufferPool.redistributeBuffers(NetworkBufferPool.java:268)
> org.apache.flink.runtime.io.network.buffer.NetworkBufferPool.destroyBufferPool(NetworkBufferPool.java:218)
> org.apache.flink.runtime.io.network.buffer.LocalBufferPool.lazyDestroy(LocalBufferPool.java:221)
> org.apache.flink.runtime.io.network.partition.ResultPartition.destroyBufferPool(ResultPartition.java:302)
> org.apache.flink.runtime.io.network.NetworkEnvironment.unregisterTask(NetworkEnvironment.java:366)
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:647)
> java.lang.Thread.run(Thread.java:745)
> {code}
> {code}
> 13:42:57,595 WARN  org.apache.flink.runtime.taskmanager.Task  
>- Task 'DataSource (at main(Job.java:59) 
> (org.apache.flink.api.java.io.TextInputFormat)) (16/16)' did not react to 
> cancelling signal, but is stuck in method:
>  
> org.apache.flink.runtime.io.network.buffer.NetworkBufferPool.destroyBufferPool(NetworkBufferPool.java:212)
> org.apache.flink.runtime.io.network.buffer.LocalBufferPool.lazyDestroy(LocalBufferPool.java:221)
> org.apache.flink.runtime.io.network.partition.ResultPartition.destroyBufferPool(ResultPartition.java:302)
> org.apache.flink.runtime.io.network.NetworkEnvironment.unregisterTask(NetworkEnvironment.java:366)
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:647)
> java.lang.Thread.run(Thread.java:745)
> {code}
> The issue is that during cancelling of high parallelism jobs the locks for 
> buffer pool management are highly contended.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2104] [ml] Fixes problem with type infe...

2015-05-28 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/741


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2104) Fallback implicit values for PredictOperation and TransformOperation don't work if Nothing is inferred as the output type

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563086#comment-14563086
 ] 

ASF GitHub Bot commented on FLINK-2104:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/741


> Fallback implicit values for PredictOperation and TransformOperation don't 
> work if Nothing is inferred as the output type
> -
>
> Key: FLINK-2104
> URL: https://issues.apache.org/jira/browse/FLINK-2104
> Project: Flink
>  Issue Type: Bug
>  Components: Machine Learning Library
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>  Labels: ML
> Fix For: 0.9
>
>
> If one calls a {{Predictor}} or {{Transformer}} with a wrong input type, then 
> the Scala compiler tries to apply the fallback implicit value for this 
> operation type. However, since the return type of the operations is 
> parameterized, it will infer it to be {{Nothing}}. The problem is then that 
> the implicit value {{Operation[Self, Input, Nothing]}} cannot be unified with 
> the implicit parameter {{Operation[Self, Input, Output]}}. This seems to be a 
> known Scala issue [https://issues.scala-lang.org/browse/SI-1570].
> I propose to fix the output type of the implicit values to {{Any}} which will 
> avoid that {{Nothing}} is inferred. This should solve the problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2104) Fallback implicit values for PredictOperation and TransformOperation don't work if Nothing is inferred as the output type

2015-05-28 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann closed FLINK-2104.

Resolution: Fixed

Fixed via 3e860e7fd5ef9c4aba10f738ce95b12d40654cce

> Fallback implicit values for PredictOperation and TransformOperation don't 
> work if Nothing is inferred as the output type
> -
>
> Key: FLINK-2104
> URL: https://issues.apache.org/jira/browse/FLINK-2104
> Project: Flink
>  Issue Type: Bug
>  Components: Machine Learning Library
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>  Labels: ML
> Fix For: 0.9
>
>
> If one calls a {{Predictor}} or {{Transformer}} with a wrong input type, then 
> the Scala compiler tries to apply the fallback implicit value for this 
> operation type. However, since the return type of the operations is 
> parameterized, it will infer it to be {{Nothing}}. The problem is then that 
> the implicit value {{Operation[Self, Input, Nothing]}} cannot be unified with 
> the implicit parameter {{Operation[Self, Input, Output]}}. This seems to be a 
> known Scala issue [https://issues.scala-lang.org/browse/SI-1570].
> I propose to fix the output type of the implicit values to {{Any}} which will 
> avoid that {{Nothing}} is inferred. This should solve the problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2108) Add score function for Predictors

2015-05-28 Thread Theodore Vasiloudis (JIRA)
Theodore Vasiloudis created FLINK-2108:
--

 Summary: Add score function for Predictors
 Key: FLINK-2108
 URL: https://issues.apache.org/jira/browse/FLINK-2108
 Project: Flink
  Issue Type: Improvement
  Components: Machine Learning Library
Reporter: Theodore Vasiloudis
Priority: Minor


A score function for Predictor implementations should take a DataSet[(I, O)] 
and an (optional) scoring measure and return a score.

The DataSet[(I, O)] would probably be the output of the predict function.

For example in MultipleLinearRegression, we can call predict on a labeled 
dataset, get back predictions for each item in the data, and then call score 
with the resulting dataset as an argument and we should get back a score for 
the prediction quality, such as the R^2 score.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2102) Add predict operation for LabeledVector

2015-05-28 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563067#comment-14563067
 ] 

Till Rohrmann commented on FLINK-2102:
--

This should work as a first version. We should make sure that this is
properly documented somewhere because it makes strong assumptions about the
types.

On Thu, May 28, 2015 at 5:08 PM, Theodore Vasiloudis (JIRA)  Add predict operation for LabeledVector
> ---
>
> Key: FLINK-2102
> URL: https://issues.apache.org/jira/browse/FLINK-2102
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Theodore Vasiloudis
>Assignee: Theodore Vasiloudis
>Priority: Minor
>  Labels: ML
> Fix For: 0.9
>
>
> Currently we can only call predict on DataSet[V <: Vector].
> A lot of times though we have a DataSet[LabeledVector] that we split into a 
> train and test set.
> We should be able to make predictions on the test DataSet[LabeledVector] 
> without having to transform it into a DataSet[Vector]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-2088) change return type of name function in DataStream scala class

2015-05-28 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/FLINK-2088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Márton Balassi resolved FLINK-2088.
---
Resolution: Fixed

Fixed via 31efaa1

> change return type of name function in DataStream scala class
> -
>
> Key: FLINK-2088
> URL: https://issues.apache.org/jira/browse/FLINK-2088
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Faye Beligianni
>Assignee: Márton Balassi
>  Labels: Streaming
> Fix For: 0.9
>
>
> {{name}} function of {{DataStream}} scala class has a wrong return type: 
> {code} DataStream[_] {code} when it should return a:{code}DataStream[T] 
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1874) Break up streaming connectors into submodules

2015-05-28 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/FLINK-1874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Márton Balassi resolved FLINK-1874.
---
   Resolution: Fixed
Fix Version/s: 0.9

Fixed via 665bcec and a302817.

> Break up streaming connectors into submodules
> -
>
> Key: FLINK-1874
> URL: https://issues.apache.org/jira/browse/FLINK-1874
> Project: Flink
>  Issue Type: Task
>  Components: Build System, Streaming
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Assignee: Márton Balassi
>  Labels: starter
> Fix For: 0.9
>
>
> As per: 
> http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Break-up-streaming-connectors-into-subprojects-td5001.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2109) CancelTaskException leads to FAILED task state

2015-05-28 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-2109:
--

 Summary: CancelTaskException leads to FAILED task state
 Key: FLINK-2109
 URL: https://issues.apache.org/jira/browse/FLINK-2109
 Project: Flink
  Issue Type: Bug
  Components: Distributed Runtime
Affects Versions: master
Reporter: Ufuk Celebi
Assignee: Ufuk Celebi


The {{CancelTaskException}} is thrown to trigger canceling of the executing 
task. It is intended to cause a cancelled status, rather than a failed status.

Currently, it leads to a {{FAILED}} state instead of the expected {{CANCELED}} 
state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2110) Early slot release after Execution failure

2015-05-28 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-2110:
--

 Summary: Early slot release after Execution failure
 Key: FLINK-2110
 URL: https://issues.apache.org/jira/browse/FLINK-2110
 Project: Flink
  Issue Type: Bug
  Components: Scheduler
Affects Versions: master
Reporter: Ufuk Celebi


In {{Execution#processFail(Throwable, boolean)}}, the slot is released before 
the cancel msg is send to the respective task manager assigned to this 
Execution.

Cancelling on the TM can take some time, which can result in a too early 
release of the slot on the JM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2098] Ensure checkpoints and element em...

2015-05-28 Thread aljoscha
GitHub user aljoscha opened a pull request:

https://github.com/apache/flink/pull/742

[FLINK-2098] Ensure checkpoints and element emission are in order

Before, it could happen that a streaming source would keep emitting
elements while a checkpoint is being performed. This can lead to
inconsistencies in the checkpoints with other operators.

This also adds a test that checks whether only one checkpoint is
executed at a time and the serial behaviour of checkpointing and
emission.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aljoscha/flink checkpoint-hardening

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/742.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #742


commit 5df7410f44c640f3a183b525f697bbc11a69c69b
Author: Aljoscha Krettek 
Date:   2015-05-28T08:24:37Z

[FLINK-2098] Ensure checkpoints and element emission are in order

Before, it could happen that a streaming source would keep emitting
elements while a checkpoint is being performed. This can lead to
inconsistencies in the checkpoints with other operators.

This also adds a test that checks whether only one checkpoint is
executed at a time and the serial behaviour of checkpointing and
emission.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2098) Checkpoint barrier initiation at source is not aligned with snapshotting

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563209#comment-14563209
 ] 

ASF GitHub Bot commented on FLINK-2098:
---

GitHub user aljoscha opened a pull request:

https://github.com/apache/flink/pull/742

[FLINK-2098] Ensure checkpoints and element emission are in order

Before, it could happen that a streaming source would keep emitting
elements while a checkpoint is being performed. This can lead to
inconsistencies in the checkpoints with other operators.

This also adds a test that checks whether only one checkpoint is
executed at a time and the serial behaviour of checkpointing and
emission.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aljoscha/flink checkpoint-hardening

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/742.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #742


commit 5df7410f44c640f3a183b525f697bbc11a69c69b
Author: Aljoscha Krettek 
Date:   2015-05-28T08:24:37Z

[FLINK-2098] Ensure checkpoints and element emission are in order

Before, it could happen that a streaming source would keep emitting
elements while a checkpoint is being performed. This can lead to
inconsistencies in the checkpoints with other operators.

This also adds a test that checks whether only one checkpoint is
executed at a time and the serial behaviour of checkpointing and
emission.




> Checkpoint barrier initiation at source is not aligned with snapshotting
> 
>
> Key: FLINK-2098
> URL: https://issues.apache.org/jira/browse/FLINK-2098
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Aljoscha Krettek
>Priority: Blocker
> Fix For: 0.9
>
>
> The stream source does not properly align the emission of checkpoint barriers 
> with the drawing of snapshots.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2098) Checkpoint barrier initiation at source is not aligned with snapshotting

2015-05-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563211#comment-14563211
 ] 

ASF GitHub Bot commented on FLINK-2098:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/742#issuecomment-106471093
  
This also enables the exactly once checkpointing test added earlier by 
@StephanEwen.


> Checkpoint barrier initiation at source is not aligned with snapshotting
> 
>
> Key: FLINK-2098
> URL: https://issues.apache.org/jira/browse/FLINK-2098
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Aljoscha Krettek
>Priority: Blocker
> Fix For: 0.9
>
>
> The stream source does not properly align the emission of checkpoint barriers 
> with the drawing of snapshots.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >