[GitHub] flink pull request: [FLINK-2098] Improvements on checkpoint-aligne...

2015-06-03 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/755#issuecomment-108218506
  
@StephanEwen I took the changes, added them on top of my PR and added some 
more refinements.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2098) Checkpoint barrier initiation at source is not aligned with snapshotting

2015-06-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570392#comment-14570392
 ] 

ASF GitHub Bot commented on FLINK-2098:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/755#issuecomment-108218506
  
@StephanEwen I took the changes, added them on top of my PR and added some 
more refinements.


> Checkpoint barrier initiation at source is not aligned with snapshotting
> 
>
> Key: FLINK-2098
> URL: https://issues.apache.org/jira/browse/FLINK-2098
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Aljoscha Krettek
>Priority: Blocker
> Fix For: 0.9
>
>
> The stream source does not properly align the emission of checkpoint barriers 
> with the drawing of snapshots.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2030) Implement an online histogram with Merging and equalization features

2015-06-03 Thread Theodore Vasiloudis (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570462#comment-14570462
 ] 

Theodore Vasiloudis commented on FLINK-2030:


Is there a PR for this issue?

> Implement an online histogram with Merging and equalization features
> 
>
> Key: FLINK-2030
> URL: https://issues.apache.org/jira/browse/FLINK-2030
> Project: Flink
>  Issue Type: Sub-task
>  Components: Machine Learning Library
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Minor
>  Labels: ML
>
> For the implementation of the decision tree in 
> https://issues.apache.org/jira/browse/FLINK-1727, we need to implement an 
> histogram with online updates, merging and equalization features. A reference 
> implementation is provided in [1]
> [1].http://www.jmlr.org/papers/volume11/ben-haim10a/ben-haim10a.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2140) Access the number of vertices from within the GSA functions

2015-06-03 Thread Andra Lungu (JIRA)
Andra Lungu created FLINK-2140:
--

 Summary: Access the number of vertices from within the GSA 
functions
 Key: FLINK-2140
 URL: https://issues.apache.org/jira/browse/FLINK-2140
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Affects Versions: 0.9
Reporter: Andra Lungu


Similarly to the Vertex-centric approach we would like to allow the user to 
access the number of vertices from the Gather, Sum and Apply functions 
respectively. This property will become available by setting 
[setOptNumVertices()] the numVertices option to true. 

The number of vertices can then be accessed in the gather, sum and apply 
functions using the getNumberOfVertices() method. If the option is not set in 
the configuration, this method will return -1. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [streaming] Consolidate streaming API method n...

2015-06-03 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/761#issuecomment-108249291
  
Looks good-to-merge


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-2141) Allow GSA's Gather to perform this operation in more than one direction

2015-06-03 Thread Andra Lungu (JIRA)
Andra Lungu created FLINK-2141:
--

 Summary: Allow GSA's Gather to perform this operation in more than 
one direction
 Key: FLINK-2141
 URL: https://issues.apache.org/jira/browse/FLINK-2141
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Affects Versions: 0.9
Reporter: Andra Lungu


For the time being, a vertex only gathers information from its in-edges.

Similarly to the vertex-centric approach, we would like to allow users to 
gather data from out and all edges as well. 

This property should be set using a setDirection() method.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1993) Replace MultipleLinearRegression's custom SGD with optimization framework's SGD

2015-06-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570492#comment-14570492
 ] 

ASF GitHub Bot commented on FLINK-1993:
---

Github user thvasilo commented on a diff in the pull request:

https://github.com/apache/flink/pull/760#discussion_r31604106
  
--- Diff: 
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/pipeline/Predictor.scala
 ---
@@ -35,7 +35,7 @@ import org.apache.flink.ml.common.{FlinkMLTools, 
ParameterMap, WithParameters}
   *
   * @tparam Self Type of the implementing class
   */
-trait Predictor[Self] extends Estimator[Self] with WithParameters with 
Serializable {
+trait Predictor[Self] extends Estimator[Self] with WithParameters {
--- End diff --

Why is Serializable no longer needed?


> Replace MultipleLinearRegression's custom SGD with optimization framework's 
> SGD
> ---
>
> Key: FLINK-1993
> URL: https://issues.apache.org/jira/browse/FLINK-1993
> Project: Flink
>  Issue Type: Task
>  Components: Machine Learning Library
>Reporter: Till Rohrmann
>Assignee: Theodore Vasiloudis
>Priority: Minor
>  Labels: ML
> Fix For: 0.9
>
>
> The current implementation of MultipleLinearRegression uses a custom SGD 
> implementation. Flink's optimization framework also contains a SGD optimizer 
> which should replace the custom implementation once the framework is merged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

2015-06-03 Thread thvasilo
Github user thvasilo commented on a diff in the pull request:

https://github.com/apache/flink/pull/760#discussion_r31604106
  
--- Diff: 
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/pipeline/Predictor.scala
 ---
@@ -35,7 +35,7 @@ import org.apache.flink.ml.common.{FlinkMLTools, 
ParameterMap, WithParameters}
   *
   * @tparam Self Type of the implementing class
   */
-trait Predictor[Self] extends Estimator[Self] with WithParameters with 
Serializable {
+trait Predictor[Self] extends Estimator[Self] with WithParameters {
--- End diff --

Why is Serializable no longer needed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

2015-06-03 Thread thvasilo
Github user thvasilo commented on a diff in the pull request:

https://github.com/apache/flink/pull/760#discussion_r31604529
  
--- Diff: 
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/regression/MultipleLinearRegression.scala
 ---
@@ -87,11 +89,11 @@ import org.apache.flink.ml.pipeline.{FitOperation, 
PredictOperation, Predictor}
   *
   */
 class MultipleLinearRegression extends Predictor[MultipleLinearRegression] 
{
-
+  import org.apache.flink.ml._
--- End diff --

Line 49 typo: iteratinos -> iterations


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1993) Replace MultipleLinearRegression's custom SGD with optimization framework's SGD

2015-06-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570503#comment-14570503
 ] 

ASF GitHub Bot commented on FLINK-1993:
---

Github user thvasilo commented on a diff in the pull request:

https://github.com/apache/flink/pull/760#discussion_r31604529
  
--- Diff: 
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/regression/MultipleLinearRegression.scala
 ---
@@ -87,11 +89,11 @@ import org.apache.flink.ml.pipeline.{FitOperation, 
PredictOperation, Predictor}
   *
   */
 class MultipleLinearRegression extends Predictor[MultipleLinearRegression] 
{
-
+  import org.apache.flink.ml._
--- End diff --

Line 49 typo: iteratinos -> iterations


> Replace MultipleLinearRegression's custom SGD with optimization framework's 
> SGD
> ---
>
> Key: FLINK-1993
> URL: https://issues.apache.org/jira/browse/FLINK-1993
> Project: Flink
>  Issue Type: Task
>  Components: Machine Learning Library
>Reporter: Till Rohrmann
>Assignee: Theodore Vasiloudis
>Priority: Minor
>  Labels: ML
> Fix For: 0.9
>
>
> The current implementation of MultipleLinearRegression uses a custom SGD 
> implementation. Flink's optimization framework also contains a SGD optimizer 
> which should replace the custom implementation once the framework is merged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

2015-06-03 Thread thvasilo
Github user thvasilo commented on a diff in the pull request:

https://github.com/apache/flink/pull/760#discussion_r31604864
  
--- Diff: 
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/regression/MultipleLinearRegression.scala
 ---
@@ -309,8 +207,10 @@ object MultipleLinearRegression {
   : DataSet[LabeledVector] = {
--- End diff --

Docstring for the return type?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1993) Replace MultipleLinearRegression's custom SGD with optimization framework's SGD

2015-06-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570509#comment-14570509
 ] 

ASF GitHub Bot commented on FLINK-1993:
---

Github user thvasilo commented on a diff in the pull request:

https://github.com/apache/flink/pull/760#discussion_r31604864
  
--- Diff: 
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/regression/MultipleLinearRegression.scala
 ---
@@ -309,8 +207,10 @@ object MultipleLinearRegression {
   : DataSet[LabeledVector] = {
--- End diff --

Docstring for the return type?


> Replace MultipleLinearRegression's custom SGD with optimization framework's 
> SGD
> ---
>
> Key: FLINK-1993
> URL: https://issues.apache.org/jira/browse/FLINK-1993
> Project: Flink
>  Issue Type: Task
>  Components: Machine Learning Library
>Reporter: Till Rohrmann
>Assignee: Theodore Vasiloudis
>Priority: Minor
>  Labels: ML
> Fix For: 0.9
>
>
> The current implementation of MultipleLinearRegression uses a custom SGD 
> implementation. Flink's optimization framework also contains a SGD optimizer 
> which should replace the custom implementation once the framework is merged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1993] [ml] Replaces custom SGD in Multi...

2015-06-03 Thread thvasilo
Github user thvasilo commented on the pull request:

https://github.com/apache/flink/pull/760#issuecomment-108254611
  
Looks good, some minor comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1993) Replace MultipleLinearRegression's custom SGD with optimization framework's SGD

2015-06-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570511#comment-14570511
 ] 

ASF GitHub Bot commented on FLINK-1993:
---

Github user thvasilo commented on the pull request:

https://github.com/apache/flink/pull/760#issuecomment-108254611
  
Looks good, some minor comments.


> Replace MultipleLinearRegression's custom SGD with optimization framework's 
> SGD
> ---
>
> Key: FLINK-1993
> URL: https://issues.apache.org/jira/browse/FLINK-1993
> Project: Flink
>  Issue Type: Task
>  Components: Machine Learning Library
>Reporter: Till Rohrmann
>Assignee: Theodore Vasiloudis
>Priority: Minor
>  Labels: ML
> Fix For: 0.9
>
>
> The current implementation of MultipleLinearRegression uses a custom SGD 
> implementation. Flink's optimization framework also contains a SGD optimizer 
> which should replace the custom implementation once the framework is merged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [contrib] Storm compatibility

2015-06-03 Thread szape
Github user szape commented on the pull request:

https://github.com/apache/flink/pull/764#issuecomment-108256209
  
This branch has a broken history. I will rewrite the commits. Please do not 
push anything until then.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2130) RabbitMQ source does not fail when failing to retrieve elements

2015-06-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/FLINK-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570515#comment-14570515
 ] 

Márton Balassi commented on FLINK-2130:
---

Fair enough.

> RabbitMQ source does not fail when failing to retrieve elements
> ---
>
> Key: FLINK-2130
> URL: https://issues.apache.org/jira/browse/FLINK-2130
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming, Streaming Connectors
>Reporter: Stephan Ewen
>Assignee: Márton Balassi
>
> The RMQ source only logs when elements cannot be retrieved. Failures are not 
> propagated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-2130) RabbitMQ source does not fail when failing to retrieve elements

2015-06-03 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/FLINK-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Márton Balassi reassigned FLINK-2130:
-

Assignee: Márton Balassi

> RabbitMQ source does not fail when failing to retrieve elements
> ---
>
> Key: FLINK-2130
> URL: https://issues.apache.org/jira/browse/FLINK-2130
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming, Streaming Connectors
>Reporter: Stephan Ewen
>Assignee: Márton Balassi
>
> The RMQ source only logs when elements cannot be retrieved. Failures are not 
> propagated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [contrib] Storm compatibility

2015-06-03 Thread mbalassi
Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/764#issuecomment-108260360
  
What exactly is broken what the commits?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (FLINK-1993) Replace MultipleLinearRegression's custom SGD with optimization framework's SGD

2015-06-03 Thread Theodore Vasiloudis (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Theodore Vasiloudis reassigned FLINK-1993:
--

Assignee: Till Rohrmann  (was: Theodore Vasiloudis)

> Replace MultipleLinearRegression's custom SGD with optimization framework's 
> SGD
> ---
>
> Key: FLINK-1993
> URL: https://issues.apache.org/jira/browse/FLINK-1993
> Project: Flink
>  Issue Type: Task
>  Components: Machine Learning Library
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Minor
>  Labels: ML
> Fix For: 0.9
>
>
> The current implementation of MultipleLinearRegression uses a custom SGD 
> implementation. Flink's optimization framework also contains a SGD optimizer 
> which should replace the custom implementation once the framework is merged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [docs/javadoc][hotfix] Corrected Join hint and...

2015-06-03 Thread vasia
Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/763#issuecomment-108262606
  
looks good, +1.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2130] [streaming] RMQ Source properly p...

2015-06-03 Thread mbalassi
GitHub user mbalassi opened a pull request:

https://github.com/apache/flink/pull/767

[FLINK-2130] [streaming] RMQ Source properly propagates exceptions



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mbalassi/flink flink-2130

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/767.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #767


commit 2b784f09493767ca5b6388ac692406466dc55575
Author: mbalassi 
Date:   2015-06-03T09:16:48Z

[FLINK-2130] [streaming] RMQ Source properly propagates exceptions




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2130) RabbitMQ source does not fail when failing to retrieve elements

2015-06-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570522#comment-14570522
 ] 

ASF GitHub Bot commented on FLINK-2130:
---

GitHub user mbalassi opened a pull request:

https://github.com/apache/flink/pull/767

[FLINK-2130] [streaming] RMQ Source properly propagates exceptions



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mbalassi/flink flink-2130

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/767.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #767


commit 2b784f09493767ca5b6388ac692406466dc55575
Author: mbalassi 
Date:   2015-06-03T09:16:48Z

[FLINK-2130] [streaming] RMQ Source properly propagates exceptions




> RabbitMQ source does not fail when failing to retrieve elements
> ---
>
> Key: FLINK-2130
> URL: https://issues.apache.org/jira/browse/FLINK-2130
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming, Streaming Connectors
>Reporter: Stephan Ewen
>Assignee: Márton Balassi
>
> The RMQ source only logs when elements cannot be retrieved. Failures are not 
> propagated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2137) Expose partitionByHash for WindowedDataStream

2015-06-03 Thread Gyula Fora (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570525#comment-14570525
 ] 

Gyula Fora commented on FLINK-2137:
---

I dont think this makes too much sense for the windowing case. The groupBy with 
keyselector should be enough.

> Expose partitionByHash for WindowedDataStream
> -
>
> Key: FLINK-2137
> URL: https://issues.apache.org/jira/browse/FLINK-2137
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Márton Balassi
>Assignee: Gábor Hermann
>
> This functionality has been recently exposed for DataStreams and 
> ConnectedDataStreams, but not for WindowedDataStreams yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2137) Expose partitionByHash for WindowedDataStream

2015-06-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/FLINK-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570528#comment-14570528
 ] 

Márton Balassi commented on FLINK-2137:
---

I am personally fine with not having it, if no objections please mark it as not 
a problem.

> Expose partitionByHash for WindowedDataStream
> -
>
> Key: FLINK-2137
> URL: https://issues.apache.org/jira/browse/FLINK-2137
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Márton Balassi
>Assignee: Gábor Hermann
>
> This functionality has been recently exposed for DataStreams and 
> ConnectedDataStreams, but not for WindowedDataStreams yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2137) Expose partitionByHash for WindowedDataStream

2015-06-03 Thread Gyula Fora (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-2137.
-
Resolution: Not A Problem

> Expose partitionByHash for WindowedDataStream
> -
>
> Key: FLINK-2137
> URL: https://issues.apache.org/jira/browse/FLINK-2137
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Márton Balassi
>Assignee: Gábor Hermann
>
> This functionality has been recently exposed for DataStreams and 
> ConnectedDataStreams, but not for WindowedDataStreams yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2142) GSoC project: Exact and Approximate Statistics for Data Streams and Windows

2015-06-03 Thread Gabor Gevay (JIRA)
Gabor Gevay created FLINK-2142:
--

 Summary: GSoC project: Exact and Approximate Statistics for Data 
Streams and Windows
 Key: FLINK-2142
 URL: https://issues.apache.org/jira/browse/FLINK-2142
 Project: Flink
  Issue Type: New Feature
  Components: Streaming
Reporter: Gabor Gevay
Assignee: Gabor Gevay
Priority: Minor


The goal of this project is to implement basic statistics of data streams and 
windows (like average, median, variance, correlation, etc.) in a 
computationally efficient manner. This involves designing custom preaggregators.

The exact calculation of some statistics (eg. frequencies, or the number of 
distinct elements) would require memory proportional to the number of elements 
in the input (the window or the entire stream). However, there are efficient 
algorithms and data structures using less memory for calculating the same 
statistics only approximately, with user-specified error bounds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2143) Add an overload to reduceWindow which takes the inverse of the reduceFunction as a second parameter

2015-06-03 Thread Gabor Gevay (JIRA)
Gabor Gevay created FLINK-2143:
--

 Summary: Add an overload to reduceWindow which takes the inverse 
of the reduceFunction as a second parameter
 Key: FLINK-2143
 URL: https://issues.apache.org/jira/browse/FLINK-2143
 Project: Flink
  Issue Type: Sub-task
Reporter: Gabor Gevay
Assignee: Gabor Gevay


If the inverse of the reduceFunction is also available (for example subtraction 
when summing numbers), then a PreReducer can maintain the aggregate in O(1) 
memory and O(1) time for evict, store, and emitWindow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2142) GSoC project: Exact and Approximate Statistics for Data Streams and Windows

2015-06-03 Thread Gabor Gevay (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Gevay updated FLINK-2142:
---
Description: 
The goal of this project is to implement basic statistics of data streams and 
windows (like average, median, variance, correlation, etc.) in a 
computationally efficient manner. This involves designing custom PreReducers.

The exact calculation of some statistics (eg. frequencies, or the number of 
distinct elements) would require memory proportional to the number of elements 
in the input (the window or the entire stream). However, there are efficient 
algorithms and data structures using less memory for calculating the same 
statistics only approximately, with user-specified error bounds.

  was:
The goal of this project is to implement basic statistics of data streams and 
windows (like average, median, variance, correlation, etc.) in a 
computationally efficient manner. This involves designing custom preaggregators.

The exact calculation of some statistics (eg. frequencies, or the number of 
distinct elements) would require memory proportional to the number of elements 
in the input (the window or the entire stream). However, there are efficient 
algorithms and data structures using less memory for calculating the same 
statistics only approximately, with user-specified error bounds.


> GSoC project: Exact and Approximate Statistics for Data Streams and Windows
> ---
>
> Key: FLINK-2142
> URL: https://issues.apache.org/jira/browse/FLINK-2142
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming
>Reporter: Gabor Gevay
>Assignee: Gabor Gevay
>Priority: Minor
>  Labels: gsoc2015, statistics, streaming
>
> The goal of this project is to implement basic statistics of data streams and 
> windows (like average, median, variance, correlation, etc.) in a 
> computationally efficient manner. This involves designing custom PreReducers.
> The exact calculation of some statistics (eg. frequencies, or the number of 
> distinct elements) would require memory proportional to the number of 
> elements in the input (the window or the entire stream). However, there are 
> efficient algorithms and data structures using less memory for calculating 
> the same statistics only approximately, with user-specified error bounds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2144) Implement count, average, and variance for windows

2015-06-03 Thread Gabor Gevay (JIRA)
Gabor Gevay created FLINK-2144:
--

 Summary: Implement count, average, and variance for windows
 Key: FLINK-2144
 URL: https://issues.apache.org/jira/browse/FLINK-2144
 Project: Flink
  Issue Type: Sub-task
Reporter: Gabor Gevay
Assignee: Gabor Gevay
Priority: Minor


By count I mean the number of elements in the window.

These can be implemented very efficiently building on FLINK-2143.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-2136) Test the streaming scala API

2015-06-03 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/FLINK-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gábor Hermann reassigned FLINK-2136:


Assignee: Gábor Hermann

> Test the streaming scala API
> 
>
> Key: FLINK-2136
> URL: https://issues.apache.org/jira/browse/FLINK-2136
> Project: Flink
>  Issue Type: Test
>  Components: Scala API, Streaming
>Affects Versions: 0.9
>Reporter: Márton Balassi
>Assignee: Gábor Hermann
>
> There are no test covering the streaming scala API. I would suggest to test 
> whether the StreamGraph created by a certain operation looks as expected. 
> Deeper layers and runtime should not be tested here, that is done in 
> streaming-core.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1526) Add Minimum Spanning Tree library method and example

2015-06-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570554#comment-14570554
 ] 

ASF GitHub Bot commented on FLINK-1526:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/434#issuecomment-108278253
  
Hey @andralungu!

I think we should close this one. We can't really continue from this state 
anyway. I guess we'll have to revisit this problem once we have for-loop 
iteration support.


> Add Minimum Spanning Tree library method and example
> 
>
> Key: FLINK-1526
> URL: https://issues.apache.org/jira/browse/FLINK-1526
> Project: Flink
>  Issue Type: Task
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Andra Lungu
>
> This issue proposes the addition of a library method and an example for 
> distributed minimum spanning tree in Gelly.
> The DMST algorithm is very interesting because it is quite different from 
> PageRank-like iterative graph algorithms. It consists of distinct phases 
> inside the same iteration and requires a mechanism to detect convergence of 
> one phase to proceed to the next one. Current implementations in 
> vertex-centric models are quite long (>1000 lines) and hard to understand.
> You can find a description of the algorithm [here | 
> http://ilpubs.stanford.edu:8090/1077/3/p535-salihoglu.pdf] and [here | 
> http://www.vldb.org/pvldb/vol7/p1047-han.pdf].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1526][gelly] [work in progress] Added M...

2015-06-03 Thread vasia
Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/434#issuecomment-108278253
  
Hey @andralungu!

I think we should close this one. We can't really continue from this state 
anyway. I guess we'll have to revisit this problem once we have for-loop 
iteration support.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-2145) Median calculation for windows

2015-06-03 Thread Gabor Gevay (JIRA)
Gabor Gevay created FLINK-2145:
--

 Summary: Median calculation for windows
 Key: FLINK-2145
 URL: https://issues.apache.org/jira/browse/FLINK-2145
 Project: Flink
  Issue Type: Sub-task
  Components: Streaming
Reporter: Gabor Gevay
Assignee: Gabor Gevay
Priority: Minor


The PreReducer for this has the following algorithm: We maintain two multisets 
(as, for example, balanced binary search trees), that always partition the 
elements of the current window to smaller-than-median and larger-than-median 
elements. At each store and evict, we can maintain this invariant with only 
O(1) multiset operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2144) Implement count, average, and variance for windows

2015-06-03 Thread Gabor Gevay (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Gevay updated FLINK-2144:
---
Labels: statistics  (was: )

> Implement count, average, and variance for windows
> --
>
> Key: FLINK-2144
> URL: https://issues.apache.org/jira/browse/FLINK-2144
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming
>Reporter: Gabor Gevay
>Assignee: Gabor Gevay
>Priority: Minor
>  Labels: statistics
>
> By count I mean the number of elements in the window.
> These can be implemented very efficiently building on FLINK-2143.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2145) Median calculation for windows

2015-06-03 Thread Gabor Gevay (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Gevay updated FLINK-2145:
---
Labels: statistics  (was: )

> Median calculation for windows
> --
>
> Key: FLINK-2145
> URL: https://issues.apache.org/jira/browse/FLINK-2145
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming
>Reporter: Gabor Gevay
>Assignee: Gabor Gevay
>Priority: Minor
>  Labels: statistics
>
> The PreReducer for this has the following algorithm: We maintain two 
> multisets (as, for example, balanced binary search trees), that always 
> partition the elements of the current window to smaller-than-median and 
> larger-than-median elements. At each store and evict, we can maintain this 
> invariant with only O(1) multiset operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2146) Fast calculation of min/max with arbitrary eviction and triggers

2015-06-03 Thread Gabor Gevay (JIRA)
Gabor Gevay created FLINK-2146:
--

 Summary: Fast calculation of min/max with arbitrary eviction and 
triggers
 Key: FLINK-2146
 URL: https://issues.apache.org/jira/browse/FLINK-2146
 Project: Flink
  Issue Type: Sub-task
Reporter: Gabor Gevay
Priority: Minor


The last algorithm described here could be used:
http://codercareer.blogspot.com/2012/02/no-33-maximums-in-sliding-windows.html
It is based on a double-ended queue which maintains a sorted list of elements 
of the current window that have the possibility of being the maximal element in 
the future.

Store: O(1) amortized
Evict: O(1)
emitWindow: O(1)
memory: O(N)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2145) Median calculation for windows

2015-06-03 Thread Gabor Gevay (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Gevay updated FLINK-2145:
---
Description: 
The PreReducer for this has the following algorithm: We maintain two multisets 
(as, for example, balanced binary search trees), that always partition the 
elements of the current window to smaller-than-median and larger-than-median 
elements. At each store and evict, we can maintain this invariant with only 
O(1) multiset operations.

Store: O(log N)
Evict: O(log N)
emitWindow: O(1)
memory: O(N)

  was:The PreReducer for this has the following algorithm: We maintain two 
multisets (as, for example, balanced binary search trees), that always 
partition the elements of the current window to smaller-than-median and 
larger-than-median elements. At each store and evict, we can maintain this 
invariant with only O(1) multiset operations.


> Median calculation for windows
> --
>
> Key: FLINK-2145
> URL: https://issues.apache.org/jira/browse/FLINK-2145
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming
>Reporter: Gabor Gevay
>Assignee: Gabor Gevay
>Priority: Minor
>  Labels: statistics
>
> The PreReducer for this has the following algorithm: We maintain two 
> multisets (as, for example, balanced binary search trees), that always 
> partition the elements of the current window to smaller-than-median and 
> larger-than-median elements. At each store and evict, we can maintain this 
> invariant with only O(1) multiset operations.
> Store: O(log N)
> Evict: O(log N)
> emitWindow: O(1)
> memory: O(N)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1528][Gelly] Added Local Clustering Coe...

2015-06-03 Thread vasia
Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/420#issuecomment-108283573
  
Hey @balidani!
Would you like to finish this up?
It's not really urgent, but it's almost finished and it'd be a pity to 
abandon :)
Someone else could also take over of course. Just let us know!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1528) Add local clustering coefficient library method and example

2015-06-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570567#comment-14570567
 ] 

ASF GitHub Bot commented on FLINK-1528:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/420#issuecomment-108283573
  
Hey @balidani!
Would you like to finish this up?
It's not really urgent, but it's almost finished and it'd be a pity to 
abandon :)
Someone else could also take over of course. Just let us know!


> Add local clustering coefficient library method and example
> ---
>
> Key: FLINK-1528
> URL: https://issues.apache.org/jira/browse/FLINK-1528
> Project: Flink
>  Issue Type: Task
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Daniel Bali
>
> Add a gelly library method and example to compute the local clustering 
> coefficient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2144) Implement count, average, and variance for windows

2015-06-03 Thread Gabor Gevay (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Gevay updated FLINK-2144:
---
Description: 
By count I mean the number of elements in the window.

These can be implemented very efficiently building on FLINK-2143:
Store: O(1)
Evict: O(1)
emitWindow: O(1)
memory: O(1)

  was:
By count I mean the number of elements in the window.

These can be implemented very efficiently building on FLINK-2143.


> Implement count, average, and variance for windows
> --
>
> Key: FLINK-2144
> URL: https://issues.apache.org/jira/browse/FLINK-2144
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming
>Reporter: Gabor Gevay
>Assignee: Gabor Gevay
>Priority: Minor
>  Labels: statistics
>
> By count I mean the number of elements in the window.
> These can be implemented very efficiently building on FLINK-2143:
> Store: O(1)
> Evict: O(1)
> emitWindow: O(1)
> memory: O(1)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1707][WIP]Add an Affinity Propagation L...

2015-06-03 Thread vasia
Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/649#issuecomment-108284020
  
Hey @joey001!

Are you still working on this? Let us know if you need help!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1707) Add an Affinity Propagation Library Method

2015-06-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570572#comment-14570572
 ] 

ASF GitHub Bot commented on FLINK-1707:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/649#issuecomment-108284020
  
Hey @joey001!

Are you still working on this? Let us know if you need help!


> Add an Affinity Propagation Library Method
> --
>
> Key: FLINK-1707
> URL: https://issues.apache.org/jira/browse/FLINK-1707
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: joey
>Priority: Minor
>
> This issue proposes adding the an implementation of the Affinity Propagation 
> algorithm as a Gelly library method and a corresponding example.
> The algorithm is described in paper [1] and a description of a vertex-centric 
> implementation can be found is [2].
> [1]: http://www.psi.toronto.edu/affinitypropagation/FreyDueckScience07.pdf
> [2]: http://event.cwi.nl/grades2014/00-ching-slides.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1528) Add local clustering coefficient library method and example

2015-06-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570573#comment-14570573
 ] 

ASF GitHub Bot commented on FLINK-1528:
---

Github user balidani commented on the pull request:

https://github.com/apache/flink/pull/420#issuecomment-108284218
  
Yeah, I should definitely finish this! I'll take a look tonight, sorry 
about that :)


> Add local clustering coefficient library method and example
> ---
>
> Key: FLINK-1528
> URL: https://issues.apache.org/jira/browse/FLINK-1528
> Project: Flink
>  Issue Type: Task
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Daniel Bali
>
> Add a gelly library method and example to compute the local clustering 
> coefficient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1528][Gelly] Added Local Clustering Coe...

2015-06-03 Thread balidani
Github user balidani commented on the pull request:

https://github.com/apache/flink/pull/420#issuecomment-108284218
  
Yeah, I should definitely finish this! I'll take a look tonight, sorry 
about that :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [contrib] Storm compatibility

2015-06-03 Thread szape
Github user szape commented on the pull request:

https://github.com/apache/flink/pull/764#issuecomment-108284180
  
The first 10 commits are not rebased on the current master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-2147) Approximate calculation of frequencies in data streams

2015-06-03 Thread Gabor Gevay (JIRA)
Gabor Gevay created FLINK-2147:
--

 Summary: Approximate calculation of frequencies in data streams
 Key: FLINK-2147
 URL: https://issues.apache.org/jira/browse/FLINK-2147
 Project: Flink
  Issue Type: Sub-task
Reporter: Gabor Gevay
Priority: Minor


Count-Min sketch is a hashing-based algorithm for approximately keeping track 
of the frequencies of elements in a data stream. It is described by Cormode et 
al. in the following paper:
http://dimacs.rutgers.edu/~graham/pubs/papers/cmsoft.pdf
Note that this algorithm can be conveniently implemented in a distributed way, 
as described in section 3.2 of the paper.

The paper
http://www.vldb.org/conf/2002/S10P03.pdf
also describes algorithms for approximately keeping track of frequencies, but 
here the user can specify a threshold below which she is not interested in the 
frequency of an element. The error-bounds are also different than the Count-min 
sketch algorithm.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-03 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570586#comment-14570586
 ] 

Vasia Kalavri commented on FLINK-1520:
--

Hey [~cebe]! One more ping to you :)
If you're not working on this, can I "release" this issue? Thanks!

> Read edges and vertices from CSV files
> --
>
> Key: FLINK-1520
> URL: https://issues.apache.org/jira/browse/FLINK-1520
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Carsten Brandt
>Priority: Minor
>  Labels: easyfix, newbie
>
> Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2148) Approximately calculate the number of distinct elements of a stream

2015-06-03 Thread Gabor Gevay (JIRA)
Gabor Gevay created FLINK-2148:
--

 Summary: Approximately calculate the number of distinct elements 
of a stream
 Key: FLINK-2148
 URL: https://issues.apache.org/jira/browse/FLINK-2148
 Project: Flink
  Issue Type: Sub-task
Reporter: Gabor Gevay
Priority: Minor


In the paper
http://people.seas.harvard.edu/~minilek/papers/f0.pdf
Kane et al. describes an optimal algorithm for estimating the number of 
distinct elements in a data stream.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2147) Approximate calculation of frequencies in data streams

2015-06-03 Thread Gabor Gevay (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Gevay updated FLINK-2147:
---
Labels: statistics  (was: )

> Approximate calculation of frequencies in data streams
> --
>
> Key: FLINK-2147
> URL: https://issues.apache.org/jira/browse/FLINK-2147
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming
>Reporter: Gabor Gevay
>Priority: Minor
>  Labels: statistics
>
> Count-Min sketch is a hashing-based algorithm for approximately keeping track 
> of the frequencies of elements in a data stream. It is described by Cormode 
> et al. in the following paper:
> http://dimacs.rutgers.edu/~graham/pubs/papers/cmsoft.pdf
> Note that this algorithm can be conveniently implemented in a distributed 
> way, as described in section 3.2 of the paper.
> The paper
> http://www.vldb.org/conf/2002/S10P03.pdf
> also describes algorithms for approximately keeping track of frequencies, but 
> here the user can specify a threshold below which she is not interested in 
> the frequency of an element. The error-bounds are also different than the 
> Count-min sketch algorithm.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1962) Add Gelly Scala API

2015-06-03 Thread PJ Van Aeken (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570601#comment-14570601
 ] 

PJ Van Aeken commented on FLINK-1962:
-

[~ssc], you can find an implementation to play with in my fork (branch 
scala-gelly-api). It has all of the functionalities from the Java API except 
for a few utility methods for creating graphs, and I am also still working on 
Vertex Centric Iterations and Gather Sum Apply Iterations. Other than that most 
of it should be there, although I am a couple commits behind.

> Add Gelly Scala API
> ---
>
> Key: FLINK-1962
> URL: https://issues.apache.org/jira/browse/FLINK-1962
> Project: Flink
>  Issue Type: Task
>  Components: Gelly, Scala API
>Affects Versions: 0.9
>Reporter: Vasia Kalavri
>Assignee: PJ Van Aeken
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Implemented TwitterSourceFilter and adapted Tw...

2015-06-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/695


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Implemented TwitterSourceFilter and adapted Tw...

2015-06-03 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/695#issuecomment-108290010
  
Thanks for your work. :smile: 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2048) Enhance Twitter Stream support

2015-06-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570605#comment-14570605
 ] 

ASF GitHub Bot commented on FLINK-2048:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/695


> Enhance Twitter Stream support
> --
>
> Key: FLINK-2048
> URL: https://issues.apache.org/jira/browse/FLINK-2048
> Project: Flink
>  Issue Type: Task
>  Components: Streaming
>Affects Versions: master
>Reporter: Hilmi Yildirim
>Assignee: Hilmi Yildirim
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Flink does not have a real twitter support. It only has a TwitterSource which 
> uses a sample stream which can not be used properly for analysis. It is 
> possible to use external tools to create streams (e.g. Kafka) but it is 
> beneficially to create a propert twitter stream in Flink.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1962) Add Gelly Scala API

2015-06-03 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570607#comment-14570607
 ] 

Vasia Kalavri commented on FLINK-1962:
--

Awesome news [~vanaepi]!

> Add Gelly Scala API
> ---
>
> Key: FLINK-1962
> URL: https://issues.apache.org/jira/browse/FLINK-1962
> Project: Flink
>  Issue Type: Task
>  Components: Gelly, Scala API
>Affects Versions: 0.9
>Reporter: Vasia Kalavri
>Assignee: PJ Van Aeken
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2149) Simplify Gelly Jaccard similarity example

2015-06-03 Thread Vasia Kalavri (JIRA)
Vasia Kalavri created FLINK-2149:


 Summary: Simplify Gelly Jaccard similarity example
 Key: FLINK-2149
 URL: https://issues.apache.org/jira/browse/FLINK-2149
 Project: Flink
  Issue Type: Improvement
  Components: Gelly
Affects Versions: 0.9
Reporter: Vasia Kalavri
Priority: Trivial


The Gelly Jaccard similarity example can be simplified by replacing the 
groupReduceOnEdges method with the simpler reduceOnEdges.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1759) Execution statistics for vertex-centric iterations

2015-06-03 Thread Vasia Kalavri (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasia Kalavri updated FLINK-1759:
-
Labels: easyfix starter  (was: )

> Execution statistics for vertex-centric iterations
> --
>
> Key: FLINK-1759
> URL: https://issues.apache.org/jira/browse/FLINK-1759
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 0.9
>Reporter: Vasia Kalavri
>Priority: Minor
>  Labels: easyfix, starter
>
> It would be nice to add an option for gathering execution statistics from 
> VertexCentricIteration.
> In particular, the following metrics could be useful:
> - total number of supersteps
> - number of messages sent (total / per superstep)
> - bytes of messages exchanged (total / per superstep)
> - execution time (total / per superstep)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [contrib] Storm compatibility

2015-06-03 Thread mjsax
Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/764#issuecomment-108292671
  
I thought this is a clean branch. I am working on this currently... 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [contrib] Storm compatibility

2015-06-03 Thread mbalassi
Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/764#issuecomment-108292481
  
You can just ad one commit at the end, please do not rewrite the complete 
history for this. That is unnecessary overhead. If you feel confident about it 
we can even put it in the 0.9 release.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-1759) Execution statistics for vertex-centric iterations

2015-06-03 Thread Vasia Kalavri (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasia Kalavri updated FLINK-1759:
-
Labels:   (was: easyfix starter)

> Execution statistics for vertex-centric iterations
> --
>
> Key: FLINK-1759
> URL: https://issues.apache.org/jira/browse/FLINK-1759
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 0.9
>Reporter: Vasia Kalavri
>Priority: Minor
>
> It would be nice to add an option for gathering execution statistics from 
> VertexCentricIteration.
> In particular, the following metrics could be useful:
> - total number of supersteps
> - number of messages sent (total / per superstep)
> - bytes of messages exchanged (total / per superstep)
> - execution time (total / per superstep)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2149) Simplify Gelly Jaccard similarity example

2015-06-03 Thread Vasia Kalavri (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasia Kalavri updated FLINK-2149:
-
Labels: easyfix starter  (was: )

> Simplify Gelly Jaccard similarity example
> -
>
> Key: FLINK-2149
> URL: https://issues.apache.org/jira/browse/FLINK-2149
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 0.9
>Reporter: Vasia Kalavri
>Priority: Trivial
>  Labels: easyfix, starter
>
> The Gelly Jaccard similarity example can be simplified by replacing the 
> groupReduceOnEdges method with the simpler reduceOnEdges.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [streaming] Consolidate streaming API method n...

2015-06-03 Thread mbalassi
Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/761#issuecomment-108295020
  
Merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-2150) Add a library method that assigns unique Long values to vertices

2015-06-03 Thread Vasia Kalavri (JIRA)
Vasia Kalavri created FLINK-2150:


 Summary: Add a library method that assigns unique Long values to 
vertices
 Key: FLINK-2150
 URL: https://issues.apache.org/jira/browse/FLINK-2150
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Priority: Minor


In some graph algorithms, it is required to initialize the vertex values with 
unique values (e.g. label propagation).
This issue proposes adding a Gelly library method that receives an input graph 
and initializes its vertex values with unique Long values.
This method can then also be used to improve the MusicProfiles example.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [contrib] Storm compatibility

2015-06-03 Thread szape
Github user szape commented on the pull request:

https://github.com/apache/flink/pull/764#issuecomment-108298555
  
Okay then, it will remain as it is.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1731) Add kMeans clustering algorithm to machine learning library

2015-06-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570676#comment-14570676
 ] 

ASF GitHub Bot commented on FLINK-1731:
---

Github user FGoessler commented on the pull request:

https://github.com/apache/flink/pull/700#issuecomment-108310843
  
The travis build is failing on Oracle JDK 8. Maven or Flink are hanging 
according to the build log. Can anyone help or at least restart the build? 
Are there any known "flipping tests"? Imo the failure isn't related to our 
changes.


> Add kMeans clustering algorithm to machine learning library
> ---
>
> Key: FLINK-1731
> URL: https://issues.apache.org/jira/browse/FLINK-1731
> Project: Flink
>  Issue Type: New Feature
>  Components: Machine Learning Library
>Reporter: Till Rohrmann
>Assignee: Peter Schrott
>  Labels: ML
>
> The Flink repository already contains a kMeans implementation but it is not 
> yet ported to the machine learning library. I assume that only the used data 
> types have to be adapted and then it can be more or less directly moved to 
> flink-ml.
> The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better 
> implementation because the improve the initial seeding phase to achieve near 
> optimal clustering. It might be worthwhile to implement kMeans||.
> Resources:
> [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf
> [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1731] [ml] Implementation of Feature K-...

2015-06-03 Thread FGoessler
Github user FGoessler commented on the pull request:

https://github.com/apache/flink/pull/700#issuecomment-108310843
  
The travis build is failing on Oracle JDK 8. Maven or Flink are hanging 
according to the build log. Can anyone help or at least restart the build? 
Are there any known "flipping tests"? Imo the failure isn't related to our 
changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1526][gelly] [work in progress] Added M...

2015-06-03 Thread andralungu
Github user andralungu closed the pull request at:

https://github.com/apache/flink/pull/434


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1707][WIP]Add an Affinity Propagation L...

2015-06-03 Thread andralungu
Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/649#issuecomment-108364900
  
Hi @vasia ,

I believe the problem @joey001 had was that in one of the test modes, he 
got the classical "Too few memory segments provided. You need at least 33 
segments, etc". As far as I remember, @StephanEwen said there is ongoing work 
in that direction. We probably cannot merge anything with failing tests anyway 
:) 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2103] Expose partitionBy to user

2015-06-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/743


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2130] [streaming] RMQ Source properly p...

2015-06-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/767


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (FLINK-2149) Simplify Gelly Jaccard similarity example

2015-06-03 Thread Andra Lungu (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andra Lungu reassigned FLINK-2149:
--

Assignee: Andra Lungu

> Simplify Gelly Jaccard similarity example
> -
>
> Key: FLINK-2149
> URL: https://issues.apache.org/jira/browse/FLINK-2149
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 0.9
>Reporter: Vasia Kalavri
>Assignee: Andra Lungu
>Priority: Trivial
>  Labels: easyfix, starter
>
> The Gelly Jaccard similarity example can be simplified by replacing the 
> groupReduceOnEdges method with the simpler reduceOnEdges.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2130) RabbitMQ source does not fail when failing to retrieve elements

2015-06-03 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/FLINK-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Márton Balassi closed FLINK-2130.
-
   Resolution: Fixed
Fix Version/s: 0.9

Fixed via 39ec54f

> RabbitMQ source does not fail when failing to retrieve elements
> ---
>
> Key: FLINK-2130
> URL: https://issues.apache.org/jira/browse/FLINK-2130
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming, Streaming Connectors
>Reporter: Stephan Ewen
>Assignee: Márton Balassi
> Fix For: 0.9
>
>
> The RMQ source only logs when elements cannot be retrieved. Failures are not 
> propagated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2151) Provide interface to distinguish close() calls in error and regular cases

2015-06-03 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-2151:
-

 Summary: Provide interface to distinguish close() calls in error 
and regular cases
 Key: FLINK-2151
 URL: https://issues.apache.org/jira/browse/FLINK-2151
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 0.9
Reporter: Robert Metzger


I was talking to somebody who is interested in contributing a 
{{flink-cassandra}} connector.

The connector will create cassandra files locally (on the TaskManagers) and 
bulk-load them in the {{close()}} method.
For the user functions it is currently not possible to find out whether the 
function is closed due to an error or an regular end.

The simplest approach would be passing an additional argument (enum or boolean) 
into the close() method, indicating the type of closing.
But that would break all existing code.

Another approach would add an interface that has such an extended close method 
{{RichCloseFunction}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2103) Expose partitionBy to the user in Stream API

2015-06-03 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/FLINK-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Márton Balassi closed FLINK-2103.
-
   Resolution: Implemented
Fix Version/s: 0.9

Implemented via a43e0de and 6b28bdf.

> Expose partitionBy to the user in Stream API
> 
>
> Key: FLINK-2103
> URL: https://issues.apache.org/jira/browse/FLINK-2103
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 0.9
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
> Fix For: 0.9
>
>
> Is there a reason why this is not exposed to the user? I could see cases 
> where this would be useful to have.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2142) GSoC project: Exact and Approximate Statistics for Data Streams and Windows

2015-06-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/FLINK-2142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570785#comment-14570785
 ] 

Márton Balassi commented on FLINK-2142:
---

Thanks, for adding the tickets to track your progress [~ggevay].

> GSoC project: Exact and Approximate Statistics for Data Streams and Windows
> ---
>
> Key: FLINK-2142
> URL: https://issues.apache.org/jira/browse/FLINK-2142
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming
>Reporter: Gabor Gevay
>Assignee: Gabor Gevay
>Priority: Minor
>  Labels: gsoc2015, statistics, streaming
>
> The goal of this project is to implement basic statistics of data streams and 
> windows (like average, median, variance, correlation, etc.) in a 
> computationally efficient manner. This involves designing custom PreReducers.
> The exact calculation of some statistics (eg. frequencies, or the number of 
> distinct elements) would require memory proportional to the number of 
> elements in the input (the window or the entire stream). However, there are 
> efficient algorithms and data structures using less memory for calculating 
> the same statistics only approximately, with user-specified error bounds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [streaming] Consolidate streaming API method n...

2015-06-03 Thread mbalassi
Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/761#issuecomment-108405297
  
Merged manually.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [streaming] Consolidate streaming API method n...

2015-06-03 Thread mbalassi
Github user mbalassi closed the pull request at:

https://github.com/apache/flink/pull/761


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2126) Scala shell tests sporadically fail on travis

2015-06-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570807#comment-14570807
 ] 

ASF GitHub Bot commented on FLINK-2126:
---

GitHub user nikste opened a pull request:

https://github.com/apache/flink/pull/768

[FLINK-2126] Fixed Flink scala shell ITSuite sporadic failures

Add-numbers from 1..10 now does not check for "Job execution switched to 
status FINISHED.", which sometimes was not printed.

Fixed typo in add numbers test

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nikste/flink Scala_shell_ITSuite_fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/768.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #768


commit 267714126a606076b44e45c7e55be1f1d2fed996
Author: Nikolaas Steenbergen 
Date:   2015-06-03T10:39:02Z

Add-numbers from 1..10 now does not check for "Job execution switched to 
status FINISHED.", which sometimes did not show up.
Fixed typo in add numbers test




> Scala shell tests sporadically fail on travis
> -
>
> Key: FLINK-2126
> URL: https://issues.apache.org/jira/browse/FLINK-2126
> Project: Flink
>  Issue Type: Bug
>  Components: Scala Shell
>Affects Versions: 0.9
>Reporter: Robert Metzger
>
> See https://travis-ci.org/rmetzger/flink/jobs/64893149



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2126] Fixed Flink scala shell ITSuite s...

2015-06-03 Thread nikste
GitHub user nikste opened a pull request:

https://github.com/apache/flink/pull/768

[FLINK-2126] Fixed Flink scala shell ITSuite sporadic failures

Add-numbers from 1..10 now does not check for "Job execution switched to 
status FINISHED.", which sometimes was not printed.

Fixed typo in add numbers test

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nikste/flink Scala_shell_ITSuite_fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/768.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #768


commit 267714126a606076b44e45c7e55be1f1d2fed996
Author: Nikolaas Steenbergen 
Date:   2015-06-03T10:39:02Z

Add-numbers from 1..10 now does not check for "Job execution switched to 
status FINISHED.", which sometimes did not show up.
Fixed typo in add numbers test




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1707][WIP]Add an Affinity Propagation L...

2015-06-03 Thread joey001
Github user joey001 commented on the pull request:

https://github.com/apache/flink/pull/649#issuecomment-108422058
  
Hello,  i will submit a right version of AP algorithm in this ASAP. I am so 
occupied by two urgent papers unfortunately.  Now, I mainly have 3 questions 
for the work.
1,The first is how to rebase the branch, as commented by andralungu , i 
must merge the work in a wrong way. I am now confused by the my git. i was 
considering do open a brand new branch for committing?
2. The other one is "Too few memory segments provided". As @andralungu 
said, It sames that the problem couldn't be solved immediately?
3. In AP algorithm in the original paper, normal distributed small values 
are added to the weights of edges to remove degeneracies, however, the 
generation of normal distribution data is not supported yet. Currently, all the 
current test are passed, but i guess it's still a problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1707) Add an Affinity Propagation Library Method

2015-06-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570836#comment-14570836
 ] 

ASF GitHub Bot commented on FLINK-1707:
---

Github user joey001 commented on the pull request:

https://github.com/apache/flink/pull/649#issuecomment-108422058
  
Hello,  i will submit a right version of AP algorithm in this ASAP. I am so 
occupied by two urgent papers unfortunately.  Now, I mainly have 3 questions 
for the work.
1,The first is how to rebase the branch, as commented by andralungu , i 
must merge the work in a wrong way. I am now confused by the my git. i was 
considering do open a brand new branch for committing?
2. The other one is "Too few memory segments provided". As @andralungu 
said, It sames that the problem couldn't be solved immediately?
3. In AP algorithm in the original paper, normal distributed small values 
are added to the weights of edges to remove degeneracies, however, the 
generation of normal distribution data is not supported yet. Currently, all the 
current test are passed, but i guess it's still a problem.


> Add an Affinity Propagation Library Method
> --
>
> Key: FLINK-1707
> URL: https://issues.apache.org/jira/browse/FLINK-1707
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: joey
>Priority: Minor
>
> This issue proposes adding the an implementation of the Affinity Propagation 
> algorithm as a Gelly library method and a corresponding example.
> The algorithm is described in paper [1] and a description of a vertex-centric 
> implementation can be found is [2].
> [1]: http://www.psi.toronto.edu/affinitypropagation/FreyDueckScience07.pdf
> [2]: http://event.cwi.nl/grades2014/00-ching-slides.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1707) Add an Affinity Propagation Library Method

2015-06-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570854#comment-14570854
 ] 

ASF GitHub Bot commented on FLINK-1707:
---

Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/649#issuecomment-108431985
  
For 1, I would open a fresh PR, I guess this is too messy to try to fix. 

Normally, you should:
change to master
git remote -v
pull the latest state - git pull upstream master
and then change back to your branch
git rebase master

then when pushing 

git push -f origin 

For 3. Open the new PR and we will have a look at the current solution :)


> Add an Affinity Propagation Library Method
> --
>
> Key: FLINK-1707
> URL: https://issues.apache.org/jira/browse/FLINK-1707
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: joey
>Priority: Minor
>
> This issue proposes adding the an implementation of the Affinity Propagation 
> algorithm as a Gelly library method and a corresponding example.
> The algorithm is described in paper [1] and a description of a vertex-centric 
> implementation can be found is [2].
> [1]: http://www.psi.toronto.edu/affinitypropagation/FreyDueckScience07.pdf
> [2]: http://event.cwi.nl/grades2014/00-ching-slides.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1707][WIP]Add an Affinity Propagation L...

2015-06-03 Thread andralungu
Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/649#issuecomment-108431985
  
For 1, I would open a fresh PR, I guess this is too messy to try to fix. 

Normally, you should:
change to master
git remote -v
pull the latest state - git pull upstream master
and then change back to your branch
git rebase master

then when pushing 

git push -f origin 

For 3. Open the new PR and we will have a look at the current solution :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2135] Fix faulty cast to GroupReduceFun...

2015-06-03 Thread rmetzger
GitHub user rmetzger opened a pull request:

https://github.com/apache/flink/pull/769

[FLINK-2135] Fix faulty cast to GroupReduceFunction

There seem to be two variants of that tuple unwrapping thing, one 
combinable, the other one non-combinable.
These unwrapping things use a helper class called `WrappingFunction`, which 
expects the type of the user function.
In this case, we need a combinable group reduce function. It seems that 
only RichGroupReduceFunction is combinable AND a GroupReduceFunction at the 
same time.
But users can also implement non-rich combinable group reduce functions (by 
implementing both interfaces).
The `PlanUnwrappingSortedReduceGroupOperator` was just casting the user 
function to a `RichGroupReduceFunction`, which will fail if the UDF is not a 
Rich one.

To avoid making the code uglier than it already is, I've decided to cast 
the wrapped function to a `GroupCombineFunction` for calling the combiner. The 
`TupleUnwrappingGroupCombinableGroupReducer` is only called for Combinable UDFs 
anyways.

Lets see if travis gives me green light for this fix.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rmetzger/flink flink2135

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/769.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #769


commit f4e66bd84ab7f0c47505336217f1ed3e86d4069c
Author: Robert Metzger 
Date:   2015-06-03T13:52:12Z

[FLINK-2135] Fix faulty cast to GroupReduceFunction




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2135) Java plan translation fails with ClassCastException (probably in first())

2015-06-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570875#comment-14570875
 ] 

ASF GitHub Bot commented on FLINK-2135:
---

GitHub user rmetzger opened a pull request:

https://github.com/apache/flink/pull/769

[FLINK-2135] Fix faulty cast to GroupReduceFunction

There seem to be two variants of that tuple unwrapping thing, one 
combinable, the other one non-combinable.
These unwrapping things use a helper class called `WrappingFunction`, which 
expects the type of the user function.
In this case, we need a combinable group reduce function. It seems that 
only RichGroupReduceFunction is combinable AND a GroupReduceFunction at the 
same time.
But users can also implement non-rich combinable group reduce functions (by 
implementing both interfaces).
The `PlanUnwrappingSortedReduceGroupOperator` was just casting the user 
function to a `RichGroupReduceFunction`, which will fail if the UDF is not a 
Rich one.

To avoid making the code uglier than it already is, I've decided to cast 
the wrapped function to a `GroupCombineFunction` for calling the combiner. The 
`TupleUnwrappingGroupCombinableGroupReducer` is only called for Combinable UDFs 
anyways.

Lets see if travis gives me green light for this fix.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rmetzger/flink flink2135

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/769.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #769


commit f4e66bd84ab7f0c47505336217f1ed3e86d4069c
Author: Robert Metzger 
Date:   2015-06-03T13:52:12Z

[FLINK-2135] Fix faulty cast to GroupReduceFunction




> Java plan translation fails with ClassCastException (probably in first())
> -
>
> Key: FLINK-2135
> URL: https://issues.apache.org/jira/browse/FLINK-2135
> Project: Flink
>  Issue Type: Bug
>  Components: Java API
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>
> A user reported the following error
> {code}
> Exception in thread "main" java.lang.ClassCastException: 
> org.apache.flink.api.java.functions.FirstReducer cannot be cast to 
> org.apache.flink.api.common.functions.RichGroupReduceFunction
>   at 
> org.apache.flink.api.java.operators.translation.PlanUnwrappingSortedReduceGroupOperator.(PlanUnwrappingSortedReduceGroupOperator.java:40)
>   at 
> org.apache.flink.api.java.operators.GroupReduceOperator.translateSelectorFunctionSortedReducer(GroupReduceOperator.java:278)
>   at 
> org.apache.flink.api.java.operators.GroupReduceOperator.translateToDataFlow(GroupReduceOperator.java:177)
>   at 
> org.apache.flink.api.java.operators.GroupReduceOperator.translateToDataFlow(GroupReduceOperator.java:50)
>   at 
> org.apache.flink.api.java.operators.OperatorTranslation.translateSingleInputOperator(OperatorTranslation.java:124)
>   at 
> org.apache.flink.api.java.operators.OperatorTranslation.translate(OperatorTranslation.java:86)
>   at 
> org.apache.flink.api.java.operators.OperatorTranslation.translateSingleInputOperator(OperatorTranslation.java:122)
>   at 
> org.apache.flink.api.java.operators.OperatorTranslation.translate(OperatorTranslation.java:86)
>   at 
> org.apache.flink.api.java.operators.OperatorTranslation.translateSingleInputOperator(OperatorTranslation.java:122)
>   at 
> org.apache.flink.api.java.operators.OperatorTranslation.translate(OperatorTranslation.java:86)
>   at 
> org.apache.flink.api.java.operators.OperatorTranslation.translate(OperatorTranslation.java:61)
>   at 
> org.apache.flink.api.java.operators.OperatorTranslation.translateToPlan(OperatorTranslation.java:49)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:925)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:893)
>   at 
> org.apache.flink.api.java.LocalEnvironment.execute(LocalEnvironment.java:50)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:789)
>   at org.apache.flink.api.java.DataSet.collect(DataSet.java:411)
>   at org.apache.flink.api.java.DataSet.print(DataSet.java:1346)
>   at com.dataartisans.GroupReduceBug.main(GroupReduceBug.java:43)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces

[jira] [Created] (FLINK-2152) Provide zipWithIndex utility in flink-contrib

2015-06-03 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-2152:
-

 Summary: Provide zipWithIndex utility in flink-contrib
 Key: FLINK-2152
 URL: https://issues.apache.org/jira/browse/FLINK-2152
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Reporter: Robert Metzger
Priority: Trivial


We should provide a simple utility method for zipping elements in a data set 
with a dense index.
its up for discussion whether we want it directly in the API or if we should 
provide it only as a utility from {{flink-contrib}}.

I would put it in {{flink-contrib}}.

See my answer on SO: 
http://stackoverflow.com/questions/30596556/zipwithindex-on-apache-flink



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1981] add support for GZIP files

2015-06-03 Thread sekruse
Github user sekruse commented on the pull request:

https://github.com/apache/flink/pull/762#issuecomment-108443527
  
I exchanged that part with the Validate with Preconditions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1981) Add GZip support

2015-06-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570882#comment-14570882
 ] 

ASF GitHub Bot commented on FLINK-1981:
---

Github user sekruse commented on the pull request:

https://github.com/apache/flink/pull/762#issuecomment-108443527
  
I exchanged that part with the Validate with Preconditions.


> Add GZip support
> 
>
> Key: FLINK-1981
> URL: https://issues.apache.org/jira/browse/FLINK-1981
> Project: Flink
>  Issue Type: New Feature
>  Components: Core
>Reporter: Sebastian Kruse
>Assignee: Sebastian Kruse
>Priority: Minor
>
> GZip, as a commonly used compression format, should be supported in the same 
> way as the already supported deflate files. This allows to use GZip files 
> with any subclass of FileInputFormat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2080) Execute Flink with sbt

2015-06-03 Thread Christian Wuertz (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Wuertz updated FLINK-2080:

Attachment: sbt.patch

I added a few lines to the scala quickstart page. I hope the patch file was 
generated correctly. If not or in case this should go to a different place, 
just let me know. 

> Execute Flink with sbt
> --
>
> Key: FLINK-2080
> URL: https://issues.apache.org/jira/browse/FLINK-2080
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 0.8.1
>Reporter: Christian Wuertz
>Priority: Minor
> Attachments: sbt.patch
>
>
> I tried to execute some of the flink example applications on my local machine 
> using sbt. To get this running without class loading issues it was important 
> to make sure that Flink is executed in its own JVM and not in the sbt JVM. 
> This can be done very easily, but it would have been nice to know that in 
> advance. So maybe you guys want to add this to the Flink documentation.
> An example can be found here: https://github.com/Teots/flink-sbt
> (The trick was to add "fork in run := true" to the build.sbt)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2149) Simplify Gelly Jaccard similarity example

2015-06-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570891#comment-14570891
 ] 

ASF GitHub Bot commented on FLINK-2149:
---

GitHub user andralungu opened a pull request:

https://github.com/apache/flink/pull/770

[FLINK-2149][gelly] Simplified Jaccard Example

This PR simplifies Gelly's Jaccard example by using the more efficient 
reduceOnNeighbors rather than groupReduceOnNeighbors. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/andralungu/flink jaccardImprovement

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/770.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #770


commit 0e189c6af9a5fb80b4999a60a431d60cf95944db
Author: andralungu 
Date:   2015-06-03T14:12:16Z

[FLINK-2149][gelly] Simplified Jaccard Example




> Simplify Gelly Jaccard similarity example
> -
>
> Key: FLINK-2149
> URL: https://issues.apache.org/jira/browse/FLINK-2149
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 0.9
>Reporter: Vasia Kalavri
>Assignee: Andra Lungu
>Priority: Trivial
>  Labels: easyfix, starter
>
> The Gelly Jaccard similarity example can be simplified by replacing the 
> groupReduceOnEdges method with the simpler reduceOnEdges.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2149][gelly] Simplified Jaccard Example

2015-06-03 Thread andralungu
GitHub user andralungu opened a pull request:

https://github.com/apache/flink/pull/770

[FLINK-2149][gelly] Simplified Jaccard Example

This PR simplifies Gelly's Jaccard example by using the more efficient 
reduceOnNeighbors rather than groupReduceOnNeighbors. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/andralungu/flink jaccardImprovement

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/770.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #770


commit 0e189c6af9a5fb80b4999a60a431d60cf95944db
Author: andralungu 
Date:   2015-06-03T14:12:16Z

[FLINK-2149][gelly] Simplified Jaccard Example




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2150) Add a library method that assigns unique Long values to vertices

2015-06-03 Thread Andra Lungu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570924#comment-14570924
 ] 

Andra Lungu commented on FLINK-2150:


Hey [~vkalavri],

I would like to reserve this issue for a new contributor. It looks like a very 
good starter task :)

> Add a library method that assigns unique Long values to vertices
> 
>
> Key: FLINK-2150
> URL: https://issues.apache.org/jira/browse/FLINK-2150
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Priority: Minor
>
> In some graph algorithms, it is required to initialize the vertex values with 
> unique values (e.g. label propagation).
> This issue proposes adding a Gelly library method that receives an input 
> graph and initializes its vertex values with unique Long values.
> This method can then also be used to improve the MusicProfiles example.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2150) Add a library method that assigns unique Long values to vertices

2015-06-03 Thread Andra Lungu (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andra Lungu updated FLINK-2150:
---
Labels: starter  (was: )

> Add a library method that assigns unique Long values to vertices
> 
>
> Key: FLINK-2150
> URL: https://issues.apache.org/jira/browse/FLINK-2150
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Priority: Minor
>  Labels: starter
>
> In some graph algorithms, it is required to initialize the vertex values with 
> unique values (e.g. label propagation).
> This issue proposes adding a Gelly library method that receives an input 
> graph and initializes its vertex values with unique Long values.
> This method can then also be used to improve the MusicProfiles example.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2030) Implement an online histogram with Merging and equalization features

2015-06-03 Thread Sachin Goel (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570934#comment-14570934
 ] 

Sachin Goel commented on FLINK-2030:


Yes, it is contained inside the Decision tree implementation. The PR is here: 
https://github.com/apache/flink/pull/710

I'm waiting for Till's review for now.

> Implement an online histogram with Merging and equalization features
> 
>
> Key: FLINK-2030
> URL: https://issues.apache.org/jira/browse/FLINK-2030
> Project: Flink
>  Issue Type: Sub-task
>  Components: Machine Learning Library
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Minor
>  Labels: ML
>
> For the implementation of the decision tree in 
> https://issues.apache.org/jira/browse/FLINK-1727, we need to implement an 
> histogram with online updates, merging and equalization features. A reference 
> implementation is provided in [1]
> [1].http://www.jmlr.org/papers/volume11/ben-haim10a/ben-haim10a.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2153) Exclude dependency on hbase annotations module

2015-06-03 Thread JIRA
Márton Balassi created FLINK-2153:
-

 Summary: Exclude dependency on hbase annotations module
 Key: FLINK-2153
 URL: https://issues.apache.org/jira/browse/FLINK-2153
 Project: Flink
  Issue Type: Bug
  Components: Build System
Affects Versions: 0.9
Reporter: Márton Balassi


[ERROR] Failed to execute goal on project flink-hbase: Could not resolve
dependencies for project org.apache.flink:flink-hbase:jar:0.9-SNAPSHOT:
Could not find artifact jdk.tools:jdk.tools:jar:1.7 at specified path
/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home/../lib/tools.jar

There is a Spark issue for this [1] with a solution [2].

[1] https://issues.apache.org/jira/browse/SPARK-4455
[2] https://github.com/apache/spark/pull/3286/files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-2116) Make pipeline extension require less coding

2015-06-03 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann reassigned FLINK-2116:


Assignee: Till Rohrmann

> Make pipeline extension require less coding
> ---
>
> Key: FLINK-2116
> URL: https://issues.apache.org/jira/browse/FLINK-2116
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Mikio Braun
>Assignee: Till Rohrmann
>Priority: Minor
>
> Right now, implementing methods from the pipelines for new types, or even 
> adding new methods to pipelines requires many steps:
> 1) implementing methods for new types
>   implement implicit of the corresponding class encapsulating the operation 
> in the companion object
> 2) adding methods to the pipeline
>   - adding a method
>   - adding a trait for the operation
>   - implement implicit in the companion object
> These are all objects which contain many generic parameters, so reducing the 
> work would be great.
> The goal should be that you can really focus on the code to add, and have as 
> little boilerplate code as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2150) Add a library method that assigns unique Long values to vertices

2015-06-03 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570954#comment-14570954
 ] 

Vasia Kalavri commented on FLINK-2150:
--

Great! Let me know if you want to discuss the implementation :)

> Add a library method that assigns unique Long values to vertices
> 
>
> Key: FLINK-2150
> URL: https://issues.apache.org/jira/browse/FLINK-2150
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Priority: Minor
>  Labels: starter
>
> In some graph algorithms, it is required to initialize the vertex values with 
> unique values (e.g. label propagation).
> This issue proposes adding a Gelly library method that receives an input 
> graph and initializes its vertex values with unique Long values.
> This method can then also be used to improve the MusicProfiles example.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-2139) Test Streaming Outputformats

2015-06-03 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/FLINK-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Márton Balassi reassigned FLINK-2139:
-

Assignee: Márton Balassi

> Test Streaming Outputformats
> 
>
> Key: FLINK-2139
> URL: https://issues.apache.org/jira/browse/FLINK-2139
> Project: Flink
>  Issue Type: Test
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Márton Balassi
>Assignee: Márton Balassi
> Fix For: 0.9
>
>
> Currently the only tested streaming core output is the writeAsTest and that 
> is only tested indirectly in integration tests. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2144) Implement count, average, and variance for windows

2015-06-03 Thread Gabor Gevay (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Gevay updated FLINK-2144:
---
Description: 
By count I mean the number of elements in the window.

These can be implemented very efficiently building on FLINK-2143:
Store: O(1)
Evict: O(1)
emitWindow: O(1)

  was:
By count I mean the number of elements in the window.

These can be implemented very efficiently building on FLINK-2143:
Store: O(1)
Evict: O(1)
emitWindow: O(1)
memory: O(1)


> Implement count, average, and variance for windows
> --
>
> Key: FLINK-2144
> URL: https://issues.apache.org/jira/browse/FLINK-2144
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming
>Reporter: Gabor Gevay
>Assignee: Gabor Gevay
>Priority: Minor
>  Labels: statistics
>
> By count I mean the number of elements in the window.
> These can be implemented very efficiently building on FLINK-2143:
> Store: O(1)
> Evict: O(1)
> emitWindow: O(1)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2136) Test the streaming scala API

2015-06-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570958#comment-14570958
 ] 

ASF GitHub Bot commented on FLINK-2136:
---

GitHub user gaborhermann opened a pull request:

https://github.com/apache/flink/pull/771

[wip] [FLINK-2136] Adding DataStream tests for Scala API

* Added tests for scala DataStream
* Added tests for setting parallelism
* Fixed some small bugs in scala API

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gaborhermann/flink FLINK-2136

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/771.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #771


commit 34707118fec52536c87febac237429f1ab71925e
Author: Gábor Hermann 
Date:   2015-06-03T10:55:41Z

[FLINK-2136] [streaming] Added test for scala DataStream

commit 000e29b78646bbb347ea5ae9a77b9f60dad2e46b
Author: Gábor Hermann 
Date:   2015-06-03T14:39:46Z

[FLINK-2136] [streaming] Added parallelism test to DataStream




> Test the streaming scala API
> 
>
> Key: FLINK-2136
> URL: https://issues.apache.org/jira/browse/FLINK-2136
> Project: Flink
>  Issue Type: Test
>  Components: Scala API, Streaming
>Affects Versions: 0.9
>Reporter: Márton Balassi
>Assignee: Gábor Hermann
>
> There are no test covering the streaming scala API. I would suggest to test 
> whether the StreamGraph created by a certain operation looks as expected. 
> Deeper layers and runtime should not be tested here, that is done in 
> streaming-core.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [wip] [FLINK-2136] Adding DataStream tests for...

2015-06-03 Thread gaborhermann
GitHub user gaborhermann opened a pull request:

https://github.com/apache/flink/pull/771

[wip] [FLINK-2136] Adding DataStream tests for Scala API

* Added tests for scala DataStream
* Added tests for setting parallelism
* Fixed some small bugs in scala API

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gaborhermann/flink FLINK-2136

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/771.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #771


commit 34707118fec52536c87febac237429f1ab71925e
Author: Gábor Hermann 
Date:   2015-06-03T10:55:41Z

[FLINK-2136] [streaming] Added test for scala DataStream

commit 000e29b78646bbb347ea5ae9a77b9f60dad2e46b
Author: Gábor Hermann 
Date:   2015-06-03T14:39:46Z

[FLINK-2136] [streaming] Added parallelism test to DataStream




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


  1   2   >