date:20160420

[jira] [Commented] (FLINK-3781) BlobClient may be left unclosed in BlobCache#deleteGlobal()

2016-04-20 Thread Fabian Hueske (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15251365#comment-15251365
 ] 

Fabian Hueske commented on FLINK-3781:
--

[~readman] Yes, definitely :-) 
Thanks for your contribution.
I gave you contributor permissions as well. You can now assign issues to 
yourself if you'd like to continue contributing.

> BlobClient may be left unclosed in BlobCache#deleteGlobal()
> ---
>
> Key: FLINK-3781
> URL: https://issues.apache.org/jira/browse/FLINK-3781
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Chenguang He
>Priority: Minor
> Fix For: 1.1.0
>
>
> {code}
>   public void deleteGlobal(BlobKey key) throws IOException {
> delete(key);
> BlobClient bc = createClient();
> bc.delete(key);
> bc.close();
> {code}
> If delete() throws IOException, BlobClient would be left inclosed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (FLINK-3781) BlobClient may be left unclosed in BlobCache#deleteGlobal()

2016-04-20 Thread Fabian Hueske (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-3781:
-
Assignee: Chenguang He

> BlobClient may be left unclosed in BlobCache#deleteGlobal()
> ---
>
> Key: FLINK-3781
> URL: https://issues.apache.org/jira/browse/FLINK-3781
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Chenguang He
>Priority: Minor
> Fix For: 1.1.0
>
>
> {code}
>   public void deleteGlobal(BlobKey key) throws IOException {
> delete(key);
> BlobClient bc = createClient();
> bc.delete(key);
> bc.close();
> {code}
> If delete() throws IOException, BlobClient would be left inclosed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3229) Kinesis streaming consumer with integration of Flink's checkpointing mechanics

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15251355#comment-15251355
 ] 

ASF GitHub Bot commented on FLINK-3229:
---

Github user tzulitai commented on a diff in the pull request:

https://github.com/apache/flink/pull/1911#discussion_r60531075
  
--- Diff: flink-streaming-connectors/pom.xml ---
@@ -45,6 +45,7 @@ under the License.
flink-connector-rabbitmq
flink-connector-twitter
flink-connector-nifi
+   flink-connector-kinesis
--- End diff --

Thanks, I missed the "include-kinesis" profile defined below. We'll 
probably need a more general profile name in the future though (ex. 
include-aws-connectors), for example when we start including more Amazon 
licensed libraries for other connectors such as for DynamoDB


> Kinesis streaming consumer with integration of Flink's checkpointing mechanics
> --
>
> Key: FLINK-3229
> URL: https://issues.apache.org/jira/browse/FLINK-3229
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming Connectors
>Affects Versions: 1.0.0
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>
> Opening a sub-task to implement data source consumer for Kinesis streaming 
> connector (https://issues.apache.org/jira/browser/FLINK-3211).
> An example of the planned user API for Flink Kinesis Consumer:
> {code}
> Properties kinesisConfig = new Properties();
> config.put(KinesisConfigConstants.CONFIG_AWS_REGION, "us-east-1");
> config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_TYPE, 
> "BASIC");
> config.put(
> KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_ACCESSKEYID, 
> "aws_access_key_id_here");
> config.put(
> KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_SECRETKEY,
> "aws_secret_key_here");
> config.put(KinesisConfigConstants.CONFIG_STREAM_INIT_POSITION_TYPE, 
> "LATEST"); // or TRIM_HORIZON
> DataStream kinesisRecords = env.addSource(new FlinkKinesisConsumer<>(
> "kinesis_stream_name",
> new SimpleStringSchema(),
> kinesisConfig));
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3229] Flink streaming consumer for AWS ...

2016-04-20 Thread tzulitai

Github user tzulitai commented on a diff in the pull request:

https://github.com/apache/flink/pull/1911#discussion_r60531075
  
--- Diff: flink-streaming-connectors/pom.xml ---
@@ -45,6 +45,7 @@ under the License.
flink-connector-rabbitmq
flink-connector-twitter
flink-connector-nifi
+   flink-connector-kinesis
--- End diff --

Thanks, I missed the "include-kinesis" profile defined below. We'll 
probably need a more general profile name in the future though (ex. 
include-aws-connectors), for example when we start including more Amazon 
licensed libraries for other connectors such as for DynamoDB


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Closed] (FLINK-3629) In wikiedits Quick Start example, "The first call, .window()" should be "The first call, .timeWindow()"

2016-04-20 Thread Li Fanxi (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Fanxi closed FLINK-3629.
---

> In wikiedits Quick Start example, "The first call, .window()" should be "The 
> first call, .timeWindow()"
> ---
>
> Key: FLINK-3629
> URL: https://issues.apache.org/jira/browse/FLINK-3629
> Project: Flink
>  Issue Type: Bug
>  Components: Quickstarts
>Affects Versions: 1.0.0
>Reporter: Li Fanxi
>Priority: Trivial
> Fix For: 1.0.0, 1.0.1
>
>   Original Estimate: 5m
>  Remaining Estimate: 5m
>
> On Quick Start page 
> (https://ci.apache.org/projects/flink/flink-docs-release-1.0/quickstart/run_example_quickstart.html).
> {quote}
> The first call, .window(), specifies that we want to have tumbling 
> (non-overlapping) windows of five seconds.
> {quote}
> This is not consistent with the sample code. In the sample code, {code:java} 
> KeyedStream.timeWindow() {code} function is used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3789) Overload methods which trigger program execution to allow naming job

2016-04-20 Thread Greg Hogan (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15251023#comment-15251023
 ] 

Greg Hogan commented on FLINK-3789:
---

I was thinking on Clustering Coefficient, for which we return the local 
clustering coefficient for each vertex as in DataSet via a GraphAlgorithm, that 
it would also be nice to compute the global clustering coefficient which would 
need to access accumulators. Both local and global clustering coefficient count 
triangles so their is certainly advantage it computing the two simultaneously, 
but there is extra cost for each so we should allow separate computation.

So there is need to do similar things as collect and count but still allow the 
user to perform the execute (which of course allows direct configuration of the 
job name) so they can compose multiple algorithms and analytics. Perhaps 
instead of overloading these functions we can provide alternative, slightly 
more sophisticated options which would allow configuring a job name. In many 
ways the current implementation of count, collect, print, and checksum is very 
limiting because you can only perform that single action per job. You can't 
print and count, or print and write. The current DataSet API works well because 
it's simple, but I think we could expand on this.

> Overload methods which trigger program execution to allow naming job
> 
>
> Key: FLINK-3789
> URL: https://issues.apache.org/jira/browse/FLINK-3789
> Project: Flink
>  Issue Type: Improvement
>  Components: Java API
>Affects Versions: 1.1.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Minor
>
> Overload the following functions to additionally accept a job name to pass to 
> {{ExecutionEnvironment.execute(String)}}.
> * {{DataSet.collect()}}
> * {{DataSet.count()}}
> * {{DataSetUtils.checksumHashCode(DataSet)}}
> * {{GraphUtils.checksumHashCode(Graph)}}
> Once the deprecated {{DataSet.print(String)}} and 
> {{DataSet.printToErr(String)}} are removed we can overload 
> {{DataSet.print()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3781) BlobClient may be left unclosed in BlobCache#deleteGlobal()

2016-04-20 Thread Chenguang He (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250980#comment-15250980
 ] 

Chenguang He commented on FLINK-3781:
-

Hello Fabian,

Could you please assign this to me, this is the first time i try to contribute 
in JIRA and i really want to remember this moment 

> BlobClient may be left unclosed in BlobCache#deleteGlobal()
> ---
>
> Key: FLINK-3781
> URL: https://issues.apache.org/jira/browse/FLINK-3781
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
> Fix For: 1.1.0
>
>
> {code}
>   public void deleteGlobal(BlobKey key) throws IOException {
> delete(key);
> BlobClient bc = createClient();
> bc.delete(key);
> bc.close();
> {code}
> If delete() throws IOException, BlobClient would be left inclosed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Closed] (FLINK-3665) Range partitioning lacks support to define sort orders

2016-04-20 Thread Fabian Hueske (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-3665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-3665.

Resolution: Implemented

Implemented with d8fb23085839a514fbd0b56cc681dd52aac503ac

Thanks for the contribution [~dawidwys]!

> Range partitioning lacks support to define sort orders
> --
>
> Key: FLINK-3665
> URL: https://issues.apache.org/jira/browse/FLINK-3665
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.0.0
>Reporter: Fabian Hueske
>Assignee: Dawid Wysakowicz
> Fix For: 1.1.0
>
>
> {{DataSet.partitionByRange()}} does not allow to specify the sort order of 
> fields. This is fine if range partitioning is used to reduce skewed 
> partitioning. 
> However, it is not sufficient if range partitioning is used to sort a data 
> set in parallel. 
> Since {{DataSet.partitionByRange()}} is {{@Public}} API and cannot be easily 
> changed, I propose to add a method {{withOrders(Order... orders)}} to 
> {{PartitionOperator}}. The method should throw an exception if the 
> partitioning method of {{PartitionOperator}} is not range partitioning.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (FLINK-2998) Support range partition comparison for multi input nodes.

2016-04-20 Thread Fabian Hueske (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske resolved FLINK-2998.
--
   Resolution: Implemented
Fix Version/s: 1.1.0

Implemented with 605b6d870d0da38fb5446675709c7243127cdff1

Thanks for the contribution!

> Support range partition comparison for multi input nodes.
> -
>
> Key: FLINK-2998
> URL: https://issues.apache.org/jira/browse/FLINK-2998
> Project: Flink
>  Issue Type: New Feature
>  Components: Optimizer
>Reporter: Chengxiang Li
>Assignee: GaoLun
>Priority: Minor
> Fix For: 1.1.0
>
>
> The optimizer may have potential opportunity to optimize the DAG while it 
> found two input range partition are equivalent, we does not support the 
> comparison yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (FLINK-2998) Support range partition comparison for multi input nodes.

2016-04-20 Thread Fabian Hueske (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-2998:
-
Assignee: GaoLun

> Support range partition comparison for multi input nodes.
> -
>
> Key: FLINK-2998
> URL: https://issues.apache.org/jira/browse/FLINK-2998
> Project: Flink
>  Issue Type: New Feature
>  Components: Optimizer
>Reporter: Chengxiang Li
>Assignee: GaoLun
>Priority: Minor
>
> The optimizer may have potential opportunity to optimize the DAG while it 
> found two input range partition are equivalent, we does not support the 
> comparison yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (FLINK-3781) BlobClient may be left unclosed in BlobCache#deleteGlobal()

2016-04-20 Thread Fabian Hueske (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske resolved FLINK-3781.
--
   Resolution: Fixed
Fix Version/s: 1.1.0

Fixed with a43bade0d87d373ff236152d13d8d56e17220605

> BlobClient may be left unclosed in BlobCache#deleteGlobal()
> ---
>
> Key: FLINK-3781
> URL: https://issues.apache.org/jira/browse/FLINK-3781
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
> Fix For: 1.1.0
>
>
> {code}
>   public void deleteGlobal(BlobKey key) throws IOException {
> delete(key);
> BlobClient bc = createClient();
> bc.delete(key);
> bc.close();
> {code}
> If delete() throws IOException, BlobClient would be left inclosed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3665) Range partitioning lacks support to define sort orders

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250953#comment-15250953
 ] 

ASF GitHub Bot commented on FLINK-3665:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1848


> Range partitioning lacks support to define sort orders
> --
>
> Key: FLINK-3665
> URL: https://issues.apache.org/jira/browse/FLINK-3665
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.0.0
>Reporter: Fabian Hueske
>Assignee: Dawid Wysakowicz
> Fix For: 1.1.0
>
>
> {{DataSet.partitionByRange()}} does not allow to specify the sort order of 
> fields. This is fine if range partitioning is used to reduce skewed 
> partitioning. 
> However, it is not sufficient if range partitioning is used to sort a data 
> set in parallel. 
> Since {{DataSet.partitionByRange()}} is {{@Public}} API and cannot be easily 
> changed, I propose to add a method {{withOrders(Order... orders)}} to 
> {{PartitionOperator}}. The method should throw an exception if the 
> partitioning method of {{PartitionOperator}} is not range partitioning.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3781) BlobClient may be left unclosed in BlobCache#deleteGlobal()

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250955#comment-15250955
 ] 

ASF GitHub Bot commented on FLINK-3781:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1908


> BlobClient may be left unclosed in BlobCache#deleteGlobal()
> ---
>
> Key: FLINK-3781
> URL: https://issues.apache.org/jira/browse/FLINK-3781
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> {code}
>   public void deleteGlobal(BlobKey key) throws IOException {
> delete(key);
> BlobClient bc = createClient();
> bc.delete(key);
> bc.close();
> {code}
> If delete() throws IOException, BlobClient would be left inclosed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2998) Support range partition comparison for multi input nodes.

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250954#comment-15250954
 ] 

ASF GitHub Bot commented on FLINK-2998:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1838


> Support range partition comparison for multi input nodes.
> -
>
> Key: FLINK-2998
> URL: https://issues.apache.org/jira/browse/FLINK-2998
> Project: Flink
>  Issue Type: New Feature
>  Components: Optimizer
>Reporter: Chengxiang Li
>Priority: Minor
>
> The optimizer may have potential opportunity to optimize the DAG while it 
> found two input range partition are equivalent, we does not support the 
> comparison yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2998] Support range partition compariso...

2016-04-20 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1838


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3781] BlobClient may be left unclosed i...

2016-04-20 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1908


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3665] Implemented sort orders support i...

2016-04-20 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1848


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: Flink 3691 extend avroinputformat to support g...

2016-04-20 Thread gna-phetsarath

GitHub user gna-phetsarath opened a pull request:

https://github.com/apache/flink/pull/1920

Flink 3691 extend avroinputformat to support generic records

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [ ] General
  - The pull request references the related JIRA issue
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message

- [ ] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [ ] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gna-phetsarath/flink 
FLINK-3691-extend_avroinputformat_to_support_generic_records

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1920.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1920


commit 78b16a080105a188fcfc0f2a1731b87857f4080f
Author: Phetsarath, Sourigna 
Date:   2016-04-05T22:01:24Z

[FLINK-3691] Extend AvroInputFormat to support Avro GenericRecord

commit d122c6b0e125af39163514d286fe7abdbf16765d
Author: Phetsarath, Sourigna 
Date:   2016-04-06T19:21:06Z

[FLINK-3691] Extend AvroInputFormat to support Avro GenericRecord.
Fixed Style issue after running verify.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2157) Create evaluation framework for ML library

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250825#comment-15250825
 ] 

ASF GitHub Bot commented on FLINK-2157:
---

Github user rawkintrevo commented on the pull request:

https://github.com/apache/flink/pull/1849#issuecomment-212633912
  
Also two quick issues.

*pipelines*

```scala
val scaler = MinMaxScaler()
val pipeline = scaler.chainPredictor(mlr)
val evaluationDS = survivalLV.map(x => (x.vector, x.label))

pipeline.fit(survivalLV)
scorer.evaluate(evaluationDS, pipeline).collect().head
```
When using this with a ChainedPredictor as the predictor I get the 
following error:
error: could not find implicit value for parameter evaluateOperation: 
org.apache.flink.ml.pipeline.EvaluateDataSetOperation[org.apache.flink.ml.pipeline.ChainedPredictor[org.apache.flink.ml.preprocessing.MinMaxScaler,org.apache.flink.ml.regression.MultipleLinearRegression],(org.apache.flink.ml.math.Vector,
 Double),Double]


*MinMaxScaler()*
Merging for me broke the following code:
```scala
val scaler = MinMaxScaler()
val scaledSurvivalLV = scaler.transform(survivalLV)
```
With the following error (omiting part of the stack trace)
Caused by: java.lang.NoSuchMethodError: 
breeze.linalg.Vector$.scalarOf()Lbreeze/linalg/support/ScalarOf;
at 
org.apache.flink.ml.preprocessing.MinMaxScaler$$anonfun$3.apply(MinMaxScaler.scala:156)
at 
org.apache.flink.ml.preprocessing.MinMaxScaler$$anonfun$3.apply(MinMaxScaler.scala:154)
at org.apache.flink.api.scala.DataSet$$anon$7.reduce(DataSet.scala:584)
at 
org.apache.flink.runtime.operators.chaining.ChainedAllReduceDriver.collect(ChainedAllReduceDriver.java:93)
at 
org.apache.flink.runtime.operators.chaining.ChainedMapDriver.collect(ChainedMapDriver.java:78)
at org.apache.flink.runtime.operators.MapDriver.run(MapDriver.java:97)
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:480)
at 
org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:345)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
at java.lang.Thread.run(Thread.java:745)


I'm looking for a work around.  Just saying I found a regression. Other 
than that, looks/works AWESOME well done. 


> Create evaluation framework for ML library
> --
>
> Key: FLINK-2157
> URL: https://issues.apache.org/jira/browse/FLINK-2157
> Project: Flink
>  Issue Type: New Feature
>  Components: Machine Learning Library
>Reporter: Till Rohrmann
>Assignee: Theodore Vasiloudis
>  Labels: ML
> Fix For: 1.0.0
>
>
> Currently, FlinkML lacks means to evaluate the performance of trained models. 
> It would be great to add some {{Evaluators}} which can calculate some score 
> based on the information about true and predicted labels. This could also be 
> used for the cross validation to choose the right hyper parameters.
> Possible scores could be F score [1], zero-one-loss score, etc.
> Resources
> [1] [http://en.wikipedia.org/wiki/F1_score]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2157] [ml] Create evaluation framework ...

2016-04-20 Thread rawkintrevo

Github user rawkintrevo commented on the pull request:

https://github.com/apache/flink/pull/1849#issuecomment-212633912
  
Also two quick issues.

*pipelines*

```scala
val scaler = MinMaxScaler()
val pipeline = scaler.chainPredictor(mlr)
val evaluationDS = survivalLV.map(x => (x.vector, x.label))

pipeline.fit(survivalLV)
scorer.evaluate(evaluationDS, pipeline).collect().head
```
When using this with a ChainedPredictor as the predictor I get the 
following error:
error: could not find implicit value for parameter evaluateOperation: 
org.apache.flink.ml.pipeline.EvaluateDataSetOperation[org.apache.flink.ml.pipeline.ChainedPredictor[org.apache.flink.ml.preprocessing.MinMaxScaler,org.apache.flink.ml.regression.MultipleLinearRegression],(org.apache.flink.ml.math.Vector,
 Double),Double]


*MinMaxScaler()*
Merging for me broke the following code:
```scala
val scaler = MinMaxScaler()
val scaledSurvivalLV = scaler.transform(survivalLV)
```
With the following error (omiting part of the stack trace)
Caused by: java.lang.NoSuchMethodError: 
breeze.linalg.Vector$.scalarOf()Lbreeze/linalg/support/ScalarOf;
at 
org.apache.flink.ml.preprocessing.MinMaxScaler$$anonfun$3.apply(MinMaxScaler.scala:156)
at 
org.apache.flink.ml.preprocessing.MinMaxScaler$$anonfun$3.apply(MinMaxScaler.scala:154)
at org.apache.flink.api.scala.DataSet$$anon$7.reduce(DataSet.scala:584)
at 
org.apache.flink.runtime.operators.chaining.ChainedAllReduceDriver.collect(ChainedAllReduceDriver.java:93)
at 
org.apache.flink.runtime.operators.chaining.ChainedMapDriver.collect(ChainedMapDriver.java:78)
at org.apache.flink.runtime.operators.MapDriver.run(MapDriver.java:97)
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:480)
at 
org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:345)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
at java.lang.Thread.run(Thread.java:745)


I'm looking for a work around.  Just saying I found a regression. Other 
than that, looks/works AWESOME well done. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (FLINK-3665) Range partitioning lacks support to define sort orders

2016-04-20 Thread Fabian Hueske (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-3665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-3665:
-
Assignee: Dawid Wysakowicz

> Range partitioning lacks support to define sort orders
> --
>
> Key: FLINK-3665
> URL: https://issues.apache.org/jira/browse/FLINK-3665
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.0.0
>Reporter: Fabian Hueske
>Assignee: Dawid Wysakowicz
> Fix For: 1.1.0
>
>
> {{DataSet.partitionByRange()}} does not allow to specify the sort order of 
> fields. This is fine if range partitioning is used to reduce skewed 
> partitioning. 
> However, it is not sufficient if range partitioning is used to sort a data 
> set in parallel. 
> Since {{DataSet.partitionByRange()}} is {{@Public}} API and cannot be easily 
> changed, I propose to add a method {{withOrders(Order... orders)}} to 
> {{PartitionOperator}}. The method should throw an exception if the 
> partitioning method of {{PartitionOperator}} is not range partitioning.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2998) Support range partition comparison for multi input nodes.

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250727#comment-15250727
 ] 

ASF GitHub Bot commented on FLINK-2998:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1838#issuecomment-212608296
  
Merging this PR


> Support range partition comparison for multi input nodes.
> -
>
> Key: FLINK-2998
> URL: https://issues.apache.org/jira/browse/FLINK-2998
> Project: Flink
>  Issue Type: New Feature
>  Components: Optimizer
>Reporter: Chengxiang Li
>Priority: Minor
>
> The optimizer may have potential opportunity to optimize the DAG while it 
> found two input range partition are equivalent, we does not support the 
> comparison yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3665] Implemented sort orders support i...

2016-04-20 Thread fhueske

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1848#issuecomment-212608472
  
Merging this PR


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3781) BlobClient may be left unclosed in BlobCache#deleteGlobal()

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250730#comment-15250730
 ] 

ASF GitHub Bot commented on FLINK-3781:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1908#issuecomment-212608557
  
Merging this PR


> BlobClient may be left unclosed in BlobCache#deleteGlobal()
> ---
>
> Key: FLINK-3781
> URL: https://issues.apache.org/jira/browse/FLINK-3781
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> {code}
>   public void deleteGlobal(BlobKey key) throws IOException {
> delete(key);
> BlobClient bc = createClient();
> bc.delete(key);
> bc.close();
> {code}
> If delete() throws IOException, BlobClient would be left inclosed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3781] BlobClient may be left unclosed i...

2016-04-20 Thread fhueske

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1908#issuecomment-212608557
  
Merging this PR


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3665) Range partitioning lacks support to define sort orders

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250729#comment-15250729
 ] 

ASF GitHub Bot commented on FLINK-3665:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1848#issuecomment-212608472
  
Merging this PR


> Range partitioning lacks support to define sort orders
> --
>
> Key: FLINK-3665
> URL: https://issues.apache.org/jira/browse/FLINK-3665
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.0.0
>Reporter: Fabian Hueske
> Fix For: 1.1.0
>
>
> {{DataSet.partitionByRange()}} does not allow to specify the sort order of 
> fields. This is fine if range partitioning is used to reduce skewed 
> partitioning. 
> However, it is not sufficient if range partitioning is used to sort a data 
> set in parallel. 
> Since {{DataSet.partitionByRange()}} is {{@Public}} API and cannot be easily 
> changed, I propose to add a method {{withOrders(Order... orders)}} to 
> {{PartitionOperator}}. The method should throw an exception if the 
> partitioning method of {{PartitionOperator}} is not range partitioning.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2998] Support range partition compariso...

2016-04-20 Thread fhueske

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1838#issuecomment-212608296
  
Merging this PR


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3665) Range partitioning lacks support to define sort orders

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250645#comment-15250645
 ] 

ASF GitHub Bot commented on FLINK-3665:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1848#issuecomment-212590297
  
Thanks @dawidwys, unrelated test failures are OK. 
I'll run the tests again after I rebasing and merging the PR.


> Range partitioning lacks support to define sort orders
> --
>
> Key: FLINK-3665
> URL: https://issues.apache.org/jira/browse/FLINK-3665
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.0.0
>Reporter: Fabian Hueske
> Fix For: 1.1.0
>
>
> {{DataSet.partitionByRange()}} does not allow to specify the sort order of 
> fields. This is fine if range partitioning is used to reduce skewed 
> partitioning. 
> However, it is not sufficient if range partitioning is used to sort a data 
> set in parallel. 
> Since {{DataSet.partitionByRange()}} is {{@Public}} API and cannot be easily 
> changed, I propose to add a method {{withOrders(Order... orders)}} to 
> {{PartitionOperator}}. The method should throw an exception if the 
> partitioning method of {{PartitionOperator}} is not range partitioning.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3665] Implemented sort orders support i...

2016-04-20 Thread fhueske

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1848#issuecomment-212590297
  
Thanks @dawidwys, unrelated test failures are OK. 
I'll run the tests again after I rebasing and merging the PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3727) Add support for embedded streaming SQL (projection, filter, union)

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250610#comment-15250610
 ] 

ASF GitHub Bot commented on FLINK-3727:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1917#issuecomment-212583555
  
Thanks for the PR. Looks good. I had a few minor comments. 
I also marked a few tests that could be removed because they do not 
increase test coverage, IMO. In general, we should be more careful when adding 
tests. Builds start to fail on Travis because the 2h threshold is exceeded.


> Add support for embedded streaming SQL (projection, filter, union)
> --
>
> Key: FLINK-3727
> URL: https://issues.apache.org/jira/browse/FLINK-3727
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>Assignee: Vasia Kalavri
>
> Similar to the support for SQL embedded in batch Table API programs, this 
> issue tracks the support for SQL embedded in stream Table API programs. The 
> only currently supported operations on streaming Tables are projection, 
> filtering, and union.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3727] Embedded streaming SQL projection...

2016-04-20 Thread fhueske

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1917#issuecomment-212583555
  
Thanks for the PR. Looks good. I had a few minor comments. 
I also marked a few tests that could be removed because they do not 
increase test coverage, IMO. In general, we should be more careful when adding 
tests. Builds start to fail on Travis because the 2h threshold is exceeded.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3727) Add support for embedded streaming SQL (projection, filter, union)

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250602#comment-15250602
 ] 

ASF GitHub Bot commented on FLINK-3727:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1917#discussion_r60477843
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/api/scala/sql/streaming/test/FilterITCase.scala
 ---
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala.sql.streaming.test
+
+import org.apache.flink.api.scala._
+import org.apache.flink.api.scala.table._
+import 
org.apache.flink.api.scala.table.streaming.test.utils.{StreamITCase, 
StreamTestData}
+import org.apache.flink.api.table.{Row, TableEnvironment}
+import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
+import org.apache.flink.streaming.util.StreamingMultipleProgramsTestBase
+import org.junit.Assert._
+import org.junit._
+
+import scala.collection.mutable
+
+class FilterITCase extends StreamingMultipleProgramsTestBase {
+
+  @Test
+  def testSimpleFilter(): Unit = {
+
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val sqlQuery = "SELECT STREAM * FROM MyTable WHERE a = 3"
+
+val t = 
StreamTestData.getSmall3TupleDataStream(env).toTable(tEnv).as('a, 'b, 'c)
+tEnv.registerTable("MyTable", t)
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+
+val expected = mutable.MutableList("3,2,Hello world")
+assertEquals(expected.sorted, StreamITCase.testResults.sorted)
+  }
+
+  @Test
+  def testAllRejectingFilter(): Unit = {
+
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val sqlQuery = "SELECT STREAM * FROM MyTable WHERE false"
+
+val t = StreamTestData.getSmall3TupleDataStream(env).toTable(tEnv)
+tEnv.registerTable("MyTable", t)
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+
+assertEquals(true, StreamITCase.testResults.isEmpty)
+  }
+
+  @Test
+  def testAllPassingFilter(): Unit = {
--- End diff --

This test might be removed.


> Add support for embedded streaming SQL (projection, filter, union)
> --
>
> Key: FLINK-3727
> URL: https://issues.apache.org/jira/browse/FLINK-3727
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>Assignee: Vasia Kalavri
>
> Similar to the support for SQL embedded in batch Table API programs, this 
> issue tracks the support for SQL embedded in stream Table API programs. The 
> only currently supported operations on streaming Tables are projection, 
> filtering, and union.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3727) Add support for embedded streaming SQL (projection, filter, union)

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250606#comment-15250606
 ] 

ASF GitHub Bot commented on FLINK-3727:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1917#discussion_r60477908
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/api/scala/sql/streaming/test/SelectITCase.scala
 ---
@@ -0,0 +1,117 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala.sql.streaming.test
+
+import org.apache.flink.api.scala._
+import org.apache.flink.api.scala.table._
+import 
org.apache.flink.api.scala.table.streaming.test.utils.{StreamITCase, 
StreamTestData}
+import org.apache.flink.api.table.{Row, TableEnvironment}
+import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
+import org.apache.flink.streaming.util.StreamingMultipleProgramsTestBase
+import org.junit.Assert._
+import org.junit._
+
+import scala.collection.mutable
+
+class SelectITCase extends StreamingMultipleProgramsTestBase {
+
+  @Test
+  def testSelectStarFromTable(): Unit = {
+
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val sqlQuery = "SELECT STREAM * FROM MyTable"
+
+val t = StreamTestData.getSmall3TupleDataStream(env).toTable(tEnv)
+tEnv.registerTable("MyTable", t)
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+
+val expected = mutable.MutableList(
+  "1,1,Hi",
+  "2,2,Hello",
+  "3,2,Hello world")
+assertEquals(expected.sorted, StreamITCase.testResults.sorted)
+  }
+
+  @Test
+  def testSelectFirstFromTable(): Unit = {
+
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val sqlQuery = "SELECT STREAM a FROM MyTable"
+
+val t = 
StreamTestData.getSmall3TupleDataStream(env).toTable(tEnv).as('a, 'b, 'c)
+tEnv.registerTable("MyTable", t)
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+
+val expected = mutable.MutableList("1", "2", "3")
+assertEquals(expected.sorted, StreamITCase.testResults.sorted)
+  }
+
+  @Test
+  def testSelectExpressionFromTable(): Unit = {
+
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val sqlQuery = "SELECT STREAM a * 2, b - 1 FROM MyTable"
+
+val t = 
StreamTestData.getSmall3TupleDataStream(env).toTable(tEnv).as('a, 'b, 'c)
+tEnv.registerTable("MyTable", t)
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+
+val expected = mutable.MutableList("2,0", "4,1", "6,1")
+assertEquals(expected.sorted, StreamITCase.testResults.sorted)
+  }
+
+
+  @Test
+  def testSelectSecondFromDataStream(): Unit = {
--- End diff --

This test might be removed.


> Add support for embedded streaming SQL (projection, filter, union)
> --
>
> Key: FLINK-3727
> URL: https://issues.apache.org/jira/browse/FLINK-3727
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>Assignee: Vasia Kalavri
>
> Similar to the supp

[jira] [Commented] (FLINK-3727) Add support for embedded streaming SQL (projection, filter, union)

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250604#comment-15250604
 ] 

ASF GitHub Bot commented on FLINK-3727:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1917#discussion_r60477897
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/api/scala/sql/streaming/test/SelectITCase.scala
 ---
@@ -0,0 +1,117 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala.sql.streaming.test
+
+import org.apache.flink.api.scala._
+import org.apache.flink.api.scala.table._
+import 
org.apache.flink.api.scala.table.streaming.test.utils.{StreamITCase, 
StreamTestData}
+import org.apache.flink.api.table.{Row, TableEnvironment}
+import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
+import org.apache.flink.streaming.util.StreamingMultipleProgramsTestBase
+import org.junit.Assert._
+import org.junit._
+
+import scala.collection.mutable
+
+class SelectITCase extends StreamingMultipleProgramsTestBase {
+
+  @Test
+  def testSelectStarFromTable(): Unit = {
+
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val sqlQuery = "SELECT STREAM * FROM MyTable"
+
+val t = StreamTestData.getSmall3TupleDataStream(env).toTable(tEnv)
+tEnv.registerTable("MyTable", t)
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+
+val expected = mutable.MutableList(
+  "1,1,Hi",
+  "2,2,Hello",
+  "3,2,Hello world")
+assertEquals(expected.sorted, StreamITCase.testResults.sorted)
+  }
+
+  @Test
+  def testSelectFirstFromTable(): Unit = {
--- End diff --

This test might be removed.


> Add support for embedded streaming SQL (projection, filter, union)
> --
>
> Key: FLINK-3727
> URL: https://issues.apache.org/jira/browse/FLINK-3727
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>Assignee: Vasia Kalavri
>
> Similar to the support for SQL embedded in batch Table API programs, this 
> issue tracks the support for SQL embedded in stream Table API programs. The 
> only currently supported operations on streaming Tables are projection, 
> filtering, and union.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3727) Add support for embedded streaming SQL (projection, filter, union)

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250600#comment-15250600
 ] 

ASF GitHub Bot commented on FLINK-3727:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1917#discussion_r60477818
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/api/scala/sql/streaming/test/FilterITCase.scala
 ---
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala.sql.streaming.test
+
+import org.apache.flink.api.scala._
+import org.apache.flink.api.scala.table._
+import 
org.apache.flink.api.scala.table.streaming.test.utils.{StreamITCase, 
StreamTestData}
+import org.apache.flink.api.table.{Row, TableEnvironment}
+import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
+import org.apache.flink.streaming.util.StreamingMultipleProgramsTestBase
+import org.junit.Assert._
+import org.junit._
+
+import scala.collection.mutable
+
+class FilterITCase extends StreamingMultipleProgramsTestBase {
+
+  @Test
+  def testSimpleFilter(): Unit = {
+
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val sqlQuery = "SELECT STREAM * FROM MyTable WHERE a = 3"
+
+val t = 
StreamTestData.getSmall3TupleDataStream(env).toTable(tEnv).as('a, 'b, 'c)
+tEnv.registerTable("MyTable", t)
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+
+val expected = mutable.MutableList("3,2,Hello world")
+assertEquals(expected.sorted, StreamITCase.testResults.sorted)
+  }
+
+  @Test
+  def testAllRejectingFilter(): Unit = {
--- End diff --

This test might be removed.


> Add support for embedded streaming SQL (projection, filter, union)
> --
>
> Key: FLINK-3727
> URL: https://issues.apache.org/jira/browse/FLINK-3727
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>Assignee: Vasia Kalavri
>
> Similar to the support for SQL embedded in batch Table API programs, this 
> issue tracks the support for SQL embedded in stream Table API programs. The 
> only currently supported operations on streaming Tables are projection, 
> filtering, and union.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3727) Add support for embedded streaming SQL (projection, filter, union)

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250603#comment-15250603
 ] 

ASF GitHub Bot commented on FLINK-3727:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1917#discussion_r60477888
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/api/scala/sql/streaming/test/SelectITCase.scala
 ---
@@ -0,0 +1,117 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala.sql.streaming.test
+
+import org.apache.flink.api.scala._
+import org.apache.flink.api.scala.table._
+import 
org.apache.flink.api.scala.table.streaming.test.utils.{StreamITCase, 
StreamTestData}
+import org.apache.flink.api.table.{Row, TableEnvironment}
+import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
+import org.apache.flink.streaming.util.StreamingMultipleProgramsTestBase
+import org.junit.Assert._
+import org.junit._
+
+import scala.collection.mutable
+
+class SelectITCase extends StreamingMultipleProgramsTestBase {
+
+  @Test
+  def testSelectStarFromTable(): Unit = {
--- End diff --

This test might be removed.


> Add support for embedded streaming SQL (projection, filter, union)
> --
>
> Key: FLINK-3727
> URL: https://issues.apache.org/jira/browse/FLINK-3727
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>Assignee: Vasia Kalavri
>
> Similar to the support for SQL embedded in batch Table API programs, this 
> issue tracks the support for SQL embedded in stream Table API programs. The 
> only currently supported operations on streaming Tables are projection, 
> filtering, and union.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3727] Embedded streaming SQL projection...

2016-04-20 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1917#discussion_r60477908
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/api/scala/sql/streaming/test/SelectITCase.scala
 ---
@@ -0,0 +1,117 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala.sql.streaming.test
+
+import org.apache.flink.api.scala._
+import org.apache.flink.api.scala.table._
+import 
org.apache.flink.api.scala.table.streaming.test.utils.{StreamITCase, 
StreamTestData}
+import org.apache.flink.api.table.{Row, TableEnvironment}
+import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
+import org.apache.flink.streaming.util.StreamingMultipleProgramsTestBase
+import org.junit.Assert._
+import org.junit._
+
+import scala.collection.mutable
+
+class SelectITCase extends StreamingMultipleProgramsTestBase {
+
+  @Test
+  def testSelectStarFromTable(): Unit = {
+
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val sqlQuery = "SELECT STREAM * FROM MyTable"
+
+val t = StreamTestData.getSmall3TupleDataStream(env).toTable(tEnv)
+tEnv.registerTable("MyTable", t)
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+
+val expected = mutable.MutableList(
+  "1,1,Hi",
+  "2,2,Hello",
+  "3,2,Hello world")
+assertEquals(expected.sorted, StreamITCase.testResults.sorted)
+  }
+
+  @Test
+  def testSelectFirstFromTable(): Unit = {
+
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val sqlQuery = "SELECT STREAM a FROM MyTable"
+
+val t = 
StreamTestData.getSmall3TupleDataStream(env).toTable(tEnv).as('a, 'b, 'c)
+tEnv.registerTable("MyTable", t)
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+
+val expected = mutable.MutableList("1", "2", "3")
+assertEquals(expected.sorted, StreamITCase.testResults.sorted)
+  }
+
+  @Test
+  def testSelectExpressionFromTable(): Unit = {
+
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val sqlQuery = "SELECT STREAM a * 2, b - 1 FROM MyTable"
+
+val t = 
StreamTestData.getSmall3TupleDataStream(env).toTable(tEnv).as('a, 'b, 'c)
+tEnv.registerTable("MyTable", t)
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+
+val expected = mutable.MutableList("2,0", "4,1", "6,1")
+assertEquals(expected.sorted, StreamITCase.testResults.sorted)
+  }
+
+
+  @Test
+  def testSelectSecondFromDataStream(): Unit = {
--- End diff --

This test might be removed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3727] Embedded streaming SQL projection...

2016-04-20 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1917#discussion_r60477888
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/api/scala/sql/streaming/test/SelectITCase.scala
 ---
@@ -0,0 +1,117 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala.sql.streaming.test
+
+import org.apache.flink.api.scala._
+import org.apache.flink.api.scala.table._
+import 
org.apache.flink.api.scala.table.streaming.test.utils.{StreamITCase, 
StreamTestData}
+import org.apache.flink.api.table.{Row, TableEnvironment}
+import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
+import org.apache.flink.streaming.util.StreamingMultipleProgramsTestBase
+import org.junit.Assert._
+import org.junit._
+
+import scala.collection.mutable
+
+class SelectITCase extends StreamingMultipleProgramsTestBase {
+
+  @Test
+  def testSelectStarFromTable(): Unit = {
--- End diff --

This test might be removed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3727] Embedded streaming SQL projection...

2016-04-20 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1917#discussion_r60477897
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/api/scala/sql/streaming/test/SelectITCase.scala
 ---
@@ -0,0 +1,117 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala.sql.streaming.test
+
+import org.apache.flink.api.scala._
+import org.apache.flink.api.scala.table._
+import 
org.apache.flink.api.scala.table.streaming.test.utils.{StreamITCase, 
StreamTestData}
+import org.apache.flink.api.table.{Row, TableEnvironment}
+import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
+import org.apache.flink.streaming.util.StreamingMultipleProgramsTestBase
+import org.junit.Assert._
+import org.junit._
+
+import scala.collection.mutable
+
+class SelectITCase extends StreamingMultipleProgramsTestBase {
+
+  @Test
+  def testSelectStarFromTable(): Unit = {
+
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val sqlQuery = "SELECT STREAM * FROM MyTable"
+
+val t = StreamTestData.getSmall3TupleDataStream(env).toTable(tEnv)
+tEnv.registerTable("MyTable", t)
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+
+val expected = mutable.MutableList(
+  "1,1,Hi",
+  "2,2,Hello",
+  "3,2,Hello world")
+assertEquals(expected.sorted, StreamITCase.testResults.sorted)
+  }
+
+  @Test
+  def testSelectFirstFromTable(): Unit = {
--- End diff --

This test might be removed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Assigned] (FLINK-3794) Add checks for unsupported operations in streaming table API

2016-04-20 Thread Vasia Kalavri (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasia Kalavri reassigned FLINK-3794:


Assignee: Vasia Kalavri

> Add checks for unsupported operations in streaming table API
> 
>
> Key: FLINK-3794
> URL: https://issues.apache.org/jira/browse/FLINK-3794
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>Assignee: Vasia Kalavri
>
> Unsupported operations on streaming tables currently fail during plan 
> translation. It would be nicer to add checks in the Table API methods and 
> fail with an informative message that the operation is not supported. The 
> operations that are not currently supported are:
> - aggregations inside select
> - groupBy
> - distinct
> - join
> We can simply check if the Table's environment is a streaming environment and 
> throw an unsupported operation exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3727] Embedded streaming SQL projection...

2016-04-20 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1917#discussion_r60477843
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/api/scala/sql/streaming/test/FilterITCase.scala
 ---
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala.sql.streaming.test
+
+import org.apache.flink.api.scala._
+import org.apache.flink.api.scala.table._
+import 
org.apache.flink.api.scala.table.streaming.test.utils.{StreamITCase, 
StreamTestData}
+import org.apache.flink.api.table.{Row, TableEnvironment}
+import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
+import org.apache.flink.streaming.util.StreamingMultipleProgramsTestBase
+import org.junit.Assert._
+import org.junit._
+
+import scala.collection.mutable
+
+class FilterITCase extends StreamingMultipleProgramsTestBase {
+
+  @Test
+  def testSimpleFilter(): Unit = {
+
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val sqlQuery = "SELECT STREAM * FROM MyTable WHERE a = 3"
+
+val t = 
StreamTestData.getSmall3TupleDataStream(env).toTable(tEnv).as('a, 'b, 'c)
+tEnv.registerTable("MyTable", t)
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+
+val expected = mutable.MutableList("3,2,Hello world")
+assertEquals(expected.sorted, StreamITCase.testResults.sorted)
+  }
+
+  @Test
+  def testAllRejectingFilter(): Unit = {
+
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val sqlQuery = "SELECT STREAM * FROM MyTable WHERE false"
+
+val t = StreamTestData.getSmall3TupleDataStream(env).toTable(tEnv)
+tEnv.registerTable("MyTable", t)
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+
+assertEquals(true, StreamITCase.testResults.isEmpty)
+  }
+
+  @Test
+  def testAllPassingFilter(): Unit = {
--- End diff --

This test might be removed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3727] Embedded streaming SQL projection...

2016-04-20 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1917#discussion_r60477818
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/api/scala/sql/streaming/test/FilterITCase.scala
 ---
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala.sql.streaming.test
+
+import org.apache.flink.api.scala._
+import org.apache.flink.api.scala.table._
+import 
org.apache.flink.api.scala.table.streaming.test.utils.{StreamITCase, 
StreamTestData}
+import org.apache.flink.api.table.{Row, TableEnvironment}
+import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
+import org.apache.flink.streaming.util.StreamingMultipleProgramsTestBase
+import org.junit.Assert._
+import org.junit._
+
+import scala.collection.mutable
+
+class FilterITCase extends StreamingMultipleProgramsTestBase {
+
+  @Test
+  def testSimpleFilter(): Unit = {
+
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val sqlQuery = "SELECT STREAM * FROM MyTable WHERE a = 3"
+
+val t = 
StreamTestData.getSmall3TupleDataStream(env).toTable(tEnv).as('a, 'b, 'c)
+tEnv.registerTable("MyTable", t)
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+
+val expected = mutable.MutableList("3,2,Hello world")
+assertEquals(expected.sorted, StreamITCase.testResults.sorted)
+  }
+
+  @Test
+  def testAllRejectingFilter(): Unit = {
--- End diff --

This test might be removed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3665] Implemented sort orders support i...

2016-04-20 Thread dawidwys

Github user dawidwys commented on the pull request:

https://github.com/apache/flink/pull/1848#issuecomment-212581664
  
I think I applied all changes. 

Unfortunately the travis build fails but I don't know why. The test it 
fails has nothing in common with my changes. It also passes locally.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3665) Range partitioning lacks support to define sort orders

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250596#comment-15250596
 ] 

ASF GitHub Bot commented on FLINK-3665:
---

Github user dawidwys commented on the pull request:

https://github.com/apache/flink/pull/1848#issuecomment-212581664
  
I think I applied all changes. 

Unfortunately the travis build fails but I don't know why. The test it 
fails has nothing in common with my changes. It also passes locally.


> Range partitioning lacks support to define sort orders
> --
>
> Key: FLINK-3665
> URL: https://issues.apache.org/jira/browse/FLINK-3665
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.0.0
>Reporter: Fabian Hueske
> Fix For: 1.1.0
>
>
> {{DataSet.partitionByRange()}} does not allow to specify the sort order of 
> fields. This is fine if range partitioning is used to reduce skewed 
> partitioning. 
> However, it is not sufficient if range partitioning is used to sort a data 
> set in parallel. 
> Since {{DataSet.partitionByRange()}} is {{@Public}} API and cannot be easily 
> changed, I propose to add a method {{withOrders(Order... orders)}} to 
> {{PartitionOperator}}. The method should throw an exception if the 
> partitioning method of {{PartitionOperator}} is not range partitioning.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (FLINK-3790) Rolling File sink does not pick up hadoop configuration

2016-04-20 Thread Gyula Fora (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora reassigned FLINK-3790:
-

Assignee: Gyula Fora

> Rolling File sink does not pick up hadoop configuration
> ---
>
> Key: FLINK-3790
> URL: https://issues.apache.org/jira/browse/FLINK-3790
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Affects Versions: 1.0.2
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Critical
>
> In the rolling file sink function, a new hadoop configuration is created to 
> get the FileSystem every time, which completely ignores the hadoop config set 
> in flink-conf.yaml



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (FLINK-3798) Clean up RocksDB state backend access modifiers

2016-04-20 Thread Gyula Fora (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora reassigned FLINK-3798:
-

Assignee: Gyula Fora

> Clean up RocksDB state backend access modifiers
> ---
>
> Key: FLINK-3798
> URL: https://issues.apache.org/jira/browse/FLINK-3798
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Minor
>
> The RocksDB state backend uses a lot package private methods and fields which 
> makes it very hard to subclass the different parts for added functionality. I 
> think these should be protected instead. 
> Also the AbstractRocksDBState declares some methods final when there are 
> use-cases when a subclass migh want to call them.
> Just to give you an example I am creating a version of the value state which 
> would keep a small cache on heap. For this it would be enough to subclass the 
> RockDBStateBackend and RocksDBVAlue state classes if the above mentioned 
> changes were made. Now I have to use reflection to access package private 
> fields and actually copy classes due to final methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3790] [streaming] Use proper hadoop con...

2016-04-20 Thread gyfora

GitHub user gyfora opened a pull request:

https://github.com/apache/flink/pull/1919

[FLINK-3790] [streaming] Use proper hadoop config in rolling sink



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gyfora/flink rolling-conf

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1919.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1919






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3790) Rolling File sink does not pick up hadoop configuration

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250570#comment-15250570
 ] 

ASF GitHub Bot commented on FLINK-3790:
---

GitHub user gyfora opened a pull request:

https://github.com/apache/flink/pull/1919

[FLINK-3790] [streaming] Use proper hadoop config in rolling sink



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gyfora/flink rolling-conf

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1919.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1919






> Rolling File sink does not pick up hadoop configuration
> ---
>
> Key: FLINK-3790
> URL: https://issues.apache.org/jira/browse/FLINK-3790
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Affects Versions: 1.0.2
>Reporter: Gyula Fora
>Priority: Critical
>
> In the rolling file sink function, a new hadoop configuration is created to 
> get the FileSystem every time, which completely ignores the hadoop config set 
> in flink-conf.yaml



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3727) Add support for embedded streaming SQL (projection, filter, union)

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250559#comment-15250559
 ] 

ASF GitHub Bot commented on FLINK-3727:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1917#discussion_r60473805
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/TableEnvironment.scala
 ---
@@ -98,8 +98,42 @@ abstract class TableEnvironment(val config: TableConfig) 
{
 
 checkValidTableName(name)
 
-val tableTable = new TableTable(table.getRelNode)
-registerTableInternal(name, tableTable)
+table.tableEnv match {
+  case e: BatchTableEnvironment =>
+val tableTable = new TableTable(table.getRelNode)
+registerTableInternal(name, tableTable)
+  case e: StreamTableEnvironment =>
+val sTableTable = new TransStreamTable(table.getRelNode, true)
+tables.add(name, sTableTable)
+}
+
+  }
+
+  protected def registerStreamTableInternal(name: String, table: 
AbstractTable): Unit = {
+
+if (isRegistered(name)) {
+  throw new TableException(s"Table \'$name\' already exists. " +
+s"Please, choose a different name.")
+} else {
+  tables.add(name, table)
+}
+  }
+
+  /**
+   * Replaces a registered Table with another Table under the same name.
+   * We use this method to replace a 
[[org.apache.flink.api.table.plan.schema.DataStreamTable]]
+   * with a [[org.apache.calcite.schema.TranslatableTable]].
+   *
+   * @param name
+   * @param table
+   */
+  protected def replaceRegisteredStreamTable(name: String, table: 
AbstractTable): Unit = {
--- End diff --

rename to `replaceRegisteredTable()` since it is not specific for stream 
tables?


> Add support for embedded streaming SQL (projection, filter, union)
> --
>
> Key: FLINK-3727
> URL: https://issues.apache.org/jira/browse/FLINK-3727
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>Assignee: Vasia Kalavri
>
> Similar to the support for SQL embedded in batch Table API programs, this 
> issue tracks the support for SQL embedded in stream Table API programs. The 
> only currently supported operations on streaming Tables are projection, 
> filtering, and union.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3727] Embedded streaming SQL projection...

2016-04-20 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1917#discussion_r60473805
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/TableEnvironment.scala
 ---
@@ -98,8 +98,42 @@ abstract class TableEnvironment(val config: TableConfig) 
{
 
 checkValidTableName(name)
 
-val tableTable = new TableTable(table.getRelNode)
-registerTableInternal(name, tableTable)
+table.tableEnv match {
+  case e: BatchTableEnvironment =>
+val tableTable = new TableTable(table.getRelNode)
+registerTableInternal(name, tableTable)
+  case e: StreamTableEnvironment =>
+val sTableTable = new TransStreamTable(table.getRelNode, true)
+tables.add(name, sTableTable)
+}
+
+  }
+
+  protected def registerStreamTableInternal(name: String, table: 
AbstractTable): Unit = {
+
+if (isRegistered(name)) {
+  throw new TableException(s"Table \'$name\' already exists. " +
+s"Please, choose a different name.")
+} else {
+  tables.add(name, table)
+}
+  }
+
+  /**
+   * Replaces a registered Table with another Table under the same name.
+   * We use this method to replace a 
[[org.apache.flink.api.table.plan.schema.DataStreamTable]]
+   * with a [[org.apache.calcite.schema.TranslatableTable]].
+   *
+   * @param name
+   * @param table
+   */
+  protected def replaceRegisteredStreamTable(name: String, table: 
AbstractTable): Unit = {
--- End diff --

rename to `replaceRegisteredTable()` since it is not specific for stream 
tables?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3727) Add support for embedded streaming SQL (projection, filter, union)

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250557#comment-15250557
 ] 

ASF GitHub Bot commented on FLINK-3727:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1917#discussion_r60473701
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/TableEnvironment.scala
 ---
@@ -98,8 +98,42 @@ abstract class TableEnvironment(val config: TableConfig) 
{
 
 checkValidTableName(name)
 
-val tableTable = new TableTable(table.getRelNode)
-registerTableInternal(name, tableTable)
+table.tableEnv match {
+  case e: BatchTableEnvironment =>
+val tableTable = new TableTable(table.getRelNode)
+registerTableInternal(name, tableTable)
+  case e: StreamTableEnvironment =>
+val sTableTable = new TransStreamTable(table.getRelNode, true)
+tables.add(name, sTableTable)
+}
+
+  }
+
+  protected def registerStreamTableInternal(name: String, table: 
AbstractTable): Unit = {
--- End diff --

same as `registerTableInternal()`, no? So can be removed, IMO.


> Add support for embedded streaming SQL (projection, filter, union)
> --
>
> Key: FLINK-3727
> URL: https://issues.apache.org/jira/browse/FLINK-3727
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>Assignee: Vasia Kalavri
>
> Similar to the support for SQL embedded in batch Table API programs, this 
> issue tracks the support for SQL embedded in stream Table API programs. The 
> only currently supported operations on streaming Tables are projection, 
> filtering, and union.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3727] Embedded streaming SQL projection...

2016-04-20 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1917#discussion_r60473701
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/TableEnvironment.scala
 ---
@@ -98,8 +98,42 @@ abstract class TableEnvironment(val config: TableConfig) 
{
 
 checkValidTableName(name)
 
-val tableTable = new TableTable(table.getRelNode)
-registerTableInternal(name, tableTable)
+table.tableEnv match {
+  case e: BatchTableEnvironment =>
+val tableTable = new TableTable(table.getRelNode)
+registerTableInternal(name, tableTable)
+  case e: StreamTableEnvironment =>
+val sTableTable = new TransStreamTable(table.getRelNode, true)
+tables.add(name, sTableTable)
+}
+
+  }
+
+  protected def registerStreamTableInternal(name: String, table: 
AbstractTable): Unit = {
--- End diff --

same as `registerTableInternal()`, no? So can be removed, IMO.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3798) Clean up RocksDB state backend access modifiers

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250554#comment-15250554
 ] 

ASF GitHub Bot commented on FLINK-3798:
---

GitHub user gyfora opened a pull request:

https://github.com/apache/flink/pull/1918

[FLINK-3798] [streaming] Clean up RocksDB backend field/method access

The RocksDB state backend uses a lot package private methods and fields 
which makes it very hard to subclass the different parts for added 
functionality. I think these should be protected instead.

Also the AbstractRocksDBState declares some methods final when there are 
use-cases when a subclass migh want to call them.

Just to give you an example I am creating a version of the value state 
which would keep a small cache on heap. For this it would be enough to subclass 
the RockDBStateBackend and RocksDBVAlue state classes if the above mentioned 
changes were made. Now I have to use reflection to access package private 
fields and actually copy classes due to final methods.
Activity

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gyfora/flink rocks-access

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1918.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1918


commit a1e323650bd5f603fbf3809fe1fe901cbab1a56b
Author: Gyula Fora 
Date:   2016-04-20T19:33:30Z

[FLINK-3798] [streaming] Clean up RocksDB backend field/method access




> Clean up RocksDB state backend access modifiers
> ---
>
> Key: FLINK-3798
> URL: https://issues.apache.org/jira/browse/FLINK-3798
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Reporter: Gyula Fora
>Priority: Minor
>
> The RocksDB state backend uses a lot package private methods and fields which 
> makes it very hard to subclass the different parts for added functionality. I 
> think these should be protected instead. 
> Also the AbstractRocksDBState declares some methods final when there are 
> use-cases when a subclass migh want to call them.
> Just to give you an example I am creating a version of the value state which 
> would keep a small cache on heap. For this it would be enough to subclass the 
> RockDBStateBackend and RocksDBVAlue state classes if the above mentioned 
> changes were made. Now I have to use reflection to access package private 
> fields and actually copy classes due to final methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3798] [streaming] Clean up RocksDB back...

2016-04-20 Thread gyfora

GitHub user gyfora opened a pull request:

https://github.com/apache/flink/pull/1918

[FLINK-3798] [streaming] Clean up RocksDB backend field/method access

The RocksDB state backend uses a lot package private methods and fields 
which makes it very hard to subclass the different parts for added 
functionality. I think these should be protected instead.

Also the AbstractRocksDBState declares some methods final when there are 
use-cases when a subclass migh want to call them.

Just to give you an example I am creating a version of the value state 
which would keep a small cache on heap. For this it would be enough to subclass 
the RockDBStateBackend and RocksDBVAlue state classes if the above mentioned 
changes were made. Now I have to use reflection to access package private 
fields and actually copy classes due to final methods.
Activity

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gyfora/flink rocks-access

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1918.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1918


commit a1e323650bd5f603fbf3809fe1fe901cbab1a56b
Author: Gyula Fora 
Date:   2016-04-20T19:33:30Z

[FLINK-3798] [streaming] Clean up RocksDB backend field/method access




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3727) Add support for embedded streaming SQL (projection, filter, union)

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250550#comment-15250550
 ] 

ASF GitHub Bot commented on FLINK-3727:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1917#discussion_r60473002
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/StreamTableEnvironment.scala
 ---
@@ -112,19 +122,31 @@ abstract class StreamTableEnvironment(config: 
TableConfig) extends TableEnvironm
 *
 * @param name The name under which the table is registered in the 
catalog.
 * @param dataStream The [[DataStream]] to register as table in the 
catalog.
+* @param wrapper True if the registration has to wrap the 
datastreamTable
+   *into a [[org.apache.calcite.schema.StreamableTable]]
--- End diff --

comment indention


> Add support for embedded streaming SQL (projection, filter, union)
> --
>
> Key: FLINK-3727
> URL: https://issues.apache.org/jira/browse/FLINK-3727
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>Assignee: Vasia Kalavri
>
> Similar to the support for SQL embedded in batch Table API programs, this 
> issue tracks the support for SQL embedded in stream Table API programs. The 
> only currently supported operations on streaming Tables are projection, 
> filtering, and union.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3727] Embedded streaming SQL projection...

2016-04-20 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1917#discussion_r60473002
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/StreamTableEnvironment.scala
 ---
@@ -112,19 +122,31 @@ abstract class StreamTableEnvironment(config: 
TableConfig) extends TableEnvironm
 *
 * @param name The name under which the table is registered in the 
catalog.
 * @param dataStream The [[DataStream]] to register as table in the 
catalog.
+* @param wrapper True if the registration has to wrap the 
datastreamTable
+   *into a [[org.apache.calcite.schema.StreamableTable]]
--- End diff --

comment indention


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3727] Embedded streaming SQL projection...

2016-04-20 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1917#discussion_r60472854
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/StreamTableEnvironment.scala
 ---
@@ -86,6 +86,8 @@ abstract class StreamTableEnvironment(config: 
TableConfig) extends TableEnvironm
 
 if (isRegistered(tableName)) {
   relBuilder.scan(tableName)
+  //val delta: LogicalDelta = LogicalDelta.create(relBuilder.build())
--- End diff --

can be removed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3727) Add support for embedded streaming SQL (projection, filter, union)

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250549#comment-15250549
 ] 

ASF GitHub Bot commented on FLINK-3727:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1917#discussion_r60472854
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/StreamTableEnvironment.scala
 ---
@@ -86,6 +86,8 @@ abstract class StreamTableEnvironment(config: 
TableConfig) extends TableEnvironm
 
 if (isRegistered(tableName)) {
   relBuilder.scan(tableName)
+  //val delta: LogicalDelta = LogicalDelta.create(relBuilder.build())
--- End diff --

can be removed?


> Add support for embedded streaming SQL (projection, filter, union)
> --
>
> Key: FLINK-3727
> URL: https://issues.apache.org/jira/browse/FLINK-3727
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>Assignee: Vasia Kalavri
>
> Similar to the support for SQL embedded in batch Table API programs, this 
> issue tracks the support for SQL embedded in stream Table API programs. The 
> only currently supported operations on streaming Tables are projection, 
> filtering, and union.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3727) Add support for embedded streaming SQL (projection, filter, union)

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250547#comment-15250547
 ] 

ASF GitHub Bot commented on FLINK-3727:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1917#discussion_r60472476
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/java/table/StreamTableEnvironment.scala
 ---
@@ -83,11 +83,54 @@ class StreamTableEnvironment(
   .toArray
 
 val name = createUniqueTableName()
-registerDataStreamInternal(name, dataStream, exprs)
+registerDataStreamInternal(name, dataStream, exprs, false)
 ingest(name)
   }
 
   /**
+   * Registers the given [[DataStream]] as table in the
--- End diff --

nitpicking, ScalaDoc indents the `*` under the second `*` of the first 
line, i.e.,
```
/**
  *
  */
```


> Add support for embedded streaming SQL (projection, filter, union)
> --
>
> Key: FLINK-3727
> URL: https://issues.apache.org/jira/browse/FLINK-3727
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>Assignee: Vasia Kalavri
>
> Similar to the support for SQL embedded in batch Table API programs, this 
> issue tracks the support for SQL embedded in stream Table API programs. The 
> only currently supported operations on streaming Tables are projection, 
> filtering, and union.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3727] Embedded streaming SQL projection...

2016-04-20 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1917#discussion_r60472476
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/java/table/StreamTableEnvironment.scala
 ---
@@ -83,11 +83,54 @@ class StreamTableEnvironment(
   .toArray
 
 val name = createUniqueTableName()
-registerDataStreamInternal(name, dataStream, exprs)
+registerDataStreamInternal(name, dataStream, exprs, false)
 ingest(name)
   }
 
   /**
+   * Registers the given [[DataStream]] as table in the
--- End diff --

nitpicking, ScalaDoc indents the `*` under the second `*` of the first 
line, i.e.,
```
/**
  *
  */
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (FLINK-3798) Clean up RocksDB state backend access modifiers

2016-04-20 Thread Gyula Fora (JIRA)

Gyula Fora created FLINK-3798:
-

 Summary: Clean up RocksDB state backend access modifiers
 Key: FLINK-3798
 URL: https://issues.apache.org/jira/browse/FLINK-3798
 Project: Flink
  Issue Type: Improvement
  Components: Streaming Connectors
Reporter: Gyula Fora
Priority: Minor


The RocksDB state backend uses a lot package private methods and fields which 
makes it very hard to subclass the different parts for added functionality. I 
think these should be protected instead. 

Also the AbstractRocksDBState declares some methods final when there are 
use-cases when a subclass migh want to call them.

Just to give you an example I am creating a version of the value state which 
would keep a small cache on heap. For this it would be enough to subclass the 
RockDBStateBackend and RocksDBVAlue state classes if the above mentioned 
changes were made. Now I have to use reflection to access package private 
fields and actually copy classes due to final methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3727] Embedded streaming SQL projection...

2016-04-20 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1917#discussion_r60471750
  
--- Diff: docs/apis/batch/libs/table.md ---
@@ -419,8 +438,8 @@ Only the types `LONG` and `STRING` can be casted to 
`DATE` and vice versa. A `LO
 SQL
 
 The Table API also supports embedded SQL queries.
-In order to use a `Table` or `DataSet` in a SQL query, it has to be 
registered in the `TableEnvironment`, using a unique name.
-A registered `Table` can be retrieved back from the `TableEnvironment` 
using the `scan()` method:
+In order to use a `Table`, `DataSet`, or `DataStream` in a SQL query, it 
has to be registered in the `TableEnvironment`, using a unique name.
+A registered Dataset `Table` can be retrieved back from the 
`TableEnvironment` using the `scan()` method:
--- End diff --

Add a sentence about the `ingest()` method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3727) Add support for embedded streaming SQL (projection, filter, union)

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250542#comment-15250542
 ] 

ASF GitHub Bot commented on FLINK-3727:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1917#discussion_r60471750
  
--- Diff: docs/apis/batch/libs/table.md ---
@@ -419,8 +438,8 @@ Only the types `LONG` and `STRING` can be casted to 
`DATE` and vice versa. A `LO
 SQL
 
 The Table API also supports embedded SQL queries.
-In order to use a `Table` or `DataSet` in a SQL query, it has to be 
registered in the `TableEnvironment`, using a unique name.
-A registered `Table` can be retrieved back from the `TableEnvironment` 
using the `scan()` method:
+In order to use a `Table`, `DataSet`, or `DataStream` in a SQL query, it 
has to be registered in the `TableEnvironment`, using a unique name.
+A registered Dataset `Table` can be retrieved back from the 
`TableEnvironment` using the `scan()` method:
--- End diff --

Add a sentence about the `ingest()` method.


> Add support for embedded streaming SQL (projection, filter, union)
> --
>
> Key: FLINK-3727
> URL: https://issues.apache.org/jira/browse/FLINK-3727
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>Assignee: Vasia Kalavri
>
> Similar to the support for SQL embedded in batch Table API programs, this 
> issue tracks the support for SQL embedded in stream Table API programs. The 
> only currently supported operations on streaming Tables are projection, 
> filtering, and union.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2157) Create evaluation framework for ML library

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250529#comment-15250529
 ] 

ASF GitHub Bot commented on FLINK-2157:
---

Github user rawkintrevo commented on the pull request:

https://github.com/apache/flink/pull/1849#issuecomment-212563858
  
Are there going to be useage docs on this?


> Create evaluation framework for ML library
> --
>
> Key: FLINK-2157
> URL: https://issues.apache.org/jira/browse/FLINK-2157
> Project: Flink
>  Issue Type: New Feature
>  Components: Machine Learning Library
>Reporter: Till Rohrmann
>Assignee: Theodore Vasiloudis
>  Labels: ML
> Fix For: 1.0.0
>
>
> Currently, FlinkML lacks means to evaluate the performance of trained models. 
> It would be great to add some {{Evaluators}} which can calculate some score 
> based on the information about true and predicted labels. This could also be 
> used for the cross validation to choose the right hyper parameters.
> Possible scores could be F score [1], zero-one-loss score, etc.
> Resources
> [1] [http://en.wikipedia.org/wiki/F1_score]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2157] [ml] Create evaluation framework ...

2016-04-20 Thread rawkintrevo

Github user rawkintrevo commented on the pull request:

https://github.com/apache/flink/pull/1849#issuecomment-212563858
  
Are there going to be useage docs on this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3708] Scala API for CEP (initial).

2016-04-20 Thread StefanRRichter

Github user StefanRRichter commented on a diff in the pull request:

https://github.com/apache/flink/pull/1905#discussion_r60452526
  
--- Diff: 
flink-libraries/flink-cep-scala/src/main/scala/org/apache/flink/cep/scala/pattern/package.scala
 ---
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.cep.scala
+
+import org.apache.flink.cep.pattern.{FollowedByPattern => 
JFollowedByPattern, Pattern => JPattern}
+
+import _root_.scala.reflect.ClassTag
+
+package object pattern {
+  /**
+* Utility method to wrap { @link org.apache.flink.cep.pattern.Pattern} 
and its subclasses
+* for usage with the Scala API.
+*
+* @param javaPattern The underlying pattern from the Java API
+* @tparam T Base type of the elements appearing in the pattern
+* @tparam F Subtype of T to which the current pattern operator is 
constrained
+* @return A pattern from the Scala API which wraps the pattern from 
the Java API
+*/
+  private[flink] def wrapPattern[
+  T: ClassTag, F <: T : ClassTag](javaPattern: JPattern[T, F])
+  : Pattern[T, F] = javaPattern match {
+case f: JFollowedByPattern[T, F] => FollowedByPattern[T, F](f)
+case p: JPattern[T, F] => Pattern[T, F](p)
+case _ => null
--- End diff --

I am not sure if this case should trigger an exception, because the Java 
API is actually allowed to return null here in case there is no previous 
pattern. Our pattern will then wrap this as an undefined Option.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3708) Scala API for CEP

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250334#comment-15250334
 ] 

ASF GitHub Bot commented on FLINK-3708:
---

Github user StefanRRichter commented on a diff in the pull request:

https://github.com/apache/flink/pull/1905#discussion_r60452526
  
--- Diff: 
flink-libraries/flink-cep-scala/src/main/scala/org/apache/flink/cep/scala/pattern/package.scala
 ---
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.cep.scala
+
+import org.apache.flink.cep.pattern.{FollowedByPattern => 
JFollowedByPattern, Pattern => JPattern}
+
+import _root_.scala.reflect.ClassTag
+
+package object pattern {
+  /**
+* Utility method to wrap { @link org.apache.flink.cep.pattern.Pattern} 
and its subclasses
+* for usage with the Scala API.
+*
+* @param javaPattern The underlying pattern from the Java API
+* @tparam T Base type of the elements appearing in the pattern
+* @tparam F Subtype of T to which the current pattern operator is 
constrained
+* @return A pattern from the Scala API which wraps the pattern from 
the Java API
+*/
+  private[flink] def wrapPattern[
+  T: ClassTag, F <: T : ClassTag](javaPattern: JPattern[T, F])
+  : Pattern[T, F] = javaPattern match {
+case f: JFollowedByPattern[T, F] => FollowedByPattern[T, F](f)
+case p: JPattern[T, F] => Pattern[T, F](p)
+case _ => null
--- End diff --

I am not sure if this case should trigger an exception, because the Java 
API is actually allowed to return null here in case there is no previous 
pattern. Our pattern will then wrap this as an undefined Option.


> Scala API for CEP
> -
>
> Key: FLINK-3708
> URL: https://issues.apache.org/jira/browse/FLINK-3708
> Project: Flink
>  Issue Type: Improvement
>  Components: CEP
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Assignee: Stefan Richter
>
> Currently, The CEP library does not support Scala case classes, because the 
> {{TypeExtractor}} cannot handle them. In order to support them, it would be 
> necessary to offer a Scala API for the CEP library.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-3797) Restart flow button

2016-04-20 Thread Maxim Dobryakov (JIRA)

Maxim Dobryakov created FLINK-3797:
--

 Summary: Restart flow button
 Key: FLINK-3797
 URL: https://issues.apache.org/jira/browse/FLINK-3797
 Project: Flink
  Issue Type: Improvement
  Components: Web Client
Affects Versions: 1.0.1
Reporter: Maxim Dobryakov
Priority: Minor


Will be better to have "Restart" button in Flink Dashboard for restart flow 
(create savepoint, cancel flow, run new flow with created savepoint).

Such functionality can be useful to start flow on new task managers after 
downtime or adding new nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-3796) FileSourceFunction doesn't respect InputFormat's life cycle methods

2016-04-20 Thread Maximilian Michels (JIRA)

Maximilian Michels created FLINK-3796:
-

 Summary: FileSourceFunction doesn't respect InputFormat's life 
cycle methods 
 Key: FLINK-3796
 URL: https://issues.apache.org/jira/browse/FLINK-3796
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.0.1, 1.0.0, 1.0.2
Reporter: Maximilian Michels
Assignee: Maximilian Michels
 Fix For: 1.1.0


The {{FileSourceFunction}} wraps {{InputFormat}} but doesn't execute its life 
cycle functions correctly.

1) It doesn't call {{close()}} before reading the next InputSplit via 
{{open(InputSplit split)}}.

2) It calls {{close()}} even if no InputSplit has been read (and thus open(..) 
hasn't been called previously).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-3795) Cancel button is hidden on MacBook Pro 13''

2016-04-20 Thread Maxim Dobryakov (JIRA)

Maxim Dobryakov created FLINK-3795:
--

 Summary: Cancel button is hidden on MacBook Pro 13''
 Key: FLINK-3795
 URL: https://issues.apache.org/jira/browse/FLINK-3795
 Project: Flink
  Issue Type: Improvement
  Components: Web Client
Affects Versions: 1.0.1
Reporter: Maxim Dobryakov
Priority: Minor


Flink Dashboard doesn't show "Cancel" button on Macbook Pro 13''. So, not 
possible to use it.

Screenshot [here|http://i.imgur.com/5IH7caH.png].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-3794) Add checks for unsupported operations in streaming table API

2016-04-20 Thread Vasia Kalavri (JIRA)

Vasia Kalavri created FLINK-3794:


 Summary: Add checks for unsupported operations in streaming table 
API
 Key: FLINK-3794
 URL: https://issues.apache.org/jira/browse/FLINK-3794
 Project: Flink
  Issue Type: Improvement
  Components: Table API
Affects Versions: 1.1.0
Reporter: Vasia Kalavri


Unsupported operations on streaming tables currently fail during plan 
translation. It would be nicer to add checks in the Table API methods and fail 
with an informative message that the operation is not supported. The operations 
that are not currently supported are:
- aggregations inside select
- groupBy
- distinct
- join

We can simply check if the Table's environment is a streaming environment and 
throw an unsupported operation exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-3793) Re-organize the Table API and SQL docs

2016-04-20 Thread Vasia Kalavri (JIRA)

Vasia Kalavri created FLINK-3793:


 Summary: Re-organize the Table API and SQL docs
 Key: FLINK-3793
 URL: https://issues.apache.org/jira/browse/FLINK-3793
 Project: Flink
  Issue Type: Bug
  Components: Documentation, Table API
Affects Versions: 1.1.0
Reporter: Vasia Kalavri


Now that we have added SQL and soon streaming SQL support, we need to 
reorganize the Table API documentation. 
- The current guide is under "apis/batch/libs". We should either split it into 
a streaming and a batch part or move it to under "apis". The second option 
might be preferable, as the batch and stream APIs have a lot in common.
- The current guide has separate sections for Java and Scala APIs. These can be 
merged and organized with tabs, like other parts of the docs.
- Mentions of "Table API" can be renamed to "Table API and SQL", e.g. in the 
software stack figure and homepage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3229) Kinesis streaming consumer with integration of Flink's checkpointing mechanics

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250214#comment-15250214
 ] 

ASF GitHub Bot commented on FLINK-3229:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/1911#discussion_r60443897
  
--- Diff: flink-streaming-connectors/pom.xml ---
@@ -45,6 +45,7 @@ under the License.
flink-connector-rabbitmq
flink-connector-twitter
flink-connector-nifi
+   flink-connector-kinesis
--- End diff --

I think we have to remove this line again. The module is included in the 
profile below (you have to activate the "include-kinesis" maven build profile)


> Kinesis streaming consumer with integration of Flink's checkpointing mechanics
> --
>
> Key: FLINK-3229
> URL: https://issues.apache.org/jira/browse/FLINK-3229
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming Connectors
>Affects Versions: 1.0.0
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>
> Opening a sub-task to implement data source consumer for Kinesis streaming 
> connector (https://issues.apache.org/jira/browser/FLINK-3211).
> An example of the planned user API for Flink Kinesis Consumer:
> {code}
> Properties kinesisConfig = new Properties();
> config.put(KinesisConfigConstants.CONFIG_AWS_REGION, "us-east-1");
> config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_TYPE, 
> "BASIC");
> config.put(
> KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_ACCESSKEYID, 
> "aws_access_key_id_here");
> config.put(
> KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_SECRETKEY,
> "aws_secret_key_here");
> config.put(KinesisConfigConstants.CONFIG_STREAM_INIT_POSITION_TYPE, 
> "LATEST"); // or TRIM_HORIZON
> DataStream kinesisRecords = env.addSource(new FlinkKinesisConsumer<>(
> "kinesis_stream_name",
> new SimpleStringSchema(),
> kinesisConfig));
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3229] Flink streaming consumer for AWS ...

2016-04-20 Thread rmetzger

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/1911#discussion_r60443897
  
--- Diff: flink-streaming-connectors/pom.xml ---
@@ -45,6 +45,7 @@ under the License.
flink-connector-rabbitmq
flink-connector-twitter
flink-connector-nifi
+   flink-connector-kinesis
--- End diff --

I think we have to remove this line again. The module is included in the 
profile below (you have to activate the "include-kinesis" maven build profile)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3229) Kinesis streaming consumer with integration of Flink's checkpointing mechanics

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250211#comment-15250211
 ] 

ASF GitHub Bot commented on FLINK-3229:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/1911#issuecomment-212504465
  
Great, thank you. I'll review the PR soon.


> Kinesis streaming consumer with integration of Flink's checkpointing mechanics
> --
>
> Key: FLINK-3229
> URL: https://issues.apache.org/jira/browse/FLINK-3229
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming Connectors
>Affects Versions: 1.0.0
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>
> Opening a sub-task to implement data source consumer for Kinesis streaming 
> connector (https://issues.apache.org/jira/browser/FLINK-3211).
> An example of the planned user API for Flink Kinesis Consumer:
> {code}
> Properties kinesisConfig = new Properties();
> config.put(KinesisConfigConstants.CONFIG_AWS_REGION, "us-east-1");
> config.put(KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_TYPE, 
> "BASIC");
> config.put(
> KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_ACCESSKEYID, 
> "aws_access_key_id_here");
> config.put(
> KinesisConfigConstants.CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_SECRETKEY,
> "aws_secret_key_here");
> config.put(KinesisConfigConstants.CONFIG_STREAM_INIT_POSITION_TYPE, 
> "LATEST"); // or TRIM_HORIZON
> DataStream kinesisRecords = env.addSource(new FlinkKinesisConsumer<>(
> "kinesis_stream_name",
> new SimpleStringSchema(),
> kinesisConfig));
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3229] Flink streaming consumer for AWS ...

2016-04-20 Thread rmetzger

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/1911#issuecomment-212504465
  
Great, thank you. I'll review the PR soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3727) Add support for embedded streaming SQL (projection, filter, union)

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250196#comment-15250196
 ] 

ASF GitHub Bot commented on FLINK-3727:
---

GitHub user vasia opened a pull request:

https://github.com/apache/flink/pull/1917

[FLINK-3727] Embedded streaming SQL projection, filtering, union

This PR adds support for embedded streaming SQL (projection, filtering, 
union):
- methods to register DataStreams
- sql translation method in StreamTableEnvironment
- a custom rule to convert to streamable table
- docs for streaming table and sql
- java SQL tests

A streaming SQL query can be executed on a streaming Table by simply adding 
the `STREAM` keyword in front of the table name. Registering DataStream tables 
and conversions work in a similar way to that of DataStream tables.
Here's a filtering example:
```
val env = StreamExecutionEnvironment.getExecutionEnvironment
val tEnv = TableEnvironment.getTableEnvironment(env)

val dataStream = env.addSource(...)
val t = dataStream.toTable(tEnv).as('a, 'b, 'c)
tEnv.registerTable("MyTable", t)

val sqlQuery = "SELECT STREAM * FROM MyTable WHERE a = 3"
val result = tEnv.sql(sqlQuery).toDataStream[Row]
```


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vasia/flink stream-sql

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1917.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1917


commit 6b747bd6902074d0475de9519c1c3bd693487eef
Author: vasia 
Date:   2016-04-15T11:35:24Z

[FLINK-3727] Add support for embedded streaming SQL (projection, filter, 
union)

- add methods to register DataStreams
- add sql translation method in StreamTableEnvironment
- add a custom rule to convert to streamable table
- add docs for streaming table and sql




> Add support for embedded streaming SQL (projection, filter, union)
> --
>
> Key: FLINK-3727
> URL: https://issues.apache.org/jira/browse/FLINK-3727
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>Assignee: Vasia Kalavri
>
> Similar to the support for SQL embedded in batch Table API programs, this 
> issue tracks the support for SQL embedded in stream Table API programs. The 
> only currently supported operations on streaming Tables are projection, 
> filtering, and union.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3727] Embedded streaming SQL projection...

2016-04-20 Thread vasia

GitHub user vasia opened a pull request:

https://github.com/apache/flink/pull/1917

[FLINK-3727] Embedded streaming SQL projection, filtering, union

This PR adds support for embedded streaming SQL (projection, filtering, 
union):
- methods to register DataStreams
- sql translation method in StreamTableEnvironment
- a custom rule to convert to streamable table
- docs for streaming table and sql
- java SQL tests

A streaming SQL query can be executed on a streaming Table by simply adding 
the `STREAM` keyword in front of the table name. Registering DataStream tables 
and conversions work in a similar way to that of DataStream tables.
Here's a filtering example:
```
val env = StreamExecutionEnvironment.getExecutionEnvironment
val tEnv = TableEnvironment.getTableEnvironment(env)

val dataStream = env.addSource(...)
val t = dataStream.toTable(tEnv).as('a, 'b, 'c)
tEnv.registerTable("MyTable", t)

val sqlQuery = "SELECT STREAM * FROM MyTable WHERE a = 3"
val result = tEnv.sql(sqlQuery).toDataStream[Row]
```


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vasia/flink stream-sql

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1917.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1917


commit 6b747bd6902074d0475de9519c1c3bd693487eef
Author: vasia 
Date:   2016-04-15T11:35:24Z

[FLINK-3727] Add support for embedded streaming SQL (projection, filter, 
union)

- add methods to register DataStreams
- add sql translation method in StreamTableEnvironment
- add a custom rule to convert to streamable table
- add docs for streaming table and sql




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3717) Add functionality to be a able to restore from specific point in a FileInputFormat

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250154#comment-15250154
 ] 

ASF GitHub Bot commented on FLINK-3717:
---

Github user kl0u commented on the pull request:

https://github.com/apache/flink/pull/1895#issuecomment-212492749
  
Thanks for the comments @aljoscha ! 
I will integrate them and get back to you.


> Add functionality to be a able to restore from specific point in a 
> FileInputFormat
> --
>
> Key: FLINK-3717
> URL: https://issues.apache.org/jira/browse/FLINK-3717
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming
>Reporter: Kostas Kloudas
>Assignee: Kostas Kloudas
>
> This is the first step in order to make the File Sources fault-tolerant. We 
> have to be able to get the start from a specific point in a file despite any 
> caching performed during reading. This will guarantee that the task that will 
> take over the execution of the failed one will be able to start from the 
> correct point in the file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: FLINK-3717 - Be able to re-start reading from ...

2016-04-20 Thread kl0u

Github user kl0u commented on the pull request:

https://github.com/apache/flink/pull/1895#issuecomment-212492749
  
Thanks for the comments @aljoscha ! 
I will integrate them and get back to you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3792) RowTypeInfo equality should not depend on field names

2016-04-20 Thread Fabian Hueske (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250140#comment-15250140
 ] 

Fabian Hueske commented on FLINK-3792:
--

I think this should be checked at API level, not when the plan is optimized and 
the DataSet / DataStream program is constructed.
Field names are not relevant when the program (Table or SQL) is executed. Also, 
{{Row}} does not provide access by name but only by position.

> RowTypeInfo equality should not depend on field names
> -
>
> Key: FLINK-3792
> URL: https://issues.apache.org/jira/browse/FLINK-3792
> Project: Flink
>  Issue Type: Bug
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>
> Currently, two Rows with the same field types but different field names are 
> not considered equal by the Table API and SQL. This behavior might create 
> problems, e.g. it makes the following union query fail:
> {code}
> SELECT STREAM a, b, c FROM T1 UNION ALL 
> (SELECT STREAM d, e, f FROM T2 WHERE d < 3)
> {code}
> where a, b, c and d, e, f are fields of corresponding types.
> {code}
> Cannot union streams of different types: org.apache.flink.api.table.Row(a: 
> Integer, b: Long, c: String) and org.apache.flink.api.table.Row(d: Integer, 
> e: Long, f: String)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3777] Managed closeInputFormat()

2016-04-20 Thread fhueske

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1903#issuecomment-212483482
  
Thanks for this PR @fpompermaier.

We need to check all places where `InputFormat.configure()` is called to 
ensure that we free resources as promised in the JavaDocs of 
`RichInputFormat.closeInputFormat()`. I found a few more spots where we need to 
call `closeInputFormat()` if the IF is a `RichInputFormat`.

- ReplicatingInputFormat which is a wrapper for other input formats.
- DataSourceNode.computeOperatorSpecificDefaultEstimates() which tries to 
obtain some input statistics and initializes the input format with 
`configure()`.
- `InputFormatVertex.initializeOnMaster()` calls `configure()` as well.

It would also be good to add a test that the closeInputFormat() method is 
actually called.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3754) Add a validation phase before construct RelNode using TableAPI

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250128#comment-15250128
 ] 

ASF GitHub Bot commented on FLINK-3754:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1916#issuecomment-212485745
  
Thanks @yjshen for working on this issue! Unified validation and exceptions 
are a big improvement, IMO. I'll try to have a look soon.

@twalthr, can you have a look at the changes done on the expressions?


> Add a validation phase before construct RelNode using TableAPI
> --
>
> Key: FLINK-3754
> URL: https://issues.apache.org/jira/browse/FLINK-3754
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API
>Affects Versions: 1.0.0
>Reporter: Yijie Shen
>Assignee: Yijie Shen
>
> Unlike sql string's execution, which have a separate validation phase before 
> RelNode construction, Table API lacks the counterparts and the validation is 
> scattered in many places.
> I suggest to add a single validation phase and detect problems as early as 
> possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3754][Table]Add a validation phase befo...

2016-04-20 Thread fhueske

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1916#issuecomment-212485745
  
Thanks @yjshen for working on this issue! Unified validation and exceptions 
are a big improvement, IMO. I'll try to have a look soon.

@twalthr, can you have a look at the changes done on the expressions?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3758) Add possibility to register accumulators in custom triggers

2016-04-20 Thread Konstantin Knauf (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250106#comment-15250106
 ] 

Konstantin Knauf commented on FLINK-3758:
-

{quote}
In the Trigger we would not know when to call addAccumulator()
{quote}
That's what I thought, yes. {getAccumulator(String name, Accumulator 
defaultAccumulator)} sounds gut. I will have a look at it. 



> Add possibility to register accumulators in custom triggers
> ---
>
> Key: FLINK-3758
> URL: https://issues.apache.org/jira/browse/FLINK-3758
> Project: Flink
>  Issue Type: Improvement
>Reporter: Konstantin Knauf
>Priority: Minor
>
> For monitoring purposes it would be nice to be able to to use accumulators in 
> custom trigger functions. 
> Basically, the trigger context could just expose {{getAccumulator}} of 
> {{RuntimeContext}} or does this create problems I am not aware of?
> Adding accumulators in a trigger function is more difficult, I think, but 
> that's not really neccessary as the accummulator could just be added in some 
> other upstream operator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3777) Add open and close methods to manage IF lifecycle

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250112#comment-15250112
 ] 

ASF GitHub Bot commented on FLINK-3777:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1903#issuecomment-212483482
  
Thanks for this PR @fpompermaier.

We need to check all places where `InputFormat.configure()` is called to 
ensure that we free resources as promised in the JavaDocs of 
`RichInputFormat.closeInputFormat()`. I found a few more spots where we need to 
call `closeInputFormat()` if the IF is a `RichInputFormat`.

- ReplicatingInputFormat which is a wrapper for other input formats.
- DataSourceNode.computeOperatorSpecificDefaultEstimates() which tries to 
obtain some input statistics and initializes the input format with 
`configure()`.
- `InputFormatVertex.initializeOnMaster()` calls `configure()` as well.

It would also be good to add a test that the closeInputFormat() method is 
actually called.


> Add open and close methods to manage IF lifecycle
> -
>
> Key: FLINK-3777
> URL: https://issues.apache.org/jira/browse/FLINK-3777
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.1
>Reporter: Flavio Pompermaier
>Assignee: Flavio Pompermaier
>  Labels: inputformat, lifecycle
>
> At the moment the opening and closing of an inputFormat are not managed, 
> although open() could be (improperly IMHO) simulated by configure().
> This limits the possibility to reuse expensive resources (like database 
> connections) and manage their release. 
> Probably the best option would be to add 2 methods (i.e. openInputformat() 
> and closeInputFormat() ) to RichInputFormat*
> * NOTE: the best option from a "semantic" point of view would be to rename 
> the current open() and close() to openSplit() and closeSplit() respectively 
> while using open() and close() methods for the IF lifecycle management, but 
> this would cause a backward compatibility issue...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (FLINK-3758) Add possibility to register accumulators in custom triggers

2016-04-20 Thread Konstantin Knauf (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250106#comment-15250106
 ] 

Konstantin Knauf edited comment on FLINK-3758 at 4/20/16 3:42 PM:
--

{quote}
In the Trigger we would not know when to call addAccumulator()
{quote}
That's what I thought, yes. {{getAccumulator(String name, Accumulator 
defaultAccumulator)}} sounds gut. I will have a look at it. 




was (Author: knaufk):
{quote}
In the Trigger we would not know when to call addAccumulator()
{quote}
That's what I thought, yes. {getAccumulator(String name, Accumulator 
defaultAccumulator)} sounds gut. I will have a look at it. 



> Add possibility to register accumulators in custom triggers
> ---
>
> Key: FLINK-3758
> URL: https://issues.apache.org/jira/browse/FLINK-3758
> Project: Flink
>  Issue Type: Improvement
>Reporter: Konstantin Knauf
>Priority: Minor
>
> For monitoring purposes it would be nice to be able to to use accumulators in 
> custom trigger functions. 
> Basically, the trigger context could just expose {{getAccumulator}} of 
> {{RuntimeContext}} or does this create problems I am not aware of?
> Adding accumulators in a trigger function is more difficult, I think, but 
> that's not really neccessary as the accummulator could just be added in some 
> other upstream operator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3750) Make JDBCInputFormat a parallel source

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250099#comment-15250099
 ] 

ASF GitHub Bot commented on FLINK-3750:
---

Github user fpompermaier commented on the pull request:

https://github.com/apache/flink/pull/1885#issuecomment-212479891
  
Depends on Flink-3777 (https://github.com/apache/flink/pull/1903) to 
properly manage the single connection instantiation


> Make JDBCInputFormat a parallel source
> --
>
> Key: FLINK-3750
> URL: https://issues.apache.org/jira/browse/FLINK-3750
> Project: Flink
>  Issue Type: Improvement
>  Components: Batch
>Affects Versions: 1.0.1
>Reporter: Flavio Pompermaier
>Assignee: Flavio Pompermaier
>Priority: Minor
>  Labels: connector, jdbc
>
> At the moment the batch JDBC InputFormat does not support parallelism 
> (NonParallelInput). I'd like to remove such limitation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: Improved JDBCInputFormat (FLINK-3750) and othe...

2016-04-20 Thread fpompermaier

Github user fpompermaier commented on the pull request:

https://github.com/apache/flink/pull/1885#issuecomment-212479891
  
Depends on Flink-3777 (https://github.com/apache/flink/pull/1903) to 
properly manage the single connection instantiation


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3717) Add functionality to be a able to restore from specific point in a FileInputFormat

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250085#comment-15250085
 ] 

ASF GitHub Bot commented on FLINK-3717:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1895#issuecomment-212477715
  
I like that you're adding comments to existing parts of the code. This will 
help other people who might be looking at it in the future. 😃 


> Add functionality to be a able to restore from specific point in a 
> FileInputFormat
> --
>
> Key: FLINK-3717
> URL: https://issues.apache.org/jira/browse/FLINK-3717
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming
>Reporter: Kostas Kloudas
>Assignee: Kostas Kloudas
>
> This is the first step in order to make the File Sources fault-tolerant. We 
> have to be able to get the start from a specific point in a file despite any 
> caching performed during reading. This will guarantee that the task that will 
> take over the execution of the failed one will be able to start from the 
> correct point in the file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: FLINK-3717 - Be able to re-start reading from ...

2016-04-20 Thread aljoscha

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1895#issuecomment-212477715
  
I like that you're adding comments to existing parts of the code. This will 
help other people who might be looking at it in the future. ð 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3792) RowTypeInfo equality should not depend on field names

2016-04-20 Thread Vasia Kalavri (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250083#comment-15250083
 ] 

Vasia Kalavri commented on FLINK-3792:
--

That's a valid point. I don't know what Calcite does, but the SQL validation 
phase does not fail. Renaming the fields does not work either.

> RowTypeInfo equality should not depend on field names
> -
>
> Key: FLINK-3792
> URL: https://issues.apache.org/jira/browse/FLINK-3792
> Project: Flink
>  Issue Type: Bug
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>
> Currently, two Rows with the same field types but different field names are 
> not considered equal by the Table API and SQL. This behavior might create 
> problems, e.g. it makes the following union query fail:
> {code}
> SELECT STREAM a, b, c FROM T1 UNION ALL 
> (SELECT STREAM d, e, f FROM T2 WHERE d < 3)
> {code}
> where a, b, c and d, e, f are fields of corresponding types.
> {code}
> Cannot union streams of different types: org.apache.flink.api.table.Row(a: 
> Integer, b: Long, c: String) and org.apache.flink.api.table.Row(d: Integer, 
> e: Long, f: String)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3717) Add functionality to be a able to restore from specific point in a FileInputFormat

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250075#comment-15250075
 ] 

ASF GitHub Bot commented on FLINK-3717:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1895#issuecomment-212473493
  
Could it happen that an `InputFormat` is opened with some `InputSplit` but 
then restored to a position that would lie in another split? Maybe it would 
make sense to have not a `restore` method but an alternative `open()` method 
that takes not only an `InputSplit` but also a read position that was 
snapshotted earlier, for this. This would make the "open from snapshot" 
consistent.


> Add functionality to be a able to restore from specific point in a 
> FileInputFormat
> --
>
> Key: FLINK-3717
> URL: https://issues.apache.org/jira/browse/FLINK-3717
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming
>Reporter: Kostas Kloudas
>Assignee: Kostas Kloudas
>
> This is the first step in order to make the File Sources fault-tolerant. We 
> have to be able to get the start from a specific point in a file despite any 
> caching performed during reading. This will guarantee that the task that will 
> take over the execution of the failed one will be able to start from the 
> correct point in the file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: FLINK-3717 - Be able to re-start reading from ...

2016-04-20 Thread aljoscha

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1895#issuecomment-212473493
  
Could it happen that an `InputFormat` is opened with some `InputSplit` but 
then restored to a position that would lie in another split? Maybe it would 
make sense to have not a `restore` method but an alternative `open()` method 
that takes not only an `InputSplit` but also a read position that was 
snapshotted earlier, for this. This would make the "open from snapshot" 
consistent.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: FLINK-3717 - Be able to re-start reading from ...

2016-04-20 Thread kl0u

Github user kl0u commented on the pull request:

https://github.com/apache/flink/pull/1895#issuecomment-212468812
  
I will also squash the commits as soon as I integrate any possible comments!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3717) Add functionality to be a able to restore from specific point in a FileInputFormat

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250064#comment-15250064
 ] 

ASF GitHub Bot commented on FLINK-3717:
---

Github user kl0u commented on the pull request:

https://github.com/apache/flink/pull/1895#issuecomment-212468812
  
I will also squash the commits as soon as I integrate any possible comments!


> Add functionality to be a able to restore from specific point in a 
> FileInputFormat
> --
>
> Key: FLINK-3717
> URL: https://issues.apache.org/jira/browse/FLINK-3717
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming
>Reporter: Kostas Kloudas
>Assignee: Kostas Kloudas
>
> This is the first step in order to make the File Sources fault-tolerant. We 
> have to be able to get the start from a specific point in a file despite any 
> caching performed during reading. This will guarantee that the task that will 
> take over the execution of the failed one will be able to start from the 
> correct point in the file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3781] BlobClient may be left unclosed i...

2016-04-20 Thread gaoyike

Github user gaoyike commented on the pull request:

https://github.com/apache/flink/pull/1908#issuecomment-212460975
  
Done!



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3781) BlobClient may be left unclosed in BlobCache#deleteGlobal()

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250035#comment-15250035
 ] 

ASF GitHub Bot commented on FLINK-3781:
---

Github user gaoyike commented on the pull request:

https://github.com/apache/flink/pull/1908#issuecomment-212460975
  
Done!



> BlobClient may be left unclosed in BlobCache#deleteGlobal()
> ---
>
> Key: FLINK-3781
> URL: https://issues.apache.org/jira/browse/FLINK-3781
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> {code}
>   public void deleteGlobal(BlobKey key) throws IOException {
> delete(key);
> BlobClient bc = createClient();
> bc.delete(key);
> bc.close();
> {code}
> If delete() throws IOException, BlobClient would be left inclosed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3717) Add functionality to be a able to restore from specific point in a FileInputFormat

2016-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250028#comment-15250028
 ] 

ASF GitHub Bot commented on FLINK-3717:
---

Github user kl0u commented on the pull request:

https://github.com/apache/flink/pull/1895#issuecomment-212457887
  
Could somebody review this?
This is also part of the FLINK-2314 issue.


> Add functionality to be a able to restore from specific point in a 
> FileInputFormat
> --
>
> Key: FLINK-3717
> URL: https://issues.apache.org/jira/browse/FLINK-3717
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming
>Reporter: Kostas Kloudas
>Assignee: Kostas Kloudas
>
> This is the first step in order to make the File Sources fault-tolerant. We 
> have to be able to get the start from a specific point in a file despite any 
> caching performed during reading. This will guarantee that the task that will 
> take over the execution of the failed one will be able to start from the 
> correct point in the file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: FLINK-3717 - Be able to re-start reading from ...

2016-04-20 Thread kl0u

Github user kl0u commented on the pull request:

https://github.com/apache/flink/pull/1895#issuecomment-212457887
  
Could somebody review this?
This is also part of the FLINK-2314 issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

1 2 3 >

1 - 100 of 268 matches

Mail list logo