[jira] [Updated] (FLINK-7642) Upgrade maven surefire plugin to 2.19.1

2017-10-15 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated FLINK-7642:
--
Description: 
Surefire 2.19 release introduced more useful test filters which would let us 
run a subset of the test.


This issue is for upgrading maven surefire plugin to 2.19.1

  was:
Surefire 2.19 release introduced more useful test filters which would let us 
run a subset of the test.

This issue is for upgrading maven surefire plugin to 2.19.1


> Upgrade maven surefire plugin to 2.19.1
> ---
>
> Key: FLINK-7642
> URL: https://issues.apache.org/jira/browse/FLINK-7642
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Ted Yu
>
> Surefire 2.19 release introduced more useful test filters which would let us 
> run a subset of the test.
> This issue is for upgrading maven surefire plugin to 2.19.1



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7525) Add config option to disable Cancel functionality on UI

2017-10-15 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205358#comment-16205358
 ] 

Ted Yu commented on FLINK-7525:
---

Thanks for the feedback, Chesnay.

> Add config option to disable Cancel functionality on UI
> ---
>
> Key: FLINK-7525
> URL: https://issues.apache.org/jira/browse/FLINK-7525
> Project: Flink
>  Issue Type: Improvement
>  Components: Web Client, Webfrontend
>Reporter: Ted Yu
>
> In this email thread 
> http://search-hadoop.com/m/Flink/VkLeQlf0QOnc7YA?subj=Security+Control+of+running+Flink+Jobs+on+Flink+UI
>  , Raja was asking for a way to control how users cancel Job(s).
> Robert proposed adding a config option which disables the Cancel 
> functionality.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-7844) Fine Grained Recovery triggers checkpoint timeout failure

2017-10-15 Thread Zhenzhong Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenzhong Xu updated FLINK-7844:

Description: 
Context: 
We are using "individual" failover (fine-grained) recovery strategy for our 
embarrassingly parallel router use case. The topic has over 2000 partitions, 
and parallelism is set to ~180 that dispatched to over 20 task managers with 
around 180 slots.

Observations:
We've noticed after one task manager termination, even though the individual 
recovery happens correctly, that the workload was re-dispatched to a new 
available task manager instance. However, the checkpoint would take 10 mins to 
eventually timeout, causing all other task managers not able to commit 
checkpoints. In a worst-case scenario, if job got restarted for other reasons 
(i.e. job manager termination), that would cause more messages to be 
re-processed/duplicates compared to the job without fine-grained recovery 
enabled.

I am suspecting that uber checkpoint was waiting for a previous checkpoint that 
initiated by the old task manager and thus taking a long time to time out.
Two questions:
1. Is there a configuration that controls this checkpoint timeout?
2. Is there any reason that when Job Manager realizes that Task Manager is gone 
and workload is redispatched, it still need to wait for the checkpoint 
initiated by the old task manager?


Checkpoint screenshot in attachments.



  was:
Context: 
We are using "individual" failover (fine-grained) recovery strategy for our 
embarrassingly parallel router use case. The topic has over 2000 partitions, 
and parallelism is set to ~180 that dispatched to over 20 task managers with 
around 180 slots.

We've noticed after one task manager termination, even though the individual 
recovery happens correctly, that the workload was re-dispatched to a new 
available task manager instance. However, the checkpoint would take 10 mins to 
eventually timeout, causing all other task managers not able to commit 
checkpoints. In a worst-case scenario, if job got restarted for other reasons 
(i.e. job manager termination), that would cause more messages to be 
re-processed/duplicates compared to the job without fine-grained recovery 
enabled.

I am suspecting that uber checkpoint was waiting for a previous checkpoint that 
initiated by the old task manager and thus taking a long time to time out.
Two questions:
1. Is there a configuration that controls this checkpoint timeout?
2. Is there any reason that when Job Manager realizes that Task Manager is gone 
and workload is redispatched, it still need to wait for the checkpoint 
initiated by the old task manager?


Checkpoint screenshot in attachments.




> Fine Grained Recovery triggers checkpoint timeout failure
> -
>
> Key: FLINK-7844
> URL: https://issues.apache.org/jira/browse/FLINK-7844
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.3.2
>Reporter: Zhenzhong Xu
> Attachments: screenshot-1.png
>
>
> Context: 
> We are using "individual" failover (fine-grained) recovery strategy for our 
> embarrassingly parallel router use case. The topic has over 2000 partitions, 
> and parallelism is set to ~180 that dispatched to over 20 task managers with 
> around 180 slots.
> Observations:
> We've noticed after one task manager termination, even though the individual 
> recovery happens correctly, that the workload was re-dispatched to a new 
> available task manager instance. However, the checkpoint would take 10 mins 
> to eventually timeout, causing all other task managers not able to commit 
> checkpoints. In a worst-case scenario, if job got restarted for other reasons 
> (i.e. job manager termination), that would cause more messages to be 
> re-processed/duplicates compared to the job without fine-grained recovery 
> enabled.
> I am suspecting that uber checkpoint was waiting for a previous checkpoint 
> that initiated by the old task manager and thus taking a long time to time 
> out.
> Two questions:
> 1. Is there a configuration that controls this checkpoint timeout?
> 2. Is there any reason that when Job Manager realizes that Task Manager is 
> gone and workload is redispatched, it still need to wait for the checkpoint 
> initiated by the old task manager?
> Checkpoint screenshot in attachments.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-7844) Fine Grained Recovery triggers checkpoint timeout failure

2017-10-15 Thread Zhenzhong Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenzhong Xu updated FLINK-7844:

Description: 
Context: 
We are using "individual" failover (fine-grained) recovery strategy for our 
embarrassingly parallel router use case. The topic has over 2000 partitions, 
and parallelism is set to ~180 that dispatched to over 20 task managers with 
around 180 slots.

We've noticed after one task manager termination, even though the individual 
recovery happens correctly, that the workload was re-dispatched to a new 
available task manager instance. However, the checkpoint would take 10 mins to 
eventually timeout, causing all other task managers not able to commit 
checkpoints. In a worst-case scenario, if job got restarted for other reasons 
(i.e. job manager termination), that would cause more messages to be 
re-processed/duplicates compared to the job without fine-grained recovery 
enabled.

I am suspecting that uber checkpoint was waiting for a previous checkpoint that 
initiated by the old task manager and thus taking a long time to time out.
Two questions:
1. Is there a configuration that controls this checkpoint timeout?
2. Is there any reason that when Job Manager realizes that Task Manager is gone 
and workload is redispatched, it still need to wait for the checkpoint 
initiated by the old task manager?


Checkpoint screenshot in attachments.



  was:
Context: 
We are using "individual" failover (fine-grained) recovery strategy for our 
embarrassingly parallel router use case. The topic has over 2000 partitions, 
and parallelism is set to ~180 that dispatched to over 20 task managers with 
around 180 slots.

We've noticed after one task manager termination, even though the individual 
recovery happens correctly, that the workload was re-dispatched to a new 
available task manager instance. However, the checkpoint would take 10 mins to 
eventually timeout, causing all other task managers not able to commit 
checkpoints. In a worst-case scenario, if job got restarted for other reasons 
(i.e. job manager termination), that would cause more messages to be 
re-processed/duplicates compared to the job without fine-grained recovery 
enabled.

I am suspecting that uber checkpoint was waiting for a previous checkpoint that 
initiated by the old task manager and thus taking a long time to time out.
Two questions:
1. Is there a configuration that controls this checkpoint timeout?
2. Is there any reason that when Job Manager realizes that Task Manager is gone 
and workload is redispatched, it still need to wait for the checkpoint 
initiated by the old task manager?



> Fine Grained Recovery triggers checkpoint timeout failure
> -
>
> Key: FLINK-7844
> URL: https://issues.apache.org/jira/browse/FLINK-7844
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.3.2
>Reporter: Zhenzhong Xu
> Attachments: screenshot-1.png
>
>
> Context: 
> We are using "individual" failover (fine-grained) recovery strategy for our 
> embarrassingly parallel router use case. The topic has over 2000 partitions, 
> and parallelism is set to ~180 that dispatched to over 20 task managers with 
> around 180 slots.
> We've noticed after one task manager termination, even though the individual 
> recovery happens correctly, that the workload was re-dispatched to a new 
> available task manager instance. However, the checkpoint would take 10 mins 
> to eventually timeout, causing all other task managers not able to commit 
> checkpoints. In a worst-case scenario, if job got restarted for other reasons 
> (i.e. job manager termination), that would cause more messages to be 
> re-processed/duplicates compared to the job without fine-grained recovery 
> enabled.
> I am suspecting that uber checkpoint was waiting for a previous checkpoint 
> that initiated by the old task manager and thus taking a long time to time 
> out.
> Two questions:
> 1. Is there a configuration that controls this checkpoint timeout?
> 2. Is there any reason that when Job Manager realizes that Task Manager is 
> gone and workload is redispatched, it still need to wait for the checkpoint 
> initiated by the old task manager?
> Checkpoint screenshot in attachments.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-7844) Fine Grained Recovery triggers checkpoint timeout failure

2017-10-15 Thread Zhenzhong Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenzhong Xu updated FLINK-7844:

Attachment: screenshot-1.png

> Fine Grained Recovery triggers checkpoint timeout failure
> -
>
> Key: FLINK-7844
> URL: https://issues.apache.org/jira/browse/FLINK-7844
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.3.2
>Reporter: Zhenzhong Xu
> Attachments: screenshot-1.png
>
>
> Context: 
> We are using "individual" failover (fine-grained) recovery strategy for our 
> embarrassingly parallel router use case. The topic has over 2000 partitions, 
> and parallelism is set to ~180 that dispatched to over 20 task managers with 
> around 180 slots.
> We've noticed after one task manager termination, even though the individual 
> recovery happens correctly, that the workload was re-dispatched to a new 
> available task manager instance. However, the checkpoint would take 10 mins 
> to eventually timeout, causing all other task managers not able to commit 
> checkpoints. In a worst-case scenario, if job got restarted for other reasons 
> (i.e. job manager termination), that would cause more messages to be 
> re-processed/duplicates compared to the job without fine-grained recovery 
> enabled.
> I am suspecting that uber checkpoint was waiting for a previous checkpoint 
> that initiated by the old task manager and thus taking a long time to time 
> out.
> Two questions:
> 1. Is there a configuration that controls this checkpoint timeout?
> 2. Is there any reason that when Job Manager realizes that Task Manager is 
> gone and workload is redispatched, it still need to wait for the checkpoint 
> initiated by the old task manager?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7844) Fine Grained Recovery triggers checkpoint timeout failure

2017-10-15 Thread Zhenzhong Xu (JIRA)
Zhenzhong Xu created FLINK-7844:
---

 Summary: Fine Grained Recovery triggers checkpoint timeout failure
 Key: FLINK-7844
 URL: https://issues.apache.org/jira/browse/FLINK-7844
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 1.3.2
Reporter: Zhenzhong Xu


Context: 
We are using "individual" failover (fine-grained) recovery strategy for our 
embarrassingly parallel router use case. The topic has over 2000 partitions, 
and parallelism is set to ~180 that dispatched to over 20 task managers with 
around 180 slots.

We've noticed after one task manager termination, even though the individual 
recovery happens correctly, that the workload was re-dispatched to a new 
available task manager instance. However, the checkpoint would take 10 mins to 
eventually timeout, causing all other task managers not able to commit 
checkpoints. In a worst-case scenario, if job got restarted for other reasons 
(i.e. job manager termination), that would cause more messages to be 
re-processed/duplicates compared to the job without fine-grained recovery 
enabled.

I am suspecting that uber checkpoint was waiting for a previous checkpoint that 
initiated by the old task manager and thus taking a long time to time out.
Two questions:
1. Is there a configuration that controls this checkpoint timeout?
2. Is there any reason that when Job Manager realizes that Task Manager is gone 
and workload is redispatched, it still need to wait for the checkpoint 
initiated by the old task manager?




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7051) Bump up Calcite version to 1.14

2017-10-15 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205312#comment-16205312
 ] 

Haohui Mai commented on FLINK-7051:
---

I plan to work on it this week to make sure it happens before the Flink 1.4 
release.



> Bump up Calcite version to 1.14
> ---
>
> Key: FLINK-7051
> URL: https://issues.apache.org/jira/browse/FLINK-7051
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Haohui Mai
>Priority: Critical
>
> This is an umbrella issue for all tasks that need to be done once Apache 
> Calcite 1.14 is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7730) TableFunction LEFT OUTER joins with ON predicates are broken

2017-10-15 Thread Xingcan Cui (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205239#comment-16205239
 ] 

Xingcan Cui commented on FLINK-7730:


Thanks for the discussion, [~fhueske]. I've checked that the left rows can be 
padded with {{null}} values in the correlate operator when the table function 
produces no result. Thus the semantic can be held if the {{ON}} clause is 
forbidden.

The existence check for join conditions is performed in 
{{org.apache.calcite.sql.validate.SqlValidatorImpl.java:3049}}. I'm not sure if 
it's proper to "override" such a huge file since it may affect the validation 
for other joins. Maybe we can just restrict the users to put a single constant 
{{TRUE}} in the {{ON}} clause? 

> TableFunction LEFT OUTER joins with ON predicates are broken
> 
>
> Key: FLINK-7730
> URL: https://issues.apache.org/jira/browse/FLINK-7730
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.4.0, 1.3.2
>Reporter: Fabian Hueske
>Assignee: Xingcan Cui
>Priority: Critical
>
> TableFunction left outer joins with predicates in the ON clause are broken. 
> Apparently, the are no tests for this and it has never worked. I observed 
> issues on several layers:
> - Table Function does not correctly validate equality predicate: 
> {{leftOuterJoin(func1('c) as 'd,  'a.cast(Types.STRING) === 'd)}} is rejected 
> because the predicate is not considered as an equality predicate (the cast 
> needs to be pushed down).
> - Plans cannot be correctly translated: {{leftOuterJoin(func1('c) as 'd,  'c 
> === 'd)}} gives an optimizer exception.
> - SQL queries get translated but produce incorrect results. For example 
> {{SELECT a, b, c, d FROM MyTable LEFT OUTER JOIN LATERAL TABLE(tfunc(c)) AS 
> T(d) ON d = c}} returns an empty result if the condition {{d = c}} never 
> returns true. However, the outer side should be preserved and padded with 
> nulls.
> So there seem to be many issues with table function outer joins. Especially, 
> the wrong result produced by SQL queries need to be quickly fixed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4827: [FLINK-7840] [build] Shade netty in akka

2017-10-15 Thread StephanEwen
GitHub user StephanEwen opened a pull request:

https://github.com/apache/flink/pull/4827

[FLINK-7840] [build] Shade netty in akka

## What is the purpose of the change

This change shade's Akka's dependency on Netty away.

**Note:** Akka itself cannot be shaded away, there is some magic in Scala 
and annotations that Scala adds to preserve some compile time information, 
which make shading impossible (results in inconsistent code). That creates some 
problems.

I tried several approaches to shading Akka's Netty:

1. Add all Akka dependencies to `flink-runtime`. The dependencies disappear 
due to shading, but the classes are present in the same namespace.
  This leads to problems. For example, tests pull in the regular akka 
dependencies as well (transitively from `akka-testkit`) and for some reason it 
creates problems with the mixing of classes loaded from `flink-runtime` and 
`akka-actor`. One could fix that by adding all other akka dependencies as 
`provided` wherever `akka-testkit` is used.
  Similarly, users that want to use akka have to add their own dependency 
to akka and will get a similar clash of classes.
  This can be summed up as: Adding a dependency to a shaded jar (hence 
removing the dependency) without relocating the classes creates problems with 
dependency management.

  2. Add only `akka-remote` to the `flink-runtime` jar. That is the minimum 
we need to add to shade the Netty dependency. It softens the problem mentioned 
in (1), because only one of the akka dependencies is present in the 
`flink-runtime` jar, but it does not solve it completely.

We shade all akka dependencies into the `flink-dist` jar anyways, so anyone 
running a job with Flink will need to set all its akka dependencies to 
`provided` and assume Flink's version anyways (reloading in a different 
classloader through child-first loading as a workaround aside).

So from the user's perspective, akka is always provided. Child-first class 
loading can save the day for some users (allowing them to have a different akka 
version in the user code class loader than Flink has in the system class 
loader), but we should not strictly rely on that - its a pretty delicate thing.

## Brief change log

  - Set all akka compile time dependencies outside `flink-runtime` to 
*provided*.
  - Add `akka-remote_*` , `io.netty` and `org.uncommons.math` to the shaded 
`flink-runtime` jar
  - Relocate `org.jboss.netty` and `org.uncommons.math`
  - Add Netty as an explicit dependency to `flink-testutils` for the 
network proxy and shade it there as well.
  - Adds a check for unshaded Netty to the Travis CI script

## Verifying this change

  - The tests for this change pass on Travis.
  - One can simply test this by manually building the change and starting a 
simple standalone cluster and running one of the sample jobs.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (yes / **no**): t 
removes dependencies
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
  - The serializers: (yes / **no** / don't know)
  - The runtime per-record code paths (performance sensitive): (yes / 
**no** / don't know)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (**yes** / no / don't know): 
It affects akka which is used for communication and thus for failure detection.

## Documentation

  - Does this pull request introduce a new feature? (yes / **no**)
  - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StephanEwen/incubator-flink shade_akka_netty

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4827.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4827


commit 14eeb5b3f0bef73460e0c541381e0718cf71e641
Author: Stephan Ewen 
Date:   2017-10-13T23:49:26Z

[FLINK-7840] [build] Shade netty in akka




---


[jira] [Commented] (FLINK-7840) Shade Akka's Netty Dependency

2017-10-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205220#comment-16205220
 ] 

ASF GitHub Bot commented on FLINK-7840:
---

GitHub user StephanEwen opened a pull request:

https://github.com/apache/flink/pull/4827

[FLINK-7840] [build] Shade netty in akka

## What is the purpose of the change

This change shade's Akka's dependency on Netty away.

**Note:** Akka itself cannot be shaded away, there is some magic in Scala 
and annotations that Scala adds to preserve some compile time information, 
which make shading impossible (results in inconsistent code). That creates some 
problems.

I tried several approaches to shading Akka's Netty:

1. Add all Akka dependencies to `flink-runtime`. The dependencies disappear 
due to shading, but the classes are present in the same namespace.
  This leads to problems. For example, tests pull in the regular akka 
dependencies as well (transitively from `akka-testkit`) and for some reason it 
creates problems with the mixing of classes loaded from `flink-runtime` and 
`akka-actor`. One could fix that by adding all other akka dependencies as 
`provided` wherever `akka-testkit` is used.
  Similarly, users that want to use akka have to add their own dependency 
to akka and will get a similar clash of classes.
  This can be summed up as: Adding a dependency to a shaded jar (hence 
removing the dependency) without relocating the classes creates problems with 
dependency management.

  2. Add only `akka-remote` to the `flink-runtime` jar. That is the minimum 
we need to add to shade the Netty dependency. It softens the problem mentioned 
in (1), because only one of the akka dependencies is present in the 
`flink-runtime` jar, but it does not solve it completely.

We shade all akka dependencies into the `flink-dist` jar anyways, so anyone 
running a job with Flink will need to set all its akka dependencies to 
`provided` and assume Flink's version anyways (reloading in a different 
classloader through child-first loading as a workaround aside).

So from the user's perspective, akka is always provided. Child-first class 
loading can save the day for some users (allowing them to have a different akka 
version in the user code class loader than Flink has in the system class 
loader), but we should not strictly rely on that - its a pretty delicate thing.

## Brief change log

  - Set all akka compile time dependencies outside `flink-runtime` to 
*provided*.
  - Add `akka-remote_*` , `io.netty` and `org.uncommons.math` to the shaded 
`flink-runtime` jar
  - Relocate `org.jboss.netty` and `org.uncommons.math`
  - Add Netty as an explicit dependency to `flink-testutils` for the 
network proxy and shade it there as well.
  - Adds a check for unshaded Netty to the Travis CI script

## Verifying this change

  - The tests for this change pass on Travis.
  - One can simply test this by manually building the change and starting a 
simple standalone cluster and running one of the sample jobs.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (yes / **no**): t 
removes dependencies
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
  - The serializers: (yes / **no** / don't know)
  - The runtime per-record code paths (performance sensitive): (yes / 
**no** / don't know)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (**yes** / no / don't know): 
It affects akka which is used for communication and thus for failure detection.

## Documentation

  - Does this pull request introduce a new feature? (yes / **no**)
  - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StephanEwen/incubator-flink shade_akka_netty

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4827.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4827


commit 14eeb5b3f0bef73460e0c541381e0718cf71e641
Author: Stephan Ewen 
Date:   2017-10-13T23:49:26Z

[FLINK-7840] [build] Shade netty in akka




> Shade Akka's Netty Dependency
> -
>
> Key: FLINK-7840
> URL: https://issues.apache.org/jira/browse/FLINK-7840
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Stephan Ewen
>   

[jira] [Assigned] (FLINK-7371) user defined aggregator assumes nr of arguments smaller or equal than number of row fields

2017-10-15 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske reassigned FLINK-7371:


Assignee: Timo Walther  (was: Fabian Hueske)

> user defined aggregator assumes nr of arguments smaller or equal than number 
> of row fields
> --
>
> Key: FLINK-7371
> URL: https://issues.apache.org/jira/browse/FLINK-7371
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.3.1
>Reporter: Stefano Bortoli
>Assignee: Timo Walther
>Priority: Blocker
>
> The definition of user define aggregations with a number of parameters larger 
> than the row fields causes ArrayIndexOutOfBoundsException because the 
> indexing is based on a linear iteration over row fields. This does not 
> consider cases where fields can be used more than once and constant values 
> are passed to the aggregation function.
> for example:
> {code}
> window(partition {} order by [2] rows between $5 PRECEDING and CURRENT ROW 
> aggs [myAgg($0, $1, $3, $0, $4)])
> {code}
> where $3 and $4 are reference to constants, and $0 and $1 are fields causes:
> {code}
> java.lang.ArrayIndexOutOfBoundsException: 4
>   at 
> org.apache.flink.table.plan.schema.RowSchema.mapIndex(RowSchema.scala:134)
>   at 
> org.apache.flink.table.plan.schema.RowSchema$$anonfun$mapAggregateCall$1.apply(RowSchema.scala:147)
>   at 
> org.apache.flink.table.plan.schema.RowSchema$$anonfun$mapAggregateCall$1.apply(RowSchema.scala:147)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:105)
>   at 
> org.apache.flink.table.plan.schema.RowSchema.mapAggregateCall(RowSchema.scala:147)
>   at 
> org.apache.flink.table.plan.nodes.datastream.DataStreamOverAggregate$$anonfun$9.apply(DataStreamOverAggregate.scala:362)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (FLINK-7426) Table API does not support null values in keys

2017-10-15 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske reassigned FLINK-7426:


Assignee: Timo Walther  (was: Fabian Hueske)

> Table API does not support null values in keys
> --
>
> Key: FLINK-7426
> URL: https://issues.apache.org/jira/browse/FLINK-7426
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.3.2
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Blocker
>
> The Table API uses {{keyBy}} internally, however, the generated 
> {{KeySelector}} uses instances of {{Tuple}}. The {{TupleSerializer}} is not 
> able to serialize null values. This causes issues during checkpointing or 
> when using the RocksDB state backend. We need to replace all {{keyBy}} calls 
> with a custom {{RowKeySelector}}.
> {code}
> class AggregateITCase extends StreamingWithStateTestBase {
>   private val queryConfig = new StreamQueryConfig()
>   queryConfig.withIdleStateRetentionTime(Time.hours(1), Time.hours(2))
>   @Test
>   def testDistinct(): Unit = {
> val env = StreamExecutionEnvironment.getExecutionEnvironment
> env.setStateBackend(getStateBackend)
> val tEnv = TableEnvironment.getTableEnvironment(env)
> StreamITCase.clear
> val t = StreamTestData.get3TupleDataStream(env).toTable(tEnv, 'a, 'b, 'c)
>   .select('b, Null(Types.LONG)).distinct()
> val results = t.toRetractStream[Row](queryConfig)
> results.addSink(new StreamITCase.RetractingSink).setParallelism(1)
> env.execute()
> val expected = mutable.MutableList("1,null", "2,null", "3,null", 
> "4,null", "5,null", "6,null")
> assertEquals(expected.sorted, StreamITCase.retractedResults.sorted)
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (FLINK-6584) Support multiple consecutive windows in SQL

2017-10-15 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske reassigned FLINK-6584:


Assignee: Timo Walther  (was: Fabian Hueske)

> Support multiple consecutive windows in SQL
> ---
>
> Key: FLINK-6584
> URL: https://issues.apache.org/jira/browse/FLINK-6584
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Blocker
>
> Right now, the Table API supports multiple consecutive windows as follows:
> {code}
> val table = stream.toTable(tEnv, 'rowtime.rowtime, 'int, 'double, 'float, 
> 'bigdec, 'string)
> val t = table
>   .window(Tumble over 2.millis on 'rowtime as 'w)
>   .groupBy('w)
>   .select('w.rowtime as 'rowtime, 'int.count as 'int)
>   .window(Tumble over 4.millis on 'rowtime as 'w2)
>   .groupBy('w2)
>   .select('w2.rowtime, 'w2.end, 'int.count)
> {code}
> Similar behavior should be supported by the SQL API as well. We need to 
> introduce a new auxiliary group function, but this should happen in sync with 
> Apache Calcite.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (FLINK-7798) Add support for windowed joins to Table API

2017-10-15 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske reassigned FLINK-7798:


Assignee: Xingcan Cui  (was: Fabian Hueske)

> Add support for windowed joins to Table API
> ---
>
> Key: FLINK-7798
> URL: https://issues.apache.org/jira/browse/FLINK-7798
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: Fabian Hueske
>Assignee: Xingcan Cui
>Priority: Blocker
> Fix For: 1.4.0
>
>
> Currently, windowed joins on streaming tables are only supported through SQL.
> The Table API should support these joins as well. For that, we have to adjust 
> the Table API validation and translate the API into the respective logical 
> plan. Since most of the code should already be there for the batch Table API 
> joins, this should be fairly straightforward.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-6703) Document how to take a savepoint on YARN

2017-10-15 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-6703.
---
   Resolution: Fixed
Fix Version/s: 1.4.0

1.4: 89394ec8094205cb3fd3a47a95dacd1b18c31088

> Document how to take a savepoint on YARN
> 
>
> Key: FLINK-6703
> URL: https://issues.apache.org/jira/browse/FLINK-6703
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, YARN
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Bowen Li
> Fix For: 1.4.0
>
>
> The documentation should have a separate entry for savepoint related CLI 
> commands in combination with YARN. It is currently not documented that you 
> have to supply the application id, nor how you can pass it.
> {code}
> ./bin/flink savepoint  -m yarn-cluster (-yid|-yarnapplicationId) 
> 
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-7774) Deserializers are not cleaned up when closing input streams

2017-10-15 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-7774.
---
   Resolution: Fixed
Fix Version/s: 1.4.0

1.4: 4b6f05585bb548d1538ada43ed33149acbc9e6d4

> Deserializers are not cleaned up when closing input streams
> ---
>
> Key: FLINK-7774
> URL: https://issues.apache.org/jira/browse/FLINK-7774
> Project: Flink
>  Issue Type: Bug
>  Components: Network
>Affects Versions: 1.4.0, 1.3.2
>Reporter: Nico Kruber
>Assignee: Nico Kruber
> Fix For: 1.4.0
>
>
> On cleanup of the {{AbstractRecordReader}}, {{StreamInputProcessor}}, and 
> {{StreamTwoInputProcessor}}, the deserializers' current buffers are cleaned 
> up but not their internal {{spanningWrapper}} and {{nonSpanningWrapper}} via 
> {{RecordDeserializer#clear}}. This call should be added.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6805) Flink Cassandra connector dependency on Netty disagrees with Flink

2017-10-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205193#comment-16205193
 ] 

ASF GitHub Bot commented on FLINK-6805:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4545


> Flink Cassandra connector dependency on Netty disagrees with Flink
> --
>
> Key: FLINK-6805
> URL: https://issues.apache.org/jira/browse/FLINK-6805
> Project: Flink
>  Issue Type: Bug
>  Components: Cassandra Connector
>Affects Versions: 1.3.0, 1.2.1
>Reporter: Shannon Carey
>Assignee: Michael Fong
>Priority: Blocker
> Fix For: 1.4.0
>
>
> The Flink Cassandra connector has a dependency on Netty libraries (via 
> promotion of transitive dependencies by the Maven shade plugin) at version 
> 4.0.33.Final, which disagrees with the version included in Flink of 
> 4.0.27.Final which is included & managed by the parent POM via dependency on 
> netty-all.
> Due to use of netty-all, the dependency management doesn't take effect on the 
> individual libraries such as netty-handler, netty-codec, etc.
> I suggest that dependency management of Netty should be added for all Netty 
> libraries individually (netty-handler, etc.) so that all Flink modules use 
> the same version, and similarly I suggest that exclusions be added to the 
> quickstart example POM for the individual Netty libraries so that fat JARs 
> don't include conflicting versions of Netty.
> It seems like this problem started when FLINK-6084 was implemented: 
> transitive dependencies of the flink-connector-cassandra were previously 
> omitted, and now that they are included we must make sure that they agree 
> with the Flink distribution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7791) Integrate LIST command into REST client

2017-10-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205195#comment-16205195
 ] 

ASF GitHub Bot commented on FLINK-7791:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4802


> Integrate LIST command into REST client
> ---
>
> Key: FLINK-7791
> URL: https://issues.apache.org/jira/browse/FLINK-7791
> Project: Flink
>  Issue Type: Sub-task
>  Components: Client, REST
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.4.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-6805) Flink Cassandra connector dependency on Netty disagrees with Flink

2017-10-15 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-6805.
---
Resolution: Fixed

1.4: 81d7c4e8b111c6bf52000f68b64c402783a6ae74

> Flink Cassandra connector dependency on Netty disagrees with Flink
> --
>
> Key: FLINK-6805
> URL: https://issues.apache.org/jira/browse/FLINK-6805
> Project: Flink
>  Issue Type: Bug
>  Components: Cassandra Connector
>Affects Versions: 1.3.0, 1.2.1
>Reporter: Shannon Carey
>Assignee: Michael Fong
>Priority: Blocker
> Fix For: 1.4.0
>
>
> The Flink Cassandra connector has a dependency on Netty libraries (via 
> promotion of transitive dependencies by the Maven shade plugin) at version 
> 4.0.33.Final, which disagrees with the version included in Flink of 
> 4.0.27.Final which is included & managed by the parent POM via dependency on 
> netty-all.
> Due to use of netty-all, the dependency management doesn't take effect on the 
> individual libraries such as netty-handler, netty-codec, etc.
> I suggest that dependency management of Netty should be added for all Netty 
> libraries individually (netty-handler, etc.) so that all Flink modules use 
> the same version, and similarly I suggest that exclusions be added to the 
> quickstart example POM for the individual Netty libraries so that fat JARs 
> don't include conflicting versions of Netty.
> It seems like this problem started when FLINK-6084 was implemented: 
> transitive dependencies of the flink-connector-cassandra were previously 
> omitted, and now that they are included we must make sure that they agree 
> with the Flink distribution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-7791) Integrate LIST command into REST client

2017-10-15 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-7791.
---
Resolution: Fixed

1.4: e572961749dd45e3f7444664d9ba967fa438ab55 & 
a76de2860d2ee78e673c97fba3394f099389ad75

> Integrate LIST command into REST client
> ---
>
> Key: FLINK-7791
> URL: https://issues.apache.org/jira/browse/FLINK-7791
> Project: Flink
>  Issue Type: Sub-task
>  Components: Client, REST
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.4.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6703) Document how to take a savepoint on YARN

2017-10-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205192#comment-16205192
 ] 

ASF GitHub Bot commented on FLINK-6703:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4721


> Document how to take a savepoint on YARN
> 
>
> Key: FLINK-6703
> URL: https://issues.apache.org/jira/browse/FLINK-6703
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, YARN
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Bowen Li
>
> The documentation should have a separate entry for savepoint related CLI 
> commands in combination with YARN. It is currently not documented that you 
> have to supply the application id, nor how you can pass it.
> {code}
> ./bin/flink savepoint  -m yarn-cluster (-yid|-yarnapplicationId) 
> 
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7774) Deserializers are not cleaned up when closing input streams

2017-10-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205194#comment-16205194
 ] 

ASF GitHub Bot commented on FLINK-7774:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4783


> Deserializers are not cleaned up when closing input streams
> ---
>
> Key: FLINK-7774
> URL: https://issues.apache.org/jira/browse/FLINK-7774
> Project: Flink
>  Issue Type: Bug
>  Components: Network
>Affects Versions: 1.4.0, 1.3.2
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> On cleanup of the {{AbstractRecordReader}}, {{StreamInputProcessor}}, and 
> {{StreamTwoInputProcessor}}, the deserializers' current buffers are cleaned 
> up but not their internal {{spanningWrapper}} and {{nonSpanningWrapper}} via 
> {{RecordDeserializer#clear}}. This call should be added.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4545: [FLINK-6805] [Cassandra-Connector] Shade indirect ...

2017-10-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4545


---


[GitHub] flink pull request #4783: [FLINK-7774][network] fix not clearing deserialize...

2017-10-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4783


---


[GitHub] flink pull request #4721: [FLINK-6703][savepoint/doc] Document how to take a...

2017-10-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4721


---


[GitHub] flink pull request #4802: [FLINK-7791] [REST][client] Integrate LIST command...

2017-10-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4802


---


[jira] [Commented] (FLINK-7495) AbstractUdfStreamOperator#initializeState() should be called in AsyncWaitOperator#initializeState()

2017-10-15 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205179#comment-16205179
 ] 

Ted Yu commented on FLINK-7495:
---

[~aljoscha]:
Can you take a look ?

> AbstractUdfStreamOperator#initializeState() should be called in 
> AsyncWaitOperator#initializeState()
> ---
>
> Key: FLINK-7495
> URL: https://issues.apache.org/jira/browse/FLINK-7495
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Reporter: Ted Yu
>Assignee: Fang Yong
>Priority: Minor
>
> {code}
> recoveredStreamElements = context
>   .getOperatorStateStore()
>   .getListState(new ListStateDescriptor<>(STATE_NAME, 
> inStreamElementSerializer));
> {code}
> Call to AbstractUdfStreamOperator#initializeState() should be added in the 
> beginning



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-7679) Upgrade maven enforcer plugin to 3.0.0-M1

2017-10-15 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated FLINK-7679:
--
Description: 
I got the following build error against Java 9:
{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce (enforce-maven) on 
project flink-parent: Execution enforce-maven of goal 
org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce failed: An API 
incompatibility was encountered while executing 
org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce: 
java.lang.ExceptionInInitializerError: null
[ERROR] -
[ERROR] realm =plugin>org.apache.maven.plugins:maven-enforcer-plugin:1.4.1
[ERROR] strategy = org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy
[ERROR] urls[0] = 
file:/home/hbase/.m2/repository/org/apache/maven/plugins/maven-enforcer-plugin/1.4.1/maven-enforcer-plugin-1.4.1.jar
{code}
Upgrading maven enforcer plugin to 3.0.0-M1 would get over the above error.

  was:
I got the following build error against Java 9:

{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce (enforce-maven) on 
project flink-parent: Execution enforce-maven of goal 
org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce failed: An API 
incompatibility was encountered while executing 
org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce: 
java.lang.ExceptionInInitializerError: null
[ERROR] -
[ERROR] realm =plugin>org.apache.maven.plugins:maven-enforcer-plugin:1.4.1
[ERROR] strategy = org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy
[ERROR] urls[0] = 
file:/home/hbase/.m2/repository/org/apache/maven/plugins/maven-enforcer-plugin/1.4.1/maven-enforcer-plugin-1.4.1.jar
{code}
Upgrading maven enforcer plugin to 3.0.0-M1 would get over the above error.


> Upgrade maven enforcer plugin to 3.0.0-M1
> -
>
> Key: FLINK-7679
> URL: https://issues.apache.org/jira/browse/FLINK-7679
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Reporter: Ted Yu
>
> I got the following build error against Java 9:
> {code}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce (enforce-maven) 
> on project flink-parent: Execution enforce-maven of goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce failed: An API 
> incompatibility was encountered while executing 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce: 
> java.lang.ExceptionInInitializerError: null
> [ERROR] -
> [ERROR] realm =plugin>org.apache.maven.plugins:maven-enforcer-plugin:1.4.1
> [ERROR] strategy = org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy
> [ERROR] urls[0] = 
> file:/home/hbase/.m2/repository/org/apache/maven/plugins/maven-enforcer-plugin/1.4.1/maven-enforcer-plugin-1.4.1.jar
> {code}
> Upgrading maven enforcer plugin to 3.0.0-M1 would get over the above error.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-7730) TableFunction LEFT OUTER joins with ON predicates are broken

2017-10-15 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-7730:
-
Priority: Critical  (was: Major)

> TableFunction LEFT OUTER joins with ON predicates are broken
> 
>
> Key: FLINK-7730
> URL: https://issues.apache.org/jira/browse/FLINK-7730
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.4.0, 1.3.2
>Reporter: Fabian Hueske
>Assignee: Xingcan Cui
>Priority: Critical
>
> TableFunction left outer joins with predicates in the ON clause are broken. 
> Apparently, the are no tests for this and it has never worked. I observed 
> issues on several layers:
> - Table Function does not correctly validate equality predicate: 
> {{leftOuterJoin(func1('c) as 'd,  'a.cast(Types.STRING) === 'd)}} is rejected 
> because the predicate is not considered as an equality predicate (the cast 
> needs to be pushed down).
> - Plans cannot be correctly translated: {{leftOuterJoin(func1('c) as 'd,  'c 
> === 'd)}} gives an optimizer exception.
> - SQL queries get translated but produce incorrect results. For example 
> {{SELECT a, b, c, d FROM MyTable LEFT OUTER JOIN LATERAL TABLE(tfunc(c)) AS 
> T(d) ON d = c}} returns an empty result if the condition {{d = c}} never 
> returns true. However, the outer side should be preserved and padded with 
> nulls.
> So there seem to be many issues with table function outer joins. Especially, 
> the wrong result produced by SQL queries need to be quickly fixed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7730) TableFunction LEFT OUTER joins with ON predicates are broken

2017-10-15 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205157#comment-16205157
 ] 

Fabian Hueske commented on FLINK-7730:
--

This is mainly problem in Calcite (see [this thread on the Calcite dev 
list|https://lists.apache.org/thread.html/16caeb8b1649c4da85f9915ea723c6c5b3ced0b96914cadc24ee4e15@%3Cdev.calcite.apache.org%3E])
Seems like there is no easy fix to solve this issue. We might want to check if 
we can forbid predicates in the {{ON}} clause of an {{OUTER JOIN}} against a 
{{LATERAL TABLE}}. This would probably involve copying and modifying Calcite 
code until the issue is fixed in a newer Calcite release.

> TableFunction LEFT OUTER joins with ON predicates are broken
> 
>
> Key: FLINK-7730
> URL: https://issues.apache.org/jira/browse/FLINK-7730
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.4.0, 1.3.2
>Reporter: Fabian Hueske
>Assignee: Xingcan Cui
>
> TableFunction left outer joins with predicates in the ON clause are broken. 
> Apparently, the are no tests for this and it has never worked. I observed 
> issues on several layers:
> - Table Function does not correctly validate equality predicate: 
> {{leftOuterJoin(func1('c) as 'd,  'a.cast(Types.STRING) === 'd)}} is rejected 
> because the predicate is not considered as an equality predicate (the cast 
> needs to be pushed down).
> - Plans cannot be correctly translated: {{leftOuterJoin(func1('c) as 'd,  'c 
> === 'd)}} gives an optimizer exception.
> - SQL queries get translated but produce incorrect results. For example 
> {{SELECT a, b, c, d FROM MyTable LEFT OUTER JOIN LATERAL TABLE(tfunc(c)) AS 
> T(d) ON d = c}} returns an empty result if the condition {{d = c}} never 
> returns true. However, the outer side should be preserved and padded with 
> nulls.
> So there seem to be many issues with table function outer joins. Especially, 
> the wrong result produced by SQL queries need to be quickly fixed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-7371) user defined aggregator assumes nr of arguments smaller or equal than number of row fields

2017-10-15 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-7371:
-
Priority: Blocker  (was: Major)

> user defined aggregator assumes nr of arguments smaller or equal than number 
> of row fields
> --
>
> Key: FLINK-7371
> URL: https://issues.apache.org/jira/browse/FLINK-7371
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.3.1
>Reporter: Stefano Bortoli
>Assignee: Fabian Hueske
>Priority: Blocker
>
> The definition of user define aggregations with a number of parameters larger 
> than the row fields causes ArrayIndexOutOfBoundsException because the 
> indexing is based on a linear iteration over row fields. This does not 
> consider cases where fields can be used more than once and constant values 
> are passed to the aggregation function.
> for example:
> {code}
> window(partition {} order by [2] rows between $5 PRECEDING and CURRENT ROW 
> aggs [myAgg($0, $1, $3, $0, $4)])
> {code}
> where $3 and $4 are reference to constants, and $0 and $1 are fields causes:
> {code}
> java.lang.ArrayIndexOutOfBoundsException: 4
>   at 
> org.apache.flink.table.plan.schema.RowSchema.mapIndex(RowSchema.scala:134)
>   at 
> org.apache.flink.table.plan.schema.RowSchema$$anonfun$mapAggregateCall$1.apply(RowSchema.scala:147)
>   at 
> org.apache.flink.table.plan.schema.RowSchema$$anonfun$mapAggregateCall$1.apply(RowSchema.scala:147)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:105)
>   at 
> org.apache.flink.table.plan.schema.RowSchema.mapAggregateCall(RowSchema.scala:147)
>   at 
> org.apache.flink.table.plan.nodes.datastream.DataStreamOverAggregate$$anonfun$9.apply(DataStreamOverAggregate.scala:362)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (FLINK-7371) user defined aggregator assumes nr of arguments smaller or equal than number of row fields

2017-10-15 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske reassigned FLINK-7371:


Assignee: Fabian Hueske  (was: Timo Walther)

> user defined aggregator assumes nr of arguments smaller or equal than number 
> of row fields
> --
>
> Key: FLINK-7371
> URL: https://issues.apache.org/jira/browse/FLINK-7371
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.3.1
>Reporter: Stefano Bortoli
>Assignee: Fabian Hueske
>
> The definition of user define aggregations with a number of parameters larger 
> than the row fields causes ArrayIndexOutOfBoundsException because the 
> indexing is based on a linear iteration over row fields. This does not 
> consider cases where fields can be used more than once and constant values 
> are passed to the aggregation function.
> for example:
> {code}
> window(partition {} order by [2] rows between $5 PRECEDING and CURRENT ROW 
> aggs [myAgg($0, $1, $3, $0, $4)])
> {code}
> where $3 and $4 are reference to constants, and $0 and $1 are fields causes:
> {code}
> java.lang.ArrayIndexOutOfBoundsException: 4
>   at 
> org.apache.flink.table.plan.schema.RowSchema.mapIndex(RowSchema.scala:134)
>   at 
> org.apache.flink.table.plan.schema.RowSchema$$anonfun$mapAggregateCall$1.apply(RowSchema.scala:147)
>   at 
> org.apache.flink.table.plan.schema.RowSchema$$anonfun$mapAggregateCall$1.apply(RowSchema.scala:147)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:105)
>   at 
> org.apache.flink.table.plan.schema.RowSchema.mapAggregateCall(RowSchema.scala:147)
>   at 
> org.apache.flink.table.plan.nodes.datastream.DataStreamOverAggregate$$anonfun$9.apply(DataStreamOverAggregate.scala:362)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-7051) Bump up Calcite version to 1.14

2017-10-15 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-7051:
-
Priority: Critical  (was: Major)

> Bump up Calcite version to 1.14
> ---
>
> Key: FLINK-7051
> URL: https://issues.apache.org/jira/browse/FLINK-7051
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Haohui Mai
>Priority: Critical
>
> This is an umbrella issue for all tasks that need to be done once Apache 
> Calcite 1.14 is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7791) Integrate LIST command into REST client

2017-10-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205149#comment-16205149
 ] 

ASF GitHub Bot commented on FLINK-7791:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4802
  
merging.


> Integrate LIST command into REST client
> ---
>
> Key: FLINK-7791
> URL: https://issues.apache.org/jira/browse/FLINK-7791
> Project: Flink
>  Issue Type: Sub-task
>  Components: Client, REST
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.4.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4802: [FLINK-7791] [REST][client] Integrate LIST command into R...

2017-10-15 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4802
  
merging.


---


[jira] [Commented] (FLINK-6094) Implement stream-stream proctime non-window inner join

2017-10-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205147#comment-16205147
 ] 

ASF GitHub Bot commented on FLINK-6094:
---

Github user hequn8128 commented on the issue:

https://github.com/apache/flink/pull/4471
  
@fhueske  hi fabian, sorry for the late update, i will resolve the 
conflicts ASAP, a busy weekend :)


> Implement stream-stream proctime non-window  inner join
> ---
>
> Key: FLINK-6094
> URL: https://issues.apache.org/jira/browse/FLINK-6094
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Shaoxuan Wang
>Assignee: Hequn Cheng
>
> This includes:
> 1.Implement stream-stream proctime non-window  inner join
> 2.Implement the retract process logic for join



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-7802) Exception occur when empty field collection was pushed into CSVTableSource

2017-10-15 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-7802:
-
Priority: Critical  (was: Major)

> Exception occur when empty field collection was pushed into CSVTableSource
> --
>
> Key: FLINK-7802
> URL: https://issues.apache.org/jira/browse/FLINK-7802
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: godfrey he
>Assignee: godfrey he
>Priority: Critical
>
> Consider such SQL: select count(1) from csv_table. 
> When above SQL was executed, an exception will occur:
> java.lang.IllegalArgumentException: At least one field must be specified
> at 
> org.apache.flink.api.java.io.RowCsvInputFormat.(RowCsvInputFormat.java:50)
> So if no fields will be used, we should also keep some columns for 
> CSVTableSource to get row count.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4471: [FLINK-6094] [table] Implement stream-stream proctime non...

2017-10-15 Thread hequn8128
Github user hequn8128 commented on the issue:

https://github.com/apache/flink/pull/4471
  
@fhueske  hi fabian, sorry for the late update, i will resolve the 
conflicts ASAP, a busy weekend :)


---


[jira] [Assigned] (FLINK-6584) Support multiple consecutive windows in SQL

2017-10-15 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske reassigned FLINK-6584:


Assignee: Fabian Hueske  (was: Timo Walther)

> Support multiple consecutive windows in SQL
> ---
>
> Key: FLINK-6584
> URL: https://issues.apache.org/jira/browse/FLINK-6584
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Fabian Hueske
>Priority: Blocker
>
> Right now, the Table API supports multiple consecutive windows as follows:
> {code}
> val table = stream.toTable(tEnv, 'rowtime.rowtime, 'int, 'double, 'float, 
> 'bigdec, 'string)
> val t = table
>   .window(Tumble over 2.millis on 'rowtime as 'w)
>   .groupBy('w)
>   .select('w.rowtime as 'rowtime, 'int.count as 'int)
>   .window(Tumble over 4.millis on 'rowtime as 'w2)
>   .groupBy('w2)
>   .select('w2.rowtime, 'w2.end, 'int.count)
> {code}
> Similar behavior should be supported by the SQL API as well. We need to 
> introduce a new auxiliary group function, but this should happen in sync with 
> Apache Calcite.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (FLINK-7426) Table API does not support null values in keys

2017-10-15 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske reassigned FLINK-7426:


Assignee: Fabian Hueske  (was: Timo Walther)

> Table API does not support null values in keys
> --
>
> Key: FLINK-7426
> URL: https://issues.apache.org/jira/browse/FLINK-7426
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.3.2
>Reporter: Timo Walther
>Assignee: Fabian Hueske
>Priority: Blocker
>
> The Table API uses {{keyBy}} internally, however, the generated 
> {{KeySelector}} uses instances of {{Tuple}}. The {{TupleSerializer}} is not 
> able to serialize null values. This causes issues during checkpointing or 
> when using the RocksDB state backend. We need to replace all {{keyBy}} calls 
> with a custom {{RowKeySelector}}.
> {code}
> class AggregateITCase extends StreamingWithStateTestBase {
>   private val queryConfig = new StreamQueryConfig()
>   queryConfig.withIdleStateRetentionTime(Time.hours(1), Time.hours(2))
>   @Test
>   def testDistinct(): Unit = {
> val env = StreamExecutionEnvironment.getExecutionEnvironment
> env.setStateBackend(getStateBackend)
> val tEnv = TableEnvironment.getTableEnvironment(env)
> StreamITCase.clear
> val t = StreamTestData.get3TupleDataStream(env).toTable(tEnv, 'a, 'b, 'c)
>   .select('b, Null(Types.LONG)).distinct()
> val results = t.toRetractStream[Row](queryConfig)
> results.addSink(new StreamITCase.RetractingSink).setParallelism(1)
> env.execute()
> val expected = mutable.MutableList("1,null", "2,null", "3,null", 
> "4,null", "5,null", "6,null")
> assertEquals(expected.sorted, StreamITCase.retractedResults.sorted)
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (FLINK-7798) Add support for windowed joins to Table API

2017-10-15 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske reassigned FLINK-7798:


Assignee: Fabian Hueske  (was: Xingcan Cui)

> Add support for windowed joins to Table API
> ---
>
> Key: FLINK-7798
> URL: https://issues.apache.org/jira/browse/FLINK-7798
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>Priority: Blocker
> Fix For: 1.4.0
>
>
> Currently, windowed joins on streaming tables are only supported through SQL.
> The Table API should support these joins as well. For that, we have to adjust 
> the Table API validation and translate the API into the respective logical 
> plan. Since most of the code should already be there for the batch Table API 
> joins, this should be fairly straightforward.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7737) On HCFS systems, FSDataOutputStream does not issue hsync only hflush which leads to data loss

2017-10-15 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205136#comment-16205136
 ] 

Stephan Ewen commented on FLINK-7737:
-

Okay, I think the root cause here is that {{hflush()}} and {{hsync()}} 
semantics are somewhat different across file systems.

We cud solve this the following way:

  1. Always {{hsync()}} in the bucketing sink, but I feat that introduces a 
performance hit for other cases (like HDFS).

  2. Always passing SYNC_BLOCK in create - will that have the same effect as 
(1) ?

  3. Make it configurable in the for the user whether they need sync or only 
flush. But I fear most users will get that wrong.

  4. Make the file systems obey a stricter definition of {{flush()}}, meaning 
it needs to guarantee persistence for loss of a TaskManager. Then this is up to 
the file system implementer or the wrapper to forward these calls properly. 
BTW, it has just gotten a lot easier to plug in new file systems. A file system 
based on Hadoop can also be explicitly exposed to handle certain situations 
differently than the other file systems loaded through Hadoop's abstaction: 
https://github.com/apache/flink/pull/4781 (already merged into master)

I believe that this is an issue for more cases. For example, in order to 
guarantee recoverability of a task manager, the file system mounted under 
{{file://}} needs only {{flush()}} if it is a local file system, but needs to 
{{sync()}} if it is a mounted NFS style file system.

> On HCFS systems, FSDataOutputStream does not issue hsync only hflush which 
> leads to data loss
> -
>
> Key: FLINK-7737
> URL: https://issues.apache.org/jira/browse/FLINK-7737
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Affects Versions: 1.3.2
> Environment: Dev
>Reporter: Ryan Hobbs
>Priority: Blocker
> Fix For: 1.4.0
>
>
> During several tests where we simulated failure conditions, we have observed 
> that on HCFS systems where the data stream is of type FSDataOutputStream, 
> Flink will issue hflush() and not hsync() which results in data loss.
> In the class *StreamWriterBase.java* the code below will execute hsync if the 
> output stream is of type *HdfsDataOutputStream* but not for streams of type 
> *FSDataOutputStream*.  Is this by design?
> {code}
> protected void hflushOrSync(FSDataOutputStream os) throws IOException {
> try {
> // At this point the refHflushOrSync cannot be null,
> // since register method would have thrown if it was.
> this.refHflushOrSync.invoke(os);
> if (os instanceof HdfsDataOutputStream) {
>   ((HdfsDataOutputStream) 
> os).hsync(EnumSet.of(HdfsDataOutputStream.SyncFlag.UPDATE_LENGTH));
>   }
>   } catch (InvocationTargetException e) {
> String msg = "Error while trying to hflushOrSync!";
> LOG.error(msg + " " + e.getCause());
> Throwable cause = e.getCause();
> if (cause != null && cause instanceof IOException) {
> throw (IOException) cause;
>   }
> throw new RuntimeException(msg, e);
>   } catch (Exception e) {
> String msg = "Error while trying to hflushOrSync!";
> LOG.error(msg + " " + e);
> throw new RuntimeException(msg, e);
>   }
>   }
> {code}
> Could a potential fix me to perform a sync even on streams of type 
> *FSDataOutputStream*?
> {code}
>  if (os instanceof HdfsDataOutputStream) {
> ((HdfsDataOutputStream) 
> os).hsync(EnumSet.of(HdfsDataOutputStream.SyncFlag.UPDATE_LENGTH));
> } else if (os instanceof FSDataOutputStream) {
> os.hsync();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-2747) TypeExtractor does not correctly analyze Scala Immutables (AnyVal)

2017-10-15 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205131#comment-16205131
 ] 

Stephan Ewen commented on FLINK-2747:
-

One way we can fix this is by obeying the {{enableObjectReuse()}} flags in the 
Network Stack deserializers as well.
For immutable ty

A more involved fix is to detect in the {{TypeInformation}} the immutable types 
that cannot be instantiated empty, and force non-reuse mode for them.

> TypeExtractor does not correctly analyze Scala Immutables (AnyVal)
> --
>
> Key: FLINK-2747
> URL: https://issues.apache.org/jira/browse/FLINK-2747
> Project: Flink
>  Issue Type: Bug
>  Components: Type Serialization System
>Affects Versions: 0.10.0, 1.0.0
>Reporter: Aljoscha Krettek
> Fix For: 1.0.0
>
>
> This example program only works correctly if Kryo is force-enabled.
> {code}
> object Test {
>   class Id(val underlying: Int) extends AnyVal
>   class X(var id: Id) {
> def this() { this(new Id(0)) }
>   }
>   class MySource extends SourceFunction[X] {
> def run(ctx: SourceFunction.SourceContext[X]) {
>   ctx.collect(new X(new Id(1)))
> }
> def cancel() {}
>   }
>   def main(args: Array[String]) {
> val env = StreamExecutionContext.getExecutionContext
> env.addSource(new MySource).print
> env.execute("Test")
>   }
> }
> {code}
> The program fails with this:
> {code}
> Caused by: java.lang.RuntimeException: Cannot instantiate class.
>   at 
> org.apache.flink.api.java.typeutils.runtime.PojoSerializer.createInstance(PojoSerializer.java:227)
>   at 
> org.apache.flink.api.java.typeutils.runtime.PojoSerializer.deserialize(PojoSerializer.java:421)
>   at 
> org.apache.flink.streaming.runtime.streamrecord.StreamRecordSerializer.deserialize(StreamRecordSerializer.java:110)
>   at 
> org.apache.flink.streaming.runtime.streamrecord.StreamRecordSerializer.deserialize(StreamRecordSerializer.java:41)
>   at 
> org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate.read(NonReusingDeserializationDelegate.java:55)
>   at 
> org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:125)
>   at 
> org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:136)
>   at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:55)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:198)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:579)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6615) tmp directory not cleaned up on shutdown

2017-10-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205123#comment-16205123
 ] 

ASF GitHub Bot commented on FLINK-6615:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4787
  
@bowenli86 I hope you don't mind that I pushed back a bit. It's my job to 
be careful about such things...


> tmp directory not cleaned up on shutdown
> 
>
> Key: FLINK-6615
> URL: https://issues.apache.org/jira/browse/FLINK-6615
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 1.2.0
>Reporter: Andrey
>Assignee: Bowen Li
>
> Steps to reproduce:
> 1) Stop task manager gracefully (kill -6 )
> 2) In the logs:
> {code}
> 2017-05-17 13:35:50,147 INFO  org.apache.zookeeper.ClientCnxn 
>   - EventThread shut down [main-EventThread]
> 2017-05-17 13:35:50,200 ERROR 
> org.apache.flink.runtime.io.disk.iomanager.IOManager  - IOManager 
> failed to properly clean up temp file directory: 
> /mnt/flink/tmp/flink-io-66f1d0ec-8976-41bf-9575-f80b181b0e47 
> [flink-akka.actor.default-dispatcher-2]
> java.nio.file.DirectoryNotEmptyException: 
> /mnt/flink/tmp/flink-io-66f1d0ec-8976-41bf-9575-f80b181b0e47
>   at 
> sun.nio.fs.UnixFileSystemProvider.implDelete(UnixFileSystemProvider.java:242)
>   at 
> sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103)
>   at java.nio.file.Files.delete(Files.java:1126)
>   at org.apache.flink.util.FileUtils.deleteDirectory(FileUtils.java:154)
>   at 
> org.apache.flink.runtime.io.disk.iomanager.IOManager.shutdown(IOManager.java:109)
>   at 
> org.apache.flink.runtime.io.disk.iomanager.IOManagerAsync.shutdown(IOManagerAsync.java:185)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.postStop(TaskManager.scala:241)
>   at akka.actor.Actor$class.aroundPostStop(Actor.scala:477)
> {code}
> Expected:
> * on shutdown delete non-empty directory anyway. 
> Notes:
> * after process terminated, I've checked 
> "/mnt/flink/tmp/flink-io-66f1d0ec-8976-41bf-9575-f80b181b0e47" directory and 
> didn't find anything there. So it looks like timing issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4787: [FLINK-6615][core] simplify FileUtils

2017-10-15 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4787
  
@bowenli86 I hope you don't mind that I pushed back a bit. It's my job to 
be careful about such things...


---


[jira] [Commented] (FLINK-6615) tmp directory not cleaned up on shutdown

2017-10-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205093#comment-16205093
 ] 

ASF GitHub Bot commented on FLINK-6615:
---

Github user bowenli86 closed the pull request at:

https://github.com/apache/flink/pull/4787


> tmp directory not cleaned up on shutdown
> 
>
> Key: FLINK-6615
> URL: https://issues.apache.org/jira/browse/FLINK-6615
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 1.2.0
>Reporter: Andrey
>Assignee: Bowen Li
>
> Steps to reproduce:
> 1) Stop task manager gracefully (kill -6 )
> 2) In the logs:
> {code}
> 2017-05-17 13:35:50,147 INFO  org.apache.zookeeper.ClientCnxn 
>   - EventThread shut down [main-EventThread]
> 2017-05-17 13:35:50,200 ERROR 
> org.apache.flink.runtime.io.disk.iomanager.IOManager  - IOManager 
> failed to properly clean up temp file directory: 
> /mnt/flink/tmp/flink-io-66f1d0ec-8976-41bf-9575-f80b181b0e47 
> [flink-akka.actor.default-dispatcher-2]
> java.nio.file.DirectoryNotEmptyException: 
> /mnt/flink/tmp/flink-io-66f1d0ec-8976-41bf-9575-f80b181b0e47
>   at 
> sun.nio.fs.UnixFileSystemProvider.implDelete(UnixFileSystemProvider.java:242)
>   at 
> sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103)
>   at java.nio.file.Files.delete(Files.java:1126)
>   at org.apache.flink.util.FileUtils.deleteDirectory(FileUtils.java:154)
>   at 
> org.apache.flink.runtime.io.disk.iomanager.IOManager.shutdown(IOManager.java:109)
>   at 
> org.apache.flink.runtime.io.disk.iomanager.IOManagerAsync.shutdown(IOManagerAsync.java:185)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.postStop(TaskManager.scala:241)
>   at akka.actor.Actor$class.aroundPostStop(Actor.scala:477)
> {code}
> Expected:
> * on shutdown delete non-empty directory anyway. 
> Notes:
> * after process terminated, I've checked 
> "/mnt/flink/tmp/flink-io-66f1d0ec-8976-41bf-9575-f80b181b0e47" directory and 
> didn't find anything there. So it looks like timing issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4787: [FLINK-6615][core] simplify FileUtils

2017-10-15 Thread bowenli86
Github user bowenli86 closed the pull request at:

https://github.com/apache/flink/pull/4787


---


[jira] [Commented] (FLINK-6615) tmp directory not cleaned up on shutdown

2017-10-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205092#comment-16205092
 ] 

ASF GitHub Bot commented on FLINK-6615:
---

Github user bowenli86 commented on the issue:

https://github.com/apache/flink/pull/4787
  
@StephanEwen Thanks for letting me know. I'll close this PR


> tmp directory not cleaned up on shutdown
> 
>
> Key: FLINK-6615
> URL: https://issues.apache.org/jira/browse/FLINK-6615
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 1.2.0
>Reporter: Andrey
>Assignee: Bowen Li
>
> Steps to reproduce:
> 1) Stop task manager gracefully (kill -6 )
> 2) In the logs:
> {code}
> 2017-05-17 13:35:50,147 INFO  org.apache.zookeeper.ClientCnxn 
>   - EventThread shut down [main-EventThread]
> 2017-05-17 13:35:50,200 ERROR 
> org.apache.flink.runtime.io.disk.iomanager.IOManager  - IOManager 
> failed to properly clean up temp file directory: 
> /mnt/flink/tmp/flink-io-66f1d0ec-8976-41bf-9575-f80b181b0e47 
> [flink-akka.actor.default-dispatcher-2]
> java.nio.file.DirectoryNotEmptyException: 
> /mnt/flink/tmp/flink-io-66f1d0ec-8976-41bf-9575-f80b181b0e47
>   at 
> sun.nio.fs.UnixFileSystemProvider.implDelete(UnixFileSystemProvider.java:242)
>   at 
> sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103)
>   at java.nio.file.Files.delete(Files.java:1126)
>   at org.apache.flink.util.FileUtils.deleteDirectory(FileUtils.java:154)
>   at 
> org.apache.flink.runtime.io.disk.iomanager.IOManager.shutdown(IOManager.java:109)
>   at 
> org.apache.flink.runtime.io.disk.iomanager.IOManagerAsync.shutdown(IOManagerAsync.java:185)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.postStop(TaskManager.scala:241)
>   at akka.actor.Actor$class.aroundPostStop(Actor.scala:477)
> {code}
> Expected:
> * on shutdown delete non-empty directory anyway. 
> Notes:
> * after process terminated, I've checked 
> "/mnt/flink/tmp/flink-io-66f1d0ec-8976-41bf-9575-f80b181b0e47" directory and 
> didn't find anything there. So it looks like timing issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4787: [FLINK-6615][core] simplify FileUtils

2017-10-15 Thread bowenli86
Github user bowenli86 commented on the issue:

https://github.com/apache/flink/pull/4787
  
@StephanEwen Thanks for letting me know. I'll close this PR


---


[jira] [Closed] (FLINK-6684) Remove AsyncCheckpointRunnable from StreamTask

2017-10-15 Thread Bowen Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li closed FLINK-6684.
---
Resolution: Invalid

> Remove AsyncCheckpointRunnable from StreamTask
> --
>
> Key: FLINK-6684
> URL: https://issues.apache.org/jira/browse/FLINK-6684
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Reporter: Stefan Richter
>Assignee: Bowen Li
>
> Right now, {{StreamTask}} executes {{AsyncCheckpointRunnable}} to run the 
> async part of a snapshot. However, it seems that currently the main reason 
> for executing this code in a separate tread is to avoid its execution under 
> the checkpoint lock, so that processing can proceed.
> Actually, the  checkpoint is already triggered asynchronously, in 
> {{Task::triggerCheckpointBarrier}}. We could also execute the checkpointing 
> without executing {{AsyncCheckpointRunnable}}, by just running the code 
> inside the thread that is spawned in {{Task::triggerCheckpointBarrier}}. We 
> could simply
> 1) Run the synchronous part of the checkpoint under the checkpointing lock.
> 2) Run the asynchronous part of the checkpoint without holding the 
> checkpointing lock.
> 3) Returning a {{Future}} from {{StatefulTask::triggerCheckpoint}} when 
> called from {{Task::triggerCheckpointBarrier}}.
> This would simplify the code and make the usage of the 
> {{SafetyNetCloseableRegistry}} possible as intended. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-7021) Flink Task Manager hangs on startup if one Zookeeper node is unresolvable

2017-10-15 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek updated FLINK-7021:

Priority: Blocker  (was: Major)

> Flink Task Manager hangs on startup if one Zookeeper node is unresolvable
> -
>
> Key: FLINK-7021
> URL: https://issues.apache.org/jira/browse/FLINK-7021
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.2.0, 1.3.0, 1.2.1, 1.3.1
> Environment: Kubernetes cluster running:
> * Flink 1.3.0 Job Manager & Task Manager on Java 8u131
> * Zookeeper 3.4.10 cluster with 3 nodes
>Reporter: Scott Kidder
>Assignee: Scott Kidder
>Priority: Blocker
> Fix For: 1.4.0
>
>
> h2. Problem
> Flink Task Manager will hang during startup if one of the Zookeeper nodes in 
> the Zookeeper connection string is unresolvable.
> h2. Expected Behavior
> Flink should retry name resolution & connection to Zookeeper nodes with 
> exponential back-off.
> h2. Environment Details
> We're running Flink and Zookeeper in Kubernetes on CoreOS. CoreOS can run in 
> a configuration that automatically detects and applies operating system 
> updates. We have a Zookeeper node running on the same CoreOS instance as 
> Flink. It's possible that the Zookeeper node will not yet be started when the 
> Flink components are started. This could cause hostname resolution of the 
> Zookeeper nodes to fail.
> h3. Flink Task Manager Logs
> {noformat}
> 2017-06-27 15:38:51,713 INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Using 
> configured hostname/address for TaskManager: 10.2.45.11
> 2017-06-27 15:38:51,714 INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Starting 
> TaskManager
> 2017-06-27 15:38:51,714 INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Starting 
> TaskManager actor system at 10.2.45.11:6122.
> 2017-06-27 15:38:52,950 INFO  akka.event.slf4j.Slf4jLogger
>   - Slf4jLogger started
> 2017-06-27 15:38:53,079 INFO  Remoting
>   - Starting remoting
> 2017-06-27 15:38:53,573 INFO  Remoting
>   - Remoting started; listening on addresses 
> :[akka.tcp://flink@10.2.45.11:6122]
> 2017-06-27 15:38:53,576 INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Starting 
> TaskManager actor
> 2017-06-27 15:38:53,660 INFO  
> org.apache.flink.runtime.io.network.netty.NettyConfig - NettyConfig 
> [server address: /10.2.45.11, server port: 6121, ssl enabled: false, memory 
> segment size (bytes): 32768, transport type: NIO, number of server threads: 2 
> (manual), number of client threads: 2 (manual), server connect backlog: 0 
> (use Netty's default), client connect timeout (sec): 120, send/receive buffer 
> size (bytes): 0 (use Netty's default)]
> 2017-06-27 15:38:53,682 INFO  
> org.apache.flink.runtime.taskexecutor.TaskManagerConfiguration  - Messages 
> have a max timeout of 1 ms
> 2017-06-27 15:38:53,688 INFO  
> org.apache.flink.runtime.taskexecutor.TaskManagerServices - Temporary 
> file directory '/tmp': total 49 GB, usable 42 GB (85.71% usable)
> 2017-06-27 15:38:54,071 INFO  
> org.apache.flink.runtime.io.network.buffer.NetworkBufferPool  - Allocated 96 
> MB for network buffer pool (number of memory segments: 3095, bytes per 
> segment: 32768).
> 2017-06-27 15:38:54,564 INFO  
> org.apache.flink.runtime.io.network.NetworkEnvironment- Starting the 
> network environment and its components.
> 2017-06-27 15:38:54,576 INFO  
> org.apache.flink.runtime.io.network.netty.NettyClient - Successful 
> initialization (took 4 ms).
> 2017-06-27 15:38:54,677 INFO  
> org.apache.flink.runtime.io.network.netty.NettyServer - Successful 
> initialization (took 101 ms). Listening on SocketAddress /10.2.45.11:6121.
> 2017-06-27 15:38:54,981 INFO  
> org.apache.flink.runtime.taskexecutor.TaskManagerServices - Limiting 
> managed memory to 0.7 of the currently free heap space (612 MB), memory will 
> be allocated lazily.
> 2017-06-27 15:38:55,050 INFO  
> org.apache.flink.runtime.io.disk.iomanager.IOManager  - I/O manager 
> uses directory /tmp/flink-io-ca01554d-f25e-4c17-a828-96d82b43d4a7 for spill 
> files.
> 2017-06-27 15:38:55,061 INFO  org.apache.flink.runtime.metrics.MetricRegistry 
>   - Configuring StatsDReporter with {interval=10 SECONDS, 
> port=8125, host=localhost, 
> class=org.apache.flink.metrics.statsd.StatsDReporter}.
> 2017-06-27 15:38:55,065 INFO  org.apache.flink.metrics.statsd.StatsDReporter  
>   - Configured StatsDReporter with {host:localhost, port:8125}
> 2017-06-27 15:38:55,065 INFO  org.apache.flin

[jira] [Updated] (FLINK-7021) Flink Task Manager hangs on startup if one Zookeeper node is unresolvable

2017-10-15 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek updated FLINK-7021:

Fix Version/s: 1.4.0

> Flink Task Manager hangs on startup if one Zookeeper node is unresolvable
> -
>
> Key: FLINK-7021
> URL: https://issues.apache.org/jira/browse/FLINK-7021
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.2.0, 1.3.0, 1.2.1, 1.3.1
> Environment: Kubernetes cluster running:
> * Flink 1.3.0 Job Manager & Task Manager on Java 8u131
> * Zookeeper 3.4.10 cluster with 3 nodes
>Reporter: Scott Kidder
>Assignee: Scott Kidder
> Fix For: 1.4.0
>
>
> h2. Problem
> Flink Task Manager will hang during startup if one of the Zookeeper nodes in 
> the Zookeeper connection string is unresolvable.
> h2. Expected Behavior
> Flink should retry name resolution & connection to Zookeeper nodes with 
> exponential back-off.
> h2. Environment Details
> We're running Flink and Zookeeper in Kubernetes on CoreOS. CoreOS can run in 
> a configuration that automatically detects and applies operating system 
> updates. We have a Zookeeper node running on the same CoreOS instance as 
> Flink. It's possible that the Zookeeper node will not yet be started when the 
> Flink components are started. This could cause hostname resolution of the 
> Zookeeper nodes to fail.
> h3. Flink Task Manager Logs
> {noformat}
> 2017-06-27 15:38:51,713 INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Using 
> configured hostname/address for TaskManager: 10.2.45.11
> 2017-06-27 15:38:51,714 INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Starting 
> TaskManager
> 2017-06-27 15:38:51,714 INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Starting 
> TaskManager actor system at 10.2.45.11:6122.
> 2017-06-27 15:38:52,950 INFO  akka.event.slf4j.Slf4jLogger
>   - Slf4jLogger started
> 2017-06-27 15:38:53,079 INFO  Remoting
>   - Starting remoting
> 2017-06-27 15:38:53,573 INFO  Remoting
>   - Remoting started; listening on addresses 
> :[akka.tcp://flink@10.2.45.11:6122]
> 2017-06-27 15:38:53,576 INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Starting 
> TaskManager actor
> 2017-06-27 15:38:53,660 INFO  
> org.apache.flink.runtime.io.network.netty.NettyConfig - NettyConfig 
> [server address: /10.2.45.11, server port: 6121, ssl enabled: false, memory 
> segment size (bytes): 32768, transport type: NIO, number of server threads: 2 
> (manual), number of client threads: 2 (manual), server connect backlog: 0 
> (use Netty's default), client connect timeout (sec): 120, send/receive buffer 
> size (bytes): 0 (use Netty's default)]
> 2017-06-27 15:38:53,682 INFO  
> org.apache.flink.runtime.taskexecutor.TaskManagerConfiguration  - Messages 
> have a max timeout of 1 ms
> 2017-06-27 15:38:53,688 INFO  
> org.apache.flink.runtime.taskexecutor.TaskManagerServices - Temporary 
> file directory '/tmp': total 49 GB, usable 42 GB (85.71% usable)
> 2017-06-27 15:38:54,071 INFO  
> org.apache.flink.runtime.io.network.buffer.NetworkBufferPool  - Allocated 96 
> MB for network buffer pool (number of memory segments: 3095, bytes per 
> segment: 32768).
> 2017-06-27 15:38:54,564 INFO  
> org.apache.flink.runtime.io.network.NetworkEnvironment- Starting the 
> network environment and its components.
> 2017-06-27 15:38:54,576 INFO  
> org.apache.flink.runtime.io.network.netty.NettyClient - Successful 
> initialization (took 4 ms).
> 2017-06-27 15:38:54,677 INFO  
> org.apache.flink.runtime.io.network.netty.NettyServer - Successful 
> initialization (took 101 ms). Listening on SocketAddress /10.2.45.11:6121.
> 2017-06-27 15:38:54,981 INFO  
> org.apache.flink.runtime.taskexecutor.TaskManagerServices - Limiting 
> managed memory to 0.7 of the currently free heap space (612 MB), memory will 
> be allocated lazily.
> 2017-06-27 15:38:55,050 INFO  
> org.apache.flink.runtime.io.disk.iomanager.IOManager  - I/O manager 
> uses directory /tmp/flink-io-ca01554d-f25e-4c17-a828-96d82b43d4a7 for spill 
> files.
> 2017-06-27 15:38:55,061 INFO  org.apache.flink.runtime.metrics.MetricRegistry 
>   - Configuring StatsDReporter with {interval=10 SECONDS, 
> port=8125, host=localhost, 
> class=org.apache.flink.metrics.statsd.StatsDReporter}.
> 2017-06-27 15:38:55,065 INFO  org.apache.flink.metrics.statsd.StatsDReporter  
>   - Configured StatsDReporter with {host:localhost, port:8125}
> 2017-06-27 15:38:55,065 INFO  org.apache.flink.runtime.metrics.MetricRegistry 
>   

[jira] [Created] (FLINK-7843) Improve and enhance documentation for system metrics

2017-10-15 Thread Hai Zhou UTC+8 (JIRA)
Hai Zhou UTC+8 created FLINK-7843:
-

 Summary: Improve and enhance documentation for system metrics
 Key: FLINK-7843
 URL: https://issues.apache.org/jira/browse/FLINK-7843
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.3.2
Reporter: Hai Zhou UTC+8
Assignee: Hai Zhou UTC+8
Priority: Critical
 Fix For: 1.4.0


I think we should do the following improvements about system metrics section in 
the documentation:

# Add a column that the *Type* of metric. eg. Counters, Gauges, Histograms and 
Meters
# Modify the *Description* of the metric,Add unit description. eg. in bytes, in 
megabytes,  in nanoseconds, in milliseconds



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7686) Add Flink Forward Berlin 2017 conference slides to the flink website

2017-10-15 Thread Hai Zhou UTC+8 (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205062#comment-16205062
 ] 

Hai Zhou UTC+8 commented on FLINK-7686:
---

PR : [Update community.md|https://github.com/apache/flink-web/pull/87]

> Add Flink Forward Berlin 2017 conference slides to the flink website
> 
>
> Key: FLINK-7686
> URL: https://issues.apache.org/jira/browse/FLINK-7686
> Project: Flink
>  Issue Type: Wish
>  Components: Project Website
>Reporter: Hai Zhou UTC+8
>Priority: Trivial
>
> I recently watched [Flink Forward Berlin 
> 2017|https://berlin.flink-forward.org/sessions/] conference slides, the 
> content is very good.  
> I think we should add them to the [flink 
> website|http://flink.apache.org/community.html] for more people to know.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Reopened] (FLINK-7686) Add Flink Forward Berlin 2017 conference slides to the flink website

2017-10-15 Thread Hai Zhou UTC+8 (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hai Zhou UTC+8 reopened FLINK-7686:
---

> Add Flink Forward Berlin 2017 conference slides to the flink website
> 
>
> Key: FLINK-7686
> URL: https://issues.apache.org/jira/browse/FLINK-7686
> Project: Flink
>  Issue Type: Wish
>  Components: Project Website
>Reporter: Hai Zhou UTC+8
>Priority: Trivial
>
> I recently watched [Flink Forward Berlin 
> 2017|https://berlin.flink-forward.org/sessions/] conference slides, the 
> content is very good.  
> I think we should add them to the [flink 
> website|http://flink.apache.org/community.html] for more people to know.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (FLINK-7799) Improve performance of windowed joins

2017-10-15 Thread Xingcan Cui (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16204554#comment-16204554
 ] 

Xingcan Cui edited comment on FLINK-7799 at 10/15/17 8:11 AM:
--

Hi [~fhueske], I've reconsidered this problem and found a drawback of the time 
block solution. 

Specifically, if we merge all the rows belonging to the same time block in an 
entry (the value of which is either a {{Map}} or a {{List}}) of the 
{{MapState}}, the minimum operating unit of the state becomes a collection. 
That means everytime we store/remove a single row, all the data in the same 
block must also be rewritten, which will definitely bring a lot of extra cost. 
Maybe we can set two states (one for real data and the other one for the block 
index) for that?

Anyway, since the rocksdb backend should be widely used in real applications 
and the {{MapState}} entries are ordered in it, I wonder if something like a  
hint mechanism could be provided in the state API, so that the join function 
can be aware of the ordering.


was (Author: xccui):
Hi [~fhueske], I've reconsidered this problem and found a drawback of the time 
block solution. 

Specifically, if we merge all the rows belonging to the same time block in an 
entry (the value of which is either a {{Map}} or a {{List}}) of the 
{{MapState}}, the minimum operating unit of the state becomes a collection. 
That means everytime we store/remove a single row, all the data in the same 
block must also be rewritten, which will definitely bring a lot of extra cost.

If that drawback cannot be eliminated, I wonder if we could improve the join 
performance from another point of view. Since the rocksdb backend should be 
widely used in real applications and the {{MapState}} entries are ordered in 
it, can we provide something like a  hint mechanism in the state API, so that 
the join function can be aware of the ordering?

> Improve performance of windowed joins
> -
>
> Key: FLINK-7799
> URL: https://issues.apache.org/jira/browse/FLINK-7799
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: Fabian Hueske
>Assignee: Xingcan Cui
>Priority: Critical
>
> The performance of windowed joins can be improved by changing the state 
> access patterns.
> Right now, rows are inserted into a MapState with their timestamp as key. 
> Since we use a time resolution of 1ms, this means that the full key space of 
> the state must be iterated and many map entries must be accessed when joining 
> or evicting rows. 
> A better strategy would be to block the time into larger intervals and 
> register the rows in their respective interval. Another benefit would be that 
> we can directly access the state entries because we know exactly which 
> timestamps to look up. Hence, we can limit the state access to the relevant 
> section during joining and state eviction. 
> The good size for intervals needs to be identified and might depend on the 
> size of the window.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)