[jira] [Assigned] (FLINK-8331) FieldParsers do not correctly set EMPT_COLUMN error state

2017-12-29 Thread sunjincheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-8331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sunjincheng reassigned FLINK-8331:
--

Assignee: sunjincheng

> FieldParsers do not correctly set EMPT_COLUMN error state
> -
>
> Key: FLINK-8331
> URL: https://issues.apache.org/jira/browse/FLINK-8331
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.5.0, 1.4.1
>Reporter: Fabian Hueske
>Assignee: sunjincheng
>
> Some {{FieldParser}} do not correctly set the EMPTY_COLUMN error state if a 
> field is empty.
> Instead, they try to parse the field value from an empty String which fails, 
> e.g., in case of the {{DoubleParser}} with a {{NumberFormatException}}.
> The {{RowCsvInputFormat}} has a flag to interpret empty fields as {{null}} 
> values. The implementation requires that all {{FieldParser}} correctly return 
> the EMPTY_COLUMN error state in case of an empty field.
> Affected {{FieldParser}}:
> - BigDecParser
> - BigIntParser
> - DoubleParser
> - FloatParser
> - SqlDateParser
> - SqlTimeParser
> - SqlTimestampParser



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-8331) FieldParsers do not correctly set EMPT_COLUMN error state

2017-12-29 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-8331:


 Summary: FieldParsers do not correctly set EMPT_COLUMN error state
 Key: FLINK-8331
 URL: https://issues.apache.org/jira/browse/FLINK-8331
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 1.5.0, 1.4.1
Reporter: Fabian Hueske


Some {{FieldParser}} do not correctly set the EMPTY_COLUMN error state if a 
field is empty.
Instead, they try to parse the field value from an empty String which fails, 
e.g., in case of the {{DoubleParser}} with a {{NumberFormatException}}.

The {{RowCsvInputFormat}} has a flag to interpret empty fields as {{null}} 
values. The implementation requires that all {{FieldParser}} correctly return 
the EMPTY_COLUMN error state in case of an empty field.

Affected {{FieldParser}}:

- BigDecParser
- BigIntParser
- DoubleParser
- FloatParser
- SqlDateParser
- SqlTimeParser
- SqlTimestampParser



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (FLINK-3655) Allow comma-separated or multiple directories to be specified for FileInputFormat

2017-12-29 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske reassigned FLINK-3655:


Assignee: Fabian Hueske

> Allow comma-separated or multiple directories to be specified for 
> FileInputFormat
> -
>
> Key: FLINK-3655
> URL: https://issues.apache.org/jira/browse/FLINK-3655
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Gna Phetsarath
>Assignee: Fabian Hueske
>  Labels: starter
> Fix For: 1.5.0
>
>
> Allow comma-separated or multiple directories to be specified for 
> FileInputFormat so that a DataSource will process the directories 
> sequentially.
>
> env.readFile("/data/2016/01/01/*/*,/data/2016/01/02/*/*,/data/2016/01/03/*/*")
> in Scala
>env.readFile(paths: Seq[String])
> or 
>   env.readFile(path: String, otherPaths: String*)
> Wildcard support would be a bonus.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-3655) Allow comma-separated or multiple directories to be specified for FileInputFormat

2017-12-29 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-3655:
-
Fix Version/s: 1.5.0

> Allow comma-separated or multiple directories to be specified for 
> FileInputFormat
> -
>
> Key: FLINK-3655
> URL: https://issues.apache.org/jira/browse/FLINK-3655
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Gna Phetsarath
>Priority: Minor
>  Labels: starter
> Fix For: 1.5.0
>
>
> Allow comma-separated or multiple directories to be specified for 
> FileInputFormat so that a DataSource will process the directories 
> sequentially.
>
> env.readFile("/data/2016/01/01/*/*,/data/2016/01/02/*/*,/data/2016/01/03/*/*")
> in Scala
>env.readFile(paths: Seq[String])
> or 
>   env.readFile(path: String, otherPaths: String*)
> Wildcard support would be a bonus.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-3655) Allow comma-separated or multiple directories to be specified for FileInputFormat

2017-12-29 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-3655:
-
Priority: Major  (was: Minor)

> Allow comma-separated or multiple directories to be specified for 
> FileInputFormat
> -
>
> Key: FLINK-3655
> URL: https://issues.apache.org/jira/browse/FLINK-3655
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Gna Phetsarath
>  Labels: starter
> Fix For: 1.5.0
>
>
> Allow comma-separated or multiple directories to be specified for 
> FileInputFormat so that a DataSource will process the directories 
> sequentially.
>
> env.readFile("/data/2016/01/01/*/*,/data/2016/01/02/*/*,/data/2016/01/03/*/*")
> in Scala
>env.readFile(paths: Seq[String])
> or 
>   env.readFile(path: String, otherPaths: String*)
> Wildcard support would be a bonus.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-3655) Allow comma-separated or multiple directories to be specified for FileInputFormat

2017-12-29 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16306527#comment-16306527
 ] 

Fabian Hueske commented on FLINK-3655:
--

Hi [~sjwiesman], sorry for the delay. I just had a look at the PR. 
The changes look mostly good but it breaks the public API in some places. 
IMO, it could go into Flink 1.5.0 after some adjustments.

Because it's more than 1.5 years since the PR has been opened, I'll rework it 
myself and will open a new PR with my changes put on top of the initial PR.
It would be great if you could give it a try once I've opened my PR.

Thanks, Fabian

> Allow comma-separated or multiple directories to be specified for 
> FileInputFormat
> -
>
> Key: FLINK-3655
> URL: https://issues.apache.org/jira/browse/FLINK-3655
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Gna Phetsarath
>Priority: Minor
>  Labels: starter
> Fix For: 1.5.0
>
>
> Allow comma-separated or multiple directories to be specified for 
> FileInputFormat so that a DataSource will process the directories 
> sequentially.
>
> env.readFile("/data/2016/01/01/*/*,/data/2016/01/02/*/*,/data/2016/01/03/*/*")
> in Scala
>env.readFile(paths: Seq[String])
> or 
>   env.readFile(path: String, otherPaths: String*)
> Wildcard support would be a bonus.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-8171) Remove work arounds in Flip6LocalStreamEnvironment

2017-12-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16306404#comment-16306404
 ] 

ASF GitHub Bot commented on FLINK-8171:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/5101
  
Thanks for the review @zentol. I've rebased this PR and once Travis gives 
green light, I'll merge it.


> Remove work arounds in Flip6LocalStreamEnvironment
> --
>
> Key: FLINK-8171
> URL: https://issues.apache.org/jira/browse/FLINK-8171
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>  Labels: flip-6
> Fix For: 1.5.0
>
>
> After adding FLINK-7956, it is no longer necessary that the 
> {{Flip6LocalStreamEnvironment}} waits for the registration of TaskManagers 
> before submitting a job. Moreover, it is also possible to use slot sharing 
> when submitting jobs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #5101: [FLINK-8171] [flip6] Remove work arounds from Flip6LocalS...

2017-12-29 Thread tillrohrmann
Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/5101
  
Thanks for the review @zentol. I've rebased this PR and once Travis gives 
green light, I'll merge it.


---


[jira] [Commented] (FLINK-8330) Remove FlinkYarnCLI

2017-12-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16306402#comment-16306402
 ] 

ASF GitHub Bot commented on FLINK-8330:
---

GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/5217

[FLINK-8330] [flip6] Remove FlinkYarnCLI

## What is the purpose of the change

The `FlinkYarnCLI` is not needed and is, thus, being removed.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
  - The S3 file system connector: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not applicable)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink removeYarnCli

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5217.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5217


commit 9504dd313f7842aae4d1cd326a99e47a774df067
Author: Till Rohrmann 
Date:   2017-12-08T09:12:23Z

[FLINK-8330] [flip6] Remove FlinkYarnCLI

The FlinkYarnCLI is not needed and is, thus, being removed.




> Remove FlinkYarnCLI
> ---
>
> Key: FLINK-8330
> URL: https://issues.apache.org/jira/browse/FLINK-8330
> Project: Flink
>  Issue Type: Improvement
>  Components: Client
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>  Labels: flip-6
> Fix For: 1.5.0
>
>
> Remove the {{FlinkYarnCLI}} because it is no longer needed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #5217: [FLINK-8330] [flip6] Remove FlinkYarnCLI

2017-12-29 Thread tillrohrmann
GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/5217

[FLINK-8330] [flip6] Remove FlinkYarnCLI

## What is the purpose of the change

The `FlinkYarnCLI` is not needed and is, thus, being removed.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
  - The S3 file system connector: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not applicable)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink removeYarnCli

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5217.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5217


commit 9504dd313f7842aae4d1cd326a99e47a774df067
Author: Till Rohrmann 
Date:   2017-12-08T09:12:23Z

[FLINK-8330] [flip6] Remove FlinkYarnCLI

The FlinkYarnCLI is not needed and is, thus, being removed.




---


[jira] [Created] (FLINK-8330) Remove FlinkYarnCLI

2017-12-29 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-8330:


 Summary: Remove FlinkYarnCLI
 Key: FLINK-8330
 URL: https://issues.apache.org/jira/browse/FLINK-8330
 Project: Flink
  Issue Type: Improvement
  Components: Client
Affects Versions: 1.5.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 1.5.0


Remove the {{FlinkYarnCLI}} because it is no longer needed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-8329) Move YarnClient out of YarnClusterClient

2017-12-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16306398#comment-16306398
 ] 

ASF GitHub Bot commented on FLINK-8329:
---

GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/5216

[FLINK-8329] [flip6] Move YarnClient to AbstractYarnClusterDescriptor

## What is the purpose of the change

Moves the YarnClient from the YarnClusterClient to the 
AbstractYarnClusterDescriptor.
This makes the latter responsible for the lifecycle management of the 
client and gives
a better separation of concerns.

This PR is based on #5215 

## Brief change log

- Move the `YarnClient` from the `YarnClusterClient` to the 
`AbstractYarnClusterDescriptor`

## Verifying this change

This change is already covered by existing tests

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes)
  - The S3 file system connector: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not applicable)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink 
moveYarnClientToYarnDescriptor

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5216.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5216


commit 69ef9787f1278f40bfeaa685379648297e1cb6f0
Author: Till Rohrmann 
Date:   2017-12-07T12:57:24Z

[FLINK-8328] [flip6] Move Yarn ApplicationStatus polling out of 
YarnClusterClient

Introduce YarnApplicationStatusMonitor which does the Yarn 
ApplicationStatus polling in
the FlinkYarnSessionCli. This decouples the YarnClusterClient from the 
actual communication
with Yarn and, thus, gives a better separation of concerns.

commit fa3b31feba7f2dd5a2fb07cc35d56c8f1a63f75b
Author: Till Rohrmann 
Date:   2017-12-20T15:43:21Z

[FLINK-8329] [flip6] Move YarnClient to AbstractYarnClusterDescriptor

Moves the YarnClient from the YarnClusterClient to the 
AbstractYarnClusterDescriptor.
This makes the latter responsible for the lifecycle management of the 
client and gives
a better separation of concerns.




> Move YarnClient out of YarnClusterClient
> 
>
> Key: FLINK-8329
> URL: https://issues.apache.org/jira/browse/FLINK-8329
> Project: Flink
>  Issue Type: Improvement
>  Components: Client
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>  Labels: flip-6
> Fix For: 1.5.0
>
>
> Move the {{YarnClient}} from the {{YarnClusterClient}} to the 
> {{AbstractYarnClusterDescriptor}} which will be responsible for the lifecycle 
> management of the {{YarnClient}}. This change is a clean up task which will 
> better structure the client code.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #5216: [FLINK-8329] [flip6] Move YarnClient to AbstractYa...

2017-12-29 Thread tillrohrmann
GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/5216

[FLINK-8329] [flip6] Move YarnClient to AbstractYarnClusterDescriptor

## What is the purpose of the change

Moves the YarnClient from the YarnClusterClient to the 
AbstractYarnClusterDescriptor.
This makes the latter responsible for the lifecycle management of the 
client and gives
a better separation of concerns.

This PR is based on #5215 

## Brief change log

- Move the `YarnClient` from the `YarnClusterClient` to the 
`AbstractYarnClusterDescriptor`

## Verifying this change

This change is already covered by existing tests

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes)
  - The S3 file system connector: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not applicable)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink 
moveYarnClientToYarnDescriptor

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5216.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5216


commit 69ef9787f1278f40bfeaa685379648297e1cb6f0
Author: Till Rohrmann 
Date:   2017-12-07T12:57:24Z

[FLINK-8328] [flip6] Move Yarn ApplicationStatus polling out of 
YarnClusterClient

Introduce YarnApplicationStatusMonitor which does the Yarn 
ApplicationStatus polling in
the FlinkYarnSessionCli. This decouples the YarnClusterClient from the 
actual communication
with Yarn and, thus, gives a better separation of concerns.

commit fa3b31feba7f2dd5a2fb07cc35d56c8f1a63f75b
Author: Till Rohrmann 
Date:   2017-12-20T15:43:21Z

[FLINK-8329] [flip6] Move YarnClient to AbstractYarnClusterDescriptor

Moves the YarnClient from the YarnClusterClient to the 
AbstractYarnClusterDescriptor.
This makes the latter responsible for the lifecycle management of the 
client and gives
a better separation of concerns.




---


[jira] [Created] (FLINK-8329) Move YarnClient out of YarnClusterClient

2017-12-29 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-8329:


 Summary: Move YarnClient out of YarnClusterClient
 Key: FLINK-8329
 URL: https://issues.apache.org/jira/browse/FLINK-8329
 Project: Flink
  Issue Type: Improvement
  Components: Client
Affects Versions: 1.5.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 1.5.0


Move the {{YarnClient}} from the {{YarnClusterClient}} to the 
{{AbstractYarnClusterDescriptor}} which will be responsible for the lifecycle 
management of the {{YarnClient}}. This change is a clean up task which will 
better structure the client code.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-8328) Pull Yarn ApplicationStatus polling out of YarnClusterClient

2017-12-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16306391#comment-16306391
 ] 

ASF GitHub Bot commented on FLINK-8328:
---

GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/5215

[FLINK-8328] [flip6] Move Yarn ApplicationStatus polling out of 
YarnClusterClient

## What is the purpose of the change

Introduce YarnApplicationStatusMonitor which does the Yarn 
ApplicationStatus polling in
the FlinkYarnSessionCli. This decouples the YarnClusterClient from the 
actual communication
with Yarn and, thus, gives a better separation of concerns.

## Brief change log

- Replace the `PollingThread` with the `YarnApplicationStatusMonitor`
- Decouple `YarnClusterClient` from Yarn `ApplicationStatus` polling

## Verifying this change

- Changes covered by existing tests

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes)
  - The S3 file system connector: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not applicable)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink removeSpecialClusterClients

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5215.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5215


commit 69ef9787f1278f40bfeaa685379648297e1cb6f0
Author: Till Rohrmann 
Date:   2017-12-07T12:57:24Z

[FLINK-8328] [flip6] Move Yarn ApplicationStatus polling out of 
YarnClusterClient

Introduce YarnApplicationStatusMonitor which does the Yarn 
ApplicationStatus polling in
the FlinkYarnSessionCli. This decouples the YarnClusterClient from the 
actual communication
with Yarn and, thus, gives a better separation of concerns.




> Pull Yarn ApplicationStatus polling out of YarnClusterClient
> 
>
> Key: FLINK-8328
> URL: https://issues.apache.org/jira/browse/FLINK-8328
> Project: Flink
>  Issue Type: Improvement
>  Components: Client
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>  Labels: flip-6
> Fix For: 1.5.0
>
>
> In order to make the {{FlinkYarnSessionCli}} work with Flip-6, we have to 
> pull the Yarn {{ApplicationStatus}} polling out of the {{YarnClusterClient}}. 
> I propose to introduce a dedicated {{YarnApplicationStatusMonitor}}. This has 
> also the benefit of separating concerns better.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #5215: [FLINK-8328] [flip6] Move Yarn ApplicationStatus p...

2017-12-29 Thread tillrohrmann
GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/5215

[FLINK-8328] [flip6] Move Yarn ApplicationStatus polling out of 
YarnClusterClient

## What is the purpose of the change

Introduce YarnApplicationStatusMonitor which does the Yarn 
ApplicationStatus polling in
the FlinkYarnSessionCli. This decouples the YarnClusterClient from the 
actual communication
with Yarn and, thus, gives a better separation of concerns.

## Brief change log

- Replace the `PollingThread` with the `YarnApplicationStatusMonitor`
- Decouple `YarnClusterClient` from Yarn `ApplicationStatus` polling

## Verifying this change

- Changes covered by existing tests

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes)
  - The S3 file system connector: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not applicable)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink removeSpecialClusterClients

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5215.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5215


commit 69ef9787f1278f40bfeaa685379648297e1cb6f0
Author: Till Rohrmann 
Date:   2017-12-07T12:57:24Z

[FLINK-8328] [flip6] Move Yarn ApplicationStatus polling out of 
YarnClusterClient

Introduce YarnApplicationStatusMonitor which does the Yarn 
ApplicationStatus polling in
the FlinkYarnSessionCli. This decouples the YarnClusterClient from the 
actual communication
with Yarn and, thus, gives a better separation of concerns.




---


[jira] [Created] (FLINK-8328) Pull Yarn ApplicationStatus polling out of YarnClusterClient

2017-12-29 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-8328:


 Summary: Pull Yarn ApplicationStatus polling out of 
YarnClusterClient
 Key: FLINK-8328
 URL: https://issues.apache.org/jira/browse/FLINK-8328
 Project: Flink
  Issue Type: Improvement
  Components: Client
Affects Versions: 1.5.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 1.5.0


In order to make the {{FlinkYarnSessionCli}} work with Flip-6, we have to pull 
the Yarn {{ApplicationStatus}} polling out of the {{YarnClusterClient}}. I 
propose to introduce a dedicated {{YarnApplicationStatusMonitor}}. This has 
also the benefit of separating concerns better.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7797) Add support for windowed outer joins for streaming tables

2017-12-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16306321#comment-16306321
 ] 

ASF GitHub Bot commented on FLINK-7797:
---

Github user xccui commented on a diff in the pull request:

https://github.com/apache/flink/pull/5140#discussion_r159069304
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/TimeBoundedStreamJoin.scala
 ---
@@ -182,16 +196,64 @@ abstract class TimeBoundedStreamInnerJoin(
 if (rightTime >= rightQualifiedLowerBound && rightTime <= 
rightQualifiedUpperBound) {
   val rightRows = rightEntry.getValue
   var i = 0
+  var markEmitted = false
   while (i < rightRows.size) {
-joinFunction.join(leftRow, rightRows.get(i), cRowWrapper)
+joinCollector.resetThisTurn()
+val tuple = rightRows.get(i)
+joinFunction.join(leftRow, tuple.f0, joinCollector)
+if (joinType == JoinType.RIGHT_OUTER || joinType == 
JoinType.FULL_OUTER) {
+  if (!tuple.f1 && joinCollector.everEmittedThisTurn) {
+// Mark the right row as being successfully joined and 
emitted.
+tuple.f1 = true
+markEmitted = true
+  }
+}
 i += 1
   }
+  if (markEmitted) {
+// Write back the edited entry (mark emitted) for the right 
cache.
+rightEntry.setValue(rightRows)
+  }
 }
 
 if (rightTime <= rightExpirationTime) {
+  if (joinType == JoinType.LEFT_OUTER || joinType == 
JoinType.FULL_OUTER) {
--- End diff --

Yes, I should have added a harness test for that.


> Add support for windowed outer joins for streaming tables
> -
>
> Key: FLINK-7797
> URL: https://issues.apache.org/jira/browse/FLINK-7797
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: Fabian Hueske
>Assignee: Xingcan Cui
>
> Currently, only windowed inner joins for streaming tables are supported.
> This issue is about adding support for windowed LEFT, RIGHT, and FULL OUTER 
> joins.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #5140: [FLINK-7797] [table] Add support for windowed oute...

2017-12-29 Thread xccui
Github user xccui commented on a diff in the pull request:

https://github.com/apache/flink/pull/5140#discussion_r159069304
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/TimeBoundedStreamJoin.scala
 ---
@@ -182,16 +196,64 @@ abstract class TimeBoundedStreamInnerJoin(
 if (rightTime >= rightQualifiedLowerBound && rightTime <= 
rightQualifiedUpperBound) {
   val rightRows = rightEntry.getValue
   var i = 0
+  var markEmitted = false
   while (i < rightRows.size) {
-joinFunction.join(leftRow, rightRows.get(i), cRowWrapper)
+joinCollector.resetThisTurn()
+val tuple = rightRows.get(i)
+joinFunction.join(leftRow, tuple.f0, joinCollector)
+if (joinType == JoinType.RIGHT_OUTER || joinType == 
JoinType.FULL_OUTER) {
+  if (!tuple.f1 && joinCollector.everEmittedThisTurn) {
+// Mark the right row as being successfully joined and 
emitted.
+tuple.f1 = true
+markEmitted = true
+  }
+}
 i += 1
   }
+  if (markEmitted) {
+// Write back the edited entry (mark emitted) for the right 
cache.
+rightEntry.setValue(rightRows)
+  }
 }
 
 if (rightTime <= rightExpirationTime) {
+  if (joinType == JoinType.LEFT_OUTER || joinType == 
JoinType.FULL_OUTER) {
--- End diff --

Yes, I should have added a harness test for that.


---


[jira] [Commented] (FLINK-7797) Add support for windowed outer joins for streaming tables

2017-12-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16306318#comment-16306318
 ] 

ASF GitHub Bot commented on FLINK-7797:
---

Github user xccui commented on a diff in the pull request:

https://github.com/apache/flink/pull/5140#discussion_r159069074
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinAwareCollector.scala
 ---
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.join
+
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+
+/**
+  * Collector to track whether there's a joined result.
+  */
+class JoinAwareCollector extends Collector[Row]{
+
+  private var emitted = false
+  private var emittedThisTurn = false
+  private var innerCollector: Collector[CRow] = _
+  private val cRow: CRow = new CRow()
--- End diff --

I'll add a function to set this value in the Collector.


> Add support for windowed outer joins for streaming tables
> -
>
> Key: FLINK-7797
> URL: https://issues.apache.org/jira/browse/FLINK-7797
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: Fabian Hueske
>Assignee: Xingcan Cui
>
> Currently, only windowed inner joins for streaming tables are supported.
> This issue is about adding support for windowed LEFT, RIGHT, and FULL OUTER 
> joins.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #5140: [FLINK-7797] [table] Add support for windowed oute...

2017-12-29 Thread xccui
Github user xccui commented on a diff in the pull request:

https://github.com/apache/flink/pull/5140#discussion_r159069074
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinAwareCollector.scala
 ---
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.join
+
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+
+/**
+  * Collector to track whether there's a joined result.
+  */
+class JoinAwareCollector extends Collector[Row]{
+
+  private var emitted = false
+  private var emittedThisTurn = false
+  private var innerCollector: Collector[CRow] = _
+  private val cRow: CRow = new CRow()
--- End diff --

I'll add a function to set this value in the Collector.


---


[jira] [Commented] (FLINK-8227) Optimize the performance of SharedBufferSerializer

2017-12-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16306257#comment-16306257
 ] 

ASF GitHub Bot commented on FLINK-8227:
---

Github user dawidwys commented on the issue:

https://github.com/apache/flink/pull/5142
  
Merged it. Thanks for the contribution!


> Optimize the performance of SharedBufferSerializer
> --
>
> Key: FLINK-8227
> URL: https://issues.apache.org/jira/browse/FLINK-8227
> Project: Flink
>  Issue Type: Bug
>  Components: CEP
>Reporter: Dian Fu
>Assignee: Dian Fu
>
> Currently {{SharedBufferSerializer.serialize()}} will create a HashMap and 
> put all the {{SharedBufferEntry}} into it. Usually this is not a problem. But 
> we obverse that in some cases the calculation of hashCode may become the 
> bottleneck. The performance will decrease as the number of 
> {{SharedBufferEdge}} increases. For looping pattern {{A*}}, if the number of 
> {{SharedBufferEntry}} is {{N}}, the number of {{SharedBufferEdge}} is about 
> {{N * N}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-8227) Optimize the performance of SharedBufferSerializer

2017-12-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16306256#comment-16306256
 ] 

ASF GitHub Bot commented on FLINK-8227:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5142


> Optimize the performance of SharedBufferSerializer
> --
>
> Key: FLINK-8227
> URL: https://issues.apache.org/jira/browse/FLINK-8227
> Project: Flink
>  Issue Type: Bug
>  Components: CEP
>Reporter: Dian Fu
>Assignee: Dian Fu
>
> Currently {{SharedBufferSerializer.serialize()}} will create a HashMap and 
> put all the {{SharedBufferEntry}} into it. Usually this is not a problem. But 
> we obverse that in some cases the calculation of hashCode may become the 
> bottleneck. The performance will decrease as the number of 
> {{SharedBufferEdge}} increases. For looping pattern {{A*}}, if the number of 
> {{SharedBufferEntry}} is {{N}}, the number of {{SharedBufferEdge}} is about 
> {{N * N}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #5142: [FLINK-8227] Optimize the performance of SharedBufferSeri...

2017-12-29 Thread dawidwys
Github user dawidwys commented on the issue:

https://github.com/apache/flink/pull/5142
  
Merged it. Thanks for the contribution!


---


[GitHub] flink pull request #5142: [FLINK-8227] Optimize the performance of SharedBuf...

2017-12-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5142


---


[jira] [Commented] (FLINK-8226) Dangling reference generated after NFA clean up timed out SharedBufferEntry

2017-12-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16306254#comment-16306254
 ] 

ASF GitHub Bot commented on FLINK-8226:
---

Github user dawidwys commented on a diff in the pull request:

https://github.com/apache/flink/pull/5141#discussion_r159057128
  
--- Diff: 
flink-libraries/flink-cep/src/main/java/org/apache/flink/cep/nfa/SharedBuffer.java
 ---
@@ -191,22 +191,28 @@ public boolean isEmpty() {
 */
public boolean prune(long pruningTimestamp) {
Iterator>> iter = 
pages.entrySet().iterator();
-   boolean pruned = false;
+   List> prunedEntries = new ArrayList<>();
 
while (iter.hasNext()) {
SharedBufferPage page = iter.next().getValue();
 
-   if (page.prune(pruningTimestamp)) {
-   pruned = true;
-   }
+   page.prune(pruningTimestamp, prunedEntries);
 
if (page.isEmpty()) {
// delete page if it is empty
iter.remove();
}
}
 
-   return pruned;
+   if (!prunedEntries.isEmpty()) {
+   for (Map.Entry> entry : 
pages.entrySet()) {
+   entry.getValue().removeEdges(prunedEntries);
+   }
+   prunedEntries.clear();
--- End diff --

As prunedEntries is local now, you don't need to clear it any longer.


> Dangling reference generated after NFA clean up timed out SharedBufferEntry
> ---
>
> Key: FLINK-8226
> URL: https://issues.apache.org/jira/browse/FLINK-8226
> Project: Flink
>  Issue Type: Bug
>  Components: CEP
>Reporter: Dian Fu
>Assignee: Dian Fu
> Fix For: 1.5.0, 1.4.1
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #5141: [FLINK-8226] [cep] Dangling reference generated af...

2017-12-29 Thread dawidwys
Github user dawidwys commented on a diff in the pull request:

https://github.com/apache/flink/pull/5141#discussion_r159057128
  
--- Diff: 
flink-libraries/flink-cep/src/main/java/org/apache/flink/cep/nfa/SharedBuffer.java
 ---
@@ -191,22 +191,28 @@ public boolean isEmpty() {
 */
public boolean prune(long pruningTimestamp) {
Iterator>> iter = 
pages.entrySet().iterator();
-   boolean pruned = false;
+   List> prunedEntries = new ArrayList<>();
 
while (iter.hasNext()) {
SharedBufferPage page = iter.next().getValue();
 
-   if (page.prune(pruningTimestamp)) {
-   pruned = true;
-   }
+   page.prune(pruningTimestamp, prunedEntries);
 
if (page.isEmpty()) {
// delete page if it is empty
iter.remove();
}
}
 
-   return pruned;
+   if (!prunedEntries.isEmpty()) {
+   for (Map.Entry> entry : 
pages.entrySet()) {
+   entry.getValue().removeEdges(prunedEntries);
+   }
+   prunedEntries.clear();
--- End diff --

As prunedEntries is local now, you don't need to clear it any longer.


---


[jira] [Closed] (FLINK-8312) Fix ScalarFunction varargs length exceeds 254 for SQL

2017-12-29 Thread sunjincheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-8312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sunjincheng closed FLINK-8312.
--
Resolution: Fixed

Fixed in  91f00ec91b55b28a59114e88127af2f85f279eb5

> Fix ScalarFunction varargs length exceeds 254 for SQL
> -
>
> Key: FLINK-8312
> URL: https://issues.apache.org/jira/browse/FLINK-8312
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Ruidong Li
>Assignee: Ruidong Li
>
> With Varargs, TableAPI can handle scalar function call with parameters 
> exceeds 254 correctly.
> This issue is intend to support long parameters for SQL



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-8312) Fix ScalarFunction varargs length exceeds 254 for SQL

2017-12-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16306220#comment-16306220
 ] 

ASF GitHub Bot commented on FLINK-8312:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5206


> Fix ScalarFunction varargs length exceeds 254 for SQL
> -
>
> Key: FLINK-8312
> URL: https://issues.apache.org/jira/browse/FLINK-8312
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Ruidong Li
>Assignee: Ruidong Li
>
> With Varargs, TableAPI can handle scalar function call with parameters 
> exceeds 254 correctly.
> This issue is intend to support long parameters for SQL



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #5206: [FLINK-8312][TableAPI && SQL] Fix ScalarFunction v...

2017-12-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5206


---