[GitHub] [flink] flinkbot edited a comment on pull request #14896: [FLINK-21176][docs] translate the updates of 'avro-confluent.zh.md' to Chinese

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14896:
URL: https://github.com/apache/flink/pull/14896#issuecomment-775006223


   
   ## CI report:
   
   * c37ae4075bc8c158a658d1e861467a319e9e6c6b Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13304)
 
   * cf343b402d811409c9fca8db5a3e3982207a95f2 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13307)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14743: [FLINK-21366][doc] mentions Maxwell as CDC tool in Kafka connector documentation

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14743:
URL: https://github.com/apache/flink/pull/14743#issuecomment-766412125


   
   ## CI report:
   
   * f3d0ce7680adcb60b6f31cdc169447d68d2b80f2 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13271)
 
   * ddfdd750a01fd31b928217912ecbb4e901304b75 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13306)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14935: [FLINK-21318][sql-client] Disable hihive catalog tests in ExecutionCo…

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14935:
URL: https://github.com/apache/flink/pull/14935#issuecomment-778544524


   
   ## CI report:
   
   * 32736efe8ae86679478627ab5f3c51c279aec1f9 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13303)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14743: [FLINK-21366][doc] mentions Maxwell as CDC tool in Kafka connector documentation

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14743:
URL: https://github.com/apache/flink/pull/14743#issuecomment-766412125


   
   ## CI report:
   
   * f3d0ce7680adcb60b6f31cdc169447d68d2b80f2 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13271)
 
   * ddfdd750a01fd31b928217912ecbb4e901304b75 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] sv3ndk commented on pull request #14743: [FLINK-21366][doc] mentions Maxwell as CDC tool in Kafka connector documentation

2021-02-12 Thread GitBox


sv3ndk commented on pull request #14743:
URL: https://github.com/apache/flink/pull/14743#issuecomment-778573015


   @sjwiesman by the way, `hugo` is a massive improvement over the previous 
method. If that's your doing, thanks and congrats!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] sv3ndk commented on pull request #14743: [FLINK-21366][doc] mentions Maxwell as CDC tool in Kafka connector documentation

2021-02-12 Thread GitBox


sv3ndk commented on pull request #14743:
URL: https://github.com/apache/flink/pull/14743#issuecomment-778572805


   There you go.
   
   By the way, I noticed that the formatting of the maven dependencies is back 
to the format of a few weeks ago, is this intentional?
   
   In the latest release of the doc, the "maven dependency" column contains an 
easily copy-able snippet of XML:
   
https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/connectors/formats/debezium.html
   
   Whereas the same thing in the current master branch shows the artifact id 
only, as it did in version 1.11 of the doc: 
   
https://ci.apache.org/projects/flink/flink-docs-master/docs/connectors/table/formats/debezium/
   
   I'm the author of those recent XML snippet. If they've been removed 
intentionally that's fine by me, although maybe that is a regression ?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14936: [FLINK-20580][core] Don't support nullable value for SerializedValue

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14936:
URL: https://github.com/apache/flink/pull/14936#issuecomment-778569873


   
   ## CI report:
   
   * 1a04c8736dde22b11869da230f1d7a868ec0a9a8 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13305)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14932: [FLINK-21369][docs] Document Checkpoint Storage

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14932:
URL: https://github.com/apache/flink/pull/14932#issuecomment-778421913


   
   ## CI report:
   
   * 7a43bcdc8a64ea8d090ce61dadeba262ff05fb67 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13302)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-19910) Table API & SQL Data Types Document error

2021-02-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-19910:
---
Labels: pull-request-available  (was: )

> Table API & SQL Data Types Document error
> -
>
> Key: FLINK-19910
> URL: https://issues.apache.org/jira/browse/FLINK-19910
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.11.1, 1.12.0
>Reporter: ZiHaoDeng
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2020-11-01-16-52-07-420.png, 
> image-2020-11-01-16-54-30-989.png
>
>
> source code
> !image-2020-11-01-16-52-07-420.png!
> but the document is wrong
> !image-2020-11-01-16-54-30-989.png!
> url:[https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/types.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] zihaodeng closed pull request #13868: [FLINK-19910][Documentation] Table API & SQL Data Types Document error

2021-02-12 Thread GitBox


zihaodeng closed pull request #13868:
URL: https://github.com/apache/flink/pull/13868


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] jjiey edited a comment on pull request #14896: [FLINK-21176][docs] translate the updates of 'avro-confluent.zh.md' to Chinese

2021-02-12 Thread GitBox


jjiey edited a comment on pull request #14896:
URL: https://github.com/apache/flink/pull/14896#issuecomment-778570539


   Hi @sjwiesman 
   
   Just finished. Thanks for your suggestion.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] jjiey commented on pull request #14896: [FLINK-21176][docs] translate the updates of 'avro-confluent.zh.md' to Chinese

2021-02-12 Thread GitBox


jjiey commented on pull request #14896:
URL: https://github.com/apache/flink/pull/14896#issuecomment-778570539


   Hi @sjwiesman 
   
   It's done already. Thanks for your suggestion.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #14936: [FLINK-20580][core] Don't support nullable value for SerializedValue

2021-02-12 Thread GitBox


flinkbot commented on pull request #14936:
URL: https://github.com/apache/flink/pull/14936#issuecomment-778569873


   
   ## CI report:
   
   * 1a04c8736dde22b11869da230f1d7a868ec0a9a8 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14896: [FLINK-21176][docs] translate the updates of 'avro-confluent.zh.md' to Chinese

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14896:
URL: https://github.com/apache/flink/pull/14896#issuecomment-775006223


   
   ## CI report:
   
   * 3fc334c1fe49196ce9c9be922b62296b7b1e0f09 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13087)
 
   * c37ae4075bc8c158a658d1e861467a319e9e6c6b Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13304)
 
   * cf343b402d811409c9fca8db5a3e3982207a95f2 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14896: [FLINK-21176][docs] translate the updates of 'avro-confluent.zh.md' to Chinese

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14896:
URL: https://github.com/apache/flink/pull/14896#issuecomment-775006223


   
   ## CI report:
   
   * 3fc334c1fe49196ce9c9be922b62296b7b1e0f09 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13087)
 
   * c37ae4075bc8c158a658d1e861467a319e9e6c6b UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #14936: [FLINK-20580][core] Don't support nullable value for SerializedValue

2021-02-12 Thread GitBox


flinkbot commented on pull request #14936:
URL: https://github.com/apache/flink/pull/14936#issuecomment-778567892


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 3687cf97d34e5a13014c6b8c8a160bd372968e46 (Sat Feb 13 
05:43:59 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] kezhuw opened a new pull request #14936: [FLINK-20580][core] Don't support nullable value for SerializedValue

2021-02-12 Thread GitBox


kezhuw opened a new pull request #14936:
URL: https://github.com/apache/flink/pull/14936


   ## What is the purpose of the change
   Dropping nullable-value support from `SerializedValue` to simplify its 
usages. Callers should resort to nullabe/optional variable of `SerializedValue` 
if them want such behavior but just can't treat `SerializedValue` as another 
optional container anymore.
   
   ## Brief change log
   
 - Migrate usage of `SerializedValue` in rpc module to its own wire value 
class. This make this module self-contained and resistant to rpc usages of 
`SerializedValue`.
 - Dropping nullable-value support from `SerializedValue`.
   
   
   ## Verifying this change
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   * `SerializedValueTest.testSimpleValue`.
   
   This change added tests and can be verified as follows:
   - `SerializedValueTest.testNullValue`, 
`SerializedValueTest.testFromNullBytes`, 
`SerializedValueTest.testFromEmptyBytes`
   - `AkkaRpcSerializedValueTest`
   - `AkkaRpcActorTest.canRespondWithSerializedValueLocally`, 
`RemoteAkkaRpcActorTest.canRespondWithSerializedValueRemotely`
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (don't know)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   
   ## WARNING
   **This change will breaking tm/jm rolling update if we ever support it.** I 
got nothing of such supporting from docs and assume no such guarantee.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13912: [FLINK-19466][FLINK-19467][runtime / state backends] Add Flip-142 public interfaces and methods

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #13912:
URL: https://github.com/apache/flink/pull/13912#issuecomment-721398037


   
   ## CI report:
   
   * c638fb8340fb07c493e236ee7ddecb1e89789c8d Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13301)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14934: Upgrade the os-maven-plugin depency, to version 1.7.0

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14934:
URL: https://github.com/apache/flink/pull/14934#issuecomment-778470257


   
   ## CI report:
   
   * e4195e86d0396dd605118d61c2e98a0743614e12 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13299)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14935: [FLINK-21318][sql-client] Disable hihive catalog tests in ExecutionCo…

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14935:
URL: https://github.com/apache/flink/pull/14935#issuecomment-778544524


   
   ## CI report:
   
   * 32736efe8ae86679478627ab5f3c51c279aec1f9 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13303)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14932: [FLINK-21369][docs] Document Checkpoint Storage

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14932:
URL: https://github.com/apache/flink/pull/14932#issuecomment-778421913


   
   ## CI report:
   
   * a0ff0d11299e7db7a8a4e450d6bd852277f7df7e Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13300)
 
   * 7a43bcdc8a64ea8d090ce61dadeba262ff05fb67 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13302)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #14935: [FLINK-21318][sql-client] Disable hihive catalog tests in ExecutionCo…

2021-02-12 Thread GitBox


flinkbot commented on pull request #14935:
URL: https://github.com/apache/flink/pull/14935#issuecomment-778544524


   
   ## CI report:
   
   * 32736efe8ae86679478627ab5f3c51c279aec1f9 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #14935: [FLINK-21318][sql-client] Disable hihive catalog tests in ExecutionCo…

2021-02-12 Thread GitBox


flinkbot commented on pull request #14935:
URL: https://github.com/apache/flink/pull/14935#issuecomment-778542018


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 32736efe8ae86679478627ab5f3c51c279aec1f9 (Sat Feb 13 
01:44:52 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
* **This pull request references an unassigned [Jira 
ticket](https://issues.apache.org/jira/browse/FLINK-21318).** According to the 
[code contribution 
guide](https://flink.apache.org/contributing/contribute-code.html), tickets 
need to be assigned before starting with the implementation work.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14933: [FLINK-21370][Connectors-JDBC] flink-1.12.1 - JDBCExecutionOptions defaults config JDBCDynamicTableFactory default config is not cons

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14933:
URL: https://github.com/apache/flink/pull/14933#issuecomment-778445164


   
   ## CI report:
   
   * e5e2d0e5f0b191127c801641c4e26b2286997a11 UNKNOWN
   * 00f354286c00bdbc713eab5fd14e063c5f620027 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13295)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-21318) ExecutionContextTest.testCatalogs fail

2021-02-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-21318:
---
Labels: pull-request-available test-stability  (was: test-stability)

> ExecutionContextTest.testCatalogs fail
> --
>
> Key: FLINK-21318
> URL: https://issues.apache.org/jira/browse/FLINK-21318
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.13.0
>Reporter: Guowei Ma
>Priority: Major
>  Labels: pull-request-available, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=13064=logs=51fed01c-4eb0-5511-d479-ed5e8b9a7820=e5682198-9e22-5770-69f6-7551182edea8
> {code:java}
> 2021-02-07T22:25:13.3526341Z [ERROR] 
> testDatabases(org.apache.flink.table.client.gateway.local.ExecutionContextTest)
>   Time elapsed: 0.044 s  <<< ERROR!
> 2021-02-07T22:25:13.3526885Z 
> org.apache.flink.table.client.gateway.SqlExecutionException: Could not create 
> execution context.
> 2021-02-07T22:25:13.3527484Z  at 
> org.apache.flink.table.client.gateway.local.ExecutionContext$Builder.build(ExecutionContext.java:972)
> 2021-02-07T22:25:13.3528070Z  at 
> org.apache.flink.table.client.gateway.local.ExecutionContextTest.createExecutionContext(ExecutionContextTest.java:398)
> 2021-02-07T22:25:13.3528676Z  at 
> org.apache.flink.table.client.gateway.local.ExecutionContextTest.createCatalogExecutionContext(ExecutionContextTest.java:434)
> 2021-02-07T22:25:13.3529280Z  at 
> org.apache.flink.table.client.gateway.local.ExecutionContextTest.testDatabases(ExecutionContextTest.java:230)
> 2021-02-07T22:25:13.3529775Z  at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2021-02-07T22:25:13.3530232Z  at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2021-02-07T22:25:13.3530773Z  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2021-02-07T22:25:13.3531246Z  at 
> java.base/java.lang.reflect.Method.invoke(Method.java:566)
> 2021-02-07T22:25:13.3531641Z  at 
> org.junit.internal.runners.TestMethod.invoke(TestMethod.java:68)
> 2021-02-07T22:25:13.3532223Z  at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:326)
> 2021-02-07T22:25:13.3532909Z  at 
> org.junit.internal.runners.MethodRoadie$2.run(MethodRoadie.java:89)
> 2021-02-07T22:25:13.3533369Z  at 
> org.junit.internal.runners.MethodRoadie.runBeforesThenTestThenAfters(MethodRoadie.java:97)
> 2021-02-07T22:25:13.3534119Z  at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.executeTest(PowerMockJUnit44RunnerDelegateImpl.java:310)
> 2021-02-07T22:25:13.3534995Z  at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTestInSuper(PowerMockJUnit47RunnerDelegateImpl.java:131)
> 2021-02-07T22:25:13.3535784Z  at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.access$100(PowerMockJUnit47RunnerDelegateImpl.java:59)
> 2021-02-07T22:25:13.3536590Z  at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner$TestExecutorStatement.evaluate(PowerMockJUnit47RunnerDelegateImpl.java:147)
> 2021-02-07T22:25:13.3537395Z  at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.evaluateStatement(PowerMockJUnit47RunnerDelegateImpl.java:107)
> 2021-02-07T22:25:13.3538183Z  at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTest(PowerMockJUnit47RunnerDelegateImpl.java:82)
> 2021-02-07T22:25:13.3538998Z  at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runBeforesThenTestThenAfters(PowerMockJUnit44RunnerDelegateImpl.java:298)
> 2021-02-07T22:25:13.3539633Z  at 
> org.junit.internal.runners.MethodRoadie.runTest(MethodRoadie.java:87)
> 2021-02-07T22:25:13.3540039Z  at 
> org.junit.internal.runners.MethodRoadie.run(MethodRoadie.java:50)
> 2021-02-07T22:25:13.3540592Z  at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.invokeTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:218)
> 2021-02-07T22:25:13.3541269Z  at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.runMethods(PowerMockJUnit44RunnerDelegateImpl.java:160)
> 2021-02-07T22:25:13.3541969Z  at 
> 

[GitHub] [flink] lirui-apache opened a new pull request #14935: [FLINK-21318][sql-client] Disable hihive catalog tests in ExecutionCo…

2021-02-12 Thread GitBox


lirui-apache opened a new pull request #14935:
URL: https://github.com/apache/flink/pull/14935


   …ntextTest for jdk 11
   
   
   
   ## What is the purpose of the change
   
   Fix test failures `testCatalogs` and `testDatabases` in 
`ExecutionContextTest`.
   
   
   ## Brief change log
   
 - Add `@Category(FailsOnJava11.class)` annotation for these tests
   
   
   ## Verifying this change
   
   NA
   
   ## Does this pull request potentially affect one of the following parts:
   
   NA
   
   ## Documentation
   
   NA
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13912: [FLINK-19466][FLINK-19467][runtime / state backends] Add Flip-142 public interfaces and methods

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #13912:
URL: https://github.com/apache/flink/pull/13912#issuecomment-721398037


   
   ## CI report:
   
   * 7ac6d8a1b4a0bbe2fd222f4403cbc967b7f3f7ef Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13291)
 
   * c638fb8340fb07c493e236ee7ddecb1e89789c8d Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13301)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14932: [FLINK-21369][docs] Document Checkpoint Storage

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14932:
URL: https://github.com/apache/flink/pull/14932#issuecomment-778421913


   
   ## CI report:
   
   * 95e05ce40b2f091a0b3c5d1e80fea7182363f913 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13293)
 
   * a0ff0d11299e7db7a8a4e450d6bd852277f7df7e Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13300)
 
   * 7a43bcdc8a64ea8d090ce61dadeba262ff05fb67 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13912: [FLINK-19466][FLINK-19467][runtime / state backends] Add Flip-142 public interfaces and methods

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #13912:
URL: https://github.com/apache/flink/pull/13912#issuecomment-721398037


   
   ## CI report:
   
   * 7ac6d8a1b4a0bbe2fd222f4403cbc967b7f3f7ef Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13291)
 
   * c638fb8340fb07c493e236ee7ddecb1e89789c8d UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Comment Edited] (FLINK-21364) piggyback finishedSplitIds in RequestSplitEvent

2021-02-12 Thread Steven Zhen Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17284024#comment-17284024
 ] 

Steven Zhen Wu edited comment on FLINK-21364 at 2/12/21, 10:57 PM:
---

For pull based source (like file/Iceberg), it is probably more 
natural/efficient to piggyback the `finishedSplitsIds` in the 
`RequestSplitEvent`. A reader request a new split when the current split is 
done.

It doesn't mean that a reader has to request for a new split when finishing 
some splits, like bounded Kafka source case.

You have a good point that some sources (like Kafka/Kineses) may still need to 
communicate the watermark info to coordinator/enumerator. In this case, it 
definitely will be a separate type of event (like `WatermarkUpdateEvent`). 

In our Iceberg source use cases, readers didn't actually report watermark. They 
just need to report which split are finished. All the ordering/watermark 
tracking is centralized in Iceberg source coordinator. But I can see that this 
may not be a very generic scenario to change the `RequestSplitEvent` in 
flink-runtime.

cc [~sundaram]


was (Author: stevenz3wu):
For pull based source (like file/Iceberg), it is probably more 
natural/efficient to piggyback the `finishedSplitsIds` in the 
`RequestSplitEvent`. A reader request a new split when the current split is 
done.

It doesn't mean that a reader has to request for a new split when finishing 
some splits, like bounded Kafka source case.

You have a good point that some sources (like Kafka/Kineses) may still need to 
communicate the watermark info to coordinator/enumerator. In this case, it 
definitely will be a separate type of event (like `WatermarkUpdateEvent`). 

In our Iceberg source use cases, readers didn't actually report watermark. They 
just need to report which split are finished. But I can see that this may not 
be a very generic scenario to change the `RequestSplitEvent` in flink-runtime.

cc [~sundaram]

> piggyback finishedSplitIds in RequestSplitEvent
> ---
>
> Key: FLINK-21364
> URL: https://issues.apache.org/jira/browse/FLINK-21364
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Common
>Affects Versions: 1.12.1
>Reporter: Steven Zhen Wu
>Priority: Major
>  Labels: pull-request-available
>
> For some split assignment strategy, the enumerator/assigner needs to track 
> the completed splits to advance watermark for event time alignment or rough 
> ordering. Right now, `RequestSplitEvent` for FLIP-27 source doesn't support 
> pass-along of the `finishedSplitIds` info and hence we have to create our own 
> custom source event type for Iceberg source. 
> Here is the proposal of add such optional info to `RequestSplitEvent`.
> {code}
> public RequestSplitEvent(
> @Nullable String hostName, 
> @Nullable Collection finishedSplitIds)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #14932: [FLINK-21369][docs] Document Checkpoint Storage

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14932:
URL: https://github.com/apache/flink/pull/14932#issuecomment-778421913


   
   ## CI report:
   
   * 95e05ce40b2f091a0b3c5d1e80fea7182363f913 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13293)
 
   * a0ff0d11299e7db7a8a4e450d6bd852277f7df7e UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21364) piggyback finishedSplitIds in RequestSplitEvent

2021-02-12 Thread Steven Zhen Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17284024#comment-17284024
 ] 

Steven Zhen Wu commented on FLINK-21364:


For pull based source (like file/Iceberg), it is probably more 
natural/efficient to piggyback the `finishedSplitsIds` in the 
`RequestSplitEvent`. A reader request a new split when the current split is 
done.

It doesn't mean that a reader has to request for a new split when finishing 
some splits, like bounded Kafka source case.

You have a good point that some sources (like Kafka/Kineses) may still need to 
communicate the watermark info to coordinator/enumerator. In this case, it 
definitely will be a separate type of event (like `WatermarkUpdateEvent`). 

In our Iceberg source use cases, readers didn't actually report watermark. They 
just need to report which split are finished. But I can see that this may not 
be a very generic scenario to change the `RequestSplitEvent` in flink-runtime.

cc [~sundaram]

> piggyback finishedSplitIds in RequestSplitEvent
> ---
>
> Key: FLINK-21364
> URL: https://issues.apache.org/jira/browse/FLINK-21364
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Common
>Affects Versions: 1.12.1
>Reporter: Steven Zhen Wu
>Priority: Major
>  Labels: pull-request-available
>
> For some split assignment strategy, the enumerator/assigner needs to track 
> the completed splits to advance watermark for event time alignment or rough 
> ordering. Right now, `RequestSplitEvent` for FLIP-27 source doesn't support 
> pass-along of the `finishedSplitIds` info and hence we have to create our own 
> custom source event type for Iceberg source. 
> Here is the proposal of add such optional info to `RequestSplitEvent`.
> {code}
> public RequestSplitEvent(
> @Nullable String hostName, 
> @Nullable Collection finishedSplitIds)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #14934: Upgrade the os-maven-plugin depency, to version 1.7.0

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14934:
URL: https://github.com/apache/flink/pull/14934#issuecomment-778470257


   
   ## CI report:
   
   * e4195e86d0396dd605118d61c2e98a0743614e12 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13299)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #14934: Upgrade the os-maven-plugin depency, to version 1.7.0

2021-02-12 Thread GitBox


flinkbot commented on pull request #14934:
URL: https://github.com/apache/flink/pull/14934#issuecomment-778470257


   
   ## CI report:
   
   * e4195e86d0396dd605118d61c2e98a0743614e12 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14933: [FLINK-21370][Connectors-JDBC] flink-1.12.1 - JDBCExecutionOptions defaults config JDBCDynamicTableFactory default config is not cons

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14933:
URL: https://github.com/apache/flink/pull/14933#issuecomment-778445164


   
   ## CI report:
   
   * e5e2d0e5f0b191127c801641c4e26b2286997a11 UNKNOWN
   * 00f354286c00bdbc713eab5fd14e063c5f620027 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13295)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21359) CompatibilityResult issues with Flink 1.9.0

2021-02-12 Thread Siva (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283996#comment-17283996
 ] 

Siva commented on FLINK-21359:
--

We are using BytesDeserializerSchema implements 
KeyedDeserializationSchema

> CompatibilityResult issues with Flink 1.9.0
> ---
>
> Key: FLINK-21359
> URL: https://issues.apache.org/jira/browse/FLINK-21359
> Project: Flink
>  Issue Type: Bug
> Environment: DEV
>Reporter: Siva
>Priority: Major
>
> I am using emr 5.28.0 and flink 1.9.0
>  
> Source code is working fine with emr 5.11.0 and flink 1.3.2, but the same 
> source code is throwing the following stack track with emr 5.28.0 and flink 
> 1.9.0
>  
> java.lang.NoClassDefFoundError: 
> org/apache/flink/api/common/typeutils/CompatibilityResult
>  at java.lang.Class.getDeclaredMethods0(Native Method)
>  at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
>  at java.lang.Class.getDeclaredMethod(Class.java:2128)
>  at java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1643)
>  at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:79)
>  at java.io.ObjectStreamClass$3.run(ObjectStreamClass.java:520)
>  at java.io.ObjectStreamClass$3.run(ObjectStreamClass.java:494)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at java.io.ObjectStreamClass.(ObjectStreamClass.java:494)
>  at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:391)
>  at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1134)
>  at java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1378)
>  at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1174)
>  at 
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
>  at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
>  at 
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
>  at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
>  at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
>  at 
> org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:586)
>  at 
> org.apache.flink.util.InstantiationUtil.writeObjectToConfig(InstantiationUtil.java:515)
>  at 
> org.apache.flink.streaming.api.graph.StreamConfig.setTypeSerializer(StreamConfig.java:193)
>  at 
> org.apache.flink.streaming.api.graph.StreamConfig.setTypeSerializerIn1(StreamConfig.java:143)
>  at 
> org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.setVertexConfig(StreamingJobGraphGenerator.java:438)
>  at 
> org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.createChain(StreamingJobGraphGenerator.java:272)
>  at 
> org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.createChain(StreamingJobGraphGenerator.java:238)
>  at 
> org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.createChain(StreamingJobGraphGenerator.java:238)
>  at 
> org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.createChain(StreamingJobGraphGenerator.java:243)
>  at 
> org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.setChaining(StreamingJobGraphGenerator.java:207)
>  at 
> org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.createJobGraph(StreamingJobGraphGenerator.java:159)
>  at 
> org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.createJobGraph(StreamingJobGraphGenerator.java:94)
>  at 
> org.apache.flink.streaming.api.graph.StreamGraph.getJobGraph(StreamGraph.java:737)
>  at 
> org.apache.flink.client.program.PackagedProgramUtils.createJobGraph(PackagedProgramUtils.java:88)
>  at 
> org.apache.flink.client.program.PackagedProgramUtils.createJobGraph(PackagedProgramUtils.java:122)
>  at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:227)
>  at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:205)
>  at 
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1010)
>  at 
> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1083)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
>  at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>  at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1083)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.flink.api.common.typeutils.CompatibilityResult
>  at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
>  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:351)

[GitHub] [flink] sjwiesman commented on a change in pull request #14743: [FLINK-21366][doc] mentions Maxwell as CDC tool in Kafka connector documentation

2021-02-12 Thread GitBox


sjwiesman commented on a change in pull request #14743:
URL: https://github.com/apache/flink/pull/14743#discussion_r575525435



##
File path: docs/content.zh/docs/connectors/table/kafka.md
##
@@ -499,12 +499,13 @@ If `timestamp` is specified, another config option 
`scan.startup.timestamp-milli
 If `specific-offsets` is specified, another config option 
`scan.startup.specific-offsets` is required to specify specific startup offsets 
for each partition,
 e.g. an option value `partition:0,offset:42;partition:1,offset:300` indicates 
offset `42` for partition `0` and offset `300` for partition `1`.
 
-### Changelog Source
+### CDC Changelog Source
+
+Flink natively supports Kafka as a CDC changelog source. If messages in Kafka 
topic is change event captured from other databases using CDC tools, then you 
can use a CDC format to interpret messages as INSERT/UPDATE/DELETE messages 
into Flink SQL system.
+
+Flink provides three CDC formats: [debezium-json]({% link 
dev/table/connectors/formats/debezium.md %}), [canal-json]({% link 
dev/table/connectors/formats/canal.md %}) and [maxwell-json]({% link 
dev/table/connectors/formats/maxwell.md %}) to interpret change events captured 
by [Debezium](https://debezium.io/), 
[Canal](https://github.com/alibaba/canal/wiki) and 
[Maxwell](https://maxwells-daemon.io/).

Review comment:
   Sounds good to me!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tweise commented on pull request #14929: [FLINK-21364][connector] piggyback finishedSplitIds in RequestSplitEv…

2021-02-12 Thread GitBox


tweise commented on pull request #14929:
URL: https://github.com/apache/flink/pull/14929#issuecomment-778460956


   @stevenzwu see my comment on the JIRA. I guess there is no harm piggybacking 
finished splits when requesting new splits but there could be situations where 
that isn't sufficient? 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] sv3ndk commented on pull request #14743: [FLINK-21366][doc] mentions Maxwell as CDC tool in Kafka connector documentation

2021-02-12 Thread GitBox


sv3ndk commented on pull request #14743:
URL: https://github.com/apache/flink/pull/14743#issuecomment-778460353


   Hi @sjwiesman,
   Got it, thanks for the pointer, I had noticed a change of structure when 
resolving conflicts, though I'm learning now the doc is based on Hugo.
   I'll update links to ref in my next commit.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21364) piggyback finishedSplitIds in RequestSplitEvent

2021-02-12 Thread Thomas Weise (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283981#comment-17283981
 ] 

Thomas Weise commented on FLINK-21364:
--

I wonder if finished splits should be communicated separately to the 
enumerator? Theoretically splits could finish and the reader not (yet) request 
new splits.

Regarding the event time alignment: For the file source the split boundary may 
provide sufficient granularity. But for sources like Kafka and Kinesis where 
readers work on their splits "forever", this won't be the case. There would 
need to be a different mechanism to synchronize. In the old Kinesis source that 
is accomplished by exchanging the actual watermark information. 

> piggyback finishedSplitIds in RequestSplitEvent
> ---
>
> Key: FLINK-21364
> URL: https://issues.apache.org/jira/browse/FLINK-21364
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Common
>Affects Versions: 1.12.1
>Reporter: Steven Zhen Wu
>Priority: Major
>  Labels: pull-request-available
>
> For some split assignment strategy, the enumerator/assigner needs to track 
> the completed splits to advance watermark for event time alignment or rough 
> ordering. Right now, `RequestSplitEvent` for FLIP-27 source doesn't support 
> pass-along of the `finishedSplitIds` info and hence we have to create our own 
> custom source event type for Iceberg source. 
> Here is the proposal of add such optional info to `RequestSplitEvent`.
> {code}
> public RequestSplitEvent(
> @Nullable String hostName, 
> @Nullable Collection finishedSplitIds)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on pull request #14934: Upgrade the os-maven-plugin depency, to version 1.7.0

2021-02-12 Thread GitBox


flinkbot commented on pull request #14934:
URL: https://github.com/apache/flink/pull/14934#issuecomment-778457304


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit e4195e86d0396dd605118d61c2e98a0743614e12 (Fri Feb 12 
21:11:32 UTC 2021)
   
   **Warnings:**
* **1 pom.xml files were touched**: Check for build and licensing issues.
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
* **Invalid pull request title: No valid Jira ID provided**
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] advancedwebdeveloper opened a new pull request #14934: Upgrade the os-maven-plugin depency, to version 1.7.0

2021-02-12 Thread GitBox


advancedwebdeveloper opened a new pull request #14934:
URL: https://github.com/apache/flink/pull/14934


   That would ensure that RISC-V arch. is supported.
   I tested it on a U54-MC based board - it worked.
   
   Hence 
https://github.com/trustin/os-maven-plugin/commit/43df41a94dab780d03f47f17c5cbc8e2ef400c3b
 
   
   CC @trustin 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] sv3ndk commented on a change in pull request #14743: [FLINK-21366][doc] mentions Maxwell as CDC tool in Kafka connector documentation

2021-02-12 Thread GitBox


sv3ndk commented on a change in pull request #14743:
URL: https://github.com/apache/flink/pull/14743#discussion_r575520426



##
File path: docs/content.zh/docs/connectors/table/kafka.md
##
@@ -499,12 +499,13 @@ If `timestamp` is specified, another config option 
`scan.startup.timestamp-milli
 If `specific-offsets` is specified, another config option 
`scan.startup.specific-offsets` is required to specify specific startup offsets 
for each partition,
 e.g. an option value `partition:0,offset:42;partition:1,offset:300` indicates 
offset `42` for partition `0` and offset `300` for partition `1`.
 
-### Changelog Source
+### CDC Changelog Source
+
+Flink natively supports Kafka as a CDC changelog source. If messages in Kafka 
topic is change event captured from other databases using CDC tools, then you 
can use a CDC format to interpret messages as INSERT/UPDATE/DELETE messages 
into Flink SQL system.
+
+Flink provides three CDC formats: [debezium-json]({% link 
dev/table/connectors/formats/debezium.md %}), [canal-json]({% link 
dev/table/connectors/formats/canal.md %}) and [maxwell-json]({% link 
dev/table/connectors/formats/maxwell.md %}) to interpret change events captured 
by [Debezium](https://debezium.io/), 
[Canal](https://github.com/alibaba/canal/wiki) and 
[Maxwell](https://maxwells-daemon.io/).

Review comment:
   Thanks for the feed-back.
   How about I make it a list, and I omit the details of the specific flavors 
of the format, i.e. we choose not to mention here the distinction between 
`json` and `avro-confluent` and others? That way the user can find all the 
specific details in the linked pages, and this part of the documentation has 
less chances of becoming obsolete over time.
   
   Something like this:
   ```
   Flink provides several CDC formats: 
   * [debezium](...)
   * [canal](...)
   * [maxwell](...)
   ```
   
   What do you think?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14933: [FLINK-21370][Connectors-JDBC] flink-1.12.1 - JDBCExecutionOptions defaults config JDBCDynamicTableFactory default config is not cons

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14933:
URL: https://github.com/apache/flink/pull/14933#issuecomment-778445164


   
   ## CI report:
   
   * e5e2d0e5f0b191127c801641c4e26b2286997a11 UNKNOWN
   * 00f354286c00bdbc713eab5fd14e063c5f620027 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Comment Edited] (FLINK-11838) Create RecoverableWriter for GCS

2021-02-12 Thread Galen Warren (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-11838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283950#comment-17283950
 ] 

Galen Warren edited comment on FLINK-11838 at 2/12/21, 9:01 PM:


One more thought/question. While we can potentially compose temp blobs along 
the way, to avoid having to compose them all at commit time, it seems to me 
that we can't safely delete any of the temporary blobs along way, because it's 
possible that we might restore to a checkpoint prior to some or all of the 
incremental compose operations having occurred. In that case, we'd have to 
repeat the compose operations, which means the underlying temp blobs would need 
to be there.

If that's right, then we'd necessarily have to wait to the end to delete the 
temp blobs. I was wondering, would it be allowable to do any of those delete 
operations in parallel? The coding 
[guidelines|https://flink.apache.org/contributing/code-style-and-quality-common.html#concurrency-and-threading]
 would seem to discourage this, but they don't outright prohibit it, so I 
thought I'd ask first.

EDIT: Nevermind regarding the parallelism question; I didn't notice that the 
GCS api provides a method to delete several blobs in a single call, I think 
that will suffice. 


was (Author: galenwarren):
One more thought/question. While we can potentially compose temp blobs along 
the way, to avoid having to compose them all at commit time, it seems to me 
that we can't safely delete any of the temporary blobs along way, because it's 
possible that we might restore to a checkpoint prior to some or all of the 
incremental compose operations having occurred. In that case, we'd have to 
repeat the compose operations, which means the underlying temp blobs would need 
to be there.

If that's right, then we'd necessarily have to wait to the end to delete the 
temp blobs. I was wondering, would it be allowable to do any of those delete 
operations in parallel? The coding 
[guidelines|https://flink.apache.org/contributing/code-style-and-quality-common.html#concurrency-and-threading]
 would seem to discourage this, but they don't outright prohibit it, so I 
thought I'd ask first.

> Create RecoverableWriter for GCS
> 
>
> Key: FLINK-11838
> URL: https://issues.apache.org/jira/browse/FLINK-11838
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / FileSystem
>Affects Versions: 1.8.0
>Reporter: Fokko Driesprong
>Assignee: Galen Warren
>Priority: Major
>  Labels: pull-request-available, usability
> Fix For: 1.13.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> GCS supports the resumable upload which we can use to create a Recoverable 
> writer similar to the S3 implementation:
> https://cloud.google.com/storage/docs/json_api/v1/how-tos/resumable-upload
> After using the Hadoop compatible interface: 
> https://github.com/apache/flink/pull/7519
> We've noticed that the current implementation relies heavily on the renaming 
> of the files on the commit: 
> https://github.com/apache/flink/blob/master/flink-filesystems/flink-hadoop-fs/src/main/java/org/apache/flink/runtime/fs/hdfs/HadoopRecoverableFsDataOutputStream.java#L233-L259
> This is suboptimal on an object store such as GCS. Therefore we would like to 
> implement a more GCS native RecoverableWriter 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on pull request #14933: [FLINK-21370][Connectors-JDBC] flink-1.12.1 - JDBCExecutionOptions defaults config JDBCDynamicTableFactory default config is not consistent

2021-02-12 Thread GitBox


flinkbot commented on pull request #14933:
URL: https://github.com/apache/flink/pull/14933#issuecomment-778445164


   
   ## CI report:
   
   * e5e2d0e5f0b191127c801641c4e26b2286997a11 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #14933: [FLINK-21370][Connectors-JDBC] flink-1.12.1 - JDBCExecutionOptions defaults config JDBCDynamicTableFactory default config is not consistent

2021-02-12 Thread GitBox


flinkbot commented on pull request #14933:
URL: https://github.com/apache/flink/pull/14933#issuecomment-778437768


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit e5e2d0e5f0b191127c801641c4e26b2286997a11 (Fri Feb 12 
20:29:56 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
* **This pull request references an unassigned [Jira 
ticket](https://issues.apache.org/jira/browse/FLINK-1).** According to the 
[code contribution 
guide](https://flink.apache.org/contributing/contribute-code.html), tickets 
need to be assigned before starting with the implementation work.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-21370) flink-1.12.1 - JDBCExecutionOptions defaults config JDBCDynamicTableFactory default config is not consistent

2021-02-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-21370:
---
Labels: pull-request-available  (was: )

> flink-1.12.1 - JDBCExecutionOptions defaults config JDBCDynamicTableFactory 
> default config is not consistent
> 
>
> Key: FLINK-21370
> URL: https://issues.apache.org/jira/browse/FLINK-21370
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / JDBC
>Affects Versions: 1.12.1
>Reporter: 谢波
>Priority: Major
>  Labels: pull-request-available
>
> When i test jdbc sink with kafka src, the data is not sink to db, and then i 
> find out JDBCExecutionOptions defaults config JDBCDynamicTableFactory default 
> config is not consistent.
> JDBCExecutionOptions
> public static final int DEFAULT_MAX_RETRY_TIMES = 3;
> private static final int DEFAULT_INTERVAL_MILLIS = 0;
> public static final int DEFAULT_SIZE = 5000;
>  
> JDBCDynamicTableFactory
>  
> // write config options
> private static final ConfigOption SINK_BUFFER_FLUSH_MAX_ROWS =
>  ConfigOptions.key("sink.buffer-flush.max-rows")
>  .intType()
>  .defaultValue(100)
>  .withDescription(
>  "the flush max size (includes all append, upsert and delete records), over 
> this number"
>  + " of records, will flush data. The default value is 100.");
> private static final ConfigOption SINK_BUFFER_FLUSH_INTERVAL =
>  ConfigOptions.key("sink.buffer-flush.interval")
>  .durationType()
>  .defaultValue(Duration.ofSeconds(1))
>  .withDescription(
>  "the flush interval mills, over this time, asynchronous threads will flush 
> data. The "
>  + "default value is 1s.");
> private static final ConfigOption SINK_MAX_RETRIES =
>  ConfigOptions.key("sink.max-retries")
>  .intType()
>  .defaultValue(3)
>  .withDescription("the max retry times if writing records to database 
> failed.");



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] MyLanPangzi opened a new pull request #14933: [FLINK-21370][Connectors-JDBC] flink-1.12.1 - JDBCExecutionOptions defaults config JDBCDynamicTableFactory default config is not consiste

2021-02-12 Thread GitBox


MyLanPangzi opened a new pull request #14933:
URL: https://github.com/apache/flink/pull/14933


   fix https://issues.apache.org/jira/projects/FLINK/issues/FLINK-21370
   
   
   ## What is the purpose of the change
   
   This change makes JDBCExecutionOptions defaults config 
JDBCDynamicTableFactory default config is consistent
   
   ## Brief change log
   
   - 
org.apache.flink.connector.jdbc.JdbcExecutionOptions#DEFAULT_INTERVAL_MILLIS 
now is pulibc and values is 1000 
   - org.apache.flink.connector.jdbc.JdbcExecutionOptions#DEFAULT_SIZE values 
is 100 
   
   
   ## Verifying this change
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`:yes 
 - The serializers:  no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] pnowojski merged pull request #14931: [hotfix] Fix ASCII art in the OperatorChain JavaDoc

2021-02-12 Thread GitBox


pnowojski merged pull request #14931:
URL: https://github.com/apache/flink/pull/14931


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14932: [FLINK-21369][docs] Document Checkpoint Storage

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14932:
URL: https://github.com/apache/flink/pull/14932#issuecomment-778421913


   
   ## CI report:
   
   * 95e05ce40b2f091a0b3c5d1e80fea7182363f913 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13293)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21370) flink-1.12.1 - JDBCExecutionOptions defaults config JDBCDynamicTableFactory default config is not consistent

2021-02-12 Thread Jira


[ 
https://issues.apache.org/jira/browse/FLINK-21370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283953#comment-17283953
 ] 

谢波 commented on FLINK-21370:


[~jark] Can you assign this problem to me?

> flink-1.12.1 - JDBCExecutionOptions defaults config JDBCDynamicTableFactory 
> default config is not consistent
> 
>
> Key: FLINK-21370
> URL: https://issues.apache.org/jira/browse/FLINK-21370
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / JDBC
>Affects Versions: 1.12.1
>Reporter: 谢波
>Priority: Major
>
> When i test jdbc sink with kafka src, the data is not sink to db, and then i 
> find out JDBCExecutionOptions defaults config JDBCDynamicTableFactory default 
> config is not consistent.
> JDBCExecutionOptions
> public static final int DEFAULT_MAX_RETRY_TIMES = 3;
> private static final int DEFAULT_INTERVAL_MILLIS = 0;
> public static final int DEFAULT_SIZE = 5000;
>  
> JDBCDynamicTableFactory
>  
> // write config options
> private static final ConfigOption SINK_BUFFER_FLUSH_MAX_ROWS =
>  ConfigOptions.key("sink.buffer-flush.max-rows")
>  .intType()
>  .defaultValue(100)
>  .withDescription(
>  "the flush max size (includes all append, upsert and delete records), over 
> this number"
>  + " of records, will flush data. The default value is 100.");
> private static final ConfigOption SINK_BUFFER_FLUSH_INTERVAL =
>  ConfigOptions.key("sink.buffer-flush.interval")
>  .durationType()
>  .defaultValue(Duration.ofSeconds(1))
>  .withDescription(
>  "the flush interval mills, over this time, asynchronous threads will flush 
> data. The "
>  + "default value is 1s.");
> private static final ConfigOption SINK_MAX_RETRIES =
>  ConfigOptions.key("sink.max-retries")
>  .intType()
>  .defaultValue(3)
>  .withDescription("the max retry times if writing records to database 
> failed.");



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21370) flink-1.12.1 - JDBCExecutionOptions defaults config JDBCDynamicTableFactory default config is not consistent

2021-02-12 Thread Jira
谢波 created FLINK-21370:
--

 Summary: flink-1.12.1 - JDBCExecutionOptions defaults config 
JDBCDynamicTableFactory default config is not consistent
 Key: FLINK-21370
 URL: https://issues.apache.org/jira/browse/FLINK-21370
 Project: Flink
  Issue Type: Bug
  Components: Connectors / JDBC
Affects Versions: 1.12.1
Reporter: 谢波


When i test jdbc sink with kafka src, the data is not sink to db, and then i 
find out JDBCExecutionOptions defaults config JDBCDynamicTableFactory default 
config is not consistent.

JDBCExecutionOptions

public static final int DEFAULT_MAX_RETRY_TIMES = 3;
private static final int DEFAULT_INTERVAL_MILLIS = 0;
public static final int DEFAULT_SIZE = 5000;

 

JDBCDynamicTableFactory

 


// write config options
private static final ConfigOption SINK_BUFFER_FLUSH_MAX_ROWS =
 ConfigOptions.key("sink.buffer-flush.max-rows")
 .intType()
 .defaultValue(100)
 .withDescription(
 "the flush max size (includes all append, upsert and delete records), over 
this number"
 + " of records, will flush data. The default value is 100.");


private static final ConfigOption SINK_BUFFER_FLUSH_INTERVAL =
 ConfigOptions.key("sink.buffer-flush.interval")
 .durationType()
 .defaultValue(Duration.ofSeconds(1))
 .withDescription(
 "the flush interval mills, over this time, asynchronous threads will flush 
data. The "
 + "default value is 1s.");


private static final ConfigOption SINK_MAX_RETRIES =
 ConfigOptions.key("sink.max-retries")
 .intType()
 .defaultValue(3)
 .withDescription("the max retry times if writing records to database failed.");



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on pull request #14932: [FLINK-21369][docs] Document Checkpoint Storage

2021-02-12 Thread GitBox


flinkbot commented on pull request #14932:
URL: https://github.com/apache/flink/pull/14932#issuecomment-778421913


   
   ## CI report:
   
   * 95e05ce40b2f091a0b3c5d1e80fea7182363f913 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-11838) Create RecoverableWriter for GCS

2021-02-12 Thread Galen Warren (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-11838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283950#comment-17283950
 ] 

Galen Warren commented on FLINK-11838:
--

One more thought/question. While we can potentially compose temp blobs along 
the way, to avoid having to compose them all at commit time, it seems to me 
that we can't safely delete any of the temporary blobs along way, because it's 
possible that we might restore to a checkpoint prior to some or all of the 
incremental compose operations having occurred. In that case, we'd have to 
repeat the compose operations, which means the underlying temp blobs would need 
to be there.

If that's right, then we'd necessarily have to wait to the end to delete the 
temp blobs. I was wondering, would it be allowable to do any of those delete 
operations in parallel? The coding 
[guidelines|https://flink.apache.org/contributing/code-style-and-quality-common.html#concurrency-and-threading]
 would seem to discourage this, but they don't outright prohibit it, so I 
thought I'd ask first.

> Create RecoverableWriter for GCS
> 
>
> Key: FLINK-11838
> URL: https://issues.apache.org/jira/browse/FLINK-11838
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / FileSystem
>Affects Versions: 1.8.0
>Reporter: Fokko Driesprong
>Assignee: Galen Warren
>Priority: Major
>  Labels: pull-request-available, usability
> Fix For: 1.13.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> GCS supports the resumable upload which we can use to create a Recoverable 
> writer similar to the S3 implementation:
> https://cloud.google.com/storage/docs/json_api/v1/how-tos/resumable-upload
> After using the Hadoop compatible interface: 
> https://github.com/apache/flink/pull/7519
> We've noticed that the current implementation relies heavily on the renaming 
> of the files on the commit: 
> https://github.com/apache/flink/blob/master/flink-filesystems/flink-hadoop-fs/src/main/java/org/apache/flink/runtime/fs/hdfs/HadoopRecoverableFsDataOutputStream.java#L233-L259
> This is suboptimal on an object store such as GCS. Therefore we would like to 
> implement a more GCS native RecoverableWriter 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21369) Document Checkpoint Storage

2021-02-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-21369:
---
Labels: pull-request-available  (was: )

> Document Checkpoint Storage
> ---
>
> Key: FLINK-21369
> URL: https://issues.apache.org/jira/browse/FLINK-21369
> Project: Flink
>  Issue Type: Improvement
>Reporter: Seth Wiesman
>Assignee: Seth Wiesman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on pull request #14932: [FLINK-21369][docs] Document Checkpoint Storage

2021-02-12 Thread GitBox


flinkbot commented on pull request #14932:
URL: https://github.com/apache/flink/pull/14932#issuecomment-778414061


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 95e05ce40b2f091a0b3c5d1e80fea7182363f913 (Fri Feb 12 
19:46:25 UTC 2021)
   
✅no warnings
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] sjwiesman commented on pull request #14932: Checkpoint storage docs

2021-02-12 Thread GitBox


sjwiesman commented on pull request #14932:
URL: https://github.com/apache/flink/pull/14932#issuecomment-778412989


   cc @rkhachatryan & @alpinegizmo 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] sjwiesman opened a new pull request #14932: Checkpoint storage docs

2021-02-12 Thread GitBox


sjwiesman opened a new pull request #14932:
URL: https://github.com/apache/flink/pull/14932


   ## What is the purpose of the change
   
   Add documentation for the new Checkpoint Storage abstraction and new State 
Backend apis (HashMap and EmbeddedRocksDB). 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-21369) Document Checkpoint Storage

2021-02-12 Thread Seth Wiesman (Jira)
Seth Wiesman created FLINK-21369:


 Summary: Document Checkpoint Storage
 Key: FLINK-21369
 URL: https://issues.apache.org/jira/browse/FLINK-21369
 Project: Flink
  Issue Type: Improvement
Reporter: Seth Wiesman
Assignee: Seth Wiesman
 Fix For: 1.13.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #14921: [FLINK-21100][coordination] Add DeclarativeScheduler

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14921:
URL: https://github.com/apache/flink/pull/14921#issuecomment-777147472


   
   ## CI report:
   
   * 6ce75263caba486446116acd952d5201a4bb2d39 UNKNOWN
   * 415370d41c01975434e8778edcfb4ce8904d4a32 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13286)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-18496) Anchors are not generated based on ZH characters

2021-02-12 Thread Zhilong Hong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283923#comment-17283923
 ] 

Zhilong Hong commented on FLINK-18496:
--

Since the document is migrated from Jekyll to Hugo in FLINK-21193, I think this 
issue is no longer valid and can be closed. cc [~zhuzh]

> Anchors are not generated based on ZH characters
> 
>
> Key: FLINK-18496
> URL: https://issues.apache.org/jira/browse/FLINK-18496
> Project: Flink
>  Issue Type: Bug
>  Components: Project Website
>Reporter: Zhu Zhu
>Assignee: Zhilong Hong
>Priority: Major
>  Labels: starter
>
> In ZH version pages of flink-web, the anchors are not generated based on ZH 
> characters. The anchor name would be like 'section-1', 'section-2' if there 
> is no EN characters. An example can be the links in the navigator of 
> https://flink.apache.org/zh/contributing/contribute-code.html
> This makes it impossible to ref an anchor from the content because the anchor 
> name might change unexpectedly if a new section is added.
> Note that it is a problem for flink-web only. The docs generated from the 
> flink repo can properly generate ZH anchors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-10320) Introduce JobMaster schedule micro-benchmark

2021-02-12 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-10320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283914#comment-17283914
 ] 

Zhu Zhu commented on FLINK-10320:
-

I think it is a duplicate and we can close it since FLINK-20612 is done. 
Regarding the benchmarks for RPC requests, I prefer to open a separate ticket 
to do it when needed.
cc [~tison]

> Introduce JobMaster schedule micro-benchmark
> 
>
> Key: FLINK-10320
> URL: https://issues.apache.org/jira/browse/FLINK-10320
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination, Tests
>Reporter: Zili Chen
>Priority: Major
>
> Based on {{org.apache.flink.streaming.runtime.io.benchmark}} stuff and the 
> repo [flink-benchmark|https://github.com/dataArtisans/flink-benchmarks], I 
> proposal to introduce another micro-benchmark which focuses on {{JobMaster}} 
> schedule performance
> h3. Target
> Benchmark how long from {{JobMaster}} startup(receive the {{JobGraph}} and 
> init) to all tasks RUNNING. Technically we use bounded stream and TM finishes 
> tasks as soon as they arrived. So the real interval we measure is to all 
> tasks FINISHED.
> h3. Case
> 1. JobGraph that cover EAGER + PIPELINED edges
> 2. JobGraph that cover LAZY_FROM_SOURCES + PIPELINED edges
> 3. JobGraph that cover LAZY_FROM_SOURCES + BLOCKING edges
> ps: maybe benchmark if the source is get from {{InputSplit}}?
> h3. Implement
> Based on the flink-benchmark repo, we finally run benchmark using jmh. So the 
> whole test suit is separated into two repos. The testing environment could be 
> located in the main repo, maybe under 
> flink-runtime/src/test/java/org/apache/flink/runtime/jobmaster/benchmark.
> To measure the performance of {{JobMaster}} scheduling, we need to simulate 
> an environment that:
> 1. has a real {{JobMaster}}
> 2. has a mock/testing {{ResourceManager}} that having infinite resource and 
> react immediately.
> 3. has a(many?) mock/testing {{TaskExecutor}} that deploy and finish tasks 
> immediately.
> [~trohrm...@apache.org] [~GJL] [~pnowojski] could you please review this 
> proposal to help clarify the goal and concrete details? Thanks in advance.
> Any suggestions are welcome.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-20612) Add benchmarks for scheduler

2021-02-12 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283906#comment-17283906
 ] 

Zhu Zhu edited comment on FLINK-20612 at 2/12/21, 6:48 PM:
---

[~pnowojski] The main problem in scheduler performance is that large complexity 
process can be introduced unexpectedly, which can lead to an increase of the 
computing complexity from O(N) to O(N^2) or even O(N^3). From this aspect, I 
think we do not need the scheduler benchmarks to be that stable, maybe a 
variance within 50% is acceptable. Especially after the optimization of some 
process in FLINK-21110, the time of the benchmarks may decrease to a much small 
number and the curve can be more unstable as a result. Even if it is not that 
stable, it can help for the initial goal, that a very steep curve can be 
displayed if an unexpected large complexity process is introduced.
Besides that, I think we can also increase the parallelism from 4000 to 1, 
given that the benchmarks currently is a bit short (within 10 s). Previously it 
is set to 4000 to ensure not get the benchmarks running too long, but now it 
seems the time of parallelism 1 can still be accepted. A larger base time 
can make the curve more stable. 


was (Author: zhuzh):
[~pnowojski] The main problem in scheduler performance is that large complexity 
process can be introduced unexpectedly, which can lead to an increase of the 
computing complexity from O(N) to O(N^2) or even O(N^3). From this aspect, I 
think we do not need the scheduler benchmarks to be that stable, maybe a 
variance within 50% is acceptable. Especially after the optimization of some 
process in FLINK-21110, the time of the benchmarks may decrease to a much small 
number and the curve can be more unstable as a result. Even if it is not that 
stable, it can help for the initial goal, that a very steep curve can be 
displayed if an unexpected large complexity process is introduced.
Besides that, I think we can also increase the parallelism from 4000 to 1, 
given that the benchmarks currently is a bit short (within 10 s). Previously it 
is set to 4000 to ensure not get the benchmarks running too long, but now it 
seems the time of parallelism 1 can still be accepted.

> Add benchmarks for scheduler
> 
>
> Key: FLINK-20612
> URL: https://issues.apache.org/jira/browse/FLINK-20612
> Project: Flink
>  Issue Type: Improvement
>  Components: Benchmarks, Runtime / Coordination
>Affects Versions: 1.13.0
>Reporter: Zhilong Hong
>Assignee: Zhilong Hong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
> With Flink 1.12, we failed to run large-scale jobs on our cluster. When we 
> were trying to run the jobs, we met the exceptions like out of heap memory, 
> taskmanager heartbeat timeout, and etc. We increased the size of heap memory 
> and extended the heartbeat timeout, the job still failed. After the 
> troubleshooting, we found that there are some performance bottlenecks in the 
> jobmaster. These bottlenecks are highly related to the complexity of the 
> topology.
> We implemented several benchmarks on these bottlenecks based on 
> flink-benchmark. The topology of the benchmarks is a simple graph, which 
> consists of only two vertices: one source vertex and one sink vertex. They 
> are both connected with all-to-all blocking edges. The parallelisms of the 
> vertices are both 8000. The execution mode is batch. The results of the 
> benchmarks are illustrated below:
> Table 1: The result of benchmarks on bottlenecks in the jobmaster
> | |*Time spent*|
> |Build topology|45725.466 ms|
> |Init scheduling strategy|38960.602 ms|
> |Deploy tasks|17472.884 ms|
> |Calculate failover region to restart|12960.912 ms|
> We'd like to propose these benchmarks for procedures related to the 
> scheduler. There are three main benefits:
>  # They help us to understand the current status of task deployment 
> performance and locate where the bottleneck is.
>  # We can use the benchmarks to evaluate the optimization in the future.
>  # As we run the benchmarks daily, they will help us to trace how the 
> performance changes and locate the commit that introduces the performance 
> regression if there is any.
> In the first version of the benchmarks, we mainly focus on the procedures we 
> mentioned above. The methods corresponding to the procedures are:
>  # Building topology: {{ExecutionGraph#attachJobGraph}}
>  # Initializing scheduling strategies: 
> {{PipelinedRegionSchedulingStrategy#init}}
>  # Deploying tasks: {{Execution#deploy}}
>  # Calculating failover regions: 
> {{RestartPipelinedRegionFailoverStrategy#getTasksNeedingRestart}}
> In the benchmarks, the topology consists of two vertices: source -> sink. 
> They are connected 

[jira] [Commented] (FLINK-20612) Add benchmarks for scheduler

2021-02-12 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283906#comment-17283906
 ] 

Zhu Zhu commented on FLINK-20612:
-

[~pnowojski] The main problem in scheduler performance is that large complexity 
process can be introduced unexpectedly, which can lead to an increase of the 
computing complexity from O(N) to O(N^2) or even O(N^3). From this aspect, I 
think we do not need the scheduler benchmarks to be that stable, maybe a 
variance within 50% is acceptable. Especially after the optimization of some 
process in FLINK-21110, the time of the benchmarks may decrease to a much small 
number and the curve can be more unstable as a result. Even if it is not that 
stable, it can help for the initial goal, that a very steep curve can be 
displayed if an unexpected large complexity process is introduced.
Besides that, I think we can also increase the parallelism from 4000 to 1, 
given that the benchmarks currently is a bit short (within 10 s). Previously it 
is set to 4000 to ensure not get the benchmarks running too long, but now it 
seems the time of parallelism 1 can still be accepted.

> Add benchmarks for scheduler
> 
>
> Key: FLINK-20612
> URL: https://issues.apache.org/jira/browse/FLINK-20612
> Project: Flink
>  Issue Type: Improvement
>  Components: Benchmarks, Runtime / Coordination
>Affects Versions: 1.13.0
>Reporter: Zhilong Hong
>Assignee: Zhilong Hong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
> With Flink 1.12, we failed to run large-scale jobs on our cluster. When we 
> were trying to run the jobs, we met the exceptions like out of heap memory, 
> taskmanager heartbeat timeout, and etc. We increased the size of heap memory 
> and extended the heartbeat timeout, the job still failed. After the 
> troubleshooting, we found that there are some performance bottlenecks in the 
> jobmaster. These bottlenecks are highly related to the complexity of the 
> topology.
> We implemented several benchmarks on these bottlenecks based on 
> flink-benchmark. The topology of the benchmarks is a simple graph, which 
> consists of only two vertices: one source vertex and one sink vertex. They 
> are both connected with all-to-all blocking edges. The parallelisms of the 
> vertices are both 8000. The execution mode is batch. The results of the 
> benchmarks are illustrated below:
> Table 1: The result of benchmarks on bottlenecks in the jobmaster
> | |*Time spent*|
> |Build topology|45725.466 ms|
> |Init scheduling strategy|38960.602 ms|
> |Deploy tasks|17472.884 ms|
> |Calculate failover region to restart|12960.912 ms|
> We'd like to propose these benchmarks for procedures related to the 
> scheduler. There are three main benefits:
>  # They help us to understand the current status of task deployment 
> performance and locate where the bottleneck is.
>  # We can use the benchmarks to evaluate the optimization in the future.
>  # As we run the benchmarks daily, they will help us to trace how the 
> performance changes and locate the commit that introduces the performance 
> regression if there is any.
> In the first version of the benchmarks, we mainly focus on the procedures we 
> mentioned above. The methods corresponding to the procedures are:
>  # Building topology: {{ExecutionGraph#attachJobGraph}}
>  # Initializing scheduling strategies: 
> {{PipelinedRegionSchedulingStrategy#init}}
>  # Deploying tasks: {{Execution#deploy}}
>  # Calculating failover regions: 
> {{RestartPipelinedRegionFailoverStrategy#getTasksNeedingRestart}}
> In the benchmarks, the topology consists of two vertices: source -> sink. 
> They are connected with all-to-all edges. The result partition type 
> ({{PIPELINED}} and {{BLOCKING}}) should be considered separately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #13912: [FLINK-19466][FLINK-19467][runtime / state backends] Add Flip-142 public interfaces and methods

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #13912:
URL: https://github.com/apache/flink/pull/13912#issuecomment-721398037


   
   ## CI report:
   
   * 7ac6d8a1b4a0bbe2fd222f4403cbc967b7f3f7ef Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13291)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-12031) the registerFactory method of TypeExtractor Should not be private

2021-02-12 Thread Ben La Monica (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-12031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283883#comment-17283883
 ] 

Ben La Monica commented on FLINK-12031:
---

I'm trying to figure out why LocalDate's aren't being serialized, and noticed 
this method. As far as I can tell, `registeredTypeInfoFactories` is NEVER able 
to have anything in it, so we should just remove it? How exactly does one 
register TypeInfo factories?

> the registerFactory method of TypeExtractor  Should not be private
> --
>
> Key: FLINK-12031
> URL: https://issues.apache.org/jira/browse/FLINK-12031
> Project: Flink
>  Issue Type: Bug
>  Components: API / Type Serialization System
>Reporter: frank wang
>Priority: Minor
>
> [https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/java/typeutils/TypeExtractor.java]
> {code:java}
> /**
>  * Registers a type information factory globally for a certain type. Every 
> following type extraction
>  * operation will use the provided factory for this type. The factory will 
> have highest precedence
>  * for this type. In a hierarchy of types the registered factory has higher 
> precedence than annotations
>  * at the same level but lower precedence than factories defined down the 
> hierarchy.
>  *
>  * @param t type for which a new factory is registered
>  * @param factory type information factory that will produce {@link 
> TypeInformation}
>  */
> private static void registerFactory(Type t, Class 
> factory) {
>Preconditions.checkNotNull(t, "Type parameter must not be null.");
>Preconditions.checkNotNull(factory, "Factory parameter must not be null.");
>if (!TypeInfoFactory.class.isAssignableFrom(factory)) {
>   throw new IllegalArgumentException("Class is not a TypeInfoFactory.");
>}
>if (registeredTypeInfoFactories.containsKey(t)) {
>   throw new InvalidTypesException("A TypeInfoFactory for type '" + t + "' 
> is already registered.");
>}
>registeredTypeInfoFactories.put(t, factory);
> }
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #14931: [hotfix] Fix ASCII art in the OperatorChain JavaDoc

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14931:
URL: https://github.com/apache/flink/pull/14931#issuecomment-778328817


   
   ## CI report:
   
   * 9c7e0b188939cb4482748a9e7da11019ee90bd91 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13290)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13912: [FLINK-19466][FLINK-19467][runtime / state backends] Add Flip-142 public interfaces and methods

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #13912:
URL: https://github.com/apache/flink/pull/13912#issuecomment-721398037


   
   ## CI report:
   
   * 7705bf347fbf193d02630a58bbb75128c60025ed Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13268)
 
   * 7ac6d8a1b4a0bbe2fd222f4403cbc967b7f3f7ef UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #14931: [hotfix] Fix ASCII art in the OperatorChain JavaDoc

2021-02-12 Thread GitBox


flinkbot commented on pull request #14931:
URL: https://github.com/apache/flink/pull/14931#issuecomment-778328817


   
   ## CI report:
   
   * 9c7e0b188939cb4482748a9e7da11019ee90bd91 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #14931: [hotfix] Fix ASCII art in the OperatorChain JavaDoc

2021-02-12 Thread GitBox


flinkbot commented on pull request #14931:
URL: https://github.com/apache/flink/pull/14931#issuecomment-778322177


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 9c7e0b188939cb4482748a9e7da11019ee90bd91 (Fri Feb 12 
17:11:15 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] pnowojski opened a new pull request #14931: [hotfix] Fix ASCII art in the OperatorChain JavaDoc

2021-02-12 Thread GitBox


pnowojski opened a new pull request #14931:
URL: https://github.com/apache/flink/pull/14931


   This is a simple JavaDoc fix, after it was broken by RufusRefactor 
https://issues.apache.org/jira/browse/FLINK-20651
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14930: [FLINK-21339][tests] Enable and fix ExceptionUtilsITCase

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14930:
URL: https://github.com/apache/flink/pull/14930#issuecomment-778161603


   
   ## CI report:
   
   * 2871d67e16ca1a58ab47320fcac05f36c1011662 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13282)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14919: [FLINK-21338][test] Relax ITCase naming constraints

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14919:
URL: https://github.com/apache/flink/pull/14919#issuecomment-776686704


   
   ## CI report:
   
   * 358404dbd7096506751ab384064d7b0a8261e48d Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13281)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] sjwiesman commented on pull request #14896: [FLINK-21176][docs] translate the updates of 'avro-confluent.zh.md' to Chinese

2021-02-12 Thread GitBox


sjwiesman commented on pull request #14896:
URL: https://github.com/apache/flink/pull/14896#issuecomment-778297029


   Hi @jjey 
   
   Thank you for working on this translation! Can you please rebase this change 
on master, we recently migrated Flinks documentation to Hugo from Jekyll. You 
can find full details on building the new documentation locally, along with our 
provided "shortcodes" in `docs/README.md` 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] sjwiesman commented on a change in pull request #14743: [FLINK-21366][doc] mentions Maxwell as CDC tool in Kafka connector documentation

2021-02-12 Thread GitBox


sjwiesman commented on a change in pull request #14743:
URL: https://github.com/apache/flink/pull/14743#discussion_r575343972



##
File path: docs/content.zh/docs/connectors/table/kafka.md
##
@@ -499,12 +499,13 @@ If `timestamp` is specified, another config option 
`scan.startup.timestamp-milli
 If `specific-offsets` is specified, another config option 
`scan.startup.specific-offsets` is required to specify specific startup offsets 
for each partition,
 e.g. an option value `partition:0,offset:42;partition:1,offset:300` indicates 
offset `42` for partition `0` and offset `300` for partition `1`.
 
-### Changelog Source
+### CDC Changelog Source
+
+Flink natively supports Kafka as a CDC changelog source. If messages in Kafka 
topic is change event captured from other databases using CDC tools, then you 
can use a CDC format to interpret messages as INSERT/UPDATE/DELETE messages 
into Flink SQL system.
+
+Flink provides three CDC formats: [debezium-json]({% link 
dev/table/connectors/formats/debezium.md %}), [canal-json]({% link 
dev/table/connectors/formats/canal.md %}) and [maxwell-json]({% link 
dev/table/connectors/formats/maxwell.md %}) to interpret change events captured 
by [Debezium](https://debezium.io/), 
[Canal](https://github.com/alibaba/canal/wiki) and 
[Maxwell](https://maxwells-daemon.io/).

Review comment:
   Since this list is getting longer why don't we make it an actual list. 
   
   ```
   Flink provides several CDC formats: 
   * [debezium-json]()
   * [canal-json]()
   * others
   ```
   
   That way its easier to expand this as we add more formats in the future. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] sjwiesman commented on pull request #14743: [FLINK-21366][doc] mentions Maxwell as CDC tool in Kafka connector documentation

2021-02-12 Thread GitBox


sjwiesman commented on pull request #14743:
URL: https://github.com/apache/flink/pull/14743#issuecomment-778294624


   Hi @sv3ndk 
   
   Thank you for making this change. You may have noticed we recently - two 
days ago -  migrated from Jekyll to Hugo. That means there are some changes in 
the tags. Instead of link tags please use "ref" as you will find on the rest of 
the page. There are full instructions on building the Hugo documentation 
locally on `docs/README.md`. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14839: [FLINK-21353][state] Add DFS-based StateChangelog

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14839:
URL: https://github.com/apache/flink/pull/14839#issuecomment-772060196


   
   ## CI report:
   
   * e62585ad4b21f689825dbb26032337b3cab3a3c5 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13280)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-13103) Graceful Shutdown Handling by UDFs

2021-02-12 Thread David Anderson (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283799#comment-17283799
 ] 

David Anderson commented on FLINK-13103:


Is FLIP-46 still relevant, or has it been subsumed by more recent developments?

> Graceful Shutdown Handling by UDFs
> --
>
> Key: FLINK-13103
> URL: https://issues.apache.org/jira/browse/FLINK-13103
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Affects Versions: 1.8.0
>Reporter: Kostas Kloudas
>Assignee: Kostas Kloudas
>Priority: Major
>
> This is an umbrella issue for 
> [FLIP-46|[https://cwiki.apache.org/confluence/display/FLINK/FLIP-46%3A+Graceful+Shutdown+Handling+by+UDFs]]
>  and it will be broken down into more fine grained JIRAs as the discussion on 
> the FLIP evolves.
>  
> For more details on what the FLIP is about, please refer to the link above.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14034) In FlinkKafkaProducer, KafkaTransactionState should be made public or invoke should be made final

2021-02-12 Thread Andrew Roberts (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283794#comment-17283794
 ] 

Andrew Roberts commented on FLINK-14034:


I see- re-reading that interface now, I misunderstood the way the various 
{{invoke}}s cascade. If I go back to selective exception handling, I'll try 
this wrapping strategy.

> In FlinkKafkaProducer, KafkaTransactionState should be made public or invoke 
> should be made final
> -
>
> Key: FLINK-14034
> URL: https://issues.apache.org/jira/browse/FLINK-14034
> Project: Flink
>  Issue Type: Wish
>  Components: Connectors / Kafka
>Affects Versions: 1.9.0
>Reporter: Niels van Kaam
>Priority: Trivial
>
> It is not possible to override the invoke method of the FlinkKafkaProducer, 
> because the first parameter, KafkaTransactionState, is a private inner class. 
>  It is not possible to override the original invoke of SinkFunction, because 
> TwoPhaseCommitSinkFunction (which is implemented by FlinkKafkaProducer) does 
> override the original invoke method with final.
> [https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer.java]
> If there is a particular reason for this, it would be better to make the 
> invoke method in FlinkKafkaProducer final as well, and document the reason 
> such that it is clear this is by design (I don't see any overrides in the 
> same package).
> Otherwise, I would make the KafkaTransactionState publicly visible. I would 
> like to override the Invoke method to create a custom KafkaProducer which 
> performs some additional generic validations and transformations. (which can 
> also be done in a process-function, but a custom sink would simplify the code 
> of jobs)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14034) In FlinkKafkaProducer, KafkaTransactionState should be made public or invoke should be made final

2021-02-12 Thread Piotr Nowojski (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283790#comment-17283790
 ] 

Piotr Nowojski commented on FLINK-14034:


{quote}
wouldn't this break Semantic.EXACTLY_ONCE behavior?
{quote}
No, everything should be fine. There is nothing special about 
{{TwoPhaseCommitSinkFunction}}. It's just a helper class to implement two phase 
commit on top of the pre-existing in Flink APIs (CheckpointedFunction, 
CheckpointListener) that provide the basic required functionalities.

> In FlinkKafkaProducer, KafkaTransactionState should be made public or invoke 
> should be made final
> -
>
> Key: FLINK-14034
> URL: https://issues.apache.org/jira/browse/FLINK-14034
> Project: Flink
>  Issue Type: Wish
>  Components: Connectors / Kafka
>Affects Versions: 1.9.0
>Reporter: Niels van Kaam
>Priority: Trivial
>
> It is not possible to override the invoke method of the FlinkKafkaProducer, 
> because the first parameter, KafkaTransactionState, is a private inner class. 
>  It is not possible to override the original invoke of SinkFunction, because 
> TwoPhaseCommitSinkFunction (which is implemented by FlinkKafkaProducer) does 
> override the original invoke method with final.
> [https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer.java]
> If there is a particular reason for this, it would be better to make the 
> invoke method in FlinkKafkaProducer final as well, and document the reason 
> such that it is clear this is by design (I don't see any overrides in the 
> same package).
> Otherwise, I would make the KafkaTransactionState publicly visible. I would 
> like to override the Invoke method to create a custom KafkaProducer which 
> performs some additional generic validations and transformations. (which can 
> also be done in a process-function, but a custom sink would simplify the code 
> of jobs)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] XComp commented on pull request #14798: [FLINK-21187] Provide exception history for root causes

2021-02-12 Thread GitBox


XComp commented on pull request #14798:
URL: https://github.com/apache/flink/pull/14798#issuecomment-778281634


   The test instability is caused by the change adding the requirement that 
`failureCause` need to be set if the `executionState` is `FAILED`. The failing 
test runs a task but doesn't wait for the task to be finished. The 
`TaskExecutor` triggers the cancellation of the `Task` through 
[Task.failExternally](https://github.com/XComp/flink/blob/354eb574c98bbc5fdaedabc9f3ca7e4acfaa3746/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java#L1129)
 passing in a `FlinkException: Disconnect from JobManager responsible for 
[...]`. This will trigger the `Task` to transition into `FAILED`. A context 
switch happens after the state is set but before the `failureCause` is set. 
Then, the `Task` finishes in the main thread, cleans up and calls 
[notifyFinalState](https://github.com/XComp/flink/blob/354eb574c98bbc5fdaedabc9f3ca7e4acfaa3746/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java#L904)
 as part of this which initializes a `TaskExecutionState` and run
 s into the `IllegalStateException` since the `failureCause` is still not set.
   
   The solution would be to do the state change and `failureCause` atomically. 
But I'm not sure whether this would be a performance problem. We would have to 
switch from 
[compareAndSet](https://github.com/XComp/flink/blob/354eb574c98bbc5fdaedabc9f3ca7e4acfaa3746/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java#L1064)
 to a normal lock, I guess. Alternatively, we could remove the invariant again 
and handle the `null` in the exception history. @tillrohrmann what are your 
thoughts on that?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21225) OverConvertRule does not consider distinct

2021-02-12 Thread Timo Walther (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283783#comment-17283783
 ] 

Timo Walther commented on FLINK-21225:
--

[~jark] shouldn't this be 1.12.2?

> OverConvertRule does not consider distinct
> --
>
> Key: FLINK-21225
> URL: https://issues.apache.org/jira/browse/FLINK-21225
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: Timo Walther
>Assignee: Jane Chan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0, 1.12.3
>
>
> We don't support OVER window distinct aggregates in Table API. Even though 
> this is explicitly documented:
> https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/tableApi.html#aggregations
> {code}
> // Distinct aggregation on over window
> Table result = orders
> .window(Over
> .partitionBy($("a"))
> .orderBy($("rowtime"))
> .preceding(UNBOUNDED_RANGE)
> .as("w"))
> .select(
> $("a"), $("b").avg().distinct().over($("w")),
> $("b").max().over($("w")),
> $("b").min().over($("w"))
> );
> {code}
> The distinct flag is set to false in {{OverConvertRule}}.
> See also
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Unknown-call-expression-avg-amount-when-use-distinct-in-Flink-Thanks-td40905.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] tillrohrmann commented on a change in pull request #14904: [FLINK-20663][runtime] Instantly release unsafe memory on freeing segment.

2021-02-12 Thread GitBox


tillrohrmann commented on a change in pull request #14904:
URL: https://github.com/apache/flink/pull/14904#discussion_r575294320



##
File path: 
flink-core/src/main/java/org/apache/flink/core/memory/HybridMemorySegment.java
##
@@ -357,4 +385,9 @@ public final void put(int offset, ByteBuffer source, int 
numBytes) {
 }
 }
 }
+
+@Override
+public  T processAsByteBuffer(Function processFunction) {
+return 
Preconditions.checkNotNull(processFunction).apply(wrapInternal(0, size));
+}

Review comment:
   Which calls to `wrap` cannot be replaced with this call?

##
File path: 
flink-core/src/main/java/org/apache/flink/core/memory/HybridMemorySegment.java
##
@@ -57,6 +59,10 @@
  */
 @Nullable private ByteBuffer offHeapBuffer;
 
+@Nullable private final Runnable cleaner;
+
+private final boolean allowWrap;

Review comment:
   It might not be super clear which segments are allowed to be wrapped and 
which not. Adding some explanation could help.

##
File path: 
flink-core/src/main/java/org/apache/flink/core/memory/HybridMemorySegment.java
##
@@ -357,4 +385,9 @@ public final void put(int offset, ByteBuffer source, int 
numBytes) {
 }
 }
 }
+
+@Override
+public  T processAsByteBuffer(Function processFunction) {
+return 
Preconditions.checkNotNull(processFunction).apply(wrapInternal(0, size));
+}

Review comment:
   Do we know what the performance penalty for the additional `Function` is?

##
File path: 
flink-core/src/main/java/org/apache/flink/core/memory/MemorySegment.java
##
@@ -1386,4 +1387,17 @@ public final boolean equalTo(MemorySegment seg2, int 
offset1, int offset2, int l
 public byte[] getHeapMemory() {
 return heapMemory;
 }
+
+/**
+ * Applies the given process function on a {@link ByteBuffer} that 
represents this entire
+ * segment.
+ *
+ * Note: The {@link ByteBuffer} passed into the process function is 
temporary and could
+ * become invalid after the processing. Thus, the process function should 
not try to keep any
+ * reference of the {@link ByteBuffer}.
+ *
+ * @param processFunction to be applied to the segment as {@link 
ByteBuffer}.
+ * @return the value that the process function returns.
+ */
+public abstract  T processAsByteBuffer(Function 
processFunction);

Review comment:
   We could also add a `acceptAsByteBuffer(Consumer consumer)` 
for the case where we don't want to return a value.

##
File path: 
flink-core/src/main/java/org/apache/flink/core/memory/HybridMemorySegment.java
##
@@ -94,26 +126,22 @@
 @Override
 public void free() {
 super.free();
+if (cleaner != null) {
+cleaner.run();
+}

Review comment:
   If we had a `DirectMemorySegment`, then we could only allow `wrap` on 
this type, for example.

##
File path: 
flink-core/src/main/java/org/apache/flink/core/memory/HybridMemorySegment.java
##
@@ -94,26 +126,22 @@
 @Override
 public void free() {
 super.free();
+if (cleaner != null) {
+cleaner.run();
+}

Review comment:
   Looking at the special casing of this class, why don't we introduce an 
`DirectMemorySegment` and a `UnsafeMemorySegment`? Also, why do we have the 
`HeapMemorySegment` and still allow the `HybridMemorySegment` to be used with 
heap memory?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14921: [FLINK-21100][coordination] Add DeclarativeScheduler

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14921:
URL: https://github.com/apache/flink/pull/14921#issuecomment-777147472


   
   ## CI report:
   
   * 6ce75263caba486446116acd952d5201a4bb2d39 UNKNOWN
   * 6456a865c84b46d91fe4151a910bfaf778087af9 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13278)
 
   * 415370d41c01975434e8778edcfb4ce8904d4a32 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13286)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Comment Edited] (FLINK-21225) OverConvertRule does not consider distinct

2021-02-12 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282209#comment-17282209
 ] 

Jark Wu edited comment on FLINK-21225 at 2/12/21, 3:21 PM:
---

Fixed in 
 - master: 31346e8cdf225bec0ea4745a504c765a1fcc0edf
 - 1.12: 1345c0f9a606a6e5ccffda59bb28a6ccfe054263


was (Author: jark):
Fixed in 
 - master: 31346e8cdf225bec0ea4745a504c765a1fcc0edf
 - 1.12: TODO

> OverConvertRule does not consider distinct
> --
>
> Key: FLINK-21225
> URL: https://issues.apache.org/jira/browse/FLINK-21225
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: Timo Walther
>Assignee: Jane Chan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
> We don't support OVER window distinct aggregates in Table API. Even though 
> this is explicitly documented:
> https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/tableApi.html#aggregations
> {code}
> // Distinct aggregation on over window
> Table result = orders
> .window(Over
> .partitionBy($("a"))
> .orderBy($("rowtime"))
> .preceding(UNBOUNDED_RANGE)
> .as("w"))
> .select(
> $("a"), $("b").avg().distinct().over($("w")),
> $("b").max().over($("w")),
> $("b").min().over($("w"))
> );
> {code}
> The distinct flag is set to false in {{OverConvertRule}}.
> See also
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Unknown-call-expression-avg-amount-when-use-distinct-in-Flink-Thanks-td40905.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21225) OverConvertRule does not consider distinct

2021-02-12 Thread Jark Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-21225:

Fix Version/s: 1.12.3

> OverConvertRule does not consider distinct
> --
>
> Key: FLINK-21225
> URL: https://issues.apache.org/jira/browse/FLINK-21225
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: Timo Walther
>Assignee: Jane Chan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0, 1.12.3
>
>
> We don't support OVER window distinct aggregates in Table API. Even though 
> this is explicitly documented:
> https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/tableApi.html#aggregations
> {code}
> // Distinct aggregation on over window
> Table result = orders
> .window(Over
> .partitionBy($("a"))
> .orderBy($("rowtime"))
> .preceding(UNBOUNDED_RANGE)
> .as("w"))
> .select(
> $("a"), $("b").avg().distinct().over($("w")),
> $("b").max().over($("w")),
> $("b").min().over($("w"))
> );
> {code}
> The distinct flag is set to false in {{OverConvertRule}}.
> See also
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Unknown-call-expression-avg-amount-when-use-distinct-in-Flink-Thanks-td40905.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-10320) Introduce JobMaster schedule micro-benchmark

2021-02-12 Thread Zhilong Hong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-10320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283767#comment-17283767
 ] 

Zhilong Hong commented on FLINK-10320:
--

Thanks for reminding, [~chesnay] and [~pnowojski].

I think the current scheduler benchmark has covered the goal #1 (i.e. "How fast 
the job switch into RUNNING, or say could we start the job faster"). It 
involves several procedures with high computation complexity. In fact after the 
optimization we make in FLINK 21110, the main throttle of deploying a job lies 
in computing pipelined region. We will try to optimize it in future. 

For the goal #2 (i.e. "What the throughput JobMaster reacts rpc requests"), we 
are still thinking about it. The first concern that comes to me is, the RPC we 
mock locally is different from the real situation. First, we cannot simulate 
the network connection latency. I think this may greatly impact the performance 
of RPC if the communications reach the maximum bandwidth (in the worst 
scenario). Second, the thread model is totally different. Currently the future 
executor in JobMaster has a thread pool that uses all the CPU cores on the 
machine. If we start threads to simulate TaskExecutor on the same machine, the 
mocked TM may impact the performance of JobMaster. For example, 
{{Execution#submitTask}} runs on future executors, as 
{{TaskExecutor#submitTask}} runs on the main thread of TaskExecutors.

> Introduce JobMaster schedule micro-benchmark
> 
>
> Key: FLINK-10320
> URL: https://issues.apache.org/jira/browse/FLINK-10320
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination, Tests
>Reporter: Zili Chen
>Priority: Major
>
> Based on {{org.apache.flink.streaming.runtime.io.benchmark}} stuff and the 
> repo [flink-benchmark|https://github.com/dataArtisans/flink-benchmarks], I 
> proposal to introduce another micro-benchmark which focuses on {{JobMaster}} 
> schedule performance
> h3. Target
> Benchmark how long from {{JobMaster}} startup(receive the {{JobGraph}} and 
> init) to all tasks RUNNING. Technically we use bounded stream and TM finishes 
> tasks as soon as they arrived. So the real interval we measure is to all 
> tasks FINISHED.
> h3. Case
> 1. JobGraph that cover EAGER + PIPELINED edges
> 2. JobGraph that cover LAZY_FROM_SOURCES + PIPELINED edges
> 3. JobGraph that cover LAZY_FROM_SOURCES + BLOCKING edges
> ps: maybe benchmark if the source is get from {{InputSplit}}?
> h3. Implement
> Based on the flink-benchmark repo, we finally run benchmark using jmh. So the 
> whole test suit is separated into two repos. The testing environment could be 
> located in the main repo, maybe under 
> flink-runtime/src/test/java/org/apache/flink/runtime/jobmaster/benchmark.
> To measure the performance of {{JobMaster}} scheduling, we need to simulate 
> an environment that:
> 1. has a real {{JobMaster}}
> 2. has a mock/testing {{ResourceManager}} that having infinite resource and 
> react immediately.
> 3. has a(many?) mock/testing {{TaskExecutor}} that deploy and finish tasks 
> immediately.
> [~trohrm...@apache.org] [~GJL] [~pnowojski] could you please review this 
> proposal to help clarify the goal and concrete details? Thanks in advance.
> Any suggestions are welcome.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] wuchong commented on a change in pull request #14743: [FLINK-21366][doc] mentions Maxwell as CDC tool in Kafka connector documentation

2021-02-12 Thread GitBox


wuchong commented on a change in pull request #14743:
URL: https://github.com/apache/flink/pull/14743#discussion_r575297850



##
File path: docs/content.zh/docs/connectors/table/kafka.md
##
@@ -499,12 +499,13 @@ If `timestamp` is specified, another config option 
`scan.startup.timestamp-milli
 If `specific-offsets` is specified, another config option 
`scan.startup.specific-offsets` is required to specify specific startup offsets 
for each partition,
 e.g. an option value `partition:0,offset:42;partition:1,offset:300` indicates 
offset `42` for partition `0` and offset `300` for partition `1`.
 
-### Changelog Source
+### CDC Changelog Source
+
+Flink natively supports Kafka as a CDC changelog source. If messages in Kafka 
topic is change event captured from other databases using CDC tools, then you 
can use a CDC format to interpret messages as INSERT/UPDATE/DELETE messages 
into Flink SQL system.
+
+Flink provides three CDC formats: [debezium-json]({% link 
dev/table/connectors/formats/debezium.md %}), [canal-json]({% link 
dev/table/connectors/formats/canal.md %}) and [maxwell-json]({% link 
dev/table/connectors/formats/maxwell.md %}) to interpret change events captured 
by [Debezium](https://debezium.io/), 
[Canal](https://github.com/alibaba/canal/wiki) and 
[Maxwell](https://maxwells-daemon.io/).

Review comment:
   We also support `debezium-avro-confluent` now. The documentation is also 
located in 
https://ci.apache.org/projects/flink/flink-docs-master/zh/docs/connectors/table/formats/debezium/





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Assigned] (FLINK-21366) Kafka connector documentation should mentions Maxwell as CDC mechanism

2021-02-12 Thread Jark Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu reassigned FLINK-21366:
---

Assignee: Svend Vanderveken

> Kafka connector documentation should mentions Maxwell as CDC mechanism
> --
>
> Key: FLINK-21366
> URL: https://issues.apache.org/jira/browse/FLINK-21366
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Svend Vanderveken
>Assignee: Svend Vanderveken
>Priority: Minor
>  Labels: pull-request-available
>
> The current [Kafka connector changelog section of the 
> documentation|https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/connectors/kafka.html#changelog-source]
>  mentions Debezium and Canal CDC tools but not the recently added Maxwell 
> format.
> This PR linked to this ticket edits the text to add it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-20952) Changelog json formats should support inherit options from JSON format

2021-02-12 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283760#comment-17283760
 ] 

Jark Wu commented on FLINK-20952:
-

Thanks [~harveyyue] for the work. However, I don't think including all the 
attributes in JsonOptions is a good idea. It might be error-prone to set 
debezium attributes in JSON format. 

> Changelog json formats should support inherit options from JSON format
> --
>
> Key: FLINK-20952
> URL: https://issues.apache.org/jira/browse/FLINK-20952
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / Ecosystem
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.13.0
>
>
> Recently, we introduced several config options for json format, e.g. 
> FLINK-20861. It reveals a potential problem that adding a small config option 
> into json may need touch debezium-json, canal-json, maxwell-json formats. 
> This is verbose and error-prone. We need an abstract machanism support 
> reuable options. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14034) In FlinkKafkaProducer, KafkaTransactionState should be made public or invoke should be made final

2021-02-12 Thread Andrew Roberts (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283755#comment-17283755
 ] 

Andrew Roberts commented on FLINK-14034:


Nothing prevents it, but then my sink is not a {{TwoPhaseCommitSinkFunction}} 
instance - wouldn't this break {{Semantic.EXACTLY_ONCE}} behavior?

> In FlinkKafkaProducer, KafkaTransactionState should be made public or invoke 
> should be made final
> -
>
> Key: FLINK-14034
> URL: https://issues.apache.org/jira/browse/FLINK-14034
> Project: Flink
>  Issue Type: Wish
>  Components: Connectors / Kafka
>Affects Versions: 1.9.0
>Reporter: Niels van Kaam
>Priority: Trivial
>
> It is not possible to override the invoke method of the FlinkKafkaProducer, 
> because the first parameter, KafkaTransactionState, is a private inner class. 
>  It is not possible to override the original invoke of SinkFunction, because 
> TwoPhaseCommitSinkFunction (which is implemented by FlinkKafkaProducer) does 
> override the original invoke method with final.
> [https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer.java]
> If there is a particular reason for this, it would be better to make the 
> invoke method in FlinkKafkaProducer final as well, and document the reason 
> such that it is clear this is by design (I don't see any overrides in the 
> same package).
> Otherwise, I would make the KafkaTransactionState publicly visible. I would 
> like to override the Invoke method to create a custom KafkaProducer which 
> performs some additional generic validations and transformations. (which can 
> also be done in a process-function, but a custom sink would simplify the code 
> of jobs)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21146) 【SQL】Flink SQL Client not support specify the queue to submit the job

2021-02-12 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283754#comment-17283754
 ] 

Jark Wu commented on FLINK-21146:
-

This tikect can be addressed by FLIP-163. 

> 【SQL】Flink SQL Client not support specify the queue to submit the job
> -
>
> Key: FLINK-21146
> URL: https://issues.apache.org/jira/browse/FLINK-21146
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Client
>Affects Versions: 1.12.0
>Reporter: zhisheng
>Priority: Major
>
> We can submit the job to specify yarn queue in Hive like : 
> {code:java}
> set mapreduce.job.queuename=queue1;
> {code}
>  
>  
> we can submit the spark-sql job to specify yarn queue like : 
> {code:java}
> spark-sql --queue xxx {code}
>  
> but Flink SQL Client can not specify the job submit to which queue, default 
> is `default` queue. it is not friendly in pro env.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-14034) In FlinkKafkaProducer, KafkaTransactionState should be made public or invoke should be made final

2021-02-12 Thread Piotr Nowojski (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283750#comment-17283750
 ] 

Piotr Nowojski edited comment on FLINK-14034 at 2/12/21, 3:07 PM:
--

But what's preventing you from trying something like that:

{code:java}
public class ExceptionIgnoringSinkWrapper implements RichFunction, 
CheckpointedFunction, CheckpointListener{
  private final RichFunction wrapped;

  public ExceptionIgnoringSinkWrapper(RichFunction wrapped) {
this.wrapped = wrapped;
  }

  ...

  @Override
  public final void invoke(IN value) throws Exception {
try {
  wrapped.invoke(value);
} catch (Exception ex) {
}
  }

  @Override
  public final void invoke(IN value, Context context) throws Exception {
try {
  wrapped.invoke(value, context);
} catch (Exception ex) {
}
  }
}

dataStream.addSink(new ExceptionIgnoringSinkWrapper(new 
FlinkKafkaProducer(...));
{code}
?

As long as you proxy/forward all of the public methods from the interfaces: 
RichSinkFunction, CheckpointedFunction, CheckpointListener it should work 
(% that underlying KafkaProducer or FlinkKafkaProducer might not be able to 
recover from exceptions. But even in that case, you could try to not only 
ignore the exceptions, but re-initialize the wrapped function. 


was (Author: pnowojski):
But what's preventing you from trying something like that:

{code:java}
public class ExceptionIgnoringSinkWrapper implements RichFunction, 
CheckpointedFunction, CheckpointListener{
  private final RichFunction wrapped;

  public ExceptionIgnoringSinkWrapper(RichFunction wrapped) {
this.wrapped = wrapped;
  }

  ...

  @Override
  public final void invoke(IN value) throws Exception {
try {
  wrapped.invoke(value);
} catch (Exception ex) {
}
  }

  @Override
  public final void invoke(IN value, Context context) throws Exception {
try {
  wrapped.invoke(value, context);
} catch (Exception ex) {
}
  }
}

dataStream.addSink(new ExceptionIgnoringSinkWrapper(new 
FlinkKafkaProducer(...));
{code}
?

> In FlinkKafkaProducer, KafkaTransactionState should be made public or invoke 
> should be made final
> -
>
> Key: FLINK-14034
> URL: https://issues.apache.org/jira/browse/FLINK-14034
> Project: Flink
>  Issue Type: Wish
>  Components: Connectors / Kafka
>Affects Versions: 1.9.0
>Reporter: Niels van Kaam
>Priority: Trivial
>
> It is not possible to override the invoke method of the FlinkKafkaProducer, 
> because the first parameter, KafkaTransactionState, is a private inner class. 
>  It is not possible to override the original invoke of SinkFunction, because 
> TwoPhaseCommitSinkFunction (which is implemented by FlinkKafkaProducer) does 
> override the original invoke method with final.
> [https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer.java]
> If there is a particular reason for this, it would be better to make the 
> invoke method in FlinkKafkaProducer final as well, and document the reason 
> such that it is clear this is by design (I don't see any overrides in the 
> same package).
> Otherwise, I would make the KafkaTransactionState publicly visible. I would 
> like to override the Invoke method to create a custom KafkaProducer which 
> performs some additional generic validations and transformations. (which can 
> also be done in a process-function, but a custom sink would simplify the code 
> of jobs)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-14034) In FlinkKafkaProducer, KafkaTransactionState should be made public or invoke should be made final

2021-02-12 Thread Piotr Nowojski (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283750#comment-17283750
 ] 

Piotr Nowojski edited comment on FLINK-14034 at 2/12/21, 3:07 PM:
--

But what's preventing you from trying something like that:

{code:java}
public class ExceptionIgnoringSinkWrapper implements RichFunction, 
CheckpointedFunction, CheckpointListener{
  private final RichFunction wrapped;

  public ExceptionIgnoringSinkWrapper(RichFunction wrapped) {
this.wrapped = wrapped;
  }

  ...

  @Override
  public final void invoke(IN value) throws Exception {
try {
  wrapped.invoke(value);
} catch (Exception ex) {
}
  }

  @Override
  public final void invoke(IN value, Context context) throws Exception {
try {
  wrapped.invoke(value, context);
} catch (Exception ex) {
}
  }
  
  ...
}

dataStream.addSink(new ExceptionIgnoringSinkWrapper(new 
FlinkKafkaProducer(...));
{code}
?

As long as you proxy/forward all of the public methods from the interfaces: 
RichSinkFunction, CheckpointedFunction, CheckpointListener it should work 
(% that underlying KafkaProducer or FlinkKafkaProducer might not be able to 
recover from exceptions. But even in that case, you could try to not only 
ignore the exceptions, but re-initialize the wrapped function. 


was (Author: pnowojski):
But what's preventing you from trying something like that:

{code:java}
public class ExceptionIgnoringSinkWrapper implements RichFunction, 
CheckpointedFunction, CheckpointListener{
  private final RichFunction wrapped;

  public ExceptionIgnoringSinkWrapper(RichFunction wrapped) {
this.wrapped = wrapped;
  }

  ...

  @Override
  public final void invoke(IN value) throws Exception {
try {
  wrapped.invoke(value);
} catch (Exception ex) {
}
  }

  @Override
  public final void invoke(IN value, Context context) throws Exception {
try {
  wrapped.invoke(value, context);
} catch (Exception ex) {
}
  }
}

dataStream.addSink(new ExceptionIgnoringSinkWrapper(new 
FlinkKafkaProducer(...));
{code}
?

As long as you proxy/forward all of the public methods from the interfaces: 
RichSinkFunction, CheckpointedFunction, CheckpointListener it should work 
(% that underlying KafkaProducer or FlinkKafkaProducer might not be able to 
recover from exceptions. But even in that case, you could try to not only 
ignore the exceptions, but re-initialize the wrapped function. 

> In FlinkKafkaProducer, KafkaTransactionState should be made public or invoke 
> should be made final
> -
>
> Key: FLINK-14034
> URL: https://issues.apache.org/jira/browse/FLINK-14034
> Project: Flink
>  Issue Type: Wish
>  Components: Connectors / Kafka
>Affects Versions: 1.9.0
>Reporter: Niels van Kaam
>Priority: Trivial
>
> It is not possible to override the invoke method of the FlinkKafkaProducer, 
> because the first parameter, KafkaTransactionState, is a private inner class. 
>  It is not possible to override the original invoke of SinkFunction, because 
> TwoPhaseCommitSinkFunction (which is implemented by FlinkKafkaProducer) does 
> override the original invoke method with final.
> [https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer.java]
> If there is a particular reason for this, it would be better to make the 
> invoke method in FlinkKafkaProducer final as well, and document the reason 
> such that it is clear this is by design (I don't see any overrides in the 
> same package).
> Otherwise, I would make the KafkaTransactionState publicly visible. I would 
> like to override the Invoke method to create a custom KafkaProducer which 
> performs some additional generic validations and transformations. (which can 
> also be done in a process-function, but a custom sink would simplify the code 
> of jobs)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #14798: [FLINK-21187] Provide exception history for root causes

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #14798:
URL: https://github.com/apache/flink/pull/14798#issuecomment-769223911


   
   ## CI report:
   
   * af8de65a4905ae47ab704a75acfcd1b2e897d915 UNKNOWN
   * 772e076ed9d773e16713236c942f4e30b659d6eb UNKNOWN
   * 354eb574c98bbc5fdaedabc9f3ca7e4acfaa3746 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13285)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-14034) In FlinkKafkaProducer, KafkaTransactionState should be made public or invoke should be made final

2021-02-12 Thread Piotr Nowojski (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283750#comment-17283750
 ] 

Piotr Nowojski commented on FLINK-14034:


But what's preventing you from trying something like that:

{code:java}
public class ExceptionIgnoringSinkWrapper implements RichFunction, 
CheckpointedFunction, CheckpointListener{
  private final RichFunction wrapped;

  public ExceptionIgnoringSinkWrapper(RichFunction wrapped) {
this.wrapped = wrapped;
  }

  ...

  @Override
  public final void invoke(IN value) throws Exception {
try {
  wrapped.invoke(value);
} catch (Exception ex) {
}
  }

  @Override
  public final void invoke(IN value, Context context) throws Exception {
try {
  wrapped.invoke(value, context);
} catch (Exception ex) {
}
  }
}

dataStream.addSink(new ExceptionIgnoringSinkWrapper(new 
FlinkKafkaProducer(...));
{code}
?

> In FlinkKafkaProducer, KafkaTransactionState should be made public or invoke 
> should be made final
> -
>
> Key: FLINK-14034
> URL: https://issues.apache.org/jira/browse/FLINK-14034
> Project: Flink
>  Issue Type: Wish
>  Components: Connectors / Kafka
>Affects Versions: 1.9.0
>Reporter: Niels van Kaam
>Priority: Trivial
>
> It is not possible to override the invoke method of the FlinkKafkaProducer, 
> because the first parameter, KafkaTransactionState, is a private inner class. 
>  It is not possible to override the original invoke of SinkFunction, because 
> TwoPhaseCommitSinkFunction (which is implemented by FlinkKafkaProducer) does 
> override the original invoke method with final.
> [https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer.java]
> If there is a particular reason for this, it would be better to make the 
> invoke method in FlinkKafkaProducer final as well, and document the reason 
> such that it is clear this is by design (I don't see any overrides in the 
> same package).
> Otherwise, I would make the KafkaTransactionState publicly visible. I would 
> like to override the Invoke method to create a custom KafkaProducer which 
> performs some additional generic validations and transformations. (which can 
> also be done in a process-function, but a custom sink would simplify the code 
> of jobs)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #13845: [FLINK-19801] Adding virtual channels for rescaling unaligned checkpoints.

2021-02-12 Thread GitBox


flinkbot edited a comment on pull request #13845:
URL: https://github.com/apache/flink/pull/13845#issuecomment-718853990


   
   ## CI report:
   
   * f63fa7264bda0bafc3e97ac79b920eb34cae3e95 UNKNOWN
   * 3d418c42a5ffe60a99d17a22a6aaba9955e347f9 UNKNOWN
   * 6cc4a796b26b66b761f50730f7534c36afad5afa UNKNOWN
   * dff9f25ac4086acf4b2dbe650a0ed80dd0385ddb UNKNOWN
   * 6ff570d417423ac84ad5d906900758fbce2b8f43 UNKNOWN
   * 894e952378dd3ae2c7e92f65b90689ec6c989c8b UNKNOWN
   * f3aff0c1259abd05d4c18d404874e687431a157c UNKNOWN
   * e07c51a3c4cf17ad447312889227e33e4b13d4f3 UNKNOWN
   * 23323a7b983fac19fdd620b0cc82eace73cc587f UNKNOWN
   * e4bdf60dbf7a9939a9a6ad36f99feb7b4408952a UNKNOWN
   * d42e17ff93fab9f5e276fe257e687ac254bcd032 UNKNOWN
   * 8750e7df6a86754a5471aab506328113c62a6b7f Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13288)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-11838) Create RecoverableWriter for GCS

2021-02-12 Thread Galen Warren (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-11838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283747#comment-17283747
 ] 

Galen Warren commented on FLINK-11838:
--

Hi [~xintongsong], sorry for the delay here, I had some other, unrelated work I 
had to focus on.

I like your idea of using {{WriteChannel}} for the uploads, but to close each 
one when {{RecoverableFsDataOutputStream.persist}} is called, so that we're not 
relying on {{WriteChannel}} for recoverability. {{WriteChannel}} allows one to 
control how frequently it flushes data via 
[setChunkSize|https://googleapis.dev/java/google-cloud-clients/0.90.0-alpha/com/google/cloud/WriteChannel.html#setChunkSize-int-],
 perhaps we expose this chunk size as a Flink option to give the user some 
control over the process, i.e. how much memory is used for buffering? It could 
be optional, not setting it would mean to use the Google default.

Yes, it would be straightforward to compose blobs at any point in the process, 
i.e. on commit and/or at {{persist}} calls along the way. Maybe we compose them 
on commit (of course) and also whenever {{persist}} is called when there are at 
least 32 temp blobs to be composed? That way, we spread the compose work out 
over time but also call {{compose}} as few times as possible, composing as many 
blobs as possible in each call, which seems like it would be the most efficient.

Shall we go with this approach?

 

> Create RecoverableWriter for GCS
> 
>
> Key: FLINK-11838
> URL: https://issues.apache.org/jira/browse/FLINK-11838
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / FileSystem
>Affects Versions: 1.8.0
>Reporter: Fokko Driesprong
>Assignee: Galen Warren
>Priority: Major
>  Labels: pull-request-available, usability
> Fix For: 1.13.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> GCS supports the resumable upload which we can use to create a Recoverable 
> writer similar to the S3 implementation:
> https://cloud.google.com/storage/docs/json_api/v1/how-tos/resumable-upload
> After using the Hadoop compatible interface: 
> https://github.com/apache/flink/pull/7519
> We've noticed that the current implementation relies heavily on the renaming 
> of the files on the commit: 
> https://github.com/apache/flink/blob/master/flink-filesystems/flink-hadoop-fs/src/main/java/org/apache/flink/runtime/fs/hdfs/HadoopRecoverableFsDataOutputStream.java#L233-L259
> This is suboptimal on an object store such as GCS. Therefore we would like to 
> implement a more GCS native RecoverableWriter 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-14034) In FlinkKafkaProducer, KafkaTransactionState should be made public or invoke should be made final

2021-02-12 Thread Andrew Roberts (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283740#comment-17283740
 ] 

Andrew Roberts edited comment on FLINK-14034 at 2/12/21, 2:57 PM:
--

[~pnowojski] the catch here is, ultimately something implementing a sink 
interface must be passed to {{addSink()}}. It is not possible to wrap the 
{{TwoPhaseCommitSink}} version of invoke, because I can't reproduce the method 
signature (due to the private type at the root of this conversation). I suppose 
I could wrap a different invoke method, but my confidence in those other 
methods being called is low, since they all seem to funnel to the main 
implementation (which again is inaccessible).

Your point about assuming the underlying instance can recover after an 
exception is a good one - I hadn't considered that.

Edit: This is pushing me towards a full-throated commitment to simply ignoring 
all exceptions via the provided method - it means that we'll experience 
"unnecessary" message loss if a transient exception occurs when pushing a 
message, but if the alternative is complete loss of the job if a message 
deterministically cannot be sent, then we'll have to take it as the less-bad 
choice.

Is there any interest in a more structural (i.e. within flink) solution to this 
transient/non-transient error issue? Kafka provides 
\{{org.apache.kafka.common.errors.RetriableException}}, which we were using to 
power this selective retry- allowing exceptions extending that type to fail the 
job (restarting from the last checkpoint), while other (\{{control.NonFatal}}) 
exceptions were logged instead.


was (Author: arobe...@fuze.com):
[~pnowojski] the catch here is, ultimately something implementing a sink 
interface must be passed to {{addSink()}}. It is not possible to wrap the 
{{TwoPhaseCommitSink}} version of invoke, because I can't reproduce the method 
signature (due to the private type at the root of this conversation). I suppose 
I could wrap a different invoke method, but my confidence in those other 
methods being called is low, since they all seem to funnel to the main 
implementation (which again is inaccessible).

Your point about assuming the underlying instance can recover after an 
exception is a good one - I hadn't considered that.

> In FlinkKafkaProducer, KafkaTransactionState should be made public or invoke 
> should be made final
> -
>
> Key: FLINK-14034
> URL: https://issues.apache.org/jira/browse/FLINK-14034
> Project: Flink
>  Issue Type: Wish
>  Components: Connectors / Kafka
>Affects Versions: 1.9.0
>Reporter: Niels van Kaam
>Priority: Trivial
>
> It is not possible to override the invoke method of the FlinkKafkaProducer, 
> because the first parameter, KafkaTransactionState, is a private inner class. 
>  It is not possible to override the original invoke of SinkFunction, because 
> TwoPhaseCommitSinkFunction (which is implemented by FlinkKafkaProducer) does 
> override the original invoke method with final.
> [https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer.java]
> If there is a particular reason for this, it would be better to make the 
> invoke method in FlinkKafkaProducer final as well, and document the reason 
> such that it is clear this is by design (I don't see any overrides in the 
> same package).
> Otherwise, I would make the KafkaTransactionState publicly visible. I would 
> like to override the Invoke method to create a custom KafkaProducer which 
> performs some additional generic validations and transformations. (which can 
> also be done in a process-function, but a custom sink would simplify the code 
> of jobs)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   >