[jira] [Commented] (FLINK-21413) TtlMapState and TtlListState cannot be clean completely with Filesystem StateBackend

2021-02-20 Thread Yu Li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287862#comment-17287862
 ] 

Yu Li commented on FLINK-21413:
---

Checking the [state TTL FLIP 
document|https://cwiki.apache.org/confluence/display/FLINK/FLIP-25%3A+Support+User+State+TTL+Natively#FLIP25:SupportUserStateTTLNatively-TTLbehaviour]
 I cannot find description on whether TTL for a whole map is supported for 
{{MapState}}, but according to the current implementation the answer is no (TTL 
is only checked against value of each map entry). What's more, in 
[HeapMapState#remove|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/state/heap/HeapMapState.java#L130-L132]
 we could see the whole map will be removed if become empty, so I don't think 
{{TtlIncrementalCleanup}} need to take care of the empty map case.

Accordingly, I think we should have a fast path in 
{{TtlMapState#getUnexpiredOrNull}} to check whether {{ttlValue}} is empty and 
return it directly (instead of returning {{NULL}}) if so, and returning 
{{NULL}} iif {{ttlValue}} is not empty but all values expired 
({{unexpired.size()}} is zero).

And similar logic should be applied to {{TtlListState#getUnexpiredOrNull}}.

Please let me know your thoughts [~wind_ljy]. Thanks.

> TtlMapState and TtlListState cannot be clean completely with Filesystem 
> StateBackend
> 
>
> Key: FLINK-21413
> URL: https://issues.apache.org/jira/browse/FLINK-21413
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.9.0
>Reporter: Jiayi Liao
>Priority: Major
> Attachments: image-2021-02-19-11-13-58-672.png
>
>
> Take the #TtlMapState as an example,
>  
> {code:java}
> public Map> getUnexpiredOrNull(@Nonnull Map TtlValue> ttlValue) {
> Map> unexpired = new HashMap<>();
> TypeSerializer> valueSerializer =
> ((MapSerializer>) 
> original.getValueSerializer()).getValueSerializer();
> for (Map.Entry> e : ttlValue.entrySet()) {
> if (!expired(e.getValue())) {
> // we have to do the defensive copy to update the 
> value
> unexpired.put(e.getKey(), 
> valueSerializer.copy(e.getValue()));
> }
> }
> return ttlValue.size() == unexpired.size() ? ttlValue : unexpired;
> }
> {code}
>  
> The returned value will never be null and the #StateEntry will exists 
> forever, which leads to memory leak if the key's range of the stream is very 
> large. Below we can see that 20+ millison uncleared TtlStateMap could take up 
> several GB memory.
>  
> !image-2021-02-19-11-13-58-672.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #14678: [FLINK-20833][runtime] Add pluggable failure listener in job manager

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14678:
URL: https://github.com/apache/flink/pull/14678#issuecomment-761905721


   
   ## CI report:
   
   * cade20e85b29ca63c51383dca04976c1d9801042 UNKNOWN
   * 7e811ff57a72897e49425a53b5d956f5a1f32ea9 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13539)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14678: [FLINK-20833][runtime] Add pluggable failure listener in job manager

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14678:
URL: https://github.com/apache/flink/pull/14678#issuecomment-761905721


   
   ## CI report:
   
   * cade20e85b29ca63c51383dca04976c1d9801042 UNKNOWN
   * 34720c9f7ea37afb5d7f3d2a824b78fee916b755 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13120)
 
   * 7e811ff57a72897e49425a53b5d956f5a1f32ea9 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14974: [FLINK-21425][table-planner-blink] Fix equals and notEquals predicates implicit conversions

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14974:
URL: https://github.com/apache/flink/pull/14974#issuecomment-782742788


   
   ## CI report:
   
   *  Unknown: [CANCELED](TBD) 
   * 870fc0477ced03a1d2e84bf4ce5d578ba781f579 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13537)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14974: [FLINK-21425][table-planner-blink] Fix equals and notEquals predicates implicit conversions

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14974:
URL: https://github.com/apache/flink/pull/14974#issuecomment-782742788


   
   ## CI report:
   
   *  Unknown: [CANCELED](TBD) 
   * 870fc0477ced03a1d2e84bf4ce5d578ba781f579 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] cooper-xiang commented on pull request #14974: [FLINK-21425][table-planner-blink] Fix equals and notEquals predicates implicit conversions

2021-02-20 Thread GitBox


cooper-xiang commented on pull request #14974:
URL: https://github.com/apache/flink/pull/14974#issuecomment-782793371


   @flinkbot run azure



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] kezhuw commented on pull request #14831: [FLINK-21086][runtime][checkpoint] CheckpointBarrierHandler Insert barriers for channels received EndOfPartition

2021-02-20 Thread GitBox


kezhuw commented on pull request #14831:
URL: https://github.com/apache/flink/pull/14831#issuecomment-782788879


   @gaoyunhaii @pnowojski I want to describe an example for necessity of 
`EndOfPartitionEvent` tolerance during checkpoint.
   
   Assumes that operator `C` has two upstream `A` and `B`. During a checkpoint, 
`A` ends with `EndOfPartitionEvent` while `B` keeps running. `C` could receive 
checkpoint barrier from `B` and then `EndOfPartitionEvent` from `A`. In this 
case, FLINK-21081 does not help.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14974: [FLINK-21425][table-planner-blink] Fix equals and notEquals predicates implicit conversions

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14974:
URL: https://github.com/apache/flink/pull/14974#issuecomment-782742788


   
   ## CI report:
   
   * 47c4dfd4a83fde2ef39532a944b5b309d522f5d8 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13530)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-21417) Separate type specific memory segments.

2021-02-20 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song closed FLINK-21417.

Fix Version/s: 1.13.0
   Resolution: Fixed

Fixed via:
 * master (1.13): 86015c766ba186f18c2b3b41c3900ea4f809a1c2

> Separate type specific memory segments.
> ---
>
> Key: FLINK-21417
> URL: https://issues.apache.org/jira/browse/FLINK-21417
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] xintongsong closed pull request #14966: [FLINK-21417][core] Separate type-specific memory segments.

2021-02-20 Thread GitBox


xintongsong closed pull request #14966:
URL: https://github.com/apache/flink/pull/14966


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14975: [FLINK-21426][docs] adds details to DataStream JdbcSink documentation

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14975:
URL: https://github.com/apache/flink/pull/14975#issuecomment-782752772


   
   ## CI report:
   
   * 1f9617bcbdf3080f6b2da5de6e911d783f2133b5 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13535)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14975: [FLINK-21426][docs] adds details to DataStream JdbcSink documentation

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14975:
URL: https://github.com/apache/flink/pull/14975#issuecomment-782752772


   
   ## CI report:
   
   * 1f9617bcbdf3080f6b2da5de6e911d783f2133b5 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13535)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #14975: [FLINK-21426][docs] adds details to DataStream JdbcSink documentation

2021-02-20 Thread GitBox


flinkbot commented on pull request #14975:
URL: https://github.com/apache/flink/pull/14975#issuecomment-782752772


   
   ## CI report:
   
   * 1f9617bcbdf3080f6b2da5de6e911d783f2133b5 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #14975: [FLINK-21426][docs] adds details to DataStream JdbcSink documentation

2021-02-20 Thread GitBox


flinkbot commented on pull request #14975:
URL: https://github.com/apache/flink/pull/14975#issuecomment-782750572


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 1f9617bcbdf3080f6b2da5de6e911d783f2133b5 (Sat Feb 20 
21:15:01 UTC 2021)
   
   **Warnings:**
* Documentation files were touched, but no `docs/content.zh/` files: Update 
Chinese documentation or file Jira ticket.
* **This pull request references an unassigned [Jira 
ticket](https://issues.apache.org/jira/browse/FLINK-21426).** According to the 
[code contribution 
guide](https://flink.apache.org/contributing/contribute-code.html), tickets 
need to be assigned before starting with the implementation work.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-21426) Documentation of DataStream JdbcSink should be more detailed

2021-02-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-21426:
---
Labels: pull-request-available  (was: )

> Documentation of DataStream JdbcSink should be more detailed
> 
>
> Key: FLINK-21426
> URL: https://issues.apache.org/jira/browse/FLINK-21426
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.12.1
>Reporter: Svend Vanderveken
>Priority: Minor
>  Labels: pull-request-available
>
> Currently the [DataStream JdbcSink 
> documentation|https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/connectors/jdbc.html]
>  is very laconic and, beyond a short example, only points to the javadoc, 
> which does not provide more info
> Several aspects could make the documentation more user friendly:
>  * the batching and retry behavior could be mentioned
>  * default values could be mentioned
>  * mandatory/optional nature of parameters could be mentioned
>  * a longer example showing all parameters
> I'll submit a short PR with suggested edition
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] sv3ndk opened a new pull request #14975: [FLINK-21426][docs] adds details to DataStream JdbcSink

2021-02-20 Thread GitBox


sv3ndk opened a new pull request #14975:
URL: https://github.com/apache/flink/pull/14975


   
   
   ## What is the purpose of the change
   
   *Make the configuration options as well as the behavior of the DataStream 
JdbcSink more explicit in the documentation*
   
   
   ## Brief change log
   
 - *Added description of the batching behavior of the connector*
 - *Mentioned which parameters are optional and what are the default values*
 - *Made the code example longer by showing all configuration possibilities*
   
   I've not updated the Chinese documentation yet, although I'm happy to 
copy/paste this English version there if/when it's accepted.
   
   ## Verifying this change
   
   I validated the visual aspect of the documentation with `hugo -b "" serve` . 
I don't know how to validate the links to the javadoc though.
   
   I validated that the example code successfully works
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no)
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? no
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-web] leewallen opened a new pull request #419: Missing ending doublequote for defaultFlinkVersion.

2021-02-20 Thread GitBox


leewallen opened a new pull request #419:
URL: https://github.com/apache/flink-web/pull/419


   The "Run the quickstart script" gradle quickstart example fails because of a 
missing ending quotation mark.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-21426) Documentation of DataStream JdbcSink should be more detailed

2021-02-20 Thread Svend Vanderveken (Jira)
Svend Vanderveken created FLINK-21426:
-

 Summary: Documentation of DataStream JdbcSink should be more 
detailed
 Key: FLINK-21426
 URL: https://issues.apache.org/jira/browse/FLINK-21426
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.12.1
Reporter: Svend Vanderveken


Currently the [DataStream JdbcSink 
documentation|https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/connectors/jdbc.html]
 is very laconic and, beyond a short example, only points to the javadoc, which 
does not provide more info

Several aspects could make the documentation more user friendly:
 * the batching and retry behavior could be mentioned
 * default values could be mentioned
 * mandatory/optional nature of parameters could be mentioned
 * a longer example showing all parameters

I'll submit a short PR with suggested edition

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #14974: [FLINK-21425][table-planner-blink] Fix equals and notEquals predicates implicit conversions

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14974:
URL: https://github.com/apache/flink/pull/14974#issuecomment-782742788


   
   ## CI report:
   
   * 47c4dfd4a83fde2ef39532a944b5b309d522f5d8 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13530)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14847: [FLINK-21030][runtime] Add global failover in case of a stop-with-savepoint failure

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14847:
URL: https://github.com/apache/flink/pull/14847#issuecomment-772387941


   
   ## CI report:
   
   * 9a2ea20ce0803e48edfc3ab7bcc02078b7410fbf UNKNOWN
   * 4aecfae5f4889de59c7f1d71d39647b6ee6f9ad8 UNKNOWN
   * 349614b952807dff55a50d95f0ff54be09c578f3 UNKNOWN
   * aa8e93dbb2f90060baa27999c8c8c90d140502dd Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13498)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #14974: [FLINK-21425][table-planner-blink] Fix equals and notEquals predicates implicit conversions

2021-02-20 Thread GitBox


flinkbot commented on pull request #14974:
URL: https://github.com/apache/flink/pull/14974#issuecomment-782742788


   
   ## CI report:
   
   * 47c4dfd4a83fde2ef39532a944b5b309d522f5d8 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14847: [FLINK-21030][runtime] Add global failover in case of a stop-with-savepoint failure

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14847:
URL: https://github.com/apache/flink/pull/14847#issuecomment-772387941


   
   ## CI report:
   
   * 9a2ea20ce0803e48edfc3ab7bcc02078b7410fbf UNKNOWN
   * 4aecfae5f4889de59c7f1d71d39647b6ee6f9ad8 UNKNOWN
   * 349614b952807dff55a50d95f0ff54be09c578f3 UNKNOWN
   * f35f85a9b57dfccf659502c24837f678d9cfd908 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13528)
 
   * aa8e93dbb2f90060baa27999c8c8c90d140502dd UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #14974: [FLINK-21425][table-planner-blink] Fix equals and notEquals predicates implicit conversions

2021-02-20 Thread GitBox


flinkbot commented on pull request #14974:
URL: https://github.com/apache/flink/pull/14974#issuecomment-782740899


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 47c4dfd4a83fde2ef39532a944b5b309d522f5d8 (Sat Feb 20 
19:58:49 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
* **This pull request references an unassigned [Jira 
ticket](https://issues.apache.org/jira/browse/FLINK-21425).** According to the 
[code contribution 
guide](https://flink.apache.org/contributing/contribute-code.html), tickets 
need to be assigned before starting with the implementation work.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-21425) Incorrect implicit conversions in equals and notEquals filter predicates lead to not correctly evaluated by Table API

2021-02-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-21425:
---
Labels: pull-request-available  (was: )

> Incorrect implicit conversions in equals and notEquals filter predicates lead 
> to not correctly evaluated by Table API
> -
>
> Key: FLINK-21425
> URL: https://issues.apache.org/jira/browse/FLINK-21425
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.12.1
>Reporter: Suxing Lee
>Priority: Major
>  Labels: pull-request-available
>
> Equality and NotEquality filter predicates do not work on Tables when 
> leftTerm and rightTerm type is mismatch(one is string ,another is integer)
> column : a,b,c
> the test data is as follows :
> 1,1L,"Hi"
> 2,2L,"Hello"
> 3,2L,"Hello world"
> for query : SELECT a,b,c FROM table WHERE a <> '1'
> The correct implicit conversion expected from the query result is: 
> 2,2L,"Hello"
> 3,2L,"Hello world"
> But in fact the current query result is:
> 1,1L,"Hi"
> 2,2L,"Hello"
> 3,2L,"Hello world"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] cooper-xiang opened a new pull request #14974: [FLINK-21425][table-planner-blink] Fix equals and notEquals predicates implicit conversions

2021-02-20 Thread GitBox


cooper-xiang opened a new pull request #14974:
URL: https://github.com/apache/flink/pull/14974


   ## What is the purpose of the change
   This pull request fixes the bug that implicit conversions in equals and 
notEquals filter predicates doesn't work
   
   
   ## Brief change log
   Cast other types as string type when compare between string type and other 
types in `=` and `<>` predicates
   
   ## Verifying this change
   This change added tests and can be verified as follows:
   JavaSqlITCase#testImplicitConvertAtEqual
   JavaSqlITCase#testImplicitConvertAtNotEqual
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / **no** / 
don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-21425) Incorrect implicit conversions in equals and notEquals filter predicates lead to not correctly evaluated by Table API

2021-02-20 Thread Suxing Lee (Jira)
Suxing Lee created FLINK-21425:
--

 Summary: Incorrect implicit conversions in equals and notEquals 
filter predicates lead to not correctly evaluated by Table API
 Key: FLINK-21425
 URL: https://issues.apache.org/jira/browse/FLINK-21425
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.12.1
Reporter: Suxing Lee


Equality and NotEquality filter predicates do not work on Tables when leftTerm 
and rightTerm type is mismatch(one is string ,another is integer)

column : a,b,c
the test data is as follows :
1,1L,"Hi"
2,2L,"Hello"
3,2L,"Hello world"

for query : SELECT a,b,c FROM table WHERE a <> '1'

The correct implicit conversion expected from the query result is: 
2,2L,"Hello"
3,2L,"Hello world"

But in fact the current query result is:
1,1L,"Hi"
2,2L,"Hello"
3,2L,"Hello world"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #14847: [FLINK-21030][runtime] Add global failover in case of a stop-with-savepoint failure

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14847:
URL: https://github.com/apache/flink/pull/14847#issuecomment-772387941


   
   ## CI report:
   
   * 9a2ea20ce0803e48edfc3ab7bcc02078b7410fbf UNKNOWN
   * 4aecfae5f4889de59c7f1d71d39647b6ee6f9ad8 UNKNOWN
   * 349614b952807dff55a50d95f0ff54be09c578f3 UNKNOWN
   * f35f85a9b57dfccf659502c24837f678d9cfd908 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13528)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14966: [FLINK-21417][core] Separate type-specific memory segments.

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14966:
URL: https://github.com/apache/flink/pull/14966#issuecomment-781952312


   
   ## CI report:
   
   * 5d9ed72db749af16d0e40ed7a117ab624cdff73d Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13527)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14896: [FLINK-21176][docs] translate the updates of 'avro-confluent.zh.md' to Chinese

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14896:
URL: https://github.com/apache/flink/pull/14896#issuecomment-775006223


   
   ## CI report:
   
   * a5f4e508be47fe26e8d0c416dd5104dbb0afbe89 UNKNOWN
   * b968335352d52a7e16f5875c0cc8557605a4cd26 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13529)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21133) FLIP-27 Source does not work with synchronous savepoint

2021-02-20 Thread Kezhu Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287735#comment-17287735
 ] 

Kezhu Wang commented on FLINK-21133:


[~becket_qin] [~pnowojski] [~sewen] Thanks all for recently discussions. It 
helps me a lot in understanding:P.

[~sewen] I like the idea of clear-distinction between suspend and 
pipeline-draining cases and "shut down the dataflow pipeline with one 
checkpoint in total". The semantics of `StreamOperator.close` back to no 
exception again in condition of clean definition of pipeline-draining case.

[~sewen] [~becket_qin] [~pnowojski] I am kind of unsure what is the difference 
between pipeline-draining case and existing "terminate" case ? Where checkpoint 
should happen in pipeline-draining case ? At trigger/barrier or after 
end-of-stream-flush ? In case of trigger/barrier(eg. just like existing 
"terminate" case), there is no big code-path difference comparing to suspend 
case but clean definition. In case of after end-of-stream-flush, it is nearly 
an real terminal operation, the savepoint may be hard to resume from. In "after 
end-of-stream-flush" case, I think it is tight related to FLIP-147. In either 
case, currently FLIP-143 sink does not work out with single checkpoint.


{quote}Maybe we would need to spin using 
StreamTask#runSynchronousSavepointMailboxLoop while triggering the checkpoint, 
and thus also blocking source thread from making any progress after triggering 
the checkpoint?
{quote}
[~pnowojski] I think it is a must, otherwise if downstream closed before 
source, {{SourceContext.collect}} will fail before {{notifyCheckpointComplete}}.


{quote}We would have to make sure that downstream/upstream task would cancel 
correctly, without mis-leading error messages, if they receive network 
connection closed before processing {{notifyCheckpointComplete()}}.
{quote}
[~pnowojski]  Currently, {{RecordWriter}} reports error by lazy checking. I 
think it make sense as network error may caused by caller module intentionally. 
But I also saw exception swallowed in 
{{PartitionRequestQueue.handleException}}. Anyway, it deserves more attentions.

Other issue about chained source in 
{{MultipleInputStreamTask.triggerCheckpointAsync}}: {{advanceToEndOfEventTime}} 
is ignored.

> FLIP-27 Source does not work with synchronous savepoint
> ---
>
> Key: FLINK-21133
> URL: https://issues.apache.org/jira/browse/FLINK-21133
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core, API / DataStream, Runtime / Checkpointing
>Affects Versions: 1.11.3, 1.12.1
>Reporter: Kezhu Wang
>Priority: Critical
> Fix For: 1.11.4, 1.13.0, 1.12.3
>
>
> I have pushed branch 
> [synchronous-savepoint-conflict-with-bounded-end-input-case|https://github.com/kezhuw/flink/commits/synchronous-savepoint-conflict-with-bounded-end-input-case]
>  in my repository. {{SavepointITCase.testStopSavepointWithFlip27Source}} 
> failed due to timeout.
> See also FLINK-21132 and 
> [apache/iceberg#2033|https://github.com/apache/iceberg/issues/2033]..



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #14969: [FLINK-21402] Introduce SlotPoolServiceSchedulerFactory to bundle SlotPoolService and Scheduler factories

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14969:
URL: https://github.com/apache/flink/pull/14969#issuecomment-782223904


   
   ## CI report:
   
   * c92881cb4485e7826f56d48284867111d280046b Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13525)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14970: [FLINK-21390] Rename DeclarativeScheduler to AdaptiveScheduler

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14970:
URL: https://github.com/apache/flink/pull/14970#issuecomment-782244636


   
   ## CI report:
   
   * 08cce5cbc57a8af1ac8699e615cf68d9b0fb8150 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13526)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14896: [FLINK-21176][docs] translate the updates of 'avro-confluent.zh.md' to Chinese

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14896:
URL: https://github.com/apache/flink/pull/14896#issuecomment-775006223


   
   ## CI report:
   
   * cf343b402d811409c9fca8db5a3e3982207a95f2 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13307)
 
   * a5f4e508be47fe26e8d0c416dd5104dbb0afbe89 UNKNOWN
   * b968335352d52a7e16f5875c0cc8557605a4cd26 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13529)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14896: [FLINK-21176][docs] translate the updates of 'avro-confluent.zh.md' to Chinese

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14896:
URL: https://github.com/apache/flink/pull/14896#issuecomment-775006223


   
   ## CI report:
   
   * cf343b402d811409c9fca8db5a3e3982207a95f2 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13307)
 
   * a5f4e508be47fe26e8d0c416dd5104dbb0afbe89 UNKNOWN
   * b968335352d52a7e16f5875c0cc8557605a4cd26 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14847: [FLINK-21030][runtime] Add global failover in case of a stop-with-savepoint failure

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14847:
URL: https://github.com/apache/flink/pull/14847#issuecomment-772387941


   
   ## CI report:
   
   * 9a2ea20ce0803e48edfc3ab7bcc02078b7410fbf UNKNOWN
   * 4aecfae5f4889de59c7f1d71d39647b6ee6f9ad8 UNKNOWN
   * 349614b952807dff55a50d95f0ff54be09c578f3 UNKNOWN
   * 72c3ecf27dd1475192d296674fab463f1a4ae6aa Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13500)
 
   * f35f85a9b57dfccf659502c24837f678d9cfd908 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13528)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] jjiey commented on pull request #14896: [FLINK-21176][docs] translate the updates of 'avro-confluent.zh.md' to Chinese

2021-02-20 Thread GitBox


jjiey commented on pull request #14896:
URL: https://github.com/apache/flink/pull/14896#issuecomment-782693420


   Thanks @fsk119 for your review comments. I've updated and commited the 
modifications. 
   And Happy Niu Year to @fsk119 and @wuchong.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14896: [FLINK-21176][docs] translate the updates of 'avro-confluent.zh.md' to Chinese

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14896:
URL: https://github.com/apache/flink/pull/14896#issuecomment-775006223


   
   ## CI report:
   
   * cf343b402d811409c9fca8db5a3e3982207a95f2 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13307)
 
   * a5f4e508be47fe26e8d0c416dd5104dbb0afbe89 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14847: [FLINK-21030][runtime] Add global failover in case of a stop-with-savepoint failure

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14847:
URL: https://github.com/apache/flink/pull/14847#issuecomment-772387941


   
   ## CI report:
   
   * 9a2ea20ce0803e48edfc3ab7bcc02078b7410fbf UNKNOWN
   * 4aecfae5f4889de59c7f1d71d39647b6ee6f9ad8 UNKNOWN
   * 349614b952807dff55a50d95f0ff54be09c578f3 UNKNOWN
   * 72c3ecf27dd1475192d296674fab463f1a4ae6aa Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13500)
 
   * f35f85a9b57dfccf659502c24837f678d9cfd908 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] jjiey commented on a change in pull request #14896: [FLINK-21176][docs] translate the updates of 'avro-confluent.zh.md' to Chinese

2021-02-20 Thread GitBox


jjiey commented on a change in pull request #14896:
URL: https://github.com/apache/flink/pull/14896#discussion_r579665915



##
File path: docs/content.zh/docs/connectors/table/formats/avro-confluent.md
##
@@ -115,41 +115,40 @@ CREATE TABLE user_created (
   'topic' = 'user_events_example2',
   'properties.bootstrap.servers' = 'localhost:9092',
 
-  -- Watch out: schema evolution in the context of a Kafka key is almost never 
backward nor
-  -- forward compatible due to hash partitioning.
+  -- 注意:由于哈希分区,在 Kafka key 的上下文中,schema 升级几乎从不向后也不向前兼容。
   'key.format' = 'avro-confluent',
   'key.avro-confluent.schema-registry.url' = 'http://localhost:8082',
   'key.fields' = 'kafka_key_id',
 
-  -- In this example, we want the Avro types of both the Kafka key and value 
to contain the field 'id'
-  -- => adding a prefix to the table column associated to the Kafka key field 
avoids clashes
+  -- 在本例中,我们希望 Kafka 的 key 和 value 的 Avro 类型都包含 'id' 字段
+  -- => 给表中与 Kafka key 字段关联的列添加一个前缀来避免冲突
   'key.fields-prefix' = 'kafka_key_',
 
   'value.format' = 'avro-confluent',
   'value.avro-confluent.schema-registry.url' = 'http://localhost:8082',
   'value.fields-include' = 'EXCEPT_KEY',

-  -- subjects have a default value since Flink 1.13, though can be overriden:
+  -- 自 Flink 1.13 起,subjects 具有一个默认值, 但是可以被覆盖:
   'key.avro-confluent.schema-registry.subject' = 'user_events_example2-key2',
   'value.avro-confluent.schema-registry.subject' = 
'user_events_example2-value2'
 )
 ```
 
 ---
-Example of a table using the upsert connector with the Kafka value registered 
as an Avro record in the Schema Registry:
+使用 upsert 连接器,Kafka 的 value 在 Schema Registry 中注册为 Avro 记录的表的示例:

Review comment:
   Yes, I agree. Thank for your suggestion.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] XComp commented on a change in pull request #14847: [FLINK-21030][runtime] Add global failover in case of a stop-with-savepoint failure

2021-02-20 Thread GitBox


XComp commented on a change in pull request #14847:
URL: https://github.com/apache/flink/pull/14847#discussion_r579664779



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/StopWithSavepointTerminationHandlerImpl.java
##
@@ -0,0 +1,258 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.scheduler;
+
+import org.apache.flink.annotation.VisibleForTesting;
+import org.apache.flink.api.common.JobID;
+import org.apache.flink.runtime.checkpoint.CheckpointScheduling;
+import org.apache.flink.runtime.checkpoint.CompletedCheckpoint;
+import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutor;
+import org.apache.flink.runtime.execution.ExecutionState;
+import org.apache.flink.util.FlinkException;
+
+import org.apache.commons.lang3.StringUtils;
+import org.slf4j.Logger;
+
+import javax.annotation.Nonnull;
+
+import java.util.Collection;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.stream.Collectors;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+
+/**
+ * {@code StopWithSavepointTerminationHandlerImpl} implements {@link
+ * StopWithSavepointTerminationHandler}.
+ *
+ * The operation only succeeds if both steps, the savepoint creation and 
the successful
+ * termination of the job, succeed. If the former step fails, the operation 
fails exceptionally
+ * without any further actions. If the latter one fails, a global fail-over is 
triggered before
+ * failing the operation.
+ */
+public class StopWithSavepointTerminationHandlerImpl
+implements StopWithSavepointTerminationHandler {
+
+private final Logger log;
+
+private final SchedulerNG scheduler;
+private final CheckpointScheduling checkpointScheduling;
+private final JobID jobId;
+
+private final CompletableFuture result = new CompletableFuture<>();
+
+private State state = new WaitingForSavepoint();
+
+public  
StopWithSavepointTerminationHandlerImpl(
+@Nonnull JobID jobId, @Nonnull S schedulerWithCheckpointing, 
@Nonnull Logger log) {
+this(jobId, schedulerWithCheckpointing, schedulerWithCheckpointing, 
log);
+}
+
+@VisibleForTesting
+StopWithSavepointTerminationHandlerImpl(
+@Nonnull JobID jobId,
+@Nonnull SchedulerNG scheduler,
+@Nonnull CheckpointScheduling checkpointScheduling,
+@Nonnull Logger log) {
+this.jobId = checkNotNull(jobId);
+this.scheduler = checkNotNull(scheduler);
+this.checkpointScheduling = checkNotNull(checkpointScheduling);
+this.log = checkNotNull(log);
+}
+
+@Override
+public CompletableFuture handlesStopWithSavepointTermination(
+CompletableFuture completedSavepointFuture,
+CompletableFuture> 
terminatedExecutionsFuture,
+ComponentMainThreadExecutor mainThreadExecutor) {
+completedSavepointFuture
+.whenCompleteAsync(
+(completedSavepoint, throwable) -> {
+if (throwable != null) {
+handleSavepointCreationFailure(throwable);
+} else {
+handleSavepointCreation(completedSavepoint);
+}
+},
+mainThreadExecutor)
+.thenCompose(
+aVoid ->
+terminatedExecutionsFuture.thenAcceptAsync(

Review comment:
   It's not. Correct me, if I'm wrong: The execution will happen in the 
main thread if we use a non-Async operation (like the `thenRun` you suggested) 
since the previous operation is forced to run in the main thread already 
through `handleAsync`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] XComp commented on a change in pull request #14847: [FLINK-21030][runtime] Add global failover in case of a stop-with-savepoint failure

2021-02-20 Thread GitBox


XComp commented on a change in pull request #14847:
URL: https://github.com/apache/flink/pull/14847#discussion_r579664565



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/StopWithSavepointTerminationHandlerImpl.java
##
@@ -0,0 +1,258 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.scheduler;
+
+import org.apache.flink.annotation.VisibleForTesting;
+import org.apache.flink.api.common.JobID;
+import org.apache.flink.runtime.checkpoint.CheckpointScheduling;
+import org.apache.flink.runtime.checkpoint.CompletedCheckpoint;
+import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutor;
+import org.apache.flink.runtime.execution.ExecutionState;
+import org.apache.flink.util.FlinkException;
+
+import org.apache.commons.lang3.StringUtils;
+import org.slf4j.Logger;
+
+import javax.annotation.Nonnull;
+
+import java.util.Collection;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.stream.Collectors;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+
+/**
+ * {@code StopWithSavepointTerminationHandlerImpl} implements {@link
+ * StopWithSavepointTerminationHandler}.
+ *
+ * The operation only succeeds if both steps, the savepoint creation and 
the successful
+ * termination of the job, succeed. If the former step fails, the operation 
fails exceptionally
+ * without any further actions. If the latter one fails, a global fail-over is 
triggered before
+ * failing the operation.
+ */
+public class StopWithSavepointTerminationHandlerImpl
+implements StopWithSavepointTerminationHandler {
+
+private final Logger log;
+
+private final SchedulerNG scheduler;
+private final CheckpointScheduling checkpointScheduling;
+private final JobID jobId;
+
+private final CompletableFuture result = new CompletableFuture<>();
+
+private State state = new WaitingForSavepoint();
+
+public  
StopWithSavepointTerminationHandlerImpl(
+@Nonnull JobID jobId, @Nonnull S schedulerWithCheckpointing, 
@Nonnull Logger log) {
+this(jobId, schedulerWithCheckpointing, schedulerWithCheckpointing, 
log);
+}
+
+@VisibleForTesting
+StopWithSavepointTerminationHandlerImpl(
+@Nonnull JobID jobId,
+@Nonnull SchedulerNG scheduler,
+@Nonnull CheckpointScheduling checkpointScheduling,
+@Nonnull Logger log) {
+this.jobId = checkNotNull(jobId);
+this.scheduler = checkNotNull(scheduler);
+this.checkpointScheduling = checkNotNull(checkpointScheduling);
+this.log = checkNotNull(log);
+}
+
+@Override
+public CompletableFuture handlesStopWithSavepointTermination(
+CompletableFuture completedSavepointFuture,
+CompletableFuture> 
terminatedExecutionsFuture,
+ComponentMainThreadExecutor mainThreadExecutor) {
+completedSavepointFuture
+.whenCompleteAsync(
+(completedSavepoint, throwable) -> {
+if (throwable != null) {
+handleSavepointCreationFailure(throwable);
+} else {
+handleSavepointCreation(completedSavepoint);
+}
+},
+mainThreadExecutor)
+.thenCompose(
+aVoid ->
+terminatedExecutionsFuture.thenAcceptAsync(
+this::handleExecutionsTermination, 
mainThreadExecutor));

Review comment:
   Good point. I refactored it and introduced a 
`StopWithSavepointTerminationManager` to enforce the right execution order.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14966: [FLINK-21417][core] Separate type-specific memory segments.

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14966:
URL: https://github.com/apache/flink/pull/14966#issuecomment-781952312


   
   ## CI report:
   
   * 8ca7a5eb4c4b076438663f4105e51be69bad29f9 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13495)
 
   * 5d9ed72db749af16d0e40ed7a117ab624cdff73d Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13527)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] XComp commented on a change in pull request #14847: [FLINK-21030][runtime] Add global failover in case of a stop-with-savepoint failure

2021-02-20 Thread GitBox


XComp commented on a change in pull request #14847:
URL: https://github.com/apache/flink/pull/14847#discussion_r579663983



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/StopWithSavepointTerminationHandlerImpl.java
##
@@ -0,0 +1,258 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.scheduler;
+
+import org.apache.flink.annotation.VisibleForTesting;
+import org.apache.flink.api.common.JobID;
+import org.apache.flink.runtime.checkpoint.CheckpointScheduling;
+import org.apache.flink.runtime.checkpoint.CompletedCheckpoint;
+import org.apache.flink.runtime.concurrent.ComponentMainThreadExecutor;
+import org.apache.flink.runtime.execution.ExecutionState;
+import org.apache.flink.util.FlinkException;
+
+import org.apache.commons.lang3.StringUtils;
+import org.slf4j.Logger;
+
+import javax.annotation.Nonnull;
+
+import java.util.Collection;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.stream.Collectors;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+
+/**
+ * {@code StopWithSavepointTerminationHandlerImpl} implements {@link
+ * StopWithSavepointTerminationHandler}.
+ *
+ * The operation only succeeds if both steps, the savepoint creation and 
the successful
+ * termination of the job, succeed. If the former step fails, the operation 
fails exceptionally
+ * without any further actions. If the latter one fails, a global fail-over is 
triggered before
+ * failing the operation.
+ */
+public class StopWithSavepointTerminationHandlerImpl
+implements StopWithSavepointTerminationHandler {
+
+private final Logger log;
+
+private final SchedulerNG scheduler;
+private final CheckpointScheduling checkpointScheduling;
+private final JobID jobId;
+
+private final CompletableFuture result = new CompletableFuture<>();
+
+private State state = new WaitingForSavepoint();
+
+public  
StopWithSavepointTerminationHandlerImpl(
+@Nonnull JobID jobId, @Nonnull S schedulerWithCheckpointing, 
@Nonnull Logger log) {
+this(jobId, schedulerWithCheckpointing, schedulerWithCheckpointing, 
log);
+}
+
+@VisibleForTesting
+StopWithSavepointTerminationHandlerImpl(
+@Nonnull JobID jobId,
+@Nonnull SchedulerNG scheduler,
+@Nonnull CheckpointScheduling checkpointScheduling,
+@Nonnull Logger log) {
+this.jobId = checkNotNull(jobId);
+this.scheduler = checkNotNull(scheduler);
+this.checkpointScheduling = checkNotNull(checkpointScheduling);
+this.log = checkNotNull(log);
+}
+
+@Override
+public CompletableFuture handlesStopWithSavepointTermination(
+CompletableFuture completedSavepointFuture,
+CompletableFuture> 
terminatedExecutionsFuture,
+ComponentMainThreadExecutor mainThreadExecutor) {
+completedSavepointFuture
+.whenCompleteAsync(
+(completedSavepoint, throwable) -> {
+if (throwable != null) {
+handleSavepointCreationFailure(throwable);
+} else {
+handleSavepointCreation(completedSavepoint);
+}
+},
+mainThreadExecutor)
+.thenCompose(
+aVoid ->
+terminatedExecutionsFuture.thenAcceptAsync(
+this::handleExecutionsTermination, 
mainThreadExecutor));
+
+return result;
+}
+
+private synchronized void handleSavepointCreation(CompletedCheckpoint 
completedCheckpoint) {
+final State oldState = state;
+state = state.onSavepointCreation(completedCheckpoint);
+
+log.debug(
+"Stop-with-savepoint transitioned from {} to {} on savepoint 
creation handling for job {}.",
+oldState,
+state,
+jobId);
+}
+
+private synchronized void handleSavepointCreationFailure(Throwable 
throwable) {
+final 

[GitHub] [flink] flinkbot edited a comment on pull request #14966: [FLINK-21417][core] Separate type-specific memory segments.

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14966:
URL: https://github.com/apache/flink/pull/14966#issuecomment-781952312


   
   ## CI report:
   
   * 8ca7a5eb4c4b076438663f4105e51be69bad29f9 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13495)
 
   * 5d9ed72db749af16d0e40ed7a117ab624cdff73d UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] xintongsong commented on pull request #14966: [FLINK-21417][core] Separate type-specific memory segments.

2021-02-20 Thread GitBox


xintongsong commented on pull request #14966:
URL: https://github.com/apache/flink/pull/14966#issuecomment-782665432


   Thanks for the quick response and review, @tillrohrmann.
   Comments addressed. I'll merge this once AZP gives green light.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14629: [FLINK-15656][k8s] Support pod template for native kubernetes integration

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14629:
URL: https://github.com/apache/flink/pull/14629#issuecomment-759361463


   
   ## CI report:
   
   * b8d2fd246e442fe92285103908ea6595c64d1c5c Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13523)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] LadyForest commented on pull request #14944: [FLINK-21297] Support 'LOAD/UNLOAD MODULE' syntax

2021-02-20 Thread GitBox


LadyForest commented on pull request #14944:
URL: https://github.com/apache/flink/pull/14944#issuecomment-782651885


   Many thanks to @wuchong. I wish I could address all your comments. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] xintongsong commented on a change in pull request #14966: [FLINK-21417][core] Separate type-specific memory segments.

2021-02-20 Thread GitBox


xintongsong commented on a change in pull request #14966:
URL: https://github.com/apache/flink/pull/14966#discussion_r579655822



##
File path: 
flink-core/src/main/java/org/apache/flink/core/memory/UnsafeMemorySegment.java
##
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.core.memory;
+
+import org.apache.flink.annotation.Internal;
+
+import javax.annotation.Nonnull;
+import javax.annotation.Nullable;
+
+import java.nio.ByteBuffer;
+
+/**
+ * This class represents a piece of unsafe off-heap memory managed by Flink.
+ *
+ * Note that memory segments should usually not be allocated manually, but 
rather through the
+ * {@link MemorySegmentFactory}.
+ */
+@Internal
+public class UnsafeMemorySegment extends OffHeapMemorySegment {
+@Nullable private final Runnable cleaner;
+
+/**
+ * Creates a new memory segment that represents the memory backing the 
given unsafe byte buffer.
+ * Note that the given ByteBuffer must be direct {@link
+ * java.nio.ByteBuffer#allocateDirect(int)}, otherwise this method with 
throw an
+ * IllegalArgumentException.
+ *
+ * The memory segment references the given owner.
+ *
+ * @param buffer The byte buffer whose memory is represented by this 
memory segment.
+ * @param owner The owner references by this memory segment.
+ * @param cleaner The cleaner to be called on free segment.
+ * @throws IllegalArgumentException Thrown, if the given ByteBuffer is not 
direct.
+ */
+UnsafeMemorySegment(
+@Nonnull ByteBuffer buffer, @Nullable Object owner, @Nullable 
Runnable cleaner) {

Review comment:
   Nice catch. We should no longer have null cleaner.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] xintongsong commented on a change in pull request #14966: [FLINK-21417][core] Separate type-specific memory segments.

2021-02-20 Thread GitBox


xintongsong commented on a change in pull request #14966:
URL: https://github.com/apache/flink/pull/14966#discussion_r579655684



##
File path: 
flink-core/src/main/java/org/apache/flink/core/memory/MemorySegmentFactory.java
##
@@ -130,7 +130,7 @@ public static MemorySegment 
allocateUnpooledOffHeapMemory(int size) {
  */
 public static MemorySegment allocateUnpooledOffHeapMemory(int size, Object 
owner) {

Review comment:
   Addressed in later commit.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14973: [FLINK-21412][python] Fix Decimal type which doesn't work in UDAF and expression DSL

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14973:
URL: https://github.com/apache/flink/pull/14973#issuecomment-782584606


   
   ## CI report:
   
   * a514485c23cddf2aaba4741241f5fa3bd0764cf0 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13521)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14953: [FLINK-21351][checkpointing] Don't subsume last checkpoint

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14953:
URL: https://github.com/apache/flink/pull/14953#issuecomment-780752668


   
   ## CI report:
   
   * 941374ecdd03b5ceb9245e4d8a8370671b691a34 UNKNOWN
   * 1ed5093d34a24e1784a7fc2573a009f49e71813b Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13522)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-20332) Add workers recovered from previous attempt to pending resources

2021-02-20 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287684#comment-17287684
 ] 

Xintong Song commented on FLINK-20332:
--

bq. Does this mean that in the Yarn case we might request some additional 
containers?

Yes. That is the current behavior.

bq. Once the recovered workers register at the RM they announce their resource 
spec and if they can be used for the job, the YarnResourceManager might cancel 
some of the newly requested resources?

Currently, the newly requested resources have to wait for the idle timeout to 
be released. FLINK-18229 proposes to cancel the resources before they're 
allocated/registered.

> Add workers recovered from previous attempt to pending resources
> 
>
> Key: FLINK-20332
> URL: https://issues.apache.org/jira/browse/FLINK-20332
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>
> For active deployments (Native K8s/Yarn/Mesos), after a JM failover, workers 
> from previous attempt should register to the new JM. Depending on the order 
> that slot requests and TM registrations arrive at the RM, it could happen 
> that RM allocates unnecessary new resources while there are recovered 
> resources that can be reused.
> A potential improvement is to add recovered workers to pending resources, so 
> that RM knows what resources are expected to be available soon and decide 
> whether to allocate new resources accordingly.
> See also the discussion in FLINK-20249.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] leonardBang commented on a change in pull request #14863: [FLINK-21203]Don’t collect -U&+U Row When they are equals In the LastRowFunction

2021-02-20 Thread GitBox


leonardBang commented on a change in pull request #14863:
URL: https://github.com/apache/flink/pull/14863#discussion_r579648904



##
File path: 
flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/runtime/operators/deduplicate/DeduplicateFunctionHelper.java
##
@@ -86,7 +97,9 @@ static void processLastRowOnChangelog(
 RowData currentRow,
 boolean generateUpdateBefore,
 ValueState state,
-Collector out)
+Collector out,
+boolean isStateTtlEnabled,
+RecordEqualiser equaliser)

Review comment:
   Please add note for added parameters

##
File path: 
flink-table/flink-table-planner-blink/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/stream/StreamExecDeduplicate.java
##
@@ -260,14 +265,23 @@ protected RowtimeDeduplicateOperatorTranslator(
 /** Translator to create process time deduplicate operator. */
 private static class ProcTimeDeduplicateOperatorTranslator
 extends DeduplicateOperatorTranslator {
+private final GeneratedRecordEqualiser generatedEqualiser;
 
 protected ProcTimeDeduplicateOperatorTranslator(
 TableConfig tableConfig,
 InternalTypeInfo rowTypeInfo,
 TypeSerializer typeSerializer,
+RowType inputRowType,
 boolean keepLastRow,
 boolean generateUpdateBefore) {
 super(tableConfig, rowTypeInfo, typeSerializer, keepLastRow, 
generateUpdateBefore);
+final EqualiserCodeGenerator equaliserCodeGen =
+new EqualiserCodeGenerator(
+inputRowType.getFields().stream()
+.map(RowType.RowField::getType)
+.toArray(LogicalType[]::new));
+generatedEqualiser =
+
equaliserCodeGen.generateRecordEqualiser("DeduplicateRowEqualiser");

Review comment:
   ```
   this.generatedEqualiser =
   new EqualiserCodeGenerator(
   inputRowType.getFields().stream()
   .map(RowType.RowField::getType)
   .toArray(LogicalType[]::new))
   
.generateRecordEqualiser("DeduplicateRowEqualiser");
   ```

##
File path: 
flink-table/flink-table-runtime-blink/src/test/java/org/apache/flink/table/runtime/operators/deduplicate/ProcTimeDeduplicateFunctionTestBase.java
##
@@ -45,4 +48,15 @@
 inputRowType.toRowFieldTypes(),
 new GenericRowRecordSortComparator(
 rowKeyIdx, 
inputRowType.toRowFieldTypes()[rowKeyIdx]));
+
+static GeneratedRecordEqualiser generatedEqualiser =
+new GeneratedRecordEqualiser("", "", new Object[0]) {
+
+private static final long serialVersionUID = 
8932260133849746733L;

Review comment:
   Flink uses `1L` as  default `serialVersionUID`

##
File path: 
flink-table/flink-table-planner-blink/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/stream/StreamExecChangelogNormalize.java
##
@@ -81,6 +84,15 @@ public StreamExecChangelogNormalize(
 tableConfig
 .getConfiguration()
 
.getBoolean(ExecutionConfigOptions.TABLE_EXEC_MINIBATCH_ENABLED);
+
+final EqualiserCodeGenerator equaliserCodeGen =
+new EqualiserCodeGenerator(
+rowTypeInfo.toRowType().getFields().stream()
+.map(RowType.RowField::getType)
+.toArray(LogicalType[]::new));
+GeneratedRecordEqualiser generatedEqualiser =
+
equaliserCodeGen.generateRecordEqualiser("DeduplicateRowEqualiser");
+

Review comment:
   we can optimize to:
   ```
   final GeneratedRecordEqualiser generatedEqualiser =
   new EqualiserCodeGenerator(
   rowTypeInfo.toRowType().getFields().stream()
   .map(RowType.RowField::getType)
   .toArray(LogicalType[]::new))
   .generateRecordEqualiser("DeduplicateRowEqualiser");
   ```

##
File path: 
flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/runtime/operators/deduplicate/DeduplicateFunctionHelper.java
##
@@ -96,12 +109,20 @@ static void processLastRowOnChangelog(
 currentRow.setRowKind(RowKind.INSERT);
 out.collect(currentRow);
 } else {
-if (generateUpdateBefore) {
-preRow.setRowKind(RowKind.UPDATE_BEFORE);
-out.collect(preRow);
+if (!isStateTtlEnabled && equaliser.equals(preRow, 
currentRow)) {
+

[GitHub] [flink] flinkbot edited a comment on pull request #14971: [FLINK-20536][tests] Update migration tests in master to cover migration from release-1.12

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14971:
URL: https://github.com/apache/flink/pull/14971#issuecomment-782581091


   
   ## CI report:
   
   * 250b424bb564a0a787d32afcb30bed2e2b2a4b91 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13519)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21420) Add path templating to the DataStream API

2021-02-20 Thread Miguel Araujo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287658#comment-17287658
 ] 

Miguel Araujo commented on FLINK-21420:
---

Hi [~tzulitai], [~igal],

I was considering opening a PR to fix this issue, here is what I would change:
 * Adding StatefulFunctionDataStreamBuilder.withRequestReplyNamespace (naming 
suggestions welcome..).
 ** HttpFunctionEndpointSpec already supports path templating, so this would 
just create one with a NamespaceTarget and a UrlPathTemplate.
 ** I think adding a new method to the StatefulFunctionDataStreamBuilder is 
better than overloading the behavior of .withRequestReplyRemoteFunction as that 
could be confusing.
 * Renaming StatefulFunctionDataStreamBuilder.requestReplyFunctions to 
specificTypeEndpointSpecs.
 * Adding StatefulFunctionDataStreamBuilder.perNamespaceEndpointSpecs.
 * Make SerializableHttpFunctionProvider's behavior with namespace endpoints 
akin to the behavior of HttpFunctionProvider. This would mean changing 
(overloading) the constructor to receive an additional `Map perNamespaceEndpointSpecs`.
 * Changing the build()  method of StatefulFunctionDataStreamBuilder to use the 
new SerializableHttpFunctionProvider's constructor.

WDYT?

Unfortunately, I think DataStreams + Remote functions are not being tested 
anywhere, or at least I couldn't find a suitable test to modify.

> Add path templating to the DataStream API
> -
>
> Key: FLINK-21420
> URL: https://issues.apache.org/jira/browse/FLINK-21420
> Project: Flink
>  Issue Type: Improvement
>  Components: Stateful Functions
>Reporter: Miguel Araujo
>Priority: Major
>
> Path Template was introduced in FLINK-20264 with a new module YAML 
> specification being added in FLINK-20334.
> However, that possibility was not added to the DataStream API.
> The main problem is that RequestReplyFunctionBuilder can only receive 
> FunctionTypes which it then turns into FunctionTypeTarget's for the 
> HttpFunctionEndpointSpec builder:
> {code:java}
> private RequestReplyFunctionBuilder(FunctionType functionType, URI endpoint) {
>   this.builder =
>   HttpFunctionEndpointSpec.builder(
>   Target.functionType(functionType), new 
> UrlPathTemplate(endpoint.toASCIIString()));
> }
> {code}
> It should also be possible for the RequestReplyFunctionBuilder to receive a 
> namespace instead of a function type and to use `Target.namespace(namespace)` 
> to initialize the HttpFunctionEndpointSpec Builder instead.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] tillrohrmann commented on a change in pull request #14966: [FLINK-21417][core] Separate type-specific memory segments.

2021-02-20 Thread GitBox


tillrohrmann commented on a change in pull request #14966:
URL: https://github.com/apache/flink/pull/14966#discussion_r579644550



##
File path: 
flink-core/src/main/java/org/apache/flink/core/memory/MemorySegmentFactory.java
##
@@ -31,7 +31,7 @@
 import static org.apache.flink.util.Preconditions.checkArgument;
 
 /**
- * A factory for (hybrid) memory segments ({@link HybridMemorySegment}).
+ * A factory for (hybrid) memory segments ({@link OffHeapMemorySegment}).

Review comment:
   `A factory for {@link OffHeapMemorySegment}`.

##
File path: 
flink-core/src/main/java/org/apache/flink/core/memory/OffHeapMemorySegment.java
##
@@ -24,33 +24,22 @@
 import javax.annotation.Nonnull;
 import javax.annotation.Nullable;
 
-import java.io.DataInput;
-import java.io.DataOutput;
-import java.io.IOException;
 import java.nio.ByteBuffer;
 import java.util.function.Consumer;
 import java.util.function.Function;
 
 import static org.apache.flink.core.memory.MemoryUtils.getByteBufferAddress;
 
 /**
- * This class represents a piece of memory managed by Flink.
+ * This class represents a piece of off-heap memory managed by Flink.
  *
- * The memory can be on-heap, off-heap direct or off-heap unsafe, this is 
transparently handled
- * by this class.
- *
- * This class specializes byte access and byte copy calls for heap memory, 
while reusing the
- * multi-byte type accesses and cross-segment operations from the 
MemorySegment.
- *
- * This class subsumes the functionality of the {@link
- * org.apache.flink.core.memory.HeapMemorySegment}, but is a bit less 
efficient for operations on
- * individual bytes.
+ * The memory can direct or unsafe, this is transparently handled by this 
class.

Review comment:
   `can be`

##
File path: 
flink-core/src/test/java/org/apache/flink/core/memory/HybridOffHeapDirectMemorySegmentTest.java
##
@@ -51,7 +51,7 @@ MemorySegment createSegment(int size, Object owner) {
 @Test
 public void testHybridHeapSegmentSpecifics() {
 final int bufSize = 411;
-HybridMemorySegment seg = (HybridMemorySegment) createSegment(bufSize);
+OffHeapMemorySegment seg = (OffHeapMemorySegment) 
createSegment(bufSize);

Review comment:
   The cast should not be necessary if we change the return type of 
`createSegment`.

##
File path: 
flink-core/src/test/java/org/apache/flink/core/memory/EndiannessAccessChecks.java
##
@@ -37,19 +37,18 @@ public void testHeapSegment() {
 
 @Test
 public void testHybridOnHeapSegment() {
-testBigAndLittleEndianAccessUnaligned(MemorySegmentFactory.wrap(new 
byte[1]));
+testBigAndLittleEndianAccessUnaligned(
+MemorySegmentFactory.wrapHeapSegment(new byte[1]));
 }
 
 @Test
 public void testHybridOffHeapSegment() {
-testBigAndLittleEndianAccessUnaligned(
-MemorySegmentFactory.allocateUnpooledOffHeapMemory(1));
+
testBigAndLittleEndianAccessUnaligned(MemorySegmentFactory.allocateDirectSegment(1));
 }
 
 @Test
 public void testHybridOffHeapUnsafeSegment() {

Review comment:
   Could be renamed into `testUnsafeOffHeapSegment`

##
File path: 
flink-core/src/main/java/org/apache/flink/core/memory/MemorySegmentFactory.java
##
@@ -130,7 +130,7 @@ public static MemorySegment 
allocateUnpooledOffHeapMemory(int size) {
  */
 public static MemorySegment allocateUnpooledOffHeapMemory(int size, Object 
owner) {

Review comment:
   Should we rename the factory methods to reflect which type of off heap 
memory they allocate?

##
File path: 
flink-core/src/main/java/org/apache/flink/core/memory/UnsafeMemorySegment.java
##
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.core.memory;
+
+import org.apache.flink.annotation.Internal;
+
+import javax.annotation.Nonnull;
+import javax.annotation.Nullable;
+
+import java.nio.ByteBuffer;
+
+/**
+ * This class represents a piece of unsafe off-heap memory managed by Flink.
+ *
+ * Note that memory segments should usually not be allocated manually, but 
rather through the
+ * {@link 

[GitHub] [flink] flinkbot edited a comment on pull request #14972: [FLINK-20536][tests] Update migration tests in master to cover migration from release-1.12

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14972:
URL: https://github.com/apache/flink/pull/14972#issuecomment-782581149


   
   ## CI report:
   
   * f8b2ed1dc83642f1e241470e8630c2890a09fa6d Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13520)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-21421) Add coMapWithState to ConnectedStreams

2021-02-20 Thread Aris Koliopoulos (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aris Koliopoulos updated FLINK-21421:
-
Description: 
Currently there is no syntactic sugar for stateful functions in 
`ConnectedStreams` in Scala. 

This makes stateful joins (aka `connect`) more verbose and exposes users to 
Java interfaces (by requiring a `RichCoMapFunction` implementation to access 
state in `ConnectedStreams`).

Looking at DriveTribe's codebase, we have implemented ~80% of our 
ConnectedStreams operators using this `coMapWithState` implementation:

[https://github.com/ariskk/flink-stream-join/blob/main/src/main/scala/com/ariskk/streamjoin/ConnectedStreamsOps.scala#L15]

A `coFlatMapWithState` can be trivially implemented on top.

This has been in production for so long I forgot it was our code and not 
Flink's.

-I can easily add it if this is of interest.- 

I did the work, here is the diff 
https://github.com/apache/flink/compare/master...ariskk:FLINK-21421

 

  was:
Currently there is no syntactic sugar for stateful functions in 
`ConnectedStreams` in Scala. 

This makes stateful joins (aka `connect`) more verbose and exposes users to 
Java interfaces (by requiring a `RichCoMapFunction` implementation to access 
state in `ConnectedStreams`).

Looking at DriveTribe's codebase, we have implemented ~80% of our 
ConnectedStreams operators using this `coMapWithState` implementation:

[https://github.com/ariskk/flink-stream-join/blob/main/src/main/scala/com/ariskk/streamjoin/ConnectedStreamsOps.scala#L15]

A `coFlatMapWithState` can be trivially implemented on top.

This has been in production for so long I forgot it was our code and not 
Flink's.

I can easily add it if this is of interest. 

 


> Add coMapWithState to ConnectedStreams
> --
>
> Key: FLINK-21421
> URL: https://issues.apache.org/jira/browse/FLINK-21421
> Project: Flink
>  Issue Type: New Feature
>  Components: API / Scala
>Reporter: Aris Koliopoulos
>Priority: Minor
>
> Currently there is no syntactic sugar for stateful functions in 
> `ConnectedStreams` in Scala. 
> This makes stateful joins (aka `connect`) more verbose and exposes users to 
> Java interfaces (by requiring a `RichCoMapFunction` implementation to access 
> state in `ConnectedStreams`).
> Looking at DriveTribe's codebase, we have implemented ~80% of our 
> ConnectedStreams operators using this `coMapWithState` implementation:
> [https://github.com/ariskk/flink-stream-join/blob/main/src/main/scala/com/ariskk/streamjoin/ConnectedStreamsOps.scala#L15]
> A `coFlatMapWithState` can be trivially implemented on top.
> This has been in production for so long I forgot it was our code and not 
> Flink's.
> -I can easily add it if this is of interest.- 
> I did the work, here is the diff 
> https://github.com/apache/flink/compare/master...ariskk:FLINK-21421
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #14969: [FLINK-21402] Introduce SlotPoolServiceSchedulerFactory to bundle SlotPoolService and Scheduler factories

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14969:
URL: https://github.com/apache/flink/pull/14969#issuecomment-782223904


   
   ## CI report:
   
   * 210d106621b94b14e90cfca736e07500eb452455 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13506)
 
   * c92881cb4485e7826f56d48284867111d280046b Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13525)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14970: [FLINK-21390] Rename DeclarativeScheduler to AdaptiveScheduler

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14970:
URL: https://github.com/apache/flink/pull/14970#issuecomment-782244636


   
   ## CI report:
   
   * c12ff62506a06a4316876810bd96cd2a144fcc26 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13507)
 
   * 08cce5cbc57a8af1ac8699e615cf68d9b0fb8150 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13526)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] fsk119 commented on a change in pull request #14896: [FLINK-21176][docs] translate the updates of 'avro-confluent.zh.md' to Chinese

2021-02-20 Thread GitBox


fsk119 commented on a change in pull request #14896:
URL: https://github.com/apache/flink/pull/14896#discussion_r579642542



##
File path: docs/content.zh/docs/connectors/table/formats/avro-confluent.md
##
@@ -115,41 +115,40 @@ CREATE TABLE user_created (
   'topic' = 'user_events_example2',
   'properties.bootstrap.servers' = 'localhost:9092',
 
-  -- Watch out: schema evolution in the context of a Kafka key is almost never 
backward nor
-  -- forward compatible due to hash partitioning.
+  -- 注意:由于哈希分区,在 Kafka key 的上下文中,schema 升级几乎从不向后也不向前兼容。
   'key.format' = 'avro-confluent',
   'key.avro-confluent.schema-registry.url' = 'http://localhost:8082',
   'key.fields' = 'kafka_key_id',
 
-  -- In this example, we want the Avro types of both the Kafka key and value 
to contain the field 'id'
-  -- => adding a prefix to the table column associated to the Kafka key field 
avoids clashes
+  -- 在本例中,我们希望 Kafka 的 key 和 value 的 Avro 类型都包含 'id' 字段
+  -- => 给表中与 Kafka key 字段关联的列添加一个前缀来避免冲突
   'key.fields-prefix' = 'kafka_key_',
 
   'value.format' = 'avro-confluent',
   'value.avro-confluent.schema-registry.url' = 'http://localhost:8082',
   'value.fields-include' = 'EXCEPT_KEY',

-  -- subjects have a default value since Flink 1.13, though can be overriden:
+  -- 自 Flink 1.13 起,subjects 具有一个默认值, 但是可以被覆盖:
   'key.avro-confluent.schema-registry.subject' = 'user_events_example2-key2',
   'value.avro-confluent.schema-registry.subject' = 
'user_events_example2-value2'
 )
 ```
 
 ---
-Example of a table using the upsert connector with the Kafka value registered 
as an Avro record in the Schema Registry:
+使用 upsert 连接器,Kafka 的 value 在 Schema Registry 中注册为 Avro 记录的表的示例:

Review comment:
   I think it's better to use fully quailfied name `upsert-kafka`. Only 
upsert may confuse users.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14969: [FLINK-21402] Introduce SlotPoolServiceSchedulerFactory to bundle SlotPoolService and Scheduler factories

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14969:
URL: https://github.com/apache/flink/pull/14969#issuecomment-782223904


   
   ## CI report:
   
   * 210d106621b94b14e90cfca736e07500eb452455 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13506)
 
   * c92881cb4485e7826f56d48284867111d280046b UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14970: [FLINK-21390] Rename DeclarativeScheduler to AdaptiveScheduler

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14970:
URL: https://github.com/apache/flink/pull/14970#issuecomment-782244636


   
   ## CI report:
   
   * c12ff62506a06a4316876810bd96cd2a144fcc26 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13507)
 
   * 08cce5cbc57a8af1ac8699e615cf68d9b0fb8150 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-20332) Add workers recovered from previous attempt to pending resources

2021-02-20 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287644#comment-17287644
 ] 

Till Rohrmann commented on FLINK-20332:
---

Does this mean that in the Yarn case we might request some additional 
containers? Once the recovered workers register at the RM they announce their 
resource spec and if they can be used for the job, the {{YarnResourceManager}} 
might cancel some of the newly requested resources?

> Add workers recovered from previous attempt to pending resources
> 
>
> Key: FLINK-20332
> URL: https://issues.apache.org/jira/browse/FLINK-20332
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>
> For active deployments (Native K8s/Yarn/Mesos), after a JM failover, workers 
> from previous attempt should register to the new JM. Depending on the order 
> that slot requests and TM registrations arrive at the RM, it could happen 
> that RM allocates unnecessary new resources while there are recovered 
> resources that can be reused.
> A potential improvement is to add recovered workers to pending resources, so 
> that RM knows what resources are expected to be available soon and decide 
> whether to allocate new resources accordingly.
> See also the discussion in FLINK-20249.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-21187) RootException history implementation in DefaultScheduler

2021-02-20 Thread Till Rohrmann (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann closed FLINK-21187.
-
Fix Version/s: 1.13.0
   Resolution: Fixed

Fixed via c77a686c195d1742c276f4a9e75899c8b85377bb

> RootException history implementation in DefaultScheduler
> 
>
> Key: FLINK-21187
> URL: https://issues.apache.org/jira/browse/FLINK-21187
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Matthias
>Assignee: Matthias
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] tillrohrmann closed pull request #14798: [FLINK-21187] Provide exception history for root causes

2021-02-20 Thread GitBox


tillrohrmann closed pull request #14798:
URL: https://github.com/apache/flink/pull/14798


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] xintongsong commented on a change in pull request #14629: [FLINK-15656][k8s] Support pod template for native kubernetes integration

2021-02-20 Thread GitBox


xintongsong commented on a change in pull request #14629:
URL: https://github.com/apache/flink/pull/14629#discussion_r579634054



##
File path: 
flink-kubernetes/src/test/java/org/apache/flink/kubernetes/kubeclient/factory/KubernetesJobManagerFactoryWithPodTemplateTest.java
##
@@ -0,0 +1,109 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.kubernetes.kubeclient.factory;
+
+import org.apache.flink.kubernetes.KubernetesPodTemplateTestUtils;
+import 
org.apache.flink.kubernetes.configuration.KubernetesConfigOptionsInternal;
+import 
org.apache.flink.kubernetes.entrypoint.KubernetesSessionClusterEntrypoint;
+import org.apache.flink.kubernetes.kubeclient.FlinkPod;
+import 
org.apache.flink.kubernetes.kubeclient.KubernetesJobManagerSpecification;
+import org.apache.flink.kubernetes.kubeclient.KubernetesJobManagerTestBase;
+import org.apache.flink.kubernetes.utils.Constants;
+import org.apache.flink.kubernetes.utils.KubernetesUtils;
+
+import io.fabric8.kubernetes.api.model.Container;
+import io.fabric8.kubernetes.api.model.PodTemplateSpec;
+import org.junit.Test;
+
+import static org.hamcrest.Matchers.hasItems;
+import static org.hamcrest.Matchers.is;
+import static org.junit.Assert.assertThat;
+
+/**
+ * General tests for the {@link KubernetesJobManagerFactory} with pod 
template. These tests will
+ * ensure that init container, volumes, sidecar containers from pod template 
should be kept after
+ * all decorators.
+ */
+public class KubernetesJobManagerFactoryWithPodTemplateTest extends 
KubernetesJobManagerTestBase {

Review comment:
   The two classes `KubernetesJobManagerFactoryWithPodTemplateTest` and 
`KubernetesTaskManagerFactoryWithPodTemplateTest` are practically identical, 
except for `onSetup`.
   I think they can be deduplicated.

##
File path: 
flink-kubernetes/src/main/java/org/apache/flink/kubernetes/configuration/KubernetesConfigOptions.java
##
@@ -367,6 +368,46 @@
 
code("FlinkKubeClient#checkAndUpdateConfigMap"))
 .build());
 
+public static final ConfigOption JOB_MANAGER_POD_TEMPLATE =
+key("kubernetes.jobmanager.pod-template-file")
+.stringType()
+.noDefaultValue()
+.withFallbackKeys(KUBERNETES_POD_TEMPLATE_FILE_KEY)
+.withDescription(
+"Specify a local file that contains the jobmanager 
pod template. It will be used to "
++ "initialize the jobmanager pod. If not 
explicitly configured, config option '"
++ KUBERNETES_POD_TEMPLATE_FILE_KEY
++ "' will be used.");
+
+public static final ConfigOption TASK_MANAGER_POD_TEMPLATE =
+key("kubernetes.taskmanager.pod-template-file")
+.stringType()
+.noDefaultValue()
+.withFallbackKeys(KUBERNETES_POD_TEMPLATE_FILE_KEY)
+.withDescription(
+"Specify a local file that contains the 
taskmanager pod template. It will be used to "
++ "initialize the taskmanager pod. If not 
explicitly configured, config option '"
++ KUBERNETES_POD_TEMPLATE_FILE_KEY
++ "' will be used.");
+
+/**
+ * This option is here only for documentation generation, it is the 
fallback key of
+ * JOB_MANAGER_POD_TEMPLATE and TASK_MANAGER_POD_TEMPLATE.
+ */
+@SuppressWarnings("unused")
+public static final ConfigOption KUBERNETES_POD_TEMPLATE =
+key(KUBERNETES_POD_TEMPLATE_FILE_KEY)
+.stringType()
+.noDefaultValue()
+.withDescription(

Review comment:
   Shall we explain in the configuration option description that the 
template should use `Constants.MAIN_CONTAINER_NAME` to refer the main container?

##
File path: 
flink-kubernetes/src/main/java/org/apache/flink/kubernetes/configuration/KubernetesConfigOptions.java

[GitHub] [flink] kezhuw commented on pull request #14831: [FLINK-21086][runtime][checkpoint] CheckpointBarrierHandler Insert barriers for channels received EndOfPartition

2021-02-20 Thread GitBox


kezhuw commented on pull request #14831:
URL: https://github.com/apache/flink/pull/14831#issuecomment-78253


   @gaoyunhaii @pnowojski I have some immature and unverified thoughts.
   
   I think handling of `EndOfPartitionEvent` during checkpoint and overtaking 
buffers before `EndOfPartitionEvent` are kind of orthogonal and could be 
coexist. We will encounter `EndOfPartitionEvent` during checkpoint regardless, 
canceling checkpoint in this case should be avoided after FLIP-147.
   
   But instead of "Insert barriers for channels received EndOfPartition", I 
would suggest to count `EndOfPartitionEvent` directly in checkpoint barrier 
handler. That is, in implementation, we could count/trigger checkpoint in 
`CheckpointBarrierHandler.processEndOfPartition` and 
`ChannelStatePersister.checkForBarrier` if there is pending checkpoint. 
Personally, I think it is same as `FinalizeBarrierComplementPrcoessor` but much 
straightforward. The point is if we decide to support `EndOfPartitionEvent` 
during checkpoint, we should burn this knowledge directly to 
`CheckpointBarrierHandler`.
   
   To overtake buffers before `EndOfPartitionEvent` for unaligned checkpoint, I 
think the minimal requirement is flushing all buffers to network stack before 
`EndOfPartitionEvent`. An asynchronous completable flush operation on 
`PipelinedSubpartition` should meet this requirement. Before that flush 
operation completed, unaligned checkpoint could take place as normal, after 
that there will be no output buffers to overtake. Also, an request-response 
paired events(eg. `EndOfXyzEvent`, `XyzConsumedEvent`) will fulfill this 
requirement. I am not sure how it is viable to introduce `FINISHING` for 
checkpointing as I think `ExecutionState` is tackled by `Task` while 
checkpointing is tackled inside `StreamTask`. But I think this "how to overtake 
buffers for unaligned checkpoint before `EndOfPartitionEvent`" could be a 
separated issue.
   
   Besides this, I think checkpoint after `EndOfInputEvent`(eg. at `FINISHING` 
or all buffer flushed) is similar to last checkpoint for 2pc. I am kind of 
worry about buffer duplication after recovery a successful checkpoint created 
in that period. Checkpoint in that period will include last buffer from 
possible end-of-stream-flush operation, after recovery, that 
end-of-stream-flush will still be executed ?
   
   I saw there are other jiras in this pr or planed pr, I want to writes my 
thoughts also:
   * FLINK-21085: I think the currently solution will undermine what we try to 
follow in FLINK-21133. In that jira, we try to unify handling of 
stop-with-savepoint in one place, while the currently approach tends to 
duplicate checkpoint trigger in many code paths. I would like to suggest to 
enhance and generalize `MultipleInputStreamTask.triggerCheckpointAsync` to 
`StreamTask`.  This way we will have only place to trigger checkpoint.
   * FLINK-21081:  Just yet another approach to list/consider/evaluation. May 
be we could send checkpoint trigger rpc to all running tasks ? If tasks with 
active inputs receive checkpoint trigger rpc, just bookkeeping it. If all 
active inputs are `EndOfPartitionEvent` in the meantime, task will trigger 
checkpoint themselves. If checkcpoint-trigger is coded in one place, this 
bookkeeping and lazy-trigger will only be coded once in one place. But this 
approach may be network-consuming in large job topology.
   
   @pnowojski I am no sure how finishing handshake will be delivered, so 
basically, I am also not sure how this handshaking will combine with 
stop-with-savepoint. But if the handshaking happens at `StreamTask` level, I 
think there will be no big deal. Though, still unsure.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14629: [FLINK-15656][k8s] Support pod template for native kubernetes integration

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14629:
URL: https://github.com/apache/flink/pull/14629#issuecomment-759361463


   
   ## CI report:
   
   * b7790eb7008ad83fb5f8cc3a24b5e8ce5abfe259 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13490)
 
   * b8d2fd246e442fe92285103908ea6595c64d1c5c Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13523)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-20332) Add workers recovered from previous attempt to pending resources

2021-02-20 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287613#comment-17287613
 ] 

Xintong Song commented on FLINK-20332:
--

To add recovered workers to pending resources, we need to get the 
{{TaskExecutorProcessSpec}} or {{WorkerResourceSpec}} of the recovered workers 
before they register to RM. There are several options to do this.
# Parse from the TM starting command, which contains TM resource specifications 
as dynamic properties. Starting command of recovered workers are only available 
for the native k8s deployment.
# Attach (serialized) resource specifications to workers as metadata: k8s 
annotations, yarn allocation tags (version 3.1+, at the price handling 
different versions with reflection and yarn RM keeps the tags in memory).
# Maintaining worker resource specs in HA, at the price of more HA space 
(linear to # of living TMs) and IO cost during TM allocation/release.

2) does not provide much extra benefit compared to 1), and 3) might be a bit 
too heavy. Therefore, as a first step, I propose to do this improvement with 
1), for native K8s deployment only.
- Yarn can be supported with 2) later, if needed. Recovering resource 
specifications with different approach for Kubernetes / Yarn shouldn't be a 
problem.
- ATM, we do not support TMs with different resources on Mesos. Thus, TM 
resource specifications on Mesos should always be the default.

> Add workers recovered from previous attempt to pending resources
> 
>
> Key: FLINK-20332
> URL: https://issues.apache.org/jira/browse/FLINK-20332
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>
> For active deployments (Native K8s/Yarn/Mesos), after a JM failover, workers 
> from previous attempt should register to the new JM. Depending on the order 
> that slot requests and TM registrations arrive at the RM, it could happen 
> that RM allocates unnecessary new resources while there are recovered 
> resources that can be reused.
> A potential improvement is to add recovered workers to pending resources, so 
> that RM knows what resources are expected to be available soon and decide 
> whether to allocate new resources accordingly.
> See also the discussion in FLINK-20249.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21424) Flink view cannot use hive temporal

2021-02-20 Thread HideOnBush (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HideOnBush updated FLINK-21424:
---
Description: 
{code:java}
Flink view cannot use hive temporal{code}
The kafka table can temporal join Hive Table.

CREATE TABLE kfk_fact_bill_master_pup (

   payLst ROW(groupID int, shopID int, shopName String, isJoinReceived int),

    proctime as PROCTIME()

) WITH (

 'connector' = 'kafka',

 'topic' = 'topic_name',

 'format' = 'json',

);

 

But the next operation will report an error。e.g..

create view test_view as 

  select * from kfk_fact_bill_master_pup;

 

use test_view  temporal join Hive Table will  report error.  and view have 
proctime field.

 

This is Error Message. 

 

!企业微信截图_35dfdc08-4f3f-4f34-962f-6db9afcf7cb0.png!

 

  was:
{code:java}
Flink view cannot use hive temporal{code}
The kafka table can temporal join Hive Table.

CREATE TABLE kfk_fact_bill_master_pup (

   payLst ROW(groupID int, shopID int, shopName String, isJoinReceived int),

    proctime as PROCTIME()

) WITH (

 'connector' = 'kafka',

 'topic' = 'topic_name',

 'format' = 'json',

);

 

But the next operation will report an error。e.g..

create view test_view as 

  select * from kfk_fact_bill_master_pup;

 

use test_view  temporal join Hive Table will  report error.  and view have 
proctime field.

 

!企业微信截图_86542a7e-9bab-4078-82c5-583b79f01670.png!

 

!企业微信截图_35dfdc08-4f3f-4f34-962f-6db9afcf7cb0.png!

 


> Flink view cannot use hive temporal
> ---
>
> Key: FLINK-21424
> URL: https://issues.apache.org/jira/browse/FLINK-21424
> Project: Flink
>  Issue Type: Wish
>  Components: Table SQL / API
>Affects Versions: 1.12.0
>Reporter: HideOnBush
>Priority: Major
> Attachments: 企业微信截图_35dfdc08-4f3f-4f34-962f-6db9afcf7cb0.png
>
>
> {code:java}
> Flink view cannot use hive temporal{code}
> The kafka table can temporal join Hive Table.
> CREATE TABLE kfk_fact_bill_master_pup (
>    payLst ROW(groupID int, shopID int, shopName String, isJoinReceived int),
>     proctime as PROCTIME()
> ) WITH (
>  'connector' = 'kafka',
>  'topic' = 'topic_name',
>  'format' = 'json',
> );
>  
> But the next operation will report an error。e.g..
> create view test_view as 
>   select * from kfk_fact_bill_master_pup;
>  
> use test_view  temporal join Hive Table will  report error.  and view have 
> proctime field.
>  
> This is Error Message. 
>  
> !企业微信截图_35dfdc08-4f3f-4f34-962f-6db9afcf7cb0.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21424) Flink view cannot use hive temporal

2021-02-20 Thread HideOnBush (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HideOnBush updated FLINK-21424:
---
Description: 
{code:java}
Flink view cannot use hive temporal{code}
The kafka table can temporal join Hive Table.

CREATE TABLE kfk_fact_bill_master_pup (

   payLst ROW(groupID int, shopID int, shopName String, isJoinReceived int),

    proctime as PROCTIME()

) WITH (

 'connector' = 'kafka',

 'topic' = 'topic_name',

 'format' = 'json',

);

 

But the next operation will report an error。e.g..

create view test_view as 

  select * from kfk_fact_bill_master_pup;

 

use test_view  temporal join Hive Table will  report error.  and view have 
proctime field.

 

!企业微信截图_86542a7e-9bab-4078-82c5-583b79f01670.png!

 

!企业微信截图_35dfdc08-4f3f-4f34-962f-6db9afcf7cb0.png!

 

  was:
{code:java}
Flink view cannot use hive temporal{code}
The kafka table can temporal join Hive Table.

CREATE TABLE kfk_fact_bill_master_pup (

   payLst ROW(groupID int, shopID int, shopName String, isJoinReceived int),

    proctime as PROCTIME()

) WITH (

 'connector' = 'kafka',

 'topic' = 'topic_name',

 'format' = 'json',

);

 

But the next operation will report an error。e.g..

create view test_view as 

  select * from kfk_fact_bill_master_pup;

 

use test_view  temporal join Hive Table will  report error.  and view have 
proctime field.

 

 


> Flink view cannot use hive temporal
> ---
>
> Key: FLINK-21424
> URL: https://issues.apache.org/jira/browse/FLINK-21424
> Project: Flink
>  Issue Type: Wish
>  Components: Table SQL / API
>Affects Versions: 1.12.0
>Reporter: HideOnBush
>Priority: Major
> Attachments: 企业微信截图_35dfdc08-4f3f-4f34-962f-6db9afcf7cb0.png
>
>
> {code:java}
> Flink view cannot use hive temporal{code}
> The kafka table can temporal join Hive Table.
> CREATE TABLE kfk_fact_bill_master_pup (
>    payLst ROW(groupID int, shopID int, shopName String, isJoinReceived int),
>     proctime as PROCTIME()
> ) WITH (
>  'connector' = 'kafka',
>  'topic' = 'topic_name',
>  'format' = 'json',
> );
>  
> But the next operation will report an error。e.g..
> create view test_view as 
>   select * from kfk_fact_bill_master_pup;
>  
> use test_view  temporal join Hive Table will  report error.  and view have 
> proctime field.
>  
> !企业微信截图_86542a7e-9bab-4078-82c5-583b79f01670.png!
>  
> !企业微信截图_35dfdc08-4f3f-4f34-962f-6db9afcf7cb0.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21424) Flink view cannot use hive temporal

2021-02-20 Thread HideOnBush (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HideOnBush updated FLINK-21424:
---
Attachment: 企业微信截图_35dfdc08-4f3f-4f34-962f-6db9afcf7cb0.png

> Flink view cannot use hive temporal
> ---
>
> Key: FLINK-21424
> URL: https://issues.apache.org/jira/browse/FLINK-21424
> Project: Flink
>  Issue Type: Wish
>  Components: Table SQL / API
>Affects Versions: 1.12.0
>Reporter: HideOnBush
>Priority: Major
> Attachments: 企业微信截图_35dfdc08-4f3f-4f34-962f-6db9afcf7cb0.png
>
>
> {code:java}
> Flink view cannot use hive temporal{code}
> The kafka table can temporal join Hive Table.
> CREATE TABLE kfk_fact_bill_master_pup (
>    payLst ROW(groupID int, shopID int, shopName String, isJoinReceived int),
>     proctime as PROCTIME()
> ) WITH (
>  'connector' = 'kafka',
>  'topic' = 'topic_name',
>  'format' = 'json',
> );
>  
> But the next operation will report an error。e.g..
> create view test_view as 
>   select * from kfk_fact_bill_master_pup;
>  
> use test_view  temporal join Hive Table will  report error.  and view have 
> proctime field.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21424) Flink view cannot use hive temporal

2021-02-20 Thread HideOnBush (Jira)
HideOnBush created FLINK-21424:
--

 Summary: Flink view cannot use hive temporal
 Key: FLINK-21424
 URL: https://issues.apache.org/jira/browse/FLINK-21424
 Project: Flink
  Issue Type: Wish
  Components: Table SQL / API
Affects Versions: 1.12.0
Reporter: HideOnBush


{code:java}
Flink view cannot use hive temporal{code}
The kafka table can temporal join Hive Table.

CREATE TABLE kfk_fact_bill_master_pup (

   payLst ROW(groupID int, shopID int, shopName String, isJoinReceived int),

    proctime as PROCTIME()

) WITH (

 'connector' = 'kafka',

 'topic' = 'topic_name',

 'format' = 'json',

);

 

But the next operation will report an error。e.g..

create view test_view as 

  select * from kfk_fact_bill_master_pup;

 

use test_view  temporal join Hive Table will  report error.  and view have 
proctime field.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #14629: [FLINK-15656][k8s] Support pod template for native kubernetes integration

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14629:
URL: https://github.com/apache/flink/pull/14629#issuecomment-759361463


   
   ## CI report:
   
   * b7790eb7008ad83fb5f8cc3a24b5e8ce5abfe259 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13490)
 
   * b8d2fd246e442fe92285103908ea6595c64d1c5c UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] wangyang0918 commented on a change in pull request #14629: [FLINK-15656][k8s] Support pod template for native kubernetes integration

2021-02-20 Thread GitBox


wangyang0918 commented on a change in pull request #14629:
URL: https://github.com/apache/flink/pull/14629#discussion_r579628708



##
File path: 
flink-kubernetes/src/main/java/org/apache/flink/kubernetes/kubeclient/FlinkPod.java
##
@@ -48,6 +48,12 @@ public Container getMainContainer() {
 return mainContainer;
 }
 
+public FlinkPod clone() {
+return new FlinkPod(
+new PodBuilder(this.getPod()).build(),
+new ContainerBuilder(this.getMainContainer()).build());

Review comment:
   Nice suggestion. I will add a new commit for this change.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] wangyang0918 commented on pull request #14629: [FLINK-15656][k8s] Support pod template for native kubernetes integration

2021-02-20 Thread GitBox


wangyang0918 commented on pull request #14629:
URL: https://github.com/apache/flink/pull/14629#issuecomment-782592922


   @xintongsong Thanks for the review. Address your concerns inline and update 
the PR.
   
   > It seems the current implementation cannot support setting the same 
template file for both JM / TM, due to the different main container names. 
Maybe it's not necessary to have different main container names for JM/TM.
   
   Yes, it does not support sharing the pod template at my first 
implementation. Because I used to believe only the JobManager needs init 
container. And this will make JobManager and TaskManager pod templates always 
different. Inspired by you and after more consideration, I think it makes sense 
to let them share a same pod template since it is easier to use and harmless. I 
have unified the main container name and introduced a common config option 
`kubernetes.pod-template-file` in this update.
   
   > It's not clear to me what kind of compatibility guarantees we provide for 
the pod templates. Would it be possible that something user defined in the 
template works well in the current Flink version and is overwritten in a new 
version? I understand it's hard to guarantee that all templates are compatible 
across versions, but maybe we can define a set of commonly used functions that 
is guaranteed (guarded by tests) compatible across versions.
   
   I think it is possible that a pod template works well in current Flink 
version and does not work in a new Flink version. This only happens when we 
introduce a new config option and overwrite(not merge) the corresponding 
Kubernetes field. This should rarely happen unless we have no other choices. 
Then we will document them and publish in the release note.
   
   We could guarantee to our users that a white list fields(e.g. annotations, 
labels, envs, init container, sidecar container, volumes) in pod template that 
will always be respected. And I have enriched the tests in 
`KubernetesJobManagerFactoryWithPodTemplateTest` and 
`KubernetesTaskManagerFactoryWithPodTemplateTest` to guide them.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21413) TtlMapState and TtlListState cannot be clean completely with Filesystem StateBackend

2021-02-20 Thread Jiayi Liao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287600#comment-17287600
 ] 

Jiayi Liao commented on FLINK-21413:


Sure. I'm still thinking about the solution. I've come up with two proposals:

1. #TtlMapState#getUnexpiredOrNull returns null directly if the 
{{unexpired.size() == 0}}, but this will also clean the empty map. 
2. Add a lastAccessTimestamp(long) in #TtlMapState like what Flink does in 
TtlValue, and use the timestamp to check the map's lifecyle. But this may break 
the state compatibility.

What do you think? [~liyu]

> TtlMapState and TtlListState cannot be clean completely with Filesystem 
> StateBackend
> 
>
> Key: FLINK-21413
> URL: https://issues.apache.org/jira/browse/FLINK-21413
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.9.0
>Reporter: Jiayi Liao
>Priority: Major
> Attachments: image-2021-02-19-11-13-58-672.png
>
>
> Take the #TtlMapState as an example,
>  
> {code:java}
> public Map> getUnexpiredOrNull(@Nonnull Map TtlValue> ttlValue) {
> Map> unexpired = new HashMap<>();
> TypeSerializer> valueSerializer =
> ((MapSerializer>) 
> original.getValueSerializer()).getValueSerializer();
> for (Map.Entry> e : ttlValue.entrySet()) {
> if (!expired(e.getValue())) {
> // we have to do the defensive copy to update the 
> value
> unexpired.put(e.getKey(), 
> valueSerializer.copy(e.getValue()));
> }
> }
> return ttlValue.size() == unexpired.size() ? ttlValue : unexpired;
> }
> {code}
>  
> The returned value will never be null and the #StateEntry will exists 
> forever, which leads to memory leak if the key's range of the stream is very 
> large. Below we can see that 20+ millison uncleared TtlStateMap could take up 
> several GB memory.
>  
> !image-2021-02-19-11-13-58-672.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #14953: [FLINK-21351][checkpointing] Don't subsume last checkpoint

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14953:
URL: https://github.com/apache/flink/pull/14953#issuecomment-780752668


   
   ## CI report:
   
   * 941374ecdd03b5ceb9245e4d8a8370671b691a34 UNKNOWN
   * 494f9fe80a0b2f1d7bb2559e2cba3283d5e2d89d Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13509)
 
   * 1ed5093d34a24e1784a7fc2573a009f49e71813b Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13522)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14973: [FLINK-21412][python] Fix Decimal type which doesn't work in UDAF and expression DSL

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14973:
URL: https://github.com/apache/flink/pull/14973#issuecomment-782584606


   
   ## CI report:
   
   * a514485c23cddf2aaba4741241f5fa3bd0764cf0 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13521)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #14973: [FLINK-21412][python] Fix Decimal type which doesn't work in UDAF and expression DSL

2021-02-20 Thread GitBox


flinkbot commented on pull request #14973:
URL: https://github.com/apache/flink/pull/14973#issuecomment-782584606


   
   ## CI report:
   
   * a514485c23cddf2aaba4741241f5fa3bd0764cf0 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14972: [FLINK-20536][tests] Update migration tests in master to cover migration from release-1.12

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14972:
URL: https://github.com/apache/flink/pull/14972#issuecomment-782581149


   
   ## CI report:
   
   * f8b2ed1dc83642f1e241470e8630c2890a09fa6d Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13520)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14971: [FLINK-20536][tests] Update migration tests in master to cover migration from release-1.12

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14971:
URL: https://github.com/apache/flink/pull/14971#issuecomment-782581091


   
   ## CI report:
   
   * 250b424bb564a0a787d32afcb30bed2e2b2a4b91 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13519)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #14953: [FLINK-21351][checkpointing] Don't subsume last checkpoint

2021-02-20 Thread GitBox


flinkbot edited a comment on pull request #14953:
URL: https://github.com/apache/flink/pull/14953#issuecomment-780752668


   
   ## CI report:
   
   * 941374ecdd03b5ceb9245e4d8a8370671b691a34 UNKNOWN
   * 494f9fe80a0b2f1d7bb2559e2cba3283d5e2d89d Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=13509)
 
   * 1ed5093d34a24e1784a7fc2573a009f49e71813b UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] dianfu commented on pull request #14920: [FLINK-21358][docs] Adds savepoint 1.12.x to savepoint compatibility diagram

2021-02-20 Thread GitBox


dianfu commented on pull request #14920:
URL: https://github.com/apache/flink/pull/14920#issuecomment-782583991


   @tillrohrmann @XComp Sorry for the late response. There are no known 
backwards compatibility issues in my head.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-21412) pyflink DataTypes.DECIMAL is not available

2021-02-20 Thread Dian Fu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu updated FLINK-21412:

Fix Version/s: 1.12.3
   1.13.0

> pyflink DataTypes.DECIMAL is not available
> --
>
> Key: FLINK-21412
> URL: https://issues.apache.org/jira/browse/FLINK-21412
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.12.1
> Environment: python 3.7.5
> pyflink 1.12.1
>Reporter: awayne
>Assignee: Dian Fu
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.13.0, 1.12.3
>
>
> when i use DataTypes.DECIMAL in udaf
> File 
> "/home/ubuntu/pyflenv/lib/python3.7/site-packages/pyflink/table/types.py", 
> line 2025, in _to_java_data_type
>  _to_java_data_type(data_type._element_type))
>  File 
> "/home/ubuntu/pyflenv/lib/python3.7/site-packages/pyflink/table/types.py", 
> line 1964, in _to_java_data_type
>  j_data_type = JDataTypes.Decimal(data_type.precision, data_type.scale)
>  File 
> "/home/ubuntu/pyflenv/lib/python3.7/site-packages/py4j/java_gateway.py", line 
> 1516, in __getattr__
>  "\{0}.\{1} does not exist in the JVM".format(self._fqn, name))
> py4j.protocol.Py4JError: org.apache.flink.table.api.DataTypes.Decimal does 
> not exist in the JVM
>  
> in pyflink\table\types.py
> line 1963-1964
> elif isinstance(data_type, DecimalType):
>     j_data_type = 
> JDataTypes.{color:#FF}Decimal{color}(data_type.precision, data_type.scale)
> in java should be called "DECIMAL"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21412) pyflink DataTypes.DECIMAL is not available

2021-02-20 Thread Dian Fu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu updated FLINK-21412:

Affects Version/s: (was: 1.12.1)
   1.12.0

> pyflink DataTypes.DECIMAL is not available
> --
>
> Key: FLINK-21412
> URL: https://issues.apache.org/jira/browse/FLINK-21412
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.12.0
> Environment: python 3.7.5
> pyflink 1.12.1
>Reporter: awayne
>Assignee: Dian Fu
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.13.0, 1.12.3
>
>
> when i use DataTypes.DECIMAL in udaf
> File 
> "/home/ubuntu/pyflenv/lib/python3.7/site-packages/pyflink/table/types.py", 
> line 2025, in _to_java_data_type
>  _to_java_data_type(data_type._element_type))
>  File 
> "/home/ubuntu/pyflenv/lib/python3.7/site-packages/pyflink/table/types.py", 
> line 1964, in _to_java_data_type
>  j_data_type = JDataTypes.Decimal(data_type.precision, data_type.scale)
>  File 
> "/home/ubuntu/pyflenv/lib/python3.7/site-packages/py4j/java_gateway.py", line 
> 1516, in __getattr__
>  "\{0}.\{1} does not exist in the JVM".format(self._fqn, name))
> py4j.protocol.Py4JError: org.apache.flink.table.api.DataTypes.Decimal does 
> not exist in the JVM
>  
> in pyflink\table\types.py
> line 1963-1964
> elif isinstance(data_type, DecimalType):
>     j_data_type = 
> JDataTypes.{color:#FF}Decimal{color}(data_type.precision, data_type.scale)
> in java should be called "DECIMAL"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)