[GitHub] [flink] rmetzger commented on pull request #13690: [FLINK-16595][YARN]support more HDFS nameServices in yarn mode when security enabled. Is…

2020-10-23 Thread GitBox


rmetzger commented on pull request #13690:
URL: https://github.com/apache/flink/pull/13690#issuecomment-715016591


   >>Have you tested this change with multiple kerberos-enabled NameNodes?
   
   >yes ,It has works in my production environment for a long time.
   
   Okay, that's great to hear! If extending the e2e tests turns out to be 
extremely difficult, I would also be fine with merging this change without 
tests, given that this is a non critical add on feature that is difficult to 
test.
   But first it would be nice if you could look into the e2e tests. In my 
opinion it should be possible. Sneaking in the right 2nd namenode config could 
be a bit difficult, but there's probably a HADOOP_CONF_DIR environment variable 
(or HADOOP_CLASSPATH) variable that you need to temporary overwrite for the 2nd 
NameNode)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13458: FLINK-18851 [runtime] Add checkpoint type to checkpoint history entries in Web UI

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13458:
URL: https://github.com/apache/flink/pull/13458#issuecomment-697041219


   
   ## CI report:
   
   * 7311b0d12d19a645391ea0359a9aa6318806363b UNKNOWN
   * 9bc87ad6b2a8cbad2f226c483c74f1fa1523b70e Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8079)
 
   * 9c8c867b1ba1f9a236534718c4d4e0371b3cc5a3 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13703: [FLINK-19696] Add runtime batch committer operators for the new sink API

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13703:
URL: https://github.com/apache/flink/pull/13703#issuecomment-712821004


   
   ## CI report:
   
   * 58ee54d456da57f99b6124c787b7cbaeedcfb261 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8081)
 
   * 66e9e845c43bf6ae4e03e3673407158677094801 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8154)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] danny0405 commented on pull request #12919: [FLINK-16048][avro] Support read/write confluent schema registry avro…

2020-10-23 Thread GitBox


danny0405 commented on pull request #12919:
URL: https://github.com/apache/flink/pull/12919#issuecomment-715024308


   Thanks, i think this is a bug, i have logged an issue there. See 
https://issues.apache.org/jira/browse/FLINK-19779



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-19779) Remove the "record_" field name prefix for Confluent Avro format deserialization

2020-10-23 Thread Danny Chen (Jira)
Danny Chen created FLINK-19779:
--

 Summary: Remove the "record_" field name prefix for Confluent Avro 
format deserialization
 Key: FLINK-19779
 URL: https://issues.apache.org/jira/browse/FLINK-19779
 Project: Flink
  Issue Type: Bug
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Affects Versions: 1.12.0
Reporter: Danny Chen
 Fix For: 1.12.0


Reported by Maciej Bryński :

Problem is this is not compatible. I'm unable to read anything from Kafka using 
Confluent Registry. Example:
I have data in Kafka with following value schema:


{code:java}
{
  "type": "record",
  "name": "myrecord",
  "fields": [
{
  "name": "f1",
  "type": "string"
}
  ]
}
{code}

I'm creating table using this avro-confluent format:


{code:sql}
create table `test` (
`f1` STRING
) WITH (
  'connector' = 'kafka', 
  'topic' = 'test', 
  'properties.bootstrap.servers' = 'localhost:9092', 
  'properties.group.id' = 'test1234', 
   'scan.startup.mode' = 'earliest-offset', 
  'format' = 'avro-confluent'
  'avro-confluent.schema-registry.url' = 'http://localhost:8081'
);
{code}

When trying to select data I'm getting error:


{code:noformat}
SELECT * FROM test;
[ERROR] Could not execute SQL statement. Reason:
org.apache.avro.AvroTypeException: Found myrecord, expecting record, missing 
required field record_f1
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #13744: [FLINK-19766][table-runtime] Introduce File streaming compaction operators

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13744:
URL: https://github.com/apache/flink/pull/13744#issuecomment-714369975


   
   ## CI report:
   
   * d257594ae6a62374f9727f97dad3254ea7c25ef2 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8144)
 
   * 4d7e601658c0dacc100c572f4fe8878c058e035f UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13758: [FLINK-18676] [FileSystem] Bump s3 aws version to handle WebIdentityTokenCredentialsProvider

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13758:
URL: https://github.com/apache/flink/pull/13758#issuecomment-714986378


   
   ## CI report:
   
   * 935ad11a8e3067394204de1df2bbb33235fc0e86 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8155)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] godfreyhe opened a new pull request #13760: [FLINK-19627][table-runtime] Introduce multiple input operator for batch

2020-10-23 Thread GitBox


godfreyhe opened a new pull request #13760:
URL: https://github.com/apache/flink/pull/13760


   
   ## What is the purpose of the change
   
   *This pr aims to introduce multiple input operator for batch. The logic in a 
multiple input operator is similar to OperatorChain, mainly including: 
initialize all sub-operators, initialize the Input list of the multiple input 
operator, connect each sub-operator via Output, delegate open/close/dispose 
action to each sub-operator, etc*
   
   
   
   ## Brief change log
 - *Introduce OneInput, FirstInputOfTwoInput, SecondInputOfTwoInput, 
InputSelectionHandler for multiple input operator*
 - *Introduce different Output sub-classes for multiple input operator*
 - *Introduce multiple input operator for batch*
   
   
   ## Verifying this change
   
   
   This change added tests and can be verified as follows:
   
 - *Added InputTest to verify different kinds of Input*
 - *Added OutputTest to verify different kinds of Output*
 - *Added BatchMultipleInputStreamOperatorTest to verify all actions of 
BatchMultipleInputStreamOperator*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / **no** / 
don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (**yes** / no)
 - If yes, how is the feature documented? (not applicable / docs / 
**JavaDocs** / not documented)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #13759: [FLINK-19412][python] Re-layer Python Operation Make it Possible to Provide only Python implementation

2020-10-23 Thread GitBox


flinkbot commented on pull request #13759:
URL: https://github.com/apache/flink/pull/13759#issuecomment-715027223


   
   ## CI report:
   
   * d05c438cf0d44def860358741ca30151d8a3dd6a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-19627) Introduce multi-input operator for batch

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-19627:
---
Labels: pull-request-available  (was: )

> Introduce multi-input operator for batch
> 
>
> Key: FLINK-19627
> URL: https://issues.apache.org/jira/browse/FLINK-19627
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Reporter: Caizhi Weng
>Assignee: godfrey he
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> After the planner is ready for multi-input, we should introduce multi-input 
> operator for batch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19779) Remove the "record_" field name prefix for Confluent Avro format deserialization

2020-10-23 Thread Jira


[ 
https://issues.apache.org/jira/browse/FLINK-19779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17219507#comment-17219507
 ] 

Maciej Bryński commented on FLINK-19779:


Thanks [~danny0405] for putting this bug here.

I was using flink from master branch. Commit: 
78b3f2e7d5d0a5a4d341587951b2dc54a5ef427b

> Remove the "record_" field name prefix for Confluent Avro format 
> deserialization
> 
>
> Key: FLINK-19779
> URL: https://issues.apache.org/jira/browse/FLINK-19779
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.12.0
>Reporter: Danny Chen
>Priority: Major
> Fix For: 1.12.0
>
>
> Reported by Maciej Bryński :
> Problem is this is not compatible. I'm unable to read anything from Kafka 
> using Confluent Registry. Example:
> I have data in Kafka with following value schema:
> {code:java}
> {
>   "type": "record",
>   "name": "myrecord",
>   "fields": [
> {
>   "name": "f1",
>   "type": "string"
> }
>   ]
> }
> {code}
> I'm creating table using this avro-confluent format:
> {code:sql}
> create table `test` (
>   `f1` STRING
> ) WITH (
>   'connector' = 'kafka', 
>   'topic' = 'test', 
>   'properties.bootstrap.servers' = 'localhost:9092', 
>   'properties.group.id' = 'test1234', 
>'scan.startup.mode' = 'earliest-offset', 
>   'format' = 'avro-confluent'
>   'avro-confluent.schema-registry.url' = 'http://localhost:8081'
> );
> {code}
> When trying to select data I'm getting error:
> {code:noformat}
> SELECT * FROM test;
> [ERROR] Could not execute SQL statement. Reason:
> org.apache.avro.AvroTypeException: Found myrecord, expecting record, missing 
> required field record_f1
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] QingdongZeng3 commented on a change in pull request #13690: [FLINK-16595][YARN]support more HDFS nameServices in yarn mode when security enabled. Is…

2020-10-23 Thread GitBox


QingdongZeng3 commented on a change in pull request #13690:
URL: https://github.com/apache/flink/pull/13690#discussion_r510678304



##
File path: 
flink-yarn/src/main/java/org/apache/flink/yarn/configuration/YarnConfigOptions.java
##
@@ -294,6 +294,13 @@
"they doesn't need to be downloaded every time 
for each application. An example could be " +
"hdfs://$namenode_address/path/of/flink/lib");
 
+   public static final ConfigOption> YARN_ACCESS =
+   key("yarn.access.hadoopFileSystems")

Review comment:
   thanks for your suggestion,I will change it.
   
   > I believe it would be good to look into covering this change in our 
Kerberos end to end tests, to guarantee this is properly working.
   > 
   > You should be able to extend the end to end test defined in 
`test_yarn_job_kerberos_docker.sh` for this.
   > You will probably need to set up a secondary namenode configuration in 
flink-end-to-end-tests/test-scripts/docker-hadoop-secure-cluster/config, create 
additional keys in `bootstrap.sh` and launch a second nameNode there.
   
   okay,I will follow your advice.
   
   > I believe it would be good to look into covering this change in our 
Kerberos end to end tests, to guarantee this is properly working.
   > 
   > You should be able to extend the end to end test defined in 
`test_yarn_job_kerberos_docker.sh` for this.
   > You will probably need to set up a secondary namenode configuration in 
flink-end-to-end-tests/test-scripts/docker-hadoop-secure-cluster/config, create 
additional keys in `bootstrap.sh` and launch a second nameNode there.
   
   okay,I will follow your advice. Thanks!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #13760: [FLINK-19627][table-runtime] Introduce multiple input operator for batch

2020-10-23 Thread GitBox


flinkbot commented on pull request #13760:
URL: https://github.com/apache/flink/pull/13760#issuecomment-715030373


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 6bbd2cf570cdda1fe9ea98b93fa290ac4d497fbd (Fri Oct 23 
07:08:36 UTC 2020)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] QingdongZeng3 commented on a change in pull request #13690: [FLINK-16595][YARN]support more HDFS nameServices in yarn mode when security enabled. Is…

2020-10-23 Thread GitBox


QingdongZeng3 commented on a change in pull request #13690:
URL: https://github.com/apache/flink/pull/13690#discussion_r510678304



##
File path: 
flink-yarn/src/main/java/org/apache/flink/yarn/configuration/YarnConfigOptions.java
##
@@ -294,6 +294,13 @@
"they doesn't need to be downloaded every time 
for each application. An example could be " +
"hdfs://$namenode_address/path/of/flink/lib");
 
+   public static final ConfigOption> YARN_ACCESS =
+   key("yarn.access.hadoopFileSystems")

Review comment:
   
   > I believe it would be good to look into covering this change in our 
Kerberos end to end tests, to guarantee this is properly working.
   > 
   > You should be able to extend the end to end test defined in 
`test_yarn_job_kerberos_docker.sh` for this.
   > You will probably need to set up a secondary namenode configuration in 
flink-end-to-end-tests/test-scripts/docker-hadoop-secure-cluster/config, create 
additional keys in `bootstrap.sh` and launch a second nameNode there.
   
   okay,I will follow your advice. Thanks!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-11544) Add option to manually set job ID for job submissions via REST API

2020-10-23 Thread Eui Heo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17219508#comment-17219508
 ] 

Eui Heo commented on FLINK-11544:
-

[~uce] May I ask you to check the my comment?

> Add option to manually set job ID for job submissions via REST API
> --
>
> Key: FLINK-11544
> URL: https://issues.apache.org/jira/browse/FLINK-11544
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / REST
>Affects Versions: 1.7.0
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Add an option to specify the job ID during job submissions via the REST API.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-11544) Add option to manually set job ID for job submissions via REST API

2020-10-23 Thread Eui Heo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17216879#comment-17216879
 ] 

Eui Heo edited comment on FLINK-11544 at 10/23/20, 7:11 AM:


The API documentation seems lacking in explanation about this feature.

Is this an official or experimental feature?

I want to use this feature to track Flink job status with the ID provided by my 
application.

 


was (Author: elanv):
The API documentation seems lacking in explanation, is this an official or 
experimental feature?

I want to use this feature to track Flink job by the ID provided by my 
application.

 

> Add option to manually set job ID for job submissions via REST API
> --
>
> Key: FLINK-11544
> URL: https://issues.apache.org/jira/browse/FLINK-11544
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / REST
>Affects Versions: 1.7.0
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Add an option to specify the job ID during job submissions via the REST API.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-11544) Add option to manually set job ID for job submissions via REST API

2020-10-23 Thread Eui Heo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17216879#comment-17216879
 ] 

Eui Heo edited comment on FLINK-11544 at 10/23/20, 7:12 AM:


The API documentation seems lacking in explanation about this feature.

Is this official or experimental feature?

I want to use this feature to track Flink job status with the ID provided by my 
application.

 


was (Author: elanv):
The API documentation seems lacking in explanation about this feature.

Is this an official or experimental feature?

I want to use this feature to track Flink job status with the ID provided by my 
application.

 

> Add option to manually set job ID for job submissions via REST API
> --
>
> Key: FLINK-11544
> URL: https://issues.apache.org/jira/browse/FLINK-11544
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / REST
>Affects Versions: 1.7.0
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Add an option to specify the job ID during job submissions via the REST API.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19779) Remove the "record_" field name prefix for Confluent Avro format deserialization

2020-10-23 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17219512#comment-17219512
 ] 

Jark Wu commented on FLINK-19779:
-

Not compatible with what? 

> Remove the "record_" field name prefix for Confluent Avro format 
> deserialization
> 
>
> Key: FLINK-19779
> URL: https://issues.apache.org/jira/browse/FLINK-19779
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.12.0
>Reporter: Danny Chen
>Priority: Major
> Fix For: 1.12.0
>
>
> Reported by Maciej Bryński :
> Problem is this is not compatible. I'm unable to read anything from Kafka 
> using Confluent Registry. Example:
> I have data in Kafka with following value schema:
> {code:java}
> {
>   "type": "record",
>   "name": "myrecord",
>   "fields": [
> {
>   "name": "f1",
>   "type": "string"
> }
>   ]
> }
> {code}
> I'm creating table using this avro-confluent format:
> {code:sql}
> create table `test` (
>   `f1` STRING
> ) WITH (
>   'connector' = 'kafka', 
>   'topic' = 'test', 
>   'properties.bootstrap.servers' = 'localhost:9092', 
>   'properties.group.id' = 'test1234', 
>'scan.startup.mode' = 'earliest-offset', 
>   'format' = 'avro-confluent'
>   'avro-confluent.schema-registry.url' = 'http://localhost:8081'
> );
> {code}
> When trying to select data I'm getting error:
> {code:noformat}
> SELECT * FROM test;
> [ERROR] Could not execute SQL statement. Reason:
> org.apache.avro.AvroTypeException: Found myrecord, expecting record, missing 
> required field record_f1
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-11544) Add option to manually set job ID for job submissions via REST API

2020-10-23 Thread Eui Heo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17219508#comment-17219508
 ] 

Eui Heo edited comment on FLINK-11544 at 10/23/20, 7:12 AM:


[~uce] May I ask you to check my comment?


was (Author: elanv):
[~uce] May I ask you to check the my comment?

> Add option to manually set job ID for job submissions via REST API
> --
>
> Key: FLINK-11544
> URL: https://issues.apache.org/jira/browse/FLINK-11544
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / REST
>Affects Versions: 1.7.0
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Add an option to specify the job ID during job submissions via the REST API.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-14606) Simplify params of Execution#processFail

2020-10-23 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-14606.
---
Resolution: Won't Do

Closed because the value of {{inCallback}} and {{releasePartitions}} are not 
exactly aligned.
{{inCallback}} only needs to be true if the task is already deployed and it is 
failed by JM. However, even if a task is not deployed, {{releasePartitions}} 
still needs to be true since the partition may have been created in external 
shuffle services.
{{fromSchedulerNG}} will be removed along with the legacy scheduler removal, so 
we do not need to change it right now here.

> Simplify params of Execution#processFail
> 
>
> Key: FLINK-14606
> URL: https://issues.apache.org/jira/browse/FLINK-14606
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.10.0
>Reporter: Zhu Zhu
>Priority: Major
>
> The 3 params fromSchedulerNg/releasePartitions/isCallback of 
> Execution#processFail are quite a mess while they seem to be correlated. 
> I'd propose to simplify the prams of processFail by using a 
> {{isInternalError}} to replace those 3 params. {{isInternalError}} is true 
> iff the failure is from TM(strictly speaking, notified from SchedulerBase). 
> This also hardens the handling of cases that a task is successfully deployed 
> but JM does not realize it(see #3 below).
> Here's why these 3 params can be simplified:
> 1. {{fromSchedulerNg}}, true iff the failure is from TM and 
> isLegacyScheduling==false.
> It's only used like this: {{if (!fromSchedulerNg && 
> !isLegacyScheduling()))}}. So it's the same to use {{!isInternalFailure}} to 
> replace it.
> 2. {{releasePartitions}}, true iff the failure is from TM.
>   Now the value is exactly the same as {{isInternalFailure}}, we can drop it 
> and use {{isInternalFailure}} instead.
> 3. {{isCallback}}, true iff the failure is from TM or the task is not 
> deployed.
> It's only used like this: {{(!isCallback && (current == RUNNING || 
> current == DEPLOYING))}}.
> So using {{!isInternalFailure}} to replace it would be enough. It is a 
> bit different for the case that a task deployment to a task manager fails, 
> which set {{isCallback}} to true previously. However, it would be safer to 
> signal a cancel call, in case the deployment is actually a success but the 
> response is lost on network.
> cc [~GJL]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] vthinkxie commented on a change in pull request #13458: FLINK-18851 [runtime] Add checkpoint type to checkpoint history entries in Web UI

2020-10-23 Thread GitBox


vthinkxie commented on a change in pull request #13458:
URL: https://github.com/apache/flink/pull/13458#discussion_r510680612



##
File path: 
flink-runtime-web/web-dashboard/src/app/pages/job/checkpoints/detail/job-checkpoints-detail.component.ts
##
@@ -80,5 +83,25 @@ export class JobCheckpointsDetailComponent implements OnInit 
{
   this.cdr.markForCheck();
   this.refresh();
 });
+this.jobService.loadCheckpointConfig(this.jobDetail.jid).subscribe(config 
=> {
+  this.checkPointConfig = config;
+  this.cdr.markForCheck();
+});
+  }
+
+  checkPointType(){

Review comment:
   it would be better to checkPointType as a value other than a function 
here
   then move the calculation to line 65





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-12130) Apply command line options to configuration before installing security modules

2020-10-23 Thread silence (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-12130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17219515#comment-17219515
 ] 

silence commented on FLINK-12130:
-

Is there any progress on this issue recently [~victor-wong]

> Apply command line options to configuration before installing security modules
> --
>
> Key: FLINK-12130
> URL: https://issues.apache.org/jira/browse/FLINK-12130
> Project: Flink
>  Issue Type: Improvement
>  Components: Command Line Client
>Reporter: Victor Wong
>Assignee: Victor Wong
>Priority: Major
>
> Currently if the user configures Kerberos credentials through command line, 
> it won't work.
> {code:java}
> // flink run -m yarn-cluster -yD 
> security.kerberos.login.keytab=/path/to/keytab -yD 
> security.kerberos.login.principal=xxx /path/to/test.jar
> {code}
> Above command would cause security failure if you do not have a ticket cache 
> w/ kinit.
> Maybe we could call 
> _org.apache.flink.client.cli.AbstractCustomCommandLine#applyCommandLineOptionsToConfiguration_
>   before _SecurityUtils.install(new 
> SecurityConfiguration(cli.configuration));_
> Here is a demo patch: 
> [https://github.com/jiasheng55/flink/commit/ef6880dba8a1f36849f5d1bb308405c421b29986]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-19777) Fix NullPointException for WindowOperator.close()

2020-10-23 Thread Jark Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-19777:

Component/s: Table SQL / Runtime

> Fix NullPointException for WindowOperator.close()
> -
>
> Key: FLINK-19777
> URL: https://issues.apache.org/jira/browse/FLINK-19777
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
> Environment: jdk 1.8.0_262
> flink 1.11.1
>Reporter: frank wang
>Priority: Major
>
> i use flink sql run a job,the sql and metadata is :
>  meta :
> 1>soure: kafka
>  create table metric_source_window_table(
> `metricName` String,
> `namespace` String,
> `timestamp` BIGINT,
> `doubleValue` DOUBLE,
> `longValue` BIGINT,
> `metricsValue` String,
> `tags` MAP,
> `meta` Map,
> t as TO_TIMESTAMP(FROM_UNIXTIME(`timestamp`/1000,'-MM-dd HH:mm:ss')),
> WATERMARK FOR t AS t) WITH (
> 'connector' = 'kafka',
> 'topic' = 'ai-platform',
> 'properties.bootstrap.servers' = 'xxx',
> 'properties.group.id' = 'metricgroup',
> 'scan.startup.mode'='earliest-offset',
> 'format' = 'json',
> 'json.fail-on-missing-field' = 'false',
> 'json.ignore-parse-errors' = 'true')
> 2>sink to clickhouse(the clickhouse-connector was developed by ourself)
> create table flink_metric_window_table(
> `timestamp` BIGINT,
> `longValue` BIGINT,
> `metricName` String,
> `metricsValueSum` DOUBLE,
> `metricsValueMin` DOUBLE,
> `metricsValueMax` DOUBLE,
> `tag_record_id` String,
> `tag_host_ip` String,
> `tag_instance` String,
> `tag_job_name` String,
> `tag_ai_app_name` String,
> `tag_namespace` String,
> `tag_ai_type` String,
> `tag_host_name` String,
> `tag_alarm_domain` String) WITH (
> 'connector.type' = 'clickhouse',
> 'connector.property-version' = '1',
> 'connector.url' = 'jdbc:clickhouse://xxx:8123/dataeye',
> 'connector.cluster'='ck_cluster',
> 'connector.write.flush.max-rows'='6000',
> 'connector.write.flush.interval'='1000',
> 'connector.table' = 'flink_metric_table_all')
> my sql is :
> insert into
>  hive.temp_vipflink.flink_metric_window_table
> select
>  cast(HOP_ROWTIME(t, INTERVAL '60' SECOND, INTERVAL '15' MINUTE) AS BIGINT) 
> AS `timestamps`,
>  sum(COALESCE( `longValue`, 0)) AS longValue,
>  metricName,
>  sum(IF(IS_DIGIT(metricsValue), cast(metricsValue AS DOUBLE), 0)) AS 
> metricsValueSum,
>  min(IF(IS_DIGIT(metricsValue), cast(metricsValue AS DOUBLE), 0)) AS 
> metricsValueMin,
>  max(IF(IS_DIGIT(metricsValue), cast(metricsValue AS DOUBLE), 0)) AS 
> metricsValueMax,
>  tags ['record_id'],
>  tags ['host_ip'],
>  tags ['instance'],
>  tags ['job_name'],
>  tags ['ai_app_name'],
>  tags ['namespace'],
>  tags ['ai_type'],
>  tags ['host_name'],
>  tags ['alarm_domain']
> from
>  hive.temp_vipflink.metric_source_window_table
>  group by 
>  metricName,
>  tags ['record_id'],
>  tags ['host_ip'],
>  tags ['instance'],
>  tags ['job_name'],
>  tags ['ai_app_name'],
>  tags ['namespace'],
>  tags ['ai_type'],
>  tags ['host_name'],
>  tags ['alarm_domain'],
>  HOP(t, INTERVAL '60' SECOND, INTERVAL '15' MINUTE)
>  
> when i run this sql for a long hours, it will appear a exception like this:
> [2020-10-22 20:54:52.089] [ERROR] [GroupWindowAggregate(groupBy=[metricName, 
> $f1, $f2, $f3, $f4, $f5, $f6, $f7, $f8, $f9], window=[SlidingGroupWindow('w$, 
> t, 90, 6)], properties=[w$start, w$end, w$rowtime, w$proctime], 
> select=[metricName, $f1, $f2, $f3, $f4, $f5, $f6, $f7, $f8, $f9, SUM($f11) AS 
> longValue, SUM($f12) AS metricsValueSum, MIN($f12) AS metricsValueMin, 
> MAX($f12) AS metricsValueMax, start('w$) AS w$start, end('w$) AS w$end, 
> rowtime('w$) AS w$rowtime, proctime('w$) AS w$proctime]) -> 
> Calc(select=[CAST(CAST(w$rowtime)) AS timestamps, longValue, metricName, 
> metricsValueSum, metricsValueMin, metricsValueMax, $f1 AS EXPR$6, $f2 AS 
> EXPR$7, $f3 AS EXPR$8, $f4 AS EXPR$9, $f5 AS EXPR$10, $f6 AS EXPR$11, $f7 AS 
> EXPR$12, $f8 AS EXPR$13, $f9 AS EXPR$14]) -> SinkConversionToTuple2 -> Sink: 
> JdbcUpsertTableSink(timestamp, longValue, metricName, metricsValueSum, 
> metricsValueMin, metricsValueMax, tag_record_id, tag_host_ip, tag_instance, 
> tag_job_name, tag_ai_app_name, tag_namespace, tag_ai_type, tag_host_name, 
> tag_alarm_domain) (23/44)] 
> [org.apache.flink.streaming.runtime.tasks.StreamTask] >>> Error during 
> disposal of stream operator. java.lang.NullPointerException: null at 
> org.apache.flink.table.runtime.operators.window.WindowOperator.dispose(WindowOperator.java:318)
>  ~[flink-table-blink_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:729)
>  [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:645)
>  [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] at 
>

[jira] [Updated] (FLINK-19777) Fix NullPointException for WindowOperator.close()

2020-10-23 Thread Jark Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-19777:

Affects Version/s: 1.11.2

> Fix NullPointException for WindowOperator.close()
> -
>
> Key: FLINK-19777
> URL: https://issues.apache.org/jira/browse/FLINK-19777
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.11.2
> Environment: jdk 1.8.0_262
> flink 1.11.1
>Reporter: frank wang
>Priority: Major
>
> i use flink sql run a job,the sql and metadata is :
>  meta :
> 1>soure: kafka
>  create table metric_source_window_table(
> `metricName` String,
> `namespace` String,
> `timestamp` BIGINT,
> `doubleValue` DOUBLE,
> `longValue` BIGINT,
> `metricsValue` String,
> `tags` MAP,
> `meta` Map,
> t as TO_TIMESTAMP(FROM_UNIXTIME(`timestamp`/1000,'-MM-dd HH:mm:ss')),
> WATERMARK FOR t AS t) WITH (
> 'connector' = 'kafka',
> 'topic' = 'ai-platform',
> 'properties.bootstrap.servers' = 'xxx',
> 'properties.group.id' = 'metricgroup',
> 'scan.startup.mode'='earliest-offset',
> 'format' = 'json',
> 'json.fail-on-missing-field' = 'false',
> 'json.ignore-parse-errors' = 'true')
> 2>sink to clickhouse(the clickhouse-connector was developed by ourself)
> create table flink_metric_window_table(
> `timestamp` BIGINT,
> `longValue` BIGINT,
> `metricName` String,
> `metricsValueSum` DOUBLE,
> `metricsValueMin` DOUBLE,
> `metricsValueMax` DOUBLE,
> `tag_record_id` String,
> `tag_host_ip` String,
> `tag_instance` String,
> `tag_job_name` String,
> `tag_ai_app_name` String,
> `tag_namespace` String,
> `tag_ai_type` String,
> `tag_host_name` String,
> `tag_alarm_domain` String) WITH (
> 'connector.type' = 'clickhouse',
> 'connector.property-version' = '1',
> 'connector.url' = 'jdbc:clickhouse://xxx:8123/dataeye',
> 'connector.cluster'='ck_cluster',
> 'connector.write.flush.max-rows'='6000',
> 'connector.write.flush.interval'='1000',
> 'connector.table' = 'flink_metric_table_all')
> my sql is :
> insert into
>  hive.temp_vipflink.flink_metric_window_table
> select
>  cast(HOP_ROWTIME(t, INTERVAL '60' SECOND, INTERVAL '15' MINUTE) AS BIGINT) 
> AS `timestamps`,
>  sum(COALESCE( `longValue`, 0)) AS longValue,
>  metricName,
>  sum(IF(IS_DIGIT(metricsValue), cast(metricsValue AS DOUBLE), 0)) AS 
> metricsValueSum,
>  min(IF(IS_DIGIT(metricsValue), cast(metricsValue AS DOUBLE), 0)) AS 
> metricsValueMin,
>  max(IF(IS_DIGIT(metricsValue), cast(metricsValue AS DOUBLE), 0)) AS 
> metricsValueMax,
>  tags ['record_id'],
>  tags ['host_ip'],
>  tags ['instance'],
>  tags ['job_name'],
>  tags ['ai_app_name'],
>  tags ['namespace'],
>  tags ['ai_type'],
>  tags ['host_name'],
>  tags ['alarm_domain']
> from
>  hive.temp_vipflink.metric_source_window_table
>  group by 
>  metricName,
>  tags ['record_id'],
>  tags ['host_ip'],
>  tags ['instance'],
>  tags ['job_name'],
>  tags ['ai_app_name'],
>  tags ['namespace'],
>  tags ['ai_type'],
>  tags ['host_name'],
>  tags ['alarm_domain'],
>  HOP(t, INTERVAL '60' SECOND, INTERVAL '15' MINUTE)
>  
> when i run this sql for a long hours, it will appear a exception like this:
> [2020-10-22 20:54:52.089] [ERROR] [GroupWindowAggregate(groupBy=[metricName, 
> $f1, $f2, $f3, $f4, $f5, $f6, $f7, $f8, $f9], window=[SlidingGroupWindow('w$, 
> t, 90, 6)], properties=[w$start, w$end, w$rowtime, w$proctime], 
> select=[metricName, $f1, $f2, $f3, $f4, $f5, $f6, $f7, $f8, $f9, SUM($f11) AS 
> longValue, SUM($f12) AS metricsValueSum, MIN($f12) AS metricsValueMin, 
> MAX($f12) AS metricsValueMax, start('w$) AS w$start, end('w$) AS w$end, 
> rowtime('w$) AS w$rowtime, proctime('w$) AS w$proctime]) -> 
> Calc(select=[CAST(CAST(w$rowtime)) AS timestamps, longValue, metricName, 
> metricsValueSum, metricsValueMin, metricsValueMax, $f1 AS EXPR$6, $f2 AS 
> EXPR$7, $f3 AS EXPR$8, $f4 AS EXPR$9, $f5 AS EXPR$10, $f6 AS EXPR$11, $f7 AS 
> EXPR$12, $f8 AS EXPR$13, $f9 AS EXPR$14]) -> SinkConversionToTuple2 -> Sink: 
> JdbcUpsertTableSink(timestamp, longValue, metricName, metricsValueSum, 
> metricsValueMin, metricsValueMax, tag_record_id, tag_host_ip, tag_instance, 
> tag_job_name, tag_ai_app_name, tag_namespace, tag_ai_type, tag_host_name, 
> tag_alarm_domain) (23/44)] 
> [org.apache.flink.streaming.runtime.tasks.StreamTask] >>> Error during 
> disposal of stream operator. java.lang.NullPointerException: null at 
> org.apache.flink.table.runtime.operators.window.WindowOperator.dispose(WindowOperator.java:318)
>  ~[flink-table-blink_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:729)
>  [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:645)
>  [flink-dist_2.11-1.11-SNAPSHOT.j

[jira] [Updated] (FLINK-18758) Support debezium data type in debezium format

2020-10-23 Thread Jark Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-18758:

Fix Version/s: 1.12.0

> Support debezium data type in debezium format 
> --
>
> Key: FLINK-18758
> URL: https://issues.apache.org/jira/browse/FLINK-18758
> Project: Flink
>  Issue Type: Sub-task
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: Leonard Xu
>Priority: Major
> Fix For: 1.12.0
>
>
> Currently debezium json format  wrapper the json format to 
> serialize/deserialize the debezuim json format data.
> But debezium json format has its own data type which consists of Literal Type 
> and Semantic Type[1], i.g. Date type in debezium is an integer which 
> represents the number of days since epoch rather than a string with 
> '-MM-dd' pattern.  
> {code:java}
> {   "schema":{
>   "fields":[  
>    {
> "fields":[
>    { 
>  "type":"int32",  //Literal Type
>   "optional":false,
>   "name":"io.debezium.time.Date", //semantic Type
>   "version":1,
>   "field":"order_date"
>    },
>    {  "type":"int32",
>   "optional":false,
>   "field":"quantity"
>    }
> ]
>  }
>   ]
>    },
>    "payload":{ 
>   "before":null,
>   "after":{ 
>  "order_date":16852, // Literal Value, the number of days since epoch
> "quantity":1
>   },
>   "op":"c",
>   "ts_ms":1596081813474
>    }
> } {code}
>  
> I think we need obtain the debezuim data type from schema information and 
> then serialize/deserialize the data in payload.
> [1][https://debezium.io/documentation/reference/1.2/connectors/mysql.html]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-19777) Fix NullPointException for WindowOperator.close()

2020-10-23 Thread Jark Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu reassigned FLINK-19777:
---

Assignee: Jark Wu

> Fix NullPointException for WindowOperator.close()
> -
>
> Key: FLINK-19777
> URL: https://issues.apache.org/jira/browse/FLINK-19777
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.11.2
> Environment: jdk 1.8.0_262
> flink 1.11.1
>Reporter: frank wang
>Assignee: Jark Wu
>Priority: Major
> Fix For: 1.12.0, 1.11.3
>
>
> i use flink sql run a job,the sql and metadata is :
>  meta :
> 1>soure: kafka
>  create table metric_source_window_table(
> `metricName` String,
> `namespace` String,
> `timestamp` BIGINT,
> `doubleValue` DOUBLE,
> `longValue` BIGINT,
> `metricsValue` String,
> `tags` MAP,
> `meta` Map,
> t as TO_TIMESTAMP(FROM_UNIXTIME(`timestamp`/1000,'-MM-dd HH:mm:ss')),
> WATERMARK FOR t AS t) WITH (
> 'connector' = 'kafka',
> 'topic' = 'ai-platform',
> 'properties.bootstrap.servers' = 'xxx',
> 'properties.group.id' = 'metricgroup',
> 'scan.startup.mode'='earliest-offset',
> 'format' = 'json',
> 'json.fail-on-missing-field' = 'false',
> 'json.ignore-parse-errors' = 'true')
> 2>sink to clickhouse(the clickhouse-connector was developed by ourself)
> create table flink_metric_window_table(
> `timestamp` BIGINT,
> `longValue` BIGINT,
> `metricName` String,
> `metricsValueSum` DOUBLE,
> `metricsValueMin` DOUBLE,
> `metricsValueMax` DOUBLE,
> `tag_record_id` String,
> `tag_host_ip` String,
> `tag_instance` String,
> `tag_job_name` String,
> `tag_ai_app_name` String,
> `tag_namespace` String,
> `tag_ai_type` String,
> `tag_host_name` String,
> `tag_alarm_domain` String) WITH (
> 'connector.type' = 'clickhouse',
> 'connector.property-version' = '1',
> 'connector.url' = 'jdbc:clickhouse://xxx:8123/dataeye',
> 'connector.cluster'='ck_cluster',
> 'connector.write.flush.max-rows'='6000',
> 'connector.write.flush.interval'='1000',
> 'connector.table' = 'flink_metric_table_all')
> my sql is :
> insert into
>  hive.temp_vipflink.flink_metric_window_table
> select
>  cast(HOP_ROWTIME(t, INTERVAL '60' SECOND, INTERVAL '15' MINUTE) AS BIGINT) 
> AS `timestamps`,
>  sum(COALESCE( `longValue`, 0)) AS longValue,
>  metricName,
>  sum(IF(IS_DIGIT(metricsValue), cast(metricsValue AS DOUBLE), 0)) AS 
> metricsValueSum,
>  min(IF(IS_DIGIT(metricsValue), cast(metricsValue AS DOUBLE), 0)) AS 
> metricsValueMin,
>  max(IF(IS_DIGIT(metricsValue), cast(metricsValue AS DOUBLE), 0)) AS 
> metricsValueMax,
>  tags ['record_id'],
>  tags ['host_ip'],
>  tags ['instance'],
>  tags ['job_name'],
>  tags ['ai_app_name'],
>  tags ['namespace'],
>  tags ['ai_type'],
>  tags ['host_name'],
>  tags ['alarm_domain']
> from
>  hive.temp_vipflink.metric_source_window_table
>  group by 
>  metricName,
>  tags ['record_id'],
>  tags ['host_ip'],
>  tags ['instance'],
>  tags ['job_name'],
>  tags ['ai_app_name'],
>  tags ['namespace'],
>  tags ['ai_type'],
>  tags ['host_name'],
>  tags ['alarm_domain'],
>  HOP(t, INTERVAL '60' SECOND, INTERVAL '15' MINUTE)
>  
> when i run this sql for a long hours, it will appear a exception like this:
> [2020-10-22 20:54:52.089] [ERROR] [GroupWindowAggregate(groupBy=[metricName, 
> $f1, $f2, $f3, $f4, $f5, $f6, $f7, $f8, $f9], window=[SlidingGroupWindow('w$, 
> t, 90, 6)], properties=[w$start, w$end, w$rowtime, w$proctime], 
> select=[metricName, $f1, $f2, $f3, $f4, $f5, $f6, $f7, $f8, $f9, SUM($f11) AS 
> longValue, SUM($f12) AS metricsValueSum, MIN($f12) AS metricsValueMin, 
> MAX($f12) AS metricsValueMax, start('w$) AS w$start, end('w$) AS w$end, 
> rowtime('w$) AS w$rowtime, proctime('w$) AS w$proctime]) -> 
> Calc(select=[CAST(CAST(w$rowtime)) AS timestamps, longValue, metricName, 
> metricsValueSum, metricsValueMin, metricsValueMax, $f1 AS EXPR$6, $f2 AS 
> EXPR$7, $f3 AS EXPR$8, $f4 AS EXPR$9, $f5 AS EXPR$10, $f6 AS EXPR$11, $f7 AS 
> EXPR$12, $f8 AS EXPR$13, $f9 AS EXPR$14]) -> SinkConversionToTuple2 -> Sink: 
> JdbcUpsertTableSink(timestamp, longValue, metricName, metricsValueSum, 
> metricsValueMin, metricsValueMax, tag_record_id, tag_host_ip, tag_instance, 
> tag_job_name, tag_ai_app_name, tag_namespace, tag_ai_type, tag_host_name, 
> tag_alarm_domain) (23/44)] 
> [org.apache.flink.streaming.runtime.tasks.StreamTask] >>> Error during 
> disposal of stream operator. java.lang.NullPointerException: null at 
> org.apache.flink.table.runtime.operators.window.WindowOperator.dispose(WindowOperator.java:318)
>  ~[flink-table-blink_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:729)
>  [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.

[jira] [Updated] (FLINK-19777) Fix NullPointException for WindowOperator.close()

2020-10-23 Thread Jark Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-19777:

Summary: Fix NullPointException for WindowOperator.close()  (was: 
java.lang.NullPointerException)

> Fix NullPointException for WindowOperator.close()
> -
>
> Key: FLINK-19777
> URL: https://issues.apache.org/jira/browse/FLINK-19777
> Project: Flink
>  Issue Type: Bug
> Environment: jdk 1.8.0_262
> flink 1.11.1
>Reporter: frank wang
>Priority: Major
>
> i use flink sql run a job,the sql and metadata is :
>  meta :
> 1>soure: kafka
>  create table metric_source_window_table(
> `metricName` String,
> `namespace` String,
> `timestamp` BIGINT,
> `doubleValue` DOUBLE,
> `longValue` BIGINT,
> `metricsValue` String,
> `tags` MAP,
> `meta` Map,
> t as TO_TIMESTAMP(FROM_UNIXTIME(`timestamp`/1000,'-MM-dd HH:mm:ss')),
> WATERMARK FOR t AS t) WITH (
> 'connector' = 'kafka',
> 'topic' = 'ai-platform',
> 'properties.bootstrap.servers' = 'xxx',
> 'properties.group.id' = 'metricgroup',
> 'scan.startup.mode'='earliest-offset',
> 'format' = 'json',
> 'json.fail-on-missing-field' = 'false',
> 'json.ignore-parse-errors' = 'true')
> 2>sink to clickhouse(the clickhouse-connector was developed by ourself)
> create table flink_metric_window_table(
> `timestamp` BIGINT,
> `longValue` BIGINT,
> `metricName` String,
> `metricsValueSum` DOUBLE,
> `metricsValueMin` DOUBLE,
> `metricsValueMax` DOUBLE,
> `tag_record_id` String,
> `tag_host_ip` String,
> `tag_instance` String,
> `tag_job_name` String,
> `tag_ai_app_name` String,
> `tag_namespace` String,
> `tag_ai_type` String,
> `tag_host_name` String,
> `tag_alarm_domain` String) WITH (
> 'connector.type' = 'clickhouse',
> 'connector.property-version' = '1',
> 'connector.url' = 'jdbc:clickhouse://xxx:8123/dataeye',
> 'connector.cluster'='ck_cluster',
> 'connector.write.flush.max-rows'='6000',
> 'connector.write.flush.interval'='1000',
> 'connector.table' = 'flink_metric_table_all')
> my sql is :
> insert into
>  hive.temp_vipflink.flink_metric_window_table
> select
>  cast(HOP_ROWTIME(t, INTERVAL '60' SECOND, INTERVAL '15' MINUTE) AS BIGINT) 
> AS `timestamps`,
>  sum(COALESCE( `longValue`, 0)) AS longValue,
>  metricName,
>  sum(IF(IS_DIGIT(metricsValue), cast(metricsValue AS DOUBLE), 0)) AS 
> metricsValueSum,
>  min(IF(IS_DIGIT(metricsValue), cast(metricsValue AS DOUBLE), 0)) AS 
> metricsValueMin,
>  max(IF(IS_DIGIT(metricsValue), cast(metricsValue AS DOUBLE), 0)) AS 
> metricsValueMax,
>  tags ['record_id'],
>  tags ['host_ip'],
>  tags ['instance'],
>  tags ['job_name'],
>  tags ['ai_app_name'],
>  tags ['namespace'],
>  tags ['ai_type'],
>  tags ['host_name'],
>  tags ['alarm_domain']
> from
>  hive.temp_vipflink.metric_source_window_table
>  group by 
>  metricName,
>  tags ['record_id'],
>  tags ['host_ip'],
>  tags ['instance'],
>  tags ['job_name'],
>  tags ['ai_app_name'],
>  tags ['namespace'],
>  tags ['ai_type'],
>  tags ['host_name'],
>  tags ['alarm_domain'],
>  HOP(t, INTERVAL '60' SECOND, INTERVAL '15' MINUTE)
>  
> when i run this sql for a long hours, it will appear a exception like this:
> [2020-10-22 20:54:52.089] [ERROR] [GroupWindowAggregate(groupBy=[metricName, 
> $f1, $f2, $f3, $f4, $f5, $f6, $f7, $f8, $f9], window=[SlidingGroupWindow('w$, 
> t, 90, 6)], properties=[w$start, w$end, w$rowtime, w$proctime], 
> select=[metricName, $f1, $f2, $f3, $f4, $f5, $f6, $f7, $f8, $f9, SUM($f11) AS 
> longValue, SUM($f12) AS metricsValueSum, MIN($f12) AS metricsValueMin, 
> MAX($f12) AS metricsValueMax, start('w$) AS w$start, end('w$) AS w$end, 
> rowtime('w$) AS w$rowtime, proctime('w$) AS w$proctime]) -> 
> Calc(select=[CAST(CAST(w$rowtime)) AS timestamps, longValue, metricName, 
> metricsValueSum, metricsValueMin, metricsValueMax, $f1 AS EXPR$6, $f2 AS 
> EXPR$7, $f3 AS EXPR$8, $f4 AS EXPR$9, $f5 AS EXPR$10, $f6 AS EXPR$11, $f7 AS 
> EXPR$12, $f8 AS EXPR$13, $f9 AS EXPR$14]) -> SinkConversionToTuple2 -> Sink: 
> JdbcUpsertTableSink(timestamp, longValue, metricName, metricsValueSum, 
> metricsValueMin, metricsValueMax, tag_record_id, tag_host_ip, tag_instance, 
> tag_job_name, tag_ai_app_name, tag_namespace, tag_ai_type, tag_host_name, 
> tag_alarm_domain) (23/44)] 
> [org.apache.flink.streaming.runtime.tasks.StreamTask] >>> Error during 
> disposal of stream operator. java.lang.NullPointerException: null at 
> org.apache.flink.table.runtime.operators.window.WindowOperator.dispose(WindowOperator.java:318)
>  ~[flink-table-blink_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:729)
>  [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:645)
>  [flink-dist_2.11-1.11-SNAPSHOT.j

[jira] [Updated] (FLINK-19777) Fix NullPointException for WindowOperator.close()

2020-10-23 Thread Jark Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-19777:

Fix Version/s: 1.11.3
   1.12.0

> Fix NullPointException for WindowOperator.close()
> -
>
> Key: FLINK-19777
> URL: https://issues.apache.org/jira/browse/FLINK-19777
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.11.2
> Environment: jdk 1.8.0_262
> flink 1.11.1
>Reporter: frank wang
>Priority: Major
> Fix For: 1.12.0, 1.11.3
>
>
> i use flink sql run a job,the sql and metadata is :
>  meta :
> 1>soure: kafka
>  create table metric_source_window_table(
> `metricName` String,
> `namespace` String,
> `timestamp` BIGINT,
> `doubleValue` DOUBLE,
> `longValue` BIGINT,
> `metricsValue` String,
> `tags` MAP,
> `meta` Map,
> t as TO_TIMESTAMP(FROM_UNIXTIME(`timestamp`/1000,'-MM-dd HH:mm:ss')),
> WATERMARK FOR t AS t) WITH (
> 'connector' = 'kafka',
> 'topic' = 'ai-platform',
> 'properties.bootstrap.servers' = 'xxx',
> 'properties.group.id' = 'metricgroup',
> 'scan.startup.mode'='earliest-offset',
> 'format' = 'json',
> 'json.fail-on-missing-field' = 'false',
> 'json.ignore-parse-errors' = 'true')
> 2>sink to clickhouse(the clickhouse-connector was developed by ourself)
> create table flink_metric_window_table(
> `timestamp` BIGINT,
> `longValue` BIGINT,
> `metricName` String,
> `metricsValueSum` DOUBLE,
> `metricsValueMin` DOUBLE,
> `metricsValueMax` DOUBLE,
> `tag_record_id` String,
> `tag_host_ip` String,
> `tag_instance` String,
> `tag_job_name` String,
> `tag_ai_app_name` String,
> `tag_namespace` String,
> `tag_ai_type` String,
> `tag_host_name` String,
> `tag_alarm_domain` String) WITH (
> 'connector.type' = 'clickhouse',
> 'connector.property-version' = '1',
> 'connector.url' = 'jdbc:clickhouse://xxx:8123/dataeye',
> 'connector.cluster'='ck_cluster',
> 'connector.write.flush.max-rows'='6000',
> 'connector.write.flush.interval'='1000',
> 'connector.table' = 'flink_metric_table_all')
> my sql is :
> insert into
>  hive.temp_vipflink.flink_metric_window_table
> select
>  cast(HOP_ROWTIME(t, INTERVAL '60' SECOND, INTERVAL '15' MINUTE) AS BIGINT) 
> AS `timestamps`,
>  sum(COALESCE( `longValue`, 0)) AS longValue,
>  metricName,
>  sum(IF(IS_DIGIT(metricsValue), cast(metricsValue AS DOUBLE), 0)) AS 
> metricsValueSum,
>  min(IF(IS_DIGIT(metricsValue), cast(metricsValue AS DOUBLE), 0)) AS 
> metricsValueMin,
>  max(IF(IS_DIGIT(metricsValue), cast(metricsValue AS DOUBLE), 0)) AS 
> metricsValueMax,
>  tags ['record_id'],
>  tags ['host_ip'],
>  tags ['instance'],
>  tags ['job_name'],
>  tags ['ai_app_name'],
>  tags ['namespace'],
>  tags ['ai_type'],
>  tags ['host_name'],
>  tags ['alarm_domain']
> from
>  hive.temp_vipflink.metric_source_window_table
>  group by 
>  metricName,
>  tags ['record_id'],
>  tags ['host_ip'],
>  tags ['instance'],
>  tags ['job_name'],
>  tags ['ai_app_name'],
>  tags ['namespace'],
>  tags ['ai_type'],
>  tags ['host_name'],
>  tags ['alarm_domain'],
>  HOP(t, INTERVAL '60' SECOND, INTERVAL '15' MINUTE)
>  
> when i run this sql for a long hours, it will appear a exception like this:
> [2020-10-22 20:54:52.089] [ERROR] [GroupWindowAggregate(groupBy=[metricName, 
> $f1, $f2, $f3, $f4, $f5, $f6, $f7, $f8, $f9], window=[SlidingGroupWindow('w$, 
> t, 90, 6)], properties=[w$start, w$end, w$rowtime, w$proctime], 
> select=[metricName, $f1, $f2, $f3, $f4, $f5, $f6, $f7, $f8, $f9, SUM($f11) AS 
> longValue, SUM($f12) AS metricsValueSum, MIN($f12) AS metricsValueMin, 
> MAX($f12) AS metricsValueMax, start('w$) AS w$start, end('w$) AS w$end, 
> rowtime('w$) AS w$rowtime, proctime('w$) AS w$proctime]) -> 
> Calc(select=[CAST(CAST(w$rowtime)) AS timestamps, longValue, metricName, 
> metricsValueSum, metricsValueMin, metricsValueMax, $f1 AS EXPR$6, $f2 AS 
> EXPR$7, $f3 AS EXPR$8, $f4 AS EXPR$9, $f5 AS EXPR$10, $f6 AS EXPR$11, $f7 AS 
> EXPR$12, $f8 AS EXPR$13, $f9 AS EXPR$14]) -> SinkConversionToTuple2 -> Sink: 
> JdbcUpsertTableSink(timestamp, longValue, metricName, metricsValueSum, 
> metricsValueMin, metricsValueMax, tag_record_id, tag_host_ip, tag_instance, 
> tag_job_name, tag_ai_app_name, tag_namespace, tag_ai_type, tag_host_name, 
> tag_alarm_domain) (23/44)] 
> [org.apache.flink.streaming.runtime.tasks.StreamTask] >>> Error during 
> disposal of stream operator. java.lang.NullPointerException: null at 
> org.apache.flink.table.runtime.operators.window.WindowOperator.dispose(WindowOperator.java:318)
>  ~[flink-table-blink_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:729)
>  [flink-dist_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT] at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpI

[jira] [Assigned] (FLINK-18758) Support debezium data type in debezium format

2020-10-23 Thread Jark Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu reassigned FLINK-18758:
---

Assignee: Jark Wu

> Support debezium data type in debezium format 
> --
>
> Key: FLINK-18758
> URL: https://issues.apache.org/jira/browse/FLINK-18758
> Project: Flink
>  Issue Type: Sub-task
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: Leonard Xu
>Assignee: Jark Wu
>Priority: Major
> Fix For: 1.12.0
>
>
> Currently debezium json format  wrapper the json format to 
> serialize/deserialize the debezuim json format data.
> But debezium json format has its own data type which consists of Literal Type 
> and Semantic Type[1], i.g. Date type in debezium is an integer which 
> represents the number of days since epoch rather than a string with 
> '-MM-dd' pattern.  
> {code:java}
> {   "schema":{
>   "fields":[  
>    {
> "fields":[
>    { 
>  "type":"int32",  //Literal Type
>   "optional":false,
>   "name":"io.debezium.time.Date", //semantic Type
>   "version":1,
>   "field":"order_date"
>    },
>    {  "type":"int32",
>   "optional":false,
>   "field":"quantity"
>    }
> ]
>  }
>   ]
>    },
>    "payload":{ 
>   "before":null,
>   "after":{ 
>  "order_date":16852, // Literal Value, the number of days since epoch
> "quantity":1
>   },
>   "op":"c",
>   "ts_ms":1596081813474
>    }
> } {code}
>  
> I think we need obtain the debezuim data type from schema information and 
> then serialize/deserialize the data in payload.
> [1][https://debezium.io/documentation/reference/1.2/connectors/mysql.html]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] wangyang0918 commented on a change in pull request #13644: [FLINK-19542][k8s]Implement LeaderElectionService and LeaderRetrievalService based on Kubernetes API

2020-10-23 Thread GitBox


wangyang0918 commented on a change in pull request #13644:
URL: https://github.com/apache/flink/pull/13644#discussion_r510123914



##
File path: 
flink-kubernetes/src/main/java/org/apache/flink/kubernetes/highavailability/KubernetesLeaderElectionService.java
##
@@ -0,0 +1,219 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.kubernetes.highavailability;
+
+import org.apache.flink.kubernetes.kubeclient.FlinkKubeClient;
+import 
org.apache.flink.kubernetes.kubeclient.KubernetesLeaderElectionConfiguration;
+import org.apache.flink.kubernetes.kubeclient.resources.KubernetesConfigMap;
+import 
org.apache.flink.kubernetes.kubeclient.resources.KubernetesLeaderElector;
+import org.apache.flink.kubernetes.kubeclient.resources.KubernetesWatch;
+import org.apache.flink.kubernetes.utils.KubernetesUtils;
+import org.apache.flink.runtime.leaderelection.AbstractLeaderElectionService;
+import org.apache.flink.runtime.leaderelection.LeaderContender;
+import org.apache.flink.util.function.FunctionUtils;
+
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.Optional;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.Executor;
+
+import static 
org.apache.flink.kubernetes.utils.Constants.LABEL_CONFIGMAP_TYPE_HIGH_AVAILABILITY;
+import static org.apache.flink.kubernetes.utils.Constants.LEADER_ADDRESS_KEY;
+import static 
org.apache.flink.kubernetes.utils.Constants.LEADER_SESSION_ID_KEY;
+import static org.apache.flink.util.Preconditions.checkNotNull;
+
+/**
+ * Leader election service for multiple JobManagers. The active JobManager is 
elected using Kubernetes.
+ * The current leader's address as well as its leader session ID is published 
via Kubernetes ConfigMap.
+ * Note that the contending lock and leader storage are using the same 
ConfigMap. And every component(e.g.
+ * ResourceManager, Dispatcher, RestEndpoint, JobManager for each job) will 
have a separate ConfigMap.
+ */
+public class KubernetesLeaderElectionService extends 
AbstractLeaderElectionService {
+
+   private final FlinkKubeClient kubeClient;
+
+   private final Executor executor;
+
+   private final String configMapName;
+
+   private final KubernetesLeaderElector leaderElector;
+
+   private KubernetesWatch kubernetesWatch;
+
+   // Labels will be used to clean up the ha related ConfigMaps.
+   private Map configMapLabels;
+
+   KubernetesLeaderElectionService(
+   FlinkKubeClient kubeClient,
+   Executor executor,
+   KubernetesLeaderElectionConfiguration leaderConfig) {
+
+   this.kubeClient = checkNotNull(kubeClient, "Kubernetes client 
should not be null.");
+   this.executor = checkNotNull(executor, "Executor should not be 
null.");
+   this.configMapName = leaderConfig.getConfigMapName();
+   this.leaderElector = 
kubeClient.createLeaderElector(leaderConfig, new LeaderCallbackHandlerImpl());
+   this.leaderContender = null;
+   this.configMapLabels = KubernetesUtils.getConfigMapLabels(
+   leaderConfig.getClusterId(), 
LABEL_CONFIGMAP_TYPE_HIGH_AVAILABILITY);
+   }
+
+   @Override
+   public void internalStart(LeaderContender contender) {
+   CompletableFuture.runAsync(leaderElector::run, executor);
+   kubernetesWatch = kubeClient.watchConfigMaps(configMapName, new 
ConfigMapCallbackHandlerImpl());
+   }
+
+   @Override
+   public void internalStop() {
+   if (kubernetesWatch != null) {
+   kubernetesWatch.close();
+   }
+   }
+
+   @Override
+   protected void writeLeaderInformation() {
+   try {
+   kubeClient.checkAndUpdateConfigMap(
+   configMapName,
+   configMap -> {
+   if 
(leaderElector.hasLeadership(configMap)) {
+   // Get the updated ConfigMap 
with new leader information
+   

[GitHub] [flink] JingsongLi merged pull request #13716: [FLINK-19706][table-runtime] Add WARN logs when hive table partition …

2020-10-23 Thread GitBox


JingsongLi merged pull request #13716:
URL: https://github.com/apache/flink/pull/13716


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] JingsongLi commented on pull request #13716: [FLINK-19706][table-runtime] Add WARN logs when hive table partition …

2020-10-23 Thread GitBox


JingsongLi commented on pull request #13716:
URL: https://github.com/apache/flink/pull/13716#issuecomment-715059216


   Thanks @shouweikun , merged.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-19706) Add WARN logs when hive table partition has existed before commit in `MetastoreCommitPolicy`

2020-10-23 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-19706.

Resolution: Fixed

master (1.12): 7b04b29e182c6245298b2f032dcbbaf25fc7dbe2

> Add WARN logs when hive table partition has existed before commit in 
> `MetastoreCommitPolicy`   
> ---
>
> Key: FLINK-19706
> URL: https://issues.apache.org/jira/browse/FLINK-19706
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem, Connectors / Hive, Table SQL / 
> Runtime
>Reporter: Lsw_aka_laplace
>Assignee: Lsw_aka_laplace
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.12.0
>
> Attachments: image-2020-10-19-16-47-39-354.png, 
> image-2020-10-19-16-57-02-661.png, image-2020-10-19-17-00-27-255.png, 
> image-2020-10-19-17-03-21-558.png, image-2020-10-19-18-16-35-083.png
>
>
> dfHi all,
>       Recently we have been devoted to using Hive Streaming Writing to 
> accelerate our data-sync of Data Warehouse based on Hive, and eventually we 
> made it. 
>        For producing purpose, a lot of metrics/logs/measures were added in 
> order to help us analyze running info or fix some unexpected problems. Among 
> these mentioned above, we found that Checking Repeated Partition Commit is 
> the most important one. So here, we are willing to make a contribution of 
> introducing this backwards to Community.
>      If this proposal is meaning, I am happy to introduce my design and 
> implementation.
>  
> Looking forward to ANY opinion~
>  
>  
> UPDATE 
>  
> Our user(using our own platform to build his own Flink job)raised some 
> Requests. One of the requests is that once the parition is commited, the data 
> in this partitio is regarded as frozen or completed. [Commiting partition] 
> seem like a gurantee(but we all know it is hard to be a promise) in some way 
> which tells us this partition is completed. Certainly, we make a lot of 
> measures try to achieve that [partition-commit means completed]. So if a 
> partition is committed twice or more times, for us, there must be sth wrong 
> or our measures are insufficent.  On the other hand, it also inform us to do 
> sth to make up to avoid data-loss or data-incompletion.  
>  
> So first of all, it is important to let us or help us know that certain 
> partition is committed repeatedly. So that we can do the following things ASAP
>    1. analyze the reason or the cause 
>    2. do some trade-off operations
>    3. improve our code/measures
>  
> — Design and Implementation--- 
> There are basically two ways, both of them have been used in prod-env
> Approach1
> Add measures in CommitPolicy and be called before partition commit
> !image-2020-10-19-16-47-39-354.png|width=576,height=235!
> //{color:#ffab00}Newly posted, see here{color}
> !image-2020-10-19-18-16-35-083.png|width=725,height=313!
>  1.1 As the pic shows, add `checkPartitionExists` and implement it in 
> sub-class
>   !image-2020-10-19-17-03-21-558.png|width=1203,height=88!
>  1.2 call checkPartitionExists before partition commit
> --- 
> Approach2
> Build a bounded cache of committed partitions and check it everytime before 
> partition commit 
> (actually this cache supposed to be a operator state)
> !image-2020-10-19-16-57-02-661.png|width=1298,height=57!
>   2.1 build a cache
> !image-2020-10-19-17-00-27-255.png|width=1235,height=116!
>   2.2 check before commit 
>  
>  
> — UPDATE —
> After discussed with [~lzljs3620320], `Repeated partition check` seems  a 
> little misleading in semantics, so only some WARN logs will be added in 
> `MetastoreCommitPolicy` in aware of repeated commit 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19778) Failed job reinitiated with wrong checkpoint after a ZK reconnection

2020-10-23 Thread Jiayi Liao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17219520#comment-17219520
 ] 

Jiayi Liao commented on FLINK-19778:


In the JM log, the point is why zk store cannot read checkpoints in Zookeeper. 

 
{code:java}
2020-10-23 06:17:59,635 INFO  
org.apache.flink.runtime.checkpoint.ZooKeeperCompletedCheckpointStore  - 
Recovering checkpoints from ZooKeeper.
2020-10-23 06:17:59,706 INFO  
org.apache.flink.runtime.checkpoint.ZooKeeperCompletedCheckpointStore  - Found 
0 checkpoints in ZooKeeper.
2020-10-23 06:17:59,706 INFO  
org.apache.flink.runtime.checkpoint.ZooKeeperCompletedCheckpointStore  - Trying 
to fetch 0 checkpoints from storage.
2020-10-23 06:17:59,706 INFO  
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Starting job 
e70c0d75910ce83ea50cc34ce62a241a from savepoint 
hdfs://cosmos/flink/user_10342/savepoints/5d6e4d1574aabf4fab5281f4/savepoint-bbbc5a-f381255749ac
 (allowing non restored state)
{code}
 Do you find any clues in Zookeeper's logs? BTW, can you query the zk data on 
command line? 

The zk path should be, ${high-availability.zookeeper.path.root} + 
${high-availability.cluster-id} + 
${high-availability.zookeeper.path.checkpoints} + ${jobId}. 

 

> Failed job reinitiated with wrong checkpoint after a ZK reconnection
> 
>
> Key: FLINK-19778
> URL: https://issues.apache.org/jira/browse/FLINK-19778
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.11.0
>Reporter: Paul Lin
>Priority: Critical
> Attachments: jm_log
>
>
> We have a job of Flink 1.11.0 running on YARN that reached FAILED state 
> because its jobmanager lost leadership during a ZK full GC. But after the ZK 
> connection was recovered, somehow the job was reinitiated again with no 
> checkpoints found in ZK, and hence an earlier savepoint was used to restore 
> the job, which rewound the job unexpectedly.
>   
>  For details please see the jobmanager logs in the attachment.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #13744: [FLINK-19766][table-runtime] Introduce File streaming compaction operators

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13744:
URL: https://github.com/apache/flink/pull/13744#issuecomment-714369975


   
   ## CI report:
   
   * 4d7e601658c0dacc100c572f4fe8878c058e035f Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8156)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13759: [FLINK-19412][python] Re-layer Python Operation Make it Possible to Provide only Python implementation

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13759:
URL: https://github.com/apache/flink/pull/13759#issuecomment-715027223


   
   ## CI report:
   
   * d05c438cf0d44def860358741ca30151d8a3dd6a Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8157)
 
   * 2e62d962b92ff2c18567a80dcae46fdcfe61d10d UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #13760: [FLINK-19627][table-runtime] Introduce multiple input operator for batch

2020-10-23 Thread GitBox


flinkbot commented on pull request #13760:
URL: https://github.com/apache/flink/pull/13760#issuecomment-715078238


   
   ## CI report:
   
   * 6bbd2cf570cdda1fe9ea98b93fa290ac4d497fbd UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-19778) Failed job reinitiated with wrong checkpoint after a ZK reconnection

2020-10-23 Thread Paul Lin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17219521#comment-17219521
 ] 

Paul Lin commented on FLINK-19778:
--

I think it's because the job turned into FAILED state, which is a global 
terminated state, so the checkpoint entries were removed from zookeeper.

And unfortunately, the job is canceled afterwards, and the application path on 
zookeeper was cleaned up, so we can't get more information from zookeeper.

> Failed job reinitiated with wrong checkpoint after a ZK reconnection
> 
>
> Key: FLINK-19778
> URL: https://issues.apache.org/jira/browse/FLINK-19778
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.11.0
>Reporter: Paul Lin
>Priority: Critical
> Attachments: jm_log
>
>
> We have a job of Flink 1.11.0 running on YARN that reached FAILED state 
> because its jobmanager lost leadership during a ZK full GC. But after the ZK 
> connection was recovered, somehow the job was reinitiated again with no 
> checkpoints found in ZK, and hence an earlier savepoint was used to restore 
> the job, which rewound the job unexpectedly.
>   
>  For details please see the jobmanager logs in the attachment.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19774) Introduce sub partition view version for approximate Failover

2020-10-23 Thread Yuan Mei (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17219524#comment-17219524
 ] 

Yuan Mei commented on FLINK-19774:
--

Places need to be changed:

1. set the parent of view -> invalid

2. view is released before set to null in subpartition(done)

 

> Introduce sub partition view version for approximate Failover
> -
>
> Key: FLINK-19774
> URL: https://issues.apache.org/jira/browse/FLINK-19774
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Yuan Mei
>Priority: Major
>
>  
> This ticket is to solve a corner case where a downstream task continuously 
> fails multiple times, or an orphan task execution may exist for a short 
> period of time after new execution is running (as described in the FLIP)
>  
> Here is an idea of how to cleanly and thoroughly solve this kind of problem:
>  # We go with the simplified release view version: only release view before a 
> new creation (in thread2). That says we won't clean up view when downstream 
> task disconnects ({{releaseView}} would not be called from the reference copy 
> of view) (in thread1 or 2).
>  * 
>  ** This would greatly simplify the threading model
>  ** This won't cause any resource leak, since view release is only to notify 
> the upstream result partition to releaseOnConsumption when all subpartitions 
> are consumed in PipelinedSubPartitionView. In our case, we do not release the 
> result partition on consumption any way (the result partition is put in track 
> in JobMaster, similar to the ResultParition.blocking Type).
>       2. Each view is associated with a downstream task execution version
>  * 
>  ** This is making sense because we actually have different versions of view 
> now, corresponding to the vertex.version of the downstream task.
>  ** createView is performed only if the new version to create is greater than 
> the existing one
>  ** If we decide to create a new view, the old view should be released.
> I think this way, we can completely disconnect the old view with the 
> subpartition. Besides that, the working handler in use would always hold the 
> freshest view reference.
>  
> Point 1 has already been addressed in FLINK-19632. This ticket is to address 
> Point 2.
> Details discussion in [https://github.com/apache/flink/pull/13648]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19067) FileNotFoundException when run flink examples

2020-10-23 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17219525#comment-17219525
 ] 

Robert Metzger commented on FLINK-19067:


The logs you are providing are somehow incomplete. I don't see a single log 
event relating to the job submission, or the execution graph. A valid case 
where the blob files get deleted it when the job execution has been completed 
successfully. But I can not check for that, if there are no job submission 
lifecycle events in the log file.
Either you are directly requesting BLOBs from the JobManager, causing these 
error messages, or you have a custom logging configuration that is filtering 
out some of the log messages.
If you are really just trying to submit a wordcount to a cluster, and these are 
the logs you are using to debug this, then you need to first fix your logging 
setup.

I would recommend to you get a vanilla Flink distribution, submit wordcount 
there, and check out how the Flink logs normally look. You'll quickly see a 
difference to the logs you've provided here.

> FileNotFoundException when run flink examples
> -
>
> Key: FLINK-19067
> URL: https://issues.apache.org/jira/browse/FLINK-19067
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.11.1
>Reporter: JieFang.He
>Priority: Major
> Attachments: flink-jobmanager-deployer-hejiefang01.log, 
> flink-jobmanager-deployer-hejiefang02.log, 
> flink-taskmanager-deployer-hejiefang01.log, 
> flink-taskmanager-deployer-hejiefang02.log
>
>
> 1、When run examples/batch/WordCount.jar,it will fail with the exception:
> Caused by: java.io.FileNotFoundException: 
> /data2/flink/storageDir/default/blob/job_d29414828f614d5466e239be4d3889ac/blob_p-a2ebe1c5aa160595f214b4bd0f39d80e42ee2e93-f458f1c12dc023e78d25f191de1d7c4b
>  (No such file or directory)
>  at java.io.FileInputStream.open0(Native Method)
>  at java.io.FileInputStream.open(FileInputStream.java:195)
>  at java.io.FileInputStream.(FileInputStream.java:138)
>  at 
> org.apache.flink.core.fs.local.LocalDataInputStream.(LocalDataInputStream.java:50)
>  at 
> org.apache.flink.core.fs.local.LocalFileSystem.open(LocalFileSystem.java:143)
>  at 
> org.apache.flink.runtime.blob.FileSystemBlobStore.get(FileSystemBlobStore.java:105)
>  at 
> org.apache.flink.runtime.blob.FileSystemBlobStore.get(FileSystemBlobStore.java:87)
>  at 
> org.apache.flink.runtime.blob.BlobServer.getFileInternal(BlobServer.java:501)
>  at 
> org.apache.flink.runtime.blob.BlobServerConnection.get(BlobServerConnection.java:231)
>  at 
> org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:117)
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] tzulitai opened a new pull request #13761: [FLINK-19741] Allow AbstractStreamOperator to skip restoring timers if subclasses are using raw keyed state

2020-10-23 Thread GitBox


tzulitai opened a new pull request #13761:
URL: https://github.com/apache/flink/pull/13761


   ## What is the purpose of the change
   
   This PR provides a temporary solution for the following issue:
   - On restore, the timer services **_always_** attempts to read timers from 
raw keyed state streams, regardless of whether or not timers were actually 
written to raw keyed state. This is done to allow users to swap from heap-based 
timers to RocksDB timers across savepoint restores. Since there is no 
information about the previous timer configuration in savepoints, the only way 
to support the timer swapping is to always see if something was written.
   - The issue is that this implementation **assumes that there are no other 
writers to the raw keyed streams**.
   - If a `AbstractStreamOperator` implementation uses the raw keyed streams, 
on restore the timer services would assume that they wrote the checkpointed 
data and tries to read them, and then the restore obviously fails (because of 
read errors).
   
   ## Solution
   
   This PR solves this by adding a flag `isUsingCustomRawKeyedState` to the 
`AbstractStreamOperator`, which by default is `false`:
   ```
   protected boolean isUsingCustomRawKeyedState() {
   return false;
   }
   ```
   
   If a `AbstractStreamOperator` implementation writes to raw keyed streams, it 
should also override this to return `true`.
   On restore, the timer services would respect this flag and skip reading from 
the raw keyed streams.
   
   Note that this works due to the fact that there could only ever be one 
writer to raw keyed streams.
   If there are multiple writers attempting to write to the raw keyed stream 
(i.e. legacy heap-based timers + some custom writing in user code), 
checkpointing would have already failed consistently, since the raw keyed 
stream API (`KeyedStateCheckpointOutputStream`) only allows calling 
`startNewKeyGroup` once for the same stream.
   
   ## Brief change log
   
   - 10bfda6 introduces the `isUsingCustomRawKeyedState` flag to 
`AbstractStreamOperator`
   - 2eab38b allows passing in the flag when creating the 
`StreamOperatorStateContext`
   - c8b5f04 wires in the flag to be respected by the 
`StreamOperatorStateContext` instantiation, so that timer services do no read 
from raw keyed state if they weren't the ones who wrote to it.
   - 1ca873f Adds a test that uses an `AbstractStreamOperator` implementation 
that writes to raw keyed state and verifies that snapshotting and restoring 
works, and the new flag is being respected. If you alter the flag to be `false` 
in the test, the test would fail.
   - c415739 Add an extra notice in docs mentioning that checkpoint would fail 
if you're using RocksDB + heap timers + writing to custom raw keyed state in 
some operators.
   
   ## Verifying this change
   
   The new test 
`AbstractStreamOperatorTest#testCustomRawKeyedStateSnapshotAndRestore` should 
cover this fix.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **NO**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (**YES** / no)
 - The serializers: (yes / **NO** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **NO** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (**YES** / no / 
don't know)
 - The S3 file system connector: (yes / **NO** / don't know)
   
   ## Documentation
   
   While technically this PR introduces a new feature, we deliberately do not 
mention it in the docs as raw keyed state is intended as an internal feature 
that users should not be using (unless in some very edge cases).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-19741) InternalTimeServiceManager fails to restore due to corrupt reads if there are other users of raw keyed state streams

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-19741:
---
Labels: pull-request-available  (was: )

> InternalTimeServiceManager fails to restore due to corrupt reads if there are 
> other users of raw keyed state streams
> 
>
> Key: FLINK-19741
> URL: https://issues.apache.org/jira/browse/FLINK-19741
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.9.3, 1.10.2, 1.11.2
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.3
>
>
> h2. *Diagnosis*
> Currently, when restoring a {{InternalTimeServiceManager}}, we always attempt 
> to read from the provided raw keyed state streams (using 
> {{InternalTimerServiceSerializationProxy}}):
> https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/InternalTimeServiceManagerImpl.java#L117
> This is incorrect, since we don't write with the 
> {{InternalTimerServiceSerializationProxy}} if the timers do not require 
> legacy synchronous snapshots:
> https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/InternalTimeServiceManagerImpl.java#L192
> (we currently only require that when users use RocksDB backend + heap timers).
> Therefore, the {{InternalTimeServiceManager}} can fail to be created on 
> restore due to corrupt reads in the case where:
> * a checkpoint was taken where {{useLegacySynchronousSnapshots}} is false 
> (hence nothing was written, and the time service manager does not use the raw 
> keyed stream)
> * the raw keyed stream is used elsewhere (e.g. in the Flink application's 
> user code)
> * on restore from the checkpoint, {{InternalTimeServiceManagerImpl.create()}} 
> attempts to read from the raw keyed stream with the 
> {{InternalTimerServiceSerializationProxy}}.
> Full error stack trace (with Flink 1.11.1):
> {code}
> 2020-10-21 13:16:51
> java.lang.Exception: Exception while creating StreamOperatorStateContext.
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:204)
>   at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:247)
>   at 
> org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:290)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$0(StreamTask.java:479)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:47)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:475)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:528)
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.EOFException
>   at java.io.DataInputStream.readFully(DataInputStream.java:197)
>   at java.io.DataInputStream.readUTF(DataInputStream.java:609)
>   at java.io.DataInputStream.readUTF(DataInputStream.java:564)
>   at 
> org.apache.flink.streaming.api.operators.InternalTimerServiceSerializationProxy.read(InternalTimerServiceSerializationProxy.java:110)
>   at 
> org.apache.flink.core.io.PostVersionedIOReadableWritable.read(PostVersionedIOReadableWritable.java:76)
>   at 
> org.apache.flink.streaming.api.operators.InternalTimeServiceManager.restoreStateForKeyGroup(InternalTimeServiceManager.java:217)
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.internalTimeServiceManager(StreamTaskStateInitializerImpl.java:234)
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:167)
>   ... 9 more
> {code}
> h2. *Reproducing*
> - Have an application with any operator that uses and writes to raw keyed 
> state streams
> - Use heap backend + any timer factory or RocksDB backend + RocksDB timers
> - Take a savepoint or wait for a checkpoint, and trigger a restore
> h2. *Proposed Fix*
> The fix would be to also respect the {{useLegacySynchronousSnapshots}} flag 
> in:
> https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/InternalTimeServiceManagerImpl.java#L231



-

[GitHub] [flink] tzulitai opened a new pull request #13762: [backport-1.11] [FLINK-19741] Allow AbstractStreamOperator to skip restoring timers if subclasses are using raw keyed state

2020-10-23 Thread GitBox


tzulitai opened a new pull request #13762:
URL: https://github.com/apache/flink/pull/13762


   This is a backport of #13761 to `release-1.11`.
   Please look there for details of this change.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #13761: [FLINK-19741] Allow AbstractStreamOperator to skip restoring timers if subclasses are using raw keyed state

2020-10-23 Thread GitBox


flinkbot commented on pull request #13761:
URL: https://github.com/apache/flink/pull/13761#issuecomment-715099581


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit c415739bc73f135820267a1f59b334ad76a5123e (Fri Oct 23 
07:47:50 UTC 2020)
   
   **Warnings:**
* Documentation files were touched, but no `.zh.md` files: Update Chinese 
documentation or file Jira ticket.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #13762: [backport-1.11] [FLINK-19741] Allow AbstractStreamOperator to skip restoring timers if subclasses are using raw keyed state

2020-10-23 Thread GitBox


flinkbot commented on pull request #13762:
URL: https://github.com/apache/flink/pull/13762#issuecomment-715103196


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit c1cbfaa264ba1e9921438b9709e9d1abe1d223d5 (Fri Oct 23 
07:49:51 UTC 2020)
   
   **Warnings:**
* Documentation files were touched, but no `.zh.md` files: Update Chinese 
documentation or file Jira ticket.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Comment Edited] (FLINK-19778) Failed job reinitiated with wrong checkpoint after a ZK reconnection

2020-10-23 Thread Paul Lin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17219521#comment-17219521
 ] 

Paul Lin edited comment on FLINK-19778 at 10/23/20, 7:51 AM:
-

I think it's because the job turned into FAILED state, which is a global 
terminated state, so the checkpoint entries were removed from zookeeper (see 
logs at 2020-10-23 06:17:59).

And unfortunately, the mistakenly restarted job is canceled afterward, and the 
application path on zookeeper was cleaned up, so we can't get more information 
from zookeeper.


was (Author: paul lin):
I think it's because the job turned into FAILED state, which is a global 
terminated state, so the checkpoint entries were removed from zookeeper.

And unfortunately, the job is canceled afterwards, and the application path on 
zookeeper was cleaned up, so we can't get more information from zookeeper.

> Failed job reinitiated with wrong checkpoint after a ZK reconnection
> 
>
> Key: FLINK-19778
> URL: https://issues.apache.org/jira/browse/FLINK-19778
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.11.0
>Reporter: Paul Lin
>Priority: Critical
> Attachments: jm_log
>
>
> We have a job of Flink 1.11.0 running on YARN that reached FAILED state 
> because its jobmanager lost leadership during a ZK full GC. But after the ZK 
> connection was recovered, somehow the job was reinitiated again with no 
> checkpoints found in ZK, and hence an earlier savepoint was used to restore 
> the job, which rewound the job unexpectedly.
>   
>  For details please see the jobmanager logs in the attachment.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #13458: FLINK-18851 [runtime] Add checkpoint type to checkpoint history entries in Web UI

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13458:
URL: https://github.com/apache/flink/pull/13458#issuecomment-697041219


   
   ## CI report:
   
   * 7311b0d12d19a645391ea0359a9aa6318806363b UNKNOWN
   * 9bc87ad6b2a8cbad2f226c483c74f1fa1523b70e Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8079)
 
   * 9c8c867b1ba1f9a236534718c4d4e0371b3cc5a3 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8159)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] danny0405 opened a new pull request #13763: [FLINK-19779][avro] Remove the record_ field name prefix for Confluen…

2020-10-23 Thread GitBox


danny0405 opened a new pull request #13763:
URL: https://github.com/apache/flink/pull/13763


   …t Avro format deserialization
   
   ## What is the purpose of the change
   
   Add prefix for the field name only when the record and field have the same 
name.
   
   
   ## Brief change log
   
 - Modify `AvroSchemaConverter.convertToSchema` to only append field prefix 
when the record and field have the same name
 - Modify the existing test
   
   
   ## Verifying this change
   
   Added UT.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-19779) Remove the "record_" field name prefix for Confluent Avro format deserialization

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-19779:
---
Labels: pull-request-available  (was: )

> Remove the "record_" field name prefix for Confluent Avro format 
> deserialization
> 
>
> Key: FLINK-19779
> URL: https://issues.apache.org/jira/browse/FLINK-19779
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.12.0
>Reporter: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Reported by Maciej Bryński :
> Problem is this is not compatible. I'm unable to read anything from Kafka 
> using Confluent Registry. Example:
> I have data in Kafka with following value schema:
> {code:java}
> {
>   "type": "record",
>   "name": "myrecord",
>   "fields": [
> {
>   "name": "f1",
>   "type": "string"
> }
>   ]
> }
> {code}
> I'm creating table using this avro-confluent format:
> {code:sql}
> create table `test` (
>   `f1` STRING
> ) WITH (
>   'connector' = 'kafka', 
>   'topic' = 'test', 
>   'properties.bootstrap.servers' = 'localhost:9092', 
>   'properties.group.id' = 'test1234', 
>'scan.startup.mode' = 'earliest-offset', 
>   'format' = 'avro-confluent'
>   'avro-confluent.schema-registry.url' = 'http://localhost:8081'
> );
> {code}
> When trying to select data I'm getting error:
> {code:noformat}
> SELECT * FROM test;
> [ERROR] Could not execute SQL statement. Reason:
> org.apache.avro.AvroTypeException: Found myrecord, expecting record, missing 
> required field record_f1
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #13676: [FLINK-19326][cep] Allow explicitly configuring time behaviour on CEP PatternStream

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13676:
URL: https://github.com/apache/flink/pull/13676#issuecomment-711145212


   
   ## CI report:
   
   * 7c8f33c27cb6cdfa1682585c9611251d8dcad5a7 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8143)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] danny0405 commented on pull request #12919: [FLINK-16048][avro] Support read/write confluent schema registry avro…

2020-10-23 Thread GitBox


danny0405 commented on pull request #12919:
URL: https://github.com/apache/flink/pull/12919#issuecomment-715117170


   I have fired a fix https://github.com/apache/flink/pull/13763/files, can you 
help check if possible @maver1ck :)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #13763: [FLINK-19779][avro] Remove the record_ field name prefix for Confluen…

2020-10-23 Thread GitBox


flinkbot commented on pull request #13763:
URL: https://github.com/apache/flink/pull/13763#issuecomment-715120169


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit b92a32d37a8eaa446f24f44e467e588cdabcd9f5 (Fri Oct 23 
07:59:21 UTC 2020)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
* **This pull request references an unassigned [Jira 
ticket](https://issues.apache.org/jira/browse/FLINK-19779).** According to the 
[code contribution 
guide](https://flink.apache.org/contributing/contribute-code.html), tickets 
need to be assigned before starting with the implementation work.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13690: [FLINK-16595][YARN]support more HDFS nameServices in yarn mode when security enabled. Is…

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13690:
URL: https://github.com/apache/flink/pull/13690#issuecomment-712304240


   
   ## CI report:
   
   * 518581ab52a2d976ff344404a6828cec23c260f9 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8054)
 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8039)
 
   * a689eb093d75ad658d56b4f1d1eb2701689f9e3b UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-19779) Remove the "record_" field name prefix for Confluent Avro format deserialization

2020-10-23 Thread Jira


[ 
https://issues.apache.org/jira/browse/FLINK-19779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17219539#comment-17219539
 ] 

Maciej Bryński commented on FLINK-19779:


[~jark] 
This was an answer to this comment:

https://github.com/apache/flink/pull/12919#issuecomment-714889392

> Remove the "record_" field name prefix for Confluent Avro format 
> deserialization
> 
>
> Key: FLINK-19779
> URL: https://issues.apache.org/jira/browse/FLINK-19779
> Project: Flink
>  Issue Type: Bug
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.12.0
>Reporter: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Reported by Maciej Bryński :
> Problem is this is not compatible. I'm unable to read anything from Kafka 
> using Confluent Registry. Example:
> I have data in Kafka with following value schema:
> {code:java}
> {
>   "type": "record",
>   "name": "myrecord",
>   "fields": [
> {
>   "name": "f1",
>   "type": "string"
> }
>   ]
> }
> {code}
> I'm creating table using this avro-confluent format:
> {code:sql}
> create table `test` (
>   `f1` STRING
> ) WITH (
>   'connector' = 'kafka', 
>   'topic' = 'test', 
>   'properties.bootstrap.servers' = 'localhost:9092', 
>   'properties.group.id' = 'test1234', 
>'scan.startup.mode' = 'earliest-offset', 
>   'format' = 'avro-confluent'
>   'avro-confluent.schema-registry.url' = 'http://localhost:8081'
> );
> {code}
> When trying to select data I'm getting error:
> {code:noformat}
> SELECT * FROM test;
> [ERROR] Could not execute SQL statement. Reason:
> org.apache.avro.AvroTypeException: Found myrecord, expecting record, missing 
> required field record_f1
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #13756: [FLINK-19770][python][test] Changed the PythonProgramOptionTest to be an ITCase.

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13756:
URL: https://github.com/apache/flink/pull/13756#issuecomment-714895505


   
   ## CI report:
   
   * d994782b0d8d72c5a59379643d6e588a02b333ad Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8145)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13760: [FLINK-19627][table-runtime] Introduce multiple input operator for batch

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13760:
URL: https://github.com/apache/flink/pull/13760#issuecomment-715078238


   
   ## CI report:
   
   * 6bbd2cf570cdda1fe9ea98b93fa290ac4d497fbd Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8160)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #13761: [FLINK-19741] Allow AbstractStreamOperator to skip restoring timers if subclasses are using raw keyed state

2020-10-23 Thread GitBox


flinkbot commented on pull request #13761:
URL: https://github.com/apache/flink/pull/13761#issuecomment-715133167


   
   ## CI report:
   
   * c415739bc73f135820267a1f59b334ad76a5123e UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-19780) FlinkRelMdDistinctRowCount#getDistinctRowCount(Calc) will always return 0 when number of rows are large

2020-10-23 Thread Caizhi Weng (Jira)
Caizhi Weng created FLINK-19780:
---

 Summary: FlinkRelMdDistinctRowCount#getDistinctRowCount(Calc) will 
always return 0 when number of rows are large
 Key: FLINK-19780
 URL: https://issues.apache.org/jira/browse/FLINK-19780
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Reporter: Caizhi Weng
 Fix For: 1.12.0


Due to CALCITE-4351 {{FlinkRelMdDistinctRowCount#getDistinctRowCount(Calc)}} 
will always return 0 when number of rows are large.

What I would suggest is to introduce our own {{FlinkRelMdUtil#numDistinctVals}} 
to treat small and large inputs in different ways. For small inputs we use the 
more precise {{RelMdUtil#numDistinctVals}} and for large inputs we copy the 
old, approximated implementation of {{RelMdUtil#numDistinctVals}}.

This is a temporary solution. When CALCITE-4351 is fixed we should revert this 
commit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] maver1ck commented on pull request #12919: [FLINK-16048][avro] Support read/write confluent schema registry avro…

2020-10-23 Thread GitBox


maver1ck commented on pull request #12919:
URL: https://github.com/apache/flink/pull/12919#issuecomment-715143420


   I will check... mvn is running



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] TsReaper opened a new pull request #13764: [FLINK-19780][table-planner-blink] Introduce FlinkRelMdUtil#numDistinctVals to work around precision problem in Calcite's implementation

2020-10-23 Thread GitBox


TsReaper opened a new pull request #13764:
URL: https://github.com/apache/flink/pull/13764


   ## What is the purpose of the change
   
   Due to CALCITE-4351 `FlinkRelMdDistinctRowCount#getDistinctRowCount(Calc)` 
will always return 0 when number of rows are large.
   
   This PR introduces our own `FlinkRelMdUtil#numDistinctVals` to treat small 
and large inputs in different ways. For small inputs we use the more precise 
`RelMdUtil#numDistinctVals` and for large inputs we copy the old, approximated 
implementation of `RelMdUtil#numDistinctVals`.
   
   This is a temporary solution. When CALCITE-4351 is fixed we should revert 
this commit.
   
   ## Brief change log
   
- Introduce `FlinkRelMdUtil#numDistinctVals`
   
   ## Verifying this change
   
   This change is already covered by existing tests, also this change added 
tests and can be verified by running the added tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not applicable
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-19780) FlinkRelMdDistinctRowCount#getDistinctRowCount(Calc) will always return 0 when number of rows are large

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-19780:
---
Labels: pull-request-available  (was: )

> FlinkRelMdDistinctRowCount#getDistinctRowCount(Calc) will always return 0 
> when number of rows are large
> ---
>
> Key: FLINK-19780
> URL: https://issues.apache.org/jira/browse/FLINK-19780
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: Caizhi Weng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Due to CALCITE-4351 {{FlinkRelMdDistinctRowCount#getDistinctRowCount(Calc)}} 
> will always return 0 when number of rows are large.
> What I would suggest is to introduce our own 
> {{FlinkRelMdUtil#numDistinctVals}} to treat small and large inputs in 
> different ways. For small inputs we use the more precise 
> {{RelMdUtil#numDistinctVals}} and for large inputs we copy the old, 
> approximated implementation of {{RelMdUtil#numDistinctVals}}.
> This is a temporary solution. When CALCITE-4351 is fixed we should revert 
> this commit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on pull request #13764: [FLINK-19780][table-planner-blink] Introduce FlinkRelMdUtil#numDistinctVals to work around precision problem in Calcite's implementation

2020-10-23 Thread GitBox


flinkbot commented on pull request #13764:
URL: https://github.com/apache/flink/pull/13764#issuecomment-715179402


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 954b1d5490d4355fde4a175b1c4023f56b872260 (Fri Oct 23 
08:32:41 UTC 2020)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
* **This pull request references an unassigned [Jira 
ticket](https://issues.apache.org/jira/browse/FLINK-19780).** According to the 
[code contribution 
guide](https://flink.apache.org/contributing/contribute-code.html), tickets 
need to be assigned before starting with the implementation work.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13690: [FLINK-16595][YARN]support more HDFS nameServices in yarn mode when security enabled. Is…

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13690:
URL: https://github.com/apache/flink/pull/13690#issuecomment-712304240


   
   ## CI report:
   
   * 518581ab52a2d976ff344404a6828cec23c260f9 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8054)
 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8039)
 
   * a689eb093d75ad658d56b4f1d1eb2701689f9e3b Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8161)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13696: [FLINK-19726][table] Implement new providers for blink planner

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13696:
URL: https://github.com/apache/flink/pull/13696#issuecomment-712617122


   
   ## CI report:
   
   * d1440ea836055d51bb808033dbc2e28c11a1599a Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8146)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13648: [FLINK-19632] Introduce a new ResultPartitionType for Approximate Local Recovery

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13648:
URL: https://github.com/apache/flink/pull/13648#issuecomment-709153806


   
   ## CI report:
   
   * 1dcce3b9301d178ae2e3ab05888260b05f24b15a Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8016)
 
   * d3709faf9515053ca9873382f5b0ce83c127ff2d UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-19781) Upgrade commons_codec to 1.13 or newer

2020-10-23 Thread Till Rohrmann (Jira)
Till Rohrmann created FLINK-19781:
-

 Summary: Upgrade commons_codec to 1.13 or newer
 Key: FLINK-19781
 URL: https://issues.apache.org/jira/browse/FLINK-19781
 Project: Flink
  Issue Type: Task
  Components: Table SQL / Planner
Affects Versions: 1.11.2, 1.12.0
Reporter: Till Rohrmann
 Fix For: 1.12.0, 1.11.3


A user reported a dependency vulnerability which affects {{commons_codec}} [1]. 
We should try to upgrade this version to 1.13 or newer.

[1] 
https://lists.apache.org/thread.html/r0dd7ff197b2e3bdd80a0326587ca3d0c22e10d1dba17c769d6da7d7a%40%3Cuser.flink.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] wuchong commented on pull request #13721: [FLINK-19694][table] Support Upsert ChangelogMode for ScanTableSource

2020-10-23 Thread GitBox


wuchong commented on pull request #13721:
URL: https://github.com/apache/flink/pull/13721#issuecomment-715192030


   I have rebased the branch to resolve conflicts. Appreciate if you can have a 
look @leonardBang . 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-10680) Unable to set negative offsets for TumblingEventTimeWindow

2020-10-23 Thread Paul Lin (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-10680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Lin closed FLINK-10680.

Resolution: Duplicate

> Unable to set negative offsets for TumblingEventTimeWindow
> --
>
> Key: FLINK-10680
> URL: https://issues.apache.org/jira/browse/FLINK-10680
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.6.1
>Reporter: Paul Lin
>Assignee: Paul Lin
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The following code given in documentation throws an IllegalArgumentException: 
> TumblingEventTimeWindows parameters must satisfy 0 <= offset < size.
> > TumblingEventTimeWindows.of(Time.days(1), Time.hours(-8));
> By design, the offset could be negative to fit in different time zones.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #13761: [FLINK-19741] Allow AbstractStreamOperator to skip restoring timers if subclasses are using raw keyed state

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13761:
URL: https://github.com/apache/flink/pull/13761#issuecomment-715133167


   
   ## CI report:
   
   * c415739bc73f135820267a1f59b334ad76a5123e Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8162)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #13762: [backport-1.11] [FLINK-19741] Allow AbstractStreamOperator to skip restoring timers if subclasses are using raw keyed state

2020-10-23 Thread GitBox


flinkbot commented on pull request #13762:
URL: https://github.com/apache/flink/pull/13762#issuecomment-715194794


   
   ## CI report:
   
   * c1cbfaa264ba1e9921438b9709e9d1abe1d223d5 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-19782) Upgrade antlr to 4.7.1 or newer

2020-10-23 Thread Till Rohrmann (Jira)
Till Rohrmann created FLINK-19782:
-

 Summary: Upgrade antlr to 4.7.1 or newer
 Key: FLINK-19782
 URL: https://issues.apache.org/jira/browse/FLINK-19782
 Project: Flink
  Issue Type: Task
  Components: API / Python
Affects Versions: 1.11.2, 1.12.0
Reporter: Till Rohrmann
 Fix For: 1.12.0, 1.11.3


A user reported dependency vulnerabilities which affect {{antlr}} [1]. We 
should upgrade this dependency to {{4.7.1}} or newer.

[1] 
https://lists.apache.org/thread.html/r0dd7ff197b2e3bdd80a0326587ca3d0c22e10d1dba17c769d6da7d7a%40%3Cuser.flink.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on pull request #13763: [FLINK-19779][avro] Remove the record_ field name prefix for Confluen…

2020-10-23 Thread GitBox


flinkbot commented on pull request #13763:
URL: https://github.com/apache/flink/pull/13763#issuecomment-715195599


   
   ## CI report:
   
   * b92a32d37a8eaa446f24f44e467e588cdabcd9f5 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-19783) Upgrade mesos to 1.7 or newer

2020-10-23 Thread Till Rohrmann (Jira)
Till Rohrmann created FLINK-19783:
-

 Summary: Upgrade mesos to 1.7 or newer
 Key: FLINK-19783
 URL: https://issues.apache.org/jira/browse/FLINK-19783
 Project: Flink
  Issue Type: Task
  Components: Deployment / Mesos
Affects Versions: 1.11.2, 1.12.0
Reporter: Till Rohrmann
 Fix For: 1.12.0, 1.11.3


A user reported a dependency vulnerability which affects {{mesos}} [1]. We 
should upgrade {{mesos}} to {{1.7.0}} or newer.

[1] 
https://lists.apache.org/thread.html/r0dd7ff197b2e3bdd80a0326587ca3d0c22e10d1dba17c769d6da7d7a%40%3Cuser.flink.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] twalthr commented on pull request #13694: [FLINK-19720][table-api] Introduce new Providers and parallelism API

2020-10-23 Thread GitBox


twalthr commented on pull request #13694:
URL: https://github.com/apache/flink/pull/13694#issuecomment-715197041


   Thanks for working on this issue @JingsongLi. I pushed a commit with a 
couple of suggestions. Feel free to undo whatever you don't like.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-19784) Upgrade okhttp to 3.13.0 or newer

2020-10-23 Thread Till Rohrmann (Jira)
Till Rohrmann created FLINK-19784:
-

 Summary: Upgrade okhttp to 3.13.0 or newer
 Key: FLINK-19784
 URL: https://issues.apache.org/jira/browse/FLINK-19784
 Project: Flink
  Issue Type: Task
  Components: Runtime / Metrics
Affects Versions: 1.11.2, 1.12.0
Reporter: Till Rohrmann
 Fix For: 1.12.0, 1.11.3


A user reported a dependency vulnerability which affects {{okhttp}} [1]. We 
should upgrade this dependency to {{3.13.0}} or newer. The dependency is used 
by the datadog reporter.

[1] 
https://lists.apache.org/thread.html/r0dd7ff197b2e3bdd80a0326587ca3d0c22e10d1dba17c769d6da7d7a%40%3Cuser.flink.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-19278) Bump Scala Macros Version to 2.1.1

2020-10-23 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-19278:
--

Assignee: Robert Metzger

> Bump Scala Macros Version to 2.1.1
> --
>
> Key: FLINK-19278
> URL: https://issues.apache.org/jira/browse/FLINK-19278
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Stephan Ewen
>Assignee: Robert Metzger
>Priority: Major
> Fix For: 1.12.0
>
>
> Scala Macros 2.1.0 does not support newer Scala versions, in particular not 
> 2.12.12.
> Scala Macros 2.1.1 seems to support virtually all Scala 2.12 minor versions 
> as well as the latest 2.11 versions (like 2.11.12).
> See here for version compatibility: 
> https://mvnrepository.com/artifact/org.scalamacros/paradise



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19785) Upgrade commons-io to 2.7 or newer

2020-10-23 Thread Till Rohrmann (Jira)
Till Rohrmann created FLINK-19785:
-

 Summary: Upgrade commons-io to 2.7 or newer
 Key: FLINK-19785
 URL: https://issues.apache.org/jira/browse/FLINK-19785
 Project: Flink
  Issue Type: Task
  Components: Runtime / Coordination
Affects Versions: 1.11.2, 1.12.0
Reporter: Till Rohrmann
 Fix For: 1.12.0, 1.11.3


A user reported a dependency vulnerability which affects {{commons-io}} [1]. We 
should try to upgrade this dependency to {{2.7}} or newer.

[1] 
https://lists.apache.org/thread.html/r0dd7ff197b2e3bdd80a0326587ca3d0c22e10d1dba17c769d6da7d7a%40%3Cuser.flink.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-18122) Kubernetes test fails with "error: timed out waiting for the condition on jobs/flink-job-cluster"

2020-10-23 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-18122:
--

Assignee: Robert Metzger

> Kubernetes test fails with "error: timed out waiting for the condition on 
> jobs/flink-job-cluster"
> -
>
> Key: FLINK-18122
> URL: https://issues.apache.org/jira/browse/FLINK-18122
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes, Tests
>Affects Versions: 1.11.0
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Major
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=2697&view=logs&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee&t=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5
> {code}
> 2020-06-04T09:25:40.7205843Z service/flink-job-cluster created
> 2020-06-04T09:25:40.9661515Z job.batch/flink-job-cluster created
> 2020-06-04T09:25:41.2189123Z deployment.apps/flink-task-manager created
> 2020-06-04T10:32:32.6402983Z error: timed out waiting for the condition on 
> jobs/flink-job-cluster
> 2020-06-04T10:32:33.8057757Z error: unable to upgrade connection: container 
> not found ("flink-task-manager")
> 2020-06-04T10:32:33.8111302Z sort: cannot read: 
> '/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-56335570120/out/kubernetes_wc_out*':
>  No such file or directory
> 2020-06-04T10:32:33.8124455Z FAIL WordCount: Output hash mismatch.  Got 
> d41d8cd98f00b204e9800998ecf8427e, expected e682ec6622b5e83f2eb614617d5ab2cf.
> 2020-06-04T10:32:33.8125379Z head hexdump of actual:
> 2020-06-04T10:32:33.8136133Z head: cannot open 
> '/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-56335570120/out/kubernetes_wc_out*'
>  for reading: No such file or directory
> 2020-06-04T10:32:33.8344715Z Debugging failed Kubernetes test:
> 2020-06-04T10:32:33.8345469Z Currently existing Kubernetes resources
> 2020-06-04T10:32:36.4977853Z I0604 10:32:36.497383   13191 request.go:621] 
> Throttling request took 1.198606989s, request: 
> GET:https://10.1.0.4:8443/apis/rbac.authorization.k8s.io/v1?timeout=32s
> 2020-06-04T10:32:46.6975735Z I0604 10:32:46.697234   13191 request.go:621] 
> Throttling request took 4.398107353s, request: 
> GET:https://10.1.0.4:8443/apis/authorization.k8s.io/v1?timeout=32s
> 2020-06-04T10:32:57.4978637Z I0604 10:32:57.497209   13191 request.go:621] 
> Throttling request took 1.198449167s, request: 
> GET:https://10.1.0.4:8443/apis/apps/v1?timeout=32s
> 2020-06-04T10:33:07.4980104Z I0604 10:33:07.497320   13191 request.go:621] 
> Throttling request took 4.198274438s, request: 
> GET:https://10.1.0.4:8443/apis/apiextensions.k8s.io/v1?timeout=32s
> 2020-06-04T10:33:18.4976060Z I0604 10:33:18.497258   13191 request.go:621] 
> Throttling request took 1.19871495s, request: 
> GET:https://10.1.0.4:8443/apis/apps/v1?timeout=32s
> 2020-06-04T10:33:28.4979129Z I0604 10:33:28.497276   13191 request.go:621] 
> Throttling request took 4.198369672s, request: 
> GET:https://10.1.0.4:8443/apis/rbac.authorization.k8s.io/v1?timeout=32s
> 2020-06-04T10:33:30.9182069Z NAME READY   
> STATUS  RESTARTS   AGE
> 2020-06-04T10:33:30.9184099Z pod/flink-job-cluster-dtb67  0/1 
> ErrImageNeverPull   0  67m
> 2020-06-04T10:33:30.9184869Z pod/flink-task-manager-74ccc9bd9-psqwm   0/1 
> ErrImageNeverPull   0  67m
> 2020-06-04T10:33:30.9185226Z 
> 2020-06-04T10:33:30.9185926Z NAMETYPE
> CLUSTER-IP  EXTERNAL-IP   PORT(S) 
>   AGE
> 2020-06-04T10:33:30.9186832Z service/flink-job-cluster   NodePort
> 10.111.92.199   
> 6123:32501/TCP,6124:31360/TCP,6125:30025/TCP,8081:30081/TCP   67m
> 2020-06-04T10:33:30.9187545Z service/kubernetes  ClusterIP   
> 10.96.0.1   443/TCP 
>   68m
> 2020-06-04T10:33:30.9187976Z 
> 2020-06-04T10:33:30.9188472Z NAME READY   
> UP-TO-DATE   AVAILABLE   AGE
> 2020-06-04T10:33:30.9189179Z deployment.apps/flink-task-manager   0/1 1   
>  0   67m
> 2020-06-04T10:33:30.9189508Z 
> 2020-06-04T10:33:30.9189815Z NAME   
> DESIRED   CURRENT   READY   AGE
> 2020-06-04T10:33:30.9190418Z replicaset.apps/flink-task-manager-74ccc9bd9   1 
> 1 0   67m
> 2020-06-04T10:33:30.9190662Z 
> 2020-06-04T10:33:30.9190891Z NAME  COMPLETIONS   
> DURATION   AGE
> 2020-06-04T10:33:30.9191423Z job.batch/flink-job-cluster   0/1   67m  
>   67m
> 2020-06-04T10:33:33.7840921Z I0604 10:

[GitHub] [flink] gaoyunhaii commented on a change in pull request #13595: [FLINK-19582][network] Introduce sort-merge based blocking shuffle to Flink

2020-10-23 Thread GitBox


gaoyunhaii commented on a change in pull request #13595:
URL: https://github.com/apache/flink/pull/13595#discussion_r510737752



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/PartitionedFile.java
##
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.network.partition;
+
+import org.apache.flink.util.IOUtils;
+
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.nio.channels.FileChannel;
+import java.nio.file.Path;
+
+import static org.apache.flink.util.Preconditions.checkArgument;
+import static org.apache.flink.util.Preconditions.checkNotNull;
+
+/**
+ * {@link PartitionedFile} is the persistent file type of sort-merge blocking 
shuffle. Each {@link PartitionedFile}
+ * contains two files: one is data file and the other is index file. Both the 
data file and index file have multiple
+ * regions. Each data region store the shuffle data in subpartition index 
order and the corresponding index region
+ * contains index entries of all subpartitions. Each index entry is a (long, 
integer) tuple of which the long value
+ * is the file offset of the corresponding subpartition and the integer value 
is the number of buffers.
+ */
+public class PartitionedFile {
+
+   public static final String DATA_FILE_SUFFIX = ".shuffle.data";
+
+   public static final String INDEX_FILE_SUFFIX = ".shuffle.index";
+
+   public static final ByteOrder DEFAULT_BYTE_ORDER = ByteOrder.BIG_ENDIAN;
+
+   /** Size of each index entry in the index file. 8 bytes for offset and 
4 bytes for number of buffers. */
+   public static final int INDEX_ENTRY_SIZE = 8 + 4;
+
+   /** Number of data regions in this {@link PartitionedFile}. */
+   private final int numRegions;
+
+   /** Number of subpartitions of this {@link PartitionedFile}. */
+   private final int numSubpartitions;
+
+   /** Path of the data file which stores all data in this {@link 
PartitionedFile}. */
+   private final Path dataFilePath;
+
+   /** Path of the index file which stores indexes of all regions in this 
{@link PartitionedFile}. */
+   private final Path indexFilePath;
+
+   /** Used to accelerate index data access. */
+   private final ByteBuffer indexDataCache;
+
+   public PartitionedFile(
+   int numRegions,
+   int numSubpartitions,
+   Path dataFilePath,
+   Path indexFilePath,
+   ByteBuffer indexDataCache) {
+   checkArgument(numRegions >= 0, "Illegal number of data 
regions.");
+   checkArgument(numSubpartitions > 0, "Illegal number of 
subpartitions.");
+
+   this.numRegions = numRegions;
+   this.numSubpartitions = numSubpartitions;
+   this.dataFilePath = checkNotNull(dataFilePath);
+   this.indexFilePath = checkNotNull(indexFilePath);
+   this.indexDataCache = indexDataCache;
+   }
+
+   public Path getDataFilePath() {
+   return dataFilePath;
+   }
+
+   public Path getIndexFilePath() {
+   return indexFilePath;
+   }
+
+   public int getNumRegions() {
+   return numRegions;
+   }
+
+   /**
+* Returns the index entry offset of the target region and subpartition 
in the index file. Both region index
+* and subpartition index start from 0.
+*/
+   private long getIndexEntryOffset(int region, int subpartition) {

Review comment:
   I think it does not make much difference here to return long since we 
always first cast it to integer when using the result, we may change the return 
type to int and use `MathUtils.checkedCastDown` in return statement.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] maver1ck commented on pull request #12919: [FLINK-16048][avro] Support read/write confluent schema registry avro…

2020-10-23 Thread GitBox


maver1ck commented on pull request #12919:
URL: https://github.com/apache/flink/pull/12919#issuecomment-715206532


   OK. It's working. I'm able to read data.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13648: [FLINK-19632] Introduce a new ResultPartitionType for Approximate Local Recovery

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13648:
URL: https://github.com/apache/flink/pull/13648#issuecomment-709153806


   
   ## CI report:
   
   * 1dcce3b9301d178ae2e3ab05888260b05f24b15a Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8016)
 
   * d3709faf9515053ca9873382f5b0ce83c127ff2d Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8163)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Assigned] (FLINK-19068) Filter verbose pod events for KubernetesResourceManagerDriver

2020-10-23 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song reassigned FLINK-19068:


Assignee: Xintong Song

> Filter verbose pod events for KubernetesResourceManagerDriver
> -
>
> Key: FLINK-19068
> URL: https://issues.apache.org/jira/browse/FLINK-19068
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>
> A status of a Kubernetes pod consists of many detailed fields. Currently, 
> Flink receives pod {{MODIFIED}} events from the {{KubernetesPodsWatcher}} on 
> every single change to these fields, many of which Flink does not care.
> The verbose events will not affect the functionality of Flink, but will 
> pollute the logs with repeated messages, because Flink only looks into the 
> fields it interested in and those fields are identical.
> E.g., when a task manager is stopped due to idle timeout, Flink receives 3 
> events:
> * MODIFIED: container terminated
> * MODIFIED: {{deletionGracePeriodSeconds}} changes from 30 to 0, which is a 
> Kubernetes internal status change after containers are gracefully terminated
> * DELETED: Flink removes metadata of the terminated pod
> Among the 3 messages, Flink is only interested in the 1st MODIFIED message, 
> but will try to process all of them because the container status is 
> terminated.
> I propose to Filter the verbose events in 
> {{KubernetesResourceManagerDriver.PodCallbackHandlerImpl}}, to only process 
> the status changes interested by Flink. This probably requires recording the 
> status of all living pods, to compare with the incoming events for detecting 
> status changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19068) Filter verbose pod events for KubernetesResourceManagerDriver

2020-10-23 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17219560#comment-17219560
 ] 

Xintong Song commented on FLINK-19068:
--

Taking a closer look at this issue, we find it not trivial to filter out events 
in {{KubernetesResourceManagerDriver.PodCallbackHandlerImpl}}. It requires 
properly maintaining status for non-terminated pods, which is against the idea 
behind FLINK-18620 that all the worker status are maintained in 
{{ActiveResourceManager}}.

An alternative approach is to adjust the logs. We should print info logs only 
for the non-duplicated events, and print the duplicated events at debug level.

I'm opening a PR with the alternative approach.

> Filter verbose pod events for KubernetesResourceManagerDriver
> -
>
> Key: FLINK-19068
> URL: https://issues.apache.org/jira/browse/FLINK-19068
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Reporter: Xintong Song
>Priority: Major
>
> A status of a Kubernetes pod consists of many detailed fields. Currently, 
> Flink receives pod {{MODIFIED}} events from the {{KubernetesPodsWatcher}} on 
> every single change to these fields, many of which Flink does not care.
> The verbose events will not affect the functionality of Flink, but will 
> pollute the logs with repeated messages, because Flink only looks into the 
> fields it interested in and those fields are identical.
> E.g., when a task manager is stopped due to idle timeout, Flink receives 3 
> events:
> * MODIFIED: container terminated
> * MODIFIED: {{deletionGracePeriodSeconds}} changes from 30 to 0, which is a 
> Kubernetes internal status change after containers are gracefully terminated
> * DELETED: Flink removes metadata of the terminated pod
> Among the 3 messages, Flink is only interested in the 1st MODIFIED message, 
> but will try to process all of them because the container status is 
> terminated.
> I propose to Filter the verbose events in 
> {{KubernetesResourceManagerDriver.PodCallbackHandlerImpl}}, to only process 
> the status changes interested by Flink. This probably requires recording the 
> status of all living pods, to compare with the incoming events for detecting 
> status changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] xintongsong opened a new pull request #13765: [FLINK-19068][k8s] Improve log readability against duplicated pod termination events.

2020-10-23 Thread GitBox


xintongsong opened a new pull request #13765:
URL: https://github.com/apache/flink/pull/13765


   
   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluser with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / no / don't 
know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-19068) Filter verbose pod events for KubernetesResourceManagerDriver

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-19068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-19068:
---
Labels: pull-request-available  (was: )

> Filter verbose pod events for KubernetesResourceManagerDriver
> -
>
> Key: FLINK-19068
> URL: https://issues.apache.org/jira/browse/FLINK-19068
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>  Labels: pull-request-available
>
> A status of a Kubernetes pod consists of many detailed fields. Currently, 
> Flink receives pod {{MODIFIED}} events from the {{KubernetesPodsWatcher}} on 
> every single change to these fields, many of which Flink does not care.
> The verbose events will not affect the functionality of Flink, but will 
> pollute the logs with repeated messages, because Flink only looks into the 
> fields it interested in and those fields are identical.
> E.g., when a task manager is stopped due to idle timeout, Flink receives 3 
> events:
> * MODIFIED: container terminated
> * MODIFIED: {{deletionGracePeriodSeconds}} changes from 30 to 0, which is a 
> Kubernetes internal status change after containers are gracefully terminated
> * DELETED: Flink removes metadata of the terminated pod
> Among the 3 messages, Flink is only interested in the 1st MODIFIED message, 
> but will try to process all of them because the container status is 
> terminated.
> I propose to Filter the verbose events in 
> {{KubernetesResourceManagerDriver.PodCallbackHandlerImpl}}, to only process 
> the status changes interested by Flink. This probably requires recording the 
> status of all living pods, to compare with the incoming events for detecting 
> status changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #13694: [FLINK-19720][table-api] Introduce new Providers and parallelism API

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13694:
URL: https://github.com/apache/flink/pull/13694#issuecomment-712569945


   
   ## CI report:
   
   * 32dd70d47f901c7ce393852a890518b06bbb236b Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=7916)
 
   * f31b6312ad11aa10b773528bda15c0ca9460c0d4 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #13765: [FLINK-19068][k8s] Improve log readability against duplicated pod termination events.

2020-10-23 Thread GitBox


flinkbot commented on pull request #13765:
URL: https://github.com/apache/flink/pull/13765#issuecomment-715212446


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 0cb48d5f01219de970695d458a6f206b8a21966f (Fri Oct 23 
09:12:11 UTC 2020)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13721: [FLINK-19694][table] Support Upsert ChangelogMode for ScanTableSource

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13721:
URL: https://github.com/apache/flink/pull/13721#issuecomment-713489063


   
   ## CI report:
   
   * d5e77b25de2f0446255b6faca7ed17afc438143c Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8093)
 
   * b03cd2b69573ff31561807574f0d9d9f2af803b3 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] shizhengchao commented on pull request #13717: [FLINK-19723][Connector/JDBC] Solve the problem of repeated data submission in the failure retry

2020-10-23 Thread GitBox


shizhengchao commented on pull request #13717:
URL: https://github.com/apache/flink/pull/13717#issuecomment-715213638


   Hi @wuchong , could you help me review this ? :)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13736: [FLINK-19654][python][e2e] Reduce pyflink e2e test parallelism

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13736:
URL: https://github.com/apache/flink/pull/13736#issuecomment-714183394


   
   ## CI report:
   
   * e59eeba8f9eafdaf6874d96ffe934ae8cfe55ea3 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8147)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13762: [backport-1.11] [FLINK-19741] Allow AbstractStreamOperator to skip restoring timers if subclasses are using raw keyed state

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13762:
URL: https://github.com/apache/flink/pull/13762#issuecomment-715194794


   
   ## CI report:
   
   * c1cbfaa264ba1e9921438b9709e9d1abe1d223d5 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8165)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13748: [FLINK-19769][streaming] Reuse StreamRecord in SourceOutputWithWatermarks#collect and BatchTimestampsAndWatermarks.TimestampsOnlyOutp

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13748:
URL: https://github.com/apache/flink/pull/13748#issuecomment-714518838


   
   ## CI report:
   
   * ab95658b962916b08339896aa8d922cadca20cc5 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8115)
 
   * cccf1dc98166f248116cf05c305b97eaf0525787 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13759: [FLINK-19412][python] Re-layer Python Operation Make it Possible to Provide only Python implementation

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13759:
URL: https://github.com/apache/flink/pull/13759#issuecomment-715027223


   
   ## CI report:
   
   * d05c438cf0d44def860358741ca30151d8a3dd6a Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8157)
 
   * 2e62d962b92ff2c18567a80dcae46fdcfe61d10d Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8164)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #13763: [FLINK-19779][avro] Remove the record_ field name prefix for Confluen…

2020-10-23 Thread GitBox


flinkbot edited a comment on pull request #13763:
URL: https://github.com/apache/flink/pull/13763#issuecomment-715195599


   
   ## CI report:
   
   * b92a32d37a8eaa446f24f44e467e588cdabcd9f5 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8166)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #13764: [FLINK-19780][table-planner-blink] Introduce FlinkRelMdUtil#numDistinctVals to work around precision problem in Calcite's implementation

2020-10-23 Thread GitBox


flinkbot commented on pull request #13764:
URL: https://github.com/apache/flink/pull/13764#issuecomment-715216405


   
   ## CI report:
   
   * 954b1d5490d4355fde4a175b1c4023f56b872260 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] gaoyunhaii commented on a change in pull request #13595: [FLINK-19582][network] Introduce sort-merge based blocking shuffle to Flink

2020-10-23 Thread GitBox


gaoyunhaii commented on a change in pull request #13595:
URL: https://github.com/apache/flink/pull/13595#discussion_r510750829



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/SortMergeSubpartitionReader.java
##
@@ -0,0 +1,193 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.network.partition;
+
+import org.apache.flink.core.memory.MemorySegment;
+import org.apache.flink.core.memory.MemorySegmentFactory;
+import org.apache.flink.runtime.io.network.buffer.Buffer;
+import org.apache.flink.runtime.io.network.buffer.BufferRecycler;
+import 
org.apache.flink.runtime.io.network.partition.ResultSubpartition.BufferAndBacklog;
+import org.apache.flink.util.ExceptionUtils;
+import org.apache.flink.util.IOUtils;
+
+import javax.annotation.Nullable;
+
+import java.io.IOException;
+import java.util.ArrayDeque;
+import java.util.Queue;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+import static org.apache.flink.util.Preconditions.checkState;
+
+/**
+ * Subpartition data reader for {@link SortMergeResultPartition}.
+ */
+public class SortMergeSubpartitionReader implements ResultSubpartitionView, 
BufferRecycler {
+
+   private static final int NUM_READ_BUFFERS = 2;
+
+   /** Target {@link SortMergeResultPartition} to read data from. */
+   private final SortMergeResultPartition partition;
+
+   /** Listener to notify when data is available. */
+   private final BufferAvailabilityListener availabilityListener;
+
+   /** Result {@link PartitionedFile} to read. */
+   private final PartitionedFile partitionedFile;
+
+   /** Unmanaged memory used as read buffers. */
+   private final Queue readBuffers = new ArrayDeque<>();
+
+   /** Buffers read by the file reader. */
+   private final Queue buffersRead = new ArrayDeque<>();
+
+   /** Target subpartition to read. */
+   private final int subpartitionIndex;

Review comment:
   This variable only used in constructor





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] maver1ck commented on pull request #12919: [FLINK-16048][avro] Support read/write confluent schema registry avro…

2020-10-23 Thread GitBox


maver1ck commented on pull request #12919:
URL: https://github.com/apache/flink/pull/12919#issuecomment-715222930


   @danny0405 
   I think we have one more problem.
   When Flink is creating schema in registry nullability is not properly set 
for logical types.
   Examples. Table:
   ```
   create table `test_logical_null` (
`string_field` STRING,
`timestamp_field` TIMESTAMP(3)
   ) WITH (
 'connector' = 'kafka', 
 'topic' = 'test-logical-null', 
 'properties.bootstrap.servers' = 'localhost:9092', 
 'properties.group.id' = 'test12345', 
  'scan.startup.mode' = 'earliest-offset', 
 'format' = 'avro-confluent', -- Must be set to 'avro-confluent' to 
configure this format.
 'avro-confluent.schema-registry.url' = 'http://localhost:8081', -- URL to 
connect to Confluent Schema Registry
 'avro-confluent.schema-registry.subject' = 'test-logical-null' -- Subject 
name to write to the Schema Registry service; required for sinks
   )
   ```
   Schema:
   ```
   {
 "type": "record",
 "name": "record",
 "fields": [
   {
 "name": "string_field",
 "type": [
   "string",
   "null"
 ]
   },
   {
 "name": "timestamp_field",
 "type": {
   "type": "long",
   "logicalType": "timestamp-millis"
 }
   }
 ]
   }
   ```
   For not null fields:
   ```
   create table `test_logical_notnull` (
`string_field` STRING NOT NULL,
`timestamp_field` TIMESTAMP(3) NOT NULL
   ) WITH (
 'connector' = 'kafka', 
 'topic' = 'test-logical-notnull', 
 'properties.bootstrap.servers' = 'localhost:9092', 
 'properties.group.id' = 'test12345', 
  'scan.startup.mode' = 'earliest-offset', 
 'format' = 'avro-confluent', -- Must be set to 'avro-confluent' to 
configure this format.
 'avro-confluent.schema-registry.url' = 'http://localhost:8081', -- URL to 
connect to Confluent Schema Registry
 'avro-confluent.schema-registry.subject' = 'test-logical-notnull-value' -- 
Subject name to write to the Schema Registry service; required for sinks
   );
   ```
   Schema
   ```
   {
 "type": "record",
 "name": "record",
 "fields": [
   {
 "name": "string_field",
 "type": "string"
   },
   {
 "name": "timestamp_field",
 "type": {
   "type": "long",
   "logicalType": "timestamp-millis"
 }
   }
 ]
   }
   ```
   As you can see for string_field we have proper union with null (for nullable 
field). For timestamp_field in both examples union is missing.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-6657) Auto-generate HistoryServer REST API reference docs

2020-10-23 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger closed FLINK-6657.
-
Resolution: Won't Fix

Closing since the API is the same.

> Auto-generate HistoryServer REST API reference docs
> ---
>
> Key: FLINK-6657
> URL: https://issues.apache.org/jira/browse/FLINK-6657
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Runtime / Coordination
>Reporter: Chesnay Schepler
>Priority: Trivial
>
> The HistoryServer documentation contains a full reference to all supported 
> Rest calls.
> Given that all handlers that the HistoryServer uses are available through 
> {{WebRuntimeMonitor#getJsonArchivists}} we can easily auto-generate this 
> table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-7937) Add pagination to Flink History view

2020-10-23 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-7937:
--
Component/s: (was: Runtime / REST)
 Runtime / Web Frontend

> Add pagination to Flink History view
> 
>
> Key: FLINK-7937
> URL: https://issues.apache.org/jira/browse/FLINK-7937
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Web Frontend
>Reporter: Andrew Roberts
>Priority: Minor
>
> We have enough historical jobs that the browser chokes when trying to render 
> them all on one page. The history server should have pagination added, so 
> it's only trying to render some small subset at a time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19703) A result partition is not untracked after its producer task failed in TaskManager

2020-10-23 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17219576#comment-17219576
 ] 

Chesnay Schepler commented on FLINK-19703:
--

Could you explain why this happens? I can't seem to find a code-path where this 
can happen.

> A result partition is not untracked after its producer task failed in 
> TaskManager
> -
>
> Key: FLINK-19703
> URL: https://issues.apache.org/jira/browse/FLINK-19703
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.10.2, 1.12.0, 1.11.2
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
> Fix For: 1.12.0
>
>
> {{Execution#maybeReleasePartitionsAndSendCancelRpcCall(...)}} will be not 
> invoked when a task is reported to be failed in TaskManager, which results in 
> its partitions to still be tacked by the job manager partition tracker. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   5   >