[GitHub] [flink] flinkbot commented on pull request #14419: [FLINK-20650][k8s] Rename the command "native-k8s" to "run" based on the changes in docker-entrypoint.sh
flinkbot commented on pull request #14419: URL: https://github.com/apache/flink/pull/14419#issuecomment-747932005 ## CI report: * 3906f10b3bb3896e8370f5e9a8d8014c96be9076 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14415: [FLINK-19880][formats] Fix JsonRowFormatFactory's support for the format.ignore-parse-errors property.
flinkbot edited a comment on pull request #14415: URL: https://github.com/apache/flink/pull/14415#issuecomment-747584734 ## CI report: * 91b26c741c5cc1b578123830c6981e5e3fca9767 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10997) * bd5a9d8b6b779c3a0148b93740c4da40adae1d4a Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11008) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14417: [FLINK-20668][table-planner-blink] Introduce translateToExecNode method for FlinkPhysicalRel
flinkbot edited a comment on pull request #14417: URL: https://github.com/apache/flink/pull/14417#issuecomment-747849155 ## CI report: * 67c73bdacc77bab17d60526487c13fe6693e4904 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10999) Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11009) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14416: [hotfix][k8s] Fix k8s ha service doc.
flinkbot edited a comment on pull request #14416: URL: https://github.com/apache/flink/pull/14416#issuecomment-747844139 ## CI report: * baee37d34012ad05bf5414ab9b6a81cfe19f881a Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10998) * 44ff90598896340971c0ebe1a8e89715c5bcdfdb Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11007) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-10868) Flink's JobCluster ResourceManager doesn't use maximum-failed-containers as limit of resource acquirement
[ https://issues.apache.org/jira/browse/FLINK-10868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251576#comment-17251576 ] Xintong Song commented on FLINK-10868: -- [~ZhenqiuHuang], I think adding counting metrics for container failures is a good idea. Another question is how do we define a *container failure*? * In your PR, a container failure is recorded in {{ActiveResourceManager#requestNewWorker}} when the {{requestResourceFuture}} completes exceptionally. * There could be cases that the task manager process is launched and failed immediately during initialization. In such cases, the {{requestResourceFuture}} will complete successfully, but the {{onWorkerTerminated}} callback will be called before the worker is registered. I would suggest to record the failure for all containers that are being requested but failed/terminated before registering to RM. WDYT? Apart from this question, I think we are on the same page. > Flink's JobCluster ResourceManager doesn't use maximum-failed-containers as > limit of resource acquirement > - > > Key: FLINK-10868 > URL: https://issues.apache.org/jira/browse/FLINK-10868 > Project: Flink > Issue Type: Bug > Components: Deployment / Mesos, Deployment / YARN >Affects Versions: 1.6.2, 1.7.0 >Reporter: Zhenqiu Huang >Assignee: Zhenqiu Huang >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently, YarnResourceManager does use yarn.maximum-failed-containers as > limit of resource acquirement. In worse case, when new start containers > consistently fail, YarnResourceManager will goes into an infinite resource > acquirement process without failing the job. Together with the > https://issues.apache.org/jira/browse/FLINK-10848, It will quick occupy all > resources of yarn queue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #13876: [FLINK-18382][docs] Translate Kafka SQL connector documentation into Chinese
flinkbot edited a comment on pull request #13876: URL: https://github.com/apache/flink/pull/13876#issuecomment-720337850 ## CI report: * cdcae6f6464fff54bd1040031039d688370d8866 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11006) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (FLINK-20671) Partition doesn't commit until the end of partition
[ https://issues.apache.org/jira/browse/FLINK-20671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251570#comment-17251570 ] zhuxiaoshang edited comment on FLINK-20671 at 12/18/20, 7:51 AM: - https://issues.apache.org/jira/browse/FLINK-20213 didn't solve this problem and still exists in 1.12. I think the reason is that check bucket is active in `notifyCheckpointComplete`,it is a async function, so the Bucket#isActive will return true. was (Author: zhushang): Maybe the reason is bucket is always active until no data coming. > Partition doesn't commit until the end of partition > --- > > Key: FLINK-20671 > URL: https://issues.apache.org/jira/browse/FLINK-20671 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.12.0, 1.11.1 >Reporter: zhuxiaoshang >Priority: Major > > According to DOC > [https://ci.apache.org/projects/flink/flink-docs-master/dev/table/connectors/filesystem.html#partition-commit] > the partition can be commited multiple times. > Actually, the partition is commited until the end of partition.For example > hourly partition, the 10th partition will be commited at 11 clock, even the > `sink.partition-commit.delay` is 0s. > In theory,if delay is 0s the partition should be commited on every checkpoint. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20630) [Kinesis][DynamoDB] DynamoDB Streams Consumer fails to consume from Latest
[ https://issues.apache.org/jira/browse/FLINK-20630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song updated FLINK-20630: - Fix Version/s: 1.13.0 > [Kinesis][DynamoDB] DynamoDB Streams Consumer fails to consume from Latest > -- > > Key: FLINK-20630 > URL: https://issues.apache.org/jira/browse/FLINK-20630 > Project: Flink > Issue Type: Bug > Components: Connectors / Kinesis >Affects Versions: 1.12.0 >Reporter: Danny Cranmer >Assignee: Danny Cranmer >Priority: Blocker > Labels: pull-request-available > Fix For: 1.13.0, 1.12.1 > > > *Background* > When consuming from {{LATEST}}, the {{KinesisDataFetcher}} converts the shard > iterator type into an {{AT_TIMESTAMP}} to ensure all shards start from the > same position. When {{LATEST}} is used each shared would effectively start > from a different point in the time. > DynamoDB streams do not support {{AT_TIMESTAMP}} iterator type. > *Scope* > Remove shard iterator type transform for DynamoDB streams consumer. > *Reproduction Steps* > Create a simple application that consumer from {{LATEST}} using > {{FlinkDynamoDBStreamsConsumer}} > *Expected Results* > Consumer starts reading records from the head of the stream > *Actual Results* > An exception is thrown: > {code} > Caused by: > org.apache.flink.kinesis.shaded.com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: > 1 validation error detected: Value 'AT_TIMESTAMP' at 'shardIteratorType' > failed to satisfy constraint: Member must satisfy enum value set: > [AFTER_SEQUENCE_NUMBER, LATEST, AT_SEQUENCE_NUMBER, TRIM_HORIZON] (Service: > AmazonDynamoDBStreams; Status Code: 400; Error Code: ValidationException; > Request ID: AFQ8KCJAP74IN5MR5KD2FP0CTBVV4KQNSO5AEMVJF66Q9ASUAAJG) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1799) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1383) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1359) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1139) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:796) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:764) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:738) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:698) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:680) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:544) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:524) > at > org.apache.flink.kinesis.shaded.com.amazonaws.services.dynamodbv2.AmazonDynamoDBStreamsClient.doInvoke(AmazonDynamoDBStreamsClient.java:686) > at > org.apache.flink.kinesis.shaded.com.amazonaws.services.dynamodbv2.AmazonDynamoDBStreamsClient.invoke(AmazonDynamoDBStreamsClient.java:653) > at > org.apache.flink.kinesis.shaded.com.amazonaws.services.dynamodbv2.AmazonDynamoDBStreamsClient.invoke(AmazonDynamoDBStreamsClient.java:642) > at > org.apache.flink.kinesis.shaded.com.amazonaws.services.dynamodbv2.AmazonDynamoDBStreamsClient.executeGetShardIterator(AmazonDynamoDBStreamsClient.java:544) > at > org.apache.flink.kinesis.shaded.com.amazonaws.services.dynamodbv2.AmazonDynamoDBStreamsClient.getShardIterator(AmazonDynamoDBStreamsClient.java:515) > at > org.apache.flink.kinesis.shaded.com.amazonaws.services.dynamodbv2.streamsadapter.AmazonDynamoDBStreamsAdapterClient.getShardIterator(AmazonDynamoDBStreamsAdapterClient.java:355) > at > org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.getShardIterator(KinesisProxy.java:311) > at > org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.getShardIterator(KinesisProxy.java:302) > at > org.apache.flink.streaming.connectors.kinesis.internals.publisher.polling.PollingRecordPublisher.getShardIterator(PollingRecordPublisher.java:173) > at > org.apache.flink.streaming.connectors.kinesis.internals.publisher.polling.PollingRecordPublisher.(
[jira] [Updated] (FLINK-20629) [Kinesis][EFO] Migrate from DescribeStream to DescribeStreamSummary
[ https://issues.apache.org/jira/browse/FLINK-20629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song updated FLINK-20629: - Fix Version/s: 1.13.0 > [Kinesis][EFO] Migrate from DescribeStream to DescribeStreamSummary > --- > > Key: FLINK-20629 > URL: https://issues.apache.org/jira/browse/FLINK-20629 > Project: Flink > Issue Type: Improvement > Components: Connectors / Kinesis >Affects Versions: 1.12.0 >Reporter: Danny Cranmer >Assignee: Danny Cranmer >Priority: Minor > Labels: pull-request-available > Fix For: 1.13.0, 1.12.1 > > > *Background* > The Kinesis EFO connector invokes {{DescribeStream}} during startup to > acquire the stream ARN. This call also includes the shard information and has > a TPS of 10. A similar service exists, {{DescribeStreamSummary}} that has a > TPS of 20 and a lighter response payload size. > During startup sources with high parallelism compete to call this service (in > {{LAZY}} mode), resulting in backoff and retry. Essentially the startup time > can grow by 1s for every 10 parallelism, due to the 10 TPS. Migrating to > {{DescribeStreamSummary}} will improve startup time. > *Scope* > Migrate call to {{DescribeStream}} to use {{DescribeStreamSummary}} instead. > *Note* > I have targeted {{1.12.1}}, let me know if we should instead target {{1.13}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20476) New File Sink end-to-end test Failed
[ https://issues.apache.org/jira/browse/FLINK-20476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251572#comment-17251572 ] Yun Gao commented on FLINK-20476: - Yes, there should be something changed at that time, but currently we still not find possible reasons yet... For the metric acquisition, the problem might comes from three possible processes: # The actual execution logic. However, since from the log we could see the task from RUNNING to FAILED, and job switch from RUNNING to RESTARTING normally, the corresponding metric numberOfRestarts are modified between the two actions and thus should be executed. # The periodic fetching the latest numberOfRestarts to metric store. In this job the update interval is adjusted to 2s, as long as the akka thread of JobMaster is not blocked, the fetch should work. The fetching is also set with timeout 10min, since the job waits for the metrics for 15min, thus if the thread is indeed blocked there should be timeout exception. But we do not see such exceptions in the log. # The https request to fetch the metric. This is most suspicious since all the failed cases are using OPENSSL static provider for SSL, and they all have issues in fetching the flink-shaded-netty-tcnative-static. However, this test would try two times for setup the SSL, one for internal and one for rest API, the two times are in fact try to build the same jar file, thus as long as one of them is successful, the SSL would work for both. The job logs demonstrate the SSL is correctly setup and there are also not errors about rest requests. Besides, another contradiction is that before the metric acquisition, waiting for dispatcher to startup is also done via rest API, which demonstrate the rest API should works. The failure also does not reproduce after that night, and also does not reproduce in the manual tests locally or in azure pipeline. Thus now the most suspected cause might be some temporary exception in the environment (like the network) that cause the failure. > New File Sink end-to-end test Failed > > > Key: FLINK-20476 > URL: https://issues.apache.org/jira/browse/FLINK-20476 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem, Tests >Affects Versions: 1.13.0, 1.12.1 >Reporter: Huang Xingbo >Priority: Blocker > Labels: test-stability > Fix For: 1.13.0, 1.12.1 > > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10502&view=logs&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee&t=ff888d9b-cd34-53cc-d90f-3e446d355529] > {code} > 2020-12-03T23:22:43.8578352Z Dec 03 23:22:43 Starting taskexecutor daemon on > host fv-az586-109. > 2020-12-03T23:22:43.8587276Z Dec 03 23:22:43 Waiting for restart to happen > 2020-12-03T23:22:43.8587669Z Dec 03 23:22:43 Still waiting for restarts. > Expected: 1 Current: 0 > 2020-12-03T23:22:48.9939434Z Dec 03 23:22:48 Still waiting for restarts. > Expected: 1 Current: 0 > 2020-12-03T23:22:54.1236439Z Dec 03 23:22:54 Still waiting for restarts. > Expected: 1 Current: 0 > 2020-12-03T23:22:59.2469617Z Dec 03 23:22:59 Still waiting for restarts. > Expected: 1 Current: 0 > 2020-12-03T23:23:04.3730041Z Dec 03 23:23:04 Still waiting for restarts. > Expected: 1 Current: 0 > 2020-12-03T23:23:09.5227739Z Dec 03 23:23:09 Still waiting for restarts. > Expected: 1 Current: 0 > 2020-12-03T23:23:14.6572986Z Dec 03 23:23:14 Still waiting for restarts. > Expected: 1 Current: 0 > 2020-12-03T23:23:19.7762483Z Dec 03 23:23:19 Still waiting for restarts. > Expected: 1 Current: 0 > 2020-12-03T23:23:24.8973187Z Dec 03 23:23:24 Still waiting for restarts. > Expected: 1 Current: 0 > 2020-12-03T23:23:30.0272934Z Dec 03 23:23:30 Still waiting for restarts. > Expected: 1 Current: 0 > 2020-12-03T23:23:35.2332771Z Dec 03 23:23:35 Still waiting for restarts. > Expected: 1 Current: 0 > 2020-12-03T23:23:40.3766421Z Dec 03 23:23:40 Still waiting for restarts. > Expected: 1 Current: 0 > 2020-12-03T23:23:45.5103677Z Dec 03 23:23:45 Still waiting for restarts. > Expected: 1 Current: 0 > 2020-12-03T23:23:50.6382894Z Dec 03 23:23:50 Still waiting for restarts. > Expected: 1 Current: 0 > 2020-12-03T23:23:55.7908088Z Dec 03 23:23:55 Still waiting for restarts. > Expected: 1 Current: 0 > 2020-12-03T23:24:00.9276393Z Dec 03 23:24:00 Still waiting for restarts. > Expected: 1 Current: 0 > 2020-12-03T23:24:06.0966785Z Dec 03 23:24:06 Still waiting for restarts. > Expected: 1 Current: 0 > 2020-12-03T23:24:11.2497761Z Dec 03 23:24:11 Still waiting for restarts. > Expected: 1 Current: 0 > 2020-12-03T23:24:16.4118742Z Dec 03 23:24:16 Still waiting for restarts. > Expected: 1 Current: 0 > 2020-12-03T23:24:21.5640591Z Dec 03 23:24:21 Still waiting for restarts. > Expected: 1 Current: 0 > 2020-12-03T23:
[jira] [Commented] (FLINK-20671) Partition doesn't commit until the end of partition
[ https://issues.apache.org/jira/browse/FLINK-20671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251570#comment-17251570 ] zhuxiaoshang commented on FLINK-20671: -- Maybe the reason is bucket is always active until no data coming. > Partition doesn't commit until the end of partition > --- > > Key: FLINK-20671 > URL: https://issues.apache.org/jira/browse/FLINK-20671 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.12.0, 1.11.1 >Reporter: zhuxiaoshang >Priority: Major > > According to DOC > [https://ci.apache.org/projects/flink/flink-docs-master/dev/table/connectors/filesystem.html#partition-commit] > the partition can be commited multiple times. > Actually, the partition is commited until the end of partition.For example > hourly partition, the 10th partition will be commited at 11 clock, even the > `sink.partition-commit.delay` is 0s. > In theory,if delay is 0s the partition should be commited on every checkpoint. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] godfreyhe commented on pull request #14417: [FLINK-20668][table-planner-blink] Introduce translateToExecNode method for FlinkPhysicalRel
godfreyhe commented on pull request #14417: URL: https://github.com/apache/flink/pull/14417#issuecomment-747924128 @flinkbot run azure This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14417: [FLINK-20668][table-planner-blink] Introduce translateToExecNode method for FlinkPhysicalRel
flinkbot edited a comment on pull request #14417: URL: https://github.com/apache/flink/pull/14417#issuecomment-747849155 ## CI report: * 67c73bdacc77bab17d60526487c13fe6693e4904 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10999) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14415: [FLINK-19880][formats] Fix JsonRowFormatFactory's support for the format.ignore-parse-errors property.
flinkbot edited a comment on pull request #14415: URL: https://github.com/apache/flink/pull/14415#issuecomment-747584734 ## CI report: * 91b26c741c5cc1b578123830c6981e5e3fca9767 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10997) * bd5a9d8b6b779c3a0148b93740c4da40adae1d4a UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14416: [hotfix][k8s] Fix k8s ha service doc.
flinkbot edited a comment on pull request #14416: URL: https://github.com/apache/flink/pull/14416#issuecomment-747844139 ## CI report: * baee37d34012ad05bf5414ab9b6a81cfe19f881a Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10998) * 44ff90598896340971c0ebe1a8e89715c5bcdfdb Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11007) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wuchong commented on a change in pull request #14415: [FLINK-19880][formats] Fix JsonRowFormatFactory's support for the format.ignore-parse-errors property.
wuchong commented on a change in pull request #14415: URL: https://github.com/apache/flink/pull/14415#discussion_r545620784 ## File path: flink-formats/flink-json/src/test/java/org/apache/flink/formats/json/JsonRowFormatFactoryTest.java ## @@ -128,6 +152,14 @@ private void testSchemaDeserializationSchema(Map properties) { assertEquals(expected2, actual2); } + private void testSchemaDeserializationSchemaIgnoreParseErrors(Map properties) { Review comment: We don't need to extract a method if this method is only be invoked in one place. ## File path: flink-formats/flink-json/src/test/java/org/apache/flink/formats/json/JsonRowFormatFactoryTest.java ## @@ -91,6 +103,18 @@ public void testJsonSchema() { testJsonSchemaDeserializationSchema(properties); } + @Test + public void testJsonSchemaIgnoreParseErrors() { Review comment: I think `testSchemaIgnoreParseErrors()` is enough to cover the bug. We can't permute all possible combination of configurations. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #14419: [FLINK-20650][k8s] Rename the command "native-k8s" to "run" based on the changes in docker-entrypoint.sh
flinkbot commented on pull request #14419: URL: https://github.com/apache/flink/pull/14419#issuecomment-747922179 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 3906f10b3bb3896e8370f5e9a8d8014c96be9076 (Fri Dec 18 07:34:20 UTC 2020) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wangyang0918 opened a new pull request #14419: [FLINK-20650][k8s] Rename the command "native-k8s" to "run" based on the changes in docker-entrypoint.sh
wangyang0918 opened a new pull request #14419: URL: https://github.com/apache/flink/pull/14419 ## What is the purpose of the change `native-k8s` command has been removed in https://github.com/apache/flink-docker/pull/49. So when the Flink client create the JobManager deployment and TaskManager pod, the arguments needs to be changed accordingly. A typical container spec looks like following. ``` Containers: flink-job-manager: Container ID: docker://e3d57ca68e90c2efcbccff5ddf109c3008e292636cca8b9355b7dd2f73c56ae2 Image: registry.cn-beijing.aliyuncs.com/streamcompute/flink:1.12.0-debug Image ID: docker-pullable://registry.cn-beijing.aliyuncs.com/streamcompute/flink@sha256:dfed8ea21ae56607286688629211cb29d5aee65fca26402904590383ee968331 Ports: 8081/TCP, 6123/TCP, 6124/TCP Host Ports:0/TCP, 0/TCP, 0/TCP Command: /docker-entrypoint.sh Args: run bash -c $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx654311424 -Xms654311424 -XX:MaxMetaspaceSize=268435456 -Dlog.file=/opt/flink/log/jobmanager.log -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties -Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint -D jobmanager.memory.off-heap.size=134217728b -D jobmanager.memory.jvm-overhead.min=201326592b -D jobmanager.memory.jvm-metaspace.size=268435456b -D jobmanager.memory.heap.size=654311424b -D jobmanager.memory.jvm-overhead.max=201326592b State: Running Started: Fri, 18 Dec 2020 15:10:00 +0800 Ready: True Restart Count: 0 Limits: cpu: 1 memory: 1200Mi Requests: cpu: 1 memory: 1200Mi Environment: ENABLE_BUILT_IN_PLUGINS: flink-oss-fs-hadoop-1.12.0.jar _POD_IP_ADDRESS: (v1:status.podIP) Mounts: /opt/flink/conf from flink-config-volume (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-t8nfg (ro) ``` ## Brief change log * Rename the command "native-k8s" to "run" based on the changes in docker-entrypoint.sh ## Verifying this change * The unit tests have been updated to make them pass * All the K8s e2e tests should work without any change ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (**yes** / no / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-20671) Partition doesn't commit until the end of partition
zhuxiaoshang created FLINK-20671: Summary: Partition doesn't commit until the end of partition Key: FLINK-20671 URL: https://issues.apache.org/jira/browse/FLINK-20671 Project: Flink Issue Type: Bug Components: Connectors / FileSystem Affects Versions: 1.11.1, 1.12.0 Reporter: zhuxiaoshang According to DOC [https://ci.apache.org/projects/flink/flink-docs-master/dev/table/connectors/filesystem.html#partition-commit] the partition can be commited multiple times. Actually, the partition is commited until the end of partition.For example hourly partition, the 10th partition will be commited at 11 clock, even the `sink.partition-commit.delay` is 0s. In theory,if delay is 0s the partition should be commited on every checkpoint. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] tzulitai commented on a change in pull request #14406: [FLINK-20629][Kinesis] Migrate from DescribeStream to DescribeStreamSummary
tzulitai commented on a change in pull request #14406: URL: https://github.com/apache/flink/pull/14406#discussion_r545617043 ## File path: docs/dev/connectors/kinesis.zh.md ## @@ -1,5 +1,5 @@ --- -title: "Amazon AWS Kinesis Streams Connector" +title: "Amazon Kinesis Data Streams Connector" Review comment: nit: Would have been nice if these changes were a separate commit (could be a `[hotfix]`), since its non-related to the PR topic at hand. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tzulitai commented on pull request #14406: [FLINK-20629][Kinesis] Migrate from DescribeStream to DescribeStreamSummary
tzulitai commented on pull request #14406: URL: https://github.com/apache/flink/pull/14406#issuecomment-747917896 Since I don't think this changes anything to the user interface, I'm fine with merging this for 1.12.1 as well. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-20630) [Kinesis][DynamoDB] DynamoDB Streams Consumer fails to consume from Latest
[ https://issues.apache.org/jira/browse/FLINK-20630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tzu-Li (Gordon) Tai reassigned FLINK-20630: --- Assignee: Danny Cranmer > [Kinesis][DynamoDB] DynamoDB Streams Consumer fails to consume from Latest > -- > > Key: FLINK-20630 > URL: https://issues.apache.org/jira/browse/FLINK-20630 > Project: Flink > Issue Type: Bug > Components: Connectors / Kinesis >Affects Versions: 1.12.0 >Reporter: Danny Cranmer >Assignee: Danny Cranmer >Priority: Blocker > Labels: pull-request-available > Fix For: 1.12.1 > > > *Background* > When consuming from {{LATEST}}, the {{KinesisDataFetcher}} converts the shard > iterator type into an {{AT_TIMESTAMP}} to ensure all shards start from the > same position. When {{LATEST}} is used each shared would effectively start > from a different point in the time. > DynamoDB streams do not support {{AT_TIMESTAMP}} iterator type. > *Scope* > Remove shard iterator type transform for DynamoDB streams consumer. > *Reproduction Steps* > Create a simple application that consumer from {{LATEST}} using > {{FlinkDynamoDBStreamsConsumer}} > *Expected Results* > Consumer starts reading records from the head of the stream > *Actual Results* > An exception is thrown: > {code} > Caused by: > org.apache.flink.kinesis.shaded.com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: > 1 validation error detected: Value 'AT_TIMESTAMP' at 'shardIteratorType' > failed to satisfy constraint: Member must satisfy enum value set: > [AFTER_SEQUENCE_NUMBER, LATEST, AT_SEQUENCE_NUMBER, TRIM_HORIZON] (Service: > AmazonDynamoDBStreams; Status Code: 400; Error Code: ValidationException; > Request ID: AFQ8KCJAP74IN5MR5KD2FP0CTBVV4KQNSO5AEMVJF66Q9ASUAAJG) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1799) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1383) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1359) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1139) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:796) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:764) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:738) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:698) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:680) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:544) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:524) > at > org.apache.flink.kinesis.shaded.com.amazonaws.services.dynamodbv2.AmazonDynamoDBStreamsClient.doInvoke(AmazonDynamoDBStreamsClient.java:686) > at > org.apache.flink.kinesis.shaded.com.amazonaws.services.dynamodbv2.AmazonDynamoDBStreamsClient.invoke(AmazonDynamoDBStreamsClient.java:653) > at > org.apache.flink.kinesis.shaded.com.amazonaws.services.dynamodbv2.AmazonDynamoDBStreamsClient.invoke(AmazonDynamoDBStreamsClient.java:642) > at > org.apache.flink.kinesis.shaded.com.amazonaws.services.dynamodbv2.AmazonDynamoDBStreamsClient.executeGetShardIterator(AmazonDynamoDBStreamsClient.java:544) > at > org.apache.flink.kinesis.shaded.com.amazonaws.services.dynamodbv2.AmazonDynamoDBStreamsClient.getShardIterator(AmazonDynamoDBStreamsClient.java:515) > at > org.apache.flink.kinesis.shaded.com.amazonaws.services.dynamodbv2.streamsadapter.AmazonDynamoDBStreamsAdapterClient.getShardIterator(AmazonDynamoDBStreamsAdapterClient.java:355) > at > org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.getShardIterator(KinesisProxy.java:311) > at > org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.getShardIterator(KinesisProxy.java:302) > at > org.apache.flink.streaming.connectors.kinesis.internals.publisher.polling.PollingRecordPublisher.getShardIterator(PollingRecordPublisher.java:173) > at > org.apache.flink.streaming.connectors.kinesis.internals.publisher.polling.PollingRe
[GitHub] [flink] flinkbot edited a comment on pull request #14416: [hotfix][k8s] Fix k8s ha service doc.
flinkbot edited a comment on pull request #14416: URL: https://github.com/apache/flink/pull/14416#issuecomment-747844139 ## CI report: * baee37d34012ad05bf5414ab9b6a81cfe19f881a Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10998) * 44ff90598896340971c0ebe1a8e89715c5bcdfdb UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-20629) [Kinesis][EFO] Migrate from DescribeStream to DescribeStreamSummary
[ https://issues.apache.org/jira/browse/FLINK-20629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tzu-Li (Gordon) Tai reassigned FLINK-20629: --- Assignee: Danny Cranmer > [Kinesis][EFO] Migrate from DescribeStream to DescribeStreamSummary > --- > > Key: FLINK-20629 > URL: https://issues.apache.org/jira/browse/FLINK-20629 > Project: Flink > Issue Type: Improvement > Components: Connectors / Kinesis >Affects Versions: 1.12.0 >Reporter: Danny Cranmer >Assignee: Danny Cranmer >Priority: Minor > Labels: pull-request-available > Fix For: 1.12.1 > > > *Background* > The Kinesis EFO connector invokes {{DescribeStream}} during startup to > acquire the stream ARN. This call also includes the shard information and has > a TPS of 10. A similar service exists, {{DescribeStreamSummary}} that has a > TPS of 20 and a lighter response payload size. > During startup sources with high parallelism compete to call this service (in > {{LAZY}} mode), resulting in backoff and retry. Essentially the startup time > can grow by 1s for every 10 parallelism, due to the 10 TPS. Migrating to > {{DescribeStreamSummary}} will improve startup time. > *Scope* > Migrate call to {{DescribeStream}} to use {{DescribeStreamSummary}} instead. > *Note* > I have targeted {{1.12.1}}, let me know if we should instead target {{1.13}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20630) [Kinesis][DynamoDB] DynamoDB Streams Consumer fails to consume from Latest
[ https://issues.apache.org/jira/browse/FLINK-20630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tzu-Li (Gordon) Tai updated FLINK-20630: Priority: Blocker (was: Major) > [Kinesis][DynamoDB] DynamoDB Streams Consumer fails to consume from Latest > -- > > Key: FLINK-20630 > URL: https://issues.apache.org/jira/browse/FLINK-20630 > Project: Flink > Issue Type: Bug > Components: Connectors / Kinesis >Affects Versions: 1.12.0 >Reporter: Danny Cranmer >Priority: Blocker > Labels: pull-request-available > Fix For: 1.12.1 > > > *Background* > When consuming from {{LATEST}}, the {{KinesisDataFetcher}} converts the shard > iterator type into an {{AT_TIMESTAMP}} to ensure all shards start from the > same position. When {{LATEST}} is used each shared would effectively start > from a different point in the time. > DynamoDB streams do not support {{AT_TIMESTAMP}} iterator type. > *Scope* > Remove shard iterator type transform for DynamoDB streams consumer. > *Reproduction Steps* > Create a simple application that consumer from {{LATEST}} using > {{FlinkDynamoDBStreamsConsumer}} > *Expected Results* > Consumer starts reading records from the head of the stream > *Actual Results* > An exception is thrown: > {code} > Caused by: > org.apache.flink.kinesis.shaded.com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: > 1 validation error detected: Value 'AT_TIMESTAMP' at 'shardIteratorType' > failed to satisfy constraint: Member must satisfy enum value set: > [AFTER_SEQUENCE_NUMBER, LATEST, AT_SEQUENCE_NUMBER, TRIM_HORIZON] (Service: > AmazonDynamoDBStreams; Status Code: 400; Error Code: ValidationException; > Request ID: AFQ8KCJAP74IN5MR5KD2FP0CTBVV4KQNSO5AEMVJF66Q9ASUAAJG) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1799) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1383) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1359) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1139) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:796) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:764) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:738) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:698) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:680) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:544) > at > org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:524) > at > org.apache.flink.kinesis.shaded.com.amazonaws.services.dynamodbv2.AmazonDynamoDBStreamsClient.doInvoke(AmazonDynamoDBStreamsClient.java:686) > at > org.apache.flink.kinesis.shaded.com.amazonaws.services.dynamodbv2.AmazonDynamoDBStreamsClient.invoke(AmazonDynamoDBStreamsClient.java:653) > at > org.apache.flink.kinesis.shaded.com.amazonaws.services.dynamodbv2.AmazonDynamoDBStreamsClient.invoke(AmazonDynamoDBStreamsClient.java:642) > at > org.apache.flink.kinesis.shaded.com.amazonaws.services.dynamodbv2.AmazonDynamoDBStreamsClient.executeGetShardIterator(AmazonDynamoDBStreamsClient.java:544) > at > org.apache.flink.kinesis.shaded.com.amazonaws.services.dynamodbv2.AmazonDynamoDBStreamsClient.getShardIterator(AmazonDynamoDBStreamsClient.java:515) > at > org.apache.flink.kinesis.shaded.com.amazonaws.services.dynamodbv2.streamsadapter.AmazonDynamoDBStreamsAdapterClient.getShardIterator(AmazonDynamoDBStreamsAdapterClient.java:355) > at > org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.getShardIterator(KinesisProxy.java:311) > at > org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.getShardIterator(KinesisProxy.java:302) > at > org.apache.flink.streaming.connectors.kinesis.internals.publisher.polling.PollingRecordPublisher.getShardIterator(PollingRecordPublisher.java:173) > at > org.apache.flink.streaming.connectors.kinesis.internals.publisher.polling.PollingRecordPublisher.(PollingRecordPublishe
[jira] [Updated] (FLINK-20670) Support query hints for Flink SQL
[ https://issues.apache.org/jira/browse/FLINK-20670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuo Cheng updated FLINK-20670: --- Description: Now Flink has supported dynamic table options based on the Hint framework of Calcite. Besides this, Hint syntax is also very useful for query optimization, as there is no perfect planner, and query hints provide a mechanism to direct the optimizer to choose an efficient query execution plan based on the specific criteria. Currently, almost most popular databases and big-data frameworks support query hint, for example, Join hints in Oracle are used as following: {code:sql} SELECT /*+ USE_MERGE(employees departments) */ * FROM employees, departments WHERE employees.department_id = departments.department_id; {code} Flink also have several join strategy for batch job, i.e., Nested-Loop, Sort-Merge and Hash Join, it will be convenient for users to produce an efficient join execution plan if Flink supports Join hint. Besides Join in batch job, it's also possible to use join hints to support partitioned temporal table join in future. In a word, query hint, especially join hint now, will benefit Flink users a lot. was: Now Flink has supported dynamic table options based on the Hint framework of Calcite. Besides this, Hint syntax is also very useful for query optimization, as there is no perfect planner, and query hints provide a mechanism to direct the optimizer to choose an efficient query execution plan based on the specific criteria. Currently, almost most popular databases and big-data frameworks support query hint, for example, Join hints in Oracle are used as following: {code:sql} SELECT /*+ USE_MERGE(employees departments) */ * FROM employees, departments WHERE employees.department_id = departments.department_id; {code} Flink also have several join strategy for batch job, i.e., Nested-Loop, Sort-Merge and Hash Join, it will be convenient for users to produce an efficient join execution plan if Flink supports Join hint. Besides Join in batch job, it's also possible to use join hints to support partitioned temporal table join in future. In a word, query hint, especially join hint now, will benefit Flink users a lot. > Support query hints for Flink SQL > - > > Key: FLINK-20670 > URL: https://issues.apache.org/jira/browse/FLINK-20670 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Reporter: Shuo Cheng >Priority: Major > Fix For: 1.13.0 > > > Now Flink has supported dynamic table options based on the Hint framework of > Calcite. Besides this, Hint syntax is also very useful for query > optimization, as there is no perfect planner, and query hints provide a > mechanism to direct the optimizer to choose an efficient query execution plan > based on the specific criteria. > Currently, almost most popular databases and big-data frameworks support > query hint, for example, Join hints in Oracle are used as following: > {code:sql} > SELECT /*+ USE_MERGE(employees departments) */ * > FROM employees, departments > WHERE employees.department_id = departments.department_id; > {code} > Flink also have several join strategy for batch job, i.e., Nested-Loop, > Sort-Merge and Hash Join, it will be convenient for users to produce an > efficient join execution plan if Flink supports Join hint. Besides Join in > batch job, it's also possible to use join hints to support partitioned > temporal table join in future. In a word, query hint, especially join hint > now, will benefit Flink users a lot. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20670) Support query hints for Flink SQL
[ https://issues.apache.org/jira/browse/FLINK-20670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuo Cheng updated FLINK-20670: --- Description: Now Flink has supported dynamic table options based on the Hint framework of Calcite. Besides this, Hint syntax is also very useful for query optimization, as there is no perfect planner, and query hints provide a mechanism to direct the optimizer to choose an efficient query execution plan based on the specific criteria. Currently, almost most popular databases and big-data frameworks support query hint, for example, Join hints in Oracle are used as following: {code:sql} SELECT /*+ USE_MERGE(employees departments) */ * FROM employees, departments WHERE employees.department_id = departments.department_id; {code} Flink also have several join strategy for batch job, i.e., Nested-Loop, Sort-Merge and Hash Join, it will be convenient for users to produce an efficient join execution plan if Flink supports Join hint. Besides Join in batch job, it's also possible to use join hints to support partitioned temporal table join in future. In a word, query hint, especially join hint now, will benefit Flink users a lot. was: Now Flink has supported dynamic table options based on the Hint framework of Calcite. Besides this, Hint syntax is also very useful for query optimization, as there is no perfect planner, and query hints provide a mechanism to direct the optimizer to choose an efficient query execution plan based on the specific criteria. Currently, almost most popular databases and big-data frameworks support query hint, for example, Join hints in Oracle are used as following: {code:sql} SELECT /*+ USE_MERGE(employees departments) */ * FROM employees, departments WHERE employees.department_id = departments.department_id; {code} Flink also have several join strategy for batch job, i.e., Nested-Loop, Sort-Merge and Hash Join, it will be convenient for users to produce an efficient join execution plan if Flink supports Join hint. Besides Join in batch job, it's also possible to use join hints to support partitioned temporal table join in future. In a word, query hint, especially join hint now, will benefit Flink users a lot. > Support query hints for Flink SQL > - > > Key: FLINK-20670 > URL: https://issues.apache.org/jira/browse/FLINK-20670 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Reporter: Shuo Cheng >Priority: Major > Fix For: 1.13.0 > > > Now Flink has supported dynamic table options based on the Hint framework of > Calcite. Besides this, Hint syntax is also very useful for query > optimization, as there is no perfect planner, and query hints provide a > mechanism to direct the optimizer to choose an efficient query execution plan > based on the specific criteria. > Currently, almost most popular databases and big-data frameworks support > query hint, for example, Join hints in Oracle are used as following: > {code:sql} > SELECT /*+ USE_MERGE(employees departments) */ * > FROM employees, departments > WHERE employees.department_id = departments.department_id; > {code} > Flink also have several join strategy for batch job, i.e., Nested-Loop, > Sort-Merge and Hash Join, it will be convenient for users to produce an > efficient join execution plan if Flink supports Join hint. Besides Join in > batch job, it's also possible to use join hints to support partitioned > temporal table join in future. In a word, query hint, especially join hint > now, will benefit Flink users a lot. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20670) Support query hints for Flink SQL
Shuo Cheng created FLINK-20670: -- Summary: Support query hints for Flink SQL Key: FLINK-20670 URL: https://issues.apache.org/jira/browse/FLINK-20670 Project: Flink Issue Type: New Feature Components: Table SQL / API Reporter: Shuo Cheng Fix For: 1.13.0 Now Flink has supported dynamic table options based on the Hint framework of Calcite. Besides this, Hint syntax is also very useful for query optimization, as there is no perfect planner, and query hints provide a mechanism to direct the optimizer to choose an efficient query execution plan based on the specific criteria. Currently, almost most popular databases and big-data frameworks support query hint, for example, Join hints in Oracle are used as following: {code:sql} SELECT /*+ USE_MERGE(employees departments) */ * FROM employees, departments WHERE employees.department_id = departments.department_id; {code} Flink also have several join strategy for batch job, i.e., Nested-Loop, Sort-Merge and Hash Join, it will be convenient for users to produce an efficient join execution plan if Flink supports Join hint. Besides Join in batch job, it's also possible to use join hints to support partitioned temporal table join in future. In a word, query hint, especially join hint now, will benefit Flink users a lot. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink-docker] tzulitai commented on pull request #47: Add GPG key for 1.11.3 release
tzulitai commented on pull request #47: URL: https://github.com/apache/flink-docker/pull/47#issuecomment-747916108 +1, please make sure to rerun the tests before merging This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] fsk119 commented on a change in pull request #14371: [FLINK-20546][kafka]fix method misuse in DynamicTableFactoryTest
fsk119 commented on a change in pull request #14371: URL: https://github.com/apache/flink/pull/14371#discussion_r545611977 ## File path: flink-connectors/flink-connector-kafka/src/test/java/org/apache/flink/streaming/connectors/kafka/table/KafkaDynamicTableFactoryTest.java ## @@ -585,7 +585,6 @@ public void testPrimaryKeyValidation() { "I;UA;UB;D")); // pk can be defined on cdc table, should pass createTableSink(pkSchema, options1); - createTableSink(pkSchema, options1); Review comment: It should be `createTableSource(...)`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-web] tzulitai commented on pull request #399: Add Apache Flink release 1.11.3
tzulitai commented on pull request #399: URL: https://github.com/apache/flink-web/pull/399#issuecomment-747915017 +1, LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] cxiiiiiii commented on pull request #14415: [FLINK-19880][formats] Fix JsonRowFormatFactory's support for the format.ignore-parse-errors property.
cxiii commented on pull request #14415: URL: https://github.com/apache/flink/pull/14415#issuecomment-747912845 @wuchong hi jark, could you help to review this pr since the ci has passed. tks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14412: [FLINK-20639][python] Use list to optimize the Row used by Python UDAF intermediate results
flinkbot edited a comment on pull request #14412: URL: https://github.com/apache/flink/pull/14412#issuecomment-747440830 ## CI report: * 0fdc8def8ceaee2359cbabf9b93b3c50ff4f0348 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10996) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wanglijie95 commented on a change in pull request #14416: [hotfix][k8s] Fix k8s ha service doc.
wanglijie95 commented on a change in pull request #14416: URL: https://github.com/apache/flink/pull/14416#discussion_r545607574 ## File path: flink-kubernetes/src/main/java/org/apache/flink/kubernetes/highavailability/KubernetesHaServices.java ## @@ -42,7 +42,7 @@ import static org.apache.flink.util.Preconditions.checkNotNull; /** - * An implementation of the {@link AbstractHaServices} using Apache ZooKeeper. + * An implementation of the {@link AbstractHaServices} based on kubernetes API. Review comment: Thanks for review. I have modified according to your suggestion. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13876: [FLINK-18382][docs] Translate Kafka SQL connector documentation into Chinese
flinkbot edited a comment on pull request #13876: URL: https://github.com/apache/flink/pull/13876#issuecomment-720337850 ## CI report: * a6e72cdc72f44909872f7324c62833b7726a44a9 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8769) * cdcae6f6464fff54bd1040031039d688370d8866 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11006) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] shuiqiangchen commented on a change in pull request #14401: [FLINK-20621][python] Refactor the TypeInformation implementation in Python DataStream API.
shuiqiangchen commented on a change in pull request #14401: URL: https://github.com/apache/flink/pull/14401#discussion_r545606417 ## File path: flink-python/pyflink/common/typeinfo.py ## @@ -431,45 +430,47 @@ def from_internal_type(self, obj): return tuple(values) -class TupleTypeInfo(WrapperTypeInfo): +class TupleTypeInfo(TypeInformation): """ TypeInformation for Tuple. """ def __init__(self, types: List[TypeInformation]): self.types = types +super(TupleTypeInfo, self).__init__() + +def get_field_types(self) -> List[TypeInformation]: +return self.types + +def get_java_type_info(self) -> JavaObject: j_types_array = get_gateway().new_array( - get_gateway().jvm.org.apache.flink.api.common.typeinfo.TypeInformation, len(types)) + get_gateway().jvm.org.apache.flink.api.common.typeinfo.TypeInformation, len(self.types)) -for i in range(len(types)): -field_type = types[i] -if isinstance(field_type, WrapperTypeInfo): +for i in range(len(self.types)): +field_type = self.types[i] +if isinstance(field_type, TypeInformation): j_types_array[i] = field_type.get_java_type_info() -j_typeinfo = get_gateway().jvm \ +self._j_typeinfo = get_gateway().jvm \ .org.apache.flink.api.java.typeutils.TupleTypeInfo(j_types_array) -super(TupleTypeInfo, self).__init__(j_typeinfo=j_typeinfo) - -def get_field_types(self) -> List[TypeInformation]: -return self.types +return self._j_typeinfo def __eq__(self, other) -> bool: -return self._j_typeinfo.equals(other._j_typeinfo) - -def __hash__(self) -> int: -return self._j_typeinfo.hashCode() +if isinstance(other, TupleTypeInfo): +return self.types == other.types +return False def __str__(self) -> str: -return "TupleTypeInfo(%s)" % ', '.join([field_type.__str__() for field_type in self.types]) +return "TupleTypeInfo(%s)" % ', '.join([str(field_type) for field_type in self.types]) -class DateTypeInfo(WrapperTypeInfo): +class DateTypeInformation(TypeInformation): Review comment: Sorry, I changed it mistakenly. I'll rename it back. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (FLINK-18983) Job doesn't changed to failed if close function has blocked
[ https://issues.apache.org/jira/browse/FLINK-18983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251540#comment-17251540 ] Yuan Mei edited comment on FLINK-18983 at 12/18/20, 6:40 AM: - Hey, [~liuyufei]. You are absolutely right: Having failed tasks and task cancelation with inconsistent behavior is misleading. And a valid argument (as you mentioned in the description) is to have a similar watchdog to timeout the process of task failure as well. However, after looking through the code a bit, we think the change is more than just simply adding a watchdog when handing invokeException in `StreamTask#invoke`. Instead, we probably need to design a way to properly unify * `Task#cancelOrFailAndCancelInvokableInternal` where cancelation is issued outside from user invokable, and * invokeException handing in `StreamTask#invoke` (user invokable) and as you can see, the change itself looks non-trivial. And In the future, we do have plans to refactor `Task` and `StreamTask`, so we do want to avoid hacky fixes right now. So, based on your last comment, do you think my previous suggestion to "add timeout logic or bounded retry numbers in the close function" can solve your problem in short term? Please let me know! Thanks! was (Author: ym): Hey, [~liuyufei]. You are absolutely right: Having failed tasks and task cancelation with inconsistent behavior is misleading. And a valid argument (as you mentioned in the description) is to have a similar watchdog to timeout the process of task failure as well. However, after looking through the code a bit, we think the change is more than just simply adding a watchdog when handing invokeException in `StreamTask#invoke`. Instead, we probably need to design a way to properly unify * `Task#cancelOrFailAndCancelInvokableInternal` where cancelation is issued outside from user invokable, and * invokeException handing in `StreamTask#invoke` (user invokable) and as you can see, the change itself looks non-trivial. And In the future, we do have plans to refactor `Task` and `StreamTask`, so we do want to avoid hacky fixes right now. So, based on your last comment, do you think my previous suggestion to "add timeout logic or bounded retry numbers in the close function" can solve your problem in short term? Please let me know! Thanks! > Job doesn't changed to failed if close function has blocked > --- > > Key: FLINK-18983 > URL: https://issues.apache.org/jira/browse/FLINK-18983 > Project: Flink > Issue Type: Bug > Components: Runtime / Task >Affects Versions: 1.11.0, 1.12.0 >Reporter: YufeiLiu >Priority: Major > > If a operator throw a exception, it will break process loop and dispose all > operator. But state will never switch to FAILED if block in Function.close, > and JobMaster can't know the final state and do restart. > Task have {{TaskCancelerWatchDog}} to kill process if cancellation timeout, > but it doesn't work for FAILED task.TAskThread will allways hang at: > org.apache.flink.streaming.runtime.tasks.StreamTask#cleanUpInvoke > Test case: > {code:java} > Configuration configuration = new Configuration(); > configuration.set(TaskManagerOptions.TASK_CANCELLATION_TIMEOUT, 1L); > StreamExecutionEnvironment env = > StreamExecutionEnvironment.createLocalEnvironment(2, configuration); > env.addSource(...) > .process(new ProcessFunction() { > @Override > public void processElement(String value, Context ctx, > Collector out) throws Exception { > if (getRuntimeContext().getIndexOfThisSubtask() == 0) { > throw new RuntimeException(); > } > } > @Override > public void close() throws Exception { > if (getRuntimeContext().getIndexOfThisSubtask() == 0) { > Thread.sleep(1000); > } > } > }).setParallelism(2) > .print(); > {code} > In this case, job will block at close action and never change to FAILED. > If change thread which subtaskIndex == 1 to sleep, TM will exit after > TASK_CANCELLATION_TIMEOUT. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-18983) Job doesn't changed to failed if close function has blocked
[ https://issues.apache.org/jira/browse/FLINK-18983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251540#comment-17251540 ] Yuan Mei edited comment on FLINK-18983 at 12/18/20, 6:40 AM: - Hey, [~liuyufei]. You are absolutely right: having failed tasks and task cancelation with inconsistent behavior is misleading. And a valid argument (as you mentioned in the description) is to have a similar watchdog to timeout the process of task failure as well. However, after looking through the code a bit, we think the change is more than just simply adding a watchdog when handing invokeException in `StreamTask#invoke`. Instead, we probably need to design a way to properly unify * `Task#cancelOrFailAndCancelInvokableInternal` where cancelation is issued outside from user invokable, and * invokeException handing in `StreamTask#invoke` (user invokable) and as you can see, the change itself looks non-trivial. And In the future, we do have plans to refactor `Task` and `StreamTask`, so we do want to avoid hacky fixes right now. So, based on your last comment, do you think my previous suggestion to "add timeout logic or bounded retry numbers in the close function" can solve your problem in short term? Please let me know! Thanks! was (Author: ym): Hey, [~liuyufei]. You are absolutely right: Having failed tasks and task cancelation with inconsistent behavior is misleading. And a valid argument (as you mentioned in the description) is to have a similar watchdog to timeout the process of task failure as well. However, after looking through the code a bit, we think the change is more than just simply adding a watchdog when handing invokeException in `StreamTask#invoke`. Instead, we probably need to design a way to properly unify * `Task#cancelOrFailAndCancelInvokableInternal` where cancelation is issued outside from user invokable, and * invokeException handing in `StreamTask#invoke` (user invokable) and as you can see, the change itself looks non-trivial. And In the future, we do have plans to refactor `Task` and `StreamTask`, so we do want to avoid hacky fixes right now. So, based on your last comment, do you think my previous suggestion to "add timeout logic or bounded retry numbers in the close function" can solve your problem in short term? Please let me know! Thanks! > Job doesn't changed to failed if close function has blocked > --- > > Key: FLINK-18983 > URL: https://issues.apache.org/jira/browse/FLINK-18983 > Project: Flink > Issue Type: Bug > Components: Runtime / Task >Affects Versions: 1.11.0, 1.12.0 >Reporter: YufeiLiu >Priority: Major > > If a operator throw a exception, it will break process loop and dispose all > operator. But state will never switch to FAILED if block in Function.close, > and JobMaster can't know the final state and do restart. > Task have {{TaskCancelerWatchDog}} to kill process if cancellation timeout, > but it doesn't work for FAILED task.TAskThread will allways hang at: > org.apache.flink.streaming.runtime.tasks.StreamTask#cleanUpInvoke > Test case: > {code:java} > Configuration configuration = new Configuration(); > configuration.set(TaskManagerOptions.TASK_CANCELLATION_TIMEOUT, 1L); > StreamExecutionEnvironment env = > StreamExecutionEnvironment.createLocalEnvironment(2, configuration); > env.addSource(...) > .process(new ProcessFunction() { > @Override > public void processElement(String value, Context ctx, > Collector out) throws Exception { > if (getRuntimeContext().getIndexOfThisSubtask() == 0) { > throw new RuntimeException(); > } > } > @Override > public void close() throws Exception { > if (getRuntimeContext().getIndexOfThisSubtask() == 0) { > Thread.sleep(1000); > } > } > }).setParallelism(2) > .print(); > {code} > In this case, job will block at close action and never change to FAILED. > If change thread which subtaskIndex == 1 to sleep, TM will exit after > TASK_CANCELLATION_TIMEOUT. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18983) Job doesn't changed to failed if close function has blocked
[ https://issues.apache.org/jira/browse/FLINK-18983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251540#comment-17251540 ] Yuan Mei commented on FLINK-18983: -- Hey, [~liuyufei]. You are absolutely right: Having failed tasks and task cancelation with inconsistent behavior is misleading. And a valid argument (as you mentioned in the description) is to have a similar watchdog to timeout the process of task failure as well. However, after looking through the code a bit, we think the change is more than just simply adding a watchdog when handing invokeException in `StreamTask#invoke`. Instead, we probably need to design a way to properly unify * `Task#cancelOrFailAndCancelInvokableInternal` where cancelation is issued outside from user invokable, and * invokeException handing in `StreamTask#invoke` (user invokable) and as you can see, the change itself looks non-trivial. And In the future, we do have plans to refactor `Task` and `StreamTask`, so we do want to avoid hacky fixes right now. So, based on your last comment, do you think my previous suggestion to "add timeout logic or bounded retry numbers in the close function" can solve your problem in short term? Please let me know! Thanks! > Job doesn't changed to failed if close function has blocked > --- > > Key: FLINK-18983 > URL: https://issues.apache.org/jira/browse/FLINK-18983 > Project: Flink > Issue Type: Bug > Components: Runtime / Task >Affects Versions: 1.11.0, 1.12.0 >Reporter: YufeiLiu >Priority: Major > > If a operator throw a exception, it will break process loop and dispose all > operator. But state will never switch to FAILED if block in Function.close, > and JobMaster can't know the final state and do restart. > Task have {{TaskCancelerWatchDog}} to kill process if cancellation timeout, > but it doesn't work for FAILED task.TAskThread will allways hang at: > org.apache.flink.streaming.runtime.tasks.StreamTask#cleanUpInvoke > Test case: > {code:java} > Configuration configuration = new Configuration(); > configuration.set(TaskManagerOptions.TASK_CANCELLATION_TIMEOUT, 1L); > StreamExecutionEnvironment env = > StreamExecutionEnvironment.createLocalEnvironment(2, configuration); > env.addSource(...) > .process(new ProcessFunction() { > @Override > public void processElement(String value, Context ctx, > Collector out) throws Exception { > if (getRuntimeContext().getIndexOfThisSubtask() == 0) { > throw new RuntimeException(); > } > } > @Override > public void close() throws Exception { > if (getRuntimeContext().getIndexOfThisSubtask() == 0) { > Thread.sleep(1000); > } > } > }).setParallelism(2) > .print(); > {code} > In this case, job will block at close action and never change to FAILED. > If change thread which subtaskIndex == 1 to sleep, TM will exit after > TASK_CANCELLATION_TIMEOUT. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] wuchong commented on pull request #14361: [FLINK-19435][connectors/jdbc] Fix deadlock when loading different driver classes concurrently using Class.forName
wuchong commented on pull request #14361: URL: https://github.com/apache/flink/pull/14361#issuecomment-747901269 Sorry for the late response. We were on business trip of Flink Forward Asia. I will review this PR later today. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14415: [FLINK-19880][formats] Fix JsonRowFormatFactory's support for the format.ignore-parse-errors property.
flinkbot edited a comment on pull request #14415: URL: https://github.com/apache/flink/pull/14415#issuecomment-747584734 ## CI report: * 91b26c741c5cc1b578123830c6981e5e3fca9767 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10997) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13876: [FLINK-18382][docs] Translate Kafka SQL connector documentation into Chinese
flinkbot edited a comment on pull request #13876: URL: https://github.com/apache/flink/pull/13876#issuecomment-720337850 ## CI report: * a6e72cdc72f44909872f7324c62833b7726a44a9 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8769) * cdcae6f6464fff54bd1040031039d688370d8866 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] HuangZhenQiu commented on pull request #8952: [FLINK-10868][flink-yarn] Add failure rater for resource manager
HuangZhenQiu commented on pull request #8952: URL: https://github.com/apache/flink/pull/8952#issuecomment-747899670 @xintongsong Thanks for reviewing the PR. I replied to your comments on the Jira ticket. Once we have an agreement on the implementation, I will revise the PR accordingly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-10868) Flink's JobCluster ResourceManager doesn't use maximum-failed-containers as limit of resource acquirement
[ https://issues.apache.org/jira/browse/FLINK-10868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251539#comment-17251539 ] Zhenqiu Huang commented on FLINK-10868: --- Hi [~xintongsong] I just found you commented on the jira ticket. Your summary of the problem and solution is accurate. Without the failure rate limit, the worst case we saw is that when a bad job that has the issue of download its job jar from hdfs, the Flink resource manager will consistently ask for more containers from yarn and then block the whole queue. In the outage, it blocks another critical pipeline to upgrade job and submit the same queue. Thus, in the current implementation, I choose the cancel all of the pending requests and killed the job. I agree that it could be a generic solution for both yarn and Kubernetes. Besides leveraging FailureRater for cool time management, I would suggest also add count metrics for the container failure. So that oncall engieer can handle the worst situation in time. How do you think? If we are on the same page, I would like the change PR accordingly. Thanks. > Flink's JobCluster ResourceManager doesn't use maximum-failed-containers as > limit of resource acquirement > - > > Key: FLINK-10868 > URL: https://issues.apache.org/jira/browse/FLINK-10868 > Project: Flink > Issue Type: Bug > Components: Deployment / Mesos, Deployment / YARN >Affects Versions: 1.6.2, 1.7.0 >Reporter: Zhenqiu Huang >Assignee: Zhenqiu Huang >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently, YarnResourceManager does use yarn.maximum-failed-containers as > limit of resource acquirement. In worse case, when new start containers > consistently fail, YarnResourceManager will goes into an infinite resource > acquirement process without failing the job. Together with the > https://issues.apache.org/jira/browse/FLINK-10848, It will quick occupy all > resources of yarn queue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] PatrickRen commented on a change in pull request #13876: [FLINK-18382][docs] Translate Kafka SQL connector documentation into Chinese
PatrickRen commented on a change in pull request #13876: URL: https://github.com/apache/flink/pull/13876#discussion_r545600084 ## File path: docs/dev/table/connectors/kafka.zh.md ## @@ -75,185 +72,181 @@ CREATE TABLE kafkaTable ( -Connector Options +连接器配置项 - Option - Required - Default - Type - Description + 选项 + 是否必需 + 默认值 + 类型 + 描述 connector - required - (none) + 必需 + (无) String - Specify what connector to use, for Kafka use: 'kafka'. + 使用的连接器, Kafka 连接器使用: 'kafka'. topic - required for sink, optional for source(use 'topic-pattern' instead if not set) - (none) + sink 表必需 + (无) String - Topic name(s) to read data from when the table is used as source. It also supports topic list for source by separating topic by semicolon like 'topic-1;topic-2'. Note, only one of "topic-pattern" and "topic" can be specified for sources. When the table is used as sink, the topic name is the topic to write data to. Note topic list is not supported for sinks. + 当表用作 source 时读取数据的 topic 名,亦支持用分号间隔的 topic 列表,如 'topic-1;topic-2'。注意对 source 表而言,'topic' 和 'topic-pattern' 两个选项只能使用其中一个当表被用作 sink 时,该配置表示写入的 topic 名。注意 sink 表不支持 topic 列表 topic-pattern - optional - (none) + 可选 + (无) String - The regular expression for a pattern of topic names to read from. All topics with names that match the specified regular expression will be subscribed by the consumer when the job starts running. Note, only one of "topic-pattern" and "topic" can be specified for sources. + 匹配读取 topic 名称的正则表达式。在作业开始运行时,所有匹配该正则表达式的 topic 都将被 Kafka consumer 订阅。注意对 source 表而言,'topic' 和 'topic-pattern' 两个选项只能使用其中一个 properties.bootstrap.servers - required - (none) + 必需 + (无) String - Comma separated list of Kafka brokers. + 逗号分隔的 Kafka broker 列表 properties.group.id - required by source - (none) + source 表必需 + (无) String - The id of the consumer group for Kafka source, optional for Kafka sink. + Kafka source 的 consumer 组 ID,对 Kafka sink 可选填 format - required - (none) + 必需 + (无) String - The format used to deserialize and serialize Kafka messages. - The supported formats are 'csv', 'json', 'avro', 'debezium-json' and 'canal-json'. - Please refer to Formats page for more details and more format options. + 用来序列化或反序列化 Kafka 消息的格式。 + 支持的格式有 'csv'、'json'、'avro'、'debezium-json' 和 'canal-json'。 + 请参阅格式页面以获取更多关于 format 的细节和相关配置项。 scan.startup.mode - optional + 可选 group-offsets String - Startup mode for Kafka consumer, valid values are 'earliest-offset', 'latest-offset', 'group-offsets', 'timestamp' and 'specific-offsets'. - See the following Start Reading Position for more details. + Kafka consumer 的启动模式。有效的值有:'earliest-offset'、'latest-offset'、'group-offsets'、'timestamp' 和 'specific-offsets'。 + 请参阅下方起始消费位点一节以获取更多细节。 scan.startup.specific-offsets - optional - (none) + 可选 + (无) String - Specify offsets for each partition in case of 'specific-offsets' startup mode, e.g. 'partition:0,offset:42;partition:1,offset:300'. + 在使用 'specific-offsets' 启动模式时为每个 partition 指定位点,例如 'partition:0,offset:42;partition:1,offset:300'。 scan.startup.timestamp-millis - optional - (none) + 可选 + (无) Long - Start from the specified epoch timestamp (milliseconds) used in case of 'timestamp' startup mode. + 在使用 'timestamp' 启动模式时指定启动的时间戳(毫秒单位) scan.topic-partition-discovery.interval - optional - (none) + 可选 + (无) Duration - Interval for consumer to discover dynamically created Kafka topics and partitions periodically. + Consumer 定期探测动态创建的 Kafka topic 和 partition 的时间间隔 sink.partitioner - optional - (none) + 可选 + (无) String - Output partitioning from Flink's partitions into Kafka's partitions. Valid values are + Flink partition 到 Kafka partition 的分区映射关系,有效值有: -fixed: each Flink partition ends up in at most one Kafka partition. -round-robin: a Flink partition is distributed to Kafka partitions round-robin. -Custom FlinkKafkaPartitioner subclass: e.g. 'org.mycompany.MyPartitioner'. +fixed: 每个 Flink partition 最终对应最多一个 Kafka partition。 +round-robin: Flink partition 按轮循 (round-robin) 的模式对应到 Kafka partition。 +自定义 FlinkKafkaPartitioner 的子类: 例如 'org.mycompany.MyPartitioner'。 sink.semantic - optional + 可选 at-least
[GitHub] [flink] PatrickRen commented on pull request #13876: [FLINK-18382][docs] Translate Kafka SQL connector documentation into Chinese
PatrickRen commented on pull request #13876: URL: https://github.com/apache/flink/pull/13876#issuecomment-747898339 Hi @wuchong and @V1ncentzzZ , I re-translated Kafka SQL connector's documentation and also fixed a typo in the original English doc. Since the sql-connector-download-table part has been extracted to an individual HTML file in FLINK-20093, I also created a Chinese version for it. Could you have a review on these changes? Thanks~ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14377: [FLINK-19905][Connector][jdbc] The Jdbc-connector's 'lookup.max-retries' option initial value is 1 in JdbcLookupFunction
flinkbot edited a comment on pull request #14377: URL: https://github.com/apache/flink/pull/14377#issuecomment-744303277 ## CI report: * c0e212ed51c103765ada525c54ba3d83614f929a Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10969) * ffcb3c699eb099caccb20aa38f372557e8a59306 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11003) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14387: [FLINK-19691][Connector][jdbc] Expose `CONNECTION_CHECK_TIMEOUT_SECONDS` as a configurable option in Jdbc connector
flinkbot edited a comment on pull request #14387: URL: https://github.com/apache/flink/pull/14387#issuecomment-745165624 ## CI report: * 6f24479fca0414f440f2bf2590fdbb4308db8bb8 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11002) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-20552) JdbcDynamicTableSink doesn't sink buffered data on checkpoint
[ https://issues.apache.org/jira/browse/FLINK-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251536#comment-17251536 ] Jark Wu commented on FLINK-20552: - Hi [~meijies], are you still working on this? > JdbcDynamicTableSink doesn't sink buffered data on checkpoint > - > > Key: FLINK-20552 > URL: https://issues.apache.org/jira/browse/FLINK-20552 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC, Table SQL / Ecosystem >Reporter: mei jie >Assignee: mei jie >Priority: Major > Labels: starter > Fix For: 1.13.0 > > > JdbcBatchingOutputFormat is wrapped to OutputFormatSinkFunction``` when > createSinkTransformation at CommonPhysicalSink class. but > OutputFormatSinkFunction don't implement CheckpointedFunction interface, so > the flush method of JdbcBatchingOutputFormat can't be called when checkpoint -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20552) JdbcDynamicTableSink doesn't sink buffered data on checkpoint
[ https://issues.apache.org/jira/browse/FLINK-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu updated FLINK-20552: Fix Version/s: 1.13.0 > JdbcDynamicTableSink doesn't sink buffered data on checkpoint > - > > Key: FLINK-20552 > URL: https://issues.apache.org/jira/browse/FLINK-20552 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC, Table SQL / Ecosystem >Reporter: mei jie >Assignee: mei jie >Priority: Major > Labels: starter > Fix For: 1.13.0 > > > JdbcBatchingOutputFormat is wrapped to OutputFormatSinkFunction``` when > createSinkTransformation at CommonPhysicalSink class. but > OutputFormatSinkFunction don't implement CheckpointedFunction interface, so > the flush method of JdbcBatchingOutputFormat can't be called when checkpoint -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20644) Check return type of ScalarFunction eval method shouldn't be void
[ https://issues.apache.org/jira/browse/FLINK-20644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu updated FLINK-20644: Summary: Check return type of ScalarFunction eval method shouldn't be void (was: Check return type of ScalarFunction eval method) > Check return type of ScalarFunction eval method shouldn't be void > - > > Key: FLINK-20644 > URL: https://issues.apache.org/jira/browse/FLINK-20644 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.11.1 > Environment: groupId:org.apache.flink > artifactId:flink-table-api-scala-bridge_2.11 > version:1.11.1 >Reporter: shiyu >Priority: Major > Labels: starter > Attachments: image-2020-12-17-16-04-15-131.png, > image-2020-12-17-16-07-39-827.png > > > flink-table-api-scala-bridge_2.11 > !image-2020-12-17-16-07-39-827.png! > !image-2020-12-17-16-04-15-131.png! > console: > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".SLF4J: Failed > to load class "org.slf4j.impl.StaticLoggerBinder".SLF4J: Defaulting to > no-operation (NOP) logger implementationSLF4J: See > http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.ERROR > StatusLogger Log4j2 could not find a logging implementation. Please add > log4j-core to the classpath. Using SimpleLogger to log to the console.../* 1 > *//* 2 */ public class StreamExecCalc$13 extends > org.apache.flink.table.runtime.operators.AbstractProcessStreamOperator/* 3 */ > implements > org.apache.flink.streaming.api.operators.OneInputStreamOperator \{/* 4 *//* 5 > */ private final Object[] references;/* 6 */ private transient > org.apache.flink.table.runtime.typeutils.StringDataSerializer > typeSerializer$6;/* 7 */ private transient > org.apache.flink.table.data.util.DataFormatConverters.StringConverter > converter$9;/* 8 */ private transient > cn.bicon.tableapitest.udf.ScalarFunctionTest$HashCode > function_cn$bicon$tableapitest$udf$ScalarFunctionTest$HashCode$8999e79cc91b971a8777461fb7698c58;/* > 9 */ private transient > org.apache.flink.table.data.util.DataFormatConverters.GenericConverter > converter$12;/* 10 */ org.apache.flink.table.data.BoxedWrapperRowData > out = new org.apache.flink.table.data.BoxedWrapperRowData(3);/* 11 */ > private final org.apache.flink.streaming.runtime.streamrecord.StreamRecord > outElement = new > org.apache.flink.streaming.runtime.streamrecord.StreamRecord(null);/* 12 *//* > 13 */ public StreamExecCalc$13(/* 14 */ Object[] > references,/* 15 */ > org.apache.flink.streaming.runtime.tasks.StreamTask task,/* 16 */ > org.apache.flink.streaming.api.graph.StreamConfig config,/* 17 */ > org.apache.flink.streaming.api.operators.Output output,/* 18 */ > org.apache.flink.streaming.runtime.tasks.ProcessingTimeService > processingTimeService) throws Exception {/* 19 */ this.references = > references;/* 20 */ typeSerializer$6 = > (((org.apache.flink.table.runtime.typeutils.StringDataSerializer) > references[0]));/* 21 */ converter$9 = > (((org.apache.flink.table.data.util.DataFormatConverters.StringConverter) > references[1]));/* 22 */ > function_cn$bicon$tableapitest$udf$ScalarFunctionTest$HashCode$8999e79cc91b971a8777461fb7698c58 > = (((cn.bicon.tableapitest.udf.ScalarFunctionTest$HashCode) > references[2]));/* 23 */ converter$12 = > (((org.apache.flink.table.data.util.DataFormatConverters.GenericConverter) > references[3]));/* 24 */ this.setup(task, config, output);/* 25 */ > if (this instanceof > org.apache.flink.streaming.api.operators.AbstractStreamOperator) {/* 26 */ > ((org.apache.flink.streaming.api.operators.AbstractStreamOperator) > this)/* 27 */ > .setProcessingTimeService(processingTimeService);/* 28 */ }/* 29 */ > }/* 30 *//* 31 */ @Override/* 32 */ public void open() > throws Exception \{/* 33 */ super.open();/* 34 */ /* 35 */ > > function_cn$bicon$tableapitest$udf$ScalarFunctionTest$HashCode$8999e79cc91b971a8777461fb7698c58.open(new > org.apache.flink.table.functions.FunctionContext(getRuntimeContext()));/* 36 > */ /* 37 */ }/* 38 *//* 39 */ @Override/* 40 */ > public void > processElement(org.apache.flink.streaming.runtime.streamrecord.StreamRecord > element) throws Exception \{/* 41 */ > org.apache.flink.table.data.RowData in1 = > (org.apache.flink.table.data.RowData) element.getValue();/* 42 */ /* > 43 */ org.apache.flink.table.data.binary.BinaryStringData field$5;/* > 44 */ boolean i
[jira] [Updated] (FLINK-20644) Check return type of ScalarFunction eval method
[ https://issues.apache.org/jira/browse/FLINK-20644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu updated FLINK-20644: Summary: Check return type of ScalarFunction eval method (was: table api use UDF(ScalarFunction ) if eval return Unit) > Check return type of ScalarFunction eval method > --- > > Key: FLINK-20644 > URL: https://issues.apache.org/jira/browse/FLINK-20644 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.11.1 > Environment: groupId:org.apache.flink > artifactId:flink-table-api-scala-bridge_2.11 > version:1.11.1 >Reporter: shiyu >Priority: Major > Labels: starter > Attachments: image-2020-12-17-16-04-15-131.png, > image-2020-12-17-16-07-39-827.png > > > flink-table-api-scala-bridge_2.11 > !image-2020-12-17-16-07-39-827.png! > !image-2020-12-17-16-04-15-131.png! > console: > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".SLF4J: Failed > to load class "org.slf4j.impl.StaticLoggerBinder".SLF4J: Defaulting to > no-operation (NOP) logger implementationSLF4J: See > http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.ERROR > StatusLogger Log4j2 could not find a logging implementation. Please add > log4j-core to the classpath. Using SimpleLogger to log to the console.../* 1 > *//* 2 */ public class StreamExecCalc$13 extends > org.apache.flink.table.runtime.operators.AbstractProcessStreamOperator/* 3 */ > implements > org.apache.flink.streaming.api.operators.OneInputStreamOperator \{/* 4 *//* 5 > */ private final Object[] references;/* 6 */ private transient > org.apache.flink.table.runtime.typeutils.StringDataSerializer > typeSerializer$6;/* 7 */ private transient > org.apache.flink.table.data.util.DataFormatConverters.StringConverter > converter$9;/* 8 */ private transient > cn.bicon.tableapitest.udf.ScalarFunctionTest$HashCode > function_cn$bicon$tableapitest$udf$ScalarFunctionTest$HashCode$8999e79cc91b971a8777461fb7698c58;/* > 9 */ private transient > org.apache.flink.table.data.util.DataFormatConverters.GenericConverter > converter$12;/* 10 */ org.apache.flink.table.data.BoxedWrapperRowData > out = new org.apache.flink.table.data.BoxedWrapperRowData(3);/* 11 */ > private final org.apache.flink.streaming.runtime.streamrecord.StreamRecord > outElement = new > org.apache.flink.streaming.runtime.streamrecord.StreamRecord(null);/* 12 *//* > 13 */ public StreamExecCalc$13(/* 14 */ Object[] > references,/* 15 */ > org.apache.flink.streaming.runtime.tasks.StreamTask task,/* 16 */ > org.apache.flink.streaming.api.graph.StreamConfig config,/* 17 */ > org.apache.flink.streaming.api.operators.Output output,/* 18 */ > org.apache.flink.streaming.runtime.tasks.ProcessingTimeService > processingTimeService) throws Exception {/* 19 */ this.references = > references;/* 20 */ typeSerializer$6 = > (((org.apache.flink.table.runtime.typeutils.StringDataSerializer) > references[0]));/* 21 */ converter$9 = > (((org.apache.flink.table.data.util.DataFormatConverters.StringConverter) > references[1]));/* 22 */ > function_cn$bicon$tableapitest$udf$ScalarFunctionTest$HashCode$8999e79cc91b971a8777461fb7698c58 > = (((cn.bicon.tableapitest.udf.ScalarFunctionTest$HashCode) > references[2]));/* 23 */ converter$12 = > (((org.apache.flink.table.data.util.DataFormatConverters.GenericConverter) > references[3]));/* 24 */ this.setup(task, config, output);/* 25 */ > if (this instanceof > org.apache.flink.streaming.api.operators.AbstractStreamOperator) {/* 26 */ > ((org.apache.flink.streaming.api.operators.AbstractStreamOperator) > this)/* 27 */ > .setProcessingTimeService(processingTimeService);/* 28 */ }/* 29 */ > }/* 30 *//* 31 */ @Override/* 32 */ public void open() > throws Exception \{/* 33 */ super.open();/* 34 */ /* 35 */ > > function_cn$bicon$tableapitest$udf$ScalarFunctionTest$HashCode$8999e79cc91b971a8777461fb7698c58.open(new > org.apache.flink.table.functions.FunctionContext(getRuntimeContext()));/* 36 > */ /* 37 */ }/* 38 *//* 39 */ @Override/* 40 */ > public void > processElement(org.apache.flink.streaming.runtime.streamrecord.StreamRecord > element) throws Exception \{/* 41 */ > org.apache.flink.table.data.RowData in1 = > (org.apache.flink.table.data.RowData) element.getValue();/* 42 */ /* > 43 */ org.apache.flink.table.data.binary.BinaryStringData field$5;/* > 44 */ boolean isNull$5;/* 45 */ > org.apache.flink.
[jira] [Commented] (FLINK-20495) Elasticsearch6DynamicSinkITCase Hang
[ https://issues.apache.org/jira/browse/FLINK-20495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251532#comment-17251532 ] Huang Xingbo commented on FLINK-20495: -- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10995&view=logs&j=d44f43ce-542c-597d-bf94-b0718c71e5e8&t=03dca39c-73e8-5aaf-601d-328ae5c35f20 > Elasticsearch6DynamicSinkITCase Hang > > > Key: FLINK-20495 > URL: https://issues.apache.org/jira/browse/FLINK-20495 > Project: Flink > Issue Type: Bug > Components: Connectors / ElasticSearch >Affects Versions: 1.13.0 >Reporter: Huang Xingbo >Priority: Major > Labels: test-stability > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10535&view=logs&j=d44f43ce-542c-597d-bf94-b0718c71e5e8&t=03dca39c-73e8-5aaf-601d-328ae5c35f20] > > {code:java} > 2020-12-04T22:39:33.9748225Z [INFO] Running > org.apache.flink.streaming.connectors.elasticsearch.table.Elasticsearch6DynamicSinkITCase > 2020-12-04T22:54:51.9486410Z > == > 2020-12-04T22:54:51.9488766Z Process produced no output for 900 seconds. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #14377: [FLINK-19905][Connector][jdbc] The Jdbc-connector's 'lookup.max-retries' option initial value is 1 in JdbcLookupFunction
flinkbot edited a comment on pull request #14377: URL: https://github.com/apache/flink/pull/14377#issuecomment-744303277 ## CI report: * c0e212ed51c103765ada525c54ba3d83614f929a Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10969) * ffcb3c699eb099caccb20aa38f372557e8a59306 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14387: [FLINK-19691][Connector][jdbc] Expose `CONNECTION_CHECK_TIMEOUT_SECONDS` as a configurable option in Jdbc connector
flinkbot edited a comment on pull request #14387: URL: https://github.com/apache/flink/pull/14387#issuecomment-745165624 ## CI report: * 1f2f6e1075b11988e06013f648d1796b4e978a48 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10918) * 6f24479fca0414f440f2bf2590fdbb4308db8bb8 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11002) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-20632) Missing docker images for 1.12 release
[ https://issues.apache.org/jira/browse/FLINK-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251526#comment-17251526 ] Yang Wang commented on FLINK-20632: --- I remember [~chesnay] said in the ML that apache does not allow publishing snapshot images. Correct me if I am wrong. > Missing docker images for 1.12 release > -- > > Key: FLINK-20632 > URL: https://issues.apache.org/jira/browse/FLINK-20632 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes >Affects Versions: 1.12.0 >Reporter: Piotr Gwiazda >Priority: Critical > > Images for Flink 1.12 are missing in Docker hub > https://hub.docker.com/_/flink. As a result Kubernetes deployment as in the > documentation example is not working. > https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/resource-providers/native_kubernetes.html -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] xiaoHoly commented on a change in pull request #14377: [FLINK-19905][Connector][jdbc] The Jdbc-connector's 'lookup.max-retries' option initial value is 1 in JdbcLookupFunction
xiaoHoly commented on a change in pull request #14377: URL: https://github.com/apache/flink/pull/14377#discussion_r545584101 ## File path: flink-connectors/flink-connector-jdbc/src/test/java/org/apache/flink/connector/jdbc/table/JdbcLookupTableITCase.java ## @@ -120,7 +120,7 @@ public void testLookup() throws Exception { .build()); if (useCache) { Review comment: Thank you for your careful review, I have fixed the code This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14387: [FLINK-19691][Connector][jdbc] Expose `CONNECTION_CHECK_TIMEOUT_SECONDS` as a configurable option in Jdbc connector
flinkbot edited a comment on pull request #14387: URL: https://github.com/apache/flink/pull/14387#issuecomment-745165624 ## CI report: * 1f2f6e1075b11988e06013f648d1796b4e978a48 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10918) * 6f24479fca0414f440f2bf2590fdbb4308db8bb8 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-20665) FileNotFoundException when restore from latest Checkpoint
[ https://issues.apache.org/jira/browse/FLINK-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251524#comment-17251524 ] zhuxiaoshang commented on FLINK-20665: -- [~lzljs3620320],yes you are write,if it has been compacted, the target file is exists,then will not compact. sorry for misunderstanding > FileNotFoundException when restore from latest Checkpoint > - > > Key: FLINK-20665 > URL: https://issues.apache.org/jira/browse/FLINK-20665 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.12.0 >Reporter: zhuxiaoshang >Assignee: zhuxiaoshang >Priority: Blocker > Fix For: 1.12.1 > > > reproduce steps: > 1.a kafka to hdfs job,open `auto-compaction` > 2.when the job have done a successful checkpoint then cancel the job. > 3.restore from the latest checkpoint. > 4.after the first checkpoint has done ,the exception will appear > {code:java} > 2020-12-18 10:40:58java.io.UncheckedIOException: > java.io.FileNotFoundException: File does not exist: > hdfs:///day=2020-12-18/hour=10/.uncompacted-part-84db54f8-eda9-4e01-8e85-672144041642-0-0 > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$1(CompactCoordinator.java:160) > at > org.apache.flink.table.runtime.util.BinPacking.pack(BinPacking.java:41)at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$2(CompactCoordinator.java:169) > at java.util.HashMap.forEach(HashMap.java:1289)at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.coordinate(CompactCoordinator.java:166) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.commitUpToCheckpoint(CompactCoordinator.java:147) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.processElement(CompactCoordinator.java:137) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:193) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:179) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:152) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:372) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:186) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:575) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:539) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722)at > org.apache.flink.runtime.taskmanager.Task.run(Task.java:547)at > java.lang.Thread.run(Thread.java:748)Caused by: > java.io.FileNotFoundException: File does not exist: > hdfs:///day=2020-12-18/hour=10/.uncompacted-part-84db54f8-eda9-4e01-8e85-672144041642-0-0 > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getFileStatus(HadoopFileSystem.java:85) > at > org.apache.flink.core.fs.SafetyNetWrapperFileSystem.getFileStatus(SafetyNetWrapperFileSystem.java:64) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$1(CompactCoordinator.java:158) > ... 17 more > {code} > DDL > {code:java} > CREATE TABLE cpc_bd_recall_log_hdfs (log_timestamp BIGINT,ip STRING, > `raw` STRING,`day` STRING, `hour` STRING) PARTITIONED BY (`day` , > `hour`) WITH ('connector'='filesystem','path'='hdfs://xxx', > 'format'='parquet','parquet.compression'='SNAPPY', > 'sink.partition-commit.policy.kind' = 'success-file','auto-compaction' = > 'true','compaction.file-size' = '128MB'); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20648) Unable to restore job from savepoint when using Kubernetes based HA services
[ https://issues.apache.org/jira/browse/FLINK-20648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251523#comment-17251523 ] Yang Wang commented on FLINK-20648: --- [~dmvk] Thanks for creating this issue and debugging the root cause. I think you are right. Currently, when recovering from savepoint, Flink will add a new checkpoint to the HA storage. So it needs to update the ConfigMap. However, the ConfigMap has not be created since the leader election service is not started. In the Kubernetes HA implementation, we have a very important assumption, only the active leader could update the HA ConfigMap. I do not tend to let {{KubernetesStateHandleStore}} could support adding checkpoints before leader is granted. It will cause some issues when we have multiple JobManagers. But I am also not sure whether we could start the leader election service before jobmaster service. I will dig more and hope to find a more reasonable solution. > Unable to restore job from savepoint when using Kubernetes based HA services > > > Key: FLINK-20648 > URL: https://issues.apache.org/jira/browse/FLINK-20648 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes >Affects Versions: 1.12.0 >Reporter: David Morávek >Assignee: Yang Wang >Priority: Critical > Fix For: 1.12.1 > > > When restoring job from savepoint, we always end up with following error: > {code} > Caused by: org.apache.flink.runtime.client.JobInitializationException: Could > not instantiate JobManager. > at > org.apache.flink.runtime.dispatcher.Dispatcher.lambda$createJobManagerRunner$5(Dispatcher.java:463) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1764) > ~[?:?] > ... 3 more > Caused by: java.util.concurrent.ExecutionException: > org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Stopped > retrying the operation because the error is not retryable. > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) > ~[?:?] > at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2063) ~[?:?] > at > org.apache.flink.kubernetes.highavailability.KubernetesStateHandleStore.addAndLock(KubernetesStateHandleStore.java:150) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.checkpoint.DefaultCompletedCheckpointStore.addCheckpoint(DefaultCompletedCheckpointStore.java:211) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1479) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.scheduler.SchedulerBase.tryRestoreExecutionGraphFromSavepoint(SchedulerBase.java:325) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.scheduler.SchedulerBase.createAndRestoreExecutionGraph(SchedulerBase.java:266) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.scheduler.SchedulerBase.(SchedulerBase.java:238) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.scheduler.DefaultScheduler.(DefaultScheduler.java:134) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.scheduler.DefaultSchedulerFactory.createInstance(DefaultSchedulerFactory.java:108) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.jobmaster.JobMaster.createScheduler(JobMaster.java:323) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.jobmaster.JobMaster.(JobMaster.java:310) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.createJobMasterService(DefaultJobMasterServiceFactory.java:96) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.createJobMasterService(DefaultJobMasterServiceFactory.java:41) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.jobmaster.JobManagerRunnerImpl.(JobManagerRunnerImpl.java:141) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.dispatcher.DefaultJobManagerRunnerFactory.createJobManagerRunner(DefaultJobManagerRunnerFactory.java:80) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.dispatcher.Dispatcher.lambda$createJobManagerRunner$5(Dispatcher.java:450) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1764) > ~[?:?] > ... 3 more > Caused by: org.apache.flink.runtime.concurrent.Futu
[GitHub] [flink] flinkbot edited a comment on pull request #14415: [FLINK-19880][formats] Fix JsonRowFormatFactory's support for the format.ignore-parse-errors property.
flinkbot edited a comment on pull request #14415: URL: https://github.com/apache/flink/pull/14415#issuecomment-747584734 ## CI report: * 054c57cbbcab6c640bac4681fc9e17c7971d9b7d Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10993) Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10985) * 91b26c741c5cc1b578123830c6981e5e3fca9767 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10997) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] xiaoHoly commented on pull request #14387: [FLINK-19691][Connector][jdbc] Expose `CONNECTION_CHECK_TIMEOUT_SECONDS` as a configurable option in Jdbc connector
xiaoHoly commented on pull request #14387: URL: https://github.com/apache/flink/pull/14387#issuecomment-747874089 @wuchong ,hi, Jark. I hope to get your advice This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] xiaoHoly commented on pull request #14387: [FLINK-19691][Connector][jdbc] Expose `CONNECTION_CHECK_TIMEOUT_SECONDS` as a configurable option in Jdbc connector
xiaoHoly commented on pull request #14387: URL: https://github.com/apache/flink/pull/14387#issuecomment-747872906 @wangxlong i had done ConfigOption in JdbcDynamicTableFactory with your advice and Jark'advice ,but i have no idea how to verify this configOption in JdbcDynamicTableFactoryTest.just like other configOption,either check whether the configuration parameters are complete, or check whether the parameters are negative. follow this code ,i dont think we should check complete or negative for max-retry-timeout.WDYT? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-20645) Extra space typo when assembling dynamic configs
[ https://issues.apache.org/jira/browse/FLINK-20645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251508#comment-17251508 ] Yang Wang commented on FLINK-20645: --- Hmm. I am not sure why you need to pass arguments to the TaskManager process. Usually, the arguments need to be set to JobManager. {{jobmanager.sh}} and {{standalone-job.sh}} could parse the user specified main arguments successfully. > Extra space typo when assembling dynamic configs > > > Key: FLINK-20645 > URL: https://issues.apache.org/jira/browse/FLINK-20645 > Project: Flink > Issue Type: Bug >Reporter: Li Wang >Priority: Major > Labels: pull-request-available > Attachments: jm.log, tm.log > > > while starting task manager, I got the error below > {code:java} > "Caused by: org.apache.flink.configuration.IllegalConfigurationException: The > required configuration option Key: 'taskmanager.cpu.cores' , default: null > (fallback keys: []) is not set > {code} > > Checking TM starting command I got something like > {code:java} > -D taskmanager.memory.framework.off-heap.size=134217728b -D > taskmanager.memory.network.max=67108864b -D > taskmanager.memory.network.min=67108864b -D > taskmanager.memory.framework.heap.size=134217728b -D > taskmanager.memory.managed.size=57346620b -D taskmanager.cpu.cores=1.0 -D > taskmanager.memory.task.heap.size=1518663108b -D > taskmanager.memory.task.off-heap.size=0b > {code} > Looks like the extra spaces between "-D" and parameters causes those dynamic > options can't be passed in correctly. > Related code is > [here|https://github.com/apache/flink/blob/2ad92169a4a4ffc92d32783f8777a132e1dac7c4/flink-core/src/main/java/org/apache/flink/configuration/ConfigurationUtils.java#L180] > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-20669) Add the jzlib LICENSE file in flink-python module
[ https://issues.apache.org/jira/browse/FLINK-20669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu closed FLINK-20669. --- Assignee: Huang Xingbo Resolution: Fixed Fixed in - master via 0a5632fd6de2dcab3e248ed1284830ef420614e5 - release-1.12 via 8755b33bc1da3ba9b368aaa3aaf6f09ab422e525 > Add the jzlib LICENSE file in flink-python module > - > > Key: FLINK-20669 > URL: https://issues.apache.org/jira/browse/FLINK-20669 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.12.0, 1.13.0 >Reporter: Huang Xingbo >Assignee: Huang Xingbo >Priority: Major > Labels: pull-request-available > Fix For: 1.13.0, 1.12.1 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] dianfu closed pull request #14418: [FLINK-20669][python] Add the jzlib LICENSE file in flink-python module
dianfu closed pull request #14418: URL: https://github.com/apache/flink/pull/14418 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] dianfu commented on a change in pull request #14401: [FLINK-20621][python] Refactor the TypeInformation implementation in Python DataStream API.
dianfu commented on a change in pull request #14401: URL: https://github.com/apache/flink/pull/14401#discussion_r54555 ## File path: flink-python/pyflink/common/typeinfo.py ## @@ -91,66 +73,117 @@ def from_internal_type(self, obj): return obj -class BasicTypeInfo(TypeInformation, ABC): +class BasicTypes(Enum): +STRING = "String" +BYTE = "Byte" +BOOLEAN = "Boolean" +SHORT = "Short" +INT = "Integer" +LONG = "Long" +FLOAT = "Float" +DOUBLE = "Double" +CHAR = "Char" +BIG_INT = "BigInteger" +BIG_DEC = "BigDecimal" + + +class BasicTypeInformation(TypeInformation, ABC): """ Type information for primitive types (int, long, double, byte, ...), String, BigInteger, and BigDecimal. """ +def __init__(self, basic_type: BasicTypes): +self._basic_type = basic_type +super(BasicTypeInformation, self).__init__() + +def get_java_type_info(self) -> JavaObject: Review comment: Could we refactor it a bit to make it more simple, e.g. introducing JBasicTypeInfo? ``` JBasicTypeInfo = get_gateway().jvm.org.apache.flink.api.common.typeinfo.BasicTypeInfo ``` ## File path: flink-python/pyflink/common/typeinfo.py ## @@ -431,45 +430,47 @@ def from_internal_type(self, obj): return tuple(values) -class TupleTypeInfo(WrapperTypeInfo): +class TupleTypeInfo(TypeInformation): """ TypeInformation for Tuple. """ def __init__(self, types: List[TypeInformation]): self.types = types +super(TupleTypeInfo, self).__init__() + +def get_field_types(self) -> List[TypeInformation]: +return self.types + +def get_java_type_info(self) -> JavaObject: j_types_array = get_gateway().new_array( - get_gateway().jvm.org.apache.flink.api.common.typeinfo.TypeInformation, len(types)) + get_gateway().jvm.org.apache.flink.api.common.typeinfo.TypeInformation, len(self.types)) -for i in range(len(types)): -field_type = types[i] -if isinstance(field_type, WrapperTypeInfo): +for i in range(len(self.types)): +field_type = self.types[i] +if isinstance(field_type, TypeInformation): j_types_array[i] = field_type.get_java_type_info() -j_typeinfo = get_gateway().jvm \ +self._j_typeinfo = get_gateway().jvm \ .org.apache.flink.api.java.typeutils.TupleTypeInfo(j_types_array) -super(TupleTypeInfo, self).__init__(j_typeinfo=j_typeinfo) - -def get_field_types(self) -> List[TypeInformation]: -return self.types +return self._j_typeinfo def __eq__(self, other) -> bool: -return self._j_typeinfo.equals(other._j_typeinfo) - -def __hash__(self) -> int: -return self._j_typeinfo.hashCode() +if isinstance(other, TupleTypeInfo): +return self.types == other.types +return False def __str__(self) -> str: -return "TupleTypeInfo(%s)" % ', '.join([field_type.__str__() for field_type in self.types]) +return "TupleTypeInfo(%s)" % ', '.join([str(field_type) for field_type in self.types]) -class DateTypeInfo(WrapperTypeInfo): +class DateTypeInformation(TypeInformation): Review comment: Why rename it? ## File path: flink-python/pyflink/common/typeinfo.py ## @@ -484,17 +485,25 @@ def from_internal_type(self, v): if v is not None: return datetime.date.fromordinal(v + self.EPOCH_ORDINAL) +def get_java_type_info(self) -> JavaObject: +self._j_typeinfo = get_gateway().jvm\ Review comment: I guess it could be cached to avoid computed again if called multiple times? ## File path: flink-python/pyflink/common/typeinfo.py ## @@ -48,30 +48,12 @@ class acts as the tool to generate serializers and comparators, and to perform s nested types). """ - -class WrapperTypeInfo(TypeInformation): -""" -A wrapper class for java TypeInformation Objects. -""" - -def __init__(self, j_typeinfo): -self._j_typeinfo = j_typeinfo +def __init__(self): +self._j_typeinfo = None def get_java_type_info(self) -> JavaObject: Review comment: +1. I have the same feeling. ## File path: flink-python/pyflink/common/typeinfo.py ## @@ -91,66 +73,117 @@ def from_internal_type(self, obj): return obj -class BasicTypeInfo(TypeInformation, ABC): +class BasicTypes(Enum): +STRING = "String" +BYTE = "Byte" +BOOLEAN = "Boolean" +SHORT = "Short" +INT = "Integer" +LONG = "Long" +FLOAT = "Float" +DOUBLE = "Double" +CHAR = "Char" +BIG_INT = "BigInteger" +BIG_DEC = "BigDecimal" + + +class BasicTypeInformation(TypeInformation, ABC): Review comment: Why rename it to BasicTypeInformation?
[GitHub] [flink] flinkbot commented on pull request #14418: [FLINK-20669][python] Add the jzlib LICENSE file in flink-python module
flinkbot commented on pull request #14418: URL: https://github.com/apache/flink/pull/14418#issuecomment-747869405 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 181cb4c0778d4ee08f1dd210d801607d5ccb0657 (Fri Dec 18 05:02:49 UTC 2020) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! * **This pull request references an unassigned [Jira ticket](https://issues.apache.org/jira/browse/FLINK-20669).** According to the [code contribution guide](https://flink.apache.org/contributing/contribute-code.html), tickets need to be assigned before starting with the implementation work. Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] HuangXingBo commented on pull request #14418: [FLINK-20669][python] Add the jzlib LICENSE file in flink-python module
HuangXingBo commented on pull request #14418: URL: https://github.com/apache/flink/pull/14418#issuecomment-747868969 @zentol @dianfu Could you help review it? Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-20669) Add the jzlib LICENSE file in flink-python module
[ https://issues.apache.org/jira/browse/FLINK-20669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-20669: --- Labels: pull-request-available (was: ) > Add the jzlib LICENSE file in flink-python module > - > > Key: FLINK-20669 > URL: https://issues.apache.org/jira/browse/FLINK-20669 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.12.0, 1.13.0 >Reporter: Huang Xingbo >Priority: Major > Labels: pull-request-available > Fix For: 1.13.0, 1.12.1 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] HuangXingBo opened a new pull request #14418: [FLINK-20669][python] Add the jzlib LICENSE file in flink-python module
HuangXingBo opened a new pull request #14418: URL: https://github.com/apache/flink/pull/14418 ## What is the purpose of the change *This pull request will add the jzlib LICENSE file in flink-python module* ## Brief change log - *Add LICENSE file `LICENSE.jzlib`* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-docker] wangyang0918 edited a comment on pull request #49: [FLINK-20650] Rename "native-k8s" command to a more general name in docker-entrypoint.sh
wangyang0918 edited a comment on pull request #49: URL: https://github.com/apache/flink-docker/pull/49#issuecomment-747409750 @tillrohrmann Yeah. I will attach the Flink PR here once created. Moreover, I am not sure whether we should wait for the response of docker guys before merging this PR so that we will not block at image publishing next time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wangyang0918 commented on a change in pull request #14416: [hotfix][k8s] Fix k8s ha service doc.
wangyang0918 commented on a change in pull request #14416: URL: https://github.com/apache/flink/pull/14416#discussion_r545571110 ## File path: flink-kubernetes/src/main/java/org/apache/flink/kubernetes/highavailability/KubernetesHaServices.java ## @@ -42,7 +42,7 @@ import static org.apache.flink.util.Preconditions.checkNotNull; /** - * An implementation of the {@link AbstractHaServices} using Apache ZooKeeper. + * An implementation of the {@link AbstractHaServices} based on kubernetes API. Review comment: ```suggestion * An implementation of the {@link AbstractHaServices} using Kubernetes. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-20645) Extra space typo when assembling dynamic configs
[ https://issues.apache.org/jira/browse/FLINK-20645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251491#comment-17251491 ] Li Wang commented on FLINK-20645: - Thanks [~karmagyz] for the reference, we will look into converting to this more k8s native way to deploy our job cluster. Also thank you [~fly_in_gis] for pointing out the root cause. I tried another flink app without any arguments, TM is able to be started without any issue. This proves those unrecognized arguments is the killer. Just curious how to pass in jar arguments in Flink 1.11 since the same scripts we're using works for Flink 1.6 and 1.7. Or maybe I have to make them as system properties and pass in using "-D"? > Extra space typo when assembling dynamic configs > > > Key: FLINK-20645 > URL: https://issues.apache.org/jira/browse/FLINK-20645 > Project: Flink > Issue Type: Bug >Reporter: Li Wang >Priority: Major > Labels: pull-request-available > Attachments: jm.log, tm.log > > > while starting task manager, I got the error below > {code:java} > "Caused by: org.apache.flink.configuration.IllegalConfigurationException: The > required configuration option Key: 'taskmanager.cpu.cores' , default: null > (fallback keys: []) is not set > {code} > > Checking TM starting command I got something like > {code:java} > -D taskmanager.memory.framework.off-heap.size=134217728b -D > taskmanager.memory.network.max=67108864b -D > taskmanager.memory.network.min=67108864b -D > taskmanager.memory.framework.heap.size=134217728b -D > taskmanager.memory.managed.size=57346620b -D taskmanager.cpu.cores=1.0 -D > taskmanager.memory.task.heap.size=1518663108b -D > taskmanager.memory.task.off-heap.size=0b > {code} > Looks like the extra spaces between "-D" and parameters causes those dynamic > options can't be passed in correctly. > Related code is > [here|https://github.com/apache/flink/blob/2ad92169a4a4ffc92d32783f8777a132e1dac7c4/flink-core/src/main/java/org/apache/flink/configuration/ConfigurationUtils.java#L180] > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20567) Document Error in UDTF section
[ https://issues.apache.org/jira/browse/FLINK-20567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu updated FLINK-20567: Summary: Document Error in UDTF section (was: Document Error) > Document Error in UDTF section > -- > > Key: FLINK-20567 > URL: https://issues.apache.org/jira/browse/FLINK-20567 > Project: Flink > Issue Type: Bug > Components: Documentation, Table SQL / Ecosystem >Reporter: appleyuchi >Assignee: appleyuchi >Priority: Major > Attachments: screenshot-1.png > > > ||item||Content|| > |Document|[Link|https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/tableApi.html]| > |part|Inner Join with Table Function (UDTF)| > |origin|TableFunction split = new MySplitUDTF();| > |change to|TableFunction> split = new > MySplitUDTF();| > I have run the following the codes successfully > that contain all the contents from the above. > ①[InnerJoinwithTableFunction.java|https://paste.ubuntu.com/p/MMXJPrfRWC] > ②[MySplitUDTF.java|https://paste.ubuntu.com/p/Q6fDHxw4Td/] > Reason: > In this part, > it says: > joinLateral(call("split", $("c")).as("s", "t", "v")) > it means: > the udtf has 1 input "c", > and 3 outputs "s", "t", "v" > So: > these outputs should have 3 types. > such as TableFunction> > instead of only > -TableFunction split- > !screenshot-1.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #14417: [FLINK-20668][table-planner-blink] Introduce translateToExecNode method for FlinkPhysicalRel
flinkbot edited a comment on pull request #14417: URL: https://github.com/apache/flink/pull/14417#issuecomment-747849155 ## CI report: * 67c73bdacc77bab17d60526487c13fe6693e4904 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10999) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-20567) Document Error
[ https://issues.apache.org/jira/browse/FLINK-20567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu reassigned FLINK-20567: --- Assignee: appleyuchi > Document Error > -- > > Key: FLINK-20567 > URL: https://issues.apache.org/jira/browse/FLINK-20567 > Project: Flink > Issue Type: Bug > Components: Documentation, Table SQL / Ecosystem >Reporter: appleyuchi >Assignee: appleyuchi >Priority: Major > Attachments: screenshot-1.png > > > ||item||Content|| > |Document|[Link|https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/tableApi.html]| > |part|Inner Join with Table Function (UDTF)| > |origin|TableFunction split = new MySplitUDTF();| > |change to|TableFunction> split = new > MySplitUDTF();| > I have run the following the codes successfully > that contain all the contents from the above. > ①[InnerJoinwithTableFunction.java|https://paste.ubuntu.com/p/MMXJPrfRWC] > ②[MySplitUDTF.java|https://paste.ubuntu.com/p/Q6fDHxw4Td/] > Reason: > In this part, > it says: > joinLateral(call("split", $("c")).as("s", "t", "v")) > it means: > the udtf has 1 input "c", > and 3 outputs "s", "t", "v" > So: > these outputs should have 3 types. > such as TableFunction> > instead of only > -TableFunction split- > !screenshot-1.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-20576) Flink Temporal Join Hive Dim Error
[ https://issues.apache.org/jira/browse/FLINK-20576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu reassigned FLINK-20576: --- Assignee: Leonard Xu > Flink Temporal Join Hive Dim Error > -- > > Key: FLINK-20576 > URL: https://issues.apache.org/jira/browse/FLINK-20576 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive, Table SQL / Ecosystem >Affects Versions: 1.12.0 >Reporter: HideOnBush >Assignee: Leonard Xu >Priority: Major > Fix For: 1.13.0, 1.12.1 > > > > KAFKA DDL > {code:java} > CREATE TABLE hive_catalog.flink_db_test.kfk_master_test ( > master Row String, action int, orderStatus int, orderKey String, actionTime bigint, > areaName String, paidAmount double, foodAmount double, startTime String, > person double, orderSubType int, checkoutTime String>, > proctime as PROCTIME() > ) WITH (properties ..){code} > > FLINK client query sql > {noformat} > SELECT * FROM hive_catalog.flink_db_test.kfk_master_test AS kafk_tbl > JOIN hive_catalog.gauss.dim_extend_shop_info /*+ > OPTIONS('streaming-source.enable'='true', > 'streaming-source.partition.include' = 'latest', >'streaming-source.monitor-interval' = '12 > h','streaming-source.partition-order' = 'partition-name') */ FOR SYSTEM_TIME > AS OF kafk_tbl.proctime AS dim >ON kafk_tbl.groupID = dim.group_id where kafk_tbl.groupID is not > null;{noformat} > When I execute the above statement, these stack error messages are returned > Caused by: java.lang.NullPointerException: bufferCaused by: > java.lang.NullPointerException: buffer at > org.apache.flink.core.memory.MemorySegment.(MemorySegment.java:161) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.core.memory.HybridMemorySegment.(HybridMemorySegment.java:86) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.core.memory.MemorySegmentFactory.wrap(MemorySegmentFactory.java:55) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.table.data.binary.BinaryStringData.fromBytes(BinaryStringData.java:98) > ~[flink-table_2.11-1.12.0.jar:1.12.0] > > Caused by: org.apache.flink.util.FlinkRuntimeException: Failed to load table > into cache after 3 retriesCaused by: > org.apache.flink.util.FlinkRuntimeException: Failed to load table into cache > after 3 retries at > org.apache.flink.table.filesystem.FileSystemLookupFunction.checkCacheReload(FileSystemLookupFunction.java:143) > ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.table.filesystem.FileSystemLookupFunction.eval(FileSystemLookupFunction.java:103) > ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at > LookupFunction$1577.flatMap(Unknown Source) ~[?:?] at > org.apache.flink.table.runtime.operators.join.lookup.LookupJoinRunner.processElement(LookupJoinRunner.java:82) > ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.table.runtime.operators.join.lookup.LookupJoinRunner.processElement(LookupJoinRunner.java:36) > ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20576) Flink Temporal Join Hive Dim Error
[ https://issues.apache.org/jira/browse/FLINK-20576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu updated FLINK-20576: Fix Version/s: 1.12.1 > Flink Temporal Join Hive Dim Error > -- > > Key: FLINK-20576 > URL: https://issues.apache.org/jira/browse/FLINK-20576 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive, Table SQL / Ecosystem >Affects Versions: 1.12.0 >Reporter: HideOnBush >Priority: Major > Fix For: 1.13.0, 1.12.1 > > > > KAFKA DDL > {code:java} > CREATE TABLE hive_catalog.flink_db_test.kfk_master_test ( > master Row String, action int, orderStatus int, orderKey String, actionTime bigint, > areaName String, paidAmount double, foodAmount double, startTime String, > person double, orderSubType int, checkoutTime String>, > proctime as PROCTIME() > ) WITH (properties ..){code} > > FLINK client query sql > {noformat} > SELECT * FROM hive_catalog.flink_db_test.kfk_master_test AS kafk_tbl > JOIN hive_catalog.gauss.dim_extend_shop_info /*+ > OPTIONS('streaming-source.enable'='true', > 'streaming-source.partition.include' = 'latest', >'streaming-source.monitor-interval' = '12 > h','streaming-source.partition-order' = 'partition-name') */ FOR SYSTEM_TIME > AS OF kafk_tbl.proctime AS dim >ON kafk_tbl.groupID = dim.group_id where kafk_tbl.groupID is not > null;{noformat} > When I execute the above statement, these stack error messages are returned > Caused by: java.lang.NullPointerException: bufferCaused by: > java.lang.NullPointerException: buffer at > org.apache.flink.core.memory.MemorySegment.(MemorySegment.java:161) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.core.memory.HybridMemorySegment.(HybridMemorySegment.java:86) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.core.memory.MemorySegmentFactory.wrap(MemorySegmentFactory.java:55) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.table.data.binary.BinaryStringData.fromBytes(BinaryStringData.java:98) > ~[flink-table_2.11-1.12.0.jar:1.12.0] > > Caused by: org.apache.flink.util.FlinkRuntimeException: Failed to load table > into cache after 3 retriesCaused by: > org.apache.flink.util.FlinkRuntimeException: Failed to load table into cache > after 3 retries at > org.apache.flink.table.filesystem.FileSystemLookupFunction.checkCacheReload(FileSystemLookupFunction.java:143) > ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.table.filesystem.FileSystemLookupFunction.eval(FileSystemLookupFunction.java:103) > ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at > LookupFunction$1577.flatMap(Unknown Source) ~[?:?] at > org.apache.flink.table.runtime.operators.join.lookup.LookupJoinRunner.processElement(LookupJoinRunner.java:82) > ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.table.runtime.operators.join.lookup.LookupJoinRunner.processElement(LookupJoinRunner.java:36) > ~[flink-table-blink_2.11-1.12.0.jar:1.12.0] at > org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20665) FileNotFoundException when restore from latest Checkpoint
[ https://issues.apache.org/jira/browse/FLINK-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251490#comment-17251490 ] Jingsong Lee commented on FLINK-20665: -- Assigned to u [~ZhuShang]. I marked this as a Blocker. We should fix this before 1.12.1. > BTW,I find that if some small files are compacted,when restore, they will be >compacted again. If it has been completely compacted, it should not be compacted again, unless it is not completed. > FileNotFoundException when restore from latest Checkpoint > - > > Key: FLINK-20665 > URL: https://issues.apache.org/jira/browse/FLINK-20665 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.12.0 >Reporter: zhuxiaoshang >Priority: Blocker > Fix For: 1.12.1 > > > reproduce steps: > 1.a kafka to hdfs job,open `auto-compaction` > 2.when the job have done a successful checkpoint then cancel the job. > 3.restore from the latest checkpoint. > 4.after the first checkpoint has done ,the exception will appear > {code:java} > 2020-12-18 10:40:58java.io.UncheckedIOException: > java.io.FileNotFoundException: File does not exist: > hdfs:///day=2020-12-18/hour=10/.uncompacted-part-84db54f8-eda9-4e01-8e85-672144041642-0-0 > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$1(CompactCoordinator.java:160) > at > org.apache.flink.table.runtime.util.BinPacking.pack(BinPacking.java:41)at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$2(CompactCoordinator.java:169) > at java.util.HashMap.forEach(HashMap.java:1289)at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.coordinate(CompactCoordinator.java:166) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.commitUpToCheckpoint(CompactCoordinator.java:147) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.processElement(CompactCoordinator.java:137) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:193) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:179) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:152) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:372) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:186) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:575) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:539) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722)at > org.apache.flink.runtime.taskmanager.Task.run(Task.java:547)at > java.lang.Thread.run(Thread.java:748)Caused by: > java.io.FileNotFoundException: File does not exist: > hdfs:///day=2020-12-18/hour=10/.uncompacted-part-84db54f8-eda9-4e01-8e85-672144041642-0-0 > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getFileStatus(HadoopFileSystem.java:85) > at > org.apache.flink.core.fs.SafetyNetWrapperFileSystem.getFileStatus(SafetyNetWrapperFileSystem.java:64) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$1(CompactCoordinator.java:158) > ... 17 more > {code} > DDL > {code:java} > CREATE TABLE cpc_bd_recall_log_hdfs (log_timestamp BIGINT,ip STRING, > `raw` STRING,`day` STRING, `hour` STRING) PARTITIONED BY (`day` , > `hour`) WITH ('connector'='filesystem','path'='hdfs://xxx', > 'format'='parquet','parquet.compression'='SNAPPY', > 'sink.partition-commit.policy.kind' = 'success-file','auto-compaction' = > 'true','compaction.file-size' = '128MB'); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-20665) FileNotFoundException when restore from latest Checkpoint
[ https://issues.apache.org/jira/browse/FLINK-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee reassigned FLINK-20665: Assignee: zhuxiaoshang > FileNotFoundException when restore from latest Checkpoint > - > > Key: FLINK-20665 > URL: https://issues.apache.org/jira/browse/FLINK-20665 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.12.0 >Reporter: zhuxiaoshang >Assignee: zhuxiaoshang >Priority: Blocker > Fix For: 1.12.1 > > > reproduce steps: > 1.a kafka to hdfs job,open `auto-compaction` > 2.when the job have done a successful checkpoint then cancel the job. > 3.restore from the latest checkpoint. > 4.after the first checkpoint has done ,the exception will appear > {code:java} > 2020-12-18 10:40:58java.io.UncheckedIOException: > java.io.FileNotFoundException: File does not exist: > hdfs:///day=2020-12-18/hour=10/.uncompacted-part-84db54f8-eda9-4e01-8e85-672144041642-0-0 > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$1(CompactCoordinator.java:160) > at > org.apache.flink.table.runtime.util.BinPacking.pack(BinPacking.java:41)at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$2(CompactCoordinator.java:169) > at java.util.HashMap.forEach(HashMap.java:1289)at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.coordinate(CompactCoordinator.java:166) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.commitUpToCheckpoint(CompactCoordinator.java:147) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.processElement(CompactCoordinator.java:137) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:193) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:179) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:152) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:372) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:186) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:575) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:539) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722)at > org.apache.flink.runtime.taskmanager.Task.run(Task.java:547)at > java.lang.Thread.run(Thread.java:748)Caused by: > java.io.FileNotFoundException: File does not exist: > hdfs:///day=2020-12-18/hour=10/.uncompacted-part-84db54f8-eda9-4e01-8e85-672144041642-0-0 > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getFileStatus(HadoopFileSystem.java:85) > at > org.apache.flink.core.fs.SafetyNetWrapperFileSystem.getFileStatus(SafetyNetWrapperFileSystem.java:64) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$1(CompactCoordinator.java:158) > ... 17 more > {code} > DDL > {code:java} > CREATE TABLE cpc_bd_recall_log_hdfs (log_timestamp BIGINT,ip STRING, > `raw` STRING,`day` STRING, `hour` STRING) PARTITIONED BY (`day` , > `hour`) WITH ('connector'='filesystem','path'='hdfs://xxx', > 'format'='parquet','parquet.compression'='SNAPPY', > 'sink.partition-commit.policy.kind' = 'success-file','auto-compaction' = > 'true','compaction.file-size' = '128MB'); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20664) Support setting service account for TaskManager pod
[ https://issues.apache.org/jira/browse/FLINK-20664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251489#comment-17251489 ] Yang Wang commented on FLINK-20664: --- Since the user code is executing on the TaskManager, I am afraid some users do not want to give TaskManager pods a service account with too many permissions. Currently, only for ConfigMap is enough. Why do we need to create two binding? I think the {{kubernetes.taskmanager.service-account}} could be set same as {{kubernetes.jobmanager.service-account}} if users want. Two config options are more flexible. Right? > Support setting service account for TaskManager pod > --- > > Key: FLINK-20664 > URL: https://issues.apache.org/jira/browse/FLINK-20664 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes >Affects Versions: 1.12.0 >Reporter: Yang Wang >Assignee: Yang Wang >Priority: Critical > Fix For: 1.12.1 > > > Currently, we only set the service account for JobManager. The TaskManager is > using the default service account. Before the KubernetesHAService is > introduced, it works because the TaskManager does not need to access the K8s > resource(e.g. ConfigMap) directly. But now the TaskManager needs to watch > ConfigMap and retrieve leader address. So if the default service account does > not have enough permission, users could not specify a valid service account > for TaskManager. > > We should introduce a new config option for TaskManager service account. > {{kubernetes.jobmanager.service-account}} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-20665) FileNotFoundException when restore from latest Checkpoint
[ https://issues.apache.org/jira/browse/FLINK-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251486#comment-17251486 ] zhuxiaoshang edited comment on FLINK-20665 at 12/18/20, 3:55 AM: - [~lzljs3620320],I'd like to do this job. BTW,I find that if some small files are compacted,when restore, they will be compacted again. How can we ensure data consistency. was (Author: zhushang): [~lzljs3620320],I'd like to do this job. BTW,I find that if some small files are compacted,when restore, they will be compacted again. > FileNotFoundException when restore from latest Checkpoint > - > > Key: FLINK-20665 > URL: https://issues.apache.org/jira/browse/FLINK-20665 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.12.0 >Reporter: zhuxiaoshang >Priority: Blocker > Fix For: 1.12.1 > > > reproduce steps: > 1.a kafka to hdfs job,open `auto-compaction` > 2.when the job have done a successful checkpoint then cancel the job. > 3.restore from the latest checkpoint. > 4.after the first checkpoint has done ,the exception will appear > {code:java} > 2020-12-18 10:40:58java.io.UncheckedIOException: > java.io.FileNotFoundException: File does not exist: > hdfs:///day=2020-12-18/hour=10/.uncompacted-part-84db54f8-eda9-4e01-8e85-672144041642-0-0 > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$1(CompactCoordinator.java:160) > at > org.apache.flink.table.runtime.util.BinPacking.pack(BinPacking.java:41)at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$2(CompactCoordinator.java:169) > at java.util.HashMap.forEach(HashMap.java:1289)at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.coordinate(CompactCoordinator.java:166) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.commitUpToCheckpoint(CompactCoordinator.java:147) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.processElement(CompactCoordinator.java:137) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:193) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:179) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:152) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:372) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:186) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:575) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:539) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722)at > org.apache.flink.runtime.taskmanager.Task.run(Task.java:547)at > java.lang.Thread.run(Thread.java:748)Caused by: > java.io.FileNotFoundException: File does not exist: > hdfs:///day=2020-12-18/hour=10/.uncompacted-part-84db54f8-eda9-4e01-8e85-672144041642-0-0 > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getFileStatus(HadoopFileSystem.java:85) > at > org.apache.flink.core.fs.SafetyNetWrapperFileSystem.getFileStatus(SafetyNetWrapperFileSystem.java:64) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$1(CompactCoordinator.java:158) > ... 17 more > {code} > DDL > {code:java} > CREATE TABLE cpc_bd_recall_log_hdfs (log_timestamp BIGINT,ip STRING, > `raw` STRING,`day` STRING, `hour` STRING) PARTITIONED BY (`day` , > `hour`) WITH ('connector'='filesystem','path'='hdfs://xxx', > 'format'='parquet','parquet.compression'='SNAPPY', > 'sink.partition-commit.policy.kind' = 'success-file','auto-compaction' = > 'true','compaction.file-size' = '128MB'); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] shuiqiangchen commented on a change in pull request #14401: [FLINK-20621][python] Refactor the TypeInformation implementation in Python DataStream API.
shuiqiangchen commented on a change in pull request #14401: URL: https://github.com/apache/flink/pull/14401#discussion_r545557068 ## File path: flink-python/pyflink/common/typeinfo.py ## @@ -407,12 +406,12 @@ def to_internal_type(self, obj): raise ValueError("Unexpected tuple %r with RowTypeInfo" % obj) else: if isinstance(obj, dict): -return tuple(obj.get(n) for n in self._j_typeinfo.getFieldNames()) +return tuple(obj.get(n) for n in self.field_names) Review comment: Here we should appy obj.get(field_name), it would return None if the field_name does not present in the obj while applying obj[field_name] will get a KeyError. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-20665) FileNotFoundException when restore from latest Checkpoint
[ https://issues.apache.org/jira/browse/FLINK-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251486#comment-17251486 ] zhuxiaoshang commented on FLINK-20665: -- [~lzljs3620320],I'd like to do this job. BTW,I find that if some small files are compacted,when restore, they will be compacted again. > FileNotFoundException when restore from latest Checkpoint > - > > Key: FLINK-20665 > URL: https://issues.apache.org/jira/browse/FLINK-20665 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.12.0 >Reporter: zhuxiaoshang >Priority: Blocker > Fix For: 1.12.1 > > > reproduce steps: > 1.a kafka to hdfs job,open `auto-compaction` > 2.when the job have done a successful checkpoint then cancel the job. > 3.restore from the latest checkpoint. > 4.after the first checkpoint has done ,the exception will appear > {code:java} > 2020-12-18 10:40:58java.io.UncheckedIOException: > java.io.FileNotFoundException: File does not exist: > hdfs:///day=2020-12-18/hour=10/.uncompacted-part-84db54f8-eda9-4e01-8e85-672144041642-0-0 > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$1(CompactCoordinator.java:160) > at > org.apache.flink.table.runtime.util.BinPacking.pack(BinPacking.java:41)at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$2(CompactCoordinator.java:169) > at java.util.HashMap.forEach(HashMap.java:1289)at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.coordinate(CompactCoordinator.java:166) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.commitUpToCheckpoint(CompactCoordinator.java:147) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.processElement(CompactCoordinator.java:137) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:193) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:179) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:152) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:372) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:186) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:575) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:539) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722)at > org.apache.flink.runtime.taskmanager.Task.run(Task.java:547)at > java.lang.Thread.run(Thread.java:748)Caused by: > java.io.FileNotFoundException: File does not exist: > hdfs:///day=2020-12-18/hour=10/.uncompacted-part-84db54f8-eda9-4e01-8e85-672144041642-0-0 > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getFileStatus(HadoopFileSystem.java:85) > at > org.apache.flink.core.fs.SafetyNetWrapperFileSystem.getFileStatus(SafetyNetWrapperFileSystem.java:64) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$1(CompactCoordinator.java:158) > ... 17 more > {code} > DDL > {code:java} > CREATE TABLE cpc_bd_recall_log_hdfs (log_timestamp BIGINT,ip STRING, > `raw` STRING,`day` STRING, `hour` STRING) PARTITIONED BY (`day` , > `hour`) WITH ('connector'='filesystem','path'='hdfs://xxx', > 'format'='parquet','parquet.compression'='SNAPPY', > 'sink.partition-commit.policy.kind' = 'success-file','auto-compaction' = > 'true','compaction.file-size' = '128MB'); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20644) table api use UDF(ScalarFunction ) if eval return Unit
[ https://issues.apache.org/jira/browse/FLINK-20644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu updated FLINK-20644: Priority: Major (was: Critical) > table api use UDF(ScalarFunction ) if eval return Unit > --- > > Key: FLINK-20644 > URL: https://issues.apache.org/jira/browse/FLINK-20644 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.11.1 > Environment: groupId:org.apache.flink > artifactId:flink-table-api-scala-bridge_2.11 > version:1.11.1 >Reporter: shiyu >Priority: Major > Attachments: image-2020-12-17-16-04-15-131.png, > image-2020-12-17-16-07-39-827.png > > > flink-table-api-scala-bridge_2.11 > !image-2020-12-17-16-07-39-827.png! > !image-2020-12-17-16-04-15-131.png! > console: > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".SLF4J: Failed > to load class "org.slf4j.impl.StaticLoggerBinder".SLF4J: Defaulting to > no-operation (NOP) logger implementationSLF4J: See > http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.ERROR > StatusLogger Log4j2 could not find a logging implementation. Please add > log4j-core to the classpath. Using SimpleLogger to log to the console.../* 1 > *//* 2 */ public class StreamExecCalc$13 extends > org.apache.flink.table.runtime.operators.AbstractProcessStreamOperator/* 3 */ > implements > org.apache.flink.streaming.api.operators.OneInputStreamOperator \{/* 4 *//* 5 > */ private final Object[] references;/* 6 */ private transient > org.apache.flink.table.runtime.typeutils.StringDataSerializer > typeSerializer$6;/* 7 */ private transient > org.apache.flink.table.data.util.DataFormatConverters.StringConverter > converter$9;/* 8 */ private transient > cn.bicon.tableapitest.udf.ScalarFunctionTest$HashCode > function_cn$bicon$tableapitest$udf$ScalarFunctionTest$HashCode$8999e79cc91b971a8777461fb7698c58;/* > 9 */ private transient > org.apache.flink.table.data.util.DataFormatConverters.GenericConverter > converter$12;/* 10 */ org.apache.flink.table.data.BoxedWrapperRowData > out = new org.apache.flink.table.data.BoxedWrapperRowData(3);/* 11 */ > private final org.apache.flink.streaming.runtime.streamrecord.StreamRecord > outElement = new > org.apache.flink.streaming.runtime.streamrecord.StreamRecord(null);/* 12 *//* > 13 */ public StreamExecCalc$13(/* 14 */ Object[] > references,/* 15 */ > org.apache.flink.streaming.runtime.tasks.StreamTask task,/* 16 */ > org.apache.flink.streaming.api.graph.StreamConfig config,/* 17 */ > org.apache.flink.streaming.api.operators.Output output,/* 18 */ > org.apache.flink.streaming.runtime.tasks.ProcessingTimeService > processingTimeService) throws Exception {/* 19 */ this.references = > references;/* 20 */ typeSerializer$6 = > (((org.apache.flink.table.runtime.typeutils.StringDataSerializer) > references[0]));/* 21 */ converter$9 = > (((org.apache.flink.table.data.util.DataFormatConverters.StringConverter) > references[1]));/* 22 */ > function_cn$bicon$tableapitest$udf$ScalarFunctionTest$HashCode$8999e79cc91b971a8777461fb7698c58 > = (((cn.bicon.tableapitest.udf.ScalarFunctionTest$HashCode) > references[2]));/* 23 */ converter$12 = > (((org.apache.flink.table.data.util.DataFormatConverters.GenericConverter) > references[3]));/* 24 */ this.setup(task, config, output);/* 25 */ > if (this instanceof > org.apache.flink.streaming.api.operators.AbstractStreamOperator) {/* 26 */ > ((org.apache.flink.streaming.api.operators.AbstractStreamOperator) > this)/* 27 */ > .setProcessingTimeService(processingTimeService);/* 28 */ }/* 29 */ > }/* 30 *//* 31 */ @Override/* 32 */ public void open() > throws Exception \{/* 33 */ super.open();/* 34 */ /* 35 */ > > function_cn$bicon$tableapitest$udf$ScalarFunctionTest$HashCode$8999e79cc91b971a8777461fb7698c58.open(new > org.apache.flink.table.functions.FunctionContext(getRuntimeContext()));/* 36 > */ /* 37 */ }/* 38 *//* 39 */ @Override/* 40 */ > public void > processElement(org.apache.flink.streaming.runtime.streamrecord.StreamRecord > element) throws Exception \{/* 41 */ > org.apache.flink.table.data.RowData in1 = > (org.apache.flink.table.data.RowData) element.getValue();/* 42 */ /* > 43 */ org.apache.flink.table.data.binary.BinaryStringData field$5;/* > 44 */ boolean isNull$5;/* 45 */ > org.apache.flink.table.data.binary.BinaryStringData field$7;/* 46 */ > org.apache.flink.table.data.TimestampDat
[jira] [Updated] (FLINK-20644) table api use UDF(ScalarFunction ) if eval return Unit
[ https://issues.apache.org/jira/browse/FLINK-20644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu updated FLINK-20644: Labels: starter (was: ) > table api use UDF(ScalarFunction ) if eval return Unit > --- > > Key: FLINK-20644 > URL: https://issues.apache.org/jira/browse/FLINK-20644 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.11.1 > Environment: groupId:org.apache.flink > artifactId:flink-table-api-scala-bridge_2.11 > version:1.11.1 >Reporter: shiyu >Priority: Major > Labels: starter > Attachments: image-2020-12-17-16-04-15-131.png, > image-2020-12-17-16-07-39-827.png > > > flink-table-api-scala-bridge_2.11 > !image-2020-12-17-16-07-39-827.png! > !image-2020-12-17-16-04-15-131.png! > console: > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".SLF4J: Failed > to load class "org.slf4j.impl.StaticLoggerBinder".SLF4J: Defaulting to > no-operation (NOP) logger implementationSLF4J: See > http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.ERROR > StatusLogger Log4j2 could not find a logging implementation. Please add > log4j-core to the classpath. Using SimpleLogger to log to the console.../* 1 > *//* 2 */ public class StreamExecCalc$13 extends > org.apache.flink.table.runtime.operators.AbstractProcessStreamOperator/* 3 */ > implements > org.apache.flink.streaming.api.operators.OneInputStreamOperator \{/* 4 *//* 5 > */ private final Object[] references;/* 6 */ private transient > org.apache.flink.table.runtime.typeutils.StringDataSerializer > typeSerializer$6;/* 7 */ private transient > org.apache.flink.table.data.util.DataFormatConverters.StringConverter > converter$9;/* 8 */ private transient > cn.bicon.tableapitest.udf.ScalarFunctionTest$HashCode > function_cn$bicon$tableapitest$udf$ScalarFunctionTest$HashCode$8999e79cc91b971a8777461fb7698c58;/* > 9 */ private transient > org.apache.flink.table.data.util.DataFormatConverters.GenericConverter > converter$12;/* 10 */ org.apache.flink.table.data.BoxedWrapperRowData > out = new org.apache.flink.table.data.BoxedWrapperRowData(3);/* 11 */ > private final org.apache.flink.streaming.runtime.streamrecord.StreamRecord > outElement = new > org.apache.flink.streaming.runtime.streamrecord.StreamRecord(null);/* 12 *//* > 13 */ public StreamExecCalc$13(/* 14 */ Object[] > references,/* 15 */ > org.apache.flink.streaming.runtime.tasks.StreamTask task,/* 16 */ > org.apache.flink.streaming.api.graph.StreamConfig config,/* 17 */ > org.apache.flink.streaming.api.operators.Output output,/* 18 */ > org.apache.flink.streaming.runtime.tasks.ProcessingTimeService > processingTimeService) throws Exception {/* 19 */ this.references = > references;/* 20 */ typeSerializer$6 = > (((org.apache.flink.table.runtime.typeutils.StringDataSerializer) > references[0]));/* 21 */ converter$9 = > (((org.apache.flink.table.data.util.DataFormatConverters.StringConverter) > references[1]));/* 22 */ > function_cn$bicon$tableapitest$udf$ScalarFunctionTest$HashCode$8999e79cc91b971a8777461fb7698c58 > = (((cn.bicon.tableapitest.udf.ScalarFunctionTest$HashCode) > references[2]));/* 23 */ converter$12 = > (((org.apache.flink.table.data.util.DataFormatConverters.GenericConverter) > references[3]));/* 24 */ this.setup(task, config, output);/* 25 */ > if (this instanceof > org.apache.flink.streaming.api.operators.AbstractStreamOperator) {/* 26 */ > ((org.apache.flink.streaming.api.operators.AbstractStreamOperator) > this)/* 27 */ > .setProcessingTimeService(processingTimeService);/* 28 */ }/* 29 */ > }/* 30 *//* 31 */ @Override/* 32 */ public void open() > throws Exception \{/* 33 */ super.open();/* 34 */ /* 35 */ > > function_cn$bicon$tableapitest$udf$ScalarFunctionTest$HashCode$8999e79cc91b971a8777461fb7698c58.open(new > org.apache.flink.table.functions.FunctionContext(getRuntimeContext()));/* 36 > */ /* 37 */ }/* 38 *//* 39 */ @Override/* 40 */ > public void > processElement(org.apache.flink.streaming.runtime.streamrecord.StreamRecord > element) throws Exception \{/* 41 */ > org.apache.flink.table.data.RowData in1 = > (org.apache.flink.table.data.RowData) element.getValue();/* 42 */ /* > 43 */ org.apache.flink.table.data.binary.BinaryStringData field$5;/* > 44 */ boolean isNull$5;/* 45 */ > org.apache.flink.table.data.binary.BinaryStringData field$7;/* 46 */ > org.apache.flink.
[jira] [Updated] (FLINK-20665) FileNotFoundException when restore from latest Checkpoint
[ https://issues.apache.org/jira/browse/FLINK-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-20665: - Fix Version/s: 1.12.1 > FileNotFoundException when restore from latest Checkpoint > - > > Key: FLINK-20665 > URL: https://issues.apache.org/jira/browse/FLINK-20665 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.12.0 >Reporter: zhuxiaoshang >Priority: Major > Fix For: 1.12.1 > > > reproduce steps: > 1.a kafka to hdfs job,open `auto-compaction` > 2.when the job have done a successful checkpoint then cancel the job. > 3.restore from the latest checkpoint. > 4.after the first checkpoint has done ,the exception will appear > {code:java} > 2020-12-18 10:40:58java.io.UncheckedIOException: > java.io.FileNotFoundException: File does not exist: > hdfs:///day=2020-12-18/hour=10/.uncompacted-part-84db54f8-eda9-4e01-8e85-672144041642-0-0 > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$1(CompactCoordinator.java:160) > at > org.apache.flink.table.runtime.util.BinPacking.pack(BinPacking.java:41)at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$2(CompactCoordinator.java:169) > at java.util.HashMap.forEach(HashMap.java:1289)at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.coordinate(CompactCoordinator.java:166) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.commitUpToCheckpoint(CompactCoordinator.java:147) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.processElement(CompactCoordinator.java:137) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:193) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:179) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:152) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:372) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:186) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:575) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:539) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722)at > org.apache.flink.runtime.taskmanager.Task.run(Task.java:547)at > java.lang.Thread.run(Thread.java:748)Caused by: > java.io.FileNotFoundException: File does not exist: > hdfs:///day=2020-12-18/hour=10/.uncompacted-part-84db54f8-eda9-4e01-8e85-672144041642-0-0 > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getFileStatus(HadoopFileSystem.java:85) > at > org.apache.flink.core.fs.SafetyNetWrapperFileSystem.getFileStatus(SafetyNetWrapperFileSystem.java:64) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$1(CompactCoordinator.java:158) > ... 17 more > {code} > DDL > {code:java} > CREATE TABLE cpc_bd_recall_log_hdfs (log_timestamp BIGINT,ip STRING, > `raw` STRING,`day` STRING, `hour` STRING) PARTITIONED BY (`day` , > `hour`) WITH ('connector'='filesystem','path'='hdfs://xxx', > 'format'='parquet','parquet.compression'='SNAPPY', > 'sink.partition-commit.policy.kind' = 'success-file','auto-compaction' = > 'true','compaction.file-size' = '128MB'); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20665) FileNotFoundException when restore from latest Checkpoint
[ https://issues.apache.org/jira/browse/FLINK-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-20665: - Priority: Blocker (was: Major) > FileNotFoundException when restore from latest Checkpoint > - > > Key: FLINK-20665 > URL: https://issues.apache.org/jira/browse/FLINK-20665 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.12.0 >Reporter: zhuxiaoshang >Priority: Blocker > Fix For: 1.12.1 > > > reproduce steps: > 1.a kafka to hdfs job,open `auto-compaction` > 2.when the job have done a successful checkpoint then cancel the job. > 3.restore from the latest checkpoint. > 4.after the first checkpoint has done ,the exception will appear > {code:java} > 2020-12-18 10:40:58java.io.UncheckedIOException: > java.io.FileNotFoundException: File does not exist: > hdfs:///day=2020-12-18/hour=10/.uncompacted-part-84db54f8-eda9-4e01-8e85-672144041642-0-0 > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$1(CompactCoordinator.java:160) > at > org.apache.flink.table.runtime.util.BinPacking.pack(BinPacking.java:41)at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$2(CompactCoordinator.java:169) > at java.util.HashMap.forEach(HashMap.java:1289)at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.coordinate(CompactCoordinator.java:166) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.commitUpToCheckpoint(CompactCoordinator.java:147) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.processElement(CompactCoordinator.java:137) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:193) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:179) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:152) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:372) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:186) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:575) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:539) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722)at > org.apache.flink.runtime.taskmanager.Task.run(Task.java:547)at > java.lang.Thread.run(Thread.java:748)Caused by: > java.io.FileNotFoundException: File does not exist: > hdfs:///day=2020-12-18/hour=10/.uncompacted-part-84db54f8-eda9-4e01-8e85-672144041642-0-0 > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getFileStatus(HadoopFileSystem.java:85) > at > org.apache.flink.core.fs.SafetyNetWrapperFileSystem.getFileStatus(SafetyNetWrapperFileSystem.java:64) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$1(CompactCoordinator.java:158) > ... 17 more > {code} > DDL > {code:java} > CREATE TABLE cpc_bd_recall_log_hdfs (log_timestamp BIGINT,ip STRING, > `raw` STRING,`day` STRING, `hour` STRING) PARTITIONED BY (`day` , > `hour`) WITH ('connector'='filesystem','path'='hdfs://xxx', > 'format'='parquet','parquet.compression'='SNAPPY', > 'sink.partition-commit.policy.kind' = 'success-file','auto-compaction' = > 'true','compaction.file-size' = '128MB'); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20665) FileNotFoundException when restore from latest Checkpoint
[ https://issues.apache.org/jira/browse/FLINK-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251484#comment-17251484 ] Jingsong Lee commented on FLINK-20665: -- [~ZhuShang] Do you want to fix this? > FileNotFoundException when restore from latest Checkpoint > - > > Key: FLINK-20665 > URL: https://issues.apache.org/jira/browse/FLINK-20665 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.12.0 >Reporter: zhuxiaoshang >Priority: Blocker > Fix For: 1.12.1 > > > reproduce steps: > 1.a kafka to hdfs job,open `auto-compaction` > 2.when the job have done a successful checkpoint then cancel the job. > 3.restore from the latest checkpoint. > 4.after the first checkpoint has done ,the exception will appear > {code:java} > 2020-12-18 10:40:58java.io.UncheckedIOException: > java.io.FileNotFoundException: File does not exist: > hdfs:///day=2020-12-18/hour=10/.uncompacted-part-84db54f8-eda9-4e01-8e85-672144041642-0-0 > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$1(CompactCoordinator.java:160) > at > org.apache.flink.table.runtime.util.BinPacking.pack(BinPacking.java:41)at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$2(CompactCoordinator.java:169) > at java.util.HashMap.forEach(HashMap.java:1289)at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.coordinate(CompactCoordinator.java:166) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.commitUpToCheckpoint(CompactCoordinator.java:147) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.processElement(CompactCoordinator.java:137) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:193) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:179) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:152) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:372) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:186) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:575) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:539) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722)at > org.apache.flink.runtime.taskmanager.Task.run(Task.java:547)at > java.lang.Thread.run(Thread.java:748)Caused by: > java.io.FileNotFoundException: File does not exist: > hdfs:///day=2020-12-18/hour=10/.uncompacted-part-84db54f8-eda9-4e01-8e85-672144041642-0-0 > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getFileStatus(HadoopFileSystem.java:85) > at > org.apache.flink.core.fs.SafetyNetWrapperFileSystem.getFileStatus(SafetyNetWrapperFileSystem.java:64) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$1(CompactCoordinator.java:158) > ... 17 more > {code} > DDL > {code:java} > CREATE TABLE cpc_bd_recall_log_hdfs (log_timestamp BIGINT,ip STRING, > `raw` STRING,`day` STRING, `hour` STRING) PARTITIONED BY (`day` , > `hour`) WITH ('connector'='filesystem','path'='hdfs://xxx', > 'format'='parquet','parquet.compression'='SNAPPY', > 'sink.partition-commit.policy.kind' = 'success-file','auto-compaction' = > 'true','compaction.file-size' = '128MB'); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20665) FileNotFoundException when restore from latest Checkpoint
[ https://issues.apache.org/jira/browse/FLINK-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251483#comment-17251483 ] Jingsong Lee commented on FLINK-20665: -- Hi [~ZhuShang], yes, we should keep old temp file for big files. * Simple solution: We can remove the optimization of HDFS. Just copy the bytes to new file. * Better solution: I don't know if HDFS supports adding soft links to avoid copying data. I think we can choose simple solution to fix this problem in 1.12.1. WDYT, [~ZhuShang] > FileNotFoundException when restore from latest Checkpoint > - > > Key: FLINK-20665 > URL: https://issues.apache.org/jira/browse/FLINK-20665 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.12.0 >Reporter: zhuxiaoshang >Priority: Major > > reproduce steps: > 1.a kafka to hdfs job,open `auto-compaction` > 2.when the job have done a successful checkpoint then cancel the job. > 3.restore from the latest checkpoint. > 4.after the first checkpoint has done ,the exception will appear > {code:java} > 2020-12-18 10:40:58java.io.UncheckedIOException: > java.io.FileNotFoundException: File does not exist: > hdfs:///day=2020-12-18/hour=10/.uncompacted-part-84db54f8-eda9-4e01-8e85-672144041642-0-0 > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$1(CompactCoordinator.java:160) > at > org.apache.flink.table.runtime.util.BinPacking.pack(BinPacking.java:41)at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$2(CompactCoordinator.java:169) > at java.util.HashMap.forEach(HashMap.java:1289)at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.coordinate(CompactCoordinator.java:166) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.commitUpToCheckpoint(CompactCoordinator.java:147) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.processElement(CompactCoordinator.java:137) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:193) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:179) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:152) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:372) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:186) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:575) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:539) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722)at > org.apache.flink.runtime.taskmanager.Task.run(Task.java:547)at > java.lang.Thread.run(Thread.java:748)Caused by: > java.io.FileNotFoundException: File does not exist: > hdfs:///day=2020-12-18/hour=10/.uncompacted-part-84db54f8-eda9-4e01-8e85-672144041642-0-0 > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getFileStatus(HadoopFileSystem.java:85) > at > org.apache.flink.core.fs.SafetyNetWrapperFileSystem.getFileStatus(SafetyNetWrapperFileSystem.java:64) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$1(CompactCoordinator.java:158) > ... 17 more > {code} > DDL > {code:java} > CREATE TABLE cpc_bd_recall_log_hdfs (log_timestamp BIGINT,ip STRING, > `raw` STRING,`day` STRING, `hour` STRING) PARTITIONED BY (`day` , > `hour`) WITH ('connector'='filesystem','path'='hdfs://xxx', > 'format'='parquet','parquet.compression'='SNAPPY', > 'sink.partition-commit.policy.kind' = 'success-file','auto-compaction' = > 'true','compaction.file-size' = '128MB'); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot commented on pull request #14417: [FLINK-20668][table-planner-blink] Introduce translateToExecNode method for FlinkPhysicalRel
flinkbot commented on pull request #14417: URL: https://github.com/apache/flink/pull/14417#issuecomment-747849155 ## CI report: * 67c73bdacc77bab17d60526487c13fe6693e4904 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #14416: [hotfix][k8s] Fix k8s ha service doc.
flinkbot edited a comment on pull request #14416: URL: https://github.com/apache/flink/pull/14416#issuecomment-747844139 ## CI report: * baee37d34012ad05bf5414ab9b6a81cfe19f881a Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10998) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-20669) Add the jzlib LICENSE file in flink-python module
Huang Xingbo created FLINK-20669: Summary: Add the jzlib LICENSE file in flink-python module Key: FLINK-20669 URL: https://issues.apache.org/jira/browse/FLINK-20669 Project: Flink Issue Type: Bug Components: API / Python Affects Versions: 1.12.0, 1.13.0 Reporter: Huang Xingbo Fix For: 1.13.0, 1.12.1 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20665) FileNotFoundException when restore from latest Checkpoint
[ https://issues.apache.org/jira/browse/FLINK-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251476#comment-17251476 ] zhuxiaoshang commented on FLINK-20665: -- When the file is big,it will be rename directly.So when restore from latest checkpoint,the file is lost. Maybe,it is the root casue.WDYT [~jark],[~lzljs3620320] > FileNotFoundException when restore from latest Checkpoint > - > > Key: FLINK-20665 > URL: https://issues.apache.org/jira/browse/FLINK-20665 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.12.0 >Reporter: zhuxiaoshang >Priority: Major > > reproduce steps: > 1.a kafka to hdfs job,open `auto-compaction` > 2.when the job have done a successful checkpoint then cancel the job. > 3.restore from the latest checkpoint. > 4.after the first checkpoint has done ,the exception will appear > {code:java} > 2020-12-18 10:40:58java.io.UncheckedIOException: > java.io.FileNotFoundException: File does not exist: > hdfs:///day=2020-12-18/hour=10/.uncompacted-part-84db54f8-eda9-4e01-8e85-672144041642-0-0 > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$1(CompactCoordinator.java:160) > at > org.apache.flink.table.runtime.util.BinPacking.pack(BinPacking.java:41)at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$2(CompactCoordinator.java:169) > at java.util.HashMap.forEach(HashMap.java:1289)at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.coordinate(CompactCoordinator.java:166) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.commitUpToCheckpoint(CompactCoordinator.java:147) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.processElement(CompactCoordinator.java:137) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:193) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:179) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:152) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:372) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:186) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:575) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:539) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722)at > org.apache.flink.runtime.taskmanager.Task.run(Task.java:547)at > java.lang.Thread.run(Thread.java:748)Caused by: > java.io.FileNotFoundException: File does not exist: > hdfs:///day=2020-12-18/hour=10/.uncompacted-part-84db54f8-eda9-4e01-8e85-672144041642-0-0 > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441) > at > org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434) > at > org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getFileStatus(HadoopFileSystem.java:85) > at > org.apache.flink.core.fs.SafetyNetWrapperFileSystem.getFileStatus(SafetyNetWrapperFileSystem.java:64) > at > org.apache.flink.table.filesystem.stream.compact.CompactCoordinator.lambda$coordinate$1(CompactCoordinator.java:158) > ... 17 more > {code} > DDL > {code:java} > CREATE TABLE cpc_bd_recall_log_hdfs (log_timestamp BIGINT,ip STRING, > `raw` STRING,`day` STRING, `hour` STRING) PARTITIONED BY (`day` , > `hour`) WITH ('connector'='filesystem','path'='hdfs://xxx', > 'format'='parquet','parquet.compression'='SNAPPY', > 'sink.partition-commit.policy.kind' = 'success-file','auto-compaction' = > 'true','compaction.file-size' = '128MB'); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-20648) Unable to restore job from savepoint when using Kubernetes based HA services
[ https://issues.apache.org/jira/browse/FLINK-20648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song reassigned FLINK-20648: Assignee: Yang Wang > Unable to restore job from savepoint when using Kubernetes based HA services > > > Key: FLINK-20648 > URL: https://issues.apache.org/jira/browse/FLINK-20648 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes >Affects Versions: 1.12.0 >Reporter: David Morávek >Assignee: Yang Wang >Priority: Critical > Fix For: 1.12.1 > > > When restoring job from savepoint, we always end up with following error: > {code} > Caused by: org.apache.flink.runtime.client.JobInitializationException: Could > not instantiate JobManager. > at > org.apache.flink.runtime.dispatcher.Dispatcher.lambda$createJobManagerRunner$5(Dispatcher.java:463) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1764) > ~[?:?] > ... 3 more > Caused by: java.util.concurrent.ExecutionException: > org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Stopped > retrying the operation because the error is not retryable. > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) > ~[?:?] > at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2063) ~[?:?] > at > org.apache.flink.kubernetes.highavailability.KubernetesStateHandleStore.addAndLock(KubernetesStateHandleStore.java:150) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.checkpoint.DefaultCompletedCheckpointStore.addCheckpoint(DefaultCompletedCheckpointStore.java:211) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1479) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.scheduler.SchedulerBase.tryRestoreExecutionGraphFromSavepoint(SchedulerBase.java:325) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.scheduler.SchedulerBase.createAndRestoreExecutionGraph(SchedulerBase.java:266) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.scheduler.SchedulerBase.(SchedulerBase.java:238) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.scheduler.DefaultScheduler.(DefaultScheduler.java:134) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.scheduler.DefaultSchedulerFactory.createInstance(DefaultSchedulerFactory.java:108) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.jobmaster.JobMaster.createScheduler(JobMaster.java:323) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.jobmaster.JobMaster.(JobMaster.java:310) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.createJobMasterService(DefaultJobMasterServiceFactory.java:96) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.createJobMasterService(DefaultJobMasterServiceFactory.java:41) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.jobmaster.JobManagerRunnerImpl.(JobManagerRunnerImpl.java:141) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.dispatcher.DefaultJobManagerRunnerFactory.createJobManagerRunner(DefaultJobManagerRunnerFactory.java:80) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.dispatcher.Dispatcher.lambda$createJobManagerRunner$5(Dispatcher.java:450) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1764) > ~[?:?] > ... 3 more > Caused by: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: > Stopped retrying the operation because the error is not retryable. > at > org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperation$1(FutureUtils.java:166) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859) > ~[?:?] > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) > ~[?:?] > at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478) > ~[?:?] > ... 3 more > Caused by: java.util.concurrent.CompletionException: > org.apache.flink.kubernetes.kubeclient.resources.KubernetesException: Cannot > retry checkAndUpdateConfigMap with configMap > pipelines-runner-fulltext-6e99e672
[jira] [Assigned] (FLINK-20664) Support setting service account for TaskManager pod
[ https://issues.apache.org/jira/browse/FLINK-20664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song reassigned FLINK-20664: Assignee: Yang Wang > Support setting service account for TaskManager pod > --- > > Key: FLINK-20664 > URL: https://issues.apache.org/jira/browse/FLINK-20664 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes >Affects Versions: 1.12.0 >Reporter: Yang Wang >Assignee: Yang Wang >Priority: Critical > Fix For: 1.12.1 > > > Currently, we only set the service account for JobManager. The TaskManager is > using the default service account. Before the KubernetesHAService is > introduced, it works because the TaskManager does not need to access the K8s > resource(e.g. ConfigMap) directly. But now the TaskManager needs to watch > ConfigMap and retrieve leader address. So if the default service account does > not have enough permission, users could not specify a valid service account > for TaskManager. > > We should introduce a new config option for TaskManager service account. > {{kubernetes.jobmanager.service-account}} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20664) Support setting service account for TaskManager pod
[ https://issues.apache.org/jira/browse/FLINK-20664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251475#comment-17251475 ] Boris Lublinsky commented on FLINK-20664: - Do you mean: kubernetes.taskmanager.service-account ? I think a better solution is to use the same service account as job manager. In this case one role and one binding is sufficient. If you make them different, then a user will have to create two bindings > Support setting service account for TaskManager pod > --- > > Key: FLINK-20664 > URL: https://issues.apache.org/jira/browse/FLINK-20664 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes >Affects Versions: 1.12.0 >Reporter: Yang Wang >Priority: Critical > Fix For: 1.12.1 > > > Currently, we only set the service account for JobManager. The TaskManager is > using the default service account. Before the KubernetesHAService is > introduced, it works because the TaskManager does not need to access the K8s > resource(e.g. ConfigMap) directly. But now the TaskManager needs to watch > ConfigMap and retrieve leader address. So if the default service account does > not have enough permission, users could not specify a valid service account > for TaskManager. > > We should introduce a new config option for TaskManager service account. > {{kubernetes.jobmanager.service-account}} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-20648) Unable to restore job from savepoint when using Kubernetes based HA services
[ https://issues.apache.org/jira/browse/FLINK-20648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song updated FLINK-20648: - Fix Version/s: 1.12.1 > Unable to restore job from savepoint when using Kubernetes based HA services > > > Key: FLINK-20648 > URL: https://issues.apache.org/jira/browse/FLINK-20648 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes >Affects Versions: 1.12.0 >Reporter: David Morávek >Priority: Critical > Fix For: 1.12.1 > > > When restoring job from savepoint, we always end up with following error: > {code} > Caused by: org.apache.flink.runtime.client.JobInitializationException: Could > not instantiate JobManager. > at > org.apache.flink.runtime.dispatcher.Dispatcher.lambda$createJobManagerRunner$5(Dispatcher.java:463) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1764) > ~[?:?] > ... 3 more > Caused by: java.util.concurrent.ExecutionException: > org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Stopped > retrying the operation because the error is not retryable. > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) > ~[?:?] > at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2063) ~[?:?] > at > org.apache.flink.kubernetes.highavailability.KubernetesStateHandleStore.addAndLock(KubernetesStateHandleStore.java:150) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.checkpoint.DefaultCompletedCheckpointStore.addCheckpoint(DefaultCompletedCheckpointStore.java:211) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1479) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.scheduler.SchedulerBase.tryRestoreExecutionGraphFromSavepoint(SchedulerBase.java:325) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.scheduler.SchedulerBase.createAndRestoreExecutionGraph(SchedulerBase.java:266) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.scheduler.SchedulerBase.(SchedulerBase.java:238) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.scheduler.DefaultScheduler.(DefaultScheduler.java:134) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.scheduler.DefaultSchedulerFactory.createInstance(DefaultSchedulerFactory.java:108) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.jobmaster.JobMaster.createScheduler(JobMaster.java:323) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.jobmaster.JobMaster.(JobMaster.java:310) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.createJobMasterService(DefaultJobMasterServiceFactory.java:96) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.createJobMasterService(DefaultJobMasterServiceFactory.java:41) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.jobmaster.JobManagerRunnerImpl.(JobManagerRunnerImpl.java:141) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.dispatcher.DefaultJobManagerRunnerFactory.createJobManagerRunner(DefaultJobManagerRunnerFactory.java:80) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > org.apache.flink.runtime.dispatcher.Dispatcher.lambda$createJobManagerRunner$5(Dispatcher.java:450) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1764) > ~[?:?] > ... 3 more > Caused by: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: > Stopped retrying the operation because the error is not retryable. > at > org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperation$1(FutureUtils.java:166) > ~[flink-dist_2.11-1.12.0.jar:1.12.0] > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859) > ~[?:?] > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) > ~[?:?] > at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478) > ~[?:?] > ... 3 more > Caused by: java.util.concurrent.CompletionException: > org.apache.flink.kubernetes.kubeclient.resources.KubernetesException: Cannot > retry checkAndUpdateConfigMap with configMap > pipelines-runner-fulltext-6e99e672-4af29f0768624632839835717898b08d-jobm
[jira] [Commented] (FLINK-20667) Configurable batch size for vectorized reader
[ https://issues.apache.org/jira/browse/FLINK-20667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251474#comment-17251474 ] Leonard Xu commented on FLINK-20667: The question also reported in [stack overflow|https://stackoverflow.com/questions/65342630/flink-sql-read-hive-table-throw-java-lang-arrayindexoutofboundsexception-1024/65351402#65351402] > Configurable batch size for vectorized reader > - > > Key: FLINK-20667 > URL: https://issues.apache.org/jira/browse/FLINK-20667 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Reporter: Rui Li >Priority: Major > > This can be useful to workaround issues like ORC-598 and ORC-672 -- This message was sent by Atlassian Jira (v8.3.4#803005)