date:20220925

[GitHub] [flink] kgrimsby commented on pull request #20685: [FLINK-28429][python] Optimize PyFlink tests

2022-09-25 Thread GitBox



kgrimsby commented on PR #20685:
URL: https://github.com/apache/flink/pull/20685#issuecomment-1257563464

   Hi, doesn't this commit introduce a bug?
   
   I've tried to setup from flink release-1.16 branch and found that the 
changes in `pyflink/fn_execution/flink_fn_execution_pb2.py` at `DESCRIPTOR = 
_descriptor_pool.Default().AddSerializedFile(b'...')` will never return a 
descriptor since the code for `AddSerializedFile` in protbuf <3.18 has no 
return on that function:
   `
   def AddSerializedFile(self, serialized_file_desc_proto):
   """Adds the FileDescriptorProto and its types to this pool.
   Args:
 serialized_file_desc_proto (bytes): A bytes string, serialization of 
the
   :class:`FileDescriptorProto` to add.
   """
   
   # pylint: disable=g-import-not-at-top
   from google.protobuf import descriptor_pb2
   file_desc_proto = descriptor_pb2.FileDescriptorProto.FromString(
   serialized_file_desc_proto)
   self.Add(file_desc_proto)
   `
   
   Am I missing something here?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-29315) HDFSTest#testBlobServerRecovery fails on CI

2022-09-25 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609324#comment-17609324
 ] 

Matthias Pohl commented on FLINK-29315:
---

Great. Thanks for solving it. [~wangyang0918] I'm wondering what triggered that 
behavior? Did an automatic update of the kernel to 
{{3.10.0-1160.62.1.el7.x86_64}} happen recently?

> HDFSTest#testBlobServerRecovery fails on CI
> ---
>
> Key: FLINK-29315
> URL: https://issues.apache.org/jira/browse/FLINK-29315
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / FileSystem, Tests
>Affects Versions: 1.16.0, 1.15.2
>Reporter: Chesnay Schepler
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.16.0, 1.17.0
>
>
> The test started failing 2 days ago on different branches. I suspect 
> something's wrong with the CI infrastructure.
> {code:java}
> Sep 15 09:11:22 [ERROR] Failures: 
> Sep 15 09:11:22 [ERROR]   HDFSTest.testBlobServerRecovery Multiple Failures 
> (2 failures)
> Sep 15 09:11:22   java.lang.AssertionError: Test failed Error while 
> running command to get file permissions : java.io.IOException: Cannot run 
> program "ls": error=1, Operation not permitted
> Sep 15 09:11:22   at 
> java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell.runCommand(Shell.java:913)
> Sep 15 09:11:22   at org.apache.hadoop.util.Shell.run(Shell.java:869)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1170)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell.execCommand(Shell.java:1264)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell.execCommand(Shell.java:1246)
> Sep 15 09:11:22   at 
> org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:1089)
> Sep 15 09:11:22   at 
> org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:697)
> Sep 15 09:11:22   at 
> org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:672)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:233)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:141)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:116)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2580)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2622)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2604)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2497)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1501)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:851)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:485)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:444)
> Sep 15 09:11:22   at 
> org.apache.flink.hdfstests.HDFSTest.createHDFS(HDFSTest.java:93)
> Sep 15 09:11:22   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Sep 15 09:11:22   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Sep 15 09:11:22   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Sep 15 09:11:22   at 
> java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
> Sep 15 09:11:22   ... 67 more
> Sep 15 09:11:22 
> Sep 15 09:11:22   java.lang.NullPointerException: 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-kubernetes-operator] morhidi commented on pull request #384: [FLINK-29159] Harden initial deployment logic

2022-09-25 Thread GitBox



morhidi commented on PR #384:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/384#issuecomment-1257554675

   +1 Thanks, Gyula. I have the feeling though that on long term we should rely 
on a more consistent state-machine model, instead of introducing new 
states/substates in utility methods. WDYT?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-29410) Add checks to guarantee the non-deprecated options not conflicting with standard YAML

2022-09-25 Thread Yun Tang (Jira)

Yun Tang created FLINK-29410:


 Summary: Add checks to guarantee the non-deprecated options not 
conflicting with standard YAML
 Key: FLINK-29410
 URL: https://issues.apache.org/jira/browse/FLINK-29410
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Configuration, Tests
Reporter: Yun Tang


To support the standard YAML parser, we add suffixes to all necessary options 
in FLINK-29372. However, this cannot guarantee that newly added options would 
still obey this rule. Thus, we should add test checks to guarantee this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-28131) FLIP-168: Speculative Execution for Batch Job

2022-09-25 Thread Zhu Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-28131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28131.
---
Release Note: 
Speculative execution(FLIP-168) is introduced in Flink 1.16 to mitigate batch 
job slowness which is caused by problematic nodes. A problematic node may have 
hardware problems, accident I/O busy, or high CPU load. These problems may make 
the hosted tasks run much slower than tasks on other nodes, and affect the 
overall execution time of a batch job.

When speculative execution is enabled, Flink will keep detecting slow tasks. 
Once slow tasks are detected, the nodes that the slow tasks locate in will be 
identified as problematic nodes and get blocked via the blocklist 
mechanism(FLIP-224). The scheduler will create new attempts for the slow tasks 
and deploy them to nodes that are not blocked, while the existing attempts will 
keep running. The new attempts process the same input data and produce the same 
data as the original attempt. Once any attempt finishes first, it will be 
admitted as the only finished attempt of the task, and the remaining attempts 
of the task will be canceled.

Most existing sources can work with speculative execution(FLIP-245). Only if a 
source uses SourceEvent, it must implement 
SupportsHandleExecutionAttemptSourceEvent to support speculative execution. 
Sinks do not support speculative execution yet so that speculative execution 
will not happen on sinks at the moment.

The Web UI & REST API are also improved(FLIP-249) to display multiple 
concurrent attempts of tasks and blocked task managers.
  Resolution: Done

> FLIP-168: Speculative Execution for Batch Job
> -
>
> Key: FLINK-28131
> URL: https://issues.apache.org/jira/browse/FLINK-28131
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> Speculative executions is helpful to mitigate slow tasks caused by 
> problematic nodes. The basic idea is to start mirror tasks on other nodes 
> when a slow task is detected. The mirror task processes the same input data 
> and produces the same data as the original task. 
> More detailed can be found in 
> [FLIP-168|[https://cwiki.apache.org/confluence/display/FLINK/FLIP-168%3A+Speculative+Execution+for+Batch+Job].]
>  
> This is the umbrella ticket to track all the changes of this feature.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Reopened] (FLINK-28131) FLIP-168: Speculative Execution for Batch Job

2022-09-25 Thread Zhu Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-28131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reopened FLINK-28131:
-

> FLIP-168: Speculative Execution for Batch Job
> -
>
> Key: FLINK-28131
> URL: https://issues.apache.org/jira/browse/FLINK-28131
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> Speculative executions is helpful to mitigate slow tasks caused by 
> problematic nodes. The basic idea is to start mirror tasks on other nodes 
> when a slow task is detected. The mirror task processes the same input data 
> and produces the same data as the original task. 
> More detailed can be found in 
> [FLIP-168|[https://cwiki.apache.org/confluence/display/FLINK/FLIP-168%3A+Speculative+Execution+for+Batch+Job].]
>  
> This is the umbrella ticket to track all the changes of this feature.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] fsk119 commented on pull request #20891: [FLINK-29274][hive] Fix ObjectStore leak when different users has dif…

2022-09-25 Thread GitBox



fsk119 commented on PR #20891:
URL: https://github.com/apache/flink/pull/20891#issuecomment-1257528078

   @flinkbot run azure
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] pvary commented on a diff in pull request #382: [FLINK-29392] Trigger error when session job is lost without HA

2022-09-25 Thread GitBox



pvary commented on code in PR #382:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/382#discussion_r979582956


##
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/observer/JobStatusObserver.java:
##
@@ -72,22 +81,92 @@ public boolean observe(
 }
 
 if (!clusterJobStatuses.isEmpty()) {
+// There are jobs on the cluster, we filter the ones for this 
resource
 Optional targetJobStatusMessage =
 filterTargetJob(jobStatus, clusterJobStatuses);
+
 if (targetJobStatusMessage.isEmpty()) {
-
jobStatus.setState(org.apache.flink.api.common.JobStatus.RECONCILING.name());
+LOG.warn("No matching jobs found on the cluster");
+ifRunningMoveToReconciling(jobStatus, previousJobStatus);
+// We could list the jobs but cannot find the one for this 
resource
+if (resource instanceof FlinkDeployment) {
+// This should never happen for application clusters, 
there is something wrong
+setUnknownJobError((FlinkDeployment) resource);

Review Comment:
   Could this be normal if we concurrently removed a job as well with a complex 
configuration change?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-29402) Add USE_DIRECT_READ configuration parameter for RocksDB

2022-09-25 Thread Donatien (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609298#comment-17609298
 ] 

Donatien commented on FLINK-29402:
--

Indeed, with RocksDB API it is easy to add this new option (two new options 
with {{use_direct_io_for_flush_and_compaction). I am not quite familiar with 
the process of adding a new option, e.g adding it to the doc, localization, 
..., but if you give me some resources about guidelines I'll be happy to create 
a PR.}}

 

I will edit my ticket later to add some examples with Grafana of the impact of 
DirectIO on performance.

> Add USE_DIRECT_READ configuration parameter for RocksDB
> ---
>
> Key: FLINK-29402
> URL: https://issues.apache.org/jira/browse/FLINK-29402
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / State Backends
>Affects Versions: 1.15.2
>Reporter: Donatien
>Priority: Not a Priority
>  Labels: Enhancement, rocksdb
> Fix For: 1.15.2
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> RocksDB allows the use of DirectIO for read operations to bypass the Linux 
> Page Cache. To understand the impact of Linux Page Cache on performance, one 
> can run a heavy workload on a single-tasked Task Manager with a container 
> memory limit identical to the TM process memory. Running this same workload 
> on a TM with no container memory limit will result in better performances but 
> with the host memory exceeding the TM requirement.
> Linux Page Cache are of course useful but can give false results when 
> benchmarking the Managed Memory used by RocksDB. DirectIO is typically 
> enabled for benchmarks on working set estimation [Zwaenepoel et 
> al.|[https://arxiv.org/abs/1702.04323].]
> I propose to add a configuration key allowing users to enable the use of 
> DirectIO for reads thanks to the RocksDB API. This configuration would be 
> disabled by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-29020) Add document for CTAS feature

2022-09-25 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu reassigned FLINK-29020:
---

Assignee: tartarus

> Add document for CTAS feature
> -
>
> Key: FLINK-29020
> URL: https://issues.apache.org/jira/browse/FLINK-29020
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.16.0
>Reporter: dalongliu
>Assignee: tartarus
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-29020) Add document for CTAS feature

2022-09-25 Thread Jark Wu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609297#comment-17609297
 ] 

Jark Wu commented on FLINK-29020:
-

Fixed in 
 - master: 8ce056c59439c1a3cedd6b32c0a98a14febc7ffb

> Add document for CTAS feature
> -
>
> Key: FLINK-29020
> URL: https://issues.apache.org/jira/browse/FLINK-29020
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.16.0
>Reporter: dalongliu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] wuchong merged pull request #20653: [FLINK-29020][docs] add document for CTAS feature

2022-09-25 Thread GitBox



wuchong merged PR #20653:
URL: https://github.com/apache/flink/pull/20653


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] wangyang0918 closed pull request #20879: Debug CI

2022-09-25 Thread GitBox



wangyang0918 closed pull request #20879: Debug CI
URL: https://github.com/apache/flink/pull/20879


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-29409) Add Transformer and Estimator for VarianceThresholdSelector

2022-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-29409:
---
Labels: pull-request-available  (was: )

> Add Transformer and Estimator for VarianceThresholdSelector 
> 
>
> Key: FLINK-29409
> URL: https://issues.apache.org/jira/browse/FLINK-29409
> Project: Flink
>  Issue Type: New Feature
>  Components: Library / Machine Learning
>Reporter: Jiang Xin
>Priority: Major
>  Labels: pull-request-available
>
> Add Transformer and Estimator for VarianceThresholdSelector 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-ml] jiangxin369 opened a new pull request, #158: [FLINK-29409] Add Transformer and Estimator for VarianceThresholdSelector

2022-09-25 Thread GitBox



jiangxin369 opened a new pull request, #158:
URL: https://github.com/apache/flink-ml/pull/158

   
   ## What is the purpose of the change
   
   Adding Transformer and Estimator for VarianceThresholdSelector
   
   
   ## Brief change log
   
 - Added java/python source/test/example for Transformer and Estimator for 
VarianceThresholdSelector in Flink ML.
   
   ## Does this pull request potentially affect one of the following parts:
   
   Dependencies (does it add or upgrade a dependency): (no)
   The public API, i.e., is any changed class annotated with @public(Evolving): 
(no)
   Does this pull request introduce a new feature? (yes)
   If yes, how is the feature documented? (Java doc)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-29409) Add Transformer and Estimator for VarianceThresholdSelector

2022-09-25 Thread Jiang Xin (Jira)

Jiang Xin created FLINK-29409:
-

 Summary: Add Transformer and Estimator for 
VarianceThresholdSelector 
 Key: FLINK-29409
 URL: https://issues.apache.org/jira/browse/FLINK-29409
 Project: Flink
  Issue Type: New Feature
  Components: Library / Machine Learning
Reporter: Jiang Xin


Add Transformer and Estimator for VarianceThresholdSelector 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-29272) Document DataStream API (DataStream to Table) for table store

2022-09-25 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-29272:
-
Fix Version/s: (was: table-store-0.2.1)

> Document DataStream API (DataStream to Table) for table store
> -
>
> Key: FLINK-29272
> URL: https://issues.apache.org/jira/browse/FLINK-29272
> Project: Flink
>  Issue Type: New Feature
>  Components: Table Store
>Reporter: Jingsong Lee
>Priority: Major
> Fix For: table-store-0.3.0
>
>
> We can have documentation to describe how to convert from DataStream to Table 
> to write to TableStore.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] xintongsong closed pull request #20857: [hotfix] Fix the problem that BatchShuffleItCase not subject to configuration.

2022-09-25 Thread GitBox



xintongsong closed pull request #20857: [hotfix] Fix the problem that 
BatchShuffleItCase not subject to configuration.
URL: https://github.com/apache/flink/pull/20857


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-29406) Expose Finish Method For TableFunction

2022-09-25 Thread lincoln lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee updated FLINK-29406:

Affects Version/s: 1.14.5

> Expose Finish Method For TableFunction
> --
>
> Key: FLINK-29406
> URL: https://issues.apache.org/jira/browse/FLINK-29406
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Affects Versions: 1.14.5, 1.16.0, 1.15.2
>Reporter: lincoln lee
>Priority: Major
> Fix For: 1.17.0
>
>
> FLIP-260: Expose Finish Method For TableFunction
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-260%3A+Expose+Finish+Method+For+TableFunction



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-29315) HDFSTest#testBlobServerRecovery fails on CI

2022-09-25 Thread Huang Xingbo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huang Xingbo resolved FLINK-29315.
--
Resolution: Fixed

> HDFSTest#testBlobServerRecovery fails on CI
> ---
>
> Key: FLINK-29315
> URL: https://issues.apache.org/jira/browse/FLINK-29315
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / FileSystem, Tests
>Affects Versions: 1.16.0, 1.15.2
>Reporter: Chesnay Schepler
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.16.0, 1.17.0
>
>
> The test started failing 2 days ago on different branches. I suspect 
> something's wrong with the CI infrastructure.
> {code:java}
> Sep 15 09:11:22 [ERROR] Failures: 
> Sep 15 09:11:22 [ERROR]   HDFSTest.testBlobServerRecovery Multiple Failures 
> (2 failures)
> Sep 15 09:11:22   java.lang.AssertionError: Test failed Error while 
> running command to get file permissions : java.io.IOException: Cannot run 
> program "ls": error=1, Operation not permitted
> Sep 15 09:11:22   at 
> java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell.runCommand(Shell.java:913)
> Sep 15 09:11:22   at org.apache.hadoop.util.Shell.run(Shell.java:869)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1170)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell.execCommand(Shell.java:1264)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell.execCommand(Shell.java:1246)
> Sep 15 09:11:22   at 
> org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:1089)
> Sep 15 09:11:22   at 
> org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:697)
> Sep 15 09:11:22   at 
> org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:672)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:233)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:141)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:116)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2580)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2622)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2604)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2497)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1501)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:851)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:485)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:444)
> Sep 15 09:11:22   at 
> org.apache.flink.hdfstests.HDFSTest.createHDFS(HDFSTest.java:93)
> Sep 15 09:11:22   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Sep 15 09:11:22   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Sep 15 09:11:22   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Sep 15 09:11:22   at 
> java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
> Sep 15 09:11:22   ... 67 more
> Sep 15 09:11:22 
> Sep 15 09:11:22   java.lang.NullPointerException: 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-29315) HDFSTest#testBlobServerRecovery fails on CI

2022-09-25 Thread Huang Xingbo (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609282#comment-17609282
 ] 

Huang Xingbo commented on FLINK-29315:
--

After upgrading the kernel version, this test has been successfully passed in 
the latest two days of CI. Thanks to [~wangyang0918], [~chesnay] and [~mapohl] 
for their help in solving this tricky problem.

> HDFSTest#testBlobServerRecovery fails on CI
> ---
>
> Key: FLINK-29315
> URL: https://issues.apache.org/jira/browse/FLINK-29315
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / FileSystem, Tests
>Affects Versions: 1.16.0, 1.15.2
>Reporter: Chesnay Schepler
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.16.0, 1.17.0
>
>
> The test started failing 2 days ago on different branches. I suspect 
> something's wrong with the CI infrastructure.
> {code:java}
> Sep 15 09:11:22 [ERROR] Failures: 
> Sep 15 09:11:22 [ERROR]   HDFSTest.testBlobServerRecovery Multiple Failures 
> (2 failures)
> Sep 15 09:11:22   java.lang.AssertionError: Test failed Error while 
> running command to get file permissions : java.io.IOException: Cannot run 
> program "ls": error=1, Operation not permitted
> Sep 15 09:11:22   at 
> java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell.runCommand(Shell.java:913)
> Sep 15 09:11:22   at org.apache.hadoop.util.Shell.run(Shell.java:869)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1170)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell.execCommand(Shell.java:1264)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell.execCommand(Shell.java:1246)
> Sep 15 09:11:22   at 
> org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:1089)
> Sep 15 09:11:22   at 
> org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:697)
> Sep 15 09:11:22   at 
> org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:672)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:233)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:141)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:116)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2580)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2622)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2604)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2497)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1501)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:851)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:485)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:444)
> Sep 15 09:11:22   at 
> org.apache.flink.hdfstests.HDFSTest.createHDFS(HDFSTest.java:93)
> Sep 15 09:11:22   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Sep 15 09:11:22   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Sep 15 09:11:22   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Sep 15 09:11:22   at 
> java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
> Sep 15 09:11:22   ... 67 more
> Sep 15 09:11:22 
> Sep 15 09:11:22   java.lang.NullPointerException: 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] flinkbot commented on pull request #20898: [FLINK-29349][table-runtime] Use state ttl instead of timer to clean up state in proctime unbounded over aggregate

2022-09-25 Thread GitBox



flinkbot commented on PR #20898:
URL: https://github.com/apache/flink/pull/20898#issuecomment-1257419037

   
   ## CI report:
   
   * 2ec2c1645096c01744e1bd7c19fe6535063ee687 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] zezaeoh commented on a diff in pull request #377: [FLINK-28979] Add owner reference to flink deployment object

2022-09-25 Thread GitBox



zezaeoh commented on code in PR #377:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/377#discussion_r979518665


##
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/reconciler/deployment/AbstractFlinkResourceReconciler.java:
##
@@ -426,4 +431,17 @@ protected boolean flinkVersionChanged(SPEC oldSpec, SPEC 
newSpec) {
 }
 return false;
 }
+
+private void setOwnerReference(CR owner, Configuration deployConfig) {
+final Map ownerReference =
+Map.of(
+"apiVersion", owner.getApiVersion(),
+"kind", owner.getKind(),
+"name", owner.getMetadata().getName(),
+"uid", owner.getMetadata().getUid(),
+"blockOwnerDeletion", "false",
+"controller", "false");
+deployConfig.set(
+KubernetesConfigOptions.JOB_MANAGER_OWNER_REFERENCE, 
List.of(ownerReference));
+}

Review Comment:
   @gyfora 
   I tried it but, the problem is with the apiVersion and kind
   
   Is there a good way to add the apiVersion and kind of CustomResources, if 
not in the `AbstractFlinkResourceReconciler`?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-29349) Use state ttl instead of timer to clean up state in proctime unbounded over aggregate

2022-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-29349:
---
Labels: pull-request-available  (was: )

> Use state ttl instead of timer to clean up state in proctime unbounded over 
> aggregate
> -
>
> Key: FLINK-29349
> URL: https://issues.apache.org/jira/browse/FLINK-29349
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Affects Versions: 1.16.0, 1.15.2
>Reporter: lincoln lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> Currently we rely on the timer based state cleaning  in proctime  over 
> aggregate, this can be optimized to use state ttl for a more efficienct way



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] lincoln-lil opened a new pull request, #20898: [FLINK-29349][table-runtime] Use state ttl instead of timer to clean up state in proctime unbounded over aggregate

2022-09-25 Thread GitBox



lincoln-lil opened a new pull request, #20898:
URL: https://github.com/apache/flink/pull/20898

   ## What is the purpose of the change
   Currently we rely on the timer based state cleaning in proctime unbounded 
over aggregate, this can be optimized to use state ttl for a more efficienct way
   
   ## Brief change log
   use ttl to replace timer for ProcTimeUnboundedPrecedingFunction
   
   ## Verifying this change
   ProcTimeUnboundedPrecedingFunctionTest
   
   ## Does this pull request potentially affect one of the following parts:
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
@Public(Evolving): (no)
 - The serializers: (no )
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
 - Does this pull request introduce a new feature? (no)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] wangyang0918 commented on a diff in pull request #383: [FLINK-29384] Bump snakeyaml version to 1.32

2022-09-25 Thread GitBox



wangyang0918 commented on code in PR #383:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/383#discussion_r979525831


##
flink-kubernetes-operator/pom.xml:
##
@@ -47,6 +47,20 @@ under the License.
 io.fabric8
 kubernetes-client
 ${fabric8.version}
+
+
+
+org.yaml
+snakeyaml
+
+
+
+
+
+

Review Comment:
   Could you please share me why we include the dependency explicitly and not 
use the `dependencyManagement`?



##
flink-kubernetes-shaded/pom.xml:
##
@@ -88,6 +103,7 @@ under the License.
 
META-INF/DEPENDENCIES
 META-INF/LICENSE
 META-INF/MANIFEST.MF
+
org/apache/flink/kubernetes/shaded/org/yaml/snakeyaml/**

Review Comment:
   I am not sure whether it is possible to update the bundled `NOTICE` file to 
also remove the `org.yaml:snakeyaml:1.27`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-29402) Add USE_DIRECT_READ configuration parameter for RocksDB

2022-09-25 Thread Yanfei Lei (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609277#comment-17609277
 ] 

Yanfei Lei edited comment on FLINK-29402 at 9/26/22 3:05 AM:
-

This is a very interesting proposal, I think this is not hard to implement in 
Flink. From the [wiki|https://github.com/facebook/rocksdb/wiki/Direct-IO] there 
are two options to control the DirectIO: {{use_direct_reads}} and 
{{use_direct_io_for_flush_and_compaction, }}and these two options are supported 
by current{{{} frocksdb-jni(6.20.3){}}}.  

BTW, do you have quantitative benchmark results about DirectIO *ON* vs DirectIO 
{*}OFF{*}?


was (Author: yanfei lei):
This is a very interesting proposal, I think this is not hard to implement in 
Flink. From the [wiki|https://github.com/facebook/rocksdb/wiki/Direct-IO] there 
are two options to control the DirectIO: {{use_direct_reads}} and 
{{use_direct_io_for_flush_and_compaction, }}and these two options are supported 
by current {{{}frocksdb-jni(6.20.3){}}}.  

BTW, do you have quantitative benchmark results about DirectIO *ON* vs DirectIO 
{*}OFF{*}?

> Add USE_DIRECT_READ configuration parameter for RocksDB
> ---
>
> Key: FLINK-29402
> URL: https://issues.apache.org/jira/browse/FLINK-29402
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / State Backends
>Affects Versions: 1.15.2
>Reporter: Donatien
>Priority: Not a Priority
>  Labels: Enhancement, rocksdb
> Fix For: 1.15.2
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> RocksDB allows the use of DirectIO for read operations to bypass the Linux 
> Page Cache. To understand the impact of Linux Page Cache on performance, one 
> can run a heavy workload on a single-tasked Task Manager with a container 
> memory limit identical to the TM process memory. Running this same workload 
> on a TM with no container memory limit will result in better performances but 
> with the host memory exceeding the TM requirement.
> Linux Page Cache are of course useful but can give false results when 
> benchmarking the Managed Memory used by RocksDB. DirectIO is typically 
> enabled for benchmarks on working set estimation [Zwaenepoel et 
> al.|[https://arxiv.org/abs/1702.04323].]
> I propose to add a configuration key allowing users to enable the use of 
> DirectIO for reads thanks to the RocksDB API. This configuration would be 
> disabled by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-29402) Add USE_DIRECT_READ configuration parameter for RocksDB

2022-09-25 Thread Yanfei Lei (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609277#comment-17609277
 ] 

Yanfei Lei commented on FLINK-29402:


This is a very interesting proposal, I think this is not hard to implement in 
Flink. From the [wiki|https://github.com/facebook/rocksdb/wiki/Direct-IO] there 
are two options to control the DirectIO: {{use_direct_reads}} and 
{{use_direct_io_for_flush_and_compaction, }}and these two options are supported 
by current {{{}frocksdb-jni(6.20.3){}}}.  

BTW, do you have quantitative benchmark results about DirectIO *ON* vs DirectIO 
{*}OFF{*}?

> Add USE_DIRECT_READ configuration parameter for RocksDB
> ---
>
> Key: FLINK-29402
> URL: https://issues.apache.org/jira/browse/FLINK-29402
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / State Backends
>Affects Versions: 1.15.2
>Reporter: Donatien
>Priority: Not a Priority
>  Labels: Enhancement, rocksdb
> Fix For: 1.15.2
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> RocksDB allows the use of DirectIO for read operations to bypass the Linux 
> Page Cache. To understand the impact of Linux Page Cache on performance, one 
> can run a heavy workload on a single-tasked Task Manager with a container 
> memory limit identical to the TM process memory. Running this same workload 
> on a TM with no container memory limit will result in better performances but 
> with the host memory exceeding the TM requirement.
> Linux Page Cache are of course useful but can give false results when 
> benchmarking the Managed Memory used by RocksDB. DirectIO is typically 
> enabled for benchmarks on working set estimation [Zwaenepoel et 
> al.|[https://arxiv.org/abs/1702.04323].]
> I propose to add a configuration key allowing users to enable the use of 
> DirectIO for reads thanks to the RocksDB API. This configuration would be 
> disabled by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-29408) HiveCatalogITCase failed with NPE

2022-09-25 Thread Huang Xingbo (Jira)

Huang Xingbo created FLINK-29408:


 Summary: HiveCatalogITCase failed with NPE
 Key: FLINK-29408
 URL: https://issues.apache.org/jira/browse/FLINK-29408
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive
Affects Versions: 1.17.0
Reporter: Huang Xingbo



{code:java}
2022-09-25T03:41:07.4212129Z Sep 25 03:41:07 [ERROR] 
org.apache.flink.table.catalog.hive.HiveCatalogUdfITCase.testFlinkUdf  Time 
elapsed: 0.098 s  <<< ERROR!
2022-09-25T03:41:07.4212662Z Sep 25 03:41:07 java.lang.NullPointerException
2022-09-25T03:41:07.4213189Z Sep 25 03:41:07at 
org.apache.flink.table.catalog.hive.HiveCatalogUdfITCase.testFlinkUdf(HiveCatalogUdfITCase.java:109)
2022-09-25T03:41:07.4213753Z Sep 25 03:41:07at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2022-09-25T03:41:07.4224643Z Sep 25 03:41:07at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2022-09-25T03:41:07.4225311Z Sep 25 03:41:07at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2022-09-25T03:41:07.4225879Z Sep 25 03:41:07at 
java.lang.reflect.Method.invoke(Method.java:498)
2022-09-25T03:41:07.4226405Z Sep 25 03:41:07at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
2022-09-25T03:41:07.4227201Z Sep 25 03:41:07at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
2022-09-25T03:41:07.4227807Z Sep 25 03:41:07at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
2022-09-25T03:41:07.4228394Z Sep 25 03:41:07at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
2022-09-25T03:41:07.4228966Z Sep 25 03:41:07at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
2022-09-25T03:41:07.4229514Z Sep 25 03:41:07at 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
2022-09-25T03:41:07.4230066Z Sep 25 03:41:07at 
org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
2022-09-25T03:41:07.4230587Z Sep 25 03:41:07at 
org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
2022-09-25T03:41:07.4231258Z Sep 25 03:41:07at 
org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
2022-09-25T03:41:07.4231823Z Sep 25 03:41:07at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
2022-09-25T03:41:07.4232384Z Sep 25 03:41:07at 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
2022-09-25T03:41:07.4232930Z Sep 25 03:41:07at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
2022-09-25T03:41:07.4233511Z Sep 25 03:41:07at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
2022-09-25T03:41:07.4234039Z Sep 25 03:41:07at 
org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
2022-09-25T03:41:07.4234546Z Sep 25 03:41:07at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
2022-09-25T03:41:07.4235057Z Sep 25 03:41:07at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
2022-09-25T03:41:07.4235573Z Sep 25 03:41:07at 
org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
2022-09-25T03:41:07.4236087Z Sep 25 03:41:07at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
2022-09-25T03:41:07.4236635Z Sep 25 03:41:07at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
2022-09-25T03:41:07.4237314Z Sep 25 03:41:07at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
2022-09-25T03:41:07.4238211Z Sep 25 03:41:07at 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
2022-09-25T03:41:07.4238775Z Sep 25 03:41:07at 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
2022-09-25T03:41:07.4239277Z Sep 25 03:41:07at 
org.junit.rules.RunRules.evaluate(RunRules.java:20)
2022-09-25T03:41:07.4239769Z Sep 25 03:41:07at 
org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
2022-09-25T03:41:07.4240265Z Sep 25 03:41:07at 
org.junit.runners.ParentRunner.run(ParentRunner.java:413)
2022-09-25T03:41:07.4240731Z Sep 25 03:41:07at 
org.junit.runner.JUnitCore.run(JUnitCore.java:137)
2022-09-25T03:41:07.4241196Z Sep 25 03:41:07at 
org.junit.runner.JUnitCore.run(JUnitCore.java:115)
2022-09-25T03:41:07.4241715Z Sep 25 03:41:07at 
org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42)
2022-09-25T03:41:07.4242316Z Sep 25 03:41:07at 
org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80)
2022-09-25T03:41:07.4242904Z Sep 25 03:41:07at 
org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72)
2022-09-25T03:41:07.4243528Z S

[jira] [Commented] (FLINK-29387) IntervalJoinITCase.testIntervalJoinSideOutputRightLateData failed with AssertionError

2022-09-25 Thread Huang Xingbo (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609276#comment-17609276
 ] 

Huang Xingbo commented on FLINK-29387:
--

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=41316&view=logs&j=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3&t=0c010d0c-3dec-5bf1-d408-7b18988b1b2b&l=8267

> IntervalJoinITCase.testIntervalJoinSideOutputRightLateData failed with 
> AssertionError
> -
>
> Key: FLINK-29387
> URL: https://issues.apache.org/jira/browse/FLINK-29387
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.17.0
>Reporter: Huang Xingbo
>Priority: Critical
>  Labels: test-stability
>
> {code:java}
> 2022-09-22T04:40:21.9296331Z Sep 22 04:40:21 [ERROR] 
> org.apache.flink.test.streaming.runtime.IntervalJoinITCase.testIntervalJoinSideOutputRightLateData
>   Time elapsed: 2.46 s  <<< FAILURE!
> 2022-09-22T04:40:21.9297487Z Sep 22 04:40:21 java.lang.AssertionError: 
> expected:<[(key,2)]> but was:<[]>
> 2022-09-22T04:40:21.9298208Z Sep 22 04:40:21  at 
> org.junit.Assert.fail(Assert.java:89)
> 2022-09-22T04:40:21.9298927Z Sep 22 04:40:21  at 
> org.junit.Assert.failNotEquals(Assert.java:835)
> 2022-09-22T04:40:21.9299655Z Sep 22 04:40:21  at 
> org.junit.Assert.assertEquals(Assert.java:120)
> 2022-09-22T04:40:21.9300403Z Sep 22 04:40:21  at 
> org.junit.Assert.assertEquals(Assert.java:146)
> 2022-09-22T04:40:21.9301538Z Sep 22 04:40:21  at 
> org.apache.flink.test.streaming.runtime.IntervalJoinITCase.expectInAnyOrder(IntervalJoinITCase.java:521)
> 2022-09-22T04:40:21.9302578Z Sep 22 04:40:21  at 
> org.apache.flink.test.streaming.runtime.IntervalJoinITCase.testIntervalJoinSideOutputRightLateData(IntervalJoinITCase.java:280)
> 2022-09-22T04:40:21.9303641Z Sep 22 04:40:21  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2022-09-22T04:40:21.9304472Z Sep 22 04:40:21  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2022-09-22T04:40:21.9305371Z Sep 22 04:40:21  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2022-09-22T04:40:21.9306195Z Sep 22 04:40:21  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2022-09-22T04:40:21.9307011Z Sep 22 04:40:21  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> 2022-09-22T04:40:21.9308077Z Sep 22 04:40:21  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2022-09-22T04:40:21.9308968Z Sep 22 04:40:21  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> 2022-09-22T04:40:21.9309849Z Sep 22 04:40:21  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2022-09-22T04:40:21.9310704Z Sep 22 04:40:21  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 2022-09-22T04:40:21.9311533Z Sep 22 04:40:21  at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> 2022-09-22T04:40:21.9312386Z Sep 22 04:40:21  at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
> 2022-09-22T04:40:21.9313231Z Sep 22 04:40:21  at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
> 2022-09-22T04:40:21.9314985Z Sep 22 04:40:21  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> 2022-09-22T04:40:21.9315857Z Sep 22 04:40:21  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> 2022-09-22T04:40:21.9316633Z Sep 22 04:40:21  at 
> org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> 2022-09-22T04:40:21.9317450Z Sep 22 04:40:21  at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> 2022-09-22T04:40:21.9318209Z Sep 22 04:40:21  at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> 2022-09-22T04:40:21.9318949Z Sep 22 04:40:21  at 
> org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
> 2022-09-22T04:40:21.9319680Z Sep 22 04:40:21  at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
> 2022-09-22T04:40:21.9320401Z Sep 22 04:40:21  at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> 2022-09-22T04:40:21.9321130Z Sep 22 04:40:21  at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> 2022-09-22T04:40:21.9321822Z Sep 22 04:40:21  at 
> org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> 2022-09-22T04:40:21.9322498Z Sep 22 04:40:21  at 
> org.junit.runner.JUnitCore.run(JUnitCore.java:115)
> 2022-09-22T04:40:21.9323248Z Sep 22 04:40:21  at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42)
> 2022

[GitHub] [flink] xintongsong commented on pull request #20857: [hotfix] Fix the problem that BatchShuffleItCase not subject to configuration.

2022-09-25 Thread GitBox



xintongsong commented on PR #20857:
URL: https://github.com/apache/flink/pull/20857#issuecomment-1257406264

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-28898) ChangelogRecoverySwitchStateBackendITCase.testSwitchFromEnablingToDisablingWithRescalingOut failed

2022-09-25 Thread Huang Xingbo (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-28898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609275#comment-17609275
 ] 

Huang Xingbo commented on FLINK-28898:
--

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=41318&view=logs&j=4d4a0d10-fca2-5507-8eed-c07f0bdf4887&t=7b25afdf-cc6c-566f-5459-359dc2585798

> ChangelogRecoverySwitchStateBackendITCase.testSwitchFromEnablingToDisablingWithRescalingOut
>  failed
> --
>
> Key: FLINK-28898
> URL: https://issues.apache.org/jira/browse/FLINK-28898
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.16.0
>Reporter: Xingbo Huang
>Assignee: Hangxiang Yu
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.16.0
>
>
> {code:java}
> 2022-08-10T02:48:19.5711924Z Aug 10 02:48:19 [ERROR] 
> ChangelogRecoverySwitchStateBackendITCase.testSwitchFromEnablingToDisablingWithRescalingOut
>   Time elapsed: 6.064 s  <<< ERROR!
> 2022-08-10T02:48:19.5712815Z Aug 10 02:48:19 
> org.apache.flink.runtime.JobException: Recovery is suppressed by 
> FixedDelayRestartBackoffTimeStrategy(maxNumberRestartAttempts=0, 
> backoffTimeMS=0)
> 2022-08-10T02:48:19.5714530Z Aug 10 02:48:19  at 
> org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:139)
> 2022-08-10T02:48:19.5716211Z Aug 10 02:48:19  at 
> org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:83)
> 2022-08-10T02:48:19.5717627Z Aug 10 02:48:19  at 
> org.apache.flink.runtime.scheduler.DefaultScheduler.recordTaskFailure(DefaultScheduler.java:256)
> 2022-08-10T02:48:19.5718885Z Aug 10 02:48:19  at 
> org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:247)
> 2022-08-10T02:48:19.5720430Z Aug 10 02:48:19  at 
> org.apache.flink.runtime.scheduler.DefaultScheduler.onTaskFailed(DefaultScheduler.java:240)
> 2022-08-10T02:48:19.5721733Z Aug 10 02:48:19  at 
> org.apache.flink.runtime.scheduler.SchedulerBase.onTaskExecutionStateUpdate(SchedulerBase.java:738)
> 2022-08-10T02:48:19.5722680Z Aug 10 02:48:19  at 
> org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:715)
> 2022-08-10T02:48:19.5723612Z Aug 10 02:48:19  at 
> org.apache.flink.runtime.scheduler.SchedulerNG.updateTaskExecutionState(SchedulerNG.java:78)
> 2022-08-10T02:48:19.5724389Z Aug 10 02:48:19  at 
> org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:477)
> 2022-08-10T02:48:19.5725046Z Aug 10 02:48:19  at 
> sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
> 2022-08-10T02:48:19.5725708Z Aug 10 02:48:19  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2022-08-10T02:48:19.5726374Z Aug 10 02:48:19  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2022-08-10T02:48:19.5727065Z Aug 10 02:48:19  at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$handleRpcInvocation$1(AkkaRpcActor.java:309)
> 2022-08-10T02:48:19.5727932Z Aug 10 02:48:19  at 
> org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:83)
> 2022-08-10T02:48:19.5729087Z Aug 10 02:48:19  at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:307)
> 2022-08-10T02:48:19.5730134Z Aug 10 02:48:19  at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:222)
> 2022-08-10T02:48:19.5731536Z Aug 10 02:48:19  at 
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:84)
> 2022-08-10T02:48:19.5732549Z Aug 10 02:48:19  at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:168)
> 2022-08-10T02:48:19.5735018Z Aug 10 02:48:19  at 
> akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24)
> 2022-08-10T02:48:19.5735821Z Aug 10 02:48:19  at 
> akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20)
> 2022-08-10T02:48:19.5736465Z Aug 10 02:48:19  at 
> scala.PartialFunction.applyOrElse(PartialFunction.scala:123)
> 2022-08-10T02:48:19.5737234Z Aug 10 02:48:19  at 
> scala.PartialFunction.applyOrElse$(PartialFunction.scala:122)
> 2022-08-10T02:48:19.5737895Z Aug 10 02:48:19  at 
> akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20)
> 2022-08-10T02:48:19.5738574Z Aug 10 02:48:19  at 
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> 2022-08-10T02:48:19.5739276Z Aug 10 02:48:19  at 
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
> 2

[jira] [Created] (FLINK-29407) HiveCatalogITCase.testCsvTableViaSQL failed with FileNotFoundException

2022-09-25 Thread Huang Xingbo (Jira)

Huang Xingbo created FLINK-29407:


 Summary: HiveCatalogITCase.testCsvTableViaSQL failed with 
FileNotFoundException
 Key: FLINK-29407
 URL: https://issues.apache.org/jira/browse/FLINK-29407
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive
Affects Versions: 1.16.0
Reporter: Huang Xingbo



{code:java}
2022-09-25T04:37:47.3992924Z Sep 25 04:37:47 [ERROR] Tests run: 14, Failures: 
0, Errors: 1, Skipped: 0, Time elapsed: 65.785 s <<< FAILURE! - in 
org.apache.flink.table.catalog.hive.HiveCatalogITCase
2022-09-25T04:37:47.4025795Z Sep 25 04:37:47 [ERROR] 
org.apache.flink.table.catalog.hive.HiveCatalogITCase.testCsvTableViaSQL  Time 
elapsed: 5.584 s  <<< ERROR!
2022-09-25T04:37:47.4026393Z Sep 25 04:37:47 java.lang.RuntimeException: Failed 
to fetch next result
2022-09-25T04:37:47.4027006Z Sep 25 04:37:47at 
org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:109)
2022-09-25T04:37:47.4046427Z Sep 25 04:37:47at 
org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:80)
2022-09-25T04:37:47.4047674Z Sep 25 04:37:47at 
org.apache.flink.table.planner.connectors.CollectDynamicSink$CloseableRowIteratorWrapper.hasNext(CollectDynamicSink.java:222)
2022-09-25T04:37:47.4048848Z Sep 25 04:37:47at 
java.util.Iterator.forEachRemaining(Iterator.java:115)
2022-09-25T04:37:47.4051301Z Sep 25 04:37:47at 
org.apache.flink.util.CollectionUtil.iteratorToList(CollectionUtil.java:115)
2022-09-25T04:37:47.4052042Z Sep 25 04:37:47at 
org.apache.flink.table.catalog.hive.HiveCatalogITCase.testCsvTableViaSQL(HiveCatalogITCase.java:140)
2022-09-25T04:37:47.4052654Z Sep 25 04:37:47at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2022-09-25T04:37:47.4053295Z Sep 25 04:37:47at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2022-09-25T04:37:47.4053909Z Sep 25 04:37:47at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2022-09-25T04:37:47.4054470Z Sep 25 04:37:47at 
java.lang.reflect.Method.invoke(Method.java:498)
2022-09-25T04:37:47.4055022Z Sep 25 04:37:47at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
2022-09-25T04:37:47.4055655Z Sep 25 04:37:47at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
2022-09-25T04:37:47.4056250Z Sep 25 04:37:47at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
2022-09-25T04:37:47.4056855Z Sep 25 04:37:47at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
2022-09-25T04:37:47.4057453Z Sep 25 04:37:47at 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
2022-09-25T04:37:47.4058013Z Sep 25 04:37:47at 
org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
2022-09-25T04:37:47.4058592Z Sep 25 04:37:47at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
2022-09-25T04:37:47.4059176Z Sep 25 04:37:47at 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
2022-09-25T04:37:47.4059731Z Sep 25 04:37:47at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
2022-09-25T04:37:47.4060350Z Sep 25 04:37:47at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
2022-09-25T04:37:47.4060907Z Sep 25 04:37:47at 
org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
2022-09-25T04:37:47.4061441Z Sep 25 04:37:47at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
2022-09-25T04:37:47.4062200Z Sep 25 04:37:47at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
2022-09-25T04:37:47.4062736Z Sep 25 04:37:47at 
org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
2022-09-25T04:37:47.4063261Z Sep 25 04:37:47at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
2022-09-25T04:37:47.4063821Z Sep 25 04:37:47at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
2022-09-25T04:37:47.4064407Z Sep 25 04:37:47at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
2022-09-25T04:37:47.4064971Z Sep 25 04:37:47at 
org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
2022-09-25T04:37:47.4065503Z Sep 25 04:37:47at 
org.junit.runners.ParentRunner.run(ParentRunner.java:413)
2022-09-25T04:37:47.4066162Z Sep 25 04:37:47at 
org.junit.runner.JUnitCore.run(JUnitCore.java:137)
2022-09-25T04:37:47.4066640Z Sep 25 04:37:47at 
org.junit.runner.JUnitCore.run(JUnitCore.java:115)
2022-09-25T04:37:47.4067195Z Sep 25 04:37:47at 
org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:

[GitHub] [flink-kubernetes-operator] zezaeoh commented on a diff in pull request #377: [FLINK-28979] Add owner reference to flink deployment object

2022-09-25 Thread GitBox



zezaeoh commented on code in PR #377:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/377#discussion_r979518665


##
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/reconciler/deployment/AbstractFlinkResourceReconciler.java:
##
@@ -426,4 +431,17 @@ protected boolean flinkVersionChanged(SPEC oldSpec, SPEC 
newSpec) {
 }
 return false;
 }
+
+private void setOwnerReference(CR owner, Configuration deployConfig) {
+final Map ownerReference =
+Map.of(
+"apiVersion", owner.getApiVersion(),
+"kind", owner.getKind(),
+"name", owner.getMetadata().getName(),
+"uid", owner.getMetadata().getUid(),
+"blockOwnerDeletion", "false",
+"controller", "false");
+deployConfig.set(
+KubernetesConfigOptions.JOB_MANAGER_OWNER_REFERENCE, 
List.of(ownerReference));
+}

Review Comment:
   @gyfora 
   I tried it but, the problem is with the apiVersion and kind
   
   Is there a good way to add the apiVersion and kind of a CustomResource, if 
not in the `AbstractFlinkResourceReconciler`?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-29405) InputFormatCacheLoaderTest is unstable

2022-09-25 Thread Huang Xingbo (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609272#comment-17609272
 ] 

Huang Xingbo commented on FLINK-29405:
--

[~smiralex][~renqs] Could you help take a look? Thx.

> InputFormatCacheLoaderTest is unstable
> --
>
> Key: FLINK-29405
> URL: https://issues.apache.org/jira/browse/FLINK-29405
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Table SQL / Runtime
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Chesnay Schepler
>Priority: Critical
>
> #testExceptionDuringReload/#testCloseAndInterruptDuringReload fail reliably 
> when run in a loop.
> {code}
> java.lang.AssertionError: 
> Expecting AtomicInteger(0) to have value:
>   0
> but did not.
>   at 
> org.apache.flink.table.runtime.functions.table.fullcache.inputformat.InputFormatCacheLoaderTest.testCloseAndInterruptDuringReload(InputFormatCacheLoaderTest.java:161)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-29405) InputFormatCacheLoaderTest is unstable

2022-09-25 Thread Huang Xingbo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huang Xingbo updated FLINK-29405:
-
Priority: Critical  (was: Major)

> InputFormatCacheLoaderTest is unstable
> --
>
> Key: FLINK-29405
> URL: https://issues.apache.org/jira/browse/FLINK-29405
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Table SQL / Runtime
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Chesnay Schepler
>Priority: Critical
>
> #testExceptionDuringReload/#testCloseAndInterruptDuringReload fail reliably 
> when run in a loop.
> {code}
> java.lang.AssertionError: 
> Expecting AtomicInteger(0) to have value:
>   0
> but did not.
>   at 
> org.apache.flink.table.runtime.functions.table.fullcache.inputformat.InputFormatCacheLoaderTest.testCloseAndInterruptDuringReload(InputFormatCacheLoaderTest.java:161)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-29405) InputFormatCacheLoaderTest is unstable

2022-09-25 Thread Huang Xingbo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huang Xingbo updated FLINK-29405:
-
Affects Version/s: 1.16.0

> InputFormatCacheLoaderTest is unstable
> --
>
> Key: FLINK-29405
> URL: https://issues.apache.org/jira/browse/FLINK-29405
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Table SQL / Runtime
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Chesnay Schepler
>Priority: Major
>
> #testExceptionDuringReload/#testCloseAndInterruptDuringReload fail reliably 
> when run in a loop.
> {code}
> java.lang.AssertionError: 
> Expecting AtomicInteger(0) to have value:
>   0
> but did not.
>   at 
> org.apache.flink.table.runtime.functions.table.fullcache.inputformat.InputFormatCacheLoaderTest.testCloseAndInterruptDuringReload(InputFormatCacheLoaderTest.java:161)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-29405) InputFormatCacheLoaderTest is unstable

2022-09-25 Thread Huang Xingbo (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609271#comment-17609271
 ] 

Huang Xingbo commented on FLINK-29405:
--

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=41328&view=logs&j=0c940707-2659-5648-cbe6-a1ad63045f0a&t=075c2716-8010-5565-fe08-3c4bb45824a4
 1.16 instance

> InputFormatCacheLoaderTest is unstable
> --
>
> Key: FLINK-29405
> URL: https://issues.apache.org/jira/browse/FLINK-29405
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Table SQL / Runtime
>Affects Versions: 1.17.0
>Reporter: Chesnay Schepler
>Priority: Major
>
> #testExceptionDuringReload/#testCloseAndInterruptDuringReload fail reliably 
> when run in a loop.
> {code}
> java.lang.AssertionError: 
> Expecting AtomicInteger(0) to have value:
>   0
> but did not.
>   at 
> org.apache.flink.table.runtime.functions.table.fullcache.inputformat.InputFormatCacheLoaderTest.testCloseAndInterruptDuringReload(InputFormatCacheLoaderTest.java:161)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-29373) DataStream to table not support BigDecimalTypeInfo

2022-09-25 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee reassigned FLINK-29373:


Assignee: hk__lrzy

> DataStream to table not support BigDecimalTypeInfo
> --
>
> Key: FLINK-29373
> URL: https://issues.apache.org/jira/browse/FLINK-29373
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.17.0
>Reporter: hk__lrzy
>Assignee: hk__lrzy
>Priority: Major
> Attachments: image-2022-09-21-15-12-11-082.png, 
> image-2022-09-22-18-08-44-385.png
>
>
> When we try to transfrom datastream to table, *TypeInfoDataTypeConverter* 
> will try to convert *TypeInformation* to {*}DataType{*}, but if datastream's 
> produce types contains {*}BigDecimalTypeInfo{*}, *TypeInfoDataTypeConverter* 
> will final convert it to {*}RawDataType{*}，then when we want tranform table 
> to datastream again, exception will hapend, and show the data type not match.
> Blink planner also will has this exception.
> !image-2022-09-22-18-08-44-385.png!
>  
> {code:java}
> Query schema: [f0: RAW('java.math.BigDecimal', '...')]
> Sink schema:  [f0: RAW('java.math.BigDecimal', ?)]{code}
> how to recurrent
> {code:java}
> // code placeholder
> StreamExecutionEnvironment executionEnvironment = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> EnvironmentSettings.Builder envBuilder = EnvironmentSettings.newInstance()
> .inStreamingMode();
> StreamTableEnvironment streamTableEnvironment = 
> StreamTableEnvironment.create(executionEnvironment, envBuilder.build());
> FromElementsFunction fromElementsFunction = new FromElementsFunction(new 
> BigDecimal(1.11D));
> DataStreamSource dataStreamSource = 
> executionEnvironment.addSource(fromElementsFunction, new 
> BigDecimalTypeInfo(10, 8));
> streamTableEnvironment.createTemporaryView("tmp", dataStreamSource);
> Table table = streamTableEnvironment.sqlQuery("select * from tmp");
> streamTableEnvironment.toRetractStream(table, 
> table.getSchema().toRowType());{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-29373) DataStream to table not support BigDecimalTypeInfo

2022-09-25 Thread Jingsong Lee (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609270#comment-17609270
 ] 

Jingsong Lee commented on FLINK-29373:
--

[~hk__lrzy] Thanks, assigned to u. CC [~jark] to review~

> DataStream to table not support BigDecimalTypeInfo
> --
>
> Key: FLINK-29373
> URL: https://issues.apache.org/jira/browse/FLINK-29373
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.17.0
>Reporter: hk__lrzy
>Assignee: hk__lrzy
>Priority: Major
> Attachments: image-2022-09-21-15-12-11-082.png, 
> image-2022-09-22-18-08-44-385.png
>
>
> When we try to transfrom datastream to table, *TypeInfoDataTypeConverter* 
> will try to convert *TypeInformation* to {*}DataType{*}, but if datastream's 
> produce types contains {*}BigDecimalTypeInfo{*}, *TypeInfoDataTypeConverter* 
> will final convert it to {*}RawDataType{*}，then when we want tranform table 
> to datastream again, exception will hapend, and show the data type not match.
> Blink planner also will has this exception.
> !image-2022-09-22-18-08-44-385.png!
>  
> {code:java}
> Query schema: [f0: RAW('java.math.BigDecimal', '...')]
> Sink schema:  [f0: RAW('java.math.BigDecimal', ?)]{code}
> how to recurrent
> {code:java}
> // code placeholder
> StreamExecutionEnvironment executionEnvironment = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> EnvironmentSettings.Builder envBuilder = EnvironmentSettings.newInstance()
> .inStreamingMode();
> StreamTableEnvironment streamTableEnvironment = 
> StreamTableEnvironment.create(executionEnvironment, envBuilder.build());
> FromElementsFunction fromElementsFunction = new FromElementsFunction(new 
> BigDecimal(1.11D));
> DataStreamSource dataStreamSource = 
> executionEnvironment.addSource(fromElementsFunction, new 
> BigDecimalTypeInfo(10, 8));
> streamTableEnvironment.createTemporaryView("tmp", dataStreamSource);
> Table table = streamTableEnvironment.sqlQuery("select * from tmp");
> streamTableEnvironment.toRetractStream(table, 
> table.getSchema().toRowType());{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #298: [FLINK-29291] Change DataFileWriter into a factory to create writers

2022-09-25 Thread GitBox



JingsongLi commented on code in PR #298:
URL: https://github.com/apache/flink-table-store/pull/298#discussion_r979506509


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/mergetree/MergeTreeWriter.java:
##
@@ -138,34 +137,45 @@ public void flushMemory() throws Exception {
 // stop writing, wait for compaction finished
 trySyncLatestCompaction(true);
 }
+
+// write changelog file
 List extraFiles = new ArrayList<>();
 if (changelogProducer == ChangelogProducer.INPUT) {
-extraFiles.add(
-dataFileWriter
-.writeLevel0Changelog(
-CloseableIterator.adapterForIterator(
-memTable.rawIterator()))
-.getName());
+SingleFileWriter writer = 
writerFactory.createExtraFileWriter();
+writer.write(memTable.rawIterator());
+writer.close();
+extraFiles.add(writer.path().getName());
 }
-boolean success = false;
+
+// write lsm level 0 file
 try {
 Iterator iterator = 
memTable.mergeIterator(keyComparator, mergeFunction);
-success =
-dataFileWriter
-
.writeLevel0(CloseableIterator.adapterForIterator(iterator))
-.map(
-file -> {
-DataFileMeta fileMeta = 
file.copy(extraFiles);
-newFiles.add(fileMeta);
-
compactManager.addNewFile(fileMeta);
-return true;
-})
-.orElse(false);
-} finally {
-if (!success) {
-extraFiles.forEach(dataFileWriter::delete);
+KeyValueDataFileWriter writer = 
writerFactory.createLevel0Writer();
+writer.write(iterator);
+writer.close();
+
+// In theory, this fileMeta should contain statistics from 
both lsm file extra file.
+// However for level 0 files, as we do not drop DELETE 
records, keys appear in one
+// file will also appear in the other. So we just need to use 
statistics from one of
+// them.
+//
+// For value count merge function, it is possible that we have 
changelog first
+// adding one record then remove one record, but after merging 
this record will not
+// appear in lsm file. This is OK because we can also skip 
this changelog.
+DataFileMeta fileMeta = writer.result();
+if (fileMeta != null) {

Review Comment:
   `if (fileMeta == null)`, `extraFiles` is not be deleted?



##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/io/KeyValueFileWriterFactory.java:
##
@@ -149,16 +96,35 @@ private KeyValueDataFileWriter createDataFileWriter(int 
level) {
 level);
 }
 
-public void delete(DataFileMeta file) {
-delete(file.fileName());
+public SingleFileWriter createExtraFileWriter() {

Review Comment:
   `createChangelogFileWriter`? we may have more extra files in future.



##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/io/KeyValueFileWriterFactory.java:
##
@@ -52,20 +44,19 @@ public class DataFileWriter {
 private final DataFilePathFactory pathFactory;
 private final long suggestedFileSize;
 
-private DataFileWriter(
+private KeyValueFileWriterFactory(
 long schemaId,
 RowType keyType,
 RowType valueType,
 BulkWriter.Factory writerFactory,
-@Nullable FileStatsExtractor fileStatsExtractor,
+FileStatsExtractor fileStatsExtractor,

Review Comment:
   why remove nullable?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-29406) Expose Finish Method For TableFunction

2022-09-25 Thread lincoln lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee updated FLINK-29406:

Component/s: Table SQL / API

> Expose Finish Method For TableFunction
> --
>
> Key: FLINK-29406
> URL: https://issues.apache.org/jira/browse/FLINK-29406
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Affects Versions: 1.16.0, 1.15.2
>Reporter: lincoln lee
>Priority: Major
> Fix For: 1.17.0
>
>
> FLIP-260: Expose Finish Method For TableFunction
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-260%3A+Expose+Finish+Method+For+TableFunction



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] luoyuxia commented on pull request #20883: [WIP] validate for https://github.com/apache/flink/pull/20882

2022-09-25 Thread GitBox



luoyuxia commented on PR #20883:
URL: https://github.com/apache/flink/pull/20883#issuecomment-1257357271

   The failure should related to FLINK-29315


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] tweise commented on a diff in pull request #385: [FLINK-29109] Avoid checkpoint path conflicts (Flink < 1.16)

2022-09-25 Thread GitBox



tweise commented on code in PR #385:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/385#discussion_r979497847


##
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/crd/status/FlinkDeploymentStatus.java:
##
@@ -53,4 +53,7 @@ public class FlinkDeploymentStatus extends 
CommonStatus {
 
 /** Information about the TaskManagers for the scale subresource. */
 private TaskManagerInfo taskManager;
+
+/** The jobId, if it was set by the operator. */
+private String overrideJobId;

Review Comment:
   Reusing jobStatus.jobId is not perfect as we are using the same field for 
two different purposes. But it works pretty well and is better than adding 
another field to status for something that is going to be phased out when 
>=1.16 becomes the minimum version. PTAL



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-29406) Expose Finish Method For TableFunction

2022-09-25 Thread lincoln lee (Jira)

lincoln lee created FLINK-29406:
---

 Summary: Expose Finish Method For TableFunction
 Key: FLINK-29406
 URL: https://issues.apache.org/jira/browse/FLINK-29406
 Project: Flink
  Issue Type: Improvement
Affects Versions: 1.15.2, 1.16.0
Reporter: lincoln lee
 Fix For: 1.17.0


FLIP-260: Expose Finish Method For TableFunction
https://cwiki.apache.org/confluence/display/FLINK/FLIP-260%3A+Expose+Finish+Method+For+TableFunction



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-29389) Update documentation of JDBC and HBase lookup table for new caching options

2022-09-25 Thread Qingsheng Ren (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qingsheng Ren resolved FLINK-29389.
---
  Assignee: Qingsheng Ren
Resolution: Fixed

> Update documentation of JDBC and HBase lookup table for new caching options
> ---
>
> Key: FLINK-29389
> URL: https://issues.apache.org/jira/browse/FLINK-29389
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / HBase, Connectors / JDBC, Documentation
>Affects Versions: 1.16.0
>Reporter: Qingsheng Ren
>Assignee: Qingsheng Ren
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> Update documentation of JDBC and HBase lookup table for new caching options



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-29389) Update documentation of JDBC and HBase lookup table for new caching options

2022-09-25 Thread Qingsheng Ren (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609265#comment-17609265
 ] 

Qingsheng Ren edited comment on FLINK-29389 at 9/26/22 1:25 AM:


1.16: 9699ca7ac188d24a3a8d33fc6749b08c10ca85c7

master: 3fa7d03ddad576e99a05ff558e2ee536872a34d2


was (Author: renqs):
1.16: 9699ca7ac188d24a3a8d33fc6749b08c10ca85c7

> Update documentation of JDBC and HBase lookup table for new caching options
> ---
>
> Key: FLINK-29389
> URL: https://issues.apache.org/jira/browse/FLINK-29389
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / HBase, Connectors / JDBC, Documentation
>Affects Versions: 1.16.0
>Reporter: Qingsheng Ren
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> Update documentation of JDBC and HBase lookup table for new caching options



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-29389) Update documentation of JDBC and HBase lookup table for new caching options

2022-09-25 Thread Qingsheng Ren (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609265#comment-17609265
 ] 

Qingsheng Ren commented on FLINK-29389:
---

1.16: 9699ca7ac188d24a3a8d33fc6749b08c10ca85c7

> Update documentation of JDBC and HBase lookup table for new caching options
> ---
>
> Key: FLINK-29389
> URL: https://issues.apache.org/jira/browse/FLINK-29389
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / HBase, Connectors / JDBC, Documentation
>Affects Versions: 1.16.0
>Reporter: Qingsheng Ren
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> Update documentation of JDBC and HBase lookup table for new caching options



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] PatrickRen merged pull request #20892: [BP-1.16][FLINK-29389][docs] Update documentation of JDBC and HBase lookup table for new caching options

2022-09-25 Thread GitBox



PatrickRen merged PR #20892:
URL: https://github.com/apache/flink/pull/20892


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] PatrickRen commented on pull request #20892: [BP-1.16][FLINK-29389][docs] Update documentation of JDBC and HBase lookup table for new caching options

2022-09-25 Thread GitBox



PatrickRen commented on PR #20892:
URL: https://github.com/apache/flink/pull/20892#issuecomment-1257352406

   Merging as this is a doc-only PR. Doc validation has passed on CI. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] snuyanzin commented on a diff in pull request #20850: [FLINK-20873][Table SQl/API] Update to calcite 1.27

2022-09-25 Thread GitBox



snuyanzin commented on code in PR #20850:
URL: https://github.com/apache/flink/pull/20850#discussion_r979468073


##
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/codegen/InputFormatCodeGenerator.scala:
##
@@ -80,6 +80,7 @@ object InputFormatCodeGenerator {
 
 @Override
 public Object nextRecord(Object reuse) {
+  | ${ctx.reuseLocalVariableCode()}

Review Comment:
   Probably as a result of https://issues.apache.org/jira/browse/CALCITE-4383



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-29365) Millisecond behind latest jumps after Flink 1.15.2 upgrade

2022-09-25 Thread Hong Liang Teoh (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609254#comment-17609254
 ] 

Hong Liang Teoh commented on FLINK-29365:
-

Hi [~wilsonwu],

Thanks for reporting this. Indeed, we don't expect millisBehindLatest to jump 
after a version upgrade.

I've tried replicating this issue by upgrading my Kinesis consumer application 
from 1.14.4 to 1.15.2, but was unable to do so. These are the test scenarios I 
tried:
 * tested both Polling and EFO
 * tested consuming streams without resharding as well as stream with multiple 
resharding (mix of shards with records, and without records)
 * tested scaling up job from 2 parallelism to 5 parallelism at the same time 
as upgrade
 * start the consumer from both TRIM_HORIZON and AT_TIMESTAMP (this shouldn't 
matter when restoring from a snapshot anyways)

In all these cases, there was no spike in client side millisBehindLatest / 
server-side IteratorAgeMilliseconds.

 

However, in your case, it seems you were able to reliably replicate the spike 
in MillisBehindLatest. It is likely that there is some difference in our setup 
/ kinesis stream that causes this issue to happen for your case. To help debug 
further, would you be able to provide the following, to help root cause the 
issue?
 * Logs from the taskmanager during the 1.15.2 -> 1.15.2 change (no spike) and 
1.14.4 -> 1.15.2 (with spike). We can compare and contrast the restored state 
for each shard.
 * Kinesis streams have [Enhanced 
metrics|https://docs.aws.amazon.com/streams/latest/dev/monitoring-with-cloudwatch.html#kinesis-metrics-shard].
 If you enable it, we can see IteratorAgeMilliseconds for each shard - and we 
can see which shardId is seeing the spike in MillisBehindLatest.
 * If possible, it would be nice to have the result from the describe stream 
call (i.e. {*}aws kinesis describe-stream --stream-name {*}), as 
this will help us determine the shard parent-child relations.

 

 

> Millisecond behind latest jumps after Flink 1.15.2 upgrade
> --
>
> Key: FLINK-29365
> URL: https://issues.apache.org/jira/browse/FLINK-29365
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: 1.15.2
> Environment: Redeployment from 1.14.4 to 1.15.2
>Reporter: Wilson Wu
>Priority: Major
> Attachments: Screen Shot 2022-09-19 at 2.50.56 PM.png
>
>
> (First time filling a ticket in Flink community, please let me know if there 
> are any guidelines I need to follow)
> I noticed a very strange behavior with a recent version bump from Flink 
> 1.14.4 to 1.15.2. My project consumes around 30K records per second from a 
> sharded kinesis stream, and during the version upgrade, it will follow the 
> best practice to first trigger a savepoint from the running job, start the 
> new job from the savepoint and then remove the old job. So far so good, and 
> the above logic has been tested multiple times without any issue for 1.14.4. 
> Usually, after the version upgrade, our job will have a few minutes delay for 
> millisecond behind latest, but it will catch up with the speed quickly(within 
> 30mins). Our savepoint is around one hundred MBs big, and our job DAG will 
> become 90 - 100% busy with some backpressure when we redeploy but after 10-20 
> minutes it goes back to normal.
> Then the strange thing happened, when I tried to redeploy with 1.15.2 upgrade 
> from a running 1.14.4 job, I can see a savepoint has been created and the new 
> job is running, all the metrics look fine, except suddenly [millisecond 
> behind the 
> latest|https://flink.apache.org/news/2019/02/25/monitoring-best-practices.html]
>  jumps to 10 hours!! and it takes days for my application to catch up with 
> the kinesis stream latest record. I don't understand why it jumps from 0 
> second to 10+ hours when we restart the new job. The only main change I 
> introduced with version bump is to change 
> [failOnError|https://nightlies.apache.org/flink/flink-docs-release-1.15/api/java/org/apache/flink/connector/kinesis/sink/KinesisStreamsSink.html]
>  from true to false, but I don't think this is the root cause.
> I tried to redeploy the new 1.15.2 job by changing our parallelism, 
> redeploying a job from 1.15.2 does not introduce a big delay, so I assume the 
> issue above only happens when we bump version from 1.14.4 to 1.15.2(note the 
> attached screenshot)? I did try to bump it twice and I see the same 10hrs+ 
> jump in delay, we do not have changes related to any timezones.
> Please let me know if this can be filled as a bug, as I do not have a running 
> project with all the kinesis setup available that can reproduce the issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-29131) Kubernetes operator webhook can use hostPort

2022-09-25 Thread Dylan Meissner (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609248#comment-17609248
 ] 

Dylan Meissner edited comment on FLINK-29131 at 9/25/22 8:54 PM:
-

The Helm chart changes dramatically to do this work. I'm not even clear if it 
could cleanly upgrade. Testing many upgrade scenarios is intimidating.

When I first brought up this idea in Slack we agreed

_if you have time opening a JIRA ticket and a minimal PR to address it, we 
would be happy to review and merge this_

Do we still envision a small change? 


was (Author: dylanmei):
The Helm chart changes dramatically to do this work. I'm not even clear if it 
could cleanly upgrade. Testing may upgrade scenarios is intimidating.

When I first brought up this idea in Slack we agreed

_if you have time opening a JIRA ticket and a minimal PR to address it, we 
would be happy to review and merge this_

Do we still envision a small change? 

> Kubernetes operator webhook can use hostPort
> 
>
> Key: FLINK-29131
> URL: https://issues.apache.org/jira/browse/FLINK-29131
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.1.0
>Reporter: Dylan Meissner
>Assignee: Dylan Meissner
>Priority: Minor
>
> When running Flink operator on EKS cluster with Calico networking the 
> control-plane (managed by AWS) cannot reach the webhook. Requests to create 
> Flink resources fail with {_}Address is not allowed{_}.
> When the webhook listens on hostPort the requests to create Flink resources 
> are successful. However, a pod security policy is generally required to allow 
> webhook to listen on such ports.
> To support this scenario with the Helm chart make changes so that we can
>  * Specify a hostPort value for the webhook
>  * Name the port that the webhook listens on
>  * Use the named port in the webhook service
>  * Add a "use" pod security policy verb to cluster role



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-29131) Kubernetes operator webhook can use hostPort

2022-09-25 Thread Dylan Meissner (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609248#comment-17609248
 ] 

Dylan Meissner commented on FLINK-29131:


The Helm chart changes dramatically to do this work. I'm not even clear if it 
could cleanly upgrade. Testing may upgrade scenarios is intimidating.

When I first brought up this idea in Slack we agreed

_if you have time opening a JIRA ticket and a minimal PR to address it, we 
would be happy to review and merge this_

Do we still envision a small change? 

> Kubernetes operator webhook can use hostPort
> 
>
> Key: FLINK-29131
> URL: https://issues.apache.org/jira/browse/FLINK-29131
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.1.0
>Reporter: Dylan Meissner
>Assignee: Dylan Meissner
>Priority: Minor
>
> When running Flink operator on EKS cluster with Calico networking the 
> control-plane (managed by AWS) cannot reach the webhook. Requests to create 
> Flink resources fail with {_}Address is not allowed{_}.
> When the webhook listens on hostPort the requests to create Flink resources 
> are successful. However, a pod security policy is generally required to allow 
> webhook to listen on such ports.
> To support this scenario with the Helm chart make changes so that we can
>  * Specify a hostPort value for the webhook
>  * Name the port that the webhook listens on
>  * Use the named port in the webhook service
>  * Add a "use" pod security policy verb to cluster role



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-kubernetes-operator] gyfora commented on a diff in pull request #385: [FLINK-29109] Avoid checkpoint path conflicts (Flink < 1.16)

2022-09-25 Thread GitBox



gyfora commented on code in PR #385:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/385#discussion_r979441596


##
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/crd/status/FlinkDeploymentStatus.java:
##
@@ -53,4 +53,7 @@ public class FlinkDeploymentStatus extends 
CommonStatus {
 
 /** Information about the TaskManagers for the scale subresource. */
 private TaskManagerInfo taskManager;
+
+/** The jobId, if it was set by the operator. */
+private String overrideJobId;

Review Comment:
   All the current logic uses the jobStatus.jobId for savepointing, cancelling, 
observing etc, it would avoid a lot of complications :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] gyfora commented on a diff in pull request #385: [FLINK-29109] Avoid checkpoint path conflicts (Flink < 1.16)

2022-09-25 Thread GitBox



gyfora commented on code in PR #385:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/385#discussion_r979441460


##
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/crd/status/FlinkDeploymentStatus.java:
##
@@ -53,4 +53,7 @@ public class FlinkDeploymentStatus extends 
CommonStatus {
 
 /** Information about the TaskManagers for the scale subresource. */
 private TaskManagerInfo taskManager;
+
+/** The jobId, if it was set by the operator. */
+private String overrideJobId;

Review Comment:
   I think we should simply set it into the jobStatus.jobId and avoid 
introducing an extra field for it. 
   
   As long as the job is running we should assume that the jobStatus reflects 
the correct jobid
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-29109) Checkpoint path conflict with stateless upgrade mode

2022-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-29109:
---
Labels: pull-request-available  (was: )

> Checkpoint path conflict with stateless upgrade mode
> 
>
> Key: FLINK-29109
> URL: https://issues.apache.org/jira/browse/FLINK-29109
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.1.0
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
>  Labels: pull-request-available
>
> A stateful job with stateless upgrade mode (yes, there are such use cases) 
> fails with checkpoint path conflict due to constant jobId and FLINK-19358 
> (applies to Flink < 1.16x). Since with stateless upgrade mode the checkpoint 
> id resets on restart the job is going to write to previously used locations 
> and fail. The workaround is to rotate the jobId on every redeploy when the 
> upgrade mode is stateless. While this can be worked around externally it is 
> best done in the operator itself because reconciliation resolves when a 
> restart is actually required while rotating jobId externally may trigger 
> unnecessary restarts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-kubernetes-operator] tweise opened a new pull request, #385: [FLINK-29109] Avoid checkpoint path conflicts (Flink < 1.16)

2022-09-25 Thread GitBox



tweise opened a new pull request, #385:
URL: https://github.com/apache/flink-kubernetes-operator/pull/385

   ## What is the purpose of the change
   
   Automatically set jobId for Flink version < 1.16 to avoid conflicts in 
checkpoint path (and other jobId based effects).
   
   TODO:
   * Unit test
   * This breaks stateful upgrade if the CRD is not updated
   * We may want a better way to accommodate this in the status in the future 
rather than adding individual fields. Should we add a map that can be used for 
various purposes? 
   
   ## Brief change log
   
   * Set a jobId when no jobID is defined by user and Flink version is < 1.16.
   * Rotate jobId when upgrade mode is stateless
   
   ## Verifying this change
   
   TODO: This change added tests and can be verified as follows:
   
   Manual verification:
   * No jobId set in Flink config
 * Set upgrade mode to stateless. Trigger CR changes and check that every 
redeploy uses a new ID.
 * Set upgrade mode to last_state. Thrigger CR changes and check that 
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changes to the `CustomResourceDescriptors`: 
(**yes** / no)
 - Core observer or reconciler logic that is regularly executed: (yes / no)
   
   Field added to status and CRD needs to be updated for this change to work. 
If the CRD is not updated, the last state upgrade mode breaks.
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #20897: [hotfix][docs] Fix typo in try-flink/datastream.md

2022-09-25 Thread GitBox



flinkbot commented on PR #20897:
URL: https://github.com/apache/flink/pull/20897#issuecomment-1257241665

   
   ## CI report:
   
   * 6c41532ee60e7e4c9471b2a2216aeb25fe9aa568 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] saikikun opened a new pull request, #20897: [hotfix][docs] Fix typo in try-flink/datastream.md

2022-09-25 Thread GitBox



saikikun opened a new pull request, #20897:
URL: https://github.com/apache/flink/pull/20897

   Fix typo in try-flink/datastream.md


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-29405) InputFormatCacheLoaderTest is unstable

2022-09-25 Thread Chesnay Schepler (Jira)

Chesnay Schepler created FLINK-29405:


 Summary: InputFormatCacheLoaderTest is unstable
 Key: FLINK-29405
 URL: https://issues.apache.org/jira/browse/FLINK-29405
 Project: Flink
  Issue Type: Technical Debt
  Components: Table SQL / Runtime
Affects Versions: 1.17.0
Reporter: Chesnay Schepler


#testExceptionDuringReload/#testCloseAndInterruptDuringReload fail reliably 
when run in a loop.

{code}
java.lang.AssertionError: 
Expecting AtomicInteger(0) to have value:
  0
but did not.

at 
org.apache.flink.table.runtime.functions.table.fullcache.inputformat.InputFormatCacheLoaderTest.testCloseAndInterruptDuringReload(InputFormatCacheLoaderTest.java:161)
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-kubernetes-operator] gyfora commented on pull request #384: [FLINK-29159] Harden initial deployment logic

2022-09-25 Thread GitBox



gyfora commented on PR #384:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/384#issuecomment-1257210648

   cc @morhidi 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-29159) Revisit/harden initial deployment logic

2022-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-29159:
---
Labels: pull-request-available  (was: )

> Revisit/harden initial deployment logic
> ---
>
> Key: FLINK-29159
> URL: https://issues.apache.org/jira/browse/FLINK-29159
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.1.0
>Reporter: Thomas Weise
>Assignee: Gyula Fora
>Priority: Major
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.2.0
>
>
> Found isFirstDeployment logic not working as expected for a deployment that 
> had never successfully deployed (image pull error).  We are probably also 
> lacking test coverage for the initialSavepointPath field.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-kubernetes-operator] gyfora opened a new pull request, #384: [FLINK-29159] Harden initial deployment logic

2022-09-25 Thread GitBox



gyfora opened a new pull request, #384:
URL: https://github.com/apache/flink-kubernetes-operator/pull/384

   ## What is the purpose of the change
   
   Fixes a small flaw in the initialDeployment logic that made it useless after 
the job was actually submitted. This also simplifies code in other places.
   
   ## Brief change log
   
 - *Rename ReconStatus#isFirstDeployment to isBeforeFirstDeployment to 
better reflect logic*
 - *Fix reconciliation metadata isFirstDeployment computation*
 - *Add some tests*
   
   ## Verifying this change
   
   Partly covered by existing tests + new tests added for the fixed logic
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changes to the `CustomResourceDescriptors`: 
no
 - Core observer or reconciler logic that is regularly executed: yes
 
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not applicable
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] gyfora commented on a diff in pull request #383: [FLINK-29384] Bump snakeyaml version to 1.32

2022-09-25 Thread GitBox



gyfora commented on code in PR #383:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/383#discussion_r979406834


##
flink-kubernetes-shaded/pom.xml:
##
@@ -88,6 +103,7 @@ under the License.
 
META-INF/DEPENDENCIES
 META-INF/LICENSE
 META-INF/MANIFEST.MF
+
org/apache/flink/kubernetes/shaded/org/yaml/snakeyaml/**

Review Comment:
   Wouldn't this line be the only change we need for `flink-kubernetes-shaded`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-29315) HDFSTest#testBlobServerRecovery fails on CI

2022-09-25 Thread Yang Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609156#comment-17609156
 ] 

Yang Wang commented on FLINK-29315:
---

After more debugging with strace, I found that it might be related with kernel 
version. Then I upgraded the kernel version for all the CI machines from 
{{3.10.0-1160.62.1.el7.x86_64}} to {{3.10.0-1160.76.1.el7.x86_64}}. It seems 
that the CI could pass now.

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=41323&view=results

> HDFSTest#testBlobServerRecovery fails on CI
> ---
>
> Key: FLINK-29315
> URL: https://issues.apache.org/jira/browse/FLINK-29315
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / FileSystem, Tests
>Affects Versions: 1.16.0, 1.15.2
>Reporter: Chesnay Schepler
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.16.0, 1.17.0
>
>
> The test started failing 2 days ago on different branches. I suspect 
> something's wrong with the CI infrastructure.
> {code:java}
> Sep 15 09:11:22 [ERROR] Failures: 
> Sep 15 09:11:22 [ERROR]   HDFSTest.testBlobServerRecovery Multiple Failures 
> (2 failures)
> Sep 15 09:11:22   java.lang.AssertionError: Test failed Error while 
> running command to get file permissions : java.io.IOException: Cannot run 
> program "ls": error=1, Operation not permitted
> Sep 15 09:11:22   at 
> java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell.runCommand(Shell.java:913)
> Sep 15 09:11:22   at org.apache.hadoop.util.Shell.run(Shell.java:869)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1170)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell.execCommand(Shell.java:1264)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.Shell.execCommand(Shell.java:1246)
> Sep 15 09:11:22   at 
> org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:1089)
> Sep 15 09:11:22   at 
> org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:697)
> Sep 15 09:11:22   at 
> org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:672)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:233)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:141)
> Sep 15 09:11:22   at 
> org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:116)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2580)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2622)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2604)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2497)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1501)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:851)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:485)
> Sep 15 09:11:22   at 
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:444)
> Sep 15 09:11:22   at 
> org.apache.flink.hdfstests.HDFSTest.createHDFS(HDFSTest.java:93)
> Sep 15 09:11:22   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Sep 15 09:11:22   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Sep 15 09:11:22   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Sep 15 09:11:22   at 
> java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
> Sep 15 09:11:22   ... 67 more
> Sep 15 09:11:22 
> Sep 15 09:11:22   java.lang.NullPointerException: 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] flinkbot commented on pull request #20896: Update Event Time Temporal Join .md

2022-09-25 Thread GitBox



flinkbot commented on PR #20896:
URL: https://github.com/apache/flink/pull/20896#issuecomment-1257183323

   
   ## CI report:
   
   * 61ad4277694a7a12e8befb414e8a089d7df85fdb UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] nicai008 opened a new pull request, #20896: Update Event Time Temporal Join .md

2022-09-25 Thread GitBox



nicai008 opened a new pull request, #20896:
URL: https://github.com/apache/flink/pull/20896

   Event Time Temporal Join  的例子中 在关联查询的时候 两个表都有 currency 字段 ，应该将查询中的‘currency 
’  改为orders.currency,
   
   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #20895: Update datastream_api.md

2022-09-25 Thread GitBox



flinkbot commented on PR #20895:
URL: https://github.com/apache/flink/pull/20895#issuecomment-1257155574

   
   ## CI report:
   
   * 83d8ee9181512a6466e2eaddf59d113ec26c3cdd UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] zs19940317 opened a new pull request, #20895: Update datastream_api.md

2022-09-25 Thread GitBox



zs19940317 opened a new pull request, #20895:
URL: https://github.com/apache/flink/pull/20895

   translate a sentence into chinese, maybe this sentence was left when other 
people translated it. This sentence is `In production, commonly used sinks 
include the StreamingFileSink, various databases, and several pub-sub systems`
   
   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

70 matches

Mail list logo