[flink-kubernetes-operator] branch main updated: [docs][autoscaler] Autoscaler docs and default config improvement

2023-08-17 Thread gyfora
This is an automated email from the ASF dual-hosted git repository.

gyfora pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/flink-kubernetes-operator.git


The following commit(s) were added to refs/heads/main by this push:
 new ccf3639c [docs][autoscaler] Autoscaler docs and default config 
improvement
ccf3639c is described below

commit ccf3639c87400e3bbd94f64ad13dc5d40fd55faf
Author: Gyula Fora 
AuthorDate: Thu Aug 17 09:55:59 2023 +0200

[docs][autoscaler] Autoscaler docs and default config improvement
---
 docs/content/docs/custom-resource/autoscaler.md|  93 +
 .../generated/auto_scaler_configuration.html   |   2 +-
 .../static/img/custom-resource/autoscaler_fig1.png | Bin 0 -> 168727 bytes
 .../static/img/custom-resource/autoscaler_fig2.png | Bin 0 -> 181157 bytes
 .../static/img/custom-resource/autoscaler_fig3.png | Bin 0 -> 178610 bytes
 .../autoscaler/config/AutoScalerOptions.java   |   2 +-
 6 files changed, 78 insertions(+), 19 deletions(-)

diff --git a/docs/content/docs/custom-resource/autoscaler.md 
b/docs/content/docs/custom-resource/autoscaler.md
index 201ab3d8..4140122a 100644
--- a/docs/content/docs/custom-resource/autoscaler.md
+++ b/docs/content/docs/custom-resource/autoscaler.md
@@ -26,7 +26,7 @@ under the License.
 
 # Autoscaler
 
-The operator provides a job autoscaler functionality that collects various 
metrics from running Flink jobs and automatically scales individual job 
vertexes (chained operator groups) to eliminate backpressure and satisfy the 
utilization and catch-up duration target set by the user.
+The operator provides a job autoscaler functionality that collects various 
metrics from running Flink jobs and automatically scales individual job 
vertexes (chained operator groups) to eliminate backpressure and satisfy the 
utilization target set by the user.
 By adjusting parallelism on a job vertex level (in contrast to job 
parallelism) we can efficiently autoscale complex and heterogeneous streaming 
applications.
 
 Key benefits to the user:
@@ -35,26 +35,78 @@ Key benefits to the user:
  - Automatic adaptation to changing load patterns
  - Detailed utilization metrics for performance debugging
 
-Job requirements:
- - The autoscaler currently only works with the latest [Flink 
1.17](https://hub.docker.com/_/flink) or after backporting the following fixes 
to your 1.15/1.16 Flink image
-   - Job vertex parallelism overrides (must have)
- - [Add option to override job vertex parallelisms during job 
submission](https://github.com/apache/flink/commit/23ce2281a0bb4047c64def9af7ddd5f19d88e2a9)
- - [Change ForwardPartitioner to RebalancePartitioner on parallelism 
changes](https://github.com/apache/flink/pull/21443) (consists of 5 commits)
- - [Fix logic for determining downstream subtasks for partitioner 
replacement](https://github.com/apache/flink/commit/fb482fe39844efda33a4c05858903f5b64e158a3)
-   - [Support timespan for busyTime 
metrics](https://github.com/apache/flink/commit/a7fdab8b23cddf568fa32ee7eb804d7c3eb23a35)
 (good to have)
- - Source scaling only supports modern sources which
-   - use the new [Source 
API](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
 that exposes the busy time metric (must have, most common connectors already 
do)
-   - expose the [standardized connector 
metrics](https://cwiki.apache.org/confluence/display/FLINK/FLIP-33%3A+Standardize+Connector+Metrics)
 for accessing backlog information (good to have, extra capacity will be added 
for catching up with backlog)
+## Overview
 
-In the current state the autoscaler works best with Kafka sources, as they 
expose all the standardized metrics. It also comes with some additional 
benefits when using Kafka such as automatically detecting and limiting source 
max parallelism to the number of Kafka partitions.
+The autoscaler relies on the metrics exposed by the Flink metric system for 
the individual tasks. The metrics are queried directly from the Flink job.
 
-{{< hint info >}}
-The autoscaler also supports a passive/metrics-only mode where it only 
collects and evaluates scaling related performance metrics but does not trigger 
any job upgrades.
-This can be used to gain confidence in the module without any impact on the 
running applications.
+Collected metrics:
+ - Backlog information at each source
+ - Incoming data rate at the sources (e.g. records/sec written into the Kafka 
topic)
+ - Number of records processed per second in each job vertex
+ - Busy time per second of each job vertex (current utilization)
 
-To disable scaling actions, set: 
`kubernetes.operator.job.autoscaler.scaling.enabled: "false"`
+{{< hint info >}}
+Please note that we are not using any container memory / CPU utilization 
metrics directly here. High utilization will be reflected in the processing 
rate and busy time metrics of the individual job vertexes.
 {{< /hint >}}
 
+The algorithm 

[flink-kubernetes-operator] branch release-1.6 updated (575ea323 -> edc49eae)

2023-08-17 Thread gyfora
This is an automated email from the ASF dual-hosted git repository.

gyfora pushed a change to branch release-1.6
in repository https://gitbox.apache.org/repos/asf/flink-kubernetes-operator.git


from 575ea323 [FLINK-32774] Improve checking for already upgraded 
deployments
 new 251bde53 [FLINK-32868][docs] Document the need to backport FLINK-30213 
for using autoscaler with older version Flinks
 new edc49eae [docs][autoscaler] Autoscaler docs and default config 
improvement

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 docs/content/docs/custom-resource/autoscaler.md|  90 +
 .../static/img/custom-resource/autoscaler_fig1.png | Bin 0 -> 168727 bytes
 .../static/img/custom-resource/autoscaler_fig2.png | Bin 0 -> 181157 bytes
 .../static/img/custom-resource/autoscaler_fig3.png | Bin 0 -> 178610 bytes
 4 files changed, 76 insertions(+), 14 deletions(-)
 create mode 100644 docs/static/img/custom-resource/autoscaler_fig1.png
 create mode 100644 docs/static/img/custom-resource/autoscaler_fig2.png
 create mode 100644 docs/static/img/custom-resource/autoscaler_fig3.png



[flink-kubernetes-operator] 02/02: [docs][autoscaler] Autoscaler docs and default config improvement

2023-08-17 Thread gyfora
This is an automated email from the ASF dual-hosted git repository.

gyfora pushed a commit to branch release-1.6
in repository https://gitbox.apache.org/repos/asf/flink-kubernetes-operator.git

commit edc49eae69150f78ee4d49501718896aa218c40d
Author: Gyula Fora 
AuthorDate: Thu Aug 17 10:15:22 2023 +0200

[docs][autoscaler] Autoscaler docs and default config improvement
---
 docs/content/docs/custom-resource/autoscaler.md|  93 +
 .../static/img/custom-resource/autoscaler_fig1.png | Bin 0 -> 168727 bytes
 .../static/img/custom-resource/autoscaler_fig2.png | Bin 0 -> 181157 bytes
 .../static/img/custom-resource/autoscaler_fig3.png | Bin 0 -> 178610 bytes
 4 files changed, 76 insertions(+), 17 deletions(-)

diff --git a/docs/content/docs/custom-resource/autoscaler.md 
b/docs/content/docs/custom-resource/autoscaler.md
index 201ab3d8..4140122a 100644
--- a/docs/content/docs/custom-resource/autoscaler.md
+++ b/docs/content/docs/custom-resource/autoscaler.md
@@ -26,7 +26,7 @@ under the License.
 
 # Autoscaler
 
-The operator provides a job autoscaler functionality that collects various 
metrics from running Flink jobs and automatically scales individual job 
vertexes (chained operator groups) to eliminate backpressure and satisfy the 
utilization and catch-up duration target set by the user.
+The operator provides a job autoscaler functionality that collects various 
metrics from running Flink jobs and automatically scales individual job 
vertexes (chained operator groups) to eliminate backpressure and satisfy the 
utilization target set by the user.
 By adjusting parallelism on a job vertex level (in contrast to job 
parallelism) we can efficiently autoscale complex and heterogeneous streaming 
applications.
 
 Key benefits to the user:
@@ -35,26 +35,78 @@ Key benefits to the user:
  - Automatic adaptation to changing load patterns
  - Detailed utilization metrics for performance debugging
 
-Job requirements:
- - The autoscaler currently only works with the latest [Flink 
1.17](https://hub.docker.com/_/flink) or after backporting the following fixes 
to your 1.15/1.16 Flink image
-   - Job vertex parallelism overrides (must have)
- - [Add option to override job vertex parallelisms during job 
submission](https://github.com/apache/flink/commit/23ce2281a0bb4047c64def9af7ddd5f19d88e2a9)
- - [Change ForwardPartitioner to RebalancePartitioner on parallelism 
changes](https://github.com/apache/flink/pull/21443) (consists of 5 commits)
- - [Fix logic for determining downstream subtasks for partitioner 
replacement](https://github.com/apache/flink/commit/fb482fe39844efda33a4c05858903f5b64e158a3)
-   - [Support timespan for busyTime 
metrics](https://github.com/apache/flink/commit/a7fdab8b23cddf568fa32ee7eb804d7c3eb23a35)
 (good to have)
- - Source scaling only supports modern sources which
-   - use the new [Source 
API](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
 that exposes the busy time metric (must have, most common connectors already 
do)
-   - expose the [standardized connector 
metrics](https://cwiki.apache.org/confluence/display/FLINK/FLIP-33%3A+Standardize+Connector+Metrics)
 for accessing backlog information (good to have, extra capacity will be added 
for catching up with backlog)
+## Overview
 
-In the current state the autoscaler works best with Kafka sources, as they 
expose all the standardized metrics. It also comes with some additional 
benefits when using Kafka such as automatically detecting and limiting source 
max parallelism to the number of Kafka partitions.
+The autoscaler relies on the metrics exposed by the Flink metric system for 
the individual tasks. The metrics are queried directly from the Flink job.
 
-{{< hint info >}}
-The autoscaler also supports a passive/metrics-only mode where it only 
collects and evaluates scaling related performance metrics but does not trigger 
any job upgrades.
-This can be used to gain confidence in the module without any impact on the 
running applications.
+Collected metrics:
+ - Backlog information at each source
+ - Incoming data rate at the sources (e.g. records/sec written into the Kafka 
topic)
+ - Number of records processed per second in each job vertex
+ - Busy time per second of each job vertex (current utilization)
 
-To disable scaling actions, set: 
`kubernetes.operator.job.autoscaler.scaling.enabled: "false"`
+{{< hint info >}}
+Please note that we are not using any container memory / CPU utilization 
metrics directly here. High utilization will be reflected in the processing 
rate and busy time metrics of the individual job vertexes.
 {{< /hint >}}
 
+The algorithm starts from the sources and recursively computes the required 
processing capacity (target data rate) for each operator in the pipeline. At 
the source vertices, target data rate is equal to incoming data rate (from the 
Kafka topic).
+
+For downstream operators we compute the target data rate as 

[flink-kubernetes-operator] 01/02: [FLINK-32868][docs] Document the need to backport FLINK-30213 for using autoscaler with older version Flinks

2023-08-17 Thread gyfora
This is an automated email from the ASF dual-hosted git repository.

gyfora pushed a commit to branch release-1.6
in repository https://gitbox.apache.org/repos/asf/flink-kubernetes-operator.git

commit 251bde53a022e1cdd2216002665d939e5993a736
Author: Zhanghao 
AuthorDate: Tue Aug 15 13:43:16 2023 +0800

[FLINK-32868][docs] Document the need to backport FLINK-30213 for using 
autoscaler with older version Flinks
---
 docs/content/docs/custom-resource/autoscaler.md | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/docs/content/docs/custom-resource/autoscaler.md 
b/docs/content/docs/custom-resource/autoscaler.md
index 748f9dc2..201ab3d8 100644
--- a/docs/content/docs/custom-resource/autoscaler.md
+++ b/docs/content/docs/custom-resource/autoscaler.md
@@ -37,7 +37,10 @@ Key benefits to the user:
 
 Job requirements:
  - The autoscaler currently only works with the latest [Flink 
1.17](https://hub.docker.com/_/flink) or after backporting the following fixes 
to your 1.15/1.16 Flink image
-   - [Job vertex parallelism 
overrides](https://github.com/apache/flink/commit/23ce2281a0bb4047c64def9af7ddd5f19d88e2a9)
 (must have)
+   - Job vertex parallelism overrides (must have)
+ - [Add option to override job vertex parallelisms during job 
submission](https://github.com/apache/flink/commit/23ce2281a0bb4047c64def9af7ddd5f19d88e2a9)
+ - [Change ForwardPartitioner to RebalancePartitioner on parallelism 
changes](https://github.com/apache/flink/pull/21443) (consists of 5 commits)
+ - [Fix logic for determining downstream subtasks for partitioner 
replacement](https://github.com/apache/flink/commit/fb482fe39844efda33a4c05858903f5b64e158a3)
- [Support timespan for busyTime 
metrics](https://github.com/apache/flink/commit/a7fdab8b23cddf568fa32ee7eb804d7c3eb23a35)
 (good to have)
  - Source scaling only supports modern sources which
- use the new [Source 
API](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
 that exposes the busy time metric (must have, most common connectors already 
do)



[flink] branch master updated: [hotfix][runtime] Remove the unnecessary private put method in FileSystemBlobStore to prevent the createBasePathIfNeeded is called repeatedly

2023-08-17 Thread fanrui
This is an automated email from the ASF dual-hosted git repository.

fanrui pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git


The following commit(s) were added to refs/heads/master by this push:
 new 1d1fe536022 [hotfix][runtime] Remove the unnecessary private put 
method in FileSystemBlobStore to prevent the createBasePathIfNeeded is called 
repeatedly
1d1fe536022 is described below

commit 1d1fe5360221f603829e15475f9a40e93b940a8d
Author: 1996fanrui <1996fan...@gmail.com>
AuthorDate: Thu Aug 17 16:17:01 2023 +0800

[hotfix][runtime] Remove the unnecessary private put method in 
FileSystemBlobStore to prevent the createBasePathIfNeeded is called repeatedly
---
 .../org/apache/flink/runtime/blob/FileSystemBlobStore.java | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git 
a/flink-runtime/src/main/java/org/apache/flink/runtime/blob/FileSystemBlobStore.java
 
b/flink-runtime/src/main/java/org/apache/flink/runtime/blob/FileSystemBlobStore.java
index 22fb7064360..141f270e2f8 100644
--- 
a/flink-runtime/src/main/java/org/apache/flink/runtime/blob/FileSystemBlobStore.java
+++ 
b/flink-runtime/src/main/java/org/apache/flink/runtime/blob/FileSystemBlobStore.java
@@ -77,15 +77,11 @@ public class FileSystemBlobStore implements 
BlobStoreService {
 @Override
 public boolean put(File localFile, JobID jobId, BlobKey blobKey) throws 
IOException {
 createBasePathIfNeeded();
-return put(localFile, BlobUtils.getStorageLocationPath(basePath, 
jobId, blobKey));
-}
-
-private boolean put(File fromFile, String toBlobPath) throws IOException {
-createBasePathIfNeeded();
+String toBlobPath = BlobUtils.getStorageLocationPath(basePath, jobId, 
blobKey);
 try (FSDataOutputStream os =
 fileSystem.create(new Path(toBlobPath), 
FileSystem.WriteMode.OVERWRITE)) {
-LOG.debug("Copying from {} to {}.", fromFile, toBlobPath);
-Files.copy(fromFile, os);
+LOG.debug("Copying from {} to {}.", localFile, toBlobPath);
+Files.copy(localFile, os);
 
 os.sync();
 }



[flink] 02/15: [FLINK-32834] Clean/overwrite pre-existing tmp files

2023-08-17 Thread chesnay
This is an automated email from the ASF dual-hosted git repository.

chesnay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit 5f719e748a0450d385182aaa8312df114b38e30a
Author: Chesnay Schepler 
AuthorDate: Fri Aug 11 10:06:54 2023 +0200

[FLINK-32834] Clean/overwrite pre-existing tmp files

- support repeated usages of the scripts
---
 tools/ci/compile.sh   | 1 +
 tools/ci/verify_scala_suffixes.sh | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/ci/compile.sh b/tools/ci/compile.sh
index 4c38931085b..92b8d331755 100755
--- a/tools/ci/compile.sh
+++ b/tools/ci/compile.sh
@@ -32,6 +32,7 @@ MVN_CLEAN_COMPILE_OUT="/tmp/clean_compile.out"
 # Deploy into this directory, to run license checks on all jars staged for 
deployment.
 # This helps us ensure that ALL artifacts we deploy to maven central adhere to 
our license conditions.
 MVN_VALIDATION_DIR="/tmp/flink-validation-deployment"
+rm -rf ${MVN_VALIDATION_DIR}
 
 # source required ci scripts
 source "${CI_DIR}/stage.sh"
diff --git a/tools/ci/verify_scala_suffixes.sh 
b/tools/ci/verify_scala_suffixes.sh
index 756a7503bee..2af7619a51a 100755
--- a/tools/ci/verify_scala_suffixes.sh
+++ b/tools/ci/verify_scala_suffixes.sh
@@ -50,7 +50,7 @@ cd "$FLINK_ROOT" || exit
 
 dependency_plugin_output=${CI_DIR}/dep.txt
 
-run_mvn dependency:tree -Dincludes=org.scala-lang,:*_2.1*:: ${MAVEN_ARGUMENTS} 
>> "${dependency_plugin_output}"
+run_mvn dependency:tree -Dincludes=org.scala-lang,:*_2.1*:: ${MAVEN_ARGUMENTS} 
> "${dependency_plugin_output}"
 EXIT_CODE=$?
 
 if [ $EXIT_CODE != 0 ]; then



[flink] 06/15: [FLINK-32834] Decouple low-level scripts from maven-utlls

2023-08-17 Thread chesnay
This is an automated email from the ASF dual-hosted git repository.

chesnay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit e638115336b2c88c16fbc421dd2ae931e39d1df4
Author: Chesnay Schepler 
AuthorDate: Fri Aug 11 11:02:59 2023 +0200

[FLINK-32834] Decouple low-level scripts from maven-utlls

- ease usage by reducing number of parameters
- avoid CI-exclusive maven-utils
- use mvnw by default
---
 tools/ci/compile.sh | 6 +++---
 tools/ci/license_check.sh   | 7 +++
 tools/ci/verify_bundled_optional.sh | 7 +++
 tools/ci/verify_scala_suffixes.sh   | 8 +++-
 4 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/tools/ci/compile.sh b/tools/ci/compile.sh
index 11e1e22e9e6..271501033e4 100755
--- a/tools/ci/compile.sh
+++ b/tools/ci/compile.sh
@@ -97,11 +97,11 @@ fi
 
 echo " Checking bundled dependencies marked as optional 
"
 
-${CI_DIR}/verify_bundled_optional.sh $MVN_CLEAN_COMPILE_OUT "$CI_DIR" || exit 
$?
+MVN=run_mvn ${CI_DIR}/verify_bundled_optional.sh $MVN_CLEAN_COMPILE_OUT || 
exit $?
 
 echo " Checking scala suffixes "
 
-${CI_DIR}/verify_scala_suffixes.sh "$CI_DIR" || exit $?
+MVN=run_mvn ${CI_DIR}/verify_scala_suffixes.sh || exit $?
 
 echo " Checking shaded dependencies "
 
@@ -117,7 +117,7 @@ echo " Run license check "
 find $MVN_VALIDATION_DIR
 # We use a different Scala version with Java 17
 if [[ ${PROFILE} != *"jdk17"* ]]; then
-  ${CI_DIR}/license_check.sh $MVN_CLEAN_COMPILE_OUT $CI_DIR 
$MVN_VALIDATION_DIR || exit $?
+  MVN=run_mvn ${CI_DIR}/license_check.sh $MVN_CLEAN_COMPILE_OUT 
$MVN_VALIDATION_DIR || exit $?
 fi
 
 exit $EXIT_CODE
diff --git a/tools/ci/license_check.sh b/tools/ci/license_check.sh
index 3b3e02603c0..4f2aebbde00 100755
--- a/tools/ci/license_check.sh
+++ b/tools/ci/license_check.sh
@@ -18,12 +18,11 @@
 

 
 MVN_CLEAN_COMPILE_OUT=$1
-CI_DIR=$2
-FLINK_DEPLOYED_ROOT=$3
+FLINK_DEPLOYED_ROOT=$2
 
-source "${CI_DIR}/maven-utils.sh"
+MVN=${MVN:-./mvnw}
 
-run_mvn -pl tools/ci/flink-ci-tools exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.licensecheck.LicenseChecker 
-Dexec.args="$MVN_CLEAN_COMPILE_OUT $(pwd) $FLINK_DEPLOYED_ROOT"
+$MVN -pl tools/ci/flink-ci-tools exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.licensecheck.LicenseChecker 
-Dexec.args="$MVN_CLEAN_COMPILE_OUT $(pwd) $FLINK_DEPLOYED_ROOT"
 EXIT_CODE=$?
 
 if [ $EXIT_CODE != 0 ]; then
diff --git a/tools/ci/verify_bundled_optional.sh 
b/tools/ci/verify_bundled_optional.sh
index bcacc705f43..34fff60ca2d 100755
--- a/tools/ci/verify_bundled_optional.sh
+++ b/tools/ci/verify_bundled_optional.sh
@@ -19,17 +19,16 @@
 
 ## Checks that all bundled dependencies are marked as optional in the poms
 MVN_CLEAN_COMPILE_OUT=$1
-CI_DIR=$2
 
-source "${CI_DIR}/maven-utils.sh"
+MVN=${MVN:-./mvnw}
 
 dependency_plugin_output=/tmp/optional_dep.txt
 
-run_mvn dependency:tree -B > "${dependency_plugin_output}"
+$MVN dependency:tree -B > "${dependency_plugin_output}"
 
 cat "${dependency_plugin_output}"
 
-run_mvn -pl tools/ci/flink-ci-tools exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.optional.ShadeOptionalChecker 
-Dexec.args="${MVN_CLEAN_COMPILE_OUT} ${dependency_plugin_output}"
+$MVN -pl tools/ci/flink-ci-tools exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.optional.ShadeOptionalChecker 
-Dexec.args="${MVN_CLEAN_COMPILE_OUT} ${dependency_plugin_output}"
 EXIT_CODE=$?
 
 if [ $EXIT_CODE != 0 ]; then
diff --git a/tools/ci/verify_scala_suffixes.sh 
b/tools/ci/verify_scala_suffixes.sh
index 37581b472b3..4c5cc389eb3 100755
--- a/tools/ci/verify_scala_suffixes.sh
+++ b/tools/ci/verify_scala_suffixes.sh
@@ -37,17 +37,15 @@
 #
 # The script uses 'mvn dependency:tree -Dincludes=org.scala-lang' to list Scala
 # dependent modules.
-CI_DIR=$1
+MVN=${MVN:-./mvnw}
 
 echo "--- Flink Scala Dependency Analyzer ---"
 echo "Analyzing modules for Scala dependencies using 'mvn dependency:tree'."
 echo "If you haven't built the project, please do so first by running \"mvn 
clean install -DskipTests\""
 
-source "${CI_DIR}/maven-utils.sh"
-
 dependency_plugin_output=/tmp/dep.txt
 
-run_mvn dependency:tree -Dincludes=org.scala-lang,:*_2.1*:: ${MAVEN_ARGUMENTS} 
> "${dependency_plugin_output}"
+$MVN dependency:tree -Dincludes=org.scala-lang,:*_2.1*:: ${MAVEN_ARGUMENTS} > 
"${dependency_plugin_output}"
 EXIT_CODE=$?
 
 if [ $EXIT_CODE != 0 ]; then
@@ -58,7 +56,7 @@ if [ $EXIT_CODE != 0 ]; then
 exit 1
 fi
 
-run_mvn -pl tools/ci/flink-ci-tools exec:java exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.suffixcheck.ScalaSuffixChecker 
-Dexec.args="${dependency_plugin_output} $(pwd)"
+$MVN -pl tools/ci/flink-ci-tools exec:java exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.suffixcheck.ScalaSuffixChecker 
-Dexec.args="${dependency_p

[flink] branch master updated (1d1fe536022 -> 5bf5003f5c7)

2023-08-17 Thread chesnay
This is an automated email from the ASF dual-hosted git repository.

chesnay pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git


from 1d1fe536022 [hotfix][runtime] Remove the unnecessary private put 
method in FileSystemBlobStore to prevent the createBasePathIfNeeded is called 
repeatedly
 new 5d163dd39d6 [FLINK-32834] Run all compile scripts from root directory
 new 5f719e748a0 [FLINK-32834] Clean/overwrite pre-existing tmp files
 new d08356a1afe [FLINK-32834] Write all tmp files to /tmp/
 new 78b5ddb11df [FLINK-32834] Rework parameter quoting
 new 4bcbc7caed9 [FLINK-32834] Remove FLINK_ROOT parameter
 new e638115336b [FLINK-32834] Decouple low-level scripts from maven-utlls
 new 003eae50c36 [FLINK-32834] Streamline CI_DIR detection
 new d3059389aea [FLINK-32834] Decouple compile.sh from maven-utils.sh
 new 109c7ff261b [FLINK-32834] Forward any additional args to maven
 new b38cd662bcf [FLINK-32834] Add documentation to the scripts
 new ed31bca4c9e [FLINK-32834] Add usage information and -h option to 
low-level scripts
 new 6663b398244 [FLINK-32834] Fail early if dependency plugin fails
 new 3e5eca702d2 [FLINK-32834] Forward actual Maven error code instead of 1
 new 6faf3680d4d [FLINK-32834] Force parallelism of 1
 new 5bf5003f5c7 [FLINK-32834] Use descriptive output file names

The 15 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 tools/azure-pipelines/e2e-template.yml |  2 +-
 tools/azure-pipelines/jobs-template.yml|  2 +-
 tools/ci/compile.sh| 54 ++
 .../dev/run_pip_test.sh => tools/ci/compile_ci.sh  | 11 -
 tools/ci/license_check.sh  | 36 ---
 tools/ci/maven-utils.sh|  6 +--
 tools/ci/shade.sh  | 30 ++--
 tools/ci/verify_bundled_optional.sh| 48 +++
 tools/ci/verify_scala_suffixes.sh  | 48 +++
 9 files changed, 161 insertions(+), 76 deletions(-)
 copy flink-python/dev/run_pip_test.sh => tools/ci/compile_ci.sh (79%)



[flink] 08/15: [FLINK-32834] Decouple compile.sh from maven-utils.sh

2023-08-17 Thread chesnay
This is an automated email from the ASF dual-hosted git repository.

chesnay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit d3059389aea1e1d65ad4c3a7af88ddf7e21e80c3
Author: Chesnay Schepler 
AuthorDate: Fri Aug 11 11:13:54 2023 +0200

[FLINK-32834] Decouple compile.sh from maven-utils.sh

- run directly against maven by default
- avoid CI-exclusive maven-utils
---
 tools/azure-pipelines/e2e-template.yml  |  2 +-
 tools/azure-pipelines/jobs-template.yml |  2 +-
 tools/ci/compile.sh | 12 ++--
 tools/ci/compile_ci.sh  | 29 +
 4 files changed, 37 insertions(+), 8 deletions(-)

diff --git a/tools/azure-pipelines/e2e-template.yml 
b/tools/azure-pipelines/e2e-template.yml
index 9fe0b7b8a05..6ec29f20897 100644
--- a/tools/azure-pipelines/e2e-template.yml
+++ b/tools/azure-pipelines/e2e-template.yml
@@ -115,7 +115,7 @@ jobs:
 sudo apt install ./libssl1.0.0_*.deb
   displayName: Prepare E2E run
   condition: not(eq(variables['SKIP'], '1'))
-- script: ${{parameters.environment}} PROFILE="$PROFILE -Dfast 
-Pskip-webui-build" ./tools/ci/compile.sh
+- script: ${{parameters.environment}} PROFILE="$PROFILE -Dfast 
-Pskip-webui-build" ./tools/ci/compile_ci.sh
   displayName: Build Flink
   condition: not(eq(variables['SKIP'], '1'))
 - script: ${{parameters.environment}} FLINK_DIR=`pwd`/build-target 
./tools/azure-pipelines/uploading_watchdog.sh 
flink-end-to-end-tests/run-nightly-tests.sh ${{parameters.group}}
diff --git a/tools/azure-pipelines/jobs-template.yml 
b/tools/azure-pipelines/jobs-template.yml
index b8bbbf87b28..908f91f7e91 100644
--- a/tools/azure-pipelines/jobs-template.yml
+++ b/tools/azure-pipelines/jobs-template.yml
@@ -68,7 +68,7 @@ jobs:
 displayName: "Set JDK"
   # Compile
   - script: |
-  ${{parameters.environment}} ./tools/ci/compile.sh || exit $?
+  ${{parameters.environment}} ./tools/ci/compile_ci.sh || exit $?
   ./tools/azure-pipelines/create_build_artifact.sh
 displayName: Compile
 
diff --git a/tools/ci/compile.sh b/tools/ci/compile.sh
index 8d1e6fbfce0..5fc899dd00b 100755
--- a/tools/ci/compile.sh
+++ b/tools/ci/compile.sh
@@ -23,6 +23,7 @@
 
 CI_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)
 MVN_CLEAN_COMPILE_OUT="/tmp/clean_compile.out"
+MVN=${MVN:-./mvnw}
 
 # Deploy into this directory, to run license checks on all jars staged for 
deployment.
 # This helps us ensure that ALL artifacts we deploy to maven central adhere to 
our license conditions.
@@ -32,10 +33,9 @@ rm -rf ${MVN_VALIDATION_DIR}
 # source required ci scripts
 source "${CI_DIR}/stage.sh"
 source "${CI_DIR}/shade.sh"
-source "${CI_DIR}/maven-utils.sh"
 
 echo "Maven version:"
-run_mvn -version
+$MVN -version
 
 echo 
"=="
 echo "Compiling Flink"
@@ -43,7 +43,7 @@ echo 
"==
 
 EXIT_CODE=0
 
-run_mvn clean deploy 
-DaltDeploymentRepository=validation_repository::default::file:$MVN_VALIDATION_DIR
 -Dflink.convergence.phase=install -Pcheck-convergence \
+$MVN clean deploy 
-DaltDeploymentRepository=validation_repository::default::file:$MVN_VALIDATION_DIR
 -Dflink.convergence.phase=install -Pcheck-convergence \
 -Dmaven.javadoc.skip=true -U -DskipTests | tee $MVN_CLEAN_COMPILE_OUT
 
 EXIT_CODE=${PIPESTATUS[0]}
@@ -92,11 +92,11 @@ fi
 
 echo " Checking bundled dependencies marked as optional 
"
 
-MVN=run_mvn ${CI_DIR}/verify_bundled_optional.sh $MVN_CLEAN_COMPILE_OUT || 
exit $?
+MVN=$MVN ${CI_DIR}/verify_bundled_optional.sh $MVN_CLEAN_COMPILE_OUT || exit $?
 
 echo " Checking scala suffixes "
 
-MVN=run_mvn ${CI_DIR}/verify_scala_suffixes.sh || exit $?
+MVN=$MVN ${CI_DIR}/verify_scala_suffixes.sh || exit $?
 
 echo " Checking shaded dependencies "
 
@@ -112,7 +112,7 @@ echo " Run license check "
 find $MVN_VALIDATION_DIR
 # We use a different Scala version with Java 17
 if [[ ${PROFILE} != *"jdk17"* ]]; then
-  MVN=run_mvn ${CI_DIR}/license_check.sh $MVN_CLEAN_COMPILE_OUT 
$MVN_VALIDATION_DIR || exit $?
+  MVN=$MVN ${CI_DIR}/license_check.sh $MVN_CLEAN_COMPILE_OUT 
$MVN_VALIDATION_DIR || exit $?
 fi
 
 exit $EXIT_CODE
diff --git a/tools/ci/compile_ci.sh b/tools/ci/compile_ci.sh
new file mode 100755
index 000..d9b3f68a49a
--- /dev/null
+++ b/tools/ci/compile_ci.sh
@@ -0,0 +1,29 @@
+#!/usr/bin/env bash
+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Ver

[flink] 04/15: [FLINK-32834] Rework parameter quoting

2023-08-17 Thread chesnay
This is an automated email from the ASF dual-hosted git repository.

chesnay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit 78b5ddb11dfd2a3a00b58079fe9ee29a80555988
Author: Chesnay Schepler 
AuthorDate: Fri Aug 11 10:41:06 2023 +0200

[FLINK-32834] Rework parameter quoting

The current quoting scheme wasn't compatible if run_mvn points directly to 
maven.
This both blocks a follow-up where we skip maven-utils entirely in local 
runs, and makes it easier to copy&paste commands to/from the command-line.
It was also not really intuitive. At all.

This comes with a slight maintenance cost to maven-utils.sh#run_mvn, but 
given that we'll likely throw this out in 1.19 anyway (because there are better 
ways to implement this on later maven versions), we can bear that cost.
---
 tools/ci/license_check.sh   | 2 +-
 tools/ci/maven-utils.sh | 6 ++
 tools/ci/verify_bundled_optional.sh | 2 +-
 tools/ci/verify_scala_suffixes.sh   | 2 +-
 4 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/tools/ci/license_check.sh b/tools/ci/license_check.sh
index 9bbcadab96a..7ba98c88eae 100755
--- a/tools/ci/license_check.sh
+++ b/tools/ci/license_check.sh
@@ -24,7 +24,7 @@ FLINK_DEPLOYED_ROOT=$4
 
 source "${CI_DIR}/maven-utils.sh"
 
-run_mvn -pl tools/ci/flink-ci-tools exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.licensecheck.LicenseChecker 
-Dexec.args=\"$MVN_CLEAN_COMPILE_OUT $FLINK_ROOT $FLINK_DEPLOYED_ROOT\"
+run_mvn -pl tools/ci/flink-ci-tools exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.licensecheck.LicenseChecker 
-Dexec.args="$MVN_CLEAN_COMPILE_OUT $FLINK_ROOT $FLINK_DEPLOYED_ROOT"
 EXIT_CODE=$?
 
 if [ $EXIT_CODE != 0 ]; then
diff --git a/tools/ci/maven-utils.sh b/tools/ci/maven-utils.sh
index c42402ff3d9..15a92575feb 100755
--- a/tools/ci/maven-utils.sh
+++ b/tools/ci/maven-utils.sh
@@ -21,12 +21,10 @@ function run_mvn {
MVN_CMD="${M2_HOME}/bin/mvn"
fi
 
-   ARGS=$@
-   INVOCATION="$MVN_CMD $MVN_GLOBAL_OPTIONS $ARGS"
if [[ "$MVN_RUN_VERBOSE" != "false" ]]; then
-   echo "Invoking mvn with '$INVOCATION'"
+   echo "Invoking mvn with '$MVN_GLOBAL_OPTIONS ${@}'"
fi
-   eval $INVOCATION
+   $MVN_CMD $MVN_GLOBAL_OPTIONS "${@}"
 }
 export -f run_mvn
 
diff --git a/tools/ci/verify_bundled_optional.sh 
b/tools/ci/verify_bundled_optional.sh
index a570ae92afb..d2c34e638db 100755
--- a/tools/ci/verify_bundled_optional.sh
+++ b/tools/ci/verify_bundled_optional.sh
@@ -32,7 +32,7 @@ run_mvn dependency:tree -B > "${dependency_plugin_output}"
 
 cat "${dependency_plugin_output}"
 
-run_mvn -pl tools/ci/flink-ci-tools exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.optional.ShadeOptionalChecker 
-Dexec.args=\""${MVN_CLEAN_COMPILE_OUT}" "${dependency_plugin_output}"\"
+run_mvn -pl tools/ci/flink-ci-tools exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.optional.ShadeOptionalChecker 
-Dexec.args="${MVN_CLEAN_COMPILE_OUT} ${dependency_plugin_output}"
 EXIT_CODE=$?
 
 if [ $EXIT_CODE != 0 ]; then
diff --git a/tools/ci/verify_scala_suffixes.sh 
b/tools/ci/verify_scala_suffixes.sh
index 22653d2dafb..b827a1c19f5 100755
--- a/tools/ci/verify_scala_suffixes.sh
+++ b/tools/ci/verify_scala_suffixes.sh
@@ -61,7 +61,7 @@ if [ $EXIT_CODE != 0 ]; then
 exit 1
 fi
 
-run_mvn -pl tools/ci/flink-ci-tools exec:java exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.suffixcheck.ScalaSuffixChecker 
-Dexec.args=\""${dependency_plugin_output}" "${FLINK_ROOT}"\"
+run_mvn -pl tools/ci/flink-ci-tools exec:java exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.suffixcheck.ScalaSuffixChecker 
-Dexec.args="${dependency_plugin_output} ${FLINK_ROOT}"
 EXIT_CODE=$?
 
 if [ $EXIT_CODE == 0 ]; then



[flink] 12/15: [FLINK-32834] Fail early if dependency plugin fails

2023-08-17 Thread chesnay
This is an automated email from the ASF dual-hosted git repository.

chesnay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit 6663b3982444a3ee3c67e68289b24b31aac15c02
Author: Chesnay Schepler 
AuthorDate: Wed Aug 16 13:02:41 2023 +0200

[FLINK-32834] Fail early if dependency plugin fails
---
 tools/ci/verify_bundled_optional.sh | 9 +
 1 file changed, 9 insertions(+)

diff --git a/tools/ci/verify_bundled_optional.sh 
b/tools/ci/verify_bundled_optional.sh
index 0d3dbdf41ba..890b1a5acf5 100755
--- a/tools/ci/verify_bundled_optional.sh
+++ b/tools/ci/verify_bundled_optional.sh
@@ -49,6 +49,15 @@ MVN=${MVN:-./mvnw}
 dependency_plugin_output=/tmp/optional_dep.txt
 
 $MVN dependency:tree -B > "${dependency_plugin_output}"
+EXIT_CODE=$?
+
+if [ $EXIT_CODE != 0 ]; then
+cat ${dependency_plugin_output}
+echo 
"=="
+echo "Optional Check failed. The dependency tree could not be determined. 
See previous output for details."
+echo 
"=="
+exit 1
+fi
 
 cat "${dependency_plugin_output}"
 



[flink] 14/15: [FLINK-32834] Force parallelism of 1

2023-08-17 Thread chesnay
This is an automated email from the ASF dual-hosted git repository.

chesnay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit 6faf3680d4d558f7f5577f9997ebaea1d9bc7b77
Author: Chesnay Schepler 
AuthorDate: Thu Aug 17 11:32:27 2023 +0200

[FLINK-32834] Force parallelism of 1

The maven output parsers rely on certain order of messages which can be 
broken by multi-threaded builds.
---
 tools/ci/compile.sh | 3 ++-
 tools/ci/verify_bundled_optional.sh | 3 ++-
 tools/ci/verify_scala_suffixes.sh   | 3 ++-
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/tools/ci/compile.sh b/tools/ci/compile.sh
index ee4ba13d940..0603e7b7e48 100755
--- a/tools/ci/compile.sh
+++ b/tools/ci/compile.sh
@@ -58,8 +58,9 @@ echo 
"==
 
 EXIT_CODE=0
 
+# run with -T1 because our maven output parsers don't support multi-threaded 
builds
 $MVN clean deploy 
-DaltDeploymentRepository=validation_repository::default::file:$MVN_VALIDATION_DIR
 -Dflink.convergence.phase=install -Pcheck-convergence \
--Dmaven.javadoc.skip=true -U -DskipTests "${@}" | tee 
$MVN_CLEAN_COMPILE_OUT
+-Dmaven.javadoc.skip=true -U -DskipTests "${@}" -T1 | tee 
$MVN_CLEAN_COMPILE_OUT
 
 EXIT_CODE=${PIPESTATUS[0]}
 
diff --git a/tools/ci/verify_bundled_optional.sh 
b/tools/ci/verify_bundled_optional.sh
index 276d95eb634..e0f5a22255d 100755
--- a/tools/ci/verify_bundled_optional.sh
+++ b/tools/ci/verify_bundled_optional.sh
@@ -48,7 +48,8 @@ MVN=${MVN:-./mvnw}
 
 dependency_plugin_output=/tmp/optional_dep.txt
 
-$MVN dependency:tree -B > "${dependency_plugin_output}"
+# run with -T1 because our maven output parsers don't support multi-threaded 
builds
+$MVN dependency:tree -B -T1 > "${dependency_plugin_output}"
 EXIT_CODE=$?
 
 if [ $EXIT_CODE != 0 ]; then
diff --git a/tools/ci/verify_scala_suffixes.sh 
b/tools/ci/verify_scala_suffixes.sh
index 9747066b4c8..f6aae040731 100755
--- a/tools/ci/verify_scala_suffixes.sh
+++ b/tools/ci/verify_scala_suffixes.sh
@@ -63,7 +63,8 @@ echo "If you haven't built the project, please do so first by 
running \"mvn clea
 
 dependency_plugin_output=/tmp/dep.txt
 
-$MVN dependency:tree -Dincludes=org.scala-lang,:*_2.1*:: ${MAVEN_ARGUMENTS} > 
"${dependency_plugin_output}"
+# run with -T1 because our maven output parsers don't support multi-threaded 
builds
+$MVN dependency:tree -Dincludes=org.scala-lang,:*_2.1*:: ${MAVEN_ARGUMENTS} 
-T1 > "${dependency_plugin_output}"
 EXIT_CODE=$?
 
 if [ $EXIT_CODE != 0 ]; then



[flink] 10/15: [FLINK-32834] Add documentation to the scripts

2023-08-17 Thread chesnay
This is an automated email from the ASF dual-hosted git repository.

chesnay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit b38cd662bcfe66dd467bfc88c4ec313f444c48fd
Author: Chesnay Schepler 
AuthorDate: Fri Aug 11 12:36:55 2023 +0200

[FLINK-32834] Add documentation to the scripts
---
 tools/ci/compile.sh| 17 -
 tools/ci/compile_ci.sh |  2 +-
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/tools/ci/compile.sh b/tools/ci/compile.sh
index bc349e3d53a..ee4ba13d940 100755
--- a/tools/ci/compile.sh
+++ b/tools/ci/compile.sh
@@ -18,7 +18,22 @@
 

 
 #
-# This file contains tooling for compiling Flink
+# This script compiles Flink and runs all QA checks apart from tests.
+#
+# This script should not contain any CI-specific logic; put these into 
compile_ci.sh instead.
+#
+# Usage: [MVN=/path/to/maven] tools/ci/compile.sh [additional maven args]
+# - Use the MVN environment variable to point the script to another maven 
installation.
+# - Any script argument is forwarded to the Flink maven build. Use it to 
skip/modify parts of the build process.
+#
+# Tips:
+# - '-Pskip-webui-build' skips the WebUI build.
+# - '-Dfast' skips Maven QA checks.
+# - '-Dmaven.clean.skip' skips recompilation of classes.
+# Example: tools/ci/compile.sh -Dmaven.clean.skip -Dfast -Pskip-webui-build, 
use -Dmaven.clean.skip to avoid recompiling classes.
+#
+# Warnings:
+# - Skipping modules via '-pl [!]' is not recommended because checks 
may assume/require a full build.
 #
 
 CI_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)
diff --git a/tools/ci/compile_ci.sh b/tools/ci/compile_ci.sh
index d9b3f68a49a..d88cce6848d 100755
--- a/tools/ci/compile_ci.sh
+++ b/tools/ci/compile_ci.sh
@@ -18,7 +18,7 @@
 

 
 #
-# This file contains tooling for compiling Flink
+# This script is the CI entrypoint for compiling Flink and running QA checks 
that don't require tests.
 #
 
 CI_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)



[flink] 05/15: [FLINK-32834] Remove FLINK_ROOT parameter

2023-08-17 Thread chesnay
This is an automated email from the ASF dual-hosted git repository.

chesnay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit 4bcbc7caed977b0c2c53c5b549b7feed4d430643
Author: Chesnay Schepler 
AuthorDate: Fri Aug 11 10:55:39 2023 +0200

[FLINK-32834] Remove FLINK_ROOT parameter

Ease direct manual usage by reducing the number of parameters.
---
 tools/ci/compile.sh | 6 +++---
 tools/ci/license_check.sh   | 5 ++---
 tools/ci/verify_bundled_optional.sh | 3 ---
 tools/ci/verify_scala_suffixes.sh   | 5 +
 4 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/tools/ci/compile.sh b/tools/ci/compile.sh
index 49743999707..11e1e22e9e6 100755
--- a/tools/ci/compile.sh
+++ b/tools/ci/compile.sh
@@ -97,11 +97,11 @@ fi
 
 echo " Checking bundled dependencies marked as optional 
"
 
-${CI_DIR}/verify_bundled_optional.sh $MVN_CLEAN_COMPILE_OUT "$CI_DIR" "$(pwd)" 
|| exit $?
+${CI_DIR}/verify_bundled_optional.sh $MVN_CLEAN_COMPILE_OUT "$CI_DIR" || exit 
$?
 
 echo " Checking scala suffixes "
 
-${CI_DIR}/verify_scala_suffixes.sh "$CI_DIR" "$(pwd)" || exit $?
+${CI_DIR}/verify_scala_suffixes.sh "$CI_DIR" || exit $?
 
 echo " Checking shaded dependencies "
 
@@ -117,7 +117,7 @@ echo " Run license check "
 find $MVN_VALIDATION_DIR
 # We use a different Scala version with Java 17
 if [[ ${PROFILE} != *"jdk17"* ]]; then
-  ${CI_DIR}/license_check.sh $MVN_CLEAN_COMPILE_OUT $CI_DIR $(pwd) 
$MVN_VALIDATION_DIR || exit $?
+  ${CI_DIR}/license_check.sh $MVN_CLEAN_COMPILE_OUT $CI_DIR 
$MVN_VALIDATION_DIR || exit $?
 fi
 
 exit $EXIT_CODE
diff --git a/tools/ci/license_check.sh b/tools/ci/license_check.sh
index 7ba98c88eae..3b3e02603c0 100755
--- a/tools/ci/license_check.sh
+++ b/tools/ci/license_check.sh
@@ -19,12 +19,11 @@
 
 MVN_CLEAN_COMPILE_OUT=$1
 CI_DIR=$2
-FLINK_ROOT=$3
-FLINK_DEPLOYED_ROOT=$4
+FLINK_DEPLOYED_ROOT=$3
 
 source "${CI_DIR}/maven-utils.sh"
 
-run_mvn -pl tools/ci/flink-ci-tools exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.licensecheck.LicenseChecker 
-Dexec.args="$MVN_CLEAN_COMPILE_OUT $FLINK_ROOT $FLINK_DEPLOYED_ROOT"
+run_mvn -pl tools/ci/flink-ci-tools exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.licensecheck.LicenseChecker 
-Dexec.args="$MVN_CLEAN_COMPILE_OUT $(pwd) $FLINK_DEPLOYED_ROOT"
 EXIT_CODE=$?
 
 if [ $EXIT_CODE != 0 ]; then
diff --git a/tools/ci/verify_bundled_optional.sh 
b/tools/ci/verify_bundled_optional.sh
index d2c34e638db..bcacc705f43 100755
--- a/tools/ci/verify_bundled_optional.sh
+++ b/tools/ci/verify_bundled_optional.sh
@@ -20,12 +20,9 @@
 ## Checks that all bundled dependencies are marked as optional in the poms
 MVN_CLEAN_COMPILE_OUT=$1
 CI_DIR=$2
-FLINK_ROOT=$3
 
 source "${CI_DIR}/maven-utils.sh"
 
-cd "$FLINK_ROOT" || exit
-
 dependency_plugin_output=/tmp/optional_dep.txt
 
 run_mvn dependency:tree -B > "${dependency_plugin_output}"
diff --git a/tools/ci/verify_scala_suffixes.sh 
b/tools/ci/verify_scala_suffixes.sh
index b827a1c19f5..37581b472b3 100755
--- a/tools/ci/verify_scala_suffixes.sh
+++ b/tools/ci/verify_scala_suffixes.sh
@@ -38,7 +38,6 @@
 # The script uses 'mvn dependency:tree -Dincludes=org.scala-lang' to list Scala
 # dependent modules.
 CI_DIR=$1
-FLINK_ROOT=$2
 
 echo "--- Flink Scala Dependency Analyzer ---"
 echo "Analyzing modules for Scala dependencies using 'mvn dependency:tree'."
@@ -46,8 +45,6 @@ echo "If you haven't built the project, please do so first by 
running \"mvn clea
 
 source "${CI_DIR}/maven-utils.sh"
 
-cd "$FLINK_ROOT" || exit
-
 dependency_plugin_output=/tmp/dep.txt
 
 run_mvn dependency:tree -Dincludes=org.scala-lang,:*_2.1*:: ${MAVEN_ARGUMENTS} 
> "${dependency_plugin_output}"
@@ -61,7 +58,7 @@ if [ $EXIT_CODE != 0 ]; then
 exit 1
 fi
 
-run_mvn -pl tools/ci/flink-ci-tools exec:java exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.suffixcheck.ScalaSuffixChecker 
-Dexec.args="${dependency_plugin_output} ${FLINK_ROOT}"
+run_mvn -pl tools/ci/flink-ci-tools exec:java exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.suffixcheck.ScalaSuffixChecker 
-Dexec.args="${dependency_plugin_output} $(pwd)"
 EXIT_CODE=$?
 
 if [ $EXIT_CODE == 0 ]; then



[flink] 11/15: [FLINK-32834] Add usage information and -h option to low-level scripts

2023-08-17 Thread chesnay
This is an automated email from the ASF dual-hosted git repository.

chesnay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit ed31bca4c9efb89ad7b0a2242d51360ab5188e35
Author: Chesnay Schepler 
AuthorDate: Tue Aug 15 17:57:29 2023 +0200

[FLINK-32834] Add usage information and -h option to low-level scripts
---
 tools/ci/license_check.sh   | 26 ++
 tools/ci/verify_bundled_optional.sh | 24 
 tools/ci/verify_scala_suffixes.sh   | 18 ++
 3 files changed, 68 insertions(+)

diff --git a/tools/ci/license_check.sh b/tools/ci/license_check.sh
index 4f2aebbde00..f31a6665061 100755
--- a/tools/ci/license_check.sh
+++ b/tools/ci/license_check.sh
@@ -17,6 +17,32 @@
 # limitations under the License.
 

 
+usage() {
+  echo "Usage: $0  "
+  echo " A file containing the output 
of the Maven build."
+  echo "  A directory containing a 
Maven repository into which the Flink artifacts were deployed."
+  echo ""
+  echo "Example preparation:"
+  echo "mvnw clean deploy 
-DaltDeploymentRepository=validation_repository::default::file:
 > "
+  echo ""
+  echo "The environment variable MVN is used to specify the Maven binaries; 
defaults to 'mvnw'."
+  echo "See further details in the JavaDoc of LicenseChecker."
+}
+
+while getopts 'h' o; do
+  case "${o}" in
+h)
+  usage
+  exit 0
+  ;;
+  esac
+done
+
+if [[ "$#" != "2" ]]; then
+  usage
+  exit 1
+fi
+
 MVN_CLEAN_COMPILE_OUT=$1
 FLINK_DEPLOYED_ROOT=$2
 
diff --git a/tools/ci/verify_bundled_optional.sh 
b/tools/ci/verify_bundled_optional.sh
index 34fff60ca2d..0d3dbdf41ba 100755
--- a/tools/ci/verify_bundled_optional.sh
+++ b/tools/ci/verify_bundled_optional.sh
@@ -17,6 +17,30 @@
 # limitations under the License.
 #
 
+usage() {
+  echo "Usage: $0 "
+  echo " A file containing the output 
of the Maven build."
+  echo ""
+  echo "mvnw clean package > "
+  echo ""
+  echo "The environment variable MVN is used to specify the Maven binaries; 
defaults to 'mvnw'."
+  echo "See further details in the JavaDoc of ShadeOptionalChecker."
+}
+
+while getopts 'h' o; do
+  case "${o}" in
+h)
+  usage
+  exit 0
+  ;;
+  esac
+done
+
+if [[ "$#" != "1" ]]; then
+  usage
+  exit 1
+fi
+
 ## Checks that all bundled dependencies are marked as optional in the poms
 MVN_CLEAN_COMPILE_OUT=$1
 
diff --git a/tools/ci/verify_scala_suffixes.sh 
b/tools/ci/verify_scala_suffixes.sh
index 4c5cc389eb3..45fca80a842 100755
--- a/tools/ci/verify_scala_suffixes.sh
+++ b/tools/ci/verify_scala_suffixes.sh
@@ -37,6 +37,24 @@
 #
 # The script uses 'mvn dependency:tree -Dincludes=org.scala-lang' to list Scala
 # dependent modules.
+
+
+usage() {
+  echo "Usage: $0"
+  echo ""
+  echo "The environment variable MVN is used to specify the Maven binaries; 
defaults to 'mvnw'."
+  echo "See further details in the JavaDoc of ScalaSuffixChecker."
+}
+
+while getopts 'h' o; do
+  case "${o}" in
+h)
+  usage
+  exit 0
+  ;;
+  esac
+done
+
 MVN=${MVN:-./mvnw}
 
 echo "--- Flink Scala Dependency Analyzer ---"



[flink] 03/15: [FLINK-32834] Write all tmp files to /tmp/

2023-08-17 Thread chesnay
This is an automated email from the ASF dual-hosted git repository.

chesnay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit d08356a1afe9a86b06c3978c04b0ef467d1685a6
Author: Chesnay Schepler 
AuthorDate: Fri Aug 11 10:13:04 2023 +0200

[FLINK-32834] Write all tmp files to /tmp/

- prevent files from being picked up by git/rat
- don't use /target because mvn clean would interfere
- tmp dirs under tools/ci would be neat, but we lack a central place to 
create it
---
 tools/ci/compile.sh | 14 +-
 tools/ci/shade.sh   | 30 --
 tools/ci/verify_bundled_optional.sh |  2 +-
 tools/ci/verify_scala_suffixes.sh   |  2 +-
 4 files changed, 27 insertions(+), 21 deletions(-)

diff --git a/tools/ci/compile.sh b/tools/ci/compile.sh
index 92b8d331755..49743999707 100755
--- a/tools/ci/compile.sh
+++ b/tools/ci/compile.sh
@@ -70,24 +70,28 @@ fi
 
 echo " Checking Javadocs "
 
+javadoc_output=/tmp/javadoc.out
+
 # use the same invocation as .github/workflows/docs.sh
-run_mvn javadoc:aggregate -DadditionalJOption='-Xdoclint:none' \
+$MVN javadoc:aggregate -DadditionalJOption='-Xdoclint:none' \
   -Dmaven.javadoc.failOnError=false -Dcheckstyle.skip=true 
-Denforcer.skip=true -Dspotless.skip=true -Drat.skip=true \
-  -Dheader=someTestHeader > javadoc.out
+  -Dheader=someTestHeader > ${javadoc_output}
 EXIT_CODE=$?
 if [ $EXIT_CODE != 0 ] ; then
   echo "ERROR in Javadocs. Printing full output:"
-  cat javadoc.out ; rm javadoc.out
+  cat ${javadoc_output}
   exit $EXIT_CODE
 fi
 
 echo " Checking Scaladocs "
 
-run_mvn scala:doc -Dcheckstyle.skip=true -Denforcer.skip=true 
-Dspotless.skip=true -pl flink-scala 2> scaladoc.out
+scaladoc_output=/tmp/scaladoc.out
+
+$MVN scala:doc -Dcheckstyle.skip=true -Denforcer.skip=true 
-Dspotless.skip=true -pl flink-scala 2> ${scaladoc_output}
 EXIT_CODE=$?
 if [ $EXIT_CODE != 0 ] ; then
   echo "ERROR in Scaladocs. Printing full output:"
-  cat scaladoc.out ; rm scaladoc.out
+  cat ${scaladoc_output}
   exit $EXIT_CODE
 fi
 
diff --git a/tools/ci/shade.sh b/tools/ci/shade.sh
index 70e99251280..51ea11e7540 100755
--- a/tools/ci/shade.sh
+++ b/tools/ci/shade.sh
@@ -17,10 +17,12 @@
 # limitations under the License.
 

 
+jarContents=/tmp/allClasses
+
 # Check the final fat jar for illegal or missing artifacts
 check_shaded_artifacts() {
-   jar tf build-target/lib/flink-dist*.jar > allClasses
-   ASM=`cat allClasses | grep '^org/objectweb/asm/' | wc -l`
+   jar tf build-target/lib/flink-dist*.jar > ${jarContents}
+   ASM=`cat ${jarContents} | grep '^org/objectweb/asm/' | wc -l`
if [ "$ASM" != "0" ]; then
echo 
"=="
echo "Detected '$ASM' unshaded asm dependencies in fat jar"
@@ -28,7 +30,7 @@ check_shaded_artifacts() {
return 1
fi
 
-   GUAVA=`cat allClasses | grep '^com/google/common' | wc -l`
+   GUAVA=`cat ${jarContents} | grep '^com/google/common' | wc -l`
if [ "$GUAVA" != "0" ]; then
echo 
"=="
echo "Detected '$GUAVA' guava dependencies in fat jar"
@@ -36,7 +38,7 @@ check_shaded_artifacts() {
return 1
fi
 
-   CODEHAUS_JACKSON=`cat allClasses | grep '^org/codehaus/jackson' | wc -l`
+   CODEHAUS_JACKSON=`cat ${jarContents} | grep '^org/codehaus/jackson' | 
wc -l`
if [ "$CODEHAUS_JACKSON" != "0" ]; then
echo 
"=="
echo "Detected '$CODEHAUS_JACKSON' unshaded 
org.codehaus.jackson classes in fat jar"
@@ -44,7 +46,7 @@ check_shaded_artifacts() {
return 1
fi
 
-   FASTERXML_JACKSON=`cat allClasses | grep '^com/fasterxml/jackson' | wc 
-l`
+   FASTERXML_JACKSON=`cat ${jarContents} | grep '^com/fasterxml/jackson' | 
wc -l`
if [ "$FASTERXML_JACKSON" != "0" ]; then
echo 
"=="
echo "Detected '$FASTERXML_JACKSON' unshaded 
com.fasterxml.jackson classes in fat jar"
@@ -52,7 +54,7 @@ check_shaded_artifacts() {
return 1
fi
 
-   SNAPPY=`cat allClasses | grep '^org/xerial/snappy' | wc -l`
+   SNAPPY=`cat ${jarContents} | grep '^org/xerial/snappy' | wc -l`
if [ "$SNAPPY" = "0" ]; then
echo 
"=="
echo "Missing snappy dependencies in fat jar"
@@ -60,7 +62,7 @@ check_shaded_artifacts() {
return 1
fi

[flink] 09/15: [FLINK-32834] Forward any additional args to maven

2023-08-17 Thread chesnay
This is an automated email from the ASF dual-hosted git repository.

chesnay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit 109c7ff261be834825e1c78c2c776c2ecb314a8a
Author: Chesnay Schepler 
AuthorDate: Fri Aug 11 11:32:43 2023 +0200

[FLINK-32834] Forward any additional args to maven

For example: 'tools/ci/compile.sh -Dfast'
---
 tools/ci/compile.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/ci/compile.sh b/tools/ci/compile.sh
index 5fc899dd00b..bc349e3d53a 100755
--- a/tools/ci/compile.sh
+++ b/tools/ci/compile.sh
@@ -44,7 +44,7 @@ echo 
"==
 EXIT_CODE=0
 
 $MVN clean deploy 
-DaltDeploymentRepository=validation_repository::default::file:$MVN_VALIDATION_DIR
 -Dflink.convergence.phase=install -Pcheck-convergence \
--Dmaven.javadoc.skip=true -U -DskipTests | tee $MVN_CLEAN_COMPILE_OUT
+-Dmaven.javadoc.skip=true -U -DskipTests "${@}" | tee 
$MVN_CLEAN_COMPILE_OUT
 
 EXIT_CODE=${PIPESTATUS[0]}
 



[flink] 15/15: [FLINK-32834] Use descriptive output file names

2023-08-17 Thread chesnay
This is an automated email from the ASF dual-hosted git repository.

chesnay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit 5bf5003f5c7baf19b0164f78558e495d8bb62b04
Author: Chesnay Schepler 
AuthorDate: Thu Aug 17 11:41:33 2023 +0200

[FLINK-32834] Use descriptive output file names
---
 tools/ci/verify_bundled_optional.sh | 2 +-
 tools/ci/verify_scala_suffixes.sh   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/ci/verify_bundled_optional.sh 
b/tools/ci/verify_bundled_optional.sh
index e0f5a22255d..9b926bd559b 100755
--- a/tools/ci/verify_bundled_optional.sh
+++ b/tools/ci/verify_bundled_optional.sh
@@ -46,7 +46,7 @@ MVN_CLEAN_COMPILE_OUT=$1
 
 MVN=${MVN:-./mvnw}
 
-dependency_plugin_output=/tmp/optional_dep.txt
+dependency_plugin_output=/tmp/dependency_tree_optional.txt
 
 # run with -T1 because our maven output parsers don't support multi-threaded 
builds
 $MVN dependency:tree -B -T1 > "${dependency_plugin_output}"
diff --git a/tools/ci/verify_scala_suffixes.sh 
b/tools/ci/verify_scala_suffixes.sh
index f6aae040731..bf7dce5b9e1 100755
--- a/tools/ci/verify_scala_suffixes.sh
+++ b/tools/ci/verify_scala_suffixes.sh
@@ -61,7 +61,7 @@ echo "--- Flink Scala Dependency Analyzer ---"
 echo "Analyzing modules for Scala dependencies using 'mvn dependency:tree'."
 echo "If you haven't built the project, please do so first by running \"mvn 
clean install -DskipTests\""
 
-dependency_plugin_output=/tmp/dep.txt
+dependency_plugin_output=/tmp/dependency_tree_scala.txt
 
 # run with -T1 because our maven output parsers don't support multi-threaded 
builds
 $MVN dependency:tree -Dincludes=org.scala-lang,:*_2.1*:: ${MAVEN_ARGUMENTS} 
-T1 > "${dependency_plugin_output}"



[flink] 07/15: [FLINK-32834] Streamline CI_DIR detection

2023-08-17 Thread chesnay
This is an automated email from the ASF dual-hosted git repository.

chesnay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit 003eae50c360001cfdbd38c34013f7ed3704bf7a
Author: Chesnay Schepler 
AuthorDate: Fri Aug 11 11:11:47 2023 +0200

[FLINK-32834] Streamline CI_DIR detection

Copied from the Flink connector release scripts
---
 tools/ci/compile.sh | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/tools/ci/compile.sh b/tools/ci/compile.sh
index 271501033e4..8d1e6fbfce0 100755
--- a/tools/ci/compile.sh
+++ b/tools/ci/compile.sh
@@ -21,12 +21,7 @@
 # This file contains tooling for compiling Flink
 #
 
-HERE="`dirname \"$0\"`" # relative
-HERE="`( cd \"$HERE\" && pwd )`"# absolutized and normalized
-if [ -z "$HERE" ] ; then
-exit 1  # fail
-fi
-CI_DIR="$HERE/../ci"
+CI_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)
 MVN_CLEAN_COMPILE_OUT="/tmp/clean_compile.out"
 
 # Deploy into this directory, to run license checks on all jars staged for 
deployment.



[flink] 01/15: [FLINK-32834] Run all compile scripts from root directory

2023-08-17 Thread chesnay
This is an automated email from the ASF dual-hosted git repository.

chesnay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit 5d163dd39d6179c3618dcada86d42b2b332569f3
Author: Chesnay Schepler 
AuthorDate: Fri Aug 11 10:01:06 2023 +0200

[FLINK-32834] Run all compile scripts from root directory

- make it easier to work with relative paths
  - specifically, useful to have the scripts rely on mvnv by default 
(follow-up)
---
 tools/ci/license_check.sh   | 4 +---
 tools/ci/verify_bundled_optional.sh | 4 +---
 tools/ci/verify_scala_suffixes.sh   | 4 +---
 3 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/tools/ci/license_check.sh b/tools/ci/license_check.sh
index 79c96416113..9bbcadab96a 100755
--- a/tools/ci/license_check.sh
+++ b/tools/ci/license_check.sh
@@ -24,9 +24,7 @@ FLINK_DEPLOYED_ROOT=$4
 
 source "${CI_DIR}/maven-utils.sh"
 
-cd $CI_DIR/flink-ci-tools/
-
-run_mvn exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.licensecheck.LicenseChecker 
-Dexec.args=\"$MVN_CLEAN_COMPILE_OUT $FLINK_ROOT $FLINK_DEPLOYED_ROOT\"
+run_mvn -pl tools/ci/flink-ci-tools exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.licensecheck.LicenseChecker 
-Dexec.args=\"$MVN_CLEAN_COMPILE_OUT $FLINK_ROOT $FLINK_DEPLOYED_ROOT\"
 EXIT_CODE=$?
 
 if [ $EXIT_CODE != 0 ]; then
diff --git a/tools/ci/verify_bundled_optional.sh 
b/tools/ci/verify_bundled_optional.sh
index db43b320249..40e761ed3e6 100755
--- a/tools/ci/verify_bundled_optional.sh
+++ b/tools/ci/verify_bundled_optional.sh
@@ -32,9 +32,7 @@ run_mvn dependency:tree -B > "${dependency_plugin_output}"
 
 cat "${dependency_plugin_output}"
 
-cd "${CI_DIR}/flink-ci-tools/" || exit
-
-run_mvn exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.optional.ShadeOptionalChecker 
-Dexec.args=\""${MVN_CLEAN_COMPILE_OUT}" "${dependency_plugin_output}"\"
+run_mvn -pl tools/ci/flink-ci-tools exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.optional.ShadeOptionalChecker 
-Dexec.args=\""${MVN_CLEAN_COMPILE_OUT}" "${dependency_plugin_output}"\"
 EXIT_CODE=$?
 
 if [ $EXIT_CODE != 0 ]; then
diff --git a/tools/ci/verify_scala_suffixes.sh 
b/tools/ci/verify_scala_suffixes.sh
index 53b9edaf08e..756a7503bee 100755
--- a/tools/ci/verify_scala_suffixes.sh
+++ b/tools/ci/verify_scala_suffixes.sh
@@ -61,9 +61,7 @@ if [ $EXIT_CODE != 0 ]; then
 exit 1
 fi
 
-cd "${CI_DIR}/flink-ci-tools/" || exit
-
-run_mvn exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.suffixcheck.ScalaSuffixChecker 
-Dexec.args=\""${dependency_plugin_output}" "${FLINK_ROOT}"\"
+run_mvn -pl tools/ci/flink-ci-tools exec:java exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.suffixcheck.ScalaSuffixChecker 
-Dexec.args=\""${dependency_plugin_output}" "${FLINK_ROOT}"\"
 EXIT_CODE=$?
 
 if [ $EXIT_CODE == 0 ]; then



[flink] 13/15: [FLINK-32834] Forward actual Maven error code instead of 1

2023-08-17 Thread chesnay
This is an automated email from the ASF dual-hosted git repository.

chesnay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit 3e5eca702d264d87964022c6cebe7dc95fbd8f9b
Author: Chesnay Schepler 
AuthorDate: Thu Aug 17 11:29:11 2023 +0200

[FLINK-32834] Forward actual Maven error code instead of 1
---
 tools/ci/verify_bundled_optional.sh |  4 ++--
 tools/ci/verify_scala_suffixes.sh   | 14 +++---
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/tools/ci/verify_bundled_optional.sh 
b/tools/ci/verify_bundled_optional.sh
index 890b1a5acf5..276d95eb634 100755
--- a/tools/ci/verify_bundled_optional.sh
+++ b/tools/ci/verify_bundled_optional.sh
@@ -56,7 +56,7 @@ if [ $EXIT_CODE != 0 ]; then
 echo 
"=="
 echo "Optional Check failed. The dependency tree could not be determined. 
See previous output for details."
 echo 
"=="
-exit 1
+exit $EXIT_CODE
 fi
 
 cat "${dependency_plugin_output}"
@@ -68,7 +68,7 @@ if [ $EXIT_CODE != 0 ]; then
 echo 
"=="
 echo "Optional Check failed. See previous output for details."
 echo 
"=="
-exit 1
+exit $EXIT_CODE
 fi
 
 exit 0
diff --git a/tools/ci/verify_scala_suffixes.sh 
b/tools/ci/verify_scala_suffixes.sh
index 45fca80a842..9747066b4c8 100755
--- a/tools/ci/verify_scala_suffixes.sh
+++ b/tools/ci/verify_scala_suffixes.sh
@@ -71,18 +71,18 @@ if [ $EXIT_CODE != 0 ]; then
 echo 
"=="
 echo "Suffix Check failed. The dependency tree could not be determined. 
See previous output for details."
 echo 
"=="
-exit 1
+exit $EXIT_CODE
 fi
 
 $MVN -pl tools/ci/flink-ci-tools exec:java exec:java 
-Dexec.mainClass=org.apache.flink.tools.ci.suffixcheck.ScalaSuffixChecker 
-Dexec.args="${dependency_plugin_output} $(pwd)"
 EXIT_CODE=$?
 
-if [ $EXIT_CODE == 0 ]; then
-exit 0
+if [ $EXIT_CODE != 0 ]; then
+echo 
"=="
+echo "Suffix Check failed. See previous output for details."
+echo 
"=="
+exit $EXIT_CODE
 fi
 
-echo 
"=="
-echo "Suffix Check failed. See previous output for details."
-echo 
"=="
-exit 1
+exit 0
 



[flink] branch master updated (5bf5003f5c7 -> 18b4759cc34)

2023-08-17 Thread yuxia
This is an automated email from the ASF dual-hosted git repository.

yuxia pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git


from 5bf5003f5c7 [FLINK-32834] Use descriptive output file names
 new 13cb456eb30 [Flink-32356][doc] Add doc for procedure
 new 18b4759cc34 [Flink-32356][doc] Add doc for call statement This closes 
#23130

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 docs/content.zh/docs/dev/table/procedures.md | 541 ++
 docs/content.zh/docs/dev/table/sql/call.md   | 118 ++
 docs/content/docs/dev/table/procedures.md| 544 +++
 docs/content/docs/dev/table/sql/call.md  | 118 ++
 4 files changed, 1321 insertions(+)
 create mode 100644 docs/content.zh/docs/dev/table/procedures.md
 create mode 100644 docs/content.zh/docs/dev/table/sql/call.md
 create mode 100644 docs/content/docs/dev/table/procedures.md
 create mode 100644 docs/content/docs/dev/table/sql/call.md



[flink] 01/02: [Flink-32356][doc] Add doc for procedure

2023-08-17 Thread yuxia
This is an automated email from the ASF dual-hosted git repository.

yuxia pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit 13cb456eb30dc50dbd47885b7471cd02cfe56cdf
Author: luoyuxia 
AuthorDate: Wed Aug 2 21:01:57 2023 +0800

[Flink-32356][doc] Add doc for procedure
---
 docs/content.zh/docs/dev/table/procedures.md | 541 ++
 docs/content/docs/dev/table/procedures.md| 544 +++
 2 files changed, 1085 insertions(+)

diff --git a/docs/content.zh/docs/dev/table/procedures.md 
b/docs/content.zh/docs/dev/table/procedures.md
new file mode 100644
index 000..cb70d759ebc
--- /dev/null
+++ b/docs/content.zh/docs/dev/table/procedures.md
@@ -0,0 +1,541 @@
+---
+title: "存储过程"
+is_beta: true
+weight: 50
+type: docs
+aliases:
+  - /zh/dev/table/procedures.html
+---
+
+
+# 存储过程
+
+
+Flink 允许用户在 Table API 和 SQL 中调用存储过程来完成一些特定任务,比如处理数据,数据管理类任务等。存储过程可以通过 
`StreamExecutionEnvironment` 来运行 Flink 作业,这使得存储过程更加强大和灵活。
+
+## 开发指南
+--
+
+为了调用一个存储过程,需要确保一个 Catalog 可以提供这个存储过程。为了让一个 Catalog 提供存储过程,你首先需要实现一个存储过程,然后在方法 
`Catalog.getProcedure(ObjectPath procedurePath)` 返回这个存储过程。
+下面的步骤将展示如何实现一个存储过程并让一个 Catalog 提供这个存储过程。
+
+### 存储过程类
+
+存储过程的实现类必须实现接口 `org.apache.flink.table.procedures.Procedure`。
+
+该实现类必须声明为 `public`, 而不是 `abstract`, 并且可以被全局访问。不允许使用非静态内部类或匿名类。
+
+### Call 方法
+
+存储过程的接口不提供任何方法,存储过程的实现类必须有名为 `call` 的方法,在该方法里面可以实现存储过程实际的逻辑。`call` 方法必须被声明为 
`public`, 并且带有一组定义明确的参数。
+
+请注意:
+
+* `call` 方法的第一个参数总是应该为 `ProcedureContext`,该参数提供了方法 `getExecutionEnvironment()` 
来得到当前的 `StreamExecutionEnvironment`。通过 `StreamExecutionEnvironment` 可以运行一个 
Flink 作业;
+* `call` 方法的返回类型应该永远都是一个数组类型,比如 `int[]`,`String[]`,等等;
+
+更多的细节请参考类 `org.apache.flink.table.procedures.Procedure` 的 Java 文档。
+
+常规的 JVM 方法调用语义是适用的,因此可以:
+- 实现重载的方法,例如 `call(ProcedureContext, Integer)` and `call(ProcedureContext, 
LocalDateTime)`;
+- 使用变长参数,例如 `call(ProcedureContext, Integer...)`;
+- 使用对象继承,例如 `call(ProcedureContext, Object)` 可接受 `LocalDateTime` 和 `Integer` 
作为参数;
+- 也可组合使用,例如 `call(ProcedureContext, Object...)` 可接受所有类型的参数;
+
+如果你希望用 Scala 来实现一个存储过程,对应可变长参数的情况,请添加 
`scala.annotation.varargs`。另外,推荐使用装箱的基本类型(比如,使用 `java.lang.Integer` 而不是 
`Int`)来支持 `NULL`。
+
+下面的代码片段展示来一个重载存储过程的例子:
+
+{{< tabs "7c5a5392-30d7-11ee-be56-0242ac120002" >}}
+{{< tab "Java" >}}
+
+```java
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.table.procedure.ProcedureContext;
+import org.apache.flink.table.procedures.Procedure;
+
+// 有多个重载 call 方法的存储过程
+public class GenerateSequenceProcedure implements Procedure {
+
+  public long[] call(ProcedureContext context, int n) {
+return generate(context.getExecutionEnvironment(), n);
+  }
+
+  public long[] call(ProcedureContext context, String n) {
+return generate(context.getExecutionEnvironment(), Integer.parseInt(n));
+  }
+
+  private long[] generate(StreamExecutionEnvironment env, int n) throws 
Exception {
+long[] sequenceN = new long[n];
+int i = 0;
+try (CloseableIterator result = env.fromSequence(0, n - 
1).executeAndCollect()) {
+  while (result.hasNext()) {
+sequenceN[i++] = result.next();
+  }
+}
+return sequenceN;
+  }
+}
+
+```
+{{< /tab >}}
+{{< tab "Scala" >}}
+```scala
+import org.apache.flink.table.procedure.ProcedureContext
+import org.apache.flink.table.procedures.Procedure
+import scala.annotation.varargs
+
+// 有多个重载 call 方法的存储过程
+class GenerateSequenceProcedure extends Procedure {
+
+  def call(context: ProcedureContext, a: Integer, b: Integer): Array[Integer] 
= {
+Array(a + b)
+  }
+
+  def call(context: ProcedureContext, a: String, b: String): Array[Integer] = {
+Array(Integer.valueOf(a) + Integer.valueOf(b))
+  }
+
+  @varargs // 类似 Java 的变长参数
+  def call(context: ProcedureContext, d: Double*): Array[Integer] = {
+Array(d.sum.toInt)
+  }
+}
+
+```
+{{< /tab >}}
+{{< /tabs >}}
+
+### 类型推导
+Table(类似于 SQL 标准)是一种强类型的 API。 因此,存储过程的参数和返回类型都必须映射到 [data type]({{< ref 
"docs/dev/table/types" >}})。
+
+从逻辑角度看,Planner 需要知道数据类型、精度和小数位数;从 JVM 角度来看,Planner 在调用存储过程时需要知道如何将内部数据结构表示为 
JVM 对象。
+
+术语 _类型推导_ 概括了意在验证输入值、推导出参数/返回值数据类型的逻辑。
+
+Flink 存储过程实现了自动的类型推导提取,通过反射从存储过程的类及其 `call` 方法中推导数据类型。如果这种隐式的反射提取方法不成功,则可以通过使用 
`@DataTypeHint` 和 `@ProcedureHint` 
注解相关参数、类或方法来支持提取存储过程的参数和返回类型,下面展示了有关如何注解存储过程的例子。
+
+需要注意的是虽然存储过程的 `call` 方法必须返回数组类型 `T[]`,但是如果用 `@DataTypeHint` 
来注解返回类型,实际上注解的是该数组的元素的类型,即 `T`。
+
+ 自动类型推导 
+
+自动类型推导会检查存储过程的类和 `call` 方法,推导出存储过程参数和结果的数据类型, `@DataTypeHint` 和 
`@ProcedurenHint` 注解支持自动类型推导。
+
+有关可以隐式映射到数据类型的类的完整列表, 请参阅[data type extraction section]({{< ref 
"docs/dev/table/types" >}}#data-type-extraction)。
+
+**`@DataTypeHint`**
+
+在许多情况下,需要支持以 _内联_ 方式自动提取出存储过程参数、返回值的类型。
+
+以下例子展示了如何使用 `@DataTypeHint`,详情可参考该注解类的文档。
+
+{{< tabs "81b297da-30d9-11ee-be56-0242ac120002" >}}
+{{< tab "Java" >}}
+```java
+import org.apache.flink.table.annotation.DataTypeHint;
+import org.apache.flink.table.annotation.InputGrou

[flink] 02/02: [Flink-32356][doc] Add doc for call statement This closes #23130

2023-08-17 Thread yuxia
This is an automated email from the ASF dual-hosted git repository.

yuxia pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit 18b4759cc3405e784205f6df55dd7a4f0ee2d3b7
Author: luoyuxia 
AuthorDate: Wed Aug 2 21:02:06 2023 +0800

[Flink-32356][doc] Add doc for call statement
This closes #23130
---
 docs/content.zh/docs/dev/table/sql/call.md | 118 +
 docs/content/docs/dev/table/sql/call.md| 118 +
 2 files changed, 236 insertions(+)

diff --git a/docs/content.zh/docs/dev/table/sql/call.md 
b/docs/content.zh/docs/dev/table/sql/call.md
new file mode 100644
index 000..3899b80b4d5
--- /dev/null
+++ b/docs/content.zh/docs/dev/table/sql/call.md
@@ -0,0 +1,118 @@
+---
+title: "CALL 语句"
+weight: 19
+type: docs
+aliases:
+- /zh/dev/table/sql/call.html
+---
+
+
+# Call Statements
+
+`Call` 语句用来调用存储过程。存储过程通常是用来执行一些数据操作和管理任务的。
+
+注意 目前 `Call` 语句要求被调用的存储过程在对应的 Catalog 
中。所以,在调用存储过程前,请确保该存储过程在对应的 Catalog 中。
+否则就会抛出异常。你可能需要查看对应 Catalog 的文档来知道该 Catalog 有哪些存储过程可用。要实现一个存储过程,请参阅 [存储过程]({{< 
ref "docs/dev/table/procedures" >}})。
+
+## Run a CALL statement
+
+{{< tabs "call statement" >}}
+
+{{< tab "Java" >}}
+
+CALL 语句可以使用 `TableEnvironment` 中的 `executeSql()` 方法执行。`executeSql()` 方法执行 CALL 
语句时会立即调用这个存储过程,并且返回一个 `TableResult` 对象,通过该对象可以获取调用存储过程的结果。
+
+以下的例子展示了如何在 `TableEnvironment` 中执行一条 CALL 语句。
+
+{{< /tab >}}
+{{< tab "Scala" >}}
+
+CALL 语句可以使用 `TableEnvironment` 中的 `executeSql()` 方法执行。`executeSql()` 方法执行 CALL 
语句时会立即调用这个存储过程,并且返回一个 `TableResult` 对象,通过该对象可以获取调用存储过程的结果。
+
+以下的例子展示了如何在 `TableEnvironment` 中执行一条 CALL 语句。
+{{< /tab >}}
+{{< tab "Python" >}}
+
+CALL 语句可以使用 `TableEnvironment` 中的 `execute_sql()` 方法执行。`execute_sql()` 方法执行 
CALL 语句时会立即调用这个存储过程,并且返回一个 `TableResult` 对象,通过该对象可以获取调用存储过程的结果。
+
+以下的例子展示了如何在 `TableEnvironment` 中执行一条 CALL 语句。
+
+{{< /tab >}}
+{{< tab "SQL CLI" >}}
+
+可以在 [SQL CLI]({{< ref "docs/dev/table/sqlClient" >}}) 中执行 CALL 语句。
+
+以下的例子展示了如何在 SQL CLI 中执行一条 CALL 语句。
+
+{{< /tab >}}
+{{< /tabs >}}
+
+{{< tabs "da4af028-303d-11ee-be56-0242ac120002" >}}
+
+{{< tab "Java" >}}
+```java
+TableEnvironment tEnv = TableEnvironment.create(...);
+
+// 假设存储过程 `generate_n` 已经存在于当前 catalog 的 `system` 数据库
+tEnv.executeSql("CALL `system`.generate_n(4)").print();
+```
+{{< /tab >}}
+{{< tab "Scala" >}}
+```scala
+val tEnv = TableEnvironment.create(...)
+
+// 假设存储过程 `generate_n` 已经存在于当前 catalog 的 `system` 数据库
+tEnv.executeSql("CALL `system`.generate_n(4)").print()
+```
+{{< /tab >}}
+{{< tab "Python" >}}
+```python
+table_env = TableEnvironment.create(...)
+
+# 假设存储过程 `generate_n` 已经存在于当前 catalog 的 `system` 数据库
+table_env.execute_sql().print()
+```
+{{< /tab >}}
+{{< tab "SQL CLI" >}}
+```sql
+-- 假设存储过程 `generate_n` 已经存在于当前 catalog 的 `system` 数据库
+Flink SQL> CALL `system`.generate_n(4);
+++
+| result |
+++
+|  0 |
+|  1 |
+|  2 |
+|  3 |
+++
+4 rows in set
+!ok
+```
+{{< /tab >}}
+{{< /tabs >}}
+
+{{< top >}}
+
+## Syntax
+
+```sql
+CALL [catalog_name.][database_name.]procedure_name ([ expression [, 
expression]* ] )
+```
+
diff --git a/docs/content/docs/dev/table/sql/call.md 
b/docs/content/docs/dev/table/sql/call.md
new file mode 100644
index 000..ca54b7b207f
--- /dev/null
+++ b/docs/content/docs/dev/table/sql/call.md
@@ -0,0 +1,118 @@
+---
+title: "CALL Statements"
+weight: 19
+type: docs
+aliases:
+- /dev/table/sql/call.html
+---
+
+
+# Call Statements
+
+`Call` statements are used to call a stored procedure which is usually 
provided to perform data manipulation or administrative tasks.
+
+Attention Currently, `Call` statements 
require the procedure called to exist in the corresponding catalog. So, please 
make sure the procedure exists in the catalog. 
+If it doesn't exist, it'll throw an exception. You may need to refer to the 
doc of the catalog to see the available procedures. To implement an procedure, 
please refer to [Procedure]({{< ref "docs/dev/table/procedures" >}}).
+
+## Run a CALL statement
+
+{{< tabs "call statement" >}}
+
+{{< tab "Java" >}}
+
+CALL statements can be executed with the `executeSql()` method of the 
`TableEnvironment`. The `executeSql()` will immediately call the procedure, and 
return a `TableResult` instance which associates the procedure.
+
+The following examples show how to execute a CALL statement in 
`TableEnvironment`.
+
+{{< /tab >}}
+{{< tab "Scala" >}}
+
+CALL statements can be executed with the `executeSql()` method of the 
`TableEnvironment`. The `executeSql()` will immediately call the procedure, and 
return a `TableResult` instance which associates the procedure.
+
+The following examples show how to execute a single CALL statement in 
`TableEnvironment`.
+{{< /tab >}}
+{{< tab "Python" >}}
+
+CALL statements can be executed with the `execute_sql()` method of the 
`TableEnvironment`. The `executeSql()` will immediately call the procedure, and 
return a `TableResult` instance which associates the pro

[flink] branch master updated (18b4759cc34 -> eab68f31193)

2023-08-17 Thread renqs
This is an automated email from the ASF dual-hosted git repository.

renqs pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git


from 18b4759cc34 [Flink-32356][doc] Add doc for call statement This closes 
#23130
 add eab68f31193 [FLINK-31268][metrics] Change not to initialize operator 
coordinator metric group lazily (#22048)

No new revisions were added by this update.

Summary of changes:
 .../source/lib/util/IteratorSourceEnumerator.java  |  1 +
 .../executiongraph/DefaultExecutionGraph.java  | 19 +---
 .../DefaultExecutionGraphBuilder.java  |  6 ++-
 .../runtime/executiongraph/ExecutionGraph.java | 16 +--
 .../runtime/executiongraph/ExecutionJobVertex.java | 20 ++--
 .../SpeculativeExecutionJobVertex.java | 12 -
 .../coordination/OperatorCoordinatorHolder.java| 53 +-
 .../scheduler/DefaultExecutionGraphFactory.java|  3 +-
 .../DefaultOperatorCoordinatorHandler.java | 14 ++
 .../scheduler/OperatorCoordinatorHandler.java  |  8 +---
 .../flink/runtime/scheduler/SchedulerBase.java |  3 +-
 .../scheduler/adaptive/CreatingExecutionGraph.java |  2 +-
 .../adaptivebatch/AdaptiveBatchScheduler.java  | 11 +++--
 .../DefaultExecutionGraphConstructionTest.java | 38 +---
 .../executiongraph/EdgeManagerBuildUtilTest.java   | 22 +++--
 .../executiongraph/ExecutionJobVertexTest.java |  4 +-
 .../executiongraph/PointwisePatternTest.java   |  4 +-
 .../TestingDefaultExecutionGraphBuilder.java   |  4 +-
 .../executiongraph/VertexSlotSharingTest.java  |  5 +-
 .../OperatorCoordinatorHolderTest.java |  8 ++--
 .../DefaultOperatorCoordinatorHandlerTest.java | 10 ++--
 .../runtime/scheduler/SchedulerTestingUtils.java   |  5 +-
 .../SsgNetworkMemoryCalculationUtilsTest.java  | 20 ++--
 .../adapter/DefaultExecutionTopologyTest.java  | 19 +---
 .../runtime/scheduler/adaptive/ExecutingTest.java  |  4 +-
 .../adaptive/StateTrackingMockExecutionGraph.java  |  8 +++-
 .../TestingOperatorCoordinatorHandler.java |  8 +---
 .../topology/BuildExecutionGraphBenchmark.java |  4 +-
 .../partitioner/RescalePartitionerTest.java|  5 +-
 29 files changed, 204 insertions(+), 132 deletions(-)



[flink] branch master updated: [FLINK-32653][docs] Add doc for catalog store (#23110)

2023-08-17 Thread zjureel
This is an automated email from the ASF dual-hosted git repository.

zjureel pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git


The following commit(s) were added to refs/heads/master by this push:
 new 5628c709787 [FLINK-32653][docs] Add doc for catalog store (#23110)
5628c709787 is described below

commit 5628c7097875be4bd56fc7805dbdd727d92bdac7
Author: Feng Jin 
AuthorDate: Fri Aug 18 10:34:20 2023 +0800

[FLINK-32653][docs] Add doc for catalog store (#23110)

* [FLINK-32653][docs] Add doc for catalog store
---
 docs/content.zh/docs/dev/table/catalogs.md | 183 
 docs/content/docs/dev/table/catalogs.md| 188 +
 2 files changed, 371 insertions(+)

diff --git a/docs/content.zh/docs/dev/table/catalogs.md 
b/docs/content.zh/docs/dev/table/catalogs.md
index 8efee5d0e83..72928e0c374 100644
--- a/docs/content.zh/docs/dev/table/catalogs.md
+++ b/docs/content.zh/docs/dev/table/catalogs.md
@@ -803,3 +803,186 @@ the gateway, or you can also use `SET` to specify the 
listener for ddl, for exam
 Flink SQL> SET 'table.catalog-modification.listeners' = 'your_factory';
 Flink SQL> CREATE TABLE test_table(...);
 ```
+
+## Catalog Store
+
+Catalog Store 用于保存 Catalog 的配置信息, 配置 Catalog Store 之后,在 session 中创建的 catalog 
信息会持久化至
+Catalog Store 对应的外部系统中,即使 session 重建, 之前创建的 Catalog 依旧可以从 Catalog Store 中重新获取。
+
+### Catalog Store 的配置
+用户可以以不同的方式配置 Catalog Store,一种是使用Table API,另一种是使用 YAML 配置。
+
+在 Table API 中使用 Catalog Store 实例来注册 Catalog Store 。
+```java
+// Initialize a catalog Store
+CatalogStore catalogStore = new 
FileCatalogStore("file://path/to/catalog/store/");
+
+// set up the catalog store
+final EnvironmentSettings settings =
+EnvironmentSettings.newInstance().inBatchMode()
+.withCatalogStore(catalogStore)
+.build();
+
+final TableEnvironment tableEnv = TableEnvironment.create(settings);
+```
+
+在 Table API 中使用 configuration 注册 Catalog Store 。
+```java
+// set up configuration
+Configuration configuration = new Configuration();
+configuration.set("table.catalog-store.kind", "file");
+configuration.set("table.catalog-store.file.path", 
"file://path/to/catalog/store/");
+// set up the configuration.
+final EnvironmentSettings settings =
+EnvironmentSettings.newInstance().inBatchMode()
+.withConfiguration(configuration)
+.build();
+
+final TableEnvironment tableEnv = TableEnvironment.create(settings);
+```
+
+在 SQL Gateway 中,推荐在 `flink-conf.yaml` 文件中进行配置,所有的 session 可以自动使用已经创建好的 Catalog 
。
+配置的格式如下,一般情况下需要配置 Catalog Store 的 kind ,以及 Catalog Store 需要的其他参数配置。
+```yaml
+table.catalog-store.kind: file
+table.catalog-store.file.path: /path/to/catalog/store/
+```
+
+### Catalog Store 类型
+Flink 框架内置了两种 Catalog Store,分别是 GenericInMemoryCatalogStore 和 
FileCatalogStore。用户也可以自定义 Catalog Store 。
+
+ GenericInMemoryCatalogStore
+GenericInMemoryCatalogStore 是基于内存实现的 Catalog Store,所有的 Catalog 配置只在 session 
的生命周期内可用,
+session 重建之后 store 中保存的 Catalog 配置也会自动清理。
+
+
+
+  
+参数
+描述
+  
+
+
+
+  kind
+  指定要使用的 Catalog Store 类型,此处应为 'generic_in_memory'
+
+
+
+
+ FileCatalogStore
+FileCatalogStore 可以将用户的 Catalog 配置信息保存至文件中,使用 FileCatalogStore 需要指定 Catalog 
配置需要
+保存的目录,不同的 Catalog 会对应不同的文件并和 Catalog Name 一一对应。
+
+这是一个示例目录结构,用于表示使用 `FileCatalogStore` 保存 `catalog` 配置的情况:
+```shell
+- /path/to/save/the/catalog/
+  - catalog1.yaml
+  - catalog2.yaml
+  - catalog3.yaml
+```
+
+
+
+  
+参数
+描述
+  
+
+
+
+  kind
+  指定要使用的 Catalog Store 类型,此处应为 'file'
+
+
+  path
+  指定要使用的 Catalog Store 保存的路径,必须是一个合法的目录,当前只支持本地目录
+
+
+
+
+ 用户自定义 Catalog Store
+Catalog Store 是可拓展的, 用户可以通过实现 Catalog Store 的接口来自定义 Catalog Store。如果需要 SQL CLI 
或者 SQL Gateway 中使用
+Catalog Store,还需要这个 Catalog Store 实现对应的 CatalogStoreFactory 接口。
+
+```java
+public class CustomCatalogStoreFactory implements CatalogStoreFactory {
+
+public static final String IDENTIFIER = "custom-kind";
+
+// Used to connect external storage systems
+private CustomClient client;
+
+@Override
+public CatalogStore createCatalogStore() {
+return new CustomCatalogStore();
+}
+
+@Override
+public void open(Context context) throws CatalogException {
+// initialize the resources, such as http client
+client = initClient(context);
+}
+
+@Override
+public void close() throws CatalogException {
+// release the resources
+}
+
+@Override
+public String factoryIdentifier() {
+// table store kind identifier
+return IDENTIFIER;
+}
+
+@Override
+public Set> requiredOptions() {
+// define the required options
+Set options = new HashSet();
+options.add(OPTION_1);
+options.add(OPTION_2);
+
+return options;
+}
+
+@Override
+public Set> optionalOptions(

[flink] branch master updated (5628c709787 -> c8cbb0e2dec)

2023-08-17 Thread huweihua
This is an automated email from the ASF dual-hosted git repository.

huweihua pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git


from 5628c709787 [FLINK-32653][docs] Add doc for catalog store (#23110)
 add c8cbb0e2dec [FLINK-32879][flink-runtime]Ignore 
slotmanager.max-total-resource.cpu/memory in standalone mode (#23231)

No new revisions were added by this update.

Summary of changes:
 .../StandaloneResourceManagerFactory.java  | 28 ++
 .../TaskManagerDisconnectOnShutdownITCase.java |  2 +-
 2 files changed, 24 insertions(+), 6 deletions(-)



[flink] branch master updated: [FLINK-32876][runtime] Prevent ExecutionTimeBasedSlowTaskDetector from identifying tasks in CREATED state as slow tasks.

2023-08-17 Thread zhuzh
This is an automated email from the ASF dual-hosted git repository.

zhuzh pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git


The following commit(s) were added to refs/heads/master by this push:
 new 4bce9c09ca4 [FLINK-32876][runtime] Prevent 
ExecutionTimeBasedSlowTaskDetector from identifying tasks in CREATED state as 
slow tasks.
4bce9c09ca4 is described below

commit 4bce9c09ca4fdf3ad8bf95ba5cf4ca361acea156
Author: JunRuiLee 
AuthorDate: Fri Aug 18 10:42:25 2023 +0800

[FLINK-32876][runtime] Prevent ExecutionTimeBasedSlowTaskDetector from 
identifying tasks in CREATED state as slow tasks.

This closes #23222.
---
 .../ExecutionTimeBasedSlowTaskDetector.java | 12 +++-
 .../ExecutionTimeBasedSlowTaskDetectorTest.java | 21 +
 2 files changed, 32 insertions(+), 1 deletion(-)

diff --git 
a/flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/slowtaskdetector/ExecutionTimeBasedSlowTaskDetector.java
 
b/flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/slowtaskdetector/ExecutionTimeBasedSlowTaskDetector.java
index 34cb5b47b36..f6d08548a0a 100644
--- 
a/flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/slowtaskdetector/ExecutionTimeBasedSlowTaskDetector.java
+++ 
b/flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/slowtaskdetector/ExecutionTimeBasedSlowTaskDetector.java
@@ -214,7 +214,17 @@ public class ExecutionTimeBasedSlowTaskDetector implements 
SlowTaskDetector {
 ExecutionTimeWithInputBytes baseline,
 long currentTimeMillis) {
 return executions.stream()
-.filter(e -> !e.getState().isTerminal() && e.getState() != 
ExecutionState.CANCELING)
+.filter(
+// We will filter out tasks that are in the CREATED 
state, as we do not
+// allow speculative execution for them because they 
have not been
+// scheduled.
+// However, for tasks that are already in the 
SCHEDULED state, we allow
+// speculative execution to provide the capability of 
parallel execution
+// running.
+e ->
+!e.getState().isTerminal()
+&& e.getState() != 
ExecutionState.CANCELING
+&& e.getState() != 
ExecutionState.CREATED)
 .filter(
 e -> {
 ExecutionTimeWithInputBytes timeWithBytes =
diff --git 
a/flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/slowtaskdetector/ExecutionTimeBasedSlowTaskDetectorTest.java
 
b/flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/slowtaskdetector/ExecutionTimeBasedSlowTaskDetectorTest.java
index b11f86c80d4..1714d79edbf 100644
--- 
a/flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/slowtaskdetector/ExecutionTimeBasedSlowTaskDetectorTest.java
+++ 
b/flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/slowtaskdetector/ExecutionTimeBasedSlowTaskDetectorTest.java
@@ -76,6 +76,27 @@ class ExecutionTimeBasedSlowTaskDetectorTest {
 assertThat(slowTasks).hasSize(parallelism);
 }
 
+@Test
+void testAllTasksInCreatedAndNoSlowTasks() throws Exception {
+final int parallelism = 3;
+final JobVertex jobVertex = createNoOpVertex(parallelism);
+final JobGraph jobGraph = 
JobGraphTestUtils.streamingJobGraph(jobVertex);
+
+// all tasks are in the CREATED state, which is not classified as slow 
tasks.
+final ExecutionGraph executionGraph =
+SchedulerTestingUtils.createScheduler(
+jobGraph,
+
ComponentMainThreadExecutorServiceAdapter.forMainThread(),
+EXECUTOR_RESOURCE.getExecutor())
+.getExecutionGraph();
+
+final ExecutionTimeBasedSlowTaskDetector slowTaskDetector = 
createSlowTaskDetector(0, 1, 0);
+final Map> slowTasks 
=
+slowTaskDetector.findSlowTasks(executionGraph);
+
+assertThat(slowTasks.size()).isZero();
+}
+
 @Test
 void testFinishedTaskNotExceedRatio() throws Exception {
 final int parallelism = 3;



[flink] branch release-1.17 updated: [FLINK-32876][runtime] Prevent ExecutionTimeBasedSlowTaskDetector from identifying tasks in CREATED state as slow tasks.

2023-08-17 Thread zhuzh
This is an automated email from the ASF dual-hosted git repository.

zhuzh pushed a commit to branch release-1.17
in repository https://gitbox.apache.org/repos/asf/flink.git


The following commit(s) were added to refs/heads/release-1.17 by this push:
 new 38b9c280128 [FLINK-32876][runtime] Prevent 
ExecutionTimeBasedSlowTaskDetector from identifying tasks in CREATED state as 
slow tasks.
38b9c280128 is described below

commit 38b9c280128981b3e809df1f963bdaf8c0491804
Author: JunRuiLee 
AuthorDate: Fri Aug 18 10:42:25 2023 +0800

[FLINK-32876][runtime] Prevent ExecutionTimeBasedSlowTaskDetector from 
identifying tasks in CREATED state as slow tasks.

This closes #23222.
---
 .../ExecutionTimeBasedSlowTaskDetector.java | 12 +++-
 .../ExecutionTimeBasedSlowTaskDetectorTest.java | 21 +
 2 files changed, 32 insertions(+), 1 deletion(-)

diff --git 
a/flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/slowtaskdetector/ExecutionTimeBasedSlowTaskDetector.java
 
b/flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/slowtaskdetector/ExecutionTimeBasedSlowTaskDetector.java
index 34cb5b47b36..f6d08548a0a 100644
--- 
a/flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/slowtaskdetector/ExecutionTimeBasedSlowTaskDetector.java
+++ 
b/flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/slowtaskdetector/ExecutionTimeBasedSlowTaskDetector.java
@@ -214,7 +214,17 @@ public class ExecutionTimeBasedSlowTaskDetector implements 
SlowTaskDetector {
 ExecutionTimeWithInputBytes baseline,
 long currentTimeMillis) {
 return executions.stream()
-.filter(e -> !e.getState().isTerminal() && e.getState() != 
ExecutionState.CANCELING)
+.filter(
+// We will filter out tasks that are in the CREATED 
state, as we do not
+// allow speculative execution for them because they 
have not been
+// scheduled.
+// However, for tasks that are already in the 
SCHEDULED state, we allow
+// speculative execution to provide the capability of 
parallel execution
+// running.
+e ->
+!e.getState().isTerminal()
+&& e.getState() != 
ExecutionState.CANCELING
+&& e.getState() != 
ExecutionState.CREATED)
 .filter(
 e -> {
 ExecutionTimeWithInputBytes timeWithBytes =
diff --git 
a/flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/slowtaskdetector/ExecutionTimeBasedSlowTaskDetectorTest.java
 
b/flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/slowtaskdetector/ExecutionTimeBasedSlowTaskDetectorTest.java
index b11f86c80d4..1714d79edbf 100644
--- 
a/flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/slowtaskdetector/ExecutionTimeBasedSlowTaskDetectorTest.java
+++ 
b/flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/slowtaskdetector/ExecutionTimeBasedSlowTaskDetectorTest.java
@@ -76,6 +76,27 @@ class ExecutionTimeBasedSlowTaskDetectorTest {
 assertThat(slowTasks).hasSize(parallelism);
 }
 
+@Test
+void testAllTasksInCreatedAndNoSlowTasks() throws Exception {
+final int parallelism = 3;
+final JobVertex jobVertex = createNoOpVertex(parallelism);
+final JobGraph jobGraph = 
JobGraphTestUtils.streamingJobGraph(jobVertex);
+
+// all tasks are in the CREATED state, which is not classified as slow 
tasks.
+final ExecutionGraph executionGraph =
+SchedulerTestingUtils.createScheduler(
+jobGraph,
+
ComponentMainThreadExecutorServiceAdapter.forMainThread(),
+EXECUTOR_RESOURCE.getExecutor())
+.getExecutionGraph();
+
+final ExecutionTimeBasedSlowTaskDetector slowTaskDetector = 
createSlowTaskDetector(0, 1, 0);
+final Map> slowTasks 
=
+slowTaskDetector.findSlowTasks(executionGraph);
+
+assertThat(slowTasks.size()).isZero();
+}
+
 @Test
 void testFinishedTaskNotExceedRatio() throws Exception {
 final int parallelism = 3;