azagrebin commented on a change in pull request #6907: [FLINK-10357][tests] 
Improve StreamingFileSink E2E test stability.
URL: https://github.com/apache/flink/pull/6907#discussion_r229220979
 
 

 ##########
 File path: flink-end-to-end-tests/test-scripts/test_streaming_file_sink.sh
 ##########
 @@ -45,47 +40,55 @@ function wait_for_restart {
 }
 
 ###################################
-# Wait a specific number of successful checkpoints
-# to have happened
+# Get all lines in part files and sort them numerically.
 #
 # Globals:
-#   None
+#   OUTPUT_PATH
 # Arguments:
-#   $1: the job id
-#   $2: the number of expected successful checkpoints
-#   $3: timeout in seconds
+#   None
 # Returns:
 #   None
 ###################################
-function wait_for_number_of_checkpoints {
-    local job_id=$1
-    local expected_num_checkpoints=$2
-    local timeout=$3
-    local count=0
-
-    echo "Starting to wait for completion of ${expected_num_checkpoints} 
checkpoints"
-    while (($(get_completed_number_of_checkpoints ${job_id}) < 
${expected_num_checkpoints})); do
+function get_complete_result {
+    find "${OUTPUT_PATH}" -type f \( -iname "part-*" \) -exec cat {} + | sort 
-g
+}
 
-        if [[ ${count} -gt ${timeout} ]]; then
-            echo "A timeout occurred waiting for successful checkpoints"
+###################################
+# Waits until a number of values have been written within a timeout.
+# If the timeout expires, exit with return code 1.
+#
+# Globals:
+#   None
+# Arguments:
+#   $1: the number of expected values
+#   $2: timeout in seconds
+# Returns:
+#   None
+###################################
+function wait_for_complete_result {
+    local expected_number_of_values=$1
+    local polling_timeout=$2
+    local polling_interval=1
+    local seconds_elapsed=0
+
+    local number_of_values=$(get_complete_result | tail -1)
+    local previous_number_of_values=-1
+
+    while [[ ${number_of_values} -lt ${expected_number_of_values} ]]; do
+        if [[ ${seconds_elapsed} -ge ${polling_timeout} ]]; then
 
 Review comment:
   Ok, I see your point. My concern was that although here it does not really 
matter because `get_complete_result` is very quick but in general counting 
number of lines can take longer, e.g. in coming test with s3 as output (#6957). 
Then timeout will be prolonged. I believe, we can discuss it in #6957.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to