[jira] [Commented] (FLINK-13663) SQL Client end-to-end test for modern Kafka failed on Travis

Andrey Zagrebin (JIRA) Mon, 12 Aug 2019 04:54:43 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-13663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905110#comment-16905110
 ]


Andrey Zagrebin commented on FLINK-13663:
-----------------------------------------

>From what I see in the test code (bash function setup_kafka_dist in 
>kafka_common.sh), we already have a hotfix introduced by [~1u0] which hardens 
>the test by increasing the timeout of the failing curl download. The hotfix 
>adds curl options which activate an embedded retry algorithm of curl command 
>(5 retries with 1 sec init back-off time then doubling, total retry time is up 
>to 1 min).
{code:java}
KAFKA_URL="https://archive.apache.org/dist/kafka/$KAFKA_VERSION/kafka_2.11-$KAFKA_VERSION.tgz";
echo "Downloading Kafka from $KAFKA_URL"
curl "$KAFKA_URL" --retry 5 --retry-max-time 60 > $TEST_DATA_DIR/kafka.tgz{code}
 
{code:java}
--retry-max-time <seconds>
 The retry timer is reset before the first transfer attempt. Retries will be 
done as usual (see --retry) as long as the timer hasn't reached this given 
limit. Notice that if
 the timer hasn't reached the limit, the request will be made and while 
performing, it may take longer than this given time period. To limit a single 
request's maximum time,
 use -m, --max-time. Set this option to zero to not timeout retries.

------------------------------------------------------------------------------------------------------------
--retry <num>
 If a transient error is returned when curl tries to perform a transfer, it 
will retry this number of times before giving up. Setting the number to 0 makes 
curl do no retries
 (which is the default). Transient error means either: a timeout, an FTP 4xx 
response code or an HTTP 5xx response code.
When curl is about to retry a transfer, it will first wait one second and then 
for all forthcoming retries it will double the waiting time until it reaches 10 
minutes which
 then will be the delay between the rest of the retries. By using --retry-delay 
you disable this exponential backoff algorithm. See also --retry-max-time to 
limit the total
 time allowed for retries.
{code}
We could either further increase the total curl timeout from 1 min to e.g. 5 or 
10 or try the same approach with our custom retry logic proposed in FLINK-13599 
but I would not expect that our retry will be much better than the embedded 
curl retry mechanism.

 

> SQL Client end-to-end test for modern Kafka failed on Travis
> ------------------------------------------------------------
>
>                 Key: FLINK-13663
>                 URL: https://issues.apache.org/jira/browse/FLINK-13663
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / Kafka, Table SQL / Client, Tests
>    Affects Versions: 1.9.0, 1.10.0
>            Reporter: Till Rohrmann
>            Priority: Critical
>              Labels: test-stability
>
> The {{SQL Client end-to-end test for modern Kafka}} failed on Travis because 
> it could not download 
> {{https://archive.apache.org/dist/kafka/0.11.0.2/kafka_2.11-0.11.0.2.tgz}}.
> Maybe we could add a similar retry logic as with the Kinesis end-to-end test 
> FLINK-13599.
> https://api.travis-ci.org/v3/job/569262834/log.txt
> https://api.travis-ci.org/v3/job/569262828/log.txt



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (FLINK-13663) SQL Client end-to-end test for modern Kafka failed on Travis

Reply via email to