[ https://issues.apache.org/jira/browse/FLINK-13663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905110#comment-16905110 ]
Andrey Zagrebin commented on FLINK-13663: ----------------------------------------- >From what I see in the test code (bash function setup_kafka_dist in >kafka_common.sh), we already have a hotfix introduced by [~1u0] which hardens >the test by increasing the timeout of the failing curl download. The hotfix >adds curl options which activate an embedded retry algorithm of curl command >(5 retries with 1 sec init back-off time then doubling, total retry time is up >to 1 min). {code:java} KAFKA_URL="https://archive.apache.org/dist/kafka/$KAFKA_VERSION/kafka_2.11-$KAFKA_VERSION.tgz" echo "Downloading Kafka from $KAFKA_URL" curl "$KAFKA_URL" --retry 5 --retry-max-time 60 > $TEST_DATA_DIR/kafka.tgz{code} {code:java} --retry-max-time <seconds> The retry timer is reset before the first transfer attempt. Retries will be done as usual (see --retry) as long as the timer hasn't reached this given limit. Notice that if the timer hasn't reached the limit, the request will be made and while performing, it may take longer than this given time period. To limit a single request's maximum time, use -m, --max-time. Set this option to zero to not timeout retries. ------------------------------------------------------------------------------------------------------------ --retry <num> If a transient error is returned when curl tries to perform a transfer, it will retry this number of times before giving up. Setting the number to 0 makes curl do no retries (which is the default). Transient error means either: a timeout, an FTP 4xx response code or an HTTP 5xx response code. When curl is about to retry a transfer, it will first wait one second and then for all forthcoming retries it will double the waiting time until it reaches 10 minutes which then will be the delay between the rest of the retries. By using --retry-delay you disable this exponential backoff algorithm. See also --retry-max-time to limit the total time allowed for retries. {code} We could either further increase the total curl timeout from 1 min to e.g. 5 or 10 or try the same approach with our custom retry logic proposed in FLINK-13599 but I would not expect that our retry will be much better than the embedded curl retry mechanism. > SQL Client end-to-end test for modern Kafka failed on Travis > ------------------------------------------------------------ > > Key: FLINK-13663 > URL: https://issues.apache.org/jira/browse/FLINK-13663 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka, Table SQL / Client, Tests > Affects Versions: 1.9.0, 1.10.0 > Reporter: Till Rohrmann > Priority: Critical > Labels: test-stability > > The {{SQL Client end-to-end test for modern Kafka}} failed on Travis because > it could not download > {{https://archive.apache.org/dist/kafka/0.11.0.2/kafka_2.11-0.11.0.2.tgz}}. > Maybe we could add a similar retry logic as with the Kinesis end-to-end test > FLINK-13599. > https://api.travis-ci.org/v3/job/569262834/log.txt > https://api.travis-ci.org/v3/job/569262828/log.txt -- This message was sent by Atlassian JIRA (v7.6.14#76016)