Ben Kietzman created ARROW-8432: ----------------------------------- Summary: [Python][CI] Failure to download Hadoop Key: ARROW-8432 URL: https://issues.apache.org/jira/browse/ARROW-8432 Project: Apache Arrow Issue Type: Bug Components: Continuous Integration, Python Affects Versions: 0.16.0 Reporter: Ben Kietzman Assignee: Ben Kietzman Fix For: 0.17.0
https://circleci.com/gh/ursa-labs/crossbow/11128?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link This is caused by an HTTP request failure https://github.com/apache/arrow/blob/master/ci/docker/conda-python-hdfs.dockerfile#L36 We should probably not rely on https://www.apache.org/dyn/mirrors/mirrors.cgi to get tarballs. Currently there are three: {code} ci/docker/conda-python-hdfs.dockerfile 36:RUN wget -q -O - "https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download&filename=hadoop/common/hadoop-${hdfs}/hadoop-${hdfs}.tar.gz" | tar -xzf - -C /opt ci/docker/linux-apt-docs.dockerfile 57:RUN wget -q -O - "https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download&filename=maven/maven-3/${maven}/binaries/apache-maven-${maven}-bin.tar.gz" | tar -xzf - -C /opt python/manylinux1/scripts/build_thrift.sh 22: "https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download&filename=${THRIFT_DOWNLOAD_PATH}" \ {code} Factor these out into a reusable script for downloading apache tarballs. It should contain hard coded apache mirrors and retry when connections fail -- This message was sent by Atlassian Jira (v8.3.4#803005)