Ben Kietzman created ARROW-8432:
-----------------------------------

             Summary: [Python][CI] Failure to download Hadoop
                 Key: ARROW-8432
                 URL: https://issues.apache.org/jira/browse/ARROW-8432
             Project: Apache Arrow
          Issue Type: Bug
          Components: Continuous Integration, Python
    Affects Versions: 0.16.0
            Reporter: Ben Kietzman
            Assignee: Ben Kietzman
             Fix For: 0.17.0


https://circleci.com/gh/ursa-labs/crossbow/11128?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link

This is caused by an HTTP request failure 
https://github.com/apache/arrow/blob/master/ci/docker/conda-python-hdfs.dockerfile#L36

We should probably not rely on https://www.apache.org/dyn/mirrors/mirrors.cgi 
to get tarballs. Currently there are three:

{code}
ci/docker/conda-python-hdfs.dockerfile
36:RUN wget -q -O - 
"https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download&filename=hadoop/common/hadoop-${hdfs}/hadoop-${hdfs}.tar.gz";
 | tar -xzf - -C /opt

ci/docker/linux-apt-docs.dockerfile
57:RUN wget -q -O - 
"https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download&filename=maven/maven-3/${maven}/binaries/apache-maven-${maven}-bin.tar.gz";
 | tar -xzf - -C /opt

python/manylinux1/scripts/build_thrift.sh
22:  
"https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download&filename=${THRIFT_DOWNLOAD_PATH}";
 \
{code}

Factor these out into a reusable script for downloading apache tarballs. It 
should contain hard coded apache mirrors and retry when connections fail



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to