[ https://issues.apache.org/jira/browse/SPARK-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14153463#comment-14153463 ]
Apache Spark commented on SPARK-3745: ------------------------------------- User 'shaneknapp' has created a pull request for this issue: https://github.com/apache/spark/pull/2596 > curl on maven search repo apache rat url returns search status, not jar file > ---------------------------------------------------------------------------- > > Key: SPARK-3745 > URL: https://issues.apache.org/jira/browse/SPARK-3745 > Project: Spark > Issue Type: Bug > Components: Build > Environment: centos 6.5 > Reporter: shane knapp > Labels: build-failure, easyfix, test > Original Estimate: 1h > Remaining Estimate: 1h > > in spark/dev/check-license, there are four attempts to download the apache > rat jar from maven: > {noformat} > > URL1="http://search.maven.org/remotecontent?filepath=org/apache/rat/apache-rat/${RAT_VERSION}/apache-rat-${RAT_VERSION}.jar" > > URL2="http://repo1.maven.org/maven2/org/apache/rat/apache-rat/${RAT_VERSION}/apache-rat-${RAT_VERSION}.jar" > *snip* > if hash curl 2>/dev/null; then > (curl --silent ${URL1} > "$JAR_DL" || curl --silent ${URL2} > > "$JAR_DL") && mv "$JAR_DL" "$JAR" > elif hash wget 2>/dev/null; then > (wget --quiet ${URL1} -O "$JAR_DL" || wget --quiet ${URL2} -O > "$JAR_DL") && mv "$JAR_DL" "$JAR" > {noformat} > the first attempt is on the search repo via curl, which returns a "YEP! WE > FOUND IT!" html blob: > {noformat} > [root@test01 sknapp]# curl --silent > http://search.maven.org/remotecontent?filepath=org/apache/rat/apache-rat/0.10/apache-rat-0.10.jar > > test.part > [root@test01 sknapp]# cat test.part > <html> > <head><title>302 Found</title></head> > <body bgcolor="white"> > <center><h1>302 Found</h1></center> > <hr><center>nginx/0.8.55</center> > </body> > </html> > {noformat} > this is failing to DL for EVERY time the test is run. i've run curl on the > 2nd url, which points at the repo itself and it successfully downloads. wget > does the correct thing for both URLs. > there is also no error checking on the downloaded file, short of file > existence. > potential fixes, in no particular order: > 1) run unzip -tq ${$JAR}, check for 0 exist status to ensure it's a > compressed archive > 2) run wget before curl > 3) only run curl on the 2nd URL (pointing directly to the repo) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org