spark-related-packages

Peter B. Pearman (JIRA) Sat, 19 Nov 2016 04:47:10 -0800

     [ 
https://issues.apache.org/jira/browse/SPARK-18509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Peter B. Pearman updated SPARK-18509:
-------------------------------------
    Environment: AWS EC2, AWS Linux, OS X 10.12.x (local)  (was: AWS EC2, 
Amazon Linux, OS X 10.12.x)
    Description: 
When I run the spark-ec2 script in a local spark-1.6.3 installation, the error 
'ERROR: Unknown Spark version' is generated:

Initializing spark
--2016-11-18 22:33:06--  
http://s3.amazonaws.com/spark-related-packages/spark-1.6.3-bin-hadoop1.tgz
Resolving s3.amazonaws.com (s3.amazonaws.com)... 52.216.1.3
Connecting to s3.amazonaws.com (s3.amazonaws.com)|52.216.1.3|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2016-11-18 22:33:06 ERROR 404: Not Found.

ERROR: Unknown Spark version
spark/init.sh: line 137: return: -1: invalid option
return: usage: return [n]
Unpacking Spark
tar (child): spark-*.tgz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
rm: cannot remove `spark-*.tgz': No such file or directory
mv: missing destination file operand after `spark'
Try `mv --help' for more information.
[timing] spark init:  00h 00m 00s

I think this happens when init.sh executes these lines:
      if [[ "$HADOOP_MAJOR_VERSION" == "1" ]]; then
        wget 
http://s3.amazonaws.com/spark-related-packages/spark-$SPARK_VERSION-bin-hadoop1.tgz
      elif [[ "$HADOOP_MAJOR_VERSION" == "2" ]]; then
        wget 
http://s3.amazonaws.com/spark-related-packages/spark-$SPARK_VERSION-bin-cdh4.tgz
      else
        wget 
http://s3.amazonaws.com/spark-related-packages/spark-$SPARK_VERSION-bin-hadoop2.4.tgz
      fi
      if [ $? != 0 ]; then
        echo "ERROR: Unknown Spark version"
        return -1
      fi

spark-1.6.3-bin-hadoop1.tgz does not exist on 
<http://s3.amazonaws.com/spark-related-packages/>
Similarly, a spark-2.0.1-bin-hadoop1.tgz also does not exist at that location. 
So with these versions, if in init.sh [ "$HADOOP_MAJOR_VERSION" == "1" ] 
evaluates to True, spark installation on the EC2 cluster will fail. 

Related (perhaps a different bug?) is: I have installed 
spark-1.6.3-bin-hadoop2.6.tgz, but if the error is generated by init.sh, then 
it appears that HADOOP_MAJOR_VERSION ==1 is True, otherwise a different spark 
version would be requested from 
<http://s3.amazonaws.com/spark-related-packages/>. I am not experienced enough 
to verify this. My installed hadoop version should be 2.6.  Please tell me if 
this should be a different bug report.

  was:
In spark-1.6.3, an ERROR: Unknown Spark version is generated, probably when 
init.sh executes these lines:
      if [[ "$HADOOP_MAJOR_VERSION" == "1" ]]; then
        wget 
http://s3.amazonaws.com/spark-related-packages/spark-$SPARK_VERSION-bin-hadoop1.tgz
      elif [[ "$HADOOP_MAJOR_VERSION" == "2" ]]; then
        wget 
http://s3.amazonaws.com/spark-related-packages/spark-$SPARK_VERSION-bin-cdh4.tgz
      else
        wget 
http://s3.amazonaws.com/spark-related-packages/spark-$SPARK_VERSION-bin-hadoop2.4.tgz
      fi
      if [ $? != 0 ]; then
        echo "ERROR: Unknown Spark version"
        return -1
      fi

The error I got on running the spark-ec2 script locally was:
Initializing spark
--2016-11-18 22:33:06--  
http://s3.amazonaws.com/spark-related-packages/spark-1.6.3-bin-hadoop1.tgz
Resolving s3.amazonaws.com (s3.amazonaws.com)... 52.216.1.3
Connecting to s3.amazonaws.com (s3.amazonaws.com)|52.216.1.3|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2016-11-18 22:33:06 ERROR 404: Not Found.

ERROR: Unknown Spark version
spark/init.sh: line 137: return: -1: invalid option
return: usage: return [n]
Unpacking Spark
tar (child): spark-*.tgz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
rm: cannot remove `spark-*.tgz': No such file or directory
mv: missing destination file operand after `spark'
Try `mv --help' for more information.
[timing] spark init:  00h 00m 00s

spark-1.6.3-bin-hadoop1.tgz does not exist on 
<http://s3.amazonaws.com/spark-related-packages/>
Similarly, a spark-2.0.1-bin-hadoop1.tgz also does not exit at that location. 
So if [ "$HADOOP_MAJOR_VERSION" == "1" ] evaluates to True, spark installation 
of these (and maybe other) versions on the EC2 cluster will fail.

Related (perhaps a different bug?) is: I installed 
spark-1.6.3-bin-hadoop2.6.tgz, but if the error is generated by init.sh, then 
it appears that HADOOP_MAJOR_VERSION ==1 is True, otherwise a different spark 
version would be requested from 
<http://s3.amazonaws.com/spark-related-packages/>. I am not experienced enough 
to verify this. My installed hadoop version should be 2.6.  Please tell me if 
this should be a different request.


> spark-ec2 init.sh requests .tgz files not available at 
> http://s3.amazonaws.com/spark-related-packages
> -----------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-18509
>                 URL: https://issues.apache.org/jira/browse/SPARK-18509
>             Project: Spark
>          Issue Type: Bug
>          Components: EC2
>    Affects Versions: 1.6.3, 2.0.1
>         Environment: AWS EC2, AWS Linux, OS X 10.12.x (local)
>            Reporter: Peter B. Pearman
>              Labels: beginner
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> When I run the spark-ec2 script in a local spark-1.6.3 installation, the 
> error 'ERROR: Unknown Spark version' is generated:
> Initializing spark
> --2016-11-18 22:33:06--  
> http://s3.amazonaws.com/spark-related-packages/spark-1.6.3-bin-hadoop1.tgz
> Resolving s3.amazonaws.com (s3.amazonaws.com)... 52.216.1.3
> Connecting to s3.amazonaws.com (s3.amazonaws.com)|52.216.1.3|:80... connected.
> HTTP request sent, awaiting response... 404 Not Found
> 2016-11-18 22:33:06 ERROR 404: Not Found.
> ERROR: Unknown Spark version
> spark/init.sh: line 137: return: -1: invalid option
> return: usage: return [n]
> Unpacking Spark
> tar (child): spark-*.tgz: Cannot open: No such file or directory
> tar (child): Error is not recoverable: exiting now
> tar: Child returned status 2
> tar: Error is not recoverable: exiting now
> rm: cannot remove `spark-*.tgz': No such file or directory
> mv: missing destination file operand after `spark'
> Try `mv --help' for more information.
> [timing] spark init:  00h 00m 00s
> I think this happens when init.sh executes these lines:
>       if [[ "$HADOOP_MAJOR_VERSION" == "1" ]]; then
>         wget 
> http://s3.amazonaws.com/spark-related-packages/spark-$SPARK_VERSION-bin-hadoop1.tgz
>       elif [[ "$HADOOP_MAJOR_VERSION" == "2" ]]; then
>         wget 
> http://s3.amazonaws.com/spark-related-packages/spark-$SPARK_VERSION-bin-cdh4.tgz
>       else
>         wget 
> http://s3.amazonaws.com/spark-related-packages/spark-$SPARK_VERSION-bin-hadoop2.4.tgz
>       fi
>       if [ $? != 0 ]; then
>         echo "ERROR: Unknown Spark version"
>         return -1
>       fi
> spark-1.6.3-bin-hadoop1.tgz does not exist on 
> <http://s3.amazonaws.com/spark-related-packages/>
> Similarly, a spark-2.0.1-bin-hadoop1.tgz also does not exist at that 
> location. So with these versions, if in init.sh [ "$HADOOP_MAJOR_VERSION" == 
> "1" ] evaluates to True, spark installation on the EC2 cluster will fail. 
> Related (perhaps a different bug?) is: I have installed 
> spark-1.6.3-bin-hadoop2.6.tgz, but if the error is generated by init.sh, then 
> it appears that HADOOP_MAJOR_VERSION ==1 is True, otherwise a different spark 
> version would be requested from 
> <http://s3.amazonaws.com/spark-related-packages/>. I am not experienced 
> enough to verify this. My installed hadoop version should be 2.6.  Please 
> tell me if this should be a different bug report.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-18509) spark-ec2 init.sh requests .tgz files not available at http://s3.amazonaws.com/spark-related-packages

Reply via email to