[ https://issues.apache.org/jira/browse/SPARK-18509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Peter B. Pearman updated SPARK-18509: ------------------------------------- Environment: AWS EC2, AWS Linux, OS X 10.12.x (local) (was: AWS EC2, Amazon Linux, OS X 10.12.x) Description: When I run the spark-ec2 script in a local spark-1.6.3 installation, the error 'ERROR: Unknown Spark version' is generated: Initializing spark --2016-11-18 22:33:06-- http://s3.amazonaws.com/spark-related-packages/spark-1.6.3-bin-hadoop1.tgz Resolving s3.amazonaws.com (s3.amazonaws.com)... 52.216.1.3 Connecting to s3.amazonaws.com (s3.amazonaws.com)|52.216.1.3|:80... connected. HTTP request sent, awaiting response... 404 Not Found 2016-11-18 22:33:06 ERROR 404: Not Found. ERROR: Unknown Spark version spark/init.sh: line 137: return: -1: invalid option return: usage: return [n] Unpacking Spark tar (child): spark-*.tgz: Cannot open: No such file or directory tar (child): Error is not recoverable: exiting now tar: Child returned status 2 tar: Error is not recoverable: exiting now rm: cannot remove `spark-*.tgz': No such file or directory mv: missing destination file operand after `spark' Try `mv --help' for more information. [timing] spark init: 00h 00m 00s I think this happens when init.sh executes these lines: if [[ "$HADOOP_MAJOR_VERSION" == "1" ]]; then wget http://s3.amazonaws.com/spark-related-packages/spark-$SPARK_VERSION-bin-hadoop1.tgz elif [[ "$HADOOP_MAJOR_VERSION" == "2" ]]; then wget http://s3.amazonaws.com/spark-related-packages/spark-$SPARK_VERSION-bin-cdh4.tgz else wget http://s3.amazonaws.com/spark-related-packages/spark-$SPARK_VERSION-bin-hadoop2.4.tgz fi if [ $? != 0 ]; then echo "ERROR: Unknown Spark version" return -1 fi spark-1.6.3-bin-hadoop1.tgz does not exist on <http://s3.amazonaws.com/spark-related-packages/> Similarly, a spark-2.0.1-bin-hadoop1.tgz also does not exist at that location. So with these versions, if in init.sh [ "$HADOOP_MAJOR_VERSION" == "1" ] evaluates to True, spark installation on the EC2 cluster will fail. Related (perhaps a different bug?) is: I have installed spark-1.6.3-bin-hadoop2.6.tgz, but if the error is generated by init.sh, then it appears that HADOOP_MAJOR_VERSION ==1 is True, otherwise a different spark version would be requested from <http://s3.amazonaws.com/spark-related-packages/>. I am not experienced enough to verify this. My installed hadoop version should be 2.6. Please tell me if this should be a different bug report. was: In spark-1.6.3, an ERROR: Unknown Spark version is generated, probably when init.sh executes these lines: if [[ "$HADOOP_MAJOR_VERSION" == "1" ]]; then wget http://s3.amazonaws.com/spark-related-packages/spark-$SPARK_VERSION-bin-hadoop1.tgz elif [[ "$HADOOP_MAJOR_VERSION" == "2" ]]; then wget http://s3.amazonaws.com/spark-related-packages/spark-$SPARK_VERSION-bin-cdh4.tgz else wget http://s3.amazonaws.com/spark-related-packages/spark-$SPARK_VERSION-bin-hadoop2.4.tgz fi if [ $? != 0 ]; then echo "ERROR: Unknown Spark version" return -1 fi The error I got on running the spark-ec2 script locally was: Initializing spark --2016-11-18 22:33:06-- http://s3.amazonaws.com/spark-related-packages/spark-1.6.3-bin-hadoop1.tgz Resolving s3.amazonaws.com (s3.amazonaws.com)... 52.216.1.3 Connecting to s3.amazonaws.com (s3.amazonaws.com)|52.216.1.3|:80... connected. HTTP request sent, awaiting response... 404 Not Found 2016-11-18 22:33:06 ERROR 404: Not Found. ERROR: Unknown Spark version spark/init.sh: line 137: return: -1: invalid option return: usage: return [n] Unpacking Spark tar (child): spark-*.tgz: Cannot open: No such file or directory tar (child): Error is not recoverable: exiting now tar: Child returned status 2 tar: Error is not recoverable: exiting now rm: cannot remove `spark-*.tgz': No such file or directory mv: missing destination file operand after `spark' Try `mv --help' for more information. [timing] spark init: 00h 00m 00s spark-1.6.3-bin-hadoop1.tgz does not exist on <http://s3.amazonaws.com/spark-related-packages/> Similarly, a spark-2.0.1-bin-hadoop1.tgz also does not exit at that location. So if [ "$HADOOP_MAJOR_VERSION" == "1" ] evaluates to True, spark installation of these (and maybe other) versions on the EC2 cluster will fail. Related (perhaps a different bug?) is: I installed spark-1.6.3-bin-hadoop2.6.tgz, but if the error is generated by init.sh, then it appears that HADOOP_MAJOR_VERSION ==1 is True, otherwise a different spark version would be requested from <http://s3.amazonaws.com/spark-related-packages/>. I am not experienced enough to verify this. My installed hadoop version should be 2.6. Please tell me if this should be a different request. > spark-ec2 init.sh requests .tgz files not available at > http://s3.amazonaws.com/spark-related-packages > ----------------------------------------------------------------------------------------------------- > > Key: SPARK-18509 > URL: https://issues.apache.org/jira/browse/SPARK-18509 > Project: Spark > Issue Type: Bug > Components: EC2 > Affects Versions: 1.6.3, 2.0.1 > Environment: AWS EC2, AWS Linux, OS X 10.12.x (local) > Reporter: Peter B. Pearman > Labels: beginner > Original Estimate: 3h > Remaining Estimate: 3h > > When I run the spark-ec2 script in a local spark-1.6.3 installation, the > error 'ERROR: Unknown Spark version' is generated: > Initializing spark > --2016-11-18 22:33:06-- > http://s3.amazonaws.com/spark-related-packages/spark-1.6.3-bin-hadoop1.tgz > Resolving s3.amazonaws.com (s3.amazonaws.com)... 52.216.1.3 > Connecting to s3.amazonaws.com (s3.amazonaws.com)|52.216.1.3|:80... connected. > HTTP request sent, awaiting response... 404 Not Found > 2016-11-18 22:33:06 ERROR 404: Not Found. > ERROR: Unknown Spark version > spark/init.sh: line 137: return: -1: invalid option > return: usage: return [n] > Unpacking Spark > tar (child): spark-*.tgz: Cannot open: No such file or directory > tar (child): Error is not recoverable: exiting now > tar: Child returned status 2 > tar: Error is not recoverable: exiting now > rm: cannot remove `spark-*.tgz': No such file or directory > mv: missing destination file operand after `spark' > Try `mv --help' for more information. > [timing] spark init: 00h 00m 00s > I think this happens when init.sh executes these lines: > if [[ "$HADOOP_MAJOR_VERSION" == "1" ]]; then > wget > http://s3.amazonaws.com/spark-related-packages/spark-$SPARK_VERSION-bin-hadoop1.tgz > elif [[ "$HADOOP_MAJOR_VERSION" == "2" ]]; then > wget > http://s3.amazonaws.com/spark-related-packages/spark-$SPARK_VERSION-bin-cdh4.tgz > else > wget > http://s3.amazonaws.com/spark-related-packages/spark-$SPARK_VERSION-bin-hadoop2.4.tgz > fi > if [ $? != 0 ]; then > echo "ERROR: Unknown Spark version" > return -1 > fi > spark-1.6.3-bin-hadoop1.tgz does not exist on > <http://s3.amazonaws.com/spark-related-packages/> > Similarly, a spark-2.0.1-bin-hadoop1.tgz also does not exist at that > location. So with these versions, if in init.sh [ "$HADOOP_MAJOR_VERSION" == > "1" ] evaluates to True, spark installation on the EC2 cluster will fail. > Related (perhaps a different bug?) is: I have installed > spark-1.6.3-bin-hadoop2.6.tgz, but if the error is generated by init.sh, then > it appears that HADOOP_MAJOR_VERSION ==1 is True, otherwise a different spark > version would be requested from > <http://s3.amazonaws.com/spark-related-packages/>. I am not experienced > enough to verify this. My installed hadoop version should be 2.6. Please > tell me if this should be a different bug report. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org