Hi
 was wondering if anyone can assist here..
I am trying to create a spark cluster on AWS using scripts located in
spark-1.6.1/ec2 directory

When the spark_ec2.py scripts tries to do a rsync to copy directories over
to teh AWS
master node it fails miserably with this stack trace

DEBUG:spark ecd logger:Issuing command..:['rsync', '-rv', '-e', 'ssh -o
StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i
ec2AccessKey.pem', 'c:/tmp-spark/',
u'r...@ec2-54-218-75-130.us-west-2.compute.amazonaws.com:/']
Traceback (most recent call last):
  File "./spark_ec2.py", line 1545, in <module>
    main()
  File "./spark_ec2.py", line 1536, in main
    real_main()
  File "./spark_ec2.py", line 1371, in real_main
    setup_cluster(conn, master_nodes, slave_nodes, opts, True)
  File "./spark_ec2.py", line 849, in setup_cluster
    modules=modules
  File "./spark_ec2.py", line 1133, in deploy_files
    subprocess.check_call(command)
  File "C:\Python27\lib\subprocess.py", line 535, in check_call
    retcode = call(*popenargs, **kwargs)
  File "C:\Python27\lib\subprocess.py", line 522, in call
    return Popen(*popenargs, **kwargs).wait()
  File "C:\Python27\lib\subprocess.py", line 710, in __init__
    errread, errwrite)
  File "C:\Python27\lib\subprocess.py", line 958, in _execute_child
    startupinfo)
WindowsError: [Error 2] The system cannot find the file specified

here's my take on the what's happening
1. spark_ec2.py script download some scripts from git repo to a temporary
directory (fro what i can see, in the temp directory there's only 1 file,
root\spark-ec2\ec2-variables.sh
2. spark_ec2.py script tries to copy over the downloaded files to AWS


the error happens at this line (roughly line 1130), while invoking this
command

DEBUG:spark ecd logger:Issuing command..:['rsync', '-rv', '-e', 'ssh -o
StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i
ec2AccessKey.pem', 'c:/tmp-spark/',
u'r...@ec2-54-218-75-130.us-west-2.compute.amazonaws.com:/']


subprocess.check_call(command)

what am i missing? perhaps an rsync executable?
the status of my cluster is that , as far as i can see, spark is not
installed,


could anyone help?
thanks
 m arco

Reply via email to