Yes, the script should be present on all the executor nodes. You can pass your script via spark-submit (e.g. --files script.sh) and then you should be able to refer that (e.g. "./script.sh") in rdd.pipe.
- Arun On Thu, 17 Jan 2019 at 14:18, Mkal <diomf...@hotmail.com> wrote: > Hi, im trying to run an external script on spark using rdd.pipe() and > although it runs successfully on standalone, it throws an error on cluster. > The error comes from the executors and it's : "Cannot run program > "path/to/program": error=2, No such file or directory". > > Does the external script need to be available on all nodes in the cluster > when using rdd.pipe()? > > What if i don't have permission to install anything on the nodes of the > cluster? Is there any other way to make the script available to the worker > nodes? > > (The external script is loaded in HDFS and is passed to the driver class > through args) > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >