Hello, When running with Spark 0.9.0 against Mesos, I can't use saveAsNewAPIHadoopFile to save to a relative path (i.e. on the local filesystem, relative to the master process's current working directory). I'm writing in Parquet, so I see that no .parquet files end up in that directory, and I get an error about the footer not getting written (presumably since none of the data files were written).
Relative paths work when running Spark against local or local[10], and absolute paths on the local filesystem work when running on Mesos. And both relative and absolute paths work perfectly fine for reading from the master's filesystem with newAPIHadoopFile. I think the issue here is that the workers are evaluating the relative path relative to whatever *their* current directory happens to be, which, since Mesos runs them, isn't necessarily the same as that of the master process. Since the worker nodes have the filesystem I am working on mounted at the same location as the master does, an absolute path works to get the data to the same place from both worker and master nodes. I think Spark should handle the conversion of relative paths on the master to absolute paths that the workers can use no matter what their working directories happen to be. -Adam