I've put the csv in the worker node since the job is run in the worker. I didn't put the csv in the master because I believe it doesn't run jobs.
If I put the csv in the zeppelin node with the same path as the worker, it reads the csv and writes a _SUCCESS file locally. The job is run on the worker too but doesn't terminate. The result is saved under a _temporary directory in the worker. worker - ls -laRt /data/02.csv/ 02.csv/: total 0 drwxr-xr-x. 3 root root 24 Apr 28 09:55 . drwxr-xr-x. 3 root root 15 Apr 28 09:55 _temporary drwxr-xr-x. 3 root root 64 Apr 28 09:55 .. 02.csv/_temporary: total 0 drwxr-xr-x. 5 root root 106 Apr 28 09:56 0 drwxr-xr-x. 3 root root 15 Apr 28 09:55 . drwxr-xr-x. 3 root root 24 Apr 28 09:55 .. 02.csv/_temporary/0: total 0 drwxr-xr-x. 5 root root 106 Apr 28 09:56 . drwxr-xr-x. 2 root root 6 Apr 28 09:56 _temporary drwxr-xr-x. 2 root root 129 Apr 28 09:56 task_20170428095632_0005_m_000000 drwxr-xr-x. 2 root root 129 Apr 28 09:55 task_20170428095516_0002_m_000000 drwxr-xr-x. 3 root root 15 Apr 28 09:55 .. 02.csv/_temporary/0/_temporary: total 0 drwxr-xr-x. 2 root root 6 Apr 28 09:56 . drwxr-xr-x. 5 root root 106 Apr 28 09:56 .. 02.csv/_temporary/0/task_20170428095632_0005_m_000000: total 52 drwxr-xr-x. 5 root root 106 Apr 28 09:56 .. -rw-r--r--. 1 root root 376 Apr 28 09:56 .part-00000-e39ebc76-5343-407e-b42e-c33e69b8fd1a.csv.crc -rw-r--r--. 1 root root 46605 Apr 28 09:56 part-00000-e39ebc76-5343-407e-b42e-c33e69b8fd1a.csv drwxr-xr-x. 2 root root 129 Apr 28 09:56 . 02.csv/_temporary/0/task_20170428095516_0002_m_000000: total 52 drwxr-xr-x. 5 root root 106 Apr 28 09:56 .. -rw-r--r--. 1 root root 376 Apr 28 09:55 .part-00000-c2ac5299-26f6-4b23-a74b-b3dc96464271.csv.crc -rw-r--r--. 1 root root 46605 Apr 28 09:55 part-00000-c2ac5299-26f6-4b23-a74b-b3dc96464271.csv zeppelin - ls -laRt 02.csv/ 02.csv/: total 12 drwxr-sr-x 2 root 10000700 4096 Apr 28 09:56 . -rw-r--r-- 1 root 10000700 8 Apr 28 09:56 ._SUCCESS.crc -rw-r--r-- 1 root 10000700 0 Apr 28 09:56 _SUCCESS drwxrwsr-x 5 root 10000700 4096 Apr 28 09:56 .. El El mié, 10 may 2017 a las 14:06, Meethu Mathew <meethu.mat...@flytxt.com> escribió: > Try putting the csv in the same path in all the nodes or in a mount point > path which is accessible by all the nodes > > Regards, > > > Meethu Mathew > > > On Wed, May 10, 2017 at 3:36 PM, Sofiane Cherchalli <sofian...@gmail.com> > wrote: > >> Yes, I already tested with spark-shell and pyspark , with the same result. >> >> Can't I use Linux filesystem to read CSV, such as file:///data/file.csv. >> My understanding is that the job is sent and is interpreted in the worker, >> isn't it? >> >> Thanks. >> >> El El mar, 9 may 2017 a las 20:23, Jongyoul Lee <jongy...@gmail.com> >> escribió: >> >>> Could you test if it works with spark-shell? >>> >>> On Sun, May 7, 2017 at 5:22 PM, Sofiane Cherchalli <sofian...@gmail.com> >>> wrote: >>> >>>> Hi, >>>> >>>> I have a standalone cluster, one master and one worker, running in >>>> separate nodes. Zeppelin is running is in a separate node too in client >>>> mode. >>>> >>>> When I run a notebook that reads a CSV file located in the worker >>>> node with Spark-CSV package, Zeppelin tries to read the CSV locally and >>>> fails because the CVS is in the worker node and not in Zeppelin node. >>>> >>>> Is this the expected behavior? >>>> >>>> Thanks. >>>> >>> >>> >>> >>> -- >>> 이종열, Jongyoul Lee, 李宗烈 >>> http://madeng.net >>> >> >