forgot the second point, I found the answer myself inside the source code PipedRDD :)
On Wed, Aug 27, 2014 at 1:36 PM, Jaonary Rabarisoa <jaon...@gmail.com> wrote: > Thank you Matei. > > I found a solution using pipe and matlab engine (an executable that can > call matlab behind the scene and uses stdin and stdout to communicate). I > just need to fix two other issues : > > - how can I handle my dependencies ? My matlab script need other matlab > files that need to be present on each workers' matlab path. So I need a way > to push them to each worker and tell matlab where to find them with > "addpath". I know how to call "addpath" but I don't know what should be the > path. > > - is the pipe() operator works on a partition level in order to run the > external process once for each data in a partition. Initializing my > external process cost a lot so it is not good to call it several times. > > > > On Mon, Aug 25, 2014 at 9:03 PM, Matei Zaharia <matei.zaha...@gmail.com> > wrote: > >> Have you tried the pipe() operator? It should work if you can launch your >> script from the command line. Just watch out for any environment variables >> needed (you can pass them to pipe() as an optional argument if there are >> some). >> >> On August 25, 2014 at 12:41:29 AM, Jaonary Rabarisoa (jaon...@gmail.com) >> wrote: >> >> Hi all, >> >> Is there someone that tried to pipe RDD into matlab script ? I'm trying >> to do something similiar if one of you could point some hints. >> >> Best regards, >> >> Jao >> >> >