Re: spark and matlab

2014-08-27 Thread Jaonary Rabarisoa
Thank you Matei.

 I found a solution using pipe and matlab engine (an executable that can
call matlab behind the scene and uses stdin and stdout to communicate). I
just need to fix two other issues :

- how can I handle my dependencies ? My matlab script need other matlab
files that need to be present on each workers' matlab path. So I need a way
to push them to each worker and tell matlab where to find them with
addpath. I know how to call addpath but I don't know what should be the
path.

- is the pipe() operator works on a partition level in order to run the
external process once for each data in a partition. Initializing my
external process cost a lot so it is not good to call it several times.



On Mon, Aug 25, 2014 at 9:03 PM, Matei Zaharia matei.zaha...@gmail.com
wrote:

 Have you tried the pipe() operator? It should work if you can launch your
 script from the command line. Just watch out for any environment variables
 needed (you can pass them to pipe() as an optional argument if there are
 some).

 On August 25, 2014 at 12:41:29 AM, Jaonary Rabarisoa (jaon...@gmail.com)
 wrote:

 Hi all,

 Is there someone that tried to pipe RDD into matlab script ? I'm trying to
 do something similiar if one of you could point some hints.

 Best regards,

 Jao




Re: spark and matlab

2014-08-25 Thread Matei Zaharia
Have you tried the pipe() operator? It should work if you can launch your 
script from the command line. Just watch out for any environment variables 
needed (you can pass them to pipe() as an optional argument if there are some).

On August 25, 2014 at 12:41:29 AM, Jaonary Rabarisoa (jaon...@gmail.com) wrote:

Hi all,

Is there someone that tried to pipe RDD into matlab script ? I'm trying to do 
something similiar if one of you could point some hints.

Best regards,

Jao