forgot the second point, I found the answer myself inside the source code
PipedRDD :)


On Wed, Aug 27, 2014 at 1:36 PM, Jaonary Rabarisoa <jaon...@gmail.com>
wrote:

> Thank you Matei.
>
>  I found a solution using pipe and matlab engine (an executable that can
> call matlab behind the scene and uses stdin and stdout to communicate). I
> just need to fix two other issues :
>
> - how can I handle my dependencies ? My matlab script need other matlab
> files that need to be present on each workers' matlab path. So I need a way
> to push them to each worker and tell matlab where to find them with
> "addpath". I know how to call "addpath" but I don't know what should be the
> path.
>
> - is the pipe() operator works on a partition level in order to run the
> external process once for each data in a partition. Initializing my
> external process cost a lot so it is not good to call it several times.
>
>
>
> On Mon, Aug 25, 2014 at 9:03 PM, Matei Zaharia <matei.zaha...@gmail.com>
> wrote:
>
>> Have you tried the pipe() operator? It should work if you can launch your
>> script from the command line. Just watch out for any environment variables
>> needed (you can pass them to pipe() as an optional argument if there are
>> some).
>>
>> On August 25, 2014 at 12:41:29 AM, Jaonary Rabarisoa (jaon...@gmail.com)
>> wrote:
>>
>>  Hi all,
>>
>> Is there someone that tried to pipe RDD into matlab script ? I'm trying
>> to do something similiar if one of you could point some hints.
>>
>> Best regards,
>>
>> Jao
>>
>>
>

Reply via email to