Re: Running a task over a single input

2015-01-28 Thread Sean Owen
On Wed, Jan 28, 2015 at 1:44 PM, Matan Safriel wrote: > So I assume I can safely run a function F of mine within the spark driver > program, without dispatching it to the cluster (?), thereby sticking to one > piece of code for both a real cluster run over big data, and for small > on-demand runs

Re: Running a task over a single input

2015-01-28 Thread Matan Safriel
Thanks! So I assume I can safely run a function *F* of mine within the spark driver program, without dispatching it to the cluster (?), thereby sticking to one piece of code for *both* a real cluster run over big data, and for small on-demand runs for a single input (now and then), both scenarios

Re: Running a task over a single input

2015-01-28 Thread Sean Owen
Processing one object isn't a distributed operation, and doesn't really involve Spark. Just invoke your function on your object in the driver; there's no magic at all to that. You can make an RDD of one object and invoke a distributed Spark operation on it, but assuming you mean you have it on the

Running a task over a single input

2015-01-28 Thread Matan Safriel
Hi, How would I run a given function in Spark, over a single input object? Would I first add the input to the file system, then somehow invoke the Spark function on just that input? or should I rather twist the Spark streaming api for it? Assume I'd like to run a piece of computation that normall