Re: Appending to an hdfs file

2015-01-29 Thread Matan Safriel
e. > > On Wed, Jan 28, 2015 at 10:39 PM, Matan Safriel > wrote: > > Hi, > > > > Is it possible to append to an existing (hdfs) file, through some Spark > > action? > > Should there be any reason not to use a hadoop append api within a Spark > > job? > > > > Thanks, > > Matan > > >

Appending to an hdfs file

2015-01-28 Thread Matan Safriel
Hi, Is it possible to append to an existing (hdfs) file, through some Spark action? Should there be any reason not to use a hadoop append api within a Spark job? Thanks, Matan

Re: Running a task over a single input

2015-01-28 Thread Matan Safriel
newb support. Thanks, Matan On Wed, Jan 28, 2015 at 12:19 PM, Sean Owen wrote: > Processing one object isn't a distributed operation, and doesn't > really involve Spark. Just invoke your function on your object in the > driver; there's no magic at all to that. > &

Running a task over a single input

2015-01-28 Thread Matan Safriel
that normally runs over a large dataset, over just one new added datum. I'm a bit reticent adapting my code to Spark without knowing the limits of this scenario. Many thanks! Matan

Full per node replication level (architecture question)

2015-01-24 Thread Matan Safriel
x27;s workflow across such cluster... Thanks! Matan

Re: Spark using non-HDFS data on a distributed file system cluster

2014-10-24 Thread matan
l? or is it bound specifically to the hdfs api of hadoop, for performing a local data pull on the storage cluster machines? Thanks, Matan On Fri, Oct 24, 2014 at 4:19 AM, Marcelo Vanzin [via Apache Spark User List] wrote: > You assessment is mostly correct. I think the only thing I'd re

Spark using HDFS data [newb]

2014-10-23 Thread matan
some of the data from HDFS, or, does it all rely on Spark being installed on each "hdfs server" and just using the hdfs file chunks of that server locally, without transporting any input hdfs data at all? Many thanks! Matan -- View this message in context: http://apache-spark-user-li