Can't you just load the data from HBase first, and then call sc.parallelize on your dataset?
-Andy ------- Regards, Andy (Nam) Dang On Wed, Sep 30, 2015 at 12:52 PM, Nicolae Marasoiu < nicolae.maras...@adswizz.com> wrote: > Hi, > > > When calling sc.parallelize(data,1), is there a preference where to put > the data? I see 2 possibilities: sending it to a worker node, or keeping it > on the driver program. > > > I would prefer to keep the data local to the driver. The use case is when > I need just to load a bit of data from HBase, and then compute over it e.g. > aggregate, using Spark. > > > Thanks, > > Nicu >