RE: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-25 Thread Cinquegrana, Piero
rom formula to binary exploding the size of the object being passed. https://github.com/amplab-extras/SparkR-pkg/blob/master/pkg/R/serialize.R From: Felix Cheung [mailto:felixcheun...@hotmail.com] Sent: Thursday, August 25, 2016 2:35 PM To: Cinquegrana, Piero <piero.cinquegr...@neustar.bi

RE: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-24 Thread Cinquegrana, Piero
or Pseudo code: scoreModel <- function(parameters){ library(score) score(dat,parameters) } dat <<- read.csv('file.csv') modelScores <- spark.lapply(parameterList, scoreModel) From: Cinquegrana, Piero [mailto:piero.cinquegr...@neustar.biz] Sent: Tues

RE: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-23 Thread Cinquegrana, Piero
, Piero <piero.cinquegr...@neustar.biz>; user@spark.apache.org Subject: Re: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big") How big is the output from score()? Also could you elaborate on what you want to broadcast? On Mon, Aug 22, 2016 at 11:58 AM -0700, &qu

spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-22 Thread Cinquegrana, Piero
Hello, I am using the new R API in SparkR spark.lapply (spark 2.0). I am defining a complex function to be run across executors and I have to send the entire dataset, but there is not (that I could find) a way to broadcast the variable in SparkR. I am thus reading the dataset in each executor