RE: spark.lapply

2018-09-27 Thread Junior Alvarez
Around 500KB each time i call the function (~150 times) From: Felix Cheung Sent: den 26 september 2018 14:57 To: Junior Alvarez ; user@spark.apache.org Subject: Re: spark.lapply It looks like the native R process is terminated from buffer overflow. Do you know how much data is involved

Re: spark.lapply

2018-09-26 Thread Felix Cheung
It looks like the native R process is terminated from buffer overflow. Do you know how much data is involved? From: Junior Alvarez Sent: Wednesday, September 26, 2018 7:33 AM To: user@spark.apache.org Subject: spark.lapply Hi! I’m using spark.lapply

spark.lapply

2018-09-26 Thread Junior Alvarez
Hi! I'm using spark.lapply() in sparkR on a mesos service I get the following crash randomly (The spark.lapply() function is called around 150 times, some times it crashes after 16 calls, other after 25 calls and so on...it is completely random, even though the data used in the actual call

Re: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-25 Thread Felix Cheung
. Please note that SparkR code is now at https://github.com/apache/spark/tree/master/R _ From: Cinquegrana, Piero <piero.cinquegr...@neustar.biz<mailto:piero.cinquegr...@neustar.biz>> Sent: Thursday, August 25, 2016 6:08 AM Subject: RE: spark.lapply in S

RE: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-25 Thread Cinquegrana, Piero
I realized that the error: Error in writeBin(batch, con, endian = "big") Was due to an object within the 'parameters' list which was a R formula. When the spark.lapply method calls the parallelize method, it splits the list and calls the SparkR:::writeRaw method, which tries to convert f

Re: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-25 Thread Felix Cheung
dat <- data.frame(fread(“file.csv”)) score(dat,parameters) } parameterList <- lapply(1:100, function(i) getParameters(i)) modelScores <- spark.lapply(parameterList, scoreModel) Could you provide more information on your actual code? _ From:

RE: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-24 Thread Cinquegrana, Piero
or Pseudo code: scoreModel <- function(parameters){ library(score) score(dat,parameters) } dat <<- read.csv('file.csv') modelScores <- spark.lapply(parameterList, scoreModel) From: Cinquegrana, Piero [mailto:piero.cinquegr...@neustar.biz] Sent: Tues

RE: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-23 Thread Cinquegrana, Piero
, Piero <piero.cinquegr...@neustar.biz>; user@spark.apache.org Subject: Re: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big") How big is the output from score()? Also could you elaborate on what you want to broadcast? On Mon, Aug 22, 2016 at 11:58 AM -0700, &qu

Re: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-22 Thread Felix Cheung
How big is the output from score()? Also could you elaborate on what you want to broadcast? On Mon, Aug 22, 2016 at 11:58 AM -0700, "Cinquegrana, Piero" <piero.cinquegr...@neustar.biz<mailto:piero.cinquegr...@neustar.biz>> wrote: Hello, I am using the new R API

spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-22 Thread Cinquegrana, Piero
Hello, I am using the new R API in SparkR spark.lapply (spark 2.0). I am defining a complex function to be run across executors and I have to send the entire dataset, but there is not (that I could find) a way to broadcast the variable in SparkR. I am thus reading the dataset in each executor