RE: as.Date can't be applied to Spark data frame in SparkR

2016-09-19 Thread xingye
Update: the job can finish, but takes a long time on a 10M row data. is there a better solution? From: xing_ma...@hotmail.com To: user@spark.apache.org Subject: as.Date can't be applied to Spark data frame in SparkR Date: Tue, 20 Sep 2016 10:22:17 +0800 Hi, all I've noticed that as.Date can't

as.Date can't be applied to Spark data frame in SparkR

2016-09-19 Thread xingye
Hi, all I've noticed that as.Date can't be applied to Spark data frame. I've created the following UDF and used dapply to change a integer column "aa" to a date with origin as 1960-01-01. change_date<-function(df){ df<-as.POSIXlt(as.Date(df$aa, origin = "1960-01-01", tz = "UTC")) }

RE: questions about using dapply

2016-09-12 Thread xingye
ply(x[, paste("c", 1:2, sep = "")], MARGIN=2, FUN = function(y){ y %in% c(61, 99)})}) # dapplyCollect does not require the schema parameter _ From: xingye <tracy.up...@gmail.com> Sent: Friday, September 9, 2016 10:35 AM Subject: qu

questions about using dapply

2016-09-09 Thread xingye
I have a question about using UDF in SparkR. I’m converting some R code into SparkR. • The original R code is :cols_in <- apply(df[, paste("cr_cd", 1:12, sep = "")], MARGIN = 2, FUN = "%in%", c(61, 99)) • If I use dapply and put the original apply function as a function for dapply,cols_in

SparkR error: reference is ambiguous.

2016-09-09 Thread xingye
Not sure whether this is the right distribution list that I can ask questions. If not, can someone give a distribution list that can find someone to help?I kept getting error of reference is ambiguous when implementing some sparkR code.1. when i tried to assign values to a column using the

Assign values to existing column in SparkR

2016-09-09 Thread xingye
I have some questions about assign values to a spark dataframe. I want to assign values to an existing column of a spark dataframe but if I assign the value directly, I got the following error.df$c_mon<-0Error: class(value) == "Column" || is.null(value) is not TRUEIs there a way to solve this?