Hello As per documentation, lapply works on single records and lapplyPartition works on partition However the format of output does not change
When I use lapplypartition, the data is converted to vertical format Here is my code library(SparkR) sc <- sparkR.init("local") lines <- textFile(sc,"/sparkdev/datafiles/covariance.txt") totals <- lapply(lines, function(lines) { sumx <- 0 sumy <- 0 totaln <- 0 for (i in 1:length(lines)){ dataxy <- unlist(strsplit(lines[i], ",")) sumx <- sumx + as.numeric(dataxy[1]) sumy <- sumy + as.numeric(dataxy[2]) } ##list(as.numeric(sumx), as.numeric(sumy), as.numeric(sumxy), as.numeric(totaln)) ##list does same as below c(sumx,sumy) } ) output <- collect(totals) for (element in output) { cat(as.character(element[1]),as.character(element[2]), "\n") } I am expecting output as 55, 55 However it is giving 55,NA 55,NA Where am I going wrong ? Thanks Pranay -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-lapplyPartition-transforms-the-data-in-vertical-format-tp11540.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org