Vipul Modi created ZEPPELIN-2199:
------------------------------------
Summary: spark.lapply not working but SparkR:::lapply works.
Key: ZEPPELIN-2199
URL: https://issues.apache.org/jira/browse/ZEPPELIN-2199
Project: Zeppelin
Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Vipul Modi
To run lapply in distributed mode via Zeppelin one needs to run SparkR:::lapply
instead of spark.lapply.
As per spark documentation spark.lapply should work.
Steps to reproduce:
Build zeppelin using with r profiles enabled:
mvn clean install -DskipTests -Drat.skip=true -Pspark-2.0 -Phadoop-2.4 -Pyarn
-Ppyspark -Psparkr -Pr -Pscala-2.11
Failed Paragraph
%spark.r
families <- c("gaussian", "poisson")
df <- createDataFrame(iris)
train <- function(family){
model <- model <- glm(Sepal.Length ~ Sepal.Width + Species, iris, family =
family)
summary(model)
}
model.summaries <- spark.lapply(families, train)
Passin paragraph:
%spark.r
families <- c("gaussian", "poisson")
df <- createDataFrame(iris)
train <- function(family){
model <- model <- glm(Sepal.Length ~ Sepal.Width + Species, iris, family =
family)
summary(model)
}
model.summaries <- spark.lapply(families, train)
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)