[ https://issues.apache.org/jira/browse/SPARK-29777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hossein Falaki updated SPARK-29777: ----------------------------------- Description: Following code block reproduces the issue: {code} df <- createDataFrame(data.frame(x=1)) f1 <- function(x) x + 1 f2 <- function(x) f1(x) + 2 dapplyCollect(df, function(x) { f1(x); f2(x) }) {code} We get following error message: {code} org.apache.spark.SparkException: R computation failed with Error in f1(x) : could not find function "f1" Calls: compute -> computeFunc -> f2 {code} Compare that to this code block with succeeds: {code} dapplyCollect(df, function(x) { f2(x) }) {code} was: Following code block reproduces the issue: {code:java} library(SparkR) sparkR.session() spark_df <- createDataFrame(na.omit(airquality)) cody_local2 <- function(param2) { 10 + param2 } cody_local1 <- function(param1) { cody_local2(param1) } result <- cody_local2(5) calc_df <- dapplyCollect(spark_df, function(x) { cody_local2(20) cody_local1(5) }) print(result) {code} We get following error message: {code:java} org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 12.0 failed 4 times, most recent failure: Lost task 0.3 in stage 12.0 (TID 27, 10.0.174.239, executor 0): org.apache.spark.SparkException: R computation failed with Error in cody_local2(param1) : could not find function "cody_local2" Calls: compute -> computeFunc -> cody_local1 {code} Compare that to this code block that succeeds: {code:java} calc_df <- dapplyCollect(spark_df, function(x) { cody_local2(20) #cody_local1(5) }) {code} > SparkR::cleanClosure aggressively removes a function required by user function > ------------------------------------------------------------------------------ > > Key: SPARK-29777 > URL: https://issues.apache.org/jira/browse/SPARK-29777 > Project: Spark > Issue Type: Bug > Components: SparkR > Affects Versions: 2.4.4 > Reporter: Hossein Falaki > Priority: Major > > Following code block reproduces the issue: > {code} > df <- createDataFrame(data.frame(x=1)) > f1 <- function(x) x + 1 > f2 <- function(x) f1(x) + 2 > dapplyCollect(df, function(x) { f1(x); f2(x) }) > {code} > We get following error message: > {code} > org.apache.spark.SparkException: R computation failed with > Error in f1(x) : could not find function "f1" > Calls: compute -> computeFunc -> f2 > {code} > Compare that to this code block with succeeds: > {code} > dapplyCollect(df, function(x) { f2(x) }) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org