[ https://issues.apache.org/jira/browse/SPARK-33795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Neal Richardson moved ARROW-10916 to SPARK-33795: ------------------------------------------------- Component/s: (was: R) SparkR Key: SPARK-33795 (was: ARROW-10916) Affects Version/s: (was: 2.0.0) 3.0.0 Workflow: no-reopen-closed (was: jira) Project: Spark (was: Apache Arrow) > gapply fails execution with rbind error > --------------------------------------- > > Key: SPARK-33795 > URL: https://issues.apache.org/jira/browse/SPARK-33795 > Project: Spark > Issue Type: Bug > Components: SparkR > Affects Versions: 3.0.0 > Environment: Databricks runtime 7.3 LTS ML > Reporter: MvR > Priority: Major > Attachments: Rerror.log > > > Executing following code on databricks runtime 7.3 LTS ML errors out showing > some rbind error whereas it is successfully executed without enabling Arrow > in Spark session. Full error message attached. > > ``` > library(dplyr) > library(SparkR) > SparkR::sparkR.session(sparkConfig = > list(spark.sql.execution.arrow.sparkr.enabled = "true")) > mtcars %>% > SparkR::as.DataFrame() %>% > SparkR::gapply(x = ., > cols = c("cyl", "vs"), > > func = function(key, > data){ > > dt <- data[,c("mpg", "qsec")] > res <- apply(dt, 2, mean) > df <- data.frame(firstGroupKey = key[1], > secondGroupKey = key[2], > mean_mpg = res[1], > mean_cyl = res[2]) > return(df) > > }, > schema = structType(structField("cyl", "double"), > structField("vs", "double"), > structField("mpg_mean", "double"), > structField("qsec_mean", "double")) > ) %>% > display() > ``` -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org