[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

felixcheung Fri, 09 Nov 2018 00:51:48 -0800

Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22954#discussion_r232173367
  
    --- Diff: R/pkg/R/SQLContext.R ---
    @@ -189,19 +238,67 @@ createDataFrame <- function(data, schema = NULL, 
samplingRatio = 1.0,
               x
             }
           }
    +      data[] <- lapply(data, cleanCols)
     
    -      # drop factors and wrap lists
    -      data <- setNames(lapply(data, cleanCols), NULL)
    +      args <- list(FUN = list, SIMPLIFY = FALSE, USE.NAMES = FALSE)
    +      if (arrowEnabled) {
    +        shouldUseArrow <- tryCatch({
    +          stopifnot(length(data) > 0)
    +          dataHead <- head(data, 1)
    +          # Currenty Arrow optimization does not support POSIXct and raw 
for now.
    +          # Also, it does not support explicit float type set by users. It 
leads to
    +          # incorrect conversion. We will fall back to the path without 
Arrow optimization.
    +          if (any(sapply(dataHead, function(x) is(x, "POSIXct")))) {
    --- End diff --
    
    can you check -  I think `is` `is.x` doesn't something do the right thing 
when
    
    head(df, 1) and one of the field is `NA`



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

Reply via email to