[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

HyukjinKwon Tue, 13 Nov 2018 18:07:13 -0800

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22954#discussion_r233292436
  
    --- Diff: R/pkg/R/SQLContext.R ---
    @@ -189,19 +238,67 @@ createDataFrame <- function(data, schema = NULL, 
samplingRatio = 1.0,
               x
             }
           }
    +      data[] <- lapply(data, cleanCols)
     
    -      # drop factors and wrap lists
    -      data <- setNames(lapply(data, cleanCols), NULL)
    +      args <- list(FUN = list, SIMPLIFY = FALSE, USE.NAMES = FALSE)
    +      if (arrowEnabled) {
    +        shouldUseArrow <- tryCatch({
    +          stopifnot(length(data) > 0)
    +          dataHead <- head(data, 1)
    +          # Currenty Arrow optimization does not support POSIXct and raw 
for now.
    +          # Also, it does not support explicit float type set by users. It 
leads to
    +          # incorrect conversion. We will fall back to the path without 
Arrow optimization.
    +          if (any(sapply(dataHead, function(x) is(x, "POSIXct")))) {
    +            stop("Arrow optimization with R DataFrame does not support 
POSIXct type yet.")
    +          }
    +          if (any(sapply(dataHead, is.raw))) {
    +            stop("Arrow optimization with R DataFrame does not support raw 
type yet.")
    +          }
    +          if (inherits(schema, "structType")) {
    +            if (any(sapply(schema$fields(), function(x) 
x$dataType.toString() == "FloatType"))) {
    +              stop("Arrow optimization with R DataFrame does not support 
FloatType type yet.")
    --- End diff --
    
    I think it's a bug because it always produces a corrupt value when I try to 
use `number` as explicit float types.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

Reply via email to