[GitHub] spark pull request: [SPARK-12104][SPARKR] collect() does not handl...

falaki Wed, 02 Dec 2015 21:34:58 -0800

Github user falaki commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10118#discussion_r46514167
  
    --- Diff: R/pkg/R/DataFrame.R ---
    @@ -822,21 +822,21 @@ setMethod("collect",
                     # Get a column of complex type returns a list.
                     # Get a cell from a column of complex type returns a list 
instead of a vector.
                     col <- listCols[[colIndex]]
    -                colName <- dtypes[[colIndex]][[1]]
                     if (length(col) <= 0) {
    -                  df[[colName]] <- col
    +                  df[[colIndex]] <- col
                     } else {
                       colType <- dtypes[[colIndex]][[2]]
                       # Note that "binary" columns behave like complex types.
                       if (!is.null(PRIMITIVE_TYPES[[colType]]) && colType != 
"binary") {
                         vec <- do.call(c, col)
                         stopifnot(class(vec) != "list")
    -                    df[[colName]] <- vec
    +                    df[[colIndex]] <- vec
                       } else {
    -                    df[[colName]] <- col
    +                    df[[colIndex]] <- col
                       }
                     }
                   }
    +              names(df) <- names(x)
    --- End diff --
    
    This is slightly different from 1.5. We will get exact same column names in 
local data.frame. In Spark 1.5 subsequent instances of the same name are 
appended with numbers. I am not sure which one is better. In fact I slightly 
prefer your suggested behavior. But just in case others want to chime in: cc 
@shivaram



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12104][SPARKR] collect() does not handl...

Reply via email to