[ https://issues.apache.org/jira/browse/SPARK-10894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15034386#comment-15034386 ]
Weiqiang Zhuang commented on SPARK-10894: ----------------------------------------- Yeah, as I mentioned in my previous comment, this depends on SPARK-7499 to free the $ operator. > Add 'drop' support for DataFrame's subset function > -------------------------------------------------- > > Key: SPARK-10894 > URL: https://issues.apache.org/jira/browse/SPARK-10894 > Project: Spark > Issue Type: Improvement > Components: SparkR > Reporter: Weiqiang Zhuang > > SparkR DataFrame can be subset to get one or more columns of the dataset. The > current '[' implementation does not support 'drop' when is asked for just one > column. This is not consistent with the R syntax: > x[i, j, ... , drop = TRUE] > # in R, when drop is FALSE, remain as data.frame > > class(iris[, "Sepal.Width", drop=F]) > [1] "data.frame" > # when drop is TRUE (default), drop to be a vector > > class(iris[, "Sepal.Width", drop=T]) > [1] "numeric" > > class(iris[,"Sepal.Width"]) > [1] "numeric" > > df <- createDataFrame(sqlContext, iris) > # in SparkR, 'drop' argument has no impact > > class(df[,"Sepal_Width", drop=F]) > [1] "DataFrame" > attr(,"package") > [1] "SparkR" > # should have dropped to be a Column class instead > > class(df[,"Sepal_Width", drop=T]) > [1] "DataFrame" > attr(,"package") > [1] "SparkR" > > class(df[,"Sepal_Width"]) > [1] "DataFrame" > attr(,"package") > [1] "SparkR" > We should add the 'drop' support. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org