[ https://issues.apache.org/jira/browse/SPARK-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493223#comment-14493223 ]
Antonio Piccolboni commented on SPARK-6820: ------------------------------------------- For the distinction between NAs and NUlls in R, see http://www.r-bloggers.com/r-na-vs-null/ This seems a fairly dangerous move, but I don't have a good alternative to suggest. This is a valid data frame dd <- structure(list(c.1..2..NA. = c(1, 2, NA), V2 = list(1, 2, NULL)), .Names = c("c.1..2..NA.", "V2"), row.names = c(NA, -3L), class = "data.frame") dd[3,1] == dd[3,2][[1]] How often real code relies on list columns that can contain nulls, I am not sure. > Convert NAs to null type in SparkR DataFrames > --------------------------------------------- > > Key: SPARK-6820 > URL: https://issues.apache.org/jira/browse/SPARK-6820 > Project: Spark > Issue Type: New Feature > Components: SparkR, SQL > Reporter: Shivaram Venkataraman > > While converting RDD or local R DataFrame to a SparkR DataFrame we need to > handle missing values or NAs. > We should convert NAs to SparkSQL's null type to handle the conversion > correctly -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org