[ https://issues.apache.org/jira/browse/SPARK-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14943784#comment-14943784 ]
Narine Kokhlikyan commented on SPARK-9318: ------------------------------------------ Hi all, [~shivaram], [~falaki], I am working on the new signature for merge and have noticed that the join in general has serous issues. I took one of the examples from R base:::merge - https://stat.ethz.ch/R-manual/R-devel/library/base/html/merge.html x <- data.frame(k1 = c(NA,NA,3,4,5), k2 = c(1,NA,NA,4,5), data = 1:5) y <- data.frame(k1 = c(NA,2,NA,4,5), k2 = c(NA,NA,3,4,5), data = 1:5) I want to do join on this two dataframes: res <- join(xdf,ydf) res has the following structure: DataFrame[k1:double, k2:double, data:int, k1:double, k2:double, data:int] but when I do head(res) I get the following: k1 k2 data 1 NA NA 1 2 2 NA 2 3 NA 3 3 4 4 4 4 5 5 5 5 6 NA NA 1 This is not what I was expecting. The structure is inconsistent with the content/data I see with head. I tried to put aliases for those columns which have the same names for both data frames with: ydfsel <- select(ydf, alias(ydf$k1,"k1.y"), alias(ydf$k2,"k2.y"), alias(ydf$data,"data.y")) xdfsel <- select(xdf, alias(xdf$k1,"k1.x"), alias(xdf$k2,"k2.x"), alias(xdf$data,"data.x")) and this actually works and when I do: join(xdfsel, ydfsel ) - this also works but the following fails: join(xdfsel,ydfsel,xdfsel$k1.x==ydfsel$k1.y) This means that I cannot refer to alias column?? Do you know what the issue here is ? Thanks, Narine > Add `merge` as synonym for join > ------------------------------- > > Key: SPARK-9318 > URL: https://issues.apache.org/jira/browse/SPARK-9318 > Project: Spark > Issue Type: Sub-task > Components: SparkR > Reporter: Shivaram Venkataraman > Assignee: Hossein Falaki > Fix For: 1.5.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org