[ 
https://issues.apache.org/jira/browse/SPARK-11427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ram Kandasamy resolved SPARK-11427.
-----------------------------------
    Resolution: Duplicate

> DataFrame's intersect method does not work, returns 1
> -----------------------------------------------------
>
>                 Key: SPARK-11427
>                 URL: https://issues.apache.org/jira/browse/SPARK-11427
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.5.0
>            Reporter: Ram Kandasamy
>
> Hello,
>     I was working with dataframes and I found the intersect() method seems to 
> always return '1'. The RDD's intersection() method does work properly.
> Consider this example:
> scala> val firstFile = 
> sqlContext.read.parquet("/Users/ramkandasamy/sparkData/2015-07-25/*").select("id").distinct
> firstFile: org.apache.spark.sql.DataFrame = [id: string]
> scala> firstFile.count
> res4: Long = 1072046
> scala> firstFile.intersect(firstFile).count
> res5: Long = 1
> scala> firstFile.rdd.intersection(firstFile.rdd).count
> res6: Long = 1072046
> I have tried various different cases, and for some reason, the dataframe's 
> intersect method always returns 1. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to