[ 
https://issues.apache.org/jira/browse/SPARK-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15240804#comment-15240804
 ] 

Sean Owen commented on SPARK-12741:
-----------------------------------

I mean it's a little better to write your test case here than link to an 
external post.
Also, try running the latest build of Spark, since sometimes something has 
already been fixed, and that much would be good to know.

> DataFrame count method return wrong size.
> -----------------------------------------
>
>                 Key: SPARK-12741
>                 URL: https://issues.apache.org/jira/browse/SPARK-12741
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.5.0
>            Reporter: Sasi
>
> Hi,
> I'm updating my report.
> I'm working with Spark 1.5.2, (used to be 1.5.0), I have a DataFrame and I 
> have 2 method, one for collect data and other for count.
> method doQuery looks like:
> {code}
> dataFrame.collect()
> {code}
> method doQueryCount looks like:
> {code}
> dataFrame.count()
> {code}
> I have few scenarios with few results:
> 1) Non data exists on my NoSQLDatabase results: count 0 and collect() 0
> 2) 3 rows exists results: count 0 and collect 3.
> 3) 5 rows exists results: count 2 and collect 5. 
> I tried to change the count code to the below code, but got the same results 
> as I mentioned above.
> {code}
> dataFrame.sql("select count(*) from tbl").count/collect[0]
> {code}
> Thanks,
> Sasi



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to