[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034641#comment-15034641
]
Maciej Bryński commented on SPARK-12030:
Will the fix be included in 1.6.0 ?
> Incorrect results
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034615#comment-15034615
]
Davies Liu commented on SPARK-12030:
I also figured out the root cause last night, that's an
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034652#comment-15034652
]
Yin Huai commented on SPARK-12030:
--
Yes, it will be in 1.6.0.
> Incorrect results when aggregate joined
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034936#comment-15034936
]
Yin Huai commented on SPARK-12030:
--
I also merged the patch to branch 1.5. Please note that, in 1.5, we
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035164#comment-15035164
]
Xiao Li commented on SPARK-12030:
-
I did verify the fix using my test cases. It works!
I posted a
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034141#comment-15034141
]
Apache Spark commented on SPARK-12030:
--
User 'nongli' has created a pull request for this issue:
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033356#comment-15033356
]
Xiao Li commented on SPARK-12030:
-
[~nongli] Thank you very much!
Your finding sounds reasonable. I
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033321#comment-15033321
]
Nong Li commented on SPARK-12030:
-
I think I tracked it down. The bug is from this PR which exposed
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15032887#comment-15032887
]
Xiao Li commented on SPARK-12030:
-
[SPARK-7542][SQL] Support off-heap index/sort buffer
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15032939#comment-15032939
]
Xiao Li commented on SPARK-12030:
-
Let me post a simple case that can trigger the data corruption. The
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15032724#comment-15032724
]
Xiao Li commented on SPARK-12030:
-
I believe I already found which PRs introduced the regression.
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15032857#comment-15032857
]
Davies Liu commented on SPARK-12030:
[~smilegator] Could you post the related PRs here? So we can
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15032902#comment-15032902
]
Xiao Li commented on SPARK-12030:
-
I already excluded Exchange and Partitioning. It should be caused by
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15032927#comment-15032927
]
Yin Huai commented on SPARK-12030:
--
[~smilegator] Can you post the case that triggers the problem? Also,
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15031206#comment-15031206
]
Xiao Li commented on SPARK-12030:
-
I can reproduced a similar issue in a Sort. I think the impact could
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030883#comment-15030883
]
Maciej Bryński commented on SPARK-12030:
[~smilegator]
Problem is not only with distinct but with
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030887#comment-15030887
]
Maciej Bryński commented on SPARK-12030:
[~smilegator]
I tested 1.5.2 (binaries from spark page)
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15031060#comment-15031060
]
Xiao Li commented on SPARK-12030:
-
[~maver1ck] Yeah, the problem was introduced in 1.6.0. So far, I think
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030430#comment-15030430
]
Xiao Li commented on SPARK-12030:
-
What is the data type of id1?
> Incorrect results when aggregate
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030618#comment-15030618
]
Xiao Li commented on SPARK-12030:
-
If you cache `joined`, can you see the same issue?
> Incorrect
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030612#comment-15030612
]
Maciej Bryński commented on SPARK-12030:
id1, id2 and fk1 are integers.
> Incorrect results when
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030615#comment-15030615
]
Xiu(Joe) Guo commented on SPARK-12030:
--
I tried your scenario with some TPCDS table last night,
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030632#comment-15030632
]
Maciej Bryński commented on SPARK-12030:
When I cache joined the result of distinct(id) is always
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030652#comment-15030652
]
Xiao Li commented on SPARK-12030:
-
Thank you! [~maver1ck]
That will be great if we can know if this is
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030633#comment-15030633
]
Maciej Bryński commented on SPARK-12030:
And spark-defaults.conf:
{code}
spark.master
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030643#comment-15030643
]
Maciej Bryński commented on SPARK-12030:
I tried following things:
- disable kryoserializer
-
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030638#comment-15030638
]
Xiao Li commented on SPARK-12030:
-
Trying to reproduce it using your parquet files. Thanks!
> Incorrect
[
https://issues.apache.org/jira/browse/SPARK-12030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030694#comment-15030694
]
Xiao Li commented on SPARK-12030:
-
I can reproduce it now. Will take a look at it and try to fix it.
28 matches
Mail list logo