[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2017-06-15 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16051013#comment-16051013 ] Reynold Xin commented on SPARK-1: - But this ticket has nothing to do with SQL? > DataFrame

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2017-06-15 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050776#comment-16050776 ] Xiao Li commented on SPARK-1: - This function is still missing in the SQL interface. We can achieve

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2017-06-15 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050730#comment-16050730 ] Reynold Xin commented on SPARK-1: - What's left in this ticket? Didn't we fix it already? If it is

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2017-06-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048477#comment-16048477 ] Joseph K. Bradley commented on SPARK-1: --- Thanks for explaining! I just rediscovered this

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2017-03-23 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15939347#comment-15939347 ] Xiao Li commented on SPARK-1: - [~josephkb] In the SQL specification, the set operations are merging

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2017-03-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15939323#comment-15939323 ] Joseph K. Bradley commented on SPARK-1: --- [~smilegator] I wouldn't call that result "right."

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-26 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170231#comment-15170231 ] Xiao Li commented on SPARK-1: - [~josephkb] The result is right. unionall does not consider the column

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170146#comment-15170146 ] Joseph K. Bradley commented on SPARK-1: --- [~rxin] Is this the same issue as the following?

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15155229#comment-15155229 ] Xiao Li commented on SPARK-1: - Thanks! > DataFrame filter + randn + unionAll has bad interaction >

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15155047#comment-15155047 ] Reynold Xin commented on SPARK-1: - OK I think this is going to be really difficult to fix right

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-17 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151746#comment-15151746 ] Xiao Li commented on SPARK-1: - Yeah, you are right. This part is an issue. That is why I did not

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-17 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151720#comment-15151720 ] Liang-Chi Hsieh commented on SPARK-1: - Yes. I agree that when user provides a specific seed

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-17 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151265#comment-15151265 ] Xiao Li commented on SPARK-1: - Another example is MS SQL Server Rand()

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-17 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150550#comment-15150550 ] Xiao Li commented on SPARK-1: - For example, in the JAVA document,

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-17 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150541#comment-15150541 ] Xiao Li commented on SPARK-1: - When users provide a specific seed number, they always expect a

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-17 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150072#comment-15150072 ] Liang-Chi Hsieh commented on SPARK-1: - Isn't weird? Suppose each partition should have

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-16 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150053#comment-15150053 ] Xiao Li commented on SPARK-1: - Yeah, the same series of random number. > DataFrame filter + randn +

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-16 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150050#comment-15150050 ] Liang-Chi Hsieh commented on SPARK-1: - But when you set deterministic to true, your each data

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-16 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150042#comment-15150042 ] Xiao Li commented on SPARK-1: - Yeah. I realized it when fixing this problem. Thus, in the PR, I just

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-16 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150008#comment-15150008 ] Liang-Chi Hsieh commented on SPARK-1: - If you don't attach a partition id, wouldn't your each

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149850#comment-15149850 ] Apache Spark commented on SPARK-1: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-16 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149507#comment-15149507 ] Xiao Li commented on SPARK-1: - Will try to submit a PR tonight. When users specify a seed, I am

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-16 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149155#comment-15149155 ] Xiao Li commented on SPARK-1: - [~josephkb] I found the root cause. : ) In the genCode of Randn and

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149053#comment-15149053 ] Joseph K. Bradley commented on SPARK-1: --- I now have a much more complex example which does

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-15 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148212#comment-15148212 ] Xiao Li commented on SPARK-1: - Tried join, intersect and except in 2.0. Works fine! For example,

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-15 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148202#comment-15148202 ] Xiao Li commented on SPARK-1: - Interesting. This query has specified the seed. Thus, it should return

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-15 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148187#comment-15148187 ] Xiao Li commented on SPARK-1: - The current solution also has performance penalty. That has been

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-15 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148186#comment-15148186 ] Xiao Li commented on SPARK-1: - You will get the right result if you cache the first DF {code} //

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-15 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148182#comment-15148182 ] Xiao Li commented on SPARK-1: - This is a known issue. The same issue exists in CTE with

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-15 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148163#comment-15148163 ] Xiao Li commented on SPARK-1: - Glad to work on this issue. Let me try it. Will keep you posted.

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2016-02-15 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148135#comment-15148135 ] Joseph K. Bradley commented on SPARK-1: --- I haven't tested with 1.5 yet, but I assume it