[jira] [Commented] (SPARK-27224) Spark to_json parses UTC timestamp incorrectly

2019-03-26 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801802#comment-16801802 ] Jurriaan Pruis commented on SPARK-27224: [~hyukjin.kwon] is this the same as

[jira] [Commented] (SPARK-17914) Spark SQL casting to TimestampType with nanosecond results in incorrect timestamp

2019-03-26 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801795#comment-16801795 ] Jurriaan Pruis commented on SPARK-17914: I'm also seeing this issue where the millisecond part

[jira] [Comment Edited] (SPARK-16753) Spark SQL doesn't handle skewed dataset joins properly

2016-07-28 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15397223#comment-15397223 ] Jurriaan Pruis edited comment on SPARK-16753 at 7/28/16 8:02 AM: - [~rxin]

[jira] [Comment Edited] (SPARK-16753) Spark SQL doesn't handle skewed dataset joins properly

2016-07-28 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15397223#comment-15397223 ] Jurriaan Pruis edited comment on SPARK-16753 at 7/28/16 8:02 AM: - [~rxin]

[jira] [Updated] (SPARK-16753) Spark SQL doesn't handle skewed dataset joins properly

2016-07-28 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jurriaan Pruis updated SPARK-16753: --- Attachment: screenshot-1.png > Spark SQL doesn't handle skewed dataset joins properly >

[jira] [Commented] (SPARK-16753) Spark SQL doesn't handle skewed dataset joins properly

2016-07-28 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15397223#comment-15397223 ] Jurriaan Pruis commented on SPARK-16753: [~rxin] I've set the following options: {code}

[jira] [Commented] (SPARK-16753) Spark SQL doesn't handle skewed dataset joins properly

2016-07-27 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396209#comment-15396209 ] Jurriaan Pruis commented on SPARK-16753: It's not only more memory, they also take up a lot more

[jira] [Commented] (SPARK-16753) Spark SQL doesn't handle skewed dataset joins properly

2016-07-27 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396078#comment-15396078 ] Jurriaan Pruis commented on SPARK-16753: This is looks like a skew problem to me since the tasks

[jira] [Commented] (SPARK-16753) Spark SQL doesn't handle skewed dataset joins properly

2016-07-27 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396053#comment-15396053 ] Jurriaan Pruis commented on SPARK-16753: [~rxin] Do you know something about this? I've seen a

[jira] [Created] (SPARK-16753) Spark SQL doesn't handle skewed dataset joins properly

2016-07-27 Thread Jurriaan Pruis (JIRA)
Jurriaan Pruis created SPARK-16753: -- Summary: Spark SQL doesn't handle skewed dataset joins properly Key: SPARK-16753 URL: https://issues.apache.org/jira/browse/SPARK-16753 Project: Spark

[jira] [Commented] (SPARK-16252) Full Outer join with literal column results in incorrect result

2016-06-28 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15353471#comment-15353471 ] Jurriaan Pruis commented on SPARK-16252: Awesome! It works, indeed! > Full Outer join with

[jira] [Updated] (SPARK-16252) Full Outer join with literal column results in incorrect result

2016-06-28 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jurriaan Pruis updated SPARK-16252: --- Description: {code} >>> from pyspark.sql.functions import lit, coalesce >>> data1 = [[1,2],

[jira] [Created] (SPARK-16252) Full Outer join with literal column results in incorrect result

2016-06-28 Thread Jurriaan Pruis (JIRA)
Jurriaan Pruis created SPARK-16252: -- Summary: Full Outer join with literal column results in incorrect result Key: SPARK-16252 URL: https://issues.apache.org/jira/browse/SPARK-16252 Project: Spark

[jira] [Commented] (SPARK-15326) Doing multiple unions on a Dataframe will result in a very inefficient query plan

2016-06-22 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343775#comment-15343775 ] Jurriaan Pruis commented on SPARK-15326: [~hvanhovell] unfortunately that doesn't work. The

[jira] [Commented] (SPARK-15393) Writing empty Dataframes doesn't save any _metadata files

2016-06-21 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15341703#comment-15341703 ] Jurriaan Pruis commented on SPARK-15393: That's interesting because my example worked just fine

[jira] [Commented] (SPARK-15654) Reading gzipped files results in duplicate rows

2016-05-31 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307425#comment-15307425 ] Jurriaan Pruis commented on SPARK-15654: You need to override maxSplitBytes, not

[jira] [Commented] (SPARK-15654) Reading gzipped files results in duplicate rows

2016-05-30 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15306866#comment-15306866 ] Jurriaan Pruis commented on SPARK-15654: Sorry, not sure about other formats. So this is due to

[jira] [Commented] (SPARK-15654) Reading gzipped files results in duplicate rows

2016-05-30 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15306731#comment-15306731 ] Jurriaan Pruis commented on SPARK-15654: cc [~davies] [~marmbrus] I saw you guys worked on code

[jira] [Created] (SPARK-15654) Reading gzipped files results in duplicate rows

2016-05-30 Thread Jurriaan Pruis (JIRA)
Jurriaan Pruis created SPARK-15654: -- Summary: Reading gzipped files results in duplicate rows Key: SPARK-15654 URL: https://issues.apache.org/jira/browse/SPARK-15654 Project: Spark Issue

[jira] [Commented] (SPARK-13638) Support for saving with a quote mode

2016-05-28 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15305263#comment-15305263 ] Jurriaan Pruis commented on SPARK-13638: [~rxin] Sure! > Support for saving with a quote mode >

[jira] [Commented] (SPARK-13638) Support for saving with a quote mode

2016-05-25 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300899#comment-15300899 ] Jurriaan Pruis commented on SPARK-13638: [~rxin] I think having quoteAll on by default is a bit

[jira] [Updated] (SPARK-15493) Allow setting the quoteEscapingEnabled flag when writing CSV

2016-05-24 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jurriaan Pruis updated SPARK-15493: --- Description: See

[jira] [Created] (SPARK-15493) Allow setting the quoteEscapingEnabled flag when writing CSV

2016-05-23 Thread Jurriaan Pruis (JIRA)
Jurriaan Pruis created SPARK-15493: -- Summary: Allow setting the quoteEscapingEnabled flag when writing CSV Key: SPARK-15493 URL: https://issues.apache.org/jira/browse/SPARK-15493 Project: Spark

[jira] [Commented] (SPARK-15393) Writing empty Dataframes doesn't save any _metadata files

2016-05-22 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295477#comment-15295477 ] Jurriaan Pruis commented on SPARK-15393: [~hyukjin.kwon] I reproduced it again using pyspark

[jira] [Commented] (SPARK-14343) Dataframe operations on a partitioned dataset (using partition discovery) return invalid results

2016-05-22 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295471#comment-15295471 ] Jurriaan Pruis commented on SPARK-14343: [~davies] This is still an issue on Spark 2.0 (You only

[jira] [Commented] (SPARK-15415) Marking partitions for broadcast broken

2016-05-20 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292828#comment-15292828 ] Jurriaan Pruis commented on SPARK-15415: [~rxin] I could try to work on that. The reason I ran

[jira] [Created] (SPARK-15415) Marking partitions for broadcast broken

2016-05-19 Thread Jurriaan Pruis (JIRA)
Jurriaan Pruis created SPARK-15415: -- Summary: Marking partitions for broadcast broken Key: SPARK-15415 URL: https://issues.apache.org/jira/browse/SPARK-15415 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-15393) Writing empty Dataframes doesn't save any _metadata files

2016-05-18 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jurriaan Pruis updated SPARK-15393: --- Description: Writing empty dataframes is broken on latest master. It omits the metadata and

[jira] [Updated] (SPARK-15393) Writing empty Dataframes doesn't save any _metadata files

2016-05-18 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jurriaan Pruis updated SPARK-15393: --- Description: Writing empty dataframes is broken on latest master. It omits the metadata and

[jira] [Updated] (SPARK-15393) Writing empty Dataframes doesn't save any _metadata files

2016-05-18 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jurriaan Pruis updated SPARK-15393: --- Summary: Writing empty Dataframes doesn't save any _metadata files (was: Writing empty

[jira] [Commented] (SPARK-15393) Writing empty Dataframes broken

2016-05-18 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289797#comment-15289797 ] Jurriaan Pruis commented on SPARK-15393: Ping [~hyukjin.kwon] > Writing empty Dataframes broken

[jira] [Created] (SPARK-15393) Writing empty Dataframes broken

2016-05-18 Thread Jurriaan Pruis (JIRA)
Jurriaan Pruis created SPARK-15393: -- Summary: Writing empty Dataframes broken Key: SPARK-15393 URL: https://issues.apache.org/jira/browse/SPARK-15393 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-14959) ​Problem Reading partitioned ORC or Parquet files

2016-05-17 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15286236#comment-15286236 ] Jurriaan Pruis commented on SPARK-14959: As you can see in the description writing is also broken

[jira] [Commented] (SPARK-15327) Catalyst code generation fails with complex data structure

2016-05-15 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283819#comment-15283819 ] Jurriaan Pruis commented on SPARK-15327: Did a quick look for the cause of this problem and it

[jira] [Updated] (SPARK-15327) Catalyst code generation fails with complex data structure

2016-05-14 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jurriaan Pruis updated SPARK-15327: --- Attachment: full_exception.txt See attached file for the full exception / generated code. >

[jira] [Created] (SPARK-15327) Catalyst code generation fails with complex data structure

2016-05-14 Thread Jurriaan Pruis (JIRA)
Jurriaan Pruis created SPARK-15327: -- Summary: Catalyst code generation fails with complex data structure Key: SPARK-15327 URL: https://issues.apache.org/jira/browse/SPARK-15327 Project: Spark

[jira] [Updated] (SPARK-15326) Doing multiple unions on a Dataframe will result in a very inefficient query plan

2016-05-14 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jurriaan Pruis updated SPARK-15326: --- Summary: Doing multiple unions on a Dataframe will result in a very inefficient query plan

[jira] [Updated] (SPARK-15326) Doing multiple union on a Dataframe will result in a very inefficient query plan

2016-05-14 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jurriaan Pruis updated SPARK-15326: --- Attachment: Query Plan.pdf Also added a PDF of the Query Plan as shown in the web interface.

[jira] [Updated] (SPARK-15326) Doing multiple union on a Dataframe will result in a very inefficient query plan

2016-05-14 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jurriaan Pruis updated SPARK-15326: --- Description: While working with a very skewed dataset I noticed that repeated unions on a

[jira] [Issue Comment Deleted] (SPARK-15326) Doing multiple union on a Dataframe will result in a very inefficient query plan

2016-05-14 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jurriaan Pruis updated SPARK-15326: --- Comment: was deleted (was: The example code) > Doing multiple union on a Dataframe will

[jira] [Issue Comment Deleted] (SPARK-15326) Doing multiple union on a Dataframe will result in a very inefficient query plan

2016-05-14 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jurriaan Pruis updated SPARK-15326: --- Comment: was deleted (was: The extended query plan generated by the example) > Doing

[jira] [Updated] (SPARK-15326) Doing multiple union on a Dataframe will result in a very inefficient query plan

2016-05-14 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jurriaan Pruis updated SPARK-15326: --- Attachment: skewed_join_plan.txt The extended query plan generated by the example > Doing

[jira] [Updated] (SPARK-15326) Doing multiple union on a Dataframe will result in a very inefficient query plan

2016-05-14 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jurriaan Pruis updated SPARK-15326: --- Description: While working with a very skewed dataset I noticed that repeated unions on a

[jira] [Updated] (SPARK-15326) Doing multiple union on a Dataframe will result in a very inefficient query plan

2016-05-14 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jurriaan Pruis updated SPARK-15326: --- Attachment: skewed_join.py The example code > Doing multiple union on a Dataframe will

[jira] [Created] (SPARK-15326) Doing multiple union on a Dataframe will result in a very inefficient query plan

2016-05-14 Thread Jurriaan Pruis (JIRA)
Jurriaan Pruis created SPARK-15326: -- Summary: Doing multiple union on a Dataframe will result in a very inefficient query plan Key: SPARK-15326 URL: https://issues.apache.org/jira/browse/SPARK-15326

[jira] [Updated] (SPARK-15323) read with format=text is broken for partitioned tables in Spark 2.0

2016-05-14 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jurriaan Pruis updated SPARK-15323: --- Description: {code} sqlContext.read.format("text").load("...") {code} Is broken for

[jira] [Created] (SPARK-15323) read with format=text is broken for partitioned tables in Spark 2.0

2016-05-14 Thread Jurriaan Pruis (JIRA)
Jurriaan Pruis created SPARK-15323: -- Summary: read with format=text is broken for partitioned tables in Spark 2.0 Key: SPARK-15323 URL: https://issues.apache.org/jira/browse/SPARK-15323 Project:

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

2016-05-13 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283044#comment-15283044 ] Jurriaan Pruis commented on SPARK-14463: Actually, this functionality is broken (explicitly

[jira] [Commented] (SPARK-14959) ​Problem Reading partitioned ORC or Parquet files

2016-05-11 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280895#comment-15280895 ] Jurriaan Pruis commented on SPARK-14959: I have the same issue reading a partitioned parquet

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

2016-05-04 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15271207#comment-15271207 ] Jurriaan Pruis commented on SPARK-14463: Any idea if

[jira] [Updated] (SPARK-15127) Column names are handled incorrectly when they originate from a single Dataframe

2016-05-04 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jurriaan Pruis updated SPARK-15127: --- Description: I think I found a bug in the way columns are handled in (py)Spark h3. How to

[jira] [Created] (SPARK-15127) Column names are handled incorrectly when they originate from a single Dataframe

2016-05-04 Thread Jurriaan Pruis (JIRA)
Jurriaan Pruis created SPARK-15127: -- Summary: Column names are handled incorrectly when they originate from a single Dataframe Key: SPARK-15127 URL: https://issues.apache.org/jira/browse/SPARK-15127

[jira] [Updated] (SPARK-14343) Dataframe operations on a partitioned dataset (using partition discovery) return invalid results

2016-04-27 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jurriaan Pruis updated SPARK-14343: --- Environment: Mac OS X 10.11.4 / Ubuntu 16.04 LTS (was: Mac OS X 10.11.4) > Dataframe

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

2016-04-18 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246211#comment-15246211 ] Jurriaan Pruis commented on SPARK-14463: Why? I guess this can be quite useful, at least while

[jira] [Commented] (SPARK-14343) Dataframe operations on a partitioned dataset (using partition discovery) return invalid results

2016-04-17 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15244669#comment-15244669 ] Jurriaan Pruis commented on SPARK-14343: On the spark 2.0.0 nightly build it doesn't work at all:

[jira] [Updated] (SPARK-14343) Dataframe operations on a partitioned dataset (using partition discovery) return invalid results

2016-04-17 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jurriaan Pruis updated SPARK-14343: --- Affects Version/s: 2.0.0 > Dataframe operations on a partitioned dataset (using partition

[jira] [Updated] (SPARK-14343) Dataframe operations on a partitioned dataset (using partition discovery) return invalid results

2016-04-04 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jurriaan Pruis updated SPARK-14343: --- Description: When reading a dataset using {{sqlContext.read.text()}} queries on the

[jira] [Created] (SPARK-14343) Dataframe operations on a partitioned dataset (using partition discovery) return invalid results

2016-04-02 Thread Jurriaan Pruis (JIRA)
Jurriaan Pruis created SPARK-14343: -- Summary: Dataframe operations on a partitioned dataset (using partition discovery) return invalid results Key: SPARK-14343 URL: