[jira] [Assigned] (SPARK-2967) Several SQL unit test failed when sort-based shuffle is enabled

Michael Armbrust (JIRA) Tue, 12 Aug 2014 00:47:34 -0700

     [ 
https://issues.apache.org/jira/browse/SPARK-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Michael Armbrust reassigned SPARK-2967:
---------------------------------------

    Assignee: Michael Armbrust

> Several SQL unit test failed when sort-based shuffle is enabled
> ---------------------------------------------------------------
>
>                 Key: SPARK-2967
>                 URL: https://issues.apache.org/jira/browse/SPARK-2967
>             Project: Spark
>          Issue Type: Bug
>    Affects Versions: 1.1.0
>            Reporter: Saisai Shao
>            Assignee: Michael Armbrust
>            Priority: Critical
>
> Several SQLQuerySuite unit test failed when sort-based shuffle is enabled. 
> Seems SQL test uses GenericMutableRow  which will make ExternalSorter's 
> internal buffer all refered to the same object finally because of object's 
> mutability. Seems row should be copied when feeding into ExternalSorter.
> The error shows below, though have many failures, I only pasted part of them:
> {noformat}
>  SQLQuerySuite:
>  - SPARK-2041 column name equals tablename
>  - SPARK-2407 Added Parser of SQL SUBSTR()
>  - index into array
>  - left semi greater than predicate
>  - index into array of arrays
>  - agg *** FAILED ***
>    Results do not match for query:
>    Aggregate ['a], ['a,SUM('b) AS c1#38]
>     UnresolvedRelation None, testData2, None
>    
>    == Analyzed Plan ==
>    Aggregate [a#4], [a#4,SUM(CAST(b#5, LongType)) AS c1#38L]
>     SparkLogicalPlan (ExistingRdd [a#4,b#5], MapPartitionsRDD[7] at 
> mapPartitions at basicOperators.scala:215)
>    
>    == Physical Plan ==
>    Aggregate false, [a#4], [a#4,SUM(PartialSum#40L) AS c1#38L]
>     Exchange (HashPartitioning [a#4], 200)
>      Aggregate true, [a#4], [a#4,SUM(CAST(b#5, LongType)) AS PartialSum#40L]
>       ExistingRdd [a#4,b#5], MapPartitionsRDD[7] at mapPartitions at 
> basicOperators.scala:215
>    
>    == Results ==
>    !== Correct Answer - 3 ==   == Spark Answer - 3 ==
>    !Vector(1, 3)               [1,3]
>    !Vector(2, 3)               [1,3]
>    !Vector(3, 3)               [1,3] (QueryTest.scala:53)
>  - aggregates with nulls
>  - select *
>  - simple select
>  - sorting *** FAILED ***
>    Results do not match for query:
>    Sort ['a ASC,'b ASC]
>     Project [*]
>      UnresolvedRelation None, testData2, None
>    
>    == Analyzed Plan ==
>    Sort [a#4 ASC,b#5 ASC]
>     Project [a#4,b#5]
>      SparkLogicalPlan (ExistingRdd [a#4,b#5], MapPartitionsRDD[7] at 
> mapPartitions at basicOperators.scala:215)
>    
>    == Physical Plan ==
>    Sort [a#4 ASC,b#5 ASC], true
>     Exchange (RangePartitioning [a#4 ASC,b#5 ASC], 200)
>      ExistingRdd [a#4,b#5], MapPartitionsRDD[7] at mapPartitions at 
> basicOperators.scala:215
>    
>    == Results ==
>    !== Correct Answer - 6 ==   == Spark Answer - 6 ==
>    !Vector(1, 1)               [3,2]
>    !Vector(1, 2)               [3,2]
>    !Vector(2, 1)               [3,2]
>    !Vector(2, 2)               [3,2]
>    !Vector(3, 1)               [3,2]
>    !Vector(3, 2)               [3,2] (QueryTest.scala:53)
>  - limit
>  - average
>  - average overflow *** FAILED ***
>    Results do not match for query:
>    Aggregate ['b], [AVG('a) AS c0#90,'b]
>     UnresolvedRelation None, largeAndSmallInts, None
>    
>    == Analyzed Plan ==
>    Aggregate [b#3], [AVG(CAST(a#2, LongType)) AS c0#90,b#3]
>     SparkLogicalPlan (ExistingRdd [a#2,b#3], MapPartitionsRDD[4] at 
> mapPartitions at basicOperators.scala:215)
>    
>    == Physical Plan ==
>    Aggregate false, [b#3], [(CAST(SUM(PartialSum#93L), DoubleType) / 
> CAST(SUM(PartialCount#94L), DoubleType)) AS c0#90,b#3]
>     Exchange (HashPartitioning [b#3], 200)
>      Aggregate true, [b#3], [b#3,COUNT(CAST(a#2, LongType)) AS 
> PartialCount#94L,SUM(CAST(a#2, LongType)) AS PartialSum#93L]
>       ExistingRdd [a#2,b#3], MapPartitionsRDD[4] at mapPartitions at 
> basicOperators.scala:215
>    
>    == Results ==
>    !== Correct Answer - 2 ==   == Spark Answer - 2 ==
>    !Vector(2.0, 2)             [2.147483645E9,1]
>    !Vector(2.147483645E9, 1)   [2.147483645E9,1] (QueryTest.scala:53)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-2967) Several SQL unit test failed when sort-based shuffle is enabled

Reply via email to