[ 
https://issues.apache.org/jira/browse/SPARK-13552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Roberts updated SPARK-13552:
---------------------------------
    Attachment:     (was: DefectBadMinValueLong.jpg)

> Incorrect data for Long.minValue in SQLQuerySuite on IBM Java
> -------------------------------------------------------------
>
>                 Key: SPARK-13552
>                 URL: https://issues.apache.org/jira/browse/SPARK-13552
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0
>         Environment: IBM Java only, all platforms
>            Reporter: Adam Roberts
>            Priority: Minor
>         Attachments: DefectBadMinValueLongResized.jpg
>
>
> The Long.minValue test fails on IBM Java 8, we get the following incorrect 
> answer with the slightly simplified test case:
> {code:SQL}
> val tester = sql(s"SELECT ${Long.MinValue} FROM testData")
> {code}
> result is
> _-9,223,372,041,149,743,104_ instead of _-9,223,372,036,854,775,808_ (there's 
> only one bit difference if we convert to binary representation).
> Here's the full test output:
> {code}
> Results do not match for query:
> == Parsed Logical Plan ==
> 'GlobalLimit 1
> +- 'LocalLimit 1
>    +- 'Sort ['key ASC], true
>       +- 'Project [unresolvedalias(-9223372036854775808, None)]
>          +- 'UnresolvedRelation `testData`, None
> == Analyzed Logical Plan ==
> (-9223372036854775808): decimal(19,0)
> GlobalLimit 1
> +- LocalLimit 1
>    +- Project [(-9223372036854775808)#4391]
>       +- Sort [key#101 ASC], true
>          +- Project [-9223372036854775808 AS 
> (-9223372036854775808)#4391,key#101]
>             +- SubqueryAlias testData
>                +- LogicalRDD [key#101,value#102], MapPartitionsRDD[3] at 
> beforeAll at BeforeAndAfterAll.scala:187
> == Optimized Logical Plan ==
> GlobalLimit 1
> +- LocalLimit 1
>    +- Project [(-9223372036854775808)#4391]
>       +- Sort [key#101 ASC], true
>          +- Project [-9223372036854775808 AS 
> (-9223372036854775808)#4391,key#101]
>             +- LogicalRDD [key#101,value#102], MapPartitionsRDD[3] at 
> beforeAll at BeforeAndAfterAll.scala:187
> == Physical Plan ==
> TakeOrderedAndProject(limit=1, orderBy=[key#101 ASC], 
> output=[(-9223372036854775808)#4391])
> +- WholeStageCodegen
>    :  +- Project [-9223372036854775808 AS (-9223372036854775808)#4391,key#101]
>    :     +- INPUT
>    +- Scan ExistingRDD[key#101,value#102]
> == Results ==
> == Results ==
> !== Correct Answer - 1 ==   == Spark Answer - 1 ==
> ![-9223372036854775808]     [-9223372041149743104]
> {code}
> Debugging in Intellij shows the query seems to be parsed OK and we eventually 
> have a schema with the correct data in the struct field but the BigDecimal's 
> BigInteger is incorrect when we have a GenericRowWithSchema.
> I've identified that the problem started when SPARK-12575 was implemented and 
> suspect the following paragraph is important:
> "Hive and the SQL Parser treat decimal literals differently. Hive will turn 
> any decimal into a Double whereas the SQL Parser would convert a 
> non-scientific decimal into a BigDecimal, and would turn a scientific decimal 
> into a Double. We follow Hive's behavior here. The new parser supports a big 
> decimal literal, for instance: 81923801.42BD, which can be used when a big 
> decimal is needed."



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to