[jira] [Commented] (SPARK-13552) Incorrect data for Long.minValue in SQLQuerySuite on IBM Java

Pete Robbins (JIRA) Mon, 02 May 2016 23:58:12 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-13552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268232#comment-15268232
 ]


Pete Robbins commented on SPARK-13552:
--------------------------------------

[~aroberts] This Jira can be closed as this is not a Spark issue

> Incorrect data for Long.minValue in SQLQuerySuite on IBM Java
> -------------------------------------------------------------
>
>                 Key: SPARK-13552
>                 URL: https://issues.apache.org/jira/browse/SPARK-13552
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0
>         Environment: IBM Java only, all platforms
>            Reporter: Adam Roberts
>            Priority: Minor
>         Attachments: DefectBadMinValueLongResized.jpg
>
>
> The Long.minValue test fails on IBM Java 8, we get the following incorrect 
> answer with the slightly simplified test case:
> {code:SQL}
> val tester = sql(s"SELECT ${Long.MinValue} FROM testData")
> {code}
> result is
> _-9,223,372,041,149,743,104_ instead of _-9,223,372,036,854,775,808_ (there's 
> only one bit difference if we convert to binary representation).
> Here's the full test output:
> {code}
> Results do not match for query:
> == Parsed Logical Plan ==
> 'GlobalLimit 1
> +- 'LocalLimit 1
>    +- 'Sort ['key ASC], true
>       +- 'Project [unresolvedalias(-9223372036854775808, None)]
>          +- 'UnresolvedRelation `testData`, None
> == Analyzed Logical Plan ==
> (-9223372036854775808): decimal(19,0)
> GlobalLimit 1
> +- LocalLimit 1
>    +- Project [(-9223372036854775808)#4391]
>       +- Sort [key#101 ASC], true
>          +- Project [-9223372036854775808 AS 
> (-9223372036854775808)#4391,key#101]
>             +- SubqueryAlias testData
>                +- LogicalRDD [key#101,value#102], MapPartitionsRDD[3] at 
> beforeAll at BeforeAndAfterAll.scala:187
> == Optimized Logical Plan ==
> GlobalLimit 1
> +- LocalLimit 1
>    +- Project [(-9223372036854775808)#4391]
>       +- Sort [key#101 ASC], true
>          +- Project [-9223372036854775808 AS 
> (-9223372036854775808)#4391,key#101]
>             +- LogicalRDD [key#101,value#102], MapPartitionsRDD[3] at 
> beforeAll at BeforeAndAfterAll.scala:187
> == Physical Plan ==
> TakeOrderedAndProject(limit=1, orderBy=[key#101 ASC], 
> output=[(-9223372036854775808)#4391])
> +- WholeStageCodegen
>    :  +- Project [-9223372036854775808 AS (-9223372036854775808)#4391,key#101]
>    :     +- INPUT
>    +- Scan ExistingRDD[key#101,value#102]
> == Results ==
> == Results ==
> !== Correct Answer - 1 ==   == Spark Answer - 1 ==
> ![-9223372036854775808]     [-9223372041149743104]
> {code}
> Debugging in Intellij shows the query seems to be parsed OK and we eventually 
> have a schema with the correct data in the struct field but the BigDecimal's 
> BigInteger is incorrect when we have a GenericRowWithSchema.
> I've identified that the problem started when SPARK-12575 was implemented and 
> suspect the following paragraph is important:
> "Hive and the SQL Parser treat decimal literals differently. Hive will turn 
> any decimal into a Double whereas the SQL Parser would convert a 
> non-scientific decimal into a BigDecimal, and would turn a scientific decimal 
> into a Double. We follow Hive's behavior here. The new parser supports a big 
> decimal literal, for instance: 81923801.42BD, which can be used when a big 
> decimal is needed."
> Done, both "value" and "row" return the correct result for both Java 
> implementations: -9223372036854775808
> FWIW, I know the first time we can see the incorrect row values is in the 
> {code}withCallback[T]{code} method in DataFrame.scala, the specific line of 
> code is
> {code}
> val result = action(df)
> {code}
> Stepping into this doesn't clearly indicate how the resulting rows are being 
> produced though (could be that I'm debugging with the wrong thread in 
> Intellij - the first time I see a value for "result" is when it's too late - 
> when we're seeing the incorrect values).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-13552) Incorrect data for Long.minValue in SQLQuerySuite on IBM Java

Reply via email to