Repository: spark
Updated Branches:
  refs/heads/branch-2.2 aaeca8bdd -> adaa3f7e0


[SPARK-20341][SQL] Support BigInt's value that does not fit in long value range

## What changes were proposed in this pull request?

This PR avoids an exception in the case where `scala.math.BigInt` has a value 
that does not fit into long value range (e.g. `Long.MAX_VALUE+1`). When we run 
the following code by using the current Spark, the following exception is 
thrown.

This PR keeps the value using `BigDecimal` if we detect such an overflow case 
by catching `ArithmeticException`.

Sample program:
```
case class BigIntWrapper(value:scala.math.BigInt)```
spark.createDataset(BigIntWrapper(scala.math.BigInt("10000000000000000002"))::Nil).show
```
Exception:
```
Error while encoding: java.lang.ArithmeticException: BigInteger out of long 
range
staticinvoke(class org.apache.spark.sql.types.Decimal$, DecimalType(38,0), 
apply, assertnotnull(assertnotnull(input[0, org.apache.spark.sql.BigIntWrapper, 
true])).value, true) AS value#0
java.lang.RuntimeException: Error while encoding: 
java.lang.ArithmeticException: BigInteger out of long range
staticinvoke(class org.apache.spark.sql.types.Decimal$, DecimalType(38,0), 
apply, assertnotnull(assertnotnull(input[0, org.apache.spark.sql.BigIntWrapper, 
true])).value, true) AS value#0
        at 
org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:290)
        at 
org.apache.spark.sql.SparkSession$$anonfun$2.apply(SparkSession.scala:454)
        at 
org.apache.spark.sql.SparkSession$$anonfun$2.apply(SparkSession.scala:454)
        at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.immutable.List.foreach(List.scala:381)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
        at scala.collection.immutable.List.map(List.scala:285)
        at 
org.apache.spark.sql.SparkSession.createDataset(SparkSession.scala:454)
        at org.apache.spark.sql.Agg$$anonfun$18.apply$mcV$sp(MySuite.scala:192)
        at org.apache.spark.sql.Agg$$anonfun$18.apply(MySuite.scala:192)
        at org.apache.spark.sql.Agg$$anonfun$18.apply(MySuite.scala:192)
        at 
org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
        at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
        at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
        at org.scalatest.Transformer.apply(Transformer.scala:22)
        at org.scalatest.Transformer.apply(Transformer.scala:20)
        at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
        at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:68)
        at 
org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
        at 
org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
        at 
org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
        at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
        at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
...
Caused by: java.lang.ArithmeticException: BigInteger out of long range
        at java.math.BigInteger.longValueExact(BigInteger.java:4531)
        at org.apache.spark.sql.types.Decimal.set(Decimal.scala:140)
        at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:434)
        at org.apache.spark.sql.types.Decimal.apply(Decimal.scala)
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
 Source)
        at 
org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:287)
        ... 59 more
```

## How was this patch tested?

Add new test suite into `DecimalSuite`

Author: Kazuaki Ishizaki <ishiz...@jp.ibm.com>

Closes #17684 from kiszk/SPARK-20341.

(cherry picked from commit a750a595976791cb8a77063f690ea8f82ea75a8f)
Signed-off-by: Wenchen Fan <wenc...@databricks.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/adaa3f7e
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/adaa3f7e
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/adaa3f7e

Branch: refs/heads/branch-2.2
Commit: adaa3f7e027338522e8a71ea40b3237d5889a30d
Parents: aaeca8b
Author: Kazuaki Ishizaki <ishiz...@jp.ibm.com>
Authored: Fri Apr 21 22:25:35 2017 +0800
Committer: Wenchen Fan <wenc...@databricks.com>
Committed: Fri Apr 21 22:26:05 2017 +0800

----------------------------------------------------------------------
 .../org/apache/spark/sql/types/Decimal.scala    | 20 ++++++++++++++------
 .../apache/spark/sql/types/DecimalSuite.scala   |  6 ++++++
 2 files changed, 20 insertions(+), 6 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/adaa3f7e/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala
----------------------------------------------------------------------
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala
index e8f6884..80916ee 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala
@@ -132,14 +132,22 @@ final class Decimal extends Ordered[Decimal] with 
Serializable {
   }
 
   /**
-   * Set this Decimal to the given BigInteger value. Will have precision 38 
and scale 0.
+   * If the value is not in the range of long, convert it to BigDecimal and
+   * the precision and scale are based on the converted value.
+   *
+   * This code avoids BigDecimal object allocation as possible to improve 
runtime efficiency
    */
   def set(bigintval: BigInteger): Decimal = {
-    this.decimalVal = null
-    this.longVal = bigintval.longValueExact()
-    this._precision = DecimalType.MAX_PRECISION
-    this._scale = 0
-    this
+    try {
+      this.decimalVal = null
+      this.longVal = bigintval.longValueExact()
+      this._precision = DecimalType.MAX_PRECISION
+      this._scale = 0
+      this
+    } catch {
+      case _: ArithmeticException =>
+        set(BigDecimal(bigintval))
+    }
   }
 
   /**

http://git-wip-us.apache.org/repos/asf/spark/blob/adaa3f7e/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DecimalSuite.scala
----------------------------------------------------------------------
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DecimalSuite.scala 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DecimalSuite.scala
index 714883a..93c231e 100644
--- a/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DecimalSuite.scala
+++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DecimalSuite.scala
@@ -212,4 +212,10 @@ class DecimalSuite extends SparkFunSuite with 
PrivateMethodTester {
       }
     }
   }
+
+  test("SPARK-20341: support BigInt's value does not fit in long value range") 
{
+    val bigInt = scala.math.BigInt("9223372036854775808")
+    val decimal = Decimal.apply(bigInt)
+    assert(decimal.toJavaBigDecimal.unscaledValue.toString === 
"9223372036854775808")
+  }
 }


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to