spark git commit: [SPARK-20341][SQL] Support BigInt's value that does not fit in long value range

2017-04-21 Thread wenchen
Repository: spark
Updated Branches:
  refs/heads/master c9e6035e1 -> a750a5959


[SPARK-20341][SQL] Support BigInt's value that does not fit in long value range

## What changes were proposed in this pull request?

This PR avoids an exception in the case where `scala.math.BigInt` has a value 
that does not fit into long value range (e.g. `Long.MAX_VALUE+1`). When we run 
the following code by using the current Spark, the following exception is 
thrown.

This PR keeps the value using `BigDecimal` if we detect such an overflow case 
by catching `ArithmeticException`.

Sample program:
```
case class BigIntWrapper(value:scala.math.BigInt)```
spark.createDataset(BigIntWrapper(scala.math.BigInt("1002"))::Nil).show
```
Exception:
```
Error while encoding: java.lang.ArithmeticException: BigInteger out of long 
range
staticinvoke(class org.apache.spark.sql.types.Decimal$, DecimalType(38,0), 
apply, assertnotnull(assertnotnull(input[0, org.apache.spark.sql.BigIntWrapper, 
true])).value, true) AS value#0
java.lang.RuntimeException: Error while encoding: 
java.lang.ArithmeticException: BigInteger out of long range
staticinvoke(class org.apache.spark.sql.types.Decimal$, DecimalType(38,0), 
apply, assertnotnull(assertnotnull(input[0, org.apache.spark.sql.BigIntWrapper, 
true])).value, true) AS value#0
at 
org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:290)
at 
org.apache.spark.sql.SparkSession$$anonfun$2.apply(SparkSession.scala:454)
at 
org.apache.spark.sql.SparkSession$$anonfun$2.apply(SparkSession.scala:454)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.immutable.List.map(List.scala:285)
at 
org.apache.spark.sql.SparkSession.createDataset(SparkSession.scala:454)
at org.apache.spark.sql.Agg$$anonfun$18.apply$mcV$sp(MySuite.scala:192)
at org.apache.spark.sql.Agg$$anonfun$18.apply(MySuite.scala:192)
at org.apache.spark.sql.Agg$$anonfun$18.apply(MySuite.scala:192)
at 
org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
at org.scalatest.Transformer.apply(Transformer.scala:22)
at org.scalatest.Transformer.apply(Transformer.scala:20)
at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:68)
at 
org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
at 
org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
at 
org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
...
Caused by: java.lang.ArithmeticException: BigInteger out of long range
at java.math.BigInteger.longValueExact(BigInteger.java:4531)
at org.apache.spark.sql.types.Decimal.set(Decimal.scala:140)
at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:434)
at org.apache.spark.sql.types.Decimal.apply(Decimal.scala)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
 Source)
at 
org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:287)
... 59 more
```

## How was this patch tested?

Add new test suite into `DecimalSuite`

Author: Kazuaki Ishizaki 

Closes #17684 from kiszk/SPARK-20341.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a750a595
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a750a595
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a750a595

Branch: refs/heads/master
Commit: a750a595976791cb8a77063f690ea8f82ea75a8f
Parents: c9e6035
Author: Kazuaki Ishizaki 
Authored: Fri Apr 21 22:25:35 2017 +0800
Committer: Wenchen Fan 
Committed: Fri Apr 21 22:25:35 2017 +0800

--
 .../org/apache/spark/sql/types/Decimal.scala| 20 ++--
 .../apache/spark/sql/types/DecimalSuite.scala   |  6 ++
 2 files changed, 20 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/a750a595/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala
--
diff --git 
a/sql/

spark git commit: [SPARK-20341][SQL] Support BigInt's value that does not fit in long value range

2017-04-21 Thread wenchen
Repository: spark
Updated Branches:
  refs/heads/branch-2.2 aaeca8bdd -> adaa3f7e0


[SPARK-20341][SQL] Support BigInt's value that does not fit in long value range

## What changes were proposed in this pull request?

This PR avoids an exception in the case where `scala.math.BigInt` has a value 
that does not fit into long value range (e.g. `Long.MAX_VALUE+1`). When we run 
the following code by using the current Spark, the following exception is 
thrown.

This PR keeps the value using `BigDecimal` if we detect such an overflow case 
by catching `ArithmeticException`.

Sample program:
```
case class BigIntWrapper(value:scala.math.BigInt)```
spark.createDataset(BigIntWrapper(scala.math.BigInt("1002"))::Nil).show
```
Exception:
```
Error while encoding: java.lang.ArithmeticException: BigInteger out of long 
range
staticinvoke(class org.apache.spark.sql.types.Decimal$, DecimalType(38,0), 
apply, assertnotnull(assertnotnull(input[0, org.apache.spark.sql.BigIntWrapper, 
true])).value, true) AS value#0
java.lang.RuntimeException: Error while encoding: 
java.lang.ArithmeticException: BigInteger out of long range
staticinvoke(class org.apache.spark.sql.types.Decimal$, DecimalType(38,0), 
apply, assertnotnull(assertnotnull(input[0, org.apache.spark.sql.BigIntWrapper, 
true])).value, true) AS value#0
at 
org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:290)
at 
org.apache.spark.sql.SparkSession$$anonfun$2.apply(SparkSession.scala:454)
at 
org.apache.spark.sql.SparkSession$$anonfun$2.apply(SparkSession.scala:454)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.immutable.List.map(List.scala:285)
at 
org.apache.spark.sql.SparkSession.createDataset(SparkSession.scala:454)
at org.apache.spark.sql.Agg$$anonfun$18.apply$mcV$sp(MySuite.scala:192)
at org.apache.spark.sql.Agg$$anonfun$18.apply(MySuite.scala:192)
at org.apache.spark.sql.Agg$$anonfun$18.apply(MySuite.scala:192)
at 
org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
at org.scalatest.Transformer.apply(Transformer.scala:22)
at org.scalatest.Transformer.apply(Transformer.scala:20)
at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:68)
at 
org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
at 
org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
at 
org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
...
Caused by: java.lang.ArithmeticException: BigInteger out of long range
at java.math.BigInteger.longValueExact(BigInteger.java:4531)
at org.apache.spark.sql.types.Decimal.set(Decimal.scala:140)
at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:434)
at org.apache.spark.sql.types.Decimal.apply(Decimal.scala)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
 Source)
at 
org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:287)
... 59 more
```

## How was this patch tested?

Add new test suite into `DecimalSuite`

Author: Kazuaki Ishizaki 

Closes #17684 from kiszk/SPARK-20341.

(cherry picked from commit a750a595976791cb8a77063f690ea8f82ea75a8f)
Signed-off-by: Wenchen Fan 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/adaa3f7e
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/adaa3f7e
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/adaa3f7e

Branch: refs/heads/branch-2.2
Commit: adaa3f7e027338522e8a71ea40b3237d5889a30d
Parents: aaeca8b
Author: Kazuaki Ishizaki 
Authored: Fri Apr 21 22:25:35 2017 +0800
Committer: Wenchen Fan 
Committed: Fri Apr 21 22:26:05 2017 +0800

--
 .../org/apache/spark/sql/types/Decimal.scala| 20 ++--
 .../apache/spark/sql/types/DecimalSuite.scala   |  6 ++
 2 files changed, 20 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/adaa3f7e/sql/catalyst/src/main/scala/org/apache/spark/sql/typ