[GitHub] spark issue #22470: [SPARK-25454][SQL] should not generate negative scale as...

2018-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22470
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22470: [SPARK-25454][SQL] should not generate negative scale as...

2018-09-19 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22470
  
cc @mgaido91 @hvanhovell  @gatorsmile 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22470: [SPARK-25454][SQL] should not generate negative scale as...

2018-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22470
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3251/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22470: [SPARK-25454][SQL] should not generate negative scale as...

2018-09-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22470
  
**[Test build #96271 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96271/testReport)**
 for PR 22470 at commit 
[`3c05636`](https://github.com/apache/spark/commit/3c05636b879b678cfcf72dfb515ea691317e470d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22470: [SPARK-25454][SQL] should not generate negative scale as...

2018-09-19 Thread dilipbiswal
Github user dilipbiswal commented on the issue:

https://github.com/apache/spark/pull/22470
  
@cloud-fan Could you please check CSVinferSchema::tryParseDecimal() ? There 
is a condition to check negative scale.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22470: [SPARK-25454][SQL] should not generate negative scale as...

2018-09-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22470
  
**[Test build #96271 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96271/testReport)**
 for PR 22470 at commit 
[`3c05636`](https://github.com/apache/spark/commit/3c05636b879b678cfcf72dfb515ea691317e470d).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22470: [SPARK-25454][SQL] should not generate negative scale as...

2018-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22470
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22470: [SPARK-25454][SQL] should not generate negative scale as...

2018-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22470
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96271/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22470: [SPARK-25454][SQL] should not generate negative scale as...

2018-09-19 Thread mgaido91
Github user mgaido91 commented on the issue:

https://github.com/apache/spark/pull/22470
  
@cloud-fan I think the main problem about this (and it is the reason why I 
haven't proposed it) is that the range of operations supported would be 
smaller, so we may forbid operations which now can happen. For instance, the 
following code has been always working on Spark: `lit(BigDecimal(1e36)) * 
lit(BigDecimal(1))`. Indeed now this would become a `decimal(6, -36)`. With 
your change, this is going to be a `decimal(42, 0)` which is out of the range 
of supported values (ie. an overflow would occur).

I am not sure if any user has something like this, but it is possible and I 
think we cannot exclude it. We may, though, restrict again the condition when 
it happens, ie. in case we are just parsing from a literal we can avoid 
returning a negative scale. But the other fix would be needed anyway in this 
case, as we could still have to deal with negative scales, so IMO this would be 
quite useless.

I'd agree though about forbidding negative scale in 3.0.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22470: [SPARK-25454][SQL] should not generate negative scale as...

2018-09-19 Thread dilipbiswal
Github user dilipbiswal commented on the issue:

https://github.com/apache/spark/pull/22470
  
@mgaido91 @cloud-fan On the other hand .. some use cases may work better 
:-) , for example
Before 
```
scala> spark.sql("create table dec as select (1e36 * 1) as col1")
org.apache.spark.SparkException: Cannot recognize hive type string: 
decimal(3,-36)
  at 
org.apache.spark.sql.hive.client.HiveClientImpl$.org$apache$spark$sql$hive$client$HiveClientImpl$$getSparkSQLDataType(HiveClientImpl.scala:883)
  at 
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$org$apache$spark$sql$hive$client$HiveClientImpl$$verifyColumnDataType$1.apply(HiveClientImpl.scala:905)

with this pr
```
scala> spark.sql("create table dec as select (1e36 * 1) as col1")
18/09/19 14:29:29 WARN HiveMetaStore: Location: 
file:/user/hive/warehouse/dec specified for non-external table:dec
18/09/19 14:29:30 WARN ObjectStore: Failed to get database global_temp, 
returning NoSuchObjectException
res0: org.apache.spark.sql.DataFrame = []
scala> spark.sql("describe table dec").show
++-+---+
|col_name|data_type|comment|
++-+---+
|col1|decimal(38,0)|   null|
++-+---+
```
Perhaps we may have issues writing out data frames containing decimal with 
negative scale to file based datasources as well. I have not verified though. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22470: [SPARK-25454][SQL] should not generate negative scale as...

2018-09-19 Thread mgaido91
Github user mgaido91 commented on the issue:

https://github.com/apache/spark/pull/22470
  
@dilipbiswal yes, I definitely think that in general we should get to 
forbid negative scales. I only thought that this should be done in 3.0 rather 
than now. And for now the safest option to me was to update properly the only 
rule which was not working correctly in that case. I also agree in avoiding 
negative scales in places where they were not possible before 2.3, ie. when 
parsing from literals. But as there may be other cases where this happens, this 
solves only part of the problem.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22470: [SPARK-25454][SQL] should not generate negative scale as...

2018-09-19 Thread dilipbiswal
Github user dilipbiswal commented on the issue:

https://github.com/apache/spark/pull/22470
  
@mgaido91 makes sense. Actually @cloud-fan had asked me to write some test 
cases for decimal values with -ve scale in another PR. While i was playing 
around, i found this issue. It seemed to me that the whole -ve scale thingy is 
not thought through properly. Thats the reason i wanted to share my findings 
here while you guys are discussing this :-).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22470: [SPARK-25454][SQL] should not generate negative scale as...

2018-09-19 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22470
  
@mgaido91 you are right, this still has behavior changes if the intermedia 
result exceed the max precision. Since most of the storages don't support 
negative scale(hive, parquet, etc.), I think we should eventually forbid it 
too. Let's move the discussion to #22450


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org