[ 
https://issues.apache.org/jira/browse/SPARK-29854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17081694#comment-17081694
 ] 

Sathyaprakash Govindasamy commented on SPARK-29854:
---------------------------------------------------

In Spark 3.0, you can change the behaviour by using  configuration 
*spark.sql.ansi.enabled* ( _When `spark.sql.ansi.enabled` is set to `true`, 
Spark SQL follows the standard in basic behaviours (e.g., arithmetic 
operations, type conversion, SQL functions and SQL parsing)_)

If spark.sql.ansi.enabled=false, which is default, then spark does not throw 
any arithmetic error like overflowException. For the given sql query, it 
basically executes below statement to cast the input value to int. Since 
returned value is negative, passing negative value to lpad/rpad gives empty 
string as output for the sql query mentioned in this issue
{code:java}
scala> BigDecimal("500000000000000000000000").longValue.toInt
res7: Int = -796917760{code}
if you set spark.sql.ansi.enabled=true, then it throws below error.  
{code:java}
java.lang.ArithmeticException: Casting 500000000000000000000000 to int causes 
overflow
  at org.apache.spark.sql.types.Decimal.overflowException(Decimal.scala:254)
  at org.apache.spark.sql.types.Decimal.roundToInt(Decimal.scala:317)
  at org.apache.spark.sql.types.DecimalExactNumeric$.toInt(numerics.scala:183)
  at org.apache.spark.sql.types.DecimalExactNumeric$.toInt(numerics.scala:182)
  at 
org.apache.spark.sql.catalyst.expressions.CastBase.$anonfun$castToInt$13(Cast.scala:518)
  at 
org.apache.spark.sql.catalyst.expressions.CastBase.$anonfun$castToInt$13$adapted(Cast.scala:518)
  at 
org.apache.spark.sql.catalyst.expressions.CastBase.nullSafeEval(Cast.scala:808)
  at 
org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:461)
  at 
org.apache.spark.sql.catalyst.expressions.TernaryExpression.eval(Expression.scala:686)
  at 
org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:457)
  at 
org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1$$anonfun$applyOrElse$1.applyOrElse(expressions.scala:52){code}

> lpad and rpad built in function not throw Exception for invalid len value
> -------------------------------------------------------------------------
>
>                 Key: SPARK-29854
>                 URL: https://issues.apache.org/jira/browse/SPARK-29854
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: ABHISHEK KUMAR GUPTA
>            Priority: Minor
>
> Spark Returns Empty String)
> {code}
> 0: jdbc:hive2://10.18.19.208:23040/default> SELECT 
> lpad('hihhhhhhhhhhhhhhhhhhhhhhh', 500000000000000000000000, '????????????');
>  +----------------------------------------------------+
> |lpad(hihhhhhhhhhhhhhhhhhhhhhhh, CAST(500000000000000000000000 AS INT), 
> ????????????)|
> +----------------------------------------------------+
> +----------------------------------------------------+
> Hive:
> SELECT lpad('hihhhhhhhhhhhhhhhhhhhhhhh', 500000000000000000000000, 
> '????????????');
>  Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10016]: Line 1:67 Argument type mismatch ''????????????'': lpad only takes 
> INT/SHORT/BYTE types as 2-ths argument, got DECIMAL (state=42000,code=10016)
> PostgreSQL
> function lpad(unknown, numeric, unknown) does not exist
>  
> Expected output:
> In Spark also it should throw Exception like Hive
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to