[GitHub] spark pull request #22419: [SPARK-23906][SQL] Add built-in UDF TRUNCATE(numb...
Github user wangyum closed the pull request at: https://github.com/apache/spark/pull/22419 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22419: [SPARK-23906][SQL] Add built-in UDF TRUNCATE(numb...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22419#discussion_r224318028 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala --- @@ -1245,3 +1245,27 @@ case class BRound(child: Expression, scale: Expression) with Serializable with ImplicitCastInputTypes { def this(child: Expression) = this(child, Literal(0)) } + +/** + * The number truncated to scale decimal places. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(number, scale) - Returns number truncated to scale decimal places. " + +"If scale is omitted, then number is truncated to 0 places. " + +"scale can be negative to truncate (make zero) scale digits left of the decimal point.", + examples = """ +Examples: + > SELECT _FUNC_(1234567891.1234567891, 4); + 1234567891.1234 + > SELECT _FUNC_(1234567891.1234567891, -4); + 123456 + > SELECT _FUNC_(1234567891.1234567891); + 1234567891 + """) +// scalastyle:on line.size.limit +case class Truncate(child: Expression, scale: Expression) --- End diff -- In that case, its ok to handle the string as date. How about only accepting float, double, and decimal for number truncation? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22419: [SPARK-23906][SQL] Add built-in UDF TRUNCATE(numb...
Github user wangyum commented on a diff in the pull request: https://github.com/apache/spark/pull/22419#discussion_r222554828 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala --- @@ -1245,3 +1245,27 @@ case class BRound(child: Expression, scale: Expression) with Serializable with ImplicitCastInputTypes { def this(child: Expression) = this(child, Literal(0)) } + +/** + * The number truncated to scale decimal places. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(number, scale) - Returns number truncated to scale decimal places. " + +"If scale is omitted, then number is truncated to 0 places. " + +"scale can be negative to truncate (make zero) scale digits left of the decimal point.", + examples = """ +Examples: + > SELECT _FUNC_(1234567891.1234567891, 4); + 1234567891.1234 + > SELECT _FUNC_(1234567891.1234567891, -4); + 123456 + > SELECT _FUNC_(1234567891.1234567891); + 1234567891 + """) +// scalastyle:on line.size.limit +case class Truncate(child: Expression, scale: Expression) --- End diff -- For example: `trunc(StringType)`, We don't know we should trunc to number or trunc to date. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22419: [SPARK-23906][SQL] Add built-in UDF TRUNCATE(numb...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/22419#discussion_r222539741 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala --- @@ -1245,3 +1245,27 @@ case class BRound(child: Expression, scale: Expression) with Serializable with ImplicitCastInputTypes { def this(child: Expression) = this(child, Literal(0)) } + +/** + * The number truncated to scale decimal places. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(number, scale) - Returns number truncated to scale decimal places. " + +"If scale is omitted, then number is truncated to 0 places. " + +"scale can be negative to truncate (make zero) scale digits left of the decimal point.", + examples = """ +Examples: + > SELECT _FUNC_(1234567891.1234567891, 4); + 1234567891.1234 + > SELECT _FUNC_(1234567891.1234567891, -4); + 123456 + > SELECT _FUNC_(1234567891.1234567891); + 1234567891 + """) +// scalastyle:on line.size.limit +case class Truncate(child: Expression, scale: Expression) --- End diff -- Hi @wangyum, can you still extend `trunc`? If not, what was the major reason you decided to separate these? Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22419: [SPARK-23906][SQL] Add built-in UDF TRUNCATE(numb...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22419#discussion_r220973430 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala --- @@ -1245,3 +1245,27 @@ case class BRound(child: Expression, scale: Expression) with Serializable with ImplicitCastInputTypes { def this(child: Expression) = this(child, Literal(0)) } + +/** + * The number truncated to scale decimal places. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(number, scale) - Returns number truncated to scale decimal places. " + +"If scale is omitted, then number is truncated to 0 places. " + +"scale can be negative to truncate (make zero) scale digits left of the decimal point.", + examples = """ +Examples: + > SELECT _FUNC_(1234567891.1234567891, 4); + 1234567891.1234 + > SELECT _FUNC_(1234567891.1234567891, -4); + 123456 + > SELECT _FUNC_(1234567891.1234567891); + 1234567891 + """) +// scalastyle:on line.size.limit +case class Truncate(child: Expression, scale: Expression) --- End diff -- I am still preferring to extend `trunc`. Not straightforward to know the difference between `truncate` and `trunc` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22419: [SPARK-23906][SQL] Add built-in UDF TRUNCATE(numb...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22419#discussion_r219858062 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -2214,6 +2214,26 @@ object functions { */ def radians(columnName: String): Column = radians(Column(columnName)) + /** + * Returns the value of the column `e` truncated to 0 places. + * + * @group math_funcs + * @since 2.4.0 + */ + def truncate(e: Column): Column = truncate(e, 0) + + /** + * Returns the value of column `e` truncated to the unit specified by the scale. + * If scale is omitted, then the value of column `e` is truncated to 0 places. + * Scale can be negative to truncate (make zero) scale digits left of the decimal point. + * + * @group math_funcs + * @since 2.4.0 --- End diff -- ditto --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22419: [SPARK-23906][SQL] Add built-in UDF TRUNCATE(numb...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22419#discussion_r219858027 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -2214,6 +2214,26 @@ object functions { */ def radians(columnName: String): Column = radians(Column(columnName)) + /** + * Returns the value of the column `e` truncated to 0 places. + * + * @group math_funcs + * @since 2.4.0 --- End diff -- `@since 2.5.0` now. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22419: [SPARK-23906][SQL] Add built-in UDF TRUNCATE(numb...
Github user wangyum commented on a diff in the pull request: https://github.com/apache/spark/pull/22419#discussion_r219660644 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala --- @@ -413,6 +413,7 @@ object Decimal { val ROUND_HALF_EVEN = BigDecimal.RoundingMode.HALF_EVEN val ROUND_CEILING = BigDecimal.RoundingMode.CEILING val ROUND_FLOOR = BigDecimal.RoundingMode.FLOOR + val ROUND_DOWN = BigDecimal.RoundingMode.DOWN --- End diff -- I added it and seem do not need change `longVal`. This test case can passed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22419: [SPARK-23906][SQL] Add built-in UDF TRUNCATE(numb...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/22419#discussion_r219655441 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala --- @@ -413,6 +413,7 @@ object Decimal { val ROUND_HALF_EVEN = BigDecimal.RoundingMode.HALF_EVEN val ROUND_CEILING = BigDecimal.RoundingMode.CEILING val ROUND_FLOOR = BigDecimal.RoundingMode.FLOOR + val ROUND_DOWN = BigDecimal.RoundingMode.DOWN --- End diff -- I guess we need to modify `Decimal.changePrecision()` as well? Could you add it to [DecimalSuite.scala#L207](https://github.com/apache/spark/blob/479b31fa046e8402f4f93cdbad5fe93ef1ea570f/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DecimalSuite.scala#L207) to check? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22419: [SPARK-23906][SQL] Add built-in UDF TRUNCATE(numb...
Github user wangyum commented on a diff in the pull request: https://github.com/apache/spark/pull/22419#discussion_r219651627 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala --- @@ -413,6 +413,7 @@ object Decimal { val ROUND_HALF_EVEN = BigDecimal.RoundingMode.HALF_EVEN val ROUND_CEILING = BigDecimal.RoundingMode.CEILING val ROUND_FLOOR = BigDecimal.RoundingMode.FLOOR + val ROUND_DOWN = BigDecimal.RoundingMode.DOWN --- End diff -- Need this change, otherwise: ``` Caused by: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 41, Column 138: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 41, Column 138: A method named "ROUND_DOWN" is not declared in any enclosing class nor any supertype, nor through a static import ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22419: [SPARK-23906][SQL] Add built-in UDF TRUNCATE(numb...
Github user wangyum commented on a diff in the pull request: https://github.com/apache/spark/pull/22419#discussion_r218749189 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala --- @@ -1245,3 +1245,80 @@ case class BRound(child: Expression, scale: Expression) with Serializable with ImplicitCastInputTypes { def this(child: Expression) = this(child, Literal(0)) } + +/** + * The number truncated to scale decimal places. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(number, scale) - Returns number truncated to scale decimal places. " + +"If scale is omitted, then number is truncated to 0 places. " + +"scale can be negative to truncate (make zero) scale digits left of the decimal point.", + examples = """ +Examples: + > SELECT _FUNC_(1234567891.1234567891, 4); + 1234567891.1234 + > SELECT _FUNC_(1234567891.1234567891, -4); + 123456 + > SELECT _FUNC_(1234567891.1234567891); + 1234567891 + """) +// scalastyle:on line.size.limit +case class Truncate(number: Expression, scale: Expression) + extends BinaryExpression with ImplicitCastInputTypes { + + def this(number: Expression) = this(number, Literal(0)) + + override def left: Expression = number + override def right: Expression = scale + + override def inputTypes: Seq[AbstractDataType] = +Seq(TypeCollection(DoubleType, FloatType, DecimalType), IntegerType) + + override def checkInputDataTypes(): TypeCheckResult = { +super.checkInputDataTypes() match { + case TypeCheckSuccess => +if (scale.foldable) { --- End diff -- Same to `RoundBase`. only support foldable: https://github.com/apache/spark/blob/c7156943a2a32ba57e67aa6d8fa7035a09847e07/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala#L1076 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22419: [SPARK-23906][SQL] Add built-in UDF TRUNCATE(numb...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/22419#discussion_r218535402 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -2214,6 +2214,24 @@ object functions { */ def radians(columnName: String): Column = radians(Column(columnName)) + /** + * Returns number truncated to the unit specified by the scale. + * + * For example, `truncate(1234567891.1234567891, 4)` returns 1234567891.1234 + * + * @param number The number to be truncated + * @param scale: A scale used to truncate number + * + * @return The number truncated to scale decimal places. + * If scale is omitted, then number is truncated to 0 places. + * scale can be negative to truncate (make zero) scale digits left of the decimal point. + * @group math_funcs + * @since 2.4.0 + */ + def truncate(number: Column, scale: Int): Column = withExpr { +Truncate(number.expr, Literal(scale)) + } --- End diff -- We need `def truncate(number: Column)` to support omitting scale? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22419: [SPARK-23906][SQL] Add built-in UDF TRUNCATE(numb...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/22419#discussion_r218534147 --- Diff: sql/core/src/test/resources/sql-tests/inputs/operators.sql --- @@ -96,3 +96,9 @@ select positive('-1.11'), positive(-1.11), negative('-1.11'), negative(-1.11); -- pmod select pmod(-7, 2), pmod(0, 2), pmod(7, 0), pmod(7, null), pmod(null, 2), pmod(null, null); select pmod(cast(3.13 as decimal), cast(0 as decimal)), pmod(cast(2 as smallint), cast(0 as smallint)); + +-- truncate +select truncate(1234567891.1234567891, -4), truncate(1234567891.1234567891, 0), truncate(1234567891.1234567891, 4); +select truncate(cast(1234567891.1234567891 as decimal), -4), truncate(cast(1234567891.1234567891 as decimal), 0), truncate(cast(1234567891.1234567891 as decimal), 4); +select truncate(cast(1234567891.1234567891 as long), -4), truncate(cast(1234567891.1234567891 as long), 0), truncate(cast(1234567891.1234567891 as long), 4); +select truncate(cast(1234567891.1234567891 as long), 9.03) --- End diff -- Could you add a test omitting scale? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22419: [SPARK-23906][SQL] Add built-in UDF TRUNCATE(numb...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/22419#discussion_r218524063 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala --- @@ -1245,3 +1245,65 @@ case class BRound(child: Expression, scale: Expression) with Serializable with ImplicitCastInputTypes { def this(child: Expression) = this(child, Literal(0)) } + +/** + * The number truncated to scale decimal places. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(number, scale) - Returns number truncated to scale decimal places. " + +"If scale is omitted, then number is truncated to 0 places. " + +"scale can be negative to truncate (make zero) scale digits left of the decimal point.", + examples = """ +Examples: + > SELECT _FUNC_(1234567891.1234567891, 4); + 1234567891.1234 + > SELECT _FUNC_(1234567891.1234567891, -4); + 123456 + > SELECT _FUNC_(1234567891.1234567891); + 1234567891 + """) +// scalastyle:on line.size.limit +case class Truncate(number: Expression, scale: Expression) + extends BinaryExpression with ImplicitCastInputTypes { + + override def left: Expression = number + override def right: Expression = scale + + override def inputTypes: Seq[AbstractDataType] = +Seq(TypeCollection(DoubleType, DecimalType), IntegerType) --- End diff -- Don't we need to support `FloatType`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22419: [SPARK-23906][SQL] Add built-in UDF TRUNCATE(numb...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/22419#discussion_r218525420 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/MathExpressionsSuite.scala --- @@ -644,4 +644,31 @@ class MathExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper { checkEvaluation(BRound(-0.35, 1), -0.4) checkEvaluation(BRound(-35, -1), -40) } + + test("Truncate number") { +def testTruncate(input: Double, fmt: Int, expected: Double): Unit = { + checkEvaluation(Truncate(Literal.create(input, DoubleType), +Literal.create(fmt, IntegerType)), +expected) + checkEvaluation(Truncate(Literal.create(input, DoubleType), +NonFoldableLiteral.create(fmt, IntegerType)), +expected) +} + +testTruncate(1234567891.1234567891, 4, 1234567891.1234) +testTruncate(1234567891.1234567891, -4, 123456) +testTruncate(1234567891.1234567891, 0, 1234567891) +testTruncate(0.123, -1, 0) +testTruncate(0.123, 0, 0) + +checkEvaluation(Truncate(Literal.create(1D, DoubleType), + NonFoldableLiteral.create(null, IntegerType)), + null) --- End diff -- Why only `NonFoldableLiteral`? What if `Literal.create(null, IntegerType)` for scale? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22419: [SPARK-23906][SQL] Add built-in UDF TRUNCATE(numb...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/22419#discussion_r218531021 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala --- @@ -1245,3 +1245,65 @@ case class BRound(child: Expression, scale: Expression) with Serializable with ImplicitCastInputTypes { def this(child: Expression) = this(child, Literal(0)) } + +/** + * The number truncated to scale decimal places. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(number, scale) - Returns number truncated to scale decimal places. " + +"If scale is omitted, then number is truncated to 0 places. " + +"scale can be negative to truncate (make zero) scale digits left of the decimal point.", + examples = """ +Examples: + > SELECT _FUNC_(1234567891.1234567891, 4); + 1234567891.1234 + > SELECT _FUNC_(1234567891.1234567891, -4); + 123456 + > SELECT _FUNC_(1234567891.1234567891); + 1234567891 + """) +// scalastyle:on line.size.limit +case class Truncate(number: Expression, scale: Expression) + extends BinaryExpression with ImplicitCastInputTypes { + + override def left: Expression = number + override def right: Expression = scale + + override def inputTypes: Seq[AbstractDataType] = +Seq(TypeCollection(DoubleType, DecimalType), IntegerType) + + override def dataType: DataType = left.dataType + + private lazy val foldableTruncScale: Int = scale.eval().asInstanceOf[Int] + + protected override def nullSafeEval(input1: Any, input2: Any): Any = { +val truncScale = if (scale.foldable) { + foldableTruncScale +} else { + scale.eval().asInstanceOf[Int] +} +number.dataType match { + case DoubleType => MathUtils.trunc(input1.asInstanceOf[Double], truncScale) + case DecimalType.Fixed(_, _) => +MathUtils.trunc(input1.asInstanceOf[Decimal].toJavaBigDecimal, truncScale) --- End diff -- I guess we have to return `Decimal` instead of `java.math.BigDecimal`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22419: [SPARK-23906][SQL] Add built-in UDF TRUNCATE(numb...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/22419#discussion_r218525567 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/MathExpressionsSuite.scala --- @@ -644,4 +644,31 @@ class MathExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper { checkEvaluation(BRound(-0.35, 1), -0.4) checkEvaluation(BRound(-35, -1), -40) } + + test("Truncate number") { +def testTruncate(input: Double, fmt: Int, expected: Double): Unit = { + checkEvaluation(Truncate(Literal.create(input, DoubleType), +Literal.create(fmt, IntegerType)), +expected) + checkEvaluation(Truncate(Literal.create(input, DoubleType), +NonFoldableLiteral.create(fmt, IntegerType)), +expected) +} + +testTruncate(1234567891.1234567891, 4, 1234567891.1234) +testTruncate(1234567891.1234567891, -4, 123456) +testTruncate(1234567891.1234567891, 0, 1234567891) +testTruncate(0.123, -1, 0) +testTruncate(0.123, 0, 0) + +checkEvaluation(Truncate(Literal.create(1D, DoubleType), + NonFoldableLiteral.create(null, IntegerType)), + null) +checkEvaluation(Truncate(Literal.create(null, DoubleType), + NonFoldableLiteral.create(1, IntegerType)), + null) +checkEvaluation(Truncate(Literal.create(null, DoubleType), + NonFoldableLiteral.create(null, IntegerType)), + null) --- End diff -- Could you add tests for `DecimalType`, and `FloatType` if we need to support? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org