[GitHub] spark pull request #20023: [SPARK-22036][SQL] Decimal multiplication with hi...

mgaido91 Thu, 21 Dec 2017 02:23:44 -0800

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20023#discussion_r158242030
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/types/DecimalType.scala ---
    @@ -136,10 +137,54 @@ object DecimalType extends AbstractDataType {
         case DoubleType => DoubleDecimal
       }
     
    +  private[sql] def forLiteral(literal: Literal): DecimalType = 
literal.value match {
    +    case v: Short => fromBigDecimal(BigDecimal(v))
    +    case v: Int => fromBigDecimal(BigDecimal(v))
    +    case v: Long => fromBigDecimal(BigDecimal(v))
    +    case _ => forType(literal.dataType)
    +  }
    +
    +  private[sql] def fromBigDecimal(d: BigDecimal): DecimalType = {
    +    DecimalType(Math.max(d.precision, d.scale), d.scale)
    +  }
    +
       private[sql] def bounded(precision: Int, scale: Int): DecimalType = {
         DecimalType(min(precision, MAX_PRECISION), min(scale, MAX_SCALE))
       }
     
    +  // scalastyle:off line.size.limit
    +  /**
    +   * Decimal implementation is based on Hive's one, which is itself 
inspired to SQLServer's one.
    +   * In particular, when a result precision is greater than {@link 
#MAX_PRECISION}, the
    +   * corresponding scale is reduced to prevent the integral part of a 
result from being truncated.
    +   *
    +   * For further reference, please see
    +   * 
https://blogs.msdn.microsoft.com/sqlprogrammability/2006/03/29/multiplication-and-division-with-numerics/.
    +   *
    +   * @param precision
    +   * @param scale
    +   * @return
    +   */
    +  // scalastyle:on line.size.limit
    +  private[sql] def adjustPrecisionScale(precision: Int, scale: Int): 
DecimalType = {
    +    // Assumptions:
    +    // precision >= scale
    +    // scale >= 0
    +    if (precision <= MAX_PRECISION) {
    +      // Adjustment only needed when we exceed max precision
    +      DecimalType(precision, scale)
    +    } else {
    +      // Precision/scale exceed maximum precision. Result must be adjusted 
to MAX_PRECISION.
    +      val intDigits = precision - scale
    +      // If original scale less than MINIMUM_ADJUSTED_SCALE, use original 
scale value; otherwise
    +      // preserve at least MINIMUM_ADJUSTED_SCALE fractional digits
    +      val minScaleValue = Math.min(scale, MINIMUM_ADJUSTED_SCALE)
    --- End diff --
    
    Yes, sorry, my answer was very poor, I will rephrase. `scale` contains the 
scale which we need to represent the values without any precision loss. What we 
are doing here is saying that the lower bound for the scale is either the scale 
that we need to correctly represent the value or the `MINIMUM_ADJUSTED_SCALE`. 
After this, in the line below we state that the scale we will use is the max 
between the number of digits of the precision we don't need on the left of the 
dot and this `minScaleValue`: ie. even though in some cases we might need a 
scale higher than `MINIMUM_ADJUSTED_SCALE`, but the number of digits needed on 
the left on the dot would force us to have a scale lower than 
`MINIMUM_ADJUSTED_SCALE`, we enforce that we will maintain at least 
`MINIMUM_ADJUSTED_SCALE`. We can't let the scale be lower that this threshold, 
even though it would be needed to enforce that we don't loose digits on the 
left of the dot. Please refer also to the blog post I linked in the comment 
above fo
 r further (hopefully better) explanation.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20023: [SPARK-22036][SQL] Decimal multiplication with hi...

Reply via email to