[ https://issues.apache.org/jira/browse/PIG-5328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16329483#comment-16329483 ]
Michael Howard commented on PIG-5328: ------------------------------------- Context/background: BigDecimal is difficult to work with. I did not see references in the pig documentation to how BigDecimal issues of precision, scale, and RoundingMode are handled for the pig 'bigdecimal' data type. So I specifically went looking at the Divide code to see how the pig implementation addressed these issues. I observed that the divide operation is performed with a min of 6 digits of scale to the right of the decimal point and RoundingMode.HALF_UP. (Arguably, RoundingMode.HALF_EVEN would have been the preferred choice.) I stumbled on the equalsZero() code and got somewhat nervous when I saw that it used BigDecimal.equals(). The behavior of BigDecimal.equals() is pretty fundamental to the understanding and proper usage of the BigDecimal API. The fact that the equalsZero() code was written this way is a strong indication that the implementor had very little exposure/experience with the BigDecimal package. I am not familiar with the pig code, but I will try to help. > expressionOperator Divide.equalsZero(DataType.BIGDECIMAL) is invalid > -------------------------------------------------------------------- > > Key: PIG-5328 > URL: https://issues.apache.org/jira/browse/PIG-5328 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: 0.16.0, 0.17.0 > Environment: pig source HEAD as of Jan 2018 ... probably goes all the > way back to initial implementation of BigDecimal support > Reporter: Michael Howard > Priority: Major > Original Estimate: 0.25h > Remaining Estimate: 0.25h > > Divide.equalsZero(DataType.BIGDECIMAL) is flawed in that it uses an invalid > test for == ZERO in the case of BigDecimal. > > ./physicalLayer/expressionOperators/Divide.java tests the divisor for zero in > order to avoid DivideByZero. > The test is performed using a method equalsZero(...) > Divide.equalsZero() is given 'protected' access, but I could not find other > references ... should be 'private' > equalsZero() implementation dispatches on dataType to type-specific > predicates ... the BigDecimal implementation is incorrect > The method BigDecimal.equals(other) is intended to be used for object > equality, not numerical equality. (Their justification is that equals() is > used in hash-table lookups in java Collections.) BigDecimal numbers are not > normalized and scale is an important attribute. Scale is included in > BigDecimal.equals(). The values "0" and "0.00" have different scales and are > not considered "equals()" > Comparisons for numeric equality need to be done using compareTo() > In the special case of comparing to zero, BigDecimal.signum() is the best. > The current code is > {{ case DataType.BIGDECIMAL:}} > {{ return BigDecimal.ZERO.equals((BigDecimal) a);}} > needs to be changed to > {{ case DataType.BIGDECIMAL:}} > {{ return ((BigDecimal) a).signum() == 0;}} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)