Dave Oshinsky created DRILL-4834:
------------------------------------

             Summary: decimal implementation is vulnerable to overflow errors, 
and extremely complex
                 Key: DRILL-4834
                 URL: https://issues.apache.org/jira/browse/DRILL-4834
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Data Types
    Affects Versions: 1.6.0
         Environment: Drill 1.7 on any platform
            Reporter: Dave Oshinsky
             Fix For: Future


While working on a fix for DRILL-4704, logic was added to CastIntDecimal.java 
template to handle the situation where a precision is not supplied (i.e., the 
supplied precision is zero) for an integer value that is to be casted to a 
decimal.  The Drill decimal implementation uses a limited selection of fixed 
decimal precision data types (the total number of decimal digits, i.e., 
Decimal9, 18, 28, 38) to represent decimal values.  If the destination 
precision is too small to represent the input integer that is being casted, 
there is no clean way to deal with the overflow error properly.

While using fixed decimal precisions as is being done currently can lead to 
more efficient use of memory, it often will actually lead to less efficient use 
of memory (when the fixed precision is specified significantly larger than is 
actually needed to represent the numbers), and it results in a tremendous 
mushrooming of the complexity of the code.  For each fixed precision (and there 
are only a limited set of selections, 9, 18, 28, 38, which itself leads to 
memory inefficiency), there is a separate set of code generated from templates. 
 For each pairwise combination of decimal or non-decimal numeric types, there 
are multiple places in the code where conversions must be handled, or 
conditions must be included to handle the difference in precision between the 
two types.  A one-size-fits-all approach (using a variable width vector to 
represent any decimal precision) would usually be more memory-efficient (since 
precisions are often over-specified), and would greatly simplify the code.

Also see the DRILL-4184 issue, which is related.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to