Hi everyone, I would like to work on this JIRA ticket: https://issues.apache.org/jira/browse/ARROW-9404 ([C++] Add support for Decimal16, Decimal32 and Decimal64)
This will be my first experience with contributing to Arrow, so I want to ask advice what approach should I use. As far as I know, currently, Arrow supports only Decimal128 and its basic and primary implementations located at `cpp/src/arrow/util/basic_decimal.h` and `.../util/decimal.h`. In current implementation 128-bit Decimal represented by two 64-bit integers. So there are several approaches that can be applied: 1. From current BasicDecimal128 class make a template class BasicDecimal<bit_width>, whose `low` and `high` variable types will be dependent on `bit_width` template parameter, implementation of methods also will be rewritten to depend on `bit_width`. As the result, we will have classes that work with `bit_width / 2`-bit integers. The disadvantage of this approach is that even when we can fit our decimal in a single int value, we will still be splitting it into two variables and have to do that unnecessary extra logic of handling these two variables. But all Decimals will be the instances of one template class and will be consistent with each other. 2. Implement new template class BasicDecimal<bit_width> and Decimal<bit_width> which will work only for bit_width <= 64 (where we can represent our decimal with a single `int##bit_width##_t` variable), and also reimplement all methods of Decimals in this new class. But that approach makes ambiguous what Decimal is, because technically Decimal64 and Decimal128 will be completely different classes which can create some inconsistency between them. 3. If we have some variable that indicates a maximum bit integer, then we can try to apply the following approach. Define template BasicDecimal<bit_width> class whose value will be represented not by ints variables, but by array of ints: ``` // Pseudo-code, may be incorrect template<int width> class BasicDecimal{ using int_type = IntBitWidthTypes<max(width, MAX_INT_WIDTH)>::type; int_type values[max(width / MAX_INT_WIDTH, 1)]; // all of these can be computed at the compile time .... }; using BasicDecimal128 = BasicDecimal<128>; using BasicDecimal64 = BasicDecimal<64>; .... ``` In the result, Decimal128 will have an uint64_t array of 2 elements, Decimal64 will have an uint64_t array of 1 element, Decimal32 - uint32_t array of 1 element, and so on... This also allows us to define decimals of arbitrary bitness. For example, Decimal256 will be represented as an array of uint64_t with 4 elements. The bad side of this approach is its complexity - we need to rewrite the whole BasicDecimal and Decimal class. Which one of these approaches will be the correct one? P.S. I've just noticed that I'm not being able to assign JIRA tasks for myself, how can I do this? -------------------------------------------------------------------- Joint Stock Company Intel A/O Registered legal address: Krylatsky Hills Business Park, 17 Krylatskaya Str., Bldg 4, Moscow 121614, Russian Federation This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.