It looks like the "scale must be less than precision" rule comes from
Hive requirements[1] (although while searching, this is called into
question elsewhere in Hive[2]). From the design document, the
requirement was specifically to avoid variable (per-row scale):

> For instance, applications (particularly native applications) such as SAS 
> which need to
> pre-allocate memory require fixed types to do so efficiently.

I believe that if we were to write a file (for example) with a
negative scale using Avro 1.10, a reader with an older version
_should_ just fall back to bytes, which seems fair enough.  I would
consider it a bug if the reader just failed on an "out-of-bounds"
scale!

Any thoughts on what Hive (as an example) would require if we were to
relax this constraint in the spec?

Ryan

[1]: https://issues.apache.org/jira/browse/HIVE-3976
[2]: 
https://github.com/apache/hive/blob/94dca16e4eb3caf7dcaa43ae92807e5750e1ff04/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFRound.java#L54

On Mon, Mar 2, 2020 at 9:53 PM Zoltan Farkas
<zolyfar...@yahoo.com.invalid> wrote:
>
> +dev adding the dev mailing list, maybe somebody there can answer the 
> reasoning.
>
> when comparing sql server with Oracle and Postgress:
>
> https://docs.microsoft.com/en-us/sql/t-sql/data-types/decimal-and-numeric-transact-sql?view=sql-server-ver15
>  
> <https://docs.microsoft.com/en-us/sql/t-sql/data-types/decimal-and-numeric-transact-sql?view=sql-server-ver15>
>
> https://docs.oracle.com/cd/A84870_01/doc/server.816/a76965/c10datyp.htm#743 
> <https://docs.oracle.com/cd/A84870_01/doc/server.816/a76965/c10datyp.htm#743>
> https://www.postgresql.org/docs/9.1/datatype-numeric.html 
> <https://www.postgresql.org/docs/9.1/datatype-numeric.html>
>
>
> One allows for negative scale, the other doesn’t.
> My biggest issue with the current decimal spec is that it does not encode the 
> scale (uses the scale defined in the schema), as such it cannot accommodate a 
> Oracle and Postgres NUMBER without scale coercion.
>
> there are other differences (like NAN, …)
>
> But there is no reason why the decimal2 logical type should not be created to 
> address the above…
>
> or even better promote decimal to a first class 
> type...https://issues.apache.org/jira/browse/AVRO-2164 
> <https://issues.apache.org/jira/browse/AVRO-2164>
>
>
> —Z
>
> > On Mar 2, 2020, at 2:34 PM, Christopher Egerton <chr...@confluent.io> wrote:
> >
> > Hi all,
> >
> > I've been trying to do some research on the logical decimal type and why 
> > the scale of a decimal type must be between zero and the precision of the 
> > type, inclusive. The ticket https://issues.apache.org/jira/browse/AVRO-1402 
> > <https://issues.apache.org/jira/browse/AVRO-1402> has a lot of discussion 
> > around the design of the type, but I haven't been able to find any 
> > rationale for the limitations on the scale of the type.
> >
> > These don't appear to align with existing conventions for precision and 
> > scale in the context of SQL numeric types, the JDBC API, and the Java 
> > standard library's BigDecimal class. In these contexts, the precision must 
> > be a positive number, but the scale can be any value--positive 
> > (representing the number of digits of precision that are available after 
> > the decimal point), negative (representing the number of trailing zeroes at 
> > the end of the number before an implicit decimal point), or zero. It is not 
> > bounded by the precision of the type.
> >
> > The definitions for scale and precision appear to align across these 
> > contexts, including the Avro spec, so I'm curious as to why the Avro 
> > spec--seemingly an anomaly--is the only one to declare these limitations on 
> > what the scale of a decimal type can be.
> >
> > Does anyone know why these exist, and if not, would it be okay to file a 
> > ticket to remove them from the spec and begin work on it?
> >
> > Cheers,
> >
> > Chris
>

Reply via email to