Hi Marco, Thanks for sending it! The problem is clearly explained in this email, but I would not treat it as a SPIP. It proposes a fix for a very tricky bug, and SPIP is usually for new features. Others please correct me if I was wrong.
Thanks, Wenchen On Fri, Sep 21, 2018 at 5:47 PM Marco Gaido <marcogaid...@gmail.com> wrote: > Hi all, > > I am writing this e-mail in order to discuss the issue which is reported > in SPARK-25454 and according to Wenchen's suggestion I prepared a design > doc for it. > > The problem we are facing here is that our rules for decimals operations > are taken from Hive and MS SQL server and they explicitly don't support > decimals with negative scales. So the rules we have currently are not meant > to deal with negative scales. The issue is that Spark, instead, doesn't > forbid negative scales and - indeed - there are cases in which we are > producing them (eg. a SQL constant like 1e8 would be turned to a decimal(1, > -8)). > > Having negative scales most likely wasn't really intended. But > unfortunately getting rid of them would be a breaking change as many > operations working fine currently would not be allowed anymore and would > overflow (eg. select 1e36 * 10000). As such, this is something I'd > definitely agree on doing, but I think we can target only for 3.0. > > What we can start doing now, instead, is updating our rules in order to > handle properly also the case when decimal scales are negative. From my > investigation, it turns out that the only operations which has problems > with them is Divide. > > Here you can find the design doc with all the details: > https://docs.google.com/document/d/17ScbMXJ83bO9lx8hB_jeJCSryhT9O_HDEcixDq0qmPk/edit?usp=sharing. > The document is also linked in SPARK-25454. There is also already a PR with > the change: https://github.com/apache/spark/pull/22450. > > Looking forward to hear your feedback, > Thanks. > Marco >