Hi Wenchen, Thank you for the clarification. I agree that this is more a bug fix rather than an improvement. I apologize for the error. Please consider this as a design doc.
Thanks, Marco Il giorno ven 21 set 2018 alle ore 12:04 Wenchen Fan <cloud0...@gmail.com> ha scritto: > Hi Marco, > > Thanks for sending it! The problem is clearly explained in this email, but > I would not treat it as a SPIP. It proposes a fix for a very tricky bug, > and SPIP is usually for new features. Others please correct me if I was > wrong. > > Thanks, > Wenchen > > On Fri, Sep 21, 2018 at 5:47 PM Marco Gaido <marcogaid...@gmail.com> > wrote: > >> Hi all, >> >> I am writing this e-mail in order to discuss the issue which is reported >> in SPARK-25454 and according to Wenchen's suggestion I prepared a design >> doc for it. >> >> The problem we are facing here is that our rules for decimals operations >> are taken from Hive and MS SQL server and they explicitly don't support >> decimals with negative scales. So the rules we have currently are not meant >> to deal with negative scales. The issue is that Spark, instead, doesn't >> forbid negative scales and - indeed - there are cases in which we are >> producing them (eg. a SQL constant like 1e8 would be turned to a decimal(1, >> -8)). >> >> Having negative scales most likely wasn't really intended. But >> unfortunately getting rid of them would be a breaking change as many >> operations working fine currently would not be allowed anymore and would >> overflow (eg. select 1e36 * 10000). As such, this is something I'd >> definitely agree on doing, but I think we can target only for 3.0. >> >> What we can start doing now, instead, is updating our rules in order to >> handle properly also the case when decimal scales are negative. From my >> investigation, it turns out that the only operations which has problems >> with them is Divide. >> >> Here you can find the design doc with all the details: >> https://docs.google.com/document/d/17ScbMXJ83bO9lx8hB_jeJCSryhT9O_HDEcixDq0qmPk/edit?usp=sharing. >> The document is also linked in SPARK-25454. There is also already a PR with >> the change: https://github.com/apache/spark/pull/22450. >> >> Looking forward to hear your feedback, >> Thanks. >> Marco >> >