On Thu, Sep 03, 2009 at 10:24:17AM -0400, Tom Lane wrote:
> We have seen several previous reports of regression test failures
> due to division by zero causing SIGFPE, even though the code should
> never reach the division command:
> 
> http://archives.postgresql.org/pgsql-bugs/2006-11/msg00180.php
> http://archives.postgresql.org/pgsql-bugs/2007-11/msg00032.php
> http://archives.postgresql.org/pgsql-bugs/2008-05/msg00148.php
> http://archives.postgresql.org/pgsql-general/2009-05/msg00774.php
> 
> It's always been on non-mainstream architectures so it was hard to
> investigate.  But I have finally been able to reproduce this:
> https://bugzilla.redhat.com/show_bug.cgi?id=520916
> 
> While s390x is still not quite mainstream, at least I can get
> access to one ;-).

Do you also have access to z/OS with Unix System Services?  IBM's
compiler, c89, is amazingly strict, and should help us flush out bugs. :)

> What turns out to be the case is that
> "simple" test cases like
>       if (y == 0)
>               single_function_call(...);
>       z = x / y;
> do not show the problem; you need something pretty complex in the
> if-command.  Like, say, an ereport() construct.  So that's why the gcc
> boys haven't already had visits from mobs of villagers about this.
> 
> I hope that the bug will get fixed in due course, but even if they
> respond pretty quickly it will be years before the problem disappears
> from every copy of gcc in the field.  So I'm thinking that it would
> behoove us to install a workaround, now that we've characterized the
> problem sufficiently.  What I am thinking is that in the three
> functions known to exhibit the bug (int24div, int28div, int48div)
> we should do something like this:
> 
> 
>       if (arg2 == 0)
> +     {
>               ereport(ERROR,
>                               (errcode(ERRCODE_DIVISION_BY_ZERO),
>                                errmsg("division by zero")));
> +             /* ensure compiler realizes we don't reach the division */
> +             PG_RETURN_NULL();
> +     }
>       /* No overflow is possible */
>       PG_RETURN_INT64((int64) arg1 / arg2);
> 
> Thoughts?

How big would this change be?  How would people know to use that
construct everywhere it's appropriate?

Cheers,
David.
-- 
David Fetter <da...@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to