Re: [Simh] O2 optimization on Linux

Mark Pizzolato - Info Comm Wed, 14 Mar 2012 21:27:38 -0700

I'm reviving this discussion (which last saw traffic in Jan 2011) since I just 
recently encountered the same issue and Sergey Oboguev pointed me back to it.

I had noticed that this problem happened with gcc builds on MANY combinations 
of platforms and gcc versions (OSX, Linux, MinGW, Cygwin, etc...).  It was not 
merely a possible bug in a particularly obscure version of the compiler, but 
appears to be a much more generic issue.

Before Sergey pointed me back to this discussion which had already narrowed 
things down to the particular macros and their use, I came at the problem from 
the 50,000 foot view of what the compiler might be doing when -O2 was enabled.  
As it turns out -O2 turns on a significant set of separate optimizations which 
each can individually be enabled or disabled.  I narrowed down the issue to a 
specific optimization.  Compiling with "-O2" demonstrates the problem, and "-O2 
-fno-strict-overflow" avoids the problem.  I've adjusted my makefile to avoid 
the problem, but there is some meat behind the reason we're seeing the problem 
at all.

As others have identified, the issue lies in vax_cpu1.c\op_ldpctx():
         {
         Int32 newpc, newpsl, pcbpa, t;
         [...]
         t = ReadLP (pcbpa + 88);
         ML_PXBR_TEST (t + 0x800000);                           /* validate 
P1BR */

The expression "t + 0x800000" is overflowing.  Whatever is done in the 
MX_PXBR_TEST macro with casts and parenthesis doesn't matter, the overflow has 
already happened since the int32 expression (t + 0x800000) had the overflow.  
The C language describes signed integer overflow behavior an undefined.  So 
what the compiler does in -O2 and without it is consistent with the language 
definition (anything could happen).

Changing the declaration of t (and the other variables) to uint32 instead of 
int32 produces the results we desire without regard to the setting of -O2.

This particular case is now solved, BUT there are many other int32 variables in 
use throughout the vax & vax780 simulators which might also have similar issues 
(but those cases don't help us find them so nicely by generating reserved 
operand faults).  There are relatively few places where actual signed 32bit 
values are interesting or necessary but the signed variable use is quite 
common.  The cases where arithmetic operations (vs bit oriented AND and OR) are 
performed are where  the potential issues like this will exist, and I suspect 
that they are very few, but I have not looked at all.  

What to do....

More details about this optimization are provided here 
(http://www.airs.com/blog/archives/120) by the guy who implemented the 
-fno-strict-overflow feature.

One might think that compiling with '-Wstrict-overflow' would warn us about all 
the places this optimization might be messing with computed results.  However, 
it doesn't.  Compiling with '-O2 -Wstrict-overflow' and the unmodified 
vax_cpu1.c (i.e. the original op_ldpctx()), does not flag anything in 
op_ldpctx().  Oh well....

I think that avoiding this gcc optimization completely (with 
-fno-strict-overflow) is the best approach until all the potential code paths 
have been reviewed.

- Mark

> On 1/3/2011 1:45 PM, Mark Pizzolato - Info Comm wrote:
> > There is a minor typo in Bob's SBR test (missing one zero).
> >
> >> #define ML_SBR_TEST(r) if (((uint32)(r))&  0xC000003u) != 0)
> > RSVD_OPND_FAULT
> > Should be:
> >> #define ML_SBR_TEST(r) if (((uint32)(r))&  0xC0000003u) != 0)
> > RSVD_OPND_FAULT
> >
> > - Mark
> >
> > On Monday, January 03, 2011 at 9:47 AM, Bob Supnik said:
> >> Well, the 780 microcode is complicated and patched to fare-thee-well,
> >> but it uses absolutely common code for the tests on P0BR and P1BR.
> > [The
> >> PDFs are on line at Bitsaver.]  Based on the comments, the actual
> > tests are:
> >> 1) Offset P1BR by 0x800000 (2**23).
> >> 2) Test PxBR<1:0>  = 0.
> >> 3) Test PxBR<31>  = 1.
> >> 4) Test PxBR<30>  = 0.  [this is missing in SimH]
> >>
> >> For SBR, it's:
> >>
> >> 1) Test SBR<1:0>  = 0.
> >> 2) Test SBR<31:30>  = 0. [this is missing in SimH]
> >>
> >> So while SimH may not be SRM conformant, it follows the 780 microcode
> >> more or less faithfully; in particular, by using a common macro for
> >> testing P0BR and P1BR.  Remember, SimH is a hardware simulator, not an
> >> architectural simulator.
> >>
> >> So I would venture that the "right" formulation of these tests is:
> >>
> >> #define ML_SBR_TEST(r)  if ((r)&  0xC000003) RSVD_OPND_FAULT
> >> #define ML_PXBR_TEST(r) if ((((r)&  0x80000000) == 0) || \
> >>                               ((r)&  0x40000003)) RSVD_OPND_FAULT
> >>
> >> Of course, you can throw in whatever casts, etc, you want to add to
> > make
> >> the code 64b proof; and also the != 0 that good C coding seems to
> > imply
> >> these days:
> >>
> >> #define ML_SBR_TEST(r) if (((uint32)(r))&  0xC000003u) != 0)
> >> RSVD_OPND_FAULT
> >> #define ML_PXBR_TEST(r) if (((((uint32)(r))&  0x80000000u) == 0) || \
> >>                               (((uint32)(r))&  0x40000003u) != 0))
> >> RSVD_OPND_FAULT
> >>
> >> The ANSI C standard says that hex constants are unsigned by default,
> > so
> >> I really don't think all the u's are needed.
> >>
> >> Remember, the problem is unique to one version of the C compiler, with
> >> CentOS 5.5.  It works everywhere else.
> >>
> >> /Bob
> >> _______________________________________________
> >> Simh mailing list
> >> [email protected]
> >> http://mailman.trailing-edge.com/mailman/listinfo/simh
> >

_______________________________________________
Simh mailing list
[email protected]
http://mailman.trailing-edge.com/mailman/listinfo/simh

Re: [Simh] O2 optimization on Linux

Reply via email to