On Sun, 2005-10-30 at 23:40 -0800, Mark Mitchell wrote: > In reviewing the PR list, I saw several (maybe 5?) PRs about problems > with -Wuninitialized. Joys...
> All of these problems related to getting confused in the optimizers in > various ways; sometimes we think things are uninitialized when they > obviously aren't; other times we don't warn when we obviously should, > etc. The set of places we warn is moving around from 3.4 to 4.0 to 4.1, > which annoys people using -Werror, especially if we have new false > positives. It's probably worth noting that jump threading actually improves our ability to avoid false positives -- it was one of the things I was rather surprised to find out when I started looking at throttling it back. Code idioms commonly used in GCC will trigger a multitude of false positives if we don't aggressively thread jumps through loop headers. Overall what I would expect is that we're probably doing quite a bit better than before with avoiding false positives and distinguishing between may be used uninitialized and is definitely used uninitialized. Where I suspect we're falling down more often is not issuing warnings because the uninitialized variable was optimized away or its uses were turned into uses of other variables (possibly temporaries) due to copy propagation. > I still think we should take the goals of consistency across releases, > across architectures, and across optimization levels seriously. For > -Wuninitialized, we should err on the side of warning too often. If > possible, we should try to sketch the rules we're using so that users > understand them. I might disagree about warning too often - one of the big complaints in the past was the number of false positives. However, I would absolutely agree that if we can't prove the variable is initialized then we ought to be warning. > I think users who set this flag generally want warnings about variables > that look unitialized, but happen not be used, or happen to be > initialized only in tricky ways; they're often using these warnings to > make their code robust not just today, but in the face of future > modification. I'd expect that most users want a warning about: > > int x; > if (f()) > x = 3; > return x; > > even if f() happens to always return true on this system. (I don't know > that we don't warn in this case; it's just meant as an example of a case > where doing too much optimization might cause us to miss a warning > people want.) Certainly if we can't prove f always returns a nonzero value, then a warning should be issued. If we do prove f always returns a nonzero value, then I think it becomes unclear if we should generate a warning. > > Certainly, this code from PR 22456: > > int i; > while (i) ++i; > > should trigger a warning. The fact that i is then never used, and thus > the loop is removed, is a fine optimization, but that shouldn't affect > whether or not a warning is issued. Certainly should trigger a warning. I'm kindof surprised it doesn't already given we try to do some warnings before we start optimizing the code. > Bugs like: > > void f(int j) { > int i; // Should be initialized. > while (j) i += j--; > return j; // Should be i, not j. > } > > are not that rare; the fact that this function can be optimized to > "return 0" should not preclude the warning! This should IMHO also trigger warning and again, I'm surprised the early warning initialization code doesn't do so. > Maybe it would help to move the checks closer to the front of the > optimizer chain. We'd have less accuracy, but, probably, more consistency. I doubt you're going to get the consistency you're looking for if move warnings strictly to the early part of the optimization path -- without getting an undue number of false positives. Here's the comment before the code in question: /* Emit warnings for uninitialized variables. This is done in two passes. The first pass notices real uses of SSA names with default definitions. Such uses are unconditionally uninitialized, and we can be certain that such a use is a mistake. This pass is run before most optimizations, so that we catch as many as we can. The second pass follows PHI nodes to find uses that are potentially uninitialized. In this case we can't necessarily prove that the use is really uninitialized. This pass is run after most optimizations, so that we thread as many jumps and possible, and delete as much dead code as possible, in order to reduce false positives. We also look again for plain uninitialized variables, since optimization may have changed conditionally uninitialized to unconditionally uninitialized. */ /* Emit a warning for T, an SSA_NAME, being uninitialized. The exact warning text is in MSGID and LOCUS may contain a location or be null. */ Ah, that explains both of the latter cases. There's going to be a PHI node at the top of each loop and the use inside the loop will refer to the PHI result rather than the default definition. I'm not sure how to best deal with this. jeff