On Wed, Oct 23, 2013 at 8:35 AM, Anna Zaks <[email protected]> wrote: > Daniel, > > I think Richard was asking about false positives/true positives ratio and > I do not see any false positives in the list below. It looks like all the > reports you found are valid. Did you run this check on large C++ codebases? > > As far as I can tell there are two outstanding issues: > - Addressing these questions from Richard: > > sizeof(expression) is a common idiom in SFINAE contexts. Is that covered > here? > > Richard, can you provide examples? >
There's one here: http://en.wikipedia.org/wiki/Substitution_failure_is_not_an_error ... but that's not the case I was thinking of; it's not likely that this pattern will be used with sizeof(binary operator). The case I meant is more like the one here: http://stackoverflow.com/questions/12654067/what-is-expression-sfinae ... where 'sizeof' is just used to hide an arbitrary expression inside a function type. Here's another example using a binary operator inside a sizeof expression: http://stackoverflow.com/questions/2127693/sfinae-sizeof-detect-if-expression-compiles IIRC, sizeof(expression) was the *only* way to do this kind of thing in C++98. > sizeof(sizeof(int)) is a "traditional" way to get sizeof(size_t). Why > should we warn on that? > > (I personally think that if this check goes into the analyzer, it's fine > to warn in the second case.) > Sure. I think it's even fine for a compiler warning, since sizeof(size_t) is a clearer way to get the same result. - Where this check should go (compiler or the analyzer)? > I think that if you can address the questions above, this check could go > into the compiler. > > Cheers, > Anna. > > On Oct 14, 2013, at 12:43 PM, Daniel Marjamäki < > [email protected]> wrote: > > > Hello! > > As I understood it.. you wanted to know what the hit ratio is. I have > investigated the hit ratio on a number of packages in debian. Below is the > negative and positives I've found. I provide the package URLs so you can > look for yourself.. > > I will continue investigating and let you know if I see any false > positives... > > > NEGATIVE (macro) > package: > http://ftp.us.debian.org/debian/pool/main/a/agedu/agedu_9723.orig.tar.gz > > #define sresize(array, number, type) \ > ( (void)sizeof((array)-(type *)0), \ > (type *) srealloc ((array), (number) * sizeof (type)) ) > > We don't warn about this since it's in a macro. However it is interesting: > sizeof((array)-(type *)0) > It seems to be written this way by intention. To get sizeof(ptrdiff_t)? > Maybe sizeof(ptr-ptr) should be ok. > > > > POSITIVE > package: > http://ftp.us.debian.org/debian/pool/main/a/argus-client_3.0.0.orig.tar.gz > > file: argus-clients-3.0.0/radump/print-ldp.c > line: 607 > print_unknown_data(tptr+sizeof(sizeof(struct > ldp_msg_header)),"\n\t ", > msg_len); > > file: argus-clients-3.0.0/radump/print-lmp.c > line: 878 > print_unknown_data(tptr+sizeof(sizeof(struct > lmp_object_header)),"\n\t ", > lmp_obj_len-sizeof(struct > lmp_object_header)); > > Not sure if these are intended to be sizeof(size_t) or sizeof(struct > lmp_object_header) but I guess the latter. > > > > POSITIVE > package: > http://ftp.us.debian.org/debian/pool/main/b/babel/babel_1.4.0.dfsg.orig.tar.gz > file: babel-1.4.0.dfsg/regression/output/libC/synch_RegOut_Impl.c > > 332: if ((res < 0) || (res > > sizeof(s_result_strings)/sizeof(sizeof(char *)))) { > > This "sizeof(sizeof(char *))" should be "sizeof(char*)" since > s_result_strings is a array of char * pointers > > > > POSITIVE > package: > http://ftp.us.debian.org/debian/pool/main/b/bird/bird_1.3.7.orig.tar.gz > file: bird-1.3.7/lib/mempool.c > > 253: > return ALLOC_OVERHEAD + sizeof(struct linpool) + > cnt * (ALLOC_OVERHEAD + sizeof(sizeof(struct lp_chunk))) + > m->total + m->total_large; > > Not sure if this is intended to be sizeof(size_t) or sizeof(struct > lp_chunk) but I guess the latter. > > > POSITIVE: > package: > http://ftp.us.debian.org/debian/pool/main/b/blender/blender_2.49.2~dfsg.orig.tar.gz > file: blender-2.49b.orig/source/blender/blenloader/intern/readfile.c > 6987: if(0==strncmp(ima->id.name+2, "Viewer Node", sizeof(ima->id.name+2))) > { > 6991: if(0==strncmp(ima->id.name+2, "Render Result", sizeof(ima-> > id.name+2))) { > > These are bad, ima->id.name is an array, it should probably be changed to: > if(0==strncmp(ima->id.name+2, "Viewer Node", sizeof(ima->id.name) - > 2)) { > > > Best regards, > Daniel Marjamäki > > > .................................................................................................................. > Daniel Marjamäki Senior Engineer > Evidente ES East AB Warfvinges väg 34 SE-112 51 Stockholm Sweden > > > Mobile: +46 (0)709 12 42 62 > E-mail: Daniel.Marjamaki <[email protected]> > @evidente.se <[email protected]> > > > www.evidente.se > ------------------------------ > *Från:* [email protected] [[email protected]] > för Anna Zaks [[email protected]] > *Skickat:* den 8 oktober 2013 18:18 > *Till:* Ted Kremenek > *Cc:* Anders Rönnholm; Richard Smith; [email protected] > *Ämne:* Re: [PATCH] [StaticAnalyzer] New checker Sizeof on expression > > > On Oct 7, 2013, at 10:05 PM, Ted Kremenek <[email protected]> wrote: > > On Oct 7, 2013, at 6:05 PM, Jordan Rose <[email protected]> wrote: > > > On Oct 7, 2013, at 13:58, Richard Smith <[email protected]> wrote: > > On Mon, Oct 7, 2013 at 9:55 AM, Jordan Rose <[email protected]> wrote: > >> I'm fine with this staying in the analyzer for now, unless David, >> Richard, or Eli feel it should be a warning right away. >> > > Do we have evidence that we want this? Does it catch bugs? If so, what do > they look like? It seems like this would trigger on legitimate code; how > does a user suppress the warning in that case, and does that suppression > make their code clearer? > > What is the false/true positive ratio for bug finding here? > > sizeof(expression) is a common idiom in SFINAE contexts. Is that covered > here? > > sizeof(sizeof(int)) is a "traditional" way to get sizeof(size_t). Why > should we warn on that? > > And more as a general question than something specific to this patch: Is > there a region in the space of false positive ratios where we think a > syntactic warning should go into the static analyzer? If so, why? And what > is that region? I would have thought that the static analyzer, like the > clang warnings, would be aimed at (eventually) having a false positive > ratio of near zero. If so, then should we ever put a warning in the static > analyzer if it doesn't require the static analyzer's technology (or have a > high runtime cost)? > > > On this last (and bringing in Ted and Anna): > > I think the main difference between compiler warnings and syntactic > analyzer checks is that we try very hard to turn new compiler warnings on > by default. A second-order effect of this is that we generally avoid style > warnings. The analyzer can be a bit looser about this, though: because > people *know* the analyzer is stricter and more in-depth, I think they > might also accept that a particular check doesn't fit their project. > > On the other hand, we still haven't gotten around to designing a proper > bug tracking and/or manual suppression system, so that's one advantage of > compiler warnings. And as you say, checks without a high runtime cost don't > really have a *technical* reason to be in the analyzer. > > > Richard’s point is correct that we want the static analyzer to also have a > high signal-to-noise ratio. Otherwise it is a useless tool. I’m also not > a fan of having the analyzer having a bunch of “junk” checkers that aren’t > on by default, but if a checker, when it is enabled, is HIGHLY useful to > some set of people (e.g., security experts who are more tolerant of false > positive rates if they want to do an aggressive code audit) I think the > analyzer is a reasonable place to put them, given the current warning > policy in Clang where we want the warnings there to be generally useful to > everybody. > > To put it in more context, in the beginning the guiding principal of what > goes in the static analyzer was: > > (a) The warning is very expensive to compute. > > OR > > (b) The warning is very domain-specific. For example, an API such as > CFNumberCreate() on OS X has some interesting API invariants, but we > generally should not be hacking those API-specific warnings into the > compiler. Exceptions exist of course, e.g., printf checking, but often the > are grounded when such APIs are fairly standard (e.g., in the C standard > itself) or the checking is based on some annotation like > __attribute__((format)) where the compiler doesn’t know anything about a > specific function itself, just the annotation. > > Style warnings can sometimes fall into (a) (in which case not putting them > in the compiler makes obvious sense), but one could argue that they are > more the flavor of (b) then a traditional compiler warning. As I mentioned > earlier, the static analyzer can be home to some highly specialized > checkers that may not be generally useful for everybody but when enabled > are very useful to certain people. Style warnings often fit in this > category of warnings. > > A related problem is that we don’t have an ontology for style warnings in > clang (the compiler). Should they really be lumped into the same group of > all other compiler warnings? What about their behavior with -Werror? > -Weverything? Style warnings really are about personal style, and users > can be highly polarized about them. For example, it would be highly useful > for Clang to have a -W80-columns or -Wspaces-instead-of-tabs (not a serious > proposal). Both would be cheap to compute, would obviously benefit LLVM > developers, but they wouldn’t work for everybody, or even the majority of > software developers. Where should they go? The compiler? The static > analyzer? Both have discoverability and usability concerns in either > location. I’d argue that these two warnings would be better suited in the > compiler because (when the user wants them) they’d get run all the time, > but having these warnings counteracts the general guiding principal that > compiler warnings should generally be useful to most people, or clear > categories of people (e.g, library authors), but not specific groups of > people (e.g., LLVM developers). > > This is not a thought out proposal, but I would not be opposed to a new > category of warnings, say -Wstyle, and have all style warnings under > -Wstyle, perhaps named “-Wstyle-80-columns” or > “-Wstyle-spaces-instead-of-tabs”. None of these warnings would be on by > default. If we want these kind of warnings in the compiler, I think these > kind of warnings are worth calling out as being something “different” that > people shouldn’t get hung up about if they don’t think the warning applies > to them. Also, having them in a special category of them own allows > institutions to set broad warning polices such as “-Werror > -Wno-error=style” where they want regular warnings to be treated as error, > but not style warnings, etc. > > > I agree with Ted. Currently, there is no good way of adding style and > highly specialized warnings to clang; given the behavior of Werror and > Weverything. For example, we encourage users to learn about new generic > warnings by turning on Weverything, so the specialized warnings should not > be included there. > > >
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
