Daniel, I think Richard was asking about false positives/true positives ratio and I do not see any false positives in the list below. It looks like all the reports you found are valid. Did you run this check on large C++ codebases?
As far as I can tell there are two outstanding issues: - Addressing these questions from Richard: >>>> sizeof(expression) is a common idiom in SFINAE contexts. Is that covered >>>> here? Richard, can you provide examples? >>>> >>>> sizeof(sizeof(int)) is a "traditional" way to get sizeof(size_t). Why >>>> should we warn on that? (I personally think that if this check goes into the analyzer, it's fine to warn in the second case.) - Where this check should go (compiler or the analyzer)? I think that if you can address the questions above, this check could go into the compiler. Cheers, Anna. On Oct 14, 2013, at 12:43 PM, Daniel Marjamäki <[email protected]> wrote: > > Hello! > > As I understood it.. you wanted to know what the hit ratio is. I have > investigated the hit ratio on a number of packages in debian. Below is the > negative and positives I've found. I provide the package URLs so you can look > for yourself.. > > I will continue investigating and let you know if I see any false positives... > > > NEGATIVE (macro) > package: > http://ftp.us.debian.org/debian/pool/main/a/agedu/agedu_9723.orig.tar.gz > > #define sresize(array, number, type) \ > ( (void)sizeof((array)-(type *)0), \ > (type *) srealloc ((array), (number) * sizeof (type)) ) > > We don't warn about this since it's in a macro. However it is interesting: > sizeof((array)-(type *)0) > It seems to be written this way by intention. To get sizeof(ptrdiff_t)? Maybe > sizeof(ptr-ptr) should be ok. > > > > POSITIVE > package: > http://ftp.us.debian.org/debian/pool/main/a/argus-client_3.0.0.orig.tar.gz > > file: argus-clients-3.0.0/radump/print-ldp.c > line: 607 > print_unknown_data(tptr+sizeof(sizeof(struct > ldp_msg_header)),"\n\t ", > msg_len); > > file: argus-clients-3.0.0/radump/print-lmp.c > line: 878 > print_unknown_data(tptr+sizeof(sizeof(struct > lmp_object_header)),"\n\t ", > lmp_obj_len-sizeof(struct lmp_object_header)); > > Not sure if these are intended to be sizeof(size_t) or sizeof(struct > lmp_object_header) but I guess the latter. > > > > POSITIVE > package: > http://ftp.us.debian.org/debian/pool/main/b/babel/babel_1.4.0.dfsg.orig.tar.gz > file: babel-1.4.0.dfsg/regression/output/libC/synch_RegOut_Impl.c > > 332: if ((res < 0) || (res > sizeof(s_result_strings)/sizeof(sizeof(char > *)))) { > > This "sizeof(sizeof(char *))" should be "sizeof(char*)" since > s_result_strings is a array of char * pointers > > > > POSITIVE > package: > http://ftp.us.debian.org/debian/pool/main/b/bird/bird_1.3.7.orig.tar.gz > file: bird-1.3.7/lib/mempool.c > > 253: > return ALLOC_OVERHEAD + sizeof(struct linpool) + > cnt * (ALLOC_OVERHEAD + sizeof(sizeof(struct lp_chunk))) + > m->total + m->total_large; > > Not sure if this is intended to be sizeof(size_t) or sizeof(struct lp_chunk) > but I guess the latter. > > > POSITIVE: > package: > http://ftp.us.debian.org/debian/pool/main/b/blender/blender_2.49.2~dfsg.orig.tar.gz > file: blender-2.49b.orig/source/blender/blenloader/intern/readfile.c > 6987: if(0==strncmp(ima->id.name+2, "Viewer Node", > sizeof(ima->id.name+2))) { > 6991: if(0==strncmp(ima->id.name+2, "Render Result", > sizeof(ima->id.name+2))) { > > These are bad, ima->id.name is an array, it should probably be changed to: > if(0==strncmp(ima->id.name+2, "Viewer Node", sizeof(ima->id.name) - 2)) { > > > Best regards, > Daniel Marjamäki > > .................................................................................................................. > Daniel Marjamäki Senior Engineer > Evidente ES East AB Warfvinges väg 34 SE-112 51 Stockholm Sweden > > Mobile: +46 (0)709 12 42 62 > E-mail: [email protected] > > www.evidente.se > Från: [email protected] [[email protected]] för > Anna Zaks [[email protected]] > Skickat: den 8 oktober 2013 18:18 > Till: Ted Kremenek > Cc: Anders Rönnholm; Richard Smith; [email protected] > Ämne: Re: [PATCH] [StaticAnalyzer] New checker Sizeof on expression > > > On Oct 7, 2013, at 10:05 PM, Ted Kremenek <[email protected]> wrote: > >> On Oct 7, 2013, at 6:05 PM, Jordan Rose <[email protected]> wrote: >> >>> >>> On Oct 7, 2013, at 13:58, Richard Smith <[email protected]> wrote: >>> >>>> On Mon, Oct 7, 2013 at 9:55 AM, Jordan Rose <[email protected]> wrote: >>>> I'm fine with this staying in the analyzer for now, unless David, Richard, >>>> or Eli feel it should be a warning right away. >>>> >>>> Do we have evidence that we want this? Does it catch bugs? If so, what do >>>> they look like? It seems like this would trigger on legitimate code; how >>>> does a user suppress the warning in that case, and does that suppression >>>> make their code clearer? >>>> >>>> What is the false/true positive ratio for bug finding here? >>>> >>>> sizeof(expression) is a common idiom in SFINAE contexts. Is that covered >>>> here? >>>> >>>> sizeof(sizeof(int)) is a "traditional" way to get sizeof(size_t). Why >>>> should we warn on that? >>>> >>>> And more as a general question than something specific to this patch: Is >>>> there a region in the space of false positive ratios where we think a >>>> syntactic warning should go into the static analyzer? If so, why? And what >>>> is that region? I would have thought that the static analyzer, like the >>>> clang warnings, would be aimed at (eventually) having a false positive >>>> ratio of near zero. If so, then should we ever put a warning in the static >>>> analyzer if it doesn't require the static analyzer's technology (or have a >>>> high runtime cost)? >>> >>> On this last (and bringing in Ted and Anna): >>> >>> I think the main difference between compiler warnings and syntactic >>> analyzer checks is that we try very hard to turn new compiler warnings on >>> by default. A second-order effect of this is that we generally avoid style >>> warnings. The analyzer can be a bit looser about this, though: because >>> people know the analyzer is stricter and more in-depth, I think they might >>> also accept that a particular check doesn't fit their project. >>> >>> On the other hand, we still haven't gotten around to designing a proper bug >>> tracking and/or manual suppression system, so that's one advantage of >>> compiler warnings. And as you say, checks without a high runtime cost don't >>> really have a technical reason to be in the analyzer. >> >> Richard’s point is correct that we want the static analyzer to also have a >> high signal-to-noise ratio. Otherwise it is a useless tool. I’m also not a >> fan of having the analyzer having a bunch of “junk” checkers that aren’t on >> by default, but if a checker, when it is enabled, is HIGHLY useful to some >> set of people (e.g., security experts who are more tolerant of false >> positive rates if they want to do an aggressive code audit) I think the >> analyzer is a reasonable place to put them, given the current warning policy >> in Clang where we want the warnings there to be generally useful to >> everybody. >> >> To put it in more context, in the beginning the guiding principal of what >> goes in the static analyzer was: >> >> (a) The warning is very expensive to compute. >> >> OR >> >> (b) The warning is very domain-specific. For example, an API such as >> CFNumberCreate() on OS X has some interesting API invariants, but we >> generally should not be hacking those API-specific warnings into the >> compiler. Exceptions exist of course, e.g., printf checking, but often the >> are grounded when such APIs are fairly standard (e.g., in the C standard >> itself) or the checking is based on some annotation like >> __attribute__((format)) where the compiler doesn’t know anything about a >> specific function itself, just the annotation. >> >> Style warnings can sometimes fall into (a) (in which case not putting them >> in the compiler makes obvious sense), but one could argue that they are more >> the flavor of (b) then a traditional compiler warning. As I mentioned >> earlier, the static analyzer can be home to some highly specialized checkers >> that may not be generally useful for everybody but when enabled are very >> useful to certain people. Style warnings often fit in this category of >> warnings. >> >> A related problem is that we don’t have an ontology for style warnings in >> clang (the compiler). Should they really be lumped into the same group of >> all other compiler warnings? What about their behavior with -Werror? >> -Weverything? Style warnings really are about personal style, and users can >> be highly polarized about them. For example, it would be highly useful for >> Clang to have a -W80-columns or -Wspaces-instead-of-tabs (not a serious >> proposal). Both would be cheap to compute, would obviously benefit LLVM >> developers, but they wouldn’t work for everybody, or even the majority of >> software developers. Where should they go? The compiler? The static >> analyzer? Both have discoverability and usability concerns in either >> location. I’d argue that these two warnings would be better suited in the >> compiler because (when the user wants them) they’d get run all the time, but >> having these warnings counteracts the general guiding principal that >> compiler warnings should generally be useful to most people, or clear >> categories of people (e.g, library authors), but not specific groups of >> people (e.g., LLVM developers). >> >> This is not a thought out proposal, but I would not be opposed to a new >> category of warnings, say -Wstyle, and have all style warnings under >> -Wstyle, perhaps named “-Wstyle-80-columns” or >> “-Wstyle-spaces-instead-of-tabs”. None of these warnings would be on by >> default. If we want these kind of warnings in the compiler, I think these >> kind of warnings are worth calling out as being something “different” that >> people shouldn’t get hung up about if they don’t think the warning applies >> to them. Also, having them in a special category of them own allows >> institutions to set broad warning polices such as “-Werror -Wno-error=style” >> where they want regular warnings to be treated as error, but not style >> warnings, etc. > > > I agree with Ted. Currently, there is no good way of adding style and highly > specialized warnings to clang; given the behavior of Werror and Weverything. > For example, we encourage users to learn about new generic warnings by > turning on Weverything, so the specialized warnings should not be included > there.
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
