Daniel,

I think Richard was asking about false positives/true positives ratio and I do 
not see any false positives in the list below. It looks like all the reports 
you found are valid. Did you run this check on large C++ codebases?

As far as I can tell there are two outstanding issues:
 - Addressing these questions from Richard:
>>>> sizeof(expression) is a common idiom in SFINAE contexts. Is that covered 
>>>> here?
Richard, can you provide examples?
>>>> 
>>>> sizeof(sizeof(int)) is a "traditional" way to get sizeof(size_t). Why 
>>>> should we warn on that?

(I personally think that if this check goes into the analyzer, it's fine to 
warn in the second case.)

 - Where this check should go (compiler or the analyzer)?
I think that if you can address the questions above, this check could go into 
the compiler.

Cheers,
Anna.

On Oct 14, 2013, at 12:43 PM, Daniel Marjamäki <[email protected]> 
wrote:

> 
> Hello!
> 
> As I understood it.. you wanted to know what the hit ratio is. I have 
> investigated the hit ratio on a number of packages in debian. Below is the 
> negative and positives I've found. I provide the package URLs so you can look 
> for yourself..
> 
> I will continue investigating and let you know if I see any false positives...
> 
> 
> NEGATIVE (macro)
> package: 
> http://ftp.us.debian.org/debian/pool/main/a/agedu/agedu_9723.orig.tar.gz
> 
> #define sresize(array, number, type) \
>     ( (void)sizeof((array)-(type *)0), \
>       (type *) srealloc ((array), (number) * sizeof (type)) )
> 
> We don't warn about this since it's in a macro. However it is interesting:
>     sizeof((array)-(type *)0)
> It seems to be written this way by intention. To get sizeof(ptrdiff_t)? Maybe 
> sizeof(ptr-ptr) should be ok.
> 
> 
> 
> POSITIVE
> package: 
> http://ftp.us.debian.org/debian/pool/main/a/argus-client_3.0.0.orig.tar.gz
> 
> file: argus-clients-3.0.0/radump/print-ldp.c
> line: 607
>             print_unknown_data(tptr+sizeof(sizeof(struct 
> ldp_msg_header)),"\n\t  ",
>                                msg_len);
> 
> file: argus-clients-3.0.0/radump/print-lmp.c
> line: 878
>             print_unknown_data(tptr+sizeof(sizeof(struct 
> lmp_object_header)),"\n\t    ",
>                                lmp_obj_len-sizeof(struct lmp_object_header));
> 
> Not sure if these are intended to be sizeof(size_t) or sizeof(struct 
> lmp_object_header) but I guess the latter.
> 
> 
> 
> POSITIVE
> package: 
> http://ftp.us.debian.org/debian/pool/main/b/babel/babel_1.4.0.dfsg.orig.tar.gz
> file: babel-1.4.0.dfsg/regression/output/libC/synch_RegOut_Impl.c
> 
> 332:    if ((res < 0) || (res > sizeof(s_result_strings)/sizeof(sizeof(char 
> *)))) {
> 
> This "sizeof(sizeof(char *))" should be "sizeof(char*)" since 
> s_result_strings is a array of char * pointers
> 
> 
> 
> POSITIVE
> package: 
> http://ftp.us.debian.org/debian/pool/main/b/bird/bird_1.3.7.orig.tar.gz
> file: bird-1.3.7/lib/mempool.c
> 
> 253:
>   return ALLOC_OVERHEAD + sizeof(struct linpool) +
>     cnt * (ALLOC_OVERHEAD + sizeof(sizeof(struct lp_chunk))) +
>     m->total + m->total_large;
> 
> Not sure if this is intended to be sizeof(size_t) or sizeof(struct lp_chunk) 
> but I guess the latter.
> 
> 
> POSITIVE:
> package: 
> http://ftp.us.debian.org/debian/pool/main/b/blender/blender_2.49.2~dfsg.orig.tar.gz
> file: blender-2.49b.orig/source/blender/blenloader/intern/readfile.c
> 6987:    if(0==strncmp(ima->id.name+2, "Viewer Node", 
> sizeof(ima->id.name+2))) {
> 6991:    if(0==strncmp(ima->id.name+2, "Render Result", 
> sizeof(ima->id.name+2))) {
> 
> These are bad, ima->id.name is an array, it should probably be changed to:
>     if(0==strncmp(ima->id.name+2, "Viewer Node", sizeof(ima->id.name) - 2)) {
> 
> 
> Best regards,
> Daniel Marjamäki
> 
> ..................................................................................................................
> Daniel Marjamäki Senior Engineer
> Evidente ES East AB  Warfvinges väg 34  SE-112 51 Stockholm  Sweden
>  
> Mobile:                 +46 (0)709 12 42 62
> E-mail:                 [email protected]                      
>  
> www.evidente.se
> Från: [email protected] [[email protected]] för 
> Anna Zaks [[email protected]]
> Skickat: den 8 oktober 2013 18:18
> Till: Ted Kremenek
> Cc: Anders Rönnholm; Richard Smith; [email protected]
> Ämne: Re: [PATCH] [StaticAnalyzer] New checker Sizeof on expression
> 
> 
> On Oct 7, 2013, at 10:05 PM, Ted Kremenek <[email protected]> wrote:
> 
>> On Oct 7, 2013, at 6:05 PM, Jordan Rose <[email protected]> wrote:
>> 
>>> 
>>> On Oct 7, 2013, at 13:58, Richard Smith <[email protected]> wrote:
>>> 
>>>> On Mon, Oct 7, 2013 at 9:55 AM, Jordan Rose <[email protected]> wrote:
>>>> I'm fine with this staying in the analyzer for now, unless David, Richard, 
>>>> or Eli feel it should be a warning right away.
>>>> 
>>>> Do we have evidence that we want this? Does it catch bugs? If so, what do 
>>>> they look like? It seems like this would trigger on legitimate code; how 
>>>> does a user suppress the warning in that case, and does that suppression 
>>>> make their code clearer?
>>>> 
>>>> What is the false/true positive ratio for bug finding here?
>>>> 
>>>> sizeof(expression) is a common idiom in SFINAE contexts. Is that covered 
>>>> here?
>>>> 
>>>> sizeof(sizeof(int)) is a "traditional" way to get sizeof(size_t). Why 
>>>> should we warn on that?
>>>> 
>>>> And more as a general question than something specific to this patch: Is 
>>>> there a region in the space of false positive ratios where we think a 
>>>> syntactic warning should go into the static analyzer? If so, why? And what 
>>>> is that region? I would have thought that the static analyzer, like the 
>>>> clang warnings, would be aimed at (eventually) having a false positive 
>>>> ratio of near zero. If so, then should we ever put a warning in the static 
>>>> analyzer if it doesn't require the static analyzer's technology (or have a 
>>>> high runtime cost)?
>>> 
>>> On this last (and bringing in Ted and Anna):
>>> 
>>> I think the main difference between compiler warnings and syntactic 
>>> analyzer checks is that we try very hard to turn new compiler warnings on 
>>> by default. A second-order effect of this is that we generally avoid style 
>>> warnings. The analyzer can be a bit looser about this, though: because 
>>> people know the analyzer is stricter and more in-depth, I think they might 
>>> also accept that a particular check doesn't fit their project.
>>> 
>>> On the other hand, we still haven't gotten around to designing a proper bug 
>>> tracking and/or manual suppression system, so that's one advantage of 
>>> compiler warnings. And as you say, checks without a high runtime cost don't 
>>> really have a technical reason to be in the analyzer.
>> 
>> Richard’s point is correct that we want the static analyzer to also have a 
>> high signal-to-noise ratio.  Otherwise it is a useless tool.  I’m also not a 
>> fan of having the analyzer having a bunch of “junk” checkers that aren’t on 
>> by default, but if a checker, when it is enabled, is HIGHLY useful to some 
>> set of people (e.g., security experts who are more tolerant of false 
>> positive rates if they want to do an aggressive code audit)  I think the 
>> analyzer is a reasonable place to put them, given the current warning policy 
>> in Clang where we want the warnings there to be generally useful to 
>> everybody.
>> 
>> To put it in more context, in the beginning the guiding principal of what 
>> goes in the static analyzer was:
>> 
>> (a) The warning is very expensive to compute.
>> 
>>  OR
>> 
>> (b) The warning is very domain-specific.  For example, an API such as 
>> CFNumberCreate() on OS X has some interesting API invariants, but we 
>> generally should not be hacking those API-specific warnings into the 
>> compiler.  Exceptions exist of course, e.g., printf checking, but often the 
>> are grounded when such APIs are fairly standard (e.g., in the C standard 
>> itself) or the checking is based on some annotation like 
>> __attribute__((format)) where the compiler doesn’t know anything about a 
>> specific function itself, just the annotation.
>> 
>> Style warnings can sometimes fall into (a) (in which case not putting them 
>> in the compiler makes obvious sense), but one could argue that they are more 
>> the flavor of (b) then a traditional compiler warning.  As I mentioned 
>> earlier, the static analyzer can be home to some highly specialized checkers 
>> that may not be generally useful for everybody but when enabled are very 
>> useful to certain people.  Style warnings often fit in this category of 
>> warnings.
>> 
>> A related problem is that we don’t have an ontology for style warnings in 
>> clang (the compiler).  Should they really be lumped into the same group of 
>> all other compiler warnings?  What about their behavior with -Werror?  
>> -Weverything?  Style warnings really are about personal style, and users can 
>> be highly polarized about them.  For example, it would be highly useful for 
>> Clang to have a -W80-columns or -Wspaces-instead-of-tabs (not a serious 
>> proposal).  Both would be cheap to compute, would obviously benefit LLVM 
>> developers, but they wouldn’t work for everybody, or even the majority of 
>> software developers.  Where should they go?  The compiler?  The static 
>> analyzer?  Both have discoverability and usability concerns in either 
>> location.  I’d argue that these two warnings would be better suited in the 
>> compiler because (when the user wants them) they’d get run all the time, but 
>> having these warnings counteracts the general guiding principal that 
>> compiler warnings should generally be useful to most people, or clear 
>> categories of people (e.g, library authors), but not specific groups of 
>> people (e.g., LLVM developers).
>> 
>> This is not a thought out proposal, but I would not be opposed to a new 
>> category of warnings, say -Wstyle, and have all style warnings under 
>> -Wstyle, perhaps named “-Wstyle-80-columns” or 
>> “-Wstyle-spaces-instead-of-tabs”.  None of these warnings would be on by 
>> default.  If we want these kind of warnings in the compiler, I think these 
>> kind of warnings are worth calling out as being something “different” that 
>> people shouldn’t get hung up about if they don’t think the warning applies 
>> to them.  Also, having them in a special category of them own allows 
>> institutions to set broad warning polices such as “-Werror -Wno-error=style” 
>> where they want regular warnings to be treated as error, but not style 
>> warnings, etc.
> 
> 
> I agree with Ted. Currently, there is no good way of adding style and highly 
> specialized warnings to clang; given the behavior of Werror and Weverything. 
> For example, we encourage users to learn about new generic warnings by 
> turning on Weverything, so the specialized warnings should not be included 
> there. 

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to