[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-09 Thread Umann Kristóf via Phabricator via cfe-commits
Szelethus created this revision.
Szelethus added reviewers: NoQ, george.karpenkov, xazax.hun, MTC.
Herald added subscribers: cfe-commits, jfb, mikhail.ramalho, a.sidorin, 
rnkovacs, szepet, whisperity.

Added some extra tasks to the open projects. These are the ideas of @NoQ and 
@george.karpenkov, I just converted them to HTML.


Repository:
  rC Clang

https://reviews.llvm.org/D53024

Files:
  www/analyzer/open_projects.html

Index: www/analyzer/open_projects.html
===
--- www/analyzer/open_projects.html
+++ www/analyzer/open_projects.html
@@ -25,6 +25,86 @@
   
   Core Analyzer Infrastructure
   
+Implement a dataflow flamework.
+
+ (Difficulty: Hard) 
+
+
+Handle aggregate construction.
+Aggregates are object that can be brace-initialized without calling a
+constructor (i.e., no https://clang.llvm.org/doxygen/classclang_1_1CXXConstructExpr.html";>
+CXXConstructExpr in the AST), but potentially calling
+constructors for their fields and (since C++17) base classes - and these
+constructors of sub-objects need to know what object (field in the
+aggregate) they are constructing. Moreover, if the aggregate contains
+references, lifetime extension needs to be modeled. Aggregates can be
+nested, so https://clang.llvm.org/doxygen/classclang_1_1ConstructionContext.html";>
+ConstructionContext can potentially cover an unlimited amount of
+statements. One can start untangling this problem by trying to replace the
+hacky https://clang.llvm.org/doxygen/classclang_1_1ParentMap.html";>
+ParentMap lookup in https://clang.llvm.org/doxygen/ExprEngineCXX_8cpp_source.html#l00430";>
+CXXConstructExpr::CK_NonVirtualBase branch of
+ExprEngine::VisitCXXConstructExpr() with some actual
+support for the feature.
+ (Difficulty: Medium) 
+
+
+Fix CFG for GNU "binary conditional" operator ?:.
+CFG for GNU "binary conditional" operator ?: is broken in
+C++. Its condition-and-LHS need to only be evaluated once.
+(Difficulty: Easy)
+
+
+Handle unions.
+Currently in the analyzer, the value of a union is always regarded as
+unknown. There has been some discussion about this on the http://lists.llvm.org/pipermail/cfe-dev/2017-March/052864.html";>
+mailing list already, but it is still an untouched area.
+ (Difficulty: Medium) 
+
+
+Enhance the modeling of the standard library.
+There is a huge amount of checker work for teaching the Static Analyzer
+about the C++ standard library. It is very easy to explain to the static
+analyzer that calling .length() on an empty std::string
+ will yield 0, and vice versa, but supporting all of them is a huge
+amount of work. One good thing to start with here would be to notice that
+inlining methods of C++ "containers" is currently outright forbidden in
+order to suppress a lot of false alarms due to weird assume()s
+made within inlined methods. There’s a hypothesis that these suppressions
+should have been instead implemented as bug report visitors, which would
+still suppress false positives, but will not prevent us from inlining the 
+ethods, and therefore will not cause other false positives. Verifying this
+hypothesis would be a wonderful accomplishment. Previous understanding of
+the "inlined defensive checks" problem is a pre-requisite for this project.
+(Difficulty: Medium)
+
+
+Reimplement the representation for various symbolic values.
+https://clang.llvm.org/doxygen/classclang_1_1ento_1_1nonloc_1_1LocAsInteger.html";>
+LocAsInteger is annoying, but alternatives are vague. Casts into
+the opposite direction - integers to pointers - are completely unsupported.
+Pointer-to-pointer casts are a mess; modeling them with https://clang.llvm.org/doxygen/classclang_1_1ento_1_1ElementRegion.html";>
+ElementRegion  is a disaster and we are suffering a lot from this
+hack, but coming up with a significantly better solution is very hard, as
+there are a lot of corner-cases to cover, and it’s hard to maintain balance
+between richness of our representation of symbolic values and our ability to
+understand when the two different values in fact represent the same thing.
+(Difficulty: Hard)
+
+
+ Provide better alternatives to inlining.
+Sometimes instead of inlining, a much simpler behavior would be more
+efficient. For instance, if the function is pure, then a single bit of
+information “this function is pure” would already be much better than
+conservative evaluation, and sometimes good enough to make inlining not
+worth the effort. Gathering such snippets of information - “partial
+summaries" - automatically, from the more simple to the more complex
+summaries, and re-using them later, probably across translation units, might
+improve our analysis quite a lot, while 

[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-09 Thread Umann Kristóf via Phabricator via cfe-commits
Szelethus added a comment.

Mind you, there are some ideas I didn't add to the list -- I just don't know 
how to put them to words nicely, but I'm on it.

Also, a lot of these is outdated, but I joined the project relatively recently, 
so I'm not sure what's the state on all of them.


Repository:
  rC Clang

https://reviews.llvm.org/D53024



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-09 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added a comment.

Whoa thanks! Will have a closer look again.




Comment at: www/analyzer/open_projects.html:33
+
+Handle aggregate construction.
+Aggregates are object that can be brace-initialized without calling a

I'll try to list other constructor kinds that i have in mind.



Comment at: www/analyzer/open_projects.html:76
+still suppress false positives, but will not prevent us from inlining the 
+ethods, and therefore will not cause other false positives. Verifying this
+hypothesis would be a wonderful accomplishment. Previous understanding of

Typo: methods.



Comment at: www/analyzer/open_projects.html:86-87
+the opposite direction - integers to pointers - are completely unsupported.
+Pointer-to-pointer casts are a mess; modeling them with https://clang.llvm.org/doxygen/classclang_1_1ento_1_1ElementRegion.html";>
+ElementRegion  is a disaster and we are suffering a lot from 
this
+hack, but coming up with a significantly better solution is very hard, as

I'll try to be more, emm, positive on the website :]



Comment at: www/analyzer/open_projects.html:234
+  Because LLVM doesn't have branches, unfinished checkers first land in
+  alpha, and are only moved out once they are production-ready. Howeever, over
+  the years many checkers got stuck in alpha, and their developtment have

Typo: However.



Comment at: www/analyzer/open_projects.html:235
+  alpha, and are only moved out once they are production-ready. Howeever, over
+  the years many checkers got stuck in alpha, and their developtment have
+  stalled.

Typo: Development.



Comment at: www/analyzer/open_projects.html:262
+same time, until the return value of pthread_mutex_destroy was checked by a
+branch in the code).
+(Difficulty: Easy)

Something is unfinished here.


Repository:
  rC Clang

https://reviews.llvm.org/D53024



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-09 Thread Umann Kristóf via Phabricator via cfe-commits
Szelethus added a comment.

Thanks!

I admit that the difficulty was mostly chosen at random, so that could be 
brought closer to the actual difficulty of the project.




Comment at: www/analyzer/open_projects.html:86-87
+the opposite direction - integers to pointers - are completely unsupported.
+Pointer-to-pointer casts are a mess; modeling them with https://clang.llvm.org/doxygen/classclang_1_1ento_1_1ElementRegion.html";>
+ElementRegion  is a disaster and we are suffering a lot from 
this
+hack, but coming up with a significantly better solution is very hard, as

NoQ wrote:
> I'll try to be more, emm, positive on the website :]
Oh, right, sorry, I tried to "positivitize" most of these, but apparently 
missed this one O:)


Repository:
  rC Clang

https://reviews.llvm.org/D53024



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-10 Thread Henry Wong via Phabricator via cfe-commits
MTC added subscribers: teemperor, baloghadamsoftware, blitz.opensource.
MTC added a comment.
Herald added a subscriber: donat.nagy.

In https://reviews.llvm.org/D53024#1258976, @Szelethus wrote:

> Also, a lot of items on this list is outdated, but I joined the project 
> relatively recently, so I'm not sure what's the state on all of them.


AFAIK, the items below is outdated.

- Enhance CFG to model C++ temporaries properly (This problem has basically  
been fixed by @NoQ.)
- Enhance CFG to model C++ new more precisely (This problem has basically  been 
fixed by @NoQ.)
- Implement iterators invalidation checker (IIRC, @baloghadamsoftware has 
solved this, see `IteratorChecker.cpp`.)
- Write checkers which catch Copy and Paste errors (IIRC, @teemperor has solved 
this, see `CloneChecker.cpp`)
- Enhance CFG to model C++ delete more precisely (@blitz.opensource's focus is 
no longer on clang static analyzer, so we should not keep him as `current 
contact`.).

And there are items, I'm not sure what the current state is. Like:

- Explicitly model standard library functions with BodyFarm. (This item is 
marked as **ongoing**, it doesn't look very active nowadays.)

If I'm wrong, @NoQ and @george.karpenkov, please correct me. In addition `2018 
Bay Area LLVM Developers' Meetings` may bring some new open projects :), see 
http://llvm.org/devmtg/2018-10/talk-abstracts.html#bof6.

At the end, there are some punctuation problems, yea, I browsed this page 
through the browser :).




Comment at: www/analyzer/open_projects.html:98
+efficient. For instance, if the function is pure, then a single bit of
+information “this function is pure” would already be much better than
+conservative evaluation, and sometimes good enough to make inlining not

`“this function is pure”` -> `"this function is pure"`



Comment at: www/analyzer/open_projects.html:100
+conservative evaluation, and sometimes good enough to make inlining not
+worth the effort. Gathering such snippets of information - “partial
+summaries" - automatically, from the more simple to the more complex

`“partial` -> `"partial`



Comment at: www/analyzer/open_projects.html:259
+One of the more annoying parts in this is handling state splits for 
error
+return values. A “Schrödinger state” technique that was first implemented 
in
+the PthreadLockChecker (where a mutex was destroyed and not destroyed at 
the

Also here? `“Schrödinger state”` -> `"Schrödinger state"`



Comment at: www/analyzer/open_projects.html:267
+Many alpha checks can be turned into opt-in lint-like checks
+Path-sensitive lint checks are interesting and they can’t be implemented
+in clang-tidy and there’s clearly an interest in them, but we here aren’t

`can‘t` -> `can't`



Comment at: www/analyzer/open_projects.html:268
+Path-sensitive lint checks are interesting and they can’t be implemented
+in clang-tidy and there’s clearly an interest in them, but we here aren’t
+having enough maintenance power to respond to bugs and false positives. If

`there‘s` -> `there's`
`aren‘t` -> `aren't`


Repository:
  rC Clang

https://reviews.llvm.org/D53024



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-11 Thread Umann Kristóf via Phabricator via cfe-commits
Szelethus updated this revision to Diff 169193.
Szelethus added a comment.

- Fixed typos
- Fixed stupid characters (Thank you so much @MTC for going the extra mile and 
opening this patch in a browser!)
- Removed outdated entries as mentioned by @MTC


https://reviews.llvm.org/D53024

Files:
  www/analyzer/open_projects.html

Index: www/analyzer/open_projects.html
===
--- www/analyzer/open_projects.html
+++ www/analyzer/open_projects.html
@@ -25,13 +25,94 @@
   
   Core Analyzer Infrastructure
   
+Implement a dataflow flamework.
+
+ (Difficulty: Hard) 
+
+
+Handle aggregate construction.
+Aggregates are object that can be brace-initialized without calling a
+constructor (i.e., no https://clang.llvm.org/doxygen/classclang_1_1CXXConstructExpr.html";>
+CXXConstructExpr in the AST), but potentially calling
+constructors for their fields and (since C++17) base classes - and these
+constructors of sub-objects need to know what object (field in the
+aggregate) they are constructing. Moreover, if the aggregate contains
+references, lifetime extension needs to be modeled. Aggregates can be
+nested, so https://clang.llvm.org/doxygen/classclang_1_1ConstructionContext.html";>
+ConstructionContext can potentially cover an unlimited amount of
+statements. One can start untangling this problem by trying to replace the
+hacky https://clang.llvm.org/doxygen/classclang_1_1ParentMap.html";>
+ParentMap lookup in https://clang.llvm.org/doxygen/ExprEngineCXX_8cpp_source.html#l00430";>
+CXXConstructExpr::CK_NonVirtualBase branch of
+ExprEngine::VisitCXXConstructExpr() with some actual
+support for the feature.
+ (Difficulty: Medium) 
+
+
+Fix CFG for GNU "binary conditional" operator ?:.
+CFG for GNU "binary conditional" operator ?: is broken in
+C++. Its condition-and-LHS need to only be evaluated once.
+(Difficulty: Easy)
+
+
+Handle unions.
+Currently in the analyzer, the value of a union is always regarded as
+unknown. There has been some discussion about this on the http://lists.llvm.org/pipermail/cfe-dev/2017-March/052864.html";>
+mailing list already, but it is still an untouched area.
+ (Difficulty: Medium) 
+
+
+Enhance the modeling of the standard library.
+There is a huge amount of checker work for teaching the Static Analyzer
+about the C++ standard library. It is very easy to explain to the static
+analyzer that calling .length() on an empty std::string
+ will yield 0, and vice versa, but supporting all of them is a huge
+amount of work. One good thing to start with here would be to notice that
+inlining methods of C++ "containers" is currently outright forbidden in
+order to suppress a lot of false alarms due to weird assume()s
+made within inlined methods. There's a hypothesis that these suppressions
+should have been instead implemented as bug report visitors, which would
+still suppress false positives, but will not prevent us from inlining the 
+methods, and therefore will not cause other false positives. Verifying this
+hypothesis would be a wonderful accomplishment. Previous understanding of
+the "inlined defensive checks" problem is a pre-requisite for this project.
+(Difficulty: Medium)
+
+
+Reimplement the representation for various symbolic values.
+https://clang.llvm.org/doxygen/classclang_1_1ento_1_1nonloc_1_1LocAsInteger.html";>
+LocAsInteger is annoying, but alternatives are vague. Casts into
+the opposite direction - integers to pointers - are completely unsupported.
+Pointer-to-pointer casts are a mess; modeling them with https://clang.llvm.org/doxygen/classclang_1_1ento_1_1ElementRegion.html";>
+ElementRegion  is a impractical, and we are suffering a lot from
+this hack, but coming up with a significantly better solution is very hard,
+as there are a lot of corner-cases to cover, and it's hard to maintain
+balance between richness of our representation of symbolic values and our
+ability to understand when the two different values in fact represent the
+same thing.
+(Difficulty: Hard)
+
+
+ Provide better alternatives to inlining.
+Sometimes instead of inlining, a much simpler behavior would be more
+efficient. For instance, if the function is pure, then a single bit of
+information "this function is pure" would already be much better than
+conservative evaluation, and sometimes good enough to make inlining not
+worth the effort. Gathering such snippets of information - "partial
+summaries" - automatically, from the more simple to the more complex
+summaries, and re-using them later, probably across translation units, might
+improve our analysis quite a lot, while being something that can be worked
+on incrementally and doesn't require checkers to rea

[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-11 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added a comment.

> In addition 2018 Bay Area LLVM Developers' Meetings may bring some new open 
> projects :)

Actually, let's commit this before the conference, even if it's not perfect, so 
that people who suddenly get inspired to work on Static Analyzer already had an 
updated list :)




Comment at: www/analyzer/open_projects.html:29
+Implement a dataflow flamework.
+
+ (Difficulty: Hard) 

@george.karpenkov your turn here.



Comment at: www/analyzer/open_projects.html:33
+
+Handle aggregate construction.
+Aggregates are object that can be brace-initialized without calling a

Actually, maybe let's turn this into a single project with sub-bullets and list 
all problems instead of just one.

Ok if i just dump my thoughts with markdown? >_<

Something like:
* Handle more ways of constructing C++ objects by identifying 
`ConstructionContext`s in the `CFG` and using them to identify and track the 
object until the permanent storage for the object is evaluated.
  * Constructors of fields and [[since C++17] base classes] of aggregates. When 
aggregates are brace-initialized, their own trivial constructor does not 
happen, but the constructor for sub-objects does. It would be necessary to see 
beyond the `InitListExpr` in order to find out which field of which aggregate 
is actually constructed. Because brace initializers can be nested, 
`ConstructionContext` for this constructor would potentially contain an 
indefinite amount of intermediate `InitListExpr`s. Additionally, if the field 
is of reference type, lifetime extension needs to be modeled.
  * Constructors within `new[]`. Once an array of objects is allocated via 
`operator new[]`, constructors for all elements of the array are called. 
Arguably, we should evaluate at least a few of them, as if it was some sort of 
loop. The same applies to destructors within `delete[]`. `ConstructionContext` 
is already available here, so it is likely that there's no CFG work required, 
though indicating the presence of a loopy control flow in the CFG might be 
helpful.
  * Constructors that can be elided due to Named Return Value Optimization 
(NRVO). If a local variable within a function is of object type and is returned 
by value on all return statements, then the compiler is allowed (but not 
required to, even in C++17) to immediately store this variable at the address 
into which the function would put the return value, thereby skipping 
("eliding") copy/move constructor call (even if it has user-visible side 
effects). Variables eligible for NRVO can be easily identified in the AST via 
`VarDecl::isNRVOVariable()`, so no CFG work is necessary here, but Static 
Analyzer needs to realize that the respecitive `VarRegion` should be entirely 
replaced with the return region of the function for the whole duration of the 
respective stack frame.
  * Constructors of virtual method arguments. Constructors of function 
arguments are already supported, but Static Analyzer has problems finding the 
correct parameter variable declaration and the correct stack frame for the 
object under construction if it doesn't know how exactly a virtual call is 
devirtualized. It might help to simplify the identity of the parameter region 
to exclude the `Decl` of the callee and its parameters from its identity.
  * Constructors of default arguments. In C++ functions can have default values 
for parameters that are re-computed at run-time every time the the function is 
called without specifying the argument explicitly. There isn't much difference 
between default arguments and normal arguments when it comes to 
`ConstructionContext`-related problems. But the actual problem here is that the 
expression that initializes the parameter is not part of the current stack 
frame, but instead lives in the middle of nowhere within the parameter 
declaration. This means that if two calls of the same function with defaulted 
arguments are present within the same full-expression, we may accidentally 
assign two different values (corresponding to two different default argument 
objects) to the same expression in the Environment, which is incorrect. One of 
the possible solutions is to define an additional `LocationContext` to 
discriminate between different default argument evaluations, similarly to how 
`StackFrameContext` allows discriminating between values of the same expression 
on different layers of recursion.
  * Constructors that perform lambda captures. If a lambda captures a variable 
of object type by value, the object needs to be copied into the lambda, which 
implies calling a copy-constructor. A new kind of `ConstructionContext` would 
need to be defined in order to identify the memory region occupied by the 
lambda object (most likely a `MaterializeTemporaryExpr` that surrounds the 
`LambdaExpr`) and the sub-region that corresponds to the implicit field of the 
lambda object that contains the capt

[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-11 Thread George Karpenkov via Phabricator via cfe-commits
george.karpenkov added a comment.

@Szelethus I take it this is mostly formed from @NoQ email? Language could use 
polishing in quite a few places, could I just commandeer this revision and try 
to fix it?




Comment at: www/analyzer/open_projects.html:29
+Implement a dataflow flamework.
+
+ (Difficulty: Hard) 

NoQ wrote:
> @george.karpenkov your turn here.
Let's just skip it.
I suspect the explanation would not benefit here, because a large body of 
assumed knowledge would be required to take on the project.



Comment at: www/analyzer/open_projects.html:314
+bound the value of the resulting expression. Bonus points for handling 
masks
+followed by shifts, e.g. ($sym & 0b1100) >> 2.
 (Difficulty: Easy)

This is handled by Z3 invalidation, I'm not sure it's worth it spending much 
effort here in the long run.


https://reviews.llvm.org/D53024



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-12 Thread Umann Kristóf via Phabricator via cfe-commits
Szelethus added a comment.

In https://reviews.llvm.org/D53024#1262976, @george.karpenkov wrote:

> @Szelethus I take it this is mostly formed from @NoQ email? Language could 
> use polishing in quite a few places, could I just commandeer this revision 
> and try to fix it?


Yes it is. Though the other item you mentioned should be on this list -- I just 
simply forgot to put it there before updating the diff O:). Feel free to 
commandeer it, this patch is barely my work.


https://reviews.llvm.org/D53024



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-15 Thread George Karpenkov via Phabricator via cfe-commits
george.karpenkov added a comment.

I have tried to clean up the list.


https://reviews.llvm.org/D53024



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-15 Thread George Karpenkov via Phabricator via cfe-commits
george.karpenkov updated this revision to Diff 169758.

https://reviews.llvm.org/D53024

Files:
  clang/www/analyzer/open_projects.html

Index: clang/www/analyzer/open_projects.html
===
--- clang/www/analyzer/open_projects.html
+++ clang/www/analyzer/open_projects.html
@@ -22,162 +22,187 @@
 to the http://lists.llvm.org/mailman/listinfo/cfe-dev>cfe-dev
 mailing list to notify other members of the community.
 
-  
-  Core Analyzer Infrastructure
-  
-Explicitly model standard library functions with BodyFarm.
-http://clang.llvm.org/doxygen/classclang_1_1BodyFarm.html";>BodyFarm 
-allows the analyzer to explicitly model functions whose definitions are 
-not available during analysis. Modeling more of the widely used functions 
-(such as the members of std::string) will improve precision of the
-analysis. 
-(Difficulty: Easy, ongoing)
-
+
+  Release checkers from "alpha"
+New checkers which were contributed to the analyzer,
+but have not passed a rigorous evaluation process,
+are committed as "alpha checkers" (from "alpha version"),
+and are not enabled by default.
+
+The development of many such checkers has stalled over the years.
+Current "alpha" checkers need a cleanup:
+checkers which have been there for a long time should either
+be improved up to a point where they can be enabled by default,
+or removed, if such an improvement is not possible.
+Most notably, these checkers could be "graduated" out of alpha
+if a consistent effort is applied:
+
+
+  alpha.security.ArrayBound and
+  alpha.security.ArrayBoundV2
+  Array bounds checking is a desired feature,
+  but having an acceptable rate of false positives might not be possible
+  without a proper
+  https://en.wikipedia.org/wiki/Widening_(computer_science)">loop widening support.
+  Additionally, it might be more promising to perform index checking based on
+  https://en.wikipedia.org/wiki/Taint_checking";>tainted index values.
+  (Difficulty: Medium)
+  
+
+  alpha.cplusplus.MisusedMovedObject
+The checker emits a warning on objects which were used after
+https://en.cppreference.com/w/cpp/utility/move";>move.
+Currently it has an overly high false positive rate due to classes
+which have a well-defined semantics for use-after-move.
+This property does not hold for STL objects, but is often the case
+for custom containers.
+  (Difficulty: Medium)
+  
+
+  alpha.unix.StreamChecker
+A SimpleStreamChecker has been presented in the Building a Checker in 24 
+Hours talk 
+(http://llvm.org/devmtg/2012-11/Zaks-Rose-Checker24Hours.pdf";>slides
+https://youtu.be/kdxlsP5QVPw";>video).
+
+This alpha checker is an attempt to write a production grade stream checker.
+However, it was found to have an unacceptably high false positive rate.
+One of the found problems was that eagerly splitting the state
+based on whether the system call may fail leads to too many reports.
+A delayed split where the implication is stored in the state
+(similarly to nullability implications in TrustNonnullChecker)
+may produce much better results.
+(Difficulty: Medium)
+  
+
+  
 
-Handle floating-point values.
-Currently, the analyzer treats all floating-point values as unknown.
-However, we already have most of the infrastructure we need to handle
-floats: RangeConstraintManager. This would involve adding a new SVal kind
-for constant floats, generalizing the constraint manager to handle floats
-and integers equally, and auditing existing code to make sure it doesn't
-make untoward assumptions.
- (Difficulty: Medium)
-
-
-Implement generalized loop execution modeling.
-Currently, the analyzer simply unrolls each loop N times. This 
-means that it will not execute any code after the loop if the loop is 
-guaranteed to execute more than N times. This results in lost 
-basic block coverage. We could continue exploring the path if we could 
-model a generic i-th iteration of a loop.
- (Difficulty: Hard)
+  Improved C++ support
+  
+Handle aggregate construction.
+  https://en.cppreference.com/w/cpp/language/aggregate_initialization";>Aggregates
+  are objects that can be brace-initialized without calling a
+  constructor (that is, https://clang.llvm.org/doxygen/classclang_1_1CXXConstructExpr.html";>
+  CXXConstructExpr does not occur in the AST),
+  but potentially calling
+  constructors for their fields and base classes
+  These
+  constructors of sub-objects need to know what object they are constructing.
+  Moreover, if the aggregate contains
+  references, lifetime extension needs to be properly modeled.
+
+  One can start un

[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-15 Thread Umann Kristóf via Phabricator via cfe-commits
Szelethus accepted this revision.
Szelethus added a comment.
This revision is now accepted and ready to land.

Thanks, this looks great!


https://reviews.llvm.org/D53024



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-15 Thread George Karpenkov via Phabricator via cfe-commits
george.karpenkov added a comment.

@Szelethus thanks! BTW if you really want to invest into maintaining the 
website,
I think it's totally worth it to change all contents to markdown,
and then have a script to generate HTML from that.
Committers would be expected to manually run that script.
This would also solve our problem with the disappearing header.


https://reviews.llvm.org/D53024



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-15 Thread Umann Kristóf via Phabricator via cfe-commits
Szelethus added a comment.

I dislike web development, but that would indeed be invaluable. I'll take a 
look.


https://reviews.llvm.org/D53024



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-15 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added inline comments.



Comment at: clang/www/analyzer/open_projects.html:27-32
+New checkers which were contributed to the analyzer,
+but have not passed a rigorous evaluation process,
+are committed as "alpha checkers" (from "alpha version"),
+and are not enabled by default.
+
+The development of many such checkers has stalled over the years.

This is extremely important to get right. Alpha doesn't mean "i did an 
experiment, let me dump my code so that it wasn't lost, maybe others will pick 
it up and turn it into a useful checker if they figure out how". Alpha doesn't 
mean "a checker that power-users can use at their own risk when they want to 
find more bugs". Alpha doesn't mean "i think this checker is great but 
maintainers think it's bad so they keep me in alpha but i'm happy because i can 
write in my resume that i'm an llvm contributor". All of these are super 
popular misconceptions.

Alpha means "i'm working on it". That's it.

Let's re-phrase to something like: "When a new checker is being developed 
incrementally, it is committed into clang and is put into the hidden "alpha" 
package (from "alpha version"). Ideally, once all desired functionality of the 
checker is implemented, checker should be moved out of the alpha package and 
become enabled by default or recommended to opt-in into, but development of 
many alpha checkers has stalled over the years."



Comment at: clang/www/analyzer/open_projects.html:80
 
-Handle floating-point values.
-Currently, the analyzer treats all floating-point values as unknown.
-However, we already have most of the infrastructure we need to handle
-floats: RangeConstraintManager. This would involve adding a new SVal kind
-for constant floats, generalizing the constraint manager to handle floats
-and integers equally, and auditing existing code to make sure it doesn't
-make untoward assumptions.
- (Difficulty: Medium)
-
-
-Implement generalized loop execution modeling.
-Currently, the analyzer simply unrolls each loop N times. This 
-means that it will not execute any code after the loop if the loop is 
-guaranteed to execute more than N times. This results in lost 
-basic block coverage. We could continue exploring the path if we could 
-model a generic i-th iteration of a loop.
- (Difficulty: Hard)
+  Improved C++ support
+  

I guess let's add the rest of the constructors from my message above.



Comment at: clang/www/analyzer/open_projects.html:88-89
+  but potentially calling
+  constructors for their fields and base classes
+  These
+  constructors of sub-objects need to know what object they are 
constructing.

Something's wrong here.


https://reviews.llvm.org/D53024



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-15 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added a comment.

The current text looks great.


https://reviews.llvm.org/D53024



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-16 Thread Umann Kristóf via Phabricator via cfe-commits
Szelethus added a comment.

Why d




Comment at: clang/www/analyzer/open_projects.html:153
+  problem still remains open.
+
+   (Difficulty: Hard)

Did you mean to have this newline here? Difficulty seems to have a weird 
placement when viewed in a browser.


https://reviews.llvm.org/D53024



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-16 Thread George Karpenkov via Phabricator via cfe-commits
george.karpenkov added inline comments.
Herald added a subscriber: dkrupp.



Comment at: clang/www/analyzer/open_projects.html:27-32
+New checkers which were contributed to the analyzer,
+but have not passed a rigorous evaluation process,
+are committed as "alpha checkers" (from "alpha version"),
+and are not enabled by default.
+
+The development of many such checkers has stalled over the years.

NoQ wrote:
> This is extremely important to get right. Alpha doesn't mean "i did an 
> experiment, let me dump my code so that it wasn't lost, maybe others will 
> pick it up and turn it into a useful checker if they figure out how". Alpha 
> doesn't mean "a checker that power-users can use at their own risk when they 
> want to find more bugs". Alpha doesn't mean "i think this checker is great 
> but maintainers think it's bad so they keep me in alpha but i'm happy because 
> i can write in my resume that i'm an llvm contributor". All of these are 
> super popular misconceptions.
> 
> Alpha means "i'm working on it". That's it.
> 
> Let's re-phrase to something like: "When a new checker is being developed 
> incrementally, it is committed into clang and is put into the hidden "alpha" 
> package (from "alpha version"). Ideally, once all desired functionality of 
> the checker is implemented, checker should be moved out of the alpha package 
> and become enabled by default or recommended to opt-in into, but development 
> of many alpha checkers has stalled over the years."
Let's ignore (3) as a red herring, but I'm not sure I see the difference 
between (1), (2) and (4). When someone works actively on a checker, but then 
stops, it immediately transfers from state (4) to state (2) and optionally (1)



Comment at: clang/www/analyzer/open_projects.html:80
 
-Handle floating-point values.
-Currently, the analyzer treats all floating-point values as unknown.
-However, we already have most of the infrastructure we need to handle
-floats: RangeConstraintManager. This would involve adding a new SVal kind
-for constant floats, generalizing the constraint manager to handle floats
-and integers equally, and auditing existing code to make sure it doesn't
-make untoward assumptions.
- (Difficulty: Medium)
-
-
-Implement generalized loop execution modeling.
-Currently, the analyzer simply unrolls each loop N times. This 
-means that it will not execute any code after the loop if the loop is 
-guaranteed to execute more than N times. This results in lost 
-basic block coverage. We could continue exploring the path if we could 
-model a generic i-th iteration of a loop.
- (Difficulty: Hard)
+  Improved C++ support
+  

NoQ wrote:
> I guess let's add the rest of the constructors from my message above.
What message above? The original email?


https://reviews.llvm.org/D53024



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-16 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added inline comments.



Comment at: clang/www/analyzer/open_projects.html:27-32
+New checkers which were contributed to the analyzer,
+but have not passed a rigorous evaluation process,
+are committed as "alpha checkers" (from "alpha version"),
+and are not enabled by default.
+
+The development of many such checkers has stalled over the years.

george.karpenkov wrote:
> NoQ wrote:
> > This is extremely important to get right. Alpha doesn't mean "i did an 
> > experiment, let me dump my code so that it wasn't lost, maybe others will 
> > pick it up and turn it into a useful checker if they figure out how". Alpha 
> > doesn't mean "a checker that power-users can use at their own risk when 
> > they want to find more bugs". Alpha doesn't mean "i think this checker is 
> > great but maintainers think it's bad so they keep me in alpha but i'm happy 
> > because i can write in my resume that i'm an llvm contributor". All of 
> > these are super popular misconceptions.
> > 
> > Alpha means "i'm working on it". That's it.
> > 
> > Let's re-phrase to something like: "When a new checker is being developed 
> > incrementally, it is committed into clang and is put into the hidden 
> > "alpha" package (from "alpha version"). Ideally, once all desired 
> > functionality of the checker is implemented, checker should be moved out of 
> > the alpha package and become enabled by default or recommended to opt-in 
> > into, but development of many alpha checkers has stalled over the years."
> Let's ignore (3) as a red herring, but I'm not sure I see the difference 
> between (1), (2) and (4). When someone works actively on a checker, but then 
> stops, it immediately transfers from state (4) to state (2) and optionally (1)
This is kinda mostly about the attitude with which people should put stuff into 
the `alpha` package. Yeah, if someone stops for inevitable reasons, then we're 
inevitably left with unmaintained experimental code in the repo, i guess that's 
why this section exists. But contributors shouldn't plan for this from the 
start. A lot of contributors literally believe that alpha is by design a 
stockpile of incomplete work and weird experiments and it'll make everybody 
happy if they add more incomplete work and weird experiments into it, i myself 
misunderstood this for a long time and i want to address this misconception.


https://reviews.llvm.org/D53024



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-16 Thread George Karpenkov via Phabricator via cfe-commits
george.karpenkov updated this revision to Diff 169920.

https://reviews.llvm.org/D53024

Files:
  clang/www/analyzer/open_projects.html

Index: clang/www/analyzer/open_projects.html
===
--- clang/www/analyzer/open_projects.html
+++ clang/www/analyzer/open_projects.html
@@ -22,162 +22,219 @@
 to the http://lists.llvm.org/mailman/listinfo/cfe-dev>cfe-dev
 mailing list to notify other members of the community.
 
-  
-  Core Analyzer Infrastructure
+
+  Release checkers from "alpha"
+New checkers which were contributed to the analyzer,
+but have not passed a rigorous evaluation process,
+are committed as "alpha checkers" (from "alpha version"),
+and are not enabled by default.
+
+Ideally, only the checkers which are actively being worked on should be in
+"alpha",
+but over the years the development of many of those has stalled.
+Such checkers need a cleanup:
+checkers which have been there for a long time should either
+be improved up to a point where they can be enabled by default,
+or removed, if such an improvement is not possible.
+Most notably, these checkers could be "graduated" out of alpha
+if a consistent effort is applied:
+
+
+  alpha.security.ArrayBound and
+  alpha.security.ArrayBoundV2
+  Array bounds checking is a desired feature,
+  but having an acceptable rate of false positives might not be possible
+  without a proper
+  https://en.wikipedia.org/wiki/Widening_(computer_science)">loop widening support.
+  Additionally, it might be more promising to perform index checking based on
+  https://en.wikipedia.org/wiki/Taint_checking";>tainted index values.
+  (Difficulty: Medium)
+  
+
+  alpha.cplusplus.MisusedMovedObject
+The checker emits a warning on objects which were used after
+https://en.cppreference.com/w/cpp/utility/move";>move.
+Currently it has an overly high false positive rate due to classes
+which have a well-defined semantics for use-after-move.
+This property does not hold for STL objects, but is often the case
+for custom containers.
+  (Difficulty: Medium)
+  
+
+  alpha.unix.StreamChecker
+A SimpleStreamChecker has been presented in the Building a Checker in 24 
+Hours talk 
+(http://llvm.org/devmtg/2012-11/Zaks-Rose-Checker24Hours.pdf";>slides
+https://youtu.be/kdxlsP5QVPw";>video).
+
+This alpha checker is an attempt to write a production grade stream checker.
+However, it was found to have an unacceptably high false positive rate.
+One of the found problems was that eagerly splitting the state
+based on whether the system call may fail leads to too many reports.
+A delayed split where the implication is stored in the state
+(similarly to nullability implications in TrustNonnullChecker)
+may produce much better results.
+(Difficulty: Medium)
+  
+
+  
+
+  Improved C++ support
   
-Explicitly model standard library functions with BodyFarm.
-http://clang.llvm.org/doxygen/classclang_1_1BodyFarm.html";>BodyFarm 
-allows the analyzer to explicitly model functions whose definitions are 
-not available during analysis. Modeling more of the widely used functions 
-(such as the members of std::string) will improve precision of the
-analysis. 
-(Difficulty: Easy, ongoing)
+Handle aggregate construction.
+  https://en.cppreference.com/w/cpp/language/aggregate_initialization";>Aggregates
+  are objects that can be brace-initialized without calling a
+  constructor (that is, https://clang.llvm.org/doxygen/classclang_1_1CXXConstructExpr.html";>
+  CXXConstructExpr does not occur in the AST),
+  but potentially calling
+  constructors for their fields and base classes
+  These
+  constructors of sub-objects need to know what object they are constructing.
+  Moreover, if the aggregate contains
+  references, lifetime extension needs to be properly modeled.
+
+  One can start untangling this problem by trying to replace the
+  current ad-hoc https://clang.llvm.org/doxygen/classclang_1_1ParentMap.html";>
+  ParentMap lookup in https://clang.llvm.org/doxygen/ExprEngineCXX_8cpp_source.html#l00430";>
+  CXXConstructExpr::CK_NonVirtualBase branch of
+  ExprEngine::VisitCXXConstructExpr()
+  with proper support for the feature.
+   (Difficulty: Medium) 
 
 
-Handle floating-point values.
-Currently, the analyzer treats all floating-point values as unknown.
-However, we already have most of the infrastructure we need to handle
-floats: RangeConstraintManager. This would involve adding a new SVal kind
-for constant floats, generalizing the constraint manager to handle floats
-and integers equally, and auditing existing code to make sure it doesn't
-   

[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-16 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ accepted this revision.
NoQ added a comment.

Ok, let's see if this actually works :)


https://reviews.llvm.org/D53024



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-16 Thread George Karpenkov via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rC344663: [analyzer] [www] Updated a list of open projects 
(authored by george.karpenkov, committed by ).

Changed prior to commit:
  https://reviews.llvm.org/D53024?vs=169920&id=169933#toc

Repository:
  rC Clang

https://reviews.llvm.org/D53024

Files:
  www/analyzer/open_projects.html

Index: www/analyzer/open_projects.html
===
--- www/analyzer/open_projects.html
+++ www/analyzer/open_projects.html
@@ -22,162 +22,219 @@
 to the http://lists.llvm.org/mailman/listinfo/cfe-dev>cfe-dev
 mailing list to notify other members of the community.
 
-  
-  Core Analyzer Infrastructure
-  
-Explicitly model standard library functions with BodyFarm.
-http://clang.llvm.org/doxygen/classclang_1_1BodyFarm.html";>BodyFarm 
-allows the analyzer to explicitly model functions whose definitions are 
-not available during analysis. Modeling more of the widely used functions 
-(such as the members of std::string) will improve precision of the
-analysis. 
-(Difficulty: Easy, ongoing)
-
-
-Handle floating-point values.
-Currently, the analyzer treats all floating-point values as unknown.
-However, we already have most of the infrastructure we need to handle
-floats: RangeConstraintManager. This would involve adding a new SVal kind
-for constant floats, generalizing the constraint manager to handle floats
-and integers equally, and auditing existing code to make sure it doesn't
-make untoward assumptions.
- (Difficulty: Medium)
-
-
-Implement generalized loop execution modeling.
-Currently, the analyzer simply unrolls each loop N times. This 
-means that it will not execute any code after the loop if the loop is 
-guaranteed to execute more than N times. This results in lost 
-basic block coverage. We could continue exploring the path if we could 
-model a generic i-th iteration of a loop.
- (Difficulty: Hard)
-
-
-Enhance CFG to model C++ temporaries properly.
-There is an existing implementation of this, but it's not complete and
-is disabled in the analyzer.
-(Difficulty: Medium; current contact: Alex McCarthy)
-
-Enhance CFG to model exception-handling properly.
-Currently exceptions are treated as "black holes", and exception-handling
-control structures are poorly modeled (to be conservative). This could be
-much improved for both C++ and Objective-C exceptions.
-(Difficulty: Medium)
-
-Enhance CFG to model C++ new more precisely.
-The current representation of new does not provide an easy
-way for the analyzer to model the call to a memory allocation function
-(operator new), then initialize the result with a constructor
-call. The problem is discussed at length in
-http://llvm.org/bugs/show_bug.cgi?id=12014";>PR12014.
-(Difficulty: Easy; current contact: Karthik Bhat)
-
-Enhance CFG to model C++ delete more precisely.
-Similarly, the representation of delete does not include
-the call to the destructor, followed by the call to the deallocation
-function (operator delete). One particular issue 
-(noreturn destructors) is discussed in
-http://llvm.org/bugs/show_bug.cgi?id=15599";>PR15599
-(Difficulty: Easy; current contact: Karthik Bhat)
-
-Implement a BitwiseConstraintManager to handle http://llvm.org/bugs/show_bug.cgi?id=3098";>PR3098.
-Constraints on the bits of an integer are not easily representable as
-ranges. A bitwise constraint manager would model constraints such as "bit 32
-is known to be 1". This would help code that made use of bitmasks.
-(Difficulty: Medium)
-
-
-Track type info through casts more precisely.
-The DynamicTypePropagation checker is in charge of inferring a region's
-dynamic type based on what operations the code is performing. Casts are a
-rich source of type information that the analyzer currently ignores. They
-are tricky to get right, but might have very useful consequences.
-(Difficulty: Medium)
-
-Design and implement alpha-renaming.
-Implement unifying two symbolic values along a path after they are 
-determined to be equal via comparison. This would allow us to reduce the 
-number of false positives and would be a building step to more advanced 
-analyses, such as summary-based interprocedural and cross-translation-unit 
-analysis. 
-(Difficulty: Hard)
-
-  
+
+  Release checkers from "alpha"
+New checkers which were contributed to the analyzer,
+but have not passed a rigorous evaluation process,
+are committed as "alpha checkers" (from "alpha version"),
+and are not enabled by default.
+
+Ideally, only the checkers which are actively being worked on should be in
+"alpha",
+but over the years the development of many 

[PATCH] D53024: [analyzer][www] Add more open projects

2018-10-17 Thread Daniel Krupp via Phabricator via cfe-commits
dkrupp added inline comments.



Comment at: www/analyzer/open_projects.html:198
+  or using a dataflow framework.
+  (Difficulty: Hard)
+

Probably it is worth mentioning here, that there is a macro language already 
for describing summaries of standard library functions in 
StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp. 

This macro language could be refactored to a separate header file so it could 
be used in other checkers too. Could also be extended for C++. 

Another useful addition would be to enable users to describe these summaries in 
run-time (in YAML files for example) to be able to model their own proprietary 
library functions. 

Then as a next step we could introduce a flow sensitive analysis to generate 
such summaries automatically. Which is a hard problem indeed, the others above 
should not be too difficult


Repository:
  rC Clang

https://reviews.llvm.org/D53024



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits