[PATCH] D44934: [analyzer] Improve the modeling of `memset()`.

2018-04-27 Thread Henry Wong via Phabricator via cfe-commits
MTC added a comment.

ping^2


Repository:
  rC Clang

https://reviews.llvm.org/D44934



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D44934: [analyzer] Improve the modeling of `memset()`.

2018-04-27 Thread Aleksei Sidorin via Phabricator via cfe-commits
a.sidorin added a comment.

Hi Henry. I had a quick look at the patch, here are some remarks.




Comment at: lib/StaticAnalyzer/Checkers/CStringChecker.cpp:2092
+  // 'invalidateRegions()', which will remove the  pair
+  // in CStringLength map. So calls 'InvalidateBuffer()' after getting
+  // old string length and before setting the new string length.

"So we call"?



Comment at: lib/StaticAnalyzer/Checkers/CStringChecker.cpp:2100
+std::tie(StateNullChar, StateNonNullChar) =
+assumeZero(C, State, CharVal, Ctx.UnsignedCharTy);
+

I think we can use the argument type here (which is int).
One more question. Could we use `SVal::isZeroConstant()` here instead? Do we 
need the path-sensitivity for the value or we can just check if the value is a 
constant zero?



Comment at: lib/StaticAnalyzer/Checkers/CStringChecker.cpp:2103
+if (StateNullChar && !StateNonNullChar) {
+  // If the value of the second arguement of 'memset()' is zero, set the
+  // string length of destination buffer to 0 directly.

"argument"



Comment at: lib/StaticAnalyzer/Checkers/CStringChecker.cpp:2127
+ /*IsSourceBuffer*/ false, Size);
+  }
+

Could you please factor this chunk to a function? It makes the method too big.



Comment at: lib/StaticAnalyzer/Core/ProgramState.cpp:148
+  ProgramStateRef NewState = makeWithStore(NewStore);
+  return Mgr.getOwningEngine()
+ ? Mgr.getOwningEngine()->processRegionChange(NewState, R, LCtx)

Do clients of `overwriteRegion` really need to call checkers?



Comment at: lib/StaticAnalyzer/Core/RegionStore.cpp:446
+  // default binding.
+  StoreRef OverwriteRegion(Store store, const MemRegion *R, SVal V) override {
+RegionBindingsRef B = getRegionBindings(store);

LLVM naming conventions say that function names should start with small letter.



Comment at: lib/StaticAnalyzer/Core/RegionStore.cpp:1718
 
-llvm_unreachable("Unknown default value");
+return val;
   }

Could you please explain what is the reason of this change? Do we get new value 
kinds?



Comment at: lib/StaticAnalyzer/Core/Store.cpp:72
 
+StoreRef StoreManager::OverwriteRegion(Store store, const MemRegion *R, SVal 
V) {
+  return StoreRef(store, *this); 

Could we make this method pure virtual?


Repository:
  rC Clang

https://reviews.llvm.org/D44934



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D44934: [analyzer] Improve the modeling of `memset()`.

2018-05-02 Thread Henry Wong via Phabricator via cfe-commits
MTC added a comment.

Sorry for the long delay, I have just finished my holiday.

Thanks a lot for the review, I will fix the typo in next update.




Comment at: lib/StaticAnalyzer/Checkers/CStringChecker.cpp:2100
+std::tie(StateNullChar, StateNonNullChar) =
+assumeZero(C, State, CharVal, Ctx.UnsignedCharTy);
+

a.sidorin wrote:
> I think we can use the argument type here (which is int).
> One more question. Could we use `SVal::isZeroConstant()` here instead? Do we 
> need the path-sensitivity for the value or we can just check if the value is 
> a constant zero?
Yea, it's should be int.

`SVal::isZeroConstant()` is enough here, thanks!

One question that has nothing to do with the patch, I would like to know if 
there is a standard to determine whether there is a need for state splitting.




Comment at: lib/StaticAnalyzer/Checkers/CStringChecker.cpp:2127
+ /*IsSourceBuffer*/ false, Size);
+  }
+

a.sidorin wrote:
> Could you please factor this chunk to a function? It makes the method too big.
will do.



Comment at: lib/StaticAnalyzer/Core/ProgramState.cpp:148
+  ProgramStateRef NewState = makeWithStore(NewStore);
+  return Mgr.getOwningEngine()
+ ? Mgr.getOwningEngine()->processRegionChange(NewState, R, LCtx)

a.sidorin wrote:
> Do clients of `overwriteRegion` really need to call checkers?
It is possible that my understanding of the engine is not enough, I think we 
need to call `processRangeChange()` for memory change. `memset()` will 
overwrite the memory region, so I think there is a need to call this API.

What do you think?



Comment at: lib/StaticAnalyzer/Core/RegionStore.cpp:1718
 
-llvm_unreachable("Unknown default value");
+return val;
   }

a.sidorin wrote:
> Could you please explain what is the reason of this change? Do we get new 
> value kinds?
`memset()` can bind anything to the region with `default binding`, if there is 
no such change, it will trigger `llvm_unreachable()`. 

I don't think much about it, just put `return val;` here. Any suggestions?



Comment at: lib/StaticAnalyzer/Core/Store.cpp:72
 
+StoreRef StoreManager::OverwriteRegion(Store store, const MemRegion *R, SVal 
V) {
+  return StoreRef(store, *this); 

a.sidorin wrote:
> Could we make this method pure virtual?
To be honest, I implemented this function according to `BindDefault()`. And I 
also think pure virtual is better, thanks.


Repository:
  rC Clang

https://reviews.llvm.org/D44934



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D44934: [analyzer] Improve the modeling of `memset()`.

2018-05-03 Thread Henry Wong via Phabricator via cfe-commits
MTC updated this revision to Diff 145019.
MTC added a comment.

- fix typos
- code refactoring, add auxiliary method `memsetAux()`
- according to a.sidorin's suggestions, remove the useless state splitting.
- make `StoreManager::overwriteRegion()` pure virtual


Repository:
  rC Clang

https://reviews.llvm.org/D44934

Files:
  include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
  include/clang/StaticAnalyzer/Core/PathSensitive/Store.h
  lib/StaticAnalyzer/Checkers/CStringChecker.cpp
  lib/StaticAnalyzer/Core/ProgramState.cpp
  lib/StaticAnalyzer/Core/RegionStore.cpp
  test/Analysis/bstring.cpp
  test/Analysis/null-deref-ps-region.c
  test/Analysis/string.c

Index: test/Analysis/string.c
===
--- test/Analysis/string.c
+++ test/Analysis/string.c
@@ -1,7 +1,8 @@
-// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
-// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -analyzer-checker=core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
-// RUN: %clang_analyze_cc1 -DVARIANT -analyzer-checker=core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
-// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -DVARIANT -analyzer-checker=alpha.security.taint,core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DVARIANT -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -DVARIANT -analyzer-checker=alpha.security.taint,core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DSUPPRESS_OUT_OF_BOUND -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring.BufferOverlap,alpha.unix.cstring.NotNullTerminated,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
 
 //===--===
 // Declarations
@@ -1159,6 +1160,248 @@
   clang_analyzer_eval(str[1] == 'b'); // expected-warning{{UNKNOWN}}
 }
 
+//===--===
+// memset()
+//===--===
+
+void *memset( void *dest, int ch, size_t count );
+
+void* malloc(size_t size);
+void free(void*);
+
+void memset1_char_array_null() {
+  char str[] = "abcd";
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{TRUE}}
+  memset(str, '\0', 2);
+  clang_analyzer_eval(strlen(str) == 0); // expected-warning{{TRUE}}
+}
+
+void memset2_char_array_null() {
+  char str[] = "abcd";
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{TRUE}}
+  memset(str, '\0', strlen(str) + 1);
+  clang_analyzer_eval(strlen(str) == 0); // expected-warning{{TRUE}}
+  clang_analyzer_eval(str[2] == 0);  // expected-warning{{TRUE}}
+}
+
+void memset3_char_malloc_null() {
+  char *str = (char *)malloc(10 * sizeof(char));
+  memset(str + 1, '\0', 8);
+  clang_analyzer_eval(str[1] == 0); // expected-warning{{UNKNOWN}}
+  free(str);
+}
+
+void memset4_char_malloc_null() {
+  char *str = (char *)malloc(10 * sizeof(char));
+  //void *str = malloc(10 * sizeof(char));
+  memset(str, '\0', 10);
+  clang_analyzer_eval(str[1] == 0);  // expected-warning{{TRUE}}
+  clang_analyzer_eval(strlen(str) == 0); // expected-warning{{TRUE}}
+  free(str);
+}
+
+#ifdef SUPPRESS_OUT_OF_BOUND
+void memset5_char_malloc_overflow_null() {
+  char *str = (char *)malloc(10 * sizeof(char));
+  memset(str, '\0', 12);
+  clang_analyzer_eval(str[1] == 0); // expected-warning{{UNKNOWN}}
+  free(str);
+}
+#endif
+
+void memset6_char_array_nonnull() {
+  char str[] = "abcd";
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{TRUE}}
+  memset(str, '0', 2);
+  clang_analyzer_eval(str[0] == 'a');// expected-warning{{UNKNOWN}}
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{UNKNOWN}}
+}
+
+void memset7_char_array_nonnull() {
+  char str[5] = "abcd";
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{TRUE}}
+  memset(str, '0', 5);
+  clang_analyzer_eval(str[0] == '0');// expected-warning{{TRUE}}
+  clang_analyzer_eval(strlen(str) >= 5); // expected-warning{{TRUE}}
+}
+
+#ifdef SUPPRESS_OUT_OF_BOUND
+void memset8_char_array_

[PATCH] D44934: [analyzer] Improve the modeling of `memset()`.

2018-05-04 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added a comment.

Hmm, ok, it seems that i've just changed the API in 
https://reviews.llvm.org/D46368, and i should have thought about this use case. 
Well, at least i have some background understanding of these problems now. 
Sorry for not keeping eye on this problem.

In https://reviews.llvm.org/D44934#1051002, @NoQ wrote:

> Why do you need separate code for null and non-null character? The function's 
> semantics doesn't seem to care.


I guess i can answer myself here:

  int32_t x;
  memset(&x, 1, sizeof(int32_t));
  clang_analyzer_eval(x == 0x1010101); // should be TRUE

I really doubt that we support this case.

So, yeah, zero character is indeed special.

And since zero character is special, i guess we could use the new 
`bindDefaultZero()` API, and perform a simple invalidation in the non-zero 
character case. @MTC, would this be enough for your use case? Or is it really 
important for you to support the non-zero character case? Cause my example is 
fairly artificial, and probably i'm worrying too much about it. If nobody 
really codes that way, i'm fine with supporting the non-zero character case via 
a simple default binding. In this case we should merge `bindDefaultZero()` with 
your `overwriteRegion()` (i'd prefer your function name, but please keep the 
empty class check).




Comment at: lib/StaticAnalyzer/Core/ProgramState.cpp:148
+  ProgramStateRef NewState = makeWithStore(NewStore);
+  return Mgr.getOwningEngine()
+ ? Mgr.getOwningEngine()->processRegionChange(NewState, R, LCtx)

MTC wrote:
> a.sidorin wrote:
> > Do clients of `overwriteRegion` really need to call checkers?
> It is possible that my understanding of the engine is not enough, I think we 
> need to call `processRangeChange()` for memory change. `memset()` will 
> overwrite the memory region, so I think there is a need to call this API.
> 
> What do you think?
Well, we need to make sure we don't notify checkers recursively. In `bindLoc()` 
we have the `notifyChanges` parameter, i guess we need to have something 
similar here as well, if we really need to notify checkers.

Technically, yeah, we need to notify checkers. Like, if you memset() a string 
to `'\0'`, string checker needs to know that its length has also become 0. 
Wait, we already are in the string checker, and we're already doing that. I 
guess it's not that important at the moment, so i'd rather delay it until we 
need it, and see if we have shared checker states available earlier and if they 
help.

Also `getOwningEngine()` is never null.



Comment at: lib/StaticAnalyzer/Core/ProgramState.cpp:118
   const LocationContext *LCtx,
   bool notifyChanges) const {
   ProgramStateManager &Mgr = getStateManager();

Note: `notifyChanges` declared here :)


Repository:
  rC Clang

https://reviews.llvm.org/D44934



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D44934: [analyzer] Improve the modeling of `memset()`.

2018-05-04 Thread Henry Wong via Phabricator via cfe-commits
MTC added a comment.

In https://reviews.llvm.org/D44934#1088771, @NoQ wrote:

> Hmm, ok, it seems that i've just changed the API in 
> https://reviews.llvm.org/D46368, and i should have thought about this use 
> case. Well, at least i have some background understanding of these problems 
> now. Sorry for not keeping eye on this problem.
>
> In https://reviews.llvm.org/D44934#1051002, @NoQ wrote:
>
> > Why do you need separate code for null and non-null character? The 
> > function's semantics doesn't seem to care.
>
>
> I guess i can answer myself here:
>
>   int32_t x;
>   memset(&x, 1, sizeof(int32_t));
>   clang_analyzer_eval(x == 0x1010101); // should be TRUE
>
>
> I really doubt that we support this case.
>
> So, yeah, zero character is indeed special.


Thank you, Artem! I did not consider this common situation.  This patch does 
not really support this situation, in this patch the value of `x` will be 1, 
it's not correct!

> And since zero character is special, i guess we could use the new 
> `bindDefaultZero()` API, and perform a simple invalidation in the non-zero 
> character case.

Agree with you. The core problem here is that there is no perfect way to `bind 
default` for non-zero value, not the string length stuff. So **invalidate the 
destination buffer** but still **set string length** for non-zero character is 
OK. Right?

> @MTC, would this be enough for your use case? Or is it really important for 
> you to support the non-zero character case? Cause my example is fairly 
> artificial, and probably i'm worrying too much about it. If nobody really 
> codes that way, i'm fine with supporting the non-zero character case via a 
> simple default binding. In this case we should merge `bindDefaultZero()` with 
> your `overwriteRegion()` (i'd prefer your function name, but please keep the 
> empty class check).

Based on my limited programming experience, I think `memset` with non-zero 
character is common. However given that we can't handle non-zero character 
binding problems very well, we should now support only zero character. After 
all, IMHO,  correctness of semantic is also important  for the analyzer.

I will update this patch to only handle zero character case. In the future, as 
I am more familiar with the `RegionStore`, I will solve the problem of non-zero 
character binding.


Repository:
  rC Clang

https://reviews.llvm.org/D44934



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D44934: [analyzer] Improve the modeling of `memset()`.

2018-05-05 Thread Henry Wong via Phabricator via cfe-commits
MTC updated this revision to Diff 145361.
MTC added a comment.

- Since there is no perfect way to handle the default binding of non-zero 
character, remove the default binding of non-zero character. Use 
`bindDefaulrZero()` instead of `overwriteRegion()` to bind the zero character.
- Reuse `assume()` instead of `isZeroConstant()` to determine whether it is 
zero character. The purpose of this is to be able to set the string length 
**when dealing with non-zero symbol character**.


Repository:
  rC Clang

https://reviews.llvm.org/D44934

Files:
  lib/StaticAnalyzer/Checkers/CStringChecker.cpp
  test/Analysis/bstring.cpp
  test/Analysis/null-deref-ps-region.c
  test/Analysis/string.c

Index: test/Analysis/string.c
===
--- test/Analysis/string.c
+++ test/Analysis/string.c
@@ -1,7 +1,8 @@
-// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
-// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -analyzer-checker=core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
-// RUN: %clang_analyze_cc1 -DVARIANT -analyzer-checker=core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
-// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -DVARIANT -analyzer-checker=alpha.security.taint,core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DVARIANT -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -DVARIANT -analyzer-checker=alpha.security.taint,core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DSUPPRESS_OUT_OF_BOUND -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring.BufferOverlap,alpha.unix.cstring.NotNullTerminated,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
 
 //===--===
 // Declarations
@@ -1159,6 +1160,186 @@
   clang_analyzer_eval(str[1] == 'b'); // expected-warning{{UNKNOWN}}
 }
 
+//===--===
+// memset()
+//===--===
+
+void *memset(void *dest, int ch, size_t count);
+
+void *malloc(size_t size);
+void free(void *);
+
+void memset1_char_array_null() {
+  char str[] = "abcd";
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{TRUE}}
+  memset(str, '\0', 2);
+  clang_analyzer_eval(strlen(str) == 0); // expected-warning{{TRUE}}
+}
+
+void memset2_char_array_null() {
+  char str[] = "abcd";
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{TRUE}}
+  memset(str, '\0', strlen(str) + 1);
+  clang_analyzer_eval(strlen(str) == 0); // expected-warning{{TRUE}}
+  clang_analyzer_eval(str[2] == 0);  // expected-warning{{TRUE}}
+}
+
+void memset3_char_malloc_null() {
+  char *str = (char *)malloc(10 * sizeof(char));
+  memset(str + 1, '\0', 8);
+  clang_analyzer_eval(str[1] == 0); // expected-warning{{UNKNOWN}}
+  free(str);
+}
+
+void memset4_char_malloc_null() {
+  char *str = (char *)malloc(10 * sizeof(char));
+  //void *str = malloc(10 * sizeof(char));
+  memset(str, '\0', 10);
+  clang_analyzer_eval(str[1] == 0);  // expected-warning{{TRUE}}
+  clang_analyzer_eval(strlen(str) == 0); // expected-warning{{TRUE}}
+  free(str);
+}
+
+#ifdef SUPPRESS_OUT_OF_BOUND
+void memset5_char_malloc_overflow_null() {
+  char *str = (char *)malloc(10 * sizeof(char));
+  memset(str, '\0', 12);
+  clang_analyzer_eval(str[1] == 0); // expected-warning{{UNKNOWN}}
+  free(str);
+}
+#endif
+
+void memset6_char_array_nonnull() {
+  char str[] = "abcd";
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{TRUE}}
+  memset(str, '0', 2);
+  clang_analyzer_eval(str[0] == 'a');// expected-warning{{UNKNOWN}}
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{UNKNOWN}}
+}
+
+#ifdef SUPPRESS_OUT_OF_BOUND
+void memset8_char_array_nonnull() {
+  char str[5] = "abcd";
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{TRUE}}
+  memset(str, '0', 10);
+  clang_analyzer_eval(str[0] != '0'); // expected-warning{{UNKNOWN}}
+  clang_analyzer_eval(strlen(str) >= 10); // expected-warning{{TRUE}}
+  clang

[PATCH] D44934: [analyzer] Improve the modeling of `memset()`.

2018-05-10 Thread Henry Wong via Phabricator via cfe-commits
MTC added a comment.

ping.


Repository:
  rC Clang

https://reviews.llvm.org/D44934



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D44934: [analyzer] Improve the modeling of `memset()`.

2018-05-14 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added a comment.

Thank you! Looks good overall.




Comment at: lib/StaticAnalyzer/Checkers/CStringChecker.cpp:1013-1014
+
+  // Get the offset and the base region to figure out whether the offset of
+  // buffer is 0.
+  RegionOffset Offset = MR->getAsOffset();

Please say something here (or above) about why do we want our offset to be 0:
> We're about to model memset by producing a "default binding" in the Store. 
> Our current implementation - RegionStore - doesn't support default bindings 
> that don't cover the whole base region.



Comment at: lib/StaticAnalyzer/Checkers/CStringChecker.cpp:1037
+
+if (StateWholeReg && !StateNotWholeReg && CharVal.isZeroConstant()) {
+  // If the 'memset()' acts on the whole region of destination buffer and

I think we should use `StateNonNullChar` (that's computed below) instead of 
`CharVal.isZeroConstant()` because the Environment doesn't provide a guarantee 
that all symbols within it are collapsed to constants where applicable.



Comment at: lib/StaticAnalyzer/Checkers/CStringChecker.cpp:1055
+std::tie(StateNullChar, StateNonNullChar) =
+assumeZero(C, State, CharVal, Ctx.UnsignedCharTy);
+

I think this should use `IntTy` here. Because that's the type of the `memset`'s 
argument, and that's what `assumeZero()` expects.



Comment at: test/Analysis/string.c:1412
+  clang_analyzer_eval(strlen(str) >= 10); // expected-warning{{TRUE}}
+  // FIXME: This shoule be TRUE.
+  clang_analyzer_eval(str[1] == '0'); // expected-warning{{UNKNOWN}}

Typo: `should` :)


Repository:
  rC Clang

https://reviews.llvm.org/D44934



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D44934: [analyzer] Improve the modeling of `memset()`.

2018-05-15 Thread Henry Wong via Phabricator via cfe-commits
MTC added inline comments.



Comment at: lib/StaticAnalyzer/Checkers/CStringChecker.cpp:1037
+
+if (StateWholeReg && !StateNotWholeReg && CharVal.isZeroConstant()) {
+  // If the 'memset()' acts on the whole region of destination buffer and

NoQ wrote:
> I think we should use `StateNonNullChar` (that's computed below) instead of 
> `CharVal.isZeroConstant()` because the Environment doesn't provide a 
> guarantee that all symbols within it are collapsed to constants where 
> applicable.
Yea, thanks! 



Comment at: lib/StaticAnalyzer/Checkers/CStringChecker.cpp:1055
+std::tie(StateNullChar, StateNonNullChar) =
+assumeZero(C, State, CharVal, Ctx.UnsignedCharTy);
+

NoQ wrote:
> I think this should use `IntTy` here. Because that's the type of the 
> `memset`'s argument, and that's what `assumeZero()` expects.
I confirmed again that `memset()` will convert the value to `unsigned char` 
first, see http://en.cppreference.com/w/c/string/byte/memset. 

In the next update, I will `evalCast(value, UnsignedCharTy, IntTy)` first, 
therefore, I will continue to use `UnsignedCharTy`.



Comment at: test/Analysis/string.c:1412
+  clang_analyzer_eval(strlen(str) >= 10); // expected-warning{{TRUE}}
+  // FIXME: This shoule be TRUE.
+  clang_analyzer_eval(str[1] == '0'); // expected-warning{{UNKNOWN}}

NoQ wrote:
> Typo: `should` :)
thanks, will do!


Repository:
  rC Clang

https://reviews.llvm.org/D44934



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D44934: [analyzer] Improve the modeling of `memset()`.

2018-05-15 Thread Henry Wong via Phabricator via cfe-commits
MTC updated this revision to Diff 146816.
MTC added a comment.

- According to NoQ's suggestion, use `assumeZero()` instead of 
`isZeroConstant()` to determine whether the value is 0.
- Add test `memset26_upper_UCHAR_MAX()` and `memset27_symbol()`
- Since `void *memset( void *dest, int ch, size_t count );` will converts the 
value `ch` to `unsigned char`, we call `evalCast()` accordingly.


Repository:
  rC Clang

https://reviews.llvm.org/D44934

Files:
  lib/StaticAnalyzer/Checkers/CStringChecker.cpp
  test/Analysis/bstring.cpp
  test/Analysis/null-deref-ps-region.c
  test/Analysis/string.c

Index: test/Analysis/string.c
===
--- test/Analysis/string.c
+++ test/Analysis/string.c
@@ -1,7 +1,8 @@
-// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
-// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -analyzer-checker=core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
-// RUN: %clang_analyze_cc1 -DVARIANT -analyzer-checker=core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
-// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -DVARIANT -analyzer-checker=alpha.security.taint,core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DVARIANT -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -DVARIANT -analyzer-checker=alpha.security.taint,core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DSUPPRESS_OUT_OF_BOUND -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring.BufferOverlap,alpha.unix.cstring.NotNullTerminated,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
 
 //===--===
 // Declarations
@@ -1159,6 +1160,206 @@
   clang_analyzer_eval(str[1] == 'b'); // expected-warning{{UNKNOWN}}
 }
 
+//===--===
+// memset()
+//===--===
+
+void *memset(void *dest, int ch, size_t count);
+
+void *malloc(size_t size);
+void free(void *);
+
+void memset1_char_array_null() {
+  char str[] = "abcd";
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{TRUE}}
+  memset(str, '\0', 2);
+  clang_analyzer_eval(strlen(str) == 0); // expected-warning{{TRUE}}
+}
+
+void memset2_char_array_null() {
+  char str[] = "abcd";
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{TRUE}}
+  memset(str, '\0', strlen(str) + 1);
+  clang_analyzer_eval(strlen(str) == 0); // expected-warning{{TRUE}}
+  clang_analyzer_eval(str[2] == 0);  // expected-warning{{TRUE}}
+}
+
+void memset3_char_malloc_null() {
+  char *str = (char *)malloc(10 * sizeof(char));
+  memset(str + 1, '\0', 8);
+  clang_analyzer_eval(str[1] == 0); // expected-warning{{UNKNOWN}}
+  free(str);
+}
+
+void memset4_char_malloc_null() {
+  char *str = (char *)malloc(10 * sizeof(char));
+  //void *str = malloc(10 * sizeof(char));
+  memset(str, '\0', 10);
+  clang_analyzer_eval(str[1] == 0);  // expected-warning{{TRUE}}
+  clang_analyzer_eval(strlen(str) == 0); // expected-warning{{TRUE}}
+  free(str);
+}
+
+#ifdef SUPPRESS_OUT_OF_BOUND
+void memset5_char_malloc_overflow_null() {
+  char *str = (char *)malloc(10 * sizeof(char));
+  memset(str, '\0', 12);
+  clang_analyzer_eval(str[1] == 0); // expected-warning{{UNKNOWN}}
+  free(str);
+}
+#endif
+
+void memset6_char_array_nonnull() {
+  char str[] = "abcd";
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{TRUE}}
+  memset(str, '0', 2);
+  clang_analyzer_eval(str[0] == 'a');// expected-warning{{UNKNOWN}}
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{UNKNOWN}}
+}
+
+#ifdef SUPPRESS_OUT_OF_BOUND
+void memset8_char_array_nonnull() {
+  char str[5] = "abcd";
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{TRUE}}
+  memset(str, '0', 10);
+  clang_analyzer_eval(str[0] != '0'); // expected-warning{{UNKNOWN}}
+  clang_analyzer_eval(strlen(str) >= 10); // expected-warning{{TRUE}}
+  clang_analyzer_eval(strlen(str) < 10);  // expected-warning{{FALSE}}
+}
+#endif
+
+struct POD_memse

[PATCH] D44934: [analyzer] Improve the modeling of `memset()`.

2018-05-15 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ accepted this revision.
NoQ added a comment.
This revision is now accepted and ready to land.

Thanks! This looks great now.




Comment at: lib/StaticAnalyzer/Checkers/CStringChecker.cpp:1055
+std::tie(StateNullChar, StateNonNullChar) =
+assumeZero(C, State, CharVal, Ctx.UnsignedCharTy);
+

MTC wrote:
> NoQ wrote:
> > I think this should use `IntTy` here. Because that's the type of the 
> > `memset`'s argument, and that's what `assumeZero()` expects.
> I confirmed again that `memset()` will convert the value to `unsigned char` 
> first, see http://en.cppreference.com/w/c/string/byte/memset. 
> 
> In the next update, I will `evalCast(value, UnsignedCharTy, IntTy)` first, 
> therefore, I will continue to use `UnsignedCharTy`.
Aha, yup, it does convert to `unsigned char`, but `assumeZero()` doesn't. The 
new code looks correct.


Repository:
  rC Clang

https://reviews.llvm.org/D44934



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D44934: [analyzer] Improve the modeling of `memset()`.

2018-05-16 Thread Phabricator via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL332463: [analyzer] Improve the modeling of memset(). 
(authored by henrywong, committed by ).
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D44934?vs=146816&id=147066#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D44934

Files:
  cfe/trunk/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
  cfe/trunk/test/Analysis/bstring.cpp
  cfe/trunk/test/Analysis/null-deref-ps-region.c
  cfe/trunk/test/Analysis/string.c

Index: cfe/trunk/test/Analysis/bstring.cpp
===
--- cfe/trunk/test/Analysis/bstring.cpp
+++ cfe/trunk/test/Analysis/bstring.cpp
@@ -1,7 +1,8 @@
-// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -verify %s
-// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -analyzer-checker=core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -verify %s
-// RUN: %clang_analyze_cc1 -DVARIANT -analyzer-checker=core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -verify %s
-// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -DVARIANT -analyzer-checker=core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -verify %s
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -verify %s
+// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -verify %s
+// RUN: %clang_analyze_cc1 -DVARIANT -analyzer-checker=core,unix.cstring,alpha.unix.cstring,unix.Malloc,debug.ExprInspection -analyzer-store=region -verify %s
+// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -DVARIANT -analyzer-checker=core,unix.cstring,alpha.unix.cstring,unix.Malloc,debug.ExprInspection -analyzer-store=region -verify %s
+// RUN: %clang_analyze_cc1 -DSUPPRESS_OUT_OF_BOUND -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring.BufferOverlap,alpha.unix.cstring.NotNullTerminated,debug.ExprInspection -analyzer-store=region -verify %s
 
 #include "Inputs/system-header-simulator-cxx.h"
 #include "Inputs/system-header-simulator-for-malloc.h"
@@ -77,3 +78,118 @@
   unsigned *f;
 };
 }
+
+void *memset(void *dest, int ch, std::size_t count);
+namespace memset_non_pod {
+class Base {
+public:
+  int b_mem;
+  Base() : b_mem(1) {}
+};
+
+class Derived : public Base {
+public:
+  int d_mem;
+  Derived() : d_mem(2) {}
+};
+
+void memset1_inheritance() {
+  Derived d;
+  memset(&d, 0, sizeof(Derived));
+  clang_analyzer_eval(d.b_mem == 0); // expected-warning{{TRUE}}
+  clang_analyzer_eval(d.d_mem == 0); // expected-warning{{TRUE}}
+}
+
+#ifdef SUPPRESS_OUT_OF_BOUND
+void memset2_inheritance_field() {
+  Derived d;
+  memset(&d.d_mem, 0, sizeof(Derived));
+  clang_analyzer_eval(d.b_mem == 0); // expected-warning{{UNKNOWN}}
+  clang_analyzer_eval(d.d_mem == 0); // expected-warning{{UNKNOWN}}
+}
+
+void memset3_inheritance_field() {
+  Derived d;
+  memset(&d.b_mem, 0, sizeof(Derived));
+  clang_analyzer_eval(d.b_mem == 0); // expected-warning{{TRUE}}
+  clang_analyzer_eval(d.d_mem == 0); // expected-warning{{TRUE}}
+}
+#endif
+
+void memset4_array_nonpod_object() {
+  Derived array[10];
+  clang_analyzer_eval(array[1].b_mem == 1); // expected-warning{{UNKNOWN}}
+  clang_analyzer_eval(array[1].d_mem == 2); // expected-warning{{UNKNOWN}}
+  memset(&array[1], 0, sizeof(Derived));
+  clang_analyzer_eval(array[1].b_mem == 0); // expected-warning{{UNKNOWN}}
+  clang_analyzer_eval(array[1].d_mem == 0); // expected-warning{{UNKNOWN}}
+}
+
+void memset5_array_nonpod_object() {
+  Derived array[10];
+  clang_analyzer_eval(array[1].b_mem == 1); // expected-warning{{UNKNOWN}}
+  clang_analyzer_eval(array[1].d_mem == 2); // expected-warning{{UNKNOWN}}
+  memset(array, 0, sizeof(array));
+  clang_analyzer_eval(array[1].b_mem == 0); // expected-warning{{TRUE}}
+  clang_analyzer_eval(array[1].d_mem == 0); // expected-warning{{TRUE}}
+}
+
+void memset6_new_array_nonpod_object() {
+  Derived *array = new Derived[10];
+  clang_analyzer_eval(array[2].b_mem == 1); // expected-warning{{UNKNOWN}}
+  clang_analyzer_eval(array[2].d_mem == 2); // expected-warning{{UNKNOWN}}
+  memset(array, 0, 10 * sizeof(Derived));
+  clang_analyzer_eval(array[2].b_mem == 0); // expected-warning{{TRUE}}
+  clang_analyzer_eval(array[2].d_mem == 0); // expected-warning{{TRUE}}
+  delete[] array;
+}
+
+void memset7_placement_new() {
+  Derived *d = new Derived();
+  clang_analyzer_eval(d->b_mem == 1); // expected-warning{{TRUE}}
+  clang_analyzer_eval(d->d_mem == 2); // expected-warning{{TRUE}}
+
+  memset(d, 0, sizeof(Derived));
+  clang_analyzer_eval(d->b_mem == 0); // expected-warning{{TRUE}}
+  clang_analyzer_eval(d->d_mem == 0); // expected-warning{{TRUE}}
+
+  Derived *d1 = new (d) Deriv

[PATCH] D44934: [analyzer] Improve the modeling of `memset()`.

2018-03-27 Thread Henry Wong via Phabricator via cfe-commits
MTC created this revision.
MTC added reviewers: dcoughlin, NoQ, xazax.hun, a.sidorin.
Herald added subscribers: cfe-commits, rnkovacs, szepet.
Herald added a reviewer: george.karpenkov.

This patch originates from https://reviews.llvm.org/D31868. There are two key 
points in this 
patch:

- Add `OverwriteRegion()`, this method used to model `memset()` or something 
like that.
- Improve the modeling of `memset`.

For `OverwriteRegion()`, is basically invalidate region and bind default. But I 
think this 
method requires more in-depth thinking and more extensive testing.

For `evalMemset()`, this patch only considers the case where the buffer's 
offset is zero. And
if the whole region is `memset`ed, bind a default value. According to the value 
for 
overwriting, decide how to update the string length.

For `void *memset(void *dest, int ch, size_t count)`:

1). offset is 0, `ch` is `'\0'` and `count` < dest-buffer's length. 
Invalidate the buffer and set the string length to 0.

2). offset is 0, `ch` is `'\0'` and `count` == dest-buffer's length.
Bind `\0` to the buffer with default binding and set the string length to 0.

3). offset is 0, `ch` is not `'\0'` and `count` < dest-buffer's length.
Invalidate the buffer and set the string length >= `count`.

4). offset is 0, `ch` is not `'\0'` and `count` == dest-buffer's length.
Bind `ch` to the buffer and set the string length >= `count`.

I have tested this patch on `sqlite`, but there's no difference int the 
warnings.

Thanks in advance for the review!


Repository:
  rC Clang

https://reviews.llvm.org/D44934

Files:
  include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
  include/clang/StaticAnalyzer/Core/PathSensitive/Store.h
  lib/StaticAnalyzer/Checkers/CStringChecker.cpp
  lib/StaticAnalyzer/Core/ProgramState.cpp
  lib/StaticAnalyzer/Core/RegionStore.cpp
  lib/StaticAnalyzer/Core/Store.cpp
  test/Analysis/bstring.cpp
  test/Analysis/null-deref-ps-region.c
  test/Analysis/string.c

Index: test/Analysis/string.c
===
--- test/Analysis/string.c
+++ test/Analysis/string.c
@@ -1,7 +1,8 @@
-// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
-// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -analyzer-checker=core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
-// RUN: %clang_analyze_cc1 -DVARIANT -analyzer-checker=core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
-// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -DVARIANT -analyzer-checker=alpha.security.taint,core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DVARIANT -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -DVARIANT -analyzer-checker=alpha.security.taint,core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DSUPPRESS_OUT_OF_BOUND -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring.BufferOverlap,alpha.unix.cstring.NotNullTerminated,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
 
 //===--===
 // Declarations
@@ -1159,6 +1160,248 @@
   clang_analyzer_eval(str[1] == 'b'); // expected-warning{{UNKNOWN}}
 }
 
+//===--===
+// memset()
+//===--===
+
+void *memset( void *dest, int ch, size_t count );
+
+void* malloc(size_t size);
+void free(void*);
+
+void memset1_char_array_null() {
+  char str[] = "abcd";
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{TRUE}}
+  memset(str, '\0', 2);
+  clang_analyzer_eval(strlen(str) == 0); // expected-warning{{TRUE}}
+}
+
+void memset2_char_array_null() {
+  char str[] = "abcd";
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{TRUE}}
+  memset(str, '\0', strlen(str) + 1);
+  clang_analyzer_eval(strlen(str) == 0); // expected-warning{{TRUE}}
+  clang_analyzer_eval(str[2] == 0);  // expected-warning{{TRUE}}
+}
+
+void memset3_char_malloc_null() {
+  char *str = (char *)malloc(10 * sizeof(char));
+  memset(str + 1, '\0', 8);
+  clang

[PATCH] D44934: [analyzer] Improve the modeling of `memset()`.

2018-03-28 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added a comment.

Why do you need separate code for null and non-null character? The function's 
semantics doesn't seem to care.

I'd rather consider the case of non-concrete character separately. Because 
wiping a region with a symbol is not something we currently support; a symbolic 
default binding over a region means a different thing and it'd be equivalent to 
invalidation, so for non-concrete character we don't have a better choice than 
to invalidate. For concrete non-zero character, on the contrary, a default 
binding would work just fine.

Could you explain why didn't a straightforward `bindLoc` over a base region 
wasn't doing the thing you wanted? I.e., why is new Store API function 
necessary?


Repository:
  rC Clang

https://reviews.llvm.org/D44934



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D44934: [analyzer] Improve the modeling of `memset()`.

2018-03-28 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added a comment.

> In addition, `memset` can bind anything to the region, so 
> `getBindingForDerivedDefaultValue()`'s logic needs some adjustment. **The 
> solution in this patch is certainly not correct.**

Yeah, i guess you meant here the thing that i was saying about concrete 
bindings. If we only limit ourselves to concrete bindings, it should work well.


Repository:
  rC Clang

https://reviews.llvm.org/D44934



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D44934: [analyzer] Improve the modeling of `memset()`.

2018-03-30 Thread Henry Wong via Phabricator via cfe-commits
MTC added a comment.

Thanks for your review, NoQ!

> Why do you need separate code for null and non-null character? The function's 
> semantics doesn't seem to care.

Thank you for your advice, I will remove the duplicate codeļ¼

> I'd rather consider the case of non-concrete character separately. Because 
> wiping a region with a symbol is not something we currently support; a 
> symbolic default binding over a region means a different thing and it'd be 
> equivalent to invalidation, so for non-concrete character we don't have a 
> better choice than to invalidate. For concrete non-zero character, on the 
> contrary, a default binding would work just fine.

Thank you for your reminding, I overlooked this point. However for non-concrete 
character, the symbol value, if we just invalidate the region, the constraint 
information of the non-concrete character will be lost. Do we need to consider 
this?

> Could you explain why didn't a straightforward `bindLoc` over a base region 
> wasn't doing the thing you wanted? I.e., why is new Store API function 
> necessary?

I am also very resistant to adding new APIs. One is that I am not very familiar 
with RegionStore. One is that adding a new API is always the last option.

- The semantics of `memset()` need default binding, and only `bindDefault()` 
can guarantee that. However `bindDefault()` is just used to initialize the 
region and can only be called once on the same region.
- Only when the region is `TypedValueRegion` and the value type is `ArrayType` 
or `StructType`, `bindLoc()` can add a default binding. So if we want to 
continue using `default binding`, `bindLoc()` is not the correct choice.

And if I use `bindLoc()` instead of `overwriteRegion()`, there will be some 
crashes occured because `bindLoc()` broke some assumptions, e.g. `bindArray()` 
believes that the value for binding should be `CompoundVal`, `LazyCompoundVal` 
or `UnknownVal`.


Repository:
  rC Clang

https://reviews.llvm.org/D44934



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D44934: [analyzer] Improve the modeling of `memset()`.

2018-03-30 Thread Henry Wong via Phabricator via cfe-commits
MTC updated this revision to Diff 140405.
MTC added a comment.

According to @NoQ's suggestion, remove the duplicated code.


Repository:
  rC Clang

https://reviews.llvm.org/D44934

Files:
  include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
  include/clang/StaticAnalyzer/Core/PathSensitive/Store.h
  lib/StaticAnalyzer/Checkers/CStringChecker.cpp
  lib/StaticAnalyzer/Core/ProgramState.cpp
  lib/StaticAnalyzer/Core/RegionStore.cpp
  lib/StaticAnalyzer/Core/Store.cpp
  test/Analysis/bstring.cpp
  test/Analysis/null-deref-ps-region.c
  test/Analysis/string.c

Index: test/Analysis/string.c
===
--- test/Analysis/string.c
+++ test/Analysis/string.c
@@ -1,7 +1,8 @@
-// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
-// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -analyzer-checker=core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
-// RUN: %clang_analyze_cc1 -DVARIANT -analyzer-checker=core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
-// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -DVARIANT -analyzer-checker=alpha.security.taint,core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DVARIANT -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -DVARIANT -analyzer-checker=alpha.security.taint,core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DSUPPRESS_OUT_OF_BOUND -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring.BufferOverlap,alpha.unix.cstring.NotNullTerminated,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
 
 //===--===
 // Declarations
@@ -1159,6 +1160,248 @@
   clang_analyzer_eval(str[1] == 'b'); // expected-warning{{UNKNOWN}}
 }
 
+//===--===
+// memset()
+//===--===
+
+void *memset( void *dest, int ch, size_t count );
+
+void* malloc(size_t size);
+void free(void*);
+
+void memset1_char_array_null() {
+  char str[] = "abcd";
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{TRUE}}
+  memset(str, '\0', 2);
+  clang_analyzer_eval(strlen(str) == 0); // expected-warning{{TRUE}}
+}
+
+void memset2_char_array_null() {
+  char str[] = "abcd";
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{TRUE}}
+  memset(str, '\0', strlen(str) + 1);
+  clang_analyzer_eval(strlen(str) == 0); // expected-warning{{TRUE}}
+  clang_analyzer_eval(str[2] == 0);  // expected-warning{{TRUE}}
+}
+
+void memset3_char_malloc_null() {
+  char *str = (char *)malloc(10 * sizeof(char));
+  memset(str + 1, '\0', 8);
+  clang_analyzer_eval(str[1] == 0); // expected-warning{{UNKNOWN}}
+  free(str);
+}
+
+void memset4_char_malloc_null() {
+  char *str = (char *)malloc(10 * sizeof(char));
+  //void *str = malloc(10 * sizeof(char));
+  memset(str, '\0', 10);
+  clang_analyzer_eval(str[1] == 0);  // expected-warning{{TRUE}}
+  clang_analyzer_eval(strlen(str) == 0); // expected-warning{{TRUE}}
+  free(str);
+}
+
+#ifdef SUPPRESS_OUT_OF_BOUND
+void memset5_char_malloc_overflow_null() {
+  char *str = (char *)malloc(10 * sizeof(char));
+  memset(str, '\0', 12);
+  clang_analyzer_eval(str[1] == 0); // expected-warning{{UNKNOWN}}
+  free(str);
+}
+#endif
+
+void memset6_char_array_nonnull() {
+  char str[] = "abcd";
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{TRUE}}
+  memset(str, '0', 2);
+  clang_analyzer_eval(str[0] == 'a');// expected-warning{{UNKNOWN}}
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{UNKNOWN}}
+}
+
+void memset7_char_array_nonnull() {
+  char str[5] = "abcd";
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{TRUE}}
+  memset(str, '0', 5);
+  clang_analyzer_eval(str[0] == '0');// expected-warning{{TRUE}}
+  clang_analyzer_eval(strlen(str) >= 5); // expected-warning{{TRUE}}
+}
+
+#ifdef SUPPRESS_OUT_OF_BOUND
+void memset8_char_array_nonnull() {
+	char str[5] = "abcd";
+	clang_analyzer_eval(strlen(str) == 4); // expected-warning{{TRU

[PATCH] D44934: [analyzer] Improve the modeling of `memset()`.

2018-04-02 Thread Henry Wong via Phabricator via cfe-commits
MTC updated this revision to Diff 140629.
MTC added a comment.

> Thank you for your reminding, I overlooked this point. However for 
> non-concrete character, the symbol value, if we just invalidate the region, 
> the constraint information of the non-concrete character will be lost. Do we 
> need to consider this?

Sorry for my hasty question, analyzer does not support to bind a symbol with a 
default binding.

The update of this diff is as follows.

- If the char value is not concrete, just invalidate the region.
- Add a test about symbolic char value.
- A little code refactoring.


Repository:
  rC Clang

https://reviews.llvm.org/D44934

Files:
  include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
  include/clang/StaticAnalyzer/Core/PathSensitive/Store.h
  lib/StaticAnalyzer/Checkers/CStringChecker.cpp
  lib/StaticAnalyzer/Core/ProgramState.cpp
  lib/StaticAnalyzer/Core/RegionStore.cpp
  lib/StaticAnalyzer/Core/Store.cpp
  test/Analysis/bstring.cpp
  test/Analysis/null-deref-ps-region.c
  test/Analysis/string.c

Index: test/Analysis/string.c
===
--- test/Analysis/string.c
+++ test/Analysis/string.c
@@ -1,7 +1,8 @@
-// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
-// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -analyzer-checker=core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
-// RUN: %clang_analyze_cc1 -DVARIANT -analyzer-checker=core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
-// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -DVARIANT -analyzer-checker=alpha.security.taint,core,unix.cstring,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DVARIANT -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -DVARIANT -analyzer-checker=alpha.security.taint,core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
+// RUN: %clang_analyze_cc1 -DSUPPRESS_OUT_OF_BOUND -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring.BufferOverlap,alpha.unix.cstring.NotNullTerminated,debug.ExprInspection -analyzer-store=region -Wno-null-dereference -verify %s
 
 //===--===
 // Declarations
@@ -1159,6 +1160,248 @@
   clang_analyzer_eval(str[1] == 'b'); // expected-warning{{UNKNOWN}}
 }
 
+//===--===
+// memset()
+//===--===
+
+void *memset( void *dest, int ch, size_t count );
+
+void* malloc(size_t size);
+void free(void*);
+
+void memset1_char_array_null() {
+  char str[] = "abcd";
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{TRUE}}
+  memset(str, '\0', 2);
+  clang_analyzer_eval(strlen(str) == 0); // expected-warning{{TRUE}}
+}
+
+void memset2_char_array_null() {
+  char str[] = "abcd";
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{TRUE}}
+  memset(str, '\0', strlen(str) + 1);
+  clang_analyzer_eval(strlen(str) == 0); // expected-warning{{TRUE}}
+  clang_analyzer_eval(str[2] == 0);  // expected-warning{{TRUE}}
+}
+
+void memset3_char_malloc_null() {
+  char *str = (char *)malloc(10 * sizeof(char));
+  memset(str + 1, '\0', 8);
+  clang_analyzer_eval(str[1] == 0); // expected-warning{{UNKNOWN}}
+  free(str);
+}
+
+void memset4_char_malloc_null() {
+  char *str = (char *)malloc(10 * sizeof(char));
+  //void *str = malloc(10 * sizeof(char));
+  memset(str, '\0', 10);
+  clang_analyzer_eval(str[1] == 0);  // expected-warning{{TRUE}}
+  clang_analyzer_eval(strlen(str) == 0); // expected-warning{{TRUE}}
+  free(str);
+}
+
+#ifdef SUPPRESS_OUT_OF_BOUND
+void memset5_char_malloc_overflow_null() {
+  char *str = (char *)malloc(10 * sizeof(char));
+  memset(str, '\0', 12);
+  clang_analyzer_eval(str[1] == 0); // expected-warning{{UNKNOWN}}
+  free(str);
+}
+#endif
+
+void memset6_char_array_nonnull() {
+  char str[] = "abcd";
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{TRUE}}
+  memset(str, '0', 2);
+  clang_analyzer_eval(str[0] == 'a');// expected-warning{{UNKNOWN}}
+  clang_analyzer_eval(strlen(str) == 4); // expected-warning{{UNKNO

[PATCH] D44934: [analyzer] Improve the modeling of `memset()`.

2018-04-02 Thread Henry Wong via Phabricator via cfe-commits
MTC added a comment.

Kindly ping!


Repository:
  rC Clang

https://reviews.llvm.org/D44934



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits