[PATCH] D129269: [analyzer] Fix use of length in CStringChecker

2022-07-13 Thread Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG1d7e58cfade1: [analyzer] Fix use of length in CStringChecker 
(authored by einvbri vince.a.bridg...@ericsson.com).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129269/new/

https://reviews.llvm.org/D129269

Files:
  clang/docs/analyzer/checkers.rst
  clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
  clang/test/Analysis/string.c


Index: clang/test/Analysis/string.c
===
--- clang/test/Analysis/string.c
+++ clang/test/Analysis/string.c
@@ -1652,3 +1652,18 @@
   __builtin___memset_chk(, 0, sizeof(x), __builtin_object_size(, 0));
   clang_analyzer_eval(x == 0); // expected-warning{{TRUE}}
 }
+
+#ifndef SUPPRESS_OUT_OF_BOUND
+void strcpy_no_overflow_2(char *y) {
+  char x[3];
+  // FIXME: string literal modeling does not account for embedded NULLs.
+  //This case should not elicit a warning, but does.
+  //See discussion at https://reviews.llvm.org/D129269
+  strcpy(x, "12\0"); // expected-warning{{String copy function overflows the 
destination buffer}}
+}
+#else
+void strcpy_no_overflow_2(char *y) {
+  char x[3];
+  strcpy(x, "12\0");
+}
+#endif
Index: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
+++ clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
@@ -848,7 +848,7 @@
 SValBuilder  = C.getSValBuilder();
 QualType sizeTy = svalBuilder.getContext().getSizeType();
 const StringLiteral *strLit = cast(MR)->getStringLiteral();
-return svalBuilder.makeIntVal(strLit->getByteLength(), sizeTy);
+return svalBuilder.makeIntVal(strLit->getLength(), sizeTy);
   }
   case MemRegion::SymbolicRegionKind:
   case MemRegion::AllocaRegionKind:
Index: clang/docs/analyzer/checkers.rst
===
--- clang/docs/analyzer/checkers.rst
+++ clang/docs/analyzer/checkers.rst
@@ -2726,6 +2726,9 @@
 ""
 Check for out-of-bounds access in string functions; applies to:`` strncopy, 
strncat``.
 
+This check also applies to string literals, except there is a known bug in that
+the analyzer cannot detect embedded NULL characters.
+
 .. code-block:: c
 
  void test() {


Index: clang/test/Analysis/string.c
===
--- clang/test/Analysis/string.c
+++ clang/test/Analysis/string.c
@@ -1652,3 +1652,18 @@
   __builtin___memset_chk(, 0, sizeof(x), __builtin_object_size(, 0));
   clang_analyzer_eval(x == 0); // expected-warning{{TRUE}}
 }
+
+#ifndef SUPPRESS_OUT_OF_BOUND
+void strcpy_no_overflow_2(char *y) {
+  char x[3];
+  // FIXME: string literal modeling does not account for embedded NULLs.
+  //This case should not elicit a warning, but does.
+  //See discussion at https://reviews.llvm.org/D129269
+  strcpy(x, "12\0"); // expected-warning{{String copy function overflows the destination buffer}}
+}
+#else
+void strcpy_no_overflow_2(char *y) {
+  char x[3];
+  strcpy(x, "12\0");
+}
+#endif
Index: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
+++ clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
@@ -848,7 +848,7 @@
 SValBuilder  = C.getSValBuilder();
 QualType sizeTy = svalBuilder.getContext().getSizeType();
 const StringLiteral *strLit = cast(MR)->getStringLiteral();
-return svalBuilder.makeIntVal(strLit->getByteLength(), sizeTy);
+return svalBuilder.makeIntVal(strLit->getLength(), sizeTy);
   }
   case MemRegion::SymbolicRegionKind:
   case MemRegion::AllocaRegionKind:
Index: clang/docs/analyzer/checkers.rst
===
--- clang/docs/analyzer/checkers.rst
+++ clang/docs/analyzer/checkers.rst
@@ -2726,6 +2726,9 @@
 ""
 Check for out-of-bounds access in string functions; applies to:`` strncopy, strncat``.
 
+This check also applies to string literals, except there is a known bug in that
+the analyzer cannot detect embedded NULL characters.
+
 .. code-block:: c
 
  void test() {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D129269: [analyzer] Fix use of length in CStringChecker

2022-07-13 Thread Gabor Marton via Phabricator via cfe-commits
martong accepted this revision.
martong added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129269/new/

https://reviews.llvm.org/D129269

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D129269: [analyzer] Fix use of length in CStringChecker

2022-07-11 Thread Vince Bridgers via Phabricator via cfe-commits
vabridgers updated this revision to Diff 443658.
vabridgers edited the summary of this revision.
vabridgers added a comment.

update per comments from @martong


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129269/new/

https://reviews.llvm.org/D129269

Files:
  clang/docs/analyzer/checkers.rst
  clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
  clang/test/Analysis/string.c


Index: clang/test/Analysis/string.c
===
--- clang/test/Analysis/string.c
+++ clang/test/Analysis/string.c
@@ -1652,3 +1652,18 @@
   __builtin___memset_chk(, 0, sizeof(x), __builtin_object_size(, 0));
   clang_analyzer_eval(x == 0); // expected-warning{{TRUE}}
 }
+
+#ifndef SUPPRESS_OUT_OF_BOUND
+void strcpy_no_overflow_2(char *y) {
+  char x[3];
+  // FIXME: string literal modeling does not account for embedded NULLs.
+  //This case should not elicit a warning, but does.
+  //See discussion at https://reviews.llvm.org/D129269
+  strcpy(x, "12\0"); // expected-warning{{String copy function overflows the 
destination buffer}}
+}
+#else
+void strcpy_no_overflow_2(char *y) {
+  char x[3];
+  strcpy(x, "12\0");
+}
+#endif
Index: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
+++ clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
@@ -848,7 +848,7 @@
 SValBuilder  = C.getSValBuilder();
 QualType sizeTy = svalBuilder.getContext().getSizeType();
 const StringLiteral *strLit = cast(MR)->getStringLiteral();
-return svalBuilder.makeIntVal(strLit->getByteLength(), sizeTy);
+return svalBuilder.makeIntVal(strLit->getLength(), sizeTy);
   }
   case MemRegion::SymbolicRegionKind:
   case MemRegion::AllocaRegionKind:
Index: clang/docs/analyzer/checkers.rst
===
--- clang/docs/analyzer/checkers.rst
+++ clang/docs/analyzer/checkers.rst
@@ -2726,6 +2726,9 @@
 ""
 Check for out-of-bounds access in string functions; applies to:`` strncopy, 
strncat``.
 
+This check also applies to string literals, except there is a known bug in that
+the analyzer cannot detect embedded NULL characters.
+
 .. code-block:: c
 
  void test() {


Index: clang/test/Analysis/string.c
===
--- clang/test/Analysis/string.c
+++ clang/test/Analysis/string.c
@@ -1652,3 +1652,18 @@
   __builtin___memset_chk(, 0, sizeof(x), __builtin_object_size(, 0));
   clang_analyzer_eval(x == 0); // expected-warning{{TRUE}}
 }
+
+#ifndef SUPPRESS_OUT_OF_BOUND
+void strcpy_no_overflow_2(char *y) {
+  char x[3];
+  // FIXME: string literal modeling does not account for embedded NULLs.
+  //This case should not elicit a warning, but does.
+  //See discussion at https://reviews.llvm.org/D129269
+  strcpy(x, "12\0"); // expected-warning{{String copy function overflows the destination buffer}}
+}
+#else
+void strcpy_no_overflow_2(char *y) {
+  char x[3];
+  strcpy(x, "12\0");
+}
+#endif
Index: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
+++ clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
@@ -848,7 +848,7 @@
 SValBuilder  = C.getSValBuilder();
 QualType sizeTy = svalBuilder.getContext().getSizeType();
 const StringLiteral *strLit = cast(MR)->getStringLiteral();
-return svalBuilder.makeIntVal(strLit->getByteLength(), sizeTy);
+return svalBuilder.makeIntVal(strLit->getLength(), sizeTy);
   }
   case MemRegion::SymbolicRegionKind:
   case MemRegion::AllocaRegionKind:
Index: clang/docs/analyzer/checkers.rst
===
--- clang/docs/analyzer/checkers.rst
+++ clang/docs/analyzer/checkers.rst
@@ -2726,6 +2726,9 @@
 ""
 Check for out-of-bounds access in string functions; applies to:`` strncopy, strncat``.
 
+This check also applies to string literals, except there is a known bug in that
+the analyzer cannot detect embedded NULL characters.
+
 .. code-block:: c
 
  void test() {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D129269: [analyzer] Fix use of length in CStringChecker

2022-07-11 Thread Vince Bridgers via Phabricator via cfe-commits
vabridgers added a comment.

Thanks @martong , I understand. I'll make the changes and resubmit. Best!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129269/new/

https://reviews.llvm.org/D129269

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D129269: [analyzer] Fix use of length in CStringChecker

2022-07-11 Thread Gabor Marton via Phabricator via cfe-commits
martong added a comment.

Thanks Vince for the patch! I think the implementation is correct. But I am 
worried that it might do sub-optimal computations on the string literals. I 
mean, getting the length would require O(N) steps each time, and that could be 
problematic with long literals, especially in projects that have many literals.

So, I'd rather keep your first implementation with `getLength` with the new 
test case which fails (and the failure is documented). I think, this is what 
@steakhal was also suggesting.

Then, if your time allows, in a second patch, we could have the loop based 
implementation for catching embedded null terminators. However, that change 
should have it's own measurement and maybe even it's own command line option to 
enable.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129269/new/

https://reviews.llvm.org/D129269

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D129269: [analyzer] Fix use of length in CStringChecker

2022-07-07 Thread Vince Bridgers via Phabricator via cfe-commits
vabridgers updated this revision to Diff 443101.
vabridgers added a comment.

a proposal to handle embedded null case caught by @steakhal


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129269/new/

https://reviews.llvm.org/D129269

Files:
  clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
  clang/test/Analysis/string.c


Index: clang/test/Analysis/string.c
===
--- clang/test/Analysis/string.c
+++ clang/test/Analysis/string.c
@@ -1652,3 +1652,8 @@
   __builtin___memset_chk(, 0, sizeof(x), __builtin_object_size(, 0));
   clang_analyzer_eval(x == 0); // expected-warning{{TRUE}}
 }
+
+void strcpy_no_overflow_2(char *y) {
+  char x[3];
+  strcpy(x, "12\0");
+}
Index: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
+++ clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
@@ -848,7 +848,15 @@
 SValBuilder  = C.getSValBuilder();
 QualType sizeTy = svalBuilder.getContext().getSizeType();
 const StringLiteral *strLit = cast(MR)->getStringLiteral();
-return svalBuilder.makeIntVal(strLit->getByteLength(), sizeTy);
+unsigned countx = 0;
+// get the number of string literal characters by the target's "code unit"
+// size, checking for an embedded literal of 0 up to the string literal's
+// length.
+for (countx = 0;
+ countx < strLit->getLength() && (strLit->getCodeUnit(countx) != 0);
+ countx++)
+  ;
+return svalBuilder.makeIntVal(countx, sizeTy);
   }
   case MemRegion::SymbolicRegionKind:
   case MemRegion::AllocaRegionKind:


Index: clang/test/Analysis/string.c
===
--- clang/test/Analysis/string.c
+++ clang/test/Analysis/string.c
@@ -1652,3 +1652,8 @@
   __builtin___memset_chk(, 0, sizeof(x), __builtin_object_size(, 0));
   clang_analyzer_eval(x == 0); // expected-warning{{TRUE}}
 }
+
+void strcpy_no_overflow_2(char *y) {
+  char x[3];
+  strcpy(x, "12\0");
+}
Index: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
+++ clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
@@ -848,7 +848,15 @@
 SValBuilder  = C.getSValBuilder();
 QualType sizeTy = svalBuilder.getContext().getSizeType();
 const StringLiteral *strLit = cast(MR)->getStringLiteral();
-return svalBuilder.makeIntVal(strLit->getByteLength(), sizeTy);
+unsigned countx = 0;
+// get the number of string literal characters by the target's "code unit"
+// size, checking for an embedded literal of 0 up to the string literal's
+// length.
+for (countx = 0;
+ countx < strLit->getLength() && (strLit->getCodeUnit(countx) != 0);
+ countx++)
+  ;
+return svalBuilder.makeIntVal(countx, sizeTy);
   }
   case MemRegion::SymbolicRegionKind:
   case MemRegion::AllocaRegionKind:
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D129269: [analyzer] Fix use of length in CStringChecker

2022-07-07 Thread Vince Bridgers via Phabricator via cfe-commits
vabridgers added a comment.

Thanks Balazs, you mean something like this correct?

void strcpy_no_overflow_2(char *y) {

  char x[3];
  strcpy(x, "12\0"); // this produces a warning, but should not. 

}


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129269/new/

https://reviews.llvm.org/D129269

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D129269: [analyzer] Fix use of length in CStringChecker

2022-07-07 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Actually, there is one more bug there. If the string has an embedded 
null-terminator; a wrong length will be returned.
Please add tests to demonstrate the wrong behavior. Try different target 
triples for having irregular char sizes and pin that.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129269/new/

https://reviews.llvm.org/D129269

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D129269: [analyzer] Fix use of length in CStringChecker

2022-07-07 Thread Vince Bridgers via Phabricator via cfe-commits
vabridgers created this revision.
vabridgers added reviewers: martong, steakhal.
Herald added subscribers: manas, ASDenysPetrov, dkrupp, donat.nagy, Szelethus, 
mikhail.ramalho, a.sidorin, rnkovacs, szepet, baloghadamsoftware, xazax.hun.
Herald added a reviewer: NoQ.
Herald added a project: All.
vabridgers requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

CStringChecker is using getByteLength to get the length of a string
literal. For targets where a "char" is 8-bits, getByteLength() and
getLength() will be equal for a C string, but for targets where a "char"
is 16-bits getByteLength() returns the size in octets.

This is verified in our downstream target, but we have no way to add a
test case for this case since there is no target supporting 16-bit
"char" upstream. Since this cannot have a test case, I'm asserted this
change is "correct by construction", and visually inspected to be
correct by way of the following example where this was found.

The case that shows this fails using a target with 16-bit chars is here.
getByteLength() for the string literal returns 4, which fails when
checked against "char x[4]". With the change, the string literal is
evaluated to a size of 2 which is a correct number of "char"'s for a
16-bit target.

void strcpy_no_overflow_2(char *y) {

  char x[4];
  strcpy(x, "12"); // with getByteLength(), returns 4 using 16-bit chars

}


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D129269

Files:
  clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp


Index: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
+++ clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
@@ -848,7 +848,7 @@
 SValBuilder  = C.getSValBuilder();
 QualType sizeTy = svalBuilder.getContext().getSizeType();
 const StringLiteral *strLit = cast(MR)->getStringLiteral();
-return svalBuilder.makeIntVal(strLit->getByteLength(), sizeTy);
+return svalBuilder.makeIntVal(strLit->getLength(), sizeTy);
   }
   case MemRegion::SymbolicRegionKind:
   case MemRegion::AllocaRegionKind:


Index: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
+++ clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
@@ -848,7 +848,7 @@
 SValBuilder  = C.getSValBuilder();
 QualType sizeTy = svalBuilder.getContext().getSizeType();
 const StringLiteral *strLit = cast(MR)->getStringLiteral();
-return svalBuilder.makeIntVal(strLit->getByteLength(), sizeTy);
+return svalBuilder.makeIntVal(strLit->getLength(), sizeTy);
   }
   case MemRegion::SymbolicRegionKind:
   case MemRegion::AllocaRegionKind:
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits