whisperity requested changes to this revision.
whisperity added a comment.

Earlier (IIRC in March) we did an internal test of the check and the following 
results were obtained. The results are kinda //weird//, to say the least. 
Numbers are //after// CodeChecker has done a serverside uniqueing (most likely 
things such as //"if **`X`** in `H.h` is reported in the compilation/analysis 
of `A.cpp` and `B.cpp` then store **`X`** only once"//) of the bugs.

- memcache-d: 43
- tmux: 4
- curl: 283
- twin: 161
- vim: 2 206
- OpenSSL: **23 724**
- sqlite: 752
- ffmpeg: **87 121**
- postgres: **116 635**
- tinyxml2: 6
- libwebm: 34
- xerces: **48 732**
- bitcoin: 4 429
- protobuf: 1 405
- codechecker-ldlogger: 0
- ACID: 4 831
- Qt: **34 973**
- Contour v2: **14 246**
- LLVM: 6 018

No crashes (of the checker) were observed. Most of the large outlier numbers 
are attributable to heavy mathematics (ffmpeg, Qt), crypto (OpenSSL, 
PostgreSQL), or character encoding (Xerces, Postgres, Contour) operations in 
the project.

Unfortunately, this is a very significant amount of false positives to the 
point where the consumption of results is slowed down not on the mental load 
side, but also the technical side (such as triggering server hangs in 
CodeChecker and whatnot...). However, after putting some thoughts into it and 
trying to hand-clusterise some of the reports I've found a few significant 
opportunities where this check should be improved.

**Note:** These improvements might be worthwhile to implement as a //follow-up 
patch// which is merged together with this current review. **As it stands right 
now, this check is way too noisy (on maths-based projects) to be merged!**

1. Enums
--------

One is when the literal is used to initialise an enum. So far I don't know 
whether simply just ignoring enums is good, or would it be worth to somehow 
emit a "unified" warning for the entire enum that follows the pattern as shown 
below.
F28149214: image.png <https://reviews.llvm.org/F28149214>

2. `(u)?int([\d]+)_t`
---------------------

The other are cases when the user explicitly initialises a **fixed-width** (or 
*constrained-width*) variable, such as `int64_t` or `uint16_t` with a literal. 
Right now, the implementation hard matches only the `IntegerLiteral` instances, 
i.e., lines such as `static constexpr std::uint64_t X = 0x07F000000` where the 
width and characteristics of the literal might just as well *exactly* and 
*properly* match the variable it is assigned to (granted, we only match and 
silence `Decl`s for this, and lose expressivity on expressions), and do not 
report then.

3. Large arrays of magic literals
---------------------------------

While this is not the silencing of a "valid" use case where the check, as of 
the current implementation, it is too verbose for, but I believe we could 
reasonably cull the noisiness of the check by simply saying that if there are 
array initialisers or //brace-init-lists// or whatever that contain multiple 
non-portable literals, then **either** we simply suck it up and do not warn for 
**any** of the elements... or for the sake of collegiality give **one** warning 
for the user. The former option would be the most silencing of all, while the 
latter would allow the user to do a single `// NOLINT` for the array instead of 
having to deal with `/* NOLINTBEGIN */ ... /* NOLINTEND */`.

Several chunks of noisy warnings come out of places like SHA-256, FFT, and 
various other "very mathematical" contexts where a large constant array is 
created with very specific proven values, all of which expressed and reported 
as "non-portable literals". It is no use to do 500000 warnings for a single 
array. (This is perhaps after all the same(-ish) family as the "enum" problem.)



================
Comment at: 
clang-tools-extra/clang-tidy/portability/NonPortableIntegerConstantCheck.cpp:87
+  } else {
+    return { StrippedLiteral, 0, 0, 10 };
+  }
----------------
(Unrealistic nit: This will consume hypothetical additional literals such as 
`0!ZZZbb=` as if they were decimal literals.)


================
Comment at: 
clang-tools-extra/clang-tidy/portability/NonPortableIntegerConstantCheck.cpp:153-154
+         "non-portable integer literal: integer literal with leading zeroes");
+  // Matches only the most significant bit,
+  // eg. unsigned value 0x80000000.
+  } else if (IsMSBBitUsed) {
----------------
(Style nit: These lines could be joined and still fit 80-col, I think.)


================
Comment at: 
clang-tools-extra/clang-tidy/portability/NonPortableIntegerConstantCheck.h:16
+
+/// Finds integer literals that are being used in a non-portable manner.
+///
----------------
I am still not entirely convinced whether this one-line summary makes too much 
sense to a user who does not know the Standard out of heart... As we hunt for 
bit widths more or less perhaps we could weave this notion into the single 
sentence summary saying that integers with "non-portable representations 
because of bit width"... Will need to give this some thought.


================
Comment at: 
clang-tools-extra/test/clang-tidy/checkers/portability/non-portable-integer-constant.cpp:1
+// RUN: %check_clang_tidy %s -std=c++17-or-later 
portability-non-portable-integer-constant %t
+
----------------
PiotrZSL wrote:
> no need to limit this check to c++17 or later.
> when it's targeting all versions of C and C++
Binary literals require C++17 and some version of newer C standard (think 
C11?). However, technically it could work to use `c++11-or-later` and then use 
the `__cplusplus` macro to only compile the binary literal test when 
appropriately C++17.


================
Comment at: 
clang-tools-extra/test/clang-tidy/checkers/portability/non-portable-integer-constant.cpp:72-74
+
+
+
----------------
(Style nit: Did we lose a test here, or is this just an accidental contiguous 
area of newlines?)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146712/new/

https://reviews.llvm.org/D146712

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to