[clang] [analyzer][NFC] Rework SVal kind representation (PR #71039)

2023-11-04 Thread Balazs Benics via cfe-commits

https://github.com/steakhal closed 
https://github.com/llvm/llvm-project/pull/71039
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Rework SVal kind representation (PR #71039)

2023-11-04 Thread Balazs Benics via cfe-commits

https://github.com/steakhal updated 
https://github.com/llvm/llvm-project/pull/71039

>From 8f16d3000a91df33d416dd09381175ddeb7e5ed3 Mon Sep 17 00:00:00 2001
From: Balazs Benics 
Date: Sat, 4 Nov 2023 15:25:42 +0100
Subject: [PATCH] [analyzer][NFC] Rework SVal kind representation

The goal of this patch is to refine how the `SVal` base and sub-kinds
are represented by forming one unified enum describing the possible SVals.
This means that the `unsigned SVal::Kind` and the attached bit-packing
semantics would be replaced by a single unified enum.
This is more conventional and leads to a better debugging experience by default.
This eases the need of using debug pretty-printers, or the use of
runtime functions doing the printing for us like we do today by calling
`Val.dump()` whenever we inspect the values.

Previously, the first 2 bits of the `unsigned SVal::Kind` discriminated
the following quartet: `UndefinedVal`, `UnknownVal`, `Loc`, or `NonLoc`.
The rest of the upper bits represented the sub-kind, where the value
represented the index among only the `Loc`s or `NonLoc`s, effectively
attaching 2 meanings of the upper bits depending on the base-kind.
We don't need to pack these bits, as we have plenty even if we would use
just a plan-old `unsigned char`.

Consequently, in this patch, I propose to lay out all the (non-abstract)
`SVal` kinds into a single enum, along with some metadata (`BEGIN_Loc`,
`END_Loc`, `BEGIN_NonLoc`, `END_NonLoc`) artificial enum values, similar
how we do with the `MemRegions`.

Note that in the unified `SVal::Kind` enum, to differentiate
`nonloc::ConcreteInt` from `loc::ConcreteInt`, I had to prefix them with
`Loc` and `NonLoc` to resolve this ambiguity.
This should not surface in general, because I'm replacing the
`nonloc::Kind` enum items with `inline constexpr` global constants to
mimic the original behavior - and offer nicer spelling to these enum
values.

Some `SVal` constructors were not marked explicit, which I now mark as
such to follow best practices, and marked others as `/*implicit*/` to
clarify the intent.
During refactoring, I also found at least one function not marked
`LLVM_ATTRIBUTE_RETURNS_NONNULL`, so I did that.

The `TypeRetrievingVisitor` visitor had some accidental dead code,
namely: `VisitNonLocConcreteInt` and `VisitLocConcreteInt`.

Previously, the `SValVisitor` expected visit handlers of
`VisitNonLocX(nonloc::X)` and `VisitLocX(loc::X)`, where
I felt that envoding `NonLoc` and `Loc` in the name is not necessary as
the type of the parameter would select the right overload anyways, so I
simplified the naming of those visit functions.

The rest of the diff is a lot of times just formatting, because
`getKind()` by nature, frequently appears in switches, which means that
the whole switch gets automatically reformatted. I could probably undo
the formatting, but I didn't want to deviate from the rule unless
explicitly requested.
---
 .../StaticAnalyzer/Checkers/SValExplainer.h   |  10 +-
 .../Core/PathSensitive/SValVisitor.h  |  55 ++---
 .../Core/PathSensitive/SVals.def  |  38 ++-
 .../StaticAnalyzer/Core/PathSensitive/SVals.h | 226 ++
 clang/lib/StaticAnalyzer/Core/SValBuilder.cpp |  28 +--
 clang/lib/StaticAnalyzer/Core/SVals.cpp   |  87 ---
 .../Core/SimpleConstraintManager.cpp  |   4 +-
 .../StaticAnalyzer/Core/SimpleSValBuilder.cpp |  73 +++---
 clang/lib/StaticAnalyzer/Core/Store.cpp   |   2 +-
 9 files changed, 219 insertions(+), 304 deletions(-)

diff --git a/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h 
b/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h
index 3ae28c1dba3eb5a..43a70f596a4da28 100644
--- a/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h
+++ b/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h
@@ -65,7 +65,7 @@ class SValExplainer : public FullSValVisitor {
 return "undefined value";
   }
 
-  std::string VisitLocMemRegionVal(loc::MemRegionVal V) {
+  std::string VisitMemRegionVal(loc::MemRegionVal V) {
 const MemRegion *R = V.getRegion();
 // Avoid the weird "pointer to pointee of ...".
 if (auto SR = dyn_cast(R)) {
@@ -76,7 +76,7 @@ class SValExplainer : public FullSValVisitor {
 return "pointer to " + Visit(R);
   }
 
-  std::string VisitLocConcreteInt(loc::ConcreteInt V) {
+  std::string VisitConcreteInt(loc::ConcreteInt V) {
 const llvm::APSInt  = V.getValue();
 std::string Str;
 llvm::raw_string_ostream OS(Str);
@@ -84,11 +84,11 @@ class SValExplainer : public FullSValVisitor {
 return Str;
   }
 
-  std::string VisitNonLocSymbolVal(nonloc::SymbolVal V) {
+  std::string VisitSymbolVal(nonloc::SymbolVal V) {
 return Visit(V.getSymbol());
   }
 
-  std::string VisitNonLocConcreteInt(nonloc::ConcreteInt V) {
+  std::string VisitConcreteInt(nonloc::ConcreteInt V) {
 const llvm::APSInt  = V.getValue();
 std::string Str;
 llvm::raw_string_ostream OS(Str);
@@ -97,7 +97,7 

[clang] [analyzer][NFC] Rework SVal kind representation (PR #71039)

2023-11-03 Thread Gábor Horváth via cfe-commits

https://github.com/Xazax-hun approved this pull request.


https://github.com/llvm/llvm-project/pull/71039
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Rework SVal kind representation (PR #71039)

2023-11-03 Thread Vlad Serebrennikov via cfe-commits

Endilll wrote:

> What other ways are to make a const void * debugger-friendly - other than 
> llvm::PointerUnion?

I'm not aware of any that wouldn't require writing a custom formatter. 
Ultimately it boils to how can I extract type information. For 
`llvm::PointerUnion` I'm matching discriminator against template argument list 
after jumping through some hoops to get discriminator as a number. If this 
patch allows me to extract type information based on prefix or suffix of `Kind` 
enumerator, then it's probably as easy as you can get it without a radical 
change. But read on, there's more to this.

> I've considered using plain-old unions, but I found that approach pretty 
> ugly, and I'm not even sure if it would help debugging experience.

Given I extracted type information from `Kind`, now I need a handle to type to 
cast the pointer to.

In order to declare a union, you have to specify type of every member. It's 
possible to extract type from such declaration, saving debugger the time to run 
global search for a type by name, which in my experience is about two orders of 
magnitudes more expensive operation than anything else formatter usually does 
(anecdotal evidence of 100 ms per search). So in order to keep debugger 
responsive with formatters enabled, I'm avoiding global searches as much as I 
can.

For debuggers having union with all possible types listed is strictly an 
improvement over status quo, but I get why this is ugly.

https://github.com/llvm/llvm-project/pull/71039
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Rework SVal kind representation (PR #71039)

2023-11-03 Thread Balazs Benics via cfe-commits

steakhal wrote:

> But I have to point out that this patch doesn't address the fact that `const 
> void* Data` is not friendly to debuggers, especially with type information 
> encoded in another member. So even with this patch applied, someone would 
> still have to write (and maintain) a custom formatter on debugger side to 
> display `Data` correctly.
> 
> If you refactor `const void* Data` to be a `llvm::PointerUnion`, then it will 
> be picked up automatically (`PointerUnion` is a popular type, so I've already 
> written a formatter for it.) Together with removing `BaseBits` from `Kind` 
> and making the latter non-packed, `SVal` will have a debugger-friendly layout.

What other ways are to make a `const void *` debugger-friendly - other than 
`llvm::PointerUnion`?
I've considered using plain-old unions, but I found that approach pretty ugly, 
and I'm not even sure if it would help debugging experience.

https://github.com/llvm/llvm-project/pull/71039
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Rework SVal kind representation (PR #71039)

2023-11-03 Thread Balazs Benics via cfe-commits

https://github.com/steakhal updated 
https://github.com/llvm/llvm-project/pull/71039

>From 3bc43ab005aa76a43644d4d93286215b490cc8fa Mon Sep 17 00:00:00 2001
From: Balazs Benics 
Date: Thu, 2 Nov 2023 10:21:03 +0100
Subject: [PATCH 1/2] [analyzer][NFC] Rework SVal kind representation

The goal of this patch is to refine how the `SVal` base and sub-kinds
are represented by forming one unified enum describing the possible SVals.
This means that the `unsigned SVal::Kind` and the attached bit-packing
semantics would be replaced by a single unified enum.
This is more conventional and leads to a better debugging experience by default.
This eases the need of using debug pretty-printers, or the use of
runtime functions doing the printing for us like we do today by calling
`Val.dump()` whenever we inspect the values.

Previously, the first 2 bits of the `unsigned SVal::Kind` discriminated
the following quartet: `UndefinedVal`, `UnknownVal`, `Loc`, or `NonLoc`.
The rest of the upper bits represented the sub-kind, where the value
represented the index among only the `Loc`s or `NonLoc`s, effectively
attaching 2 meanings of the upper bits depending on the base-kind.
We don't need to pack these bits, as we have plenty even if we would use
just a plan-old `unsigned char`.

Consequently, in this patch, I propose to lay out all the (non-abstract)
`SVal` kinds into a single enum, along with some metadata (`BEGIN_Loc`,
`END_Loc`, `BEGIN_NonLoc`, `END_NonLoc`) artificial enum values, similar
how we do with the `MemRegions`.

Note that in the unified `SVal::Kind` enum, to differentiate
`nonloc::ConcreteInt` from `loc::ConcreteInt`, I had to prefix them with
`Loc` and `NonLoc` to resolve this ambiguity.
This should not surface in general, because I'm replacing the
`nonloc::Kind` enum items with `inline constexpr` global constants to
mimic the original behavior - and offer nicer spelling to these enum
values.

Some `SVal` constructors were not marked explicit, which I now mark as
such to follow best practices, and marked others as `/*implicit*/` to
clarify the intent.
During refactoring, I also found at least one function not marked
`LLVM_ATTRIBUTE_RETURNS_NONNULL`, so I did that.

The `TypeRetrievingVisitor` visitor had some accidental dead code,
namely: `VisitNonLocConcreteInt` and `VisitLocConcreteInt`.

Previously, the `SValVisitor` expected visit handlers of
`VisitNonLocX(nonloc::X)` and `VisitLocX(loc::X)`, where
I felt that envoding `NonLoc` and `Loc` in the name is not necessary as
the type of the parameter would select the right overload anyways, so I
simplified the naming of those visit functions.

The rest of the diff is a lot of times just formatting, because
`getKind()` by nature, frequently appears in switches, which means that
the whole switch gets automatically reformatted. I could probably undo
the formatting, but I didn't want to deviate from the rule unless
explicitly requested.
---
 .../StaticAnalyzer/Checkers/SValExplainer.h   |  10 +-
 .../Core/PathSensitive/SValVisitor.h  |  57 ++---
 .../Core/PathSensitive/SVals.def  |  38 ++-
 .../StaticAnalyzer/Core/PathSensitive/SVals.h | 225 ++
 clang/lib/StaticAnalyzer/Core/SValBuilder.cpp |  28 +--
 clang/lib/StaticAnalyzer/Core/SVals.cpp   |  87 ---
 .../Core/SimpleConstraintManager.cpp  |   4 +-
 .../StaticAnalyzer/Core/SimpleSValBuilder.cpp |  73 +++---
 clang/lib/StaticAnalyzer/Core/Store.cpp   |   2 +-
 9 files changed, 220 insertions(+), 304 deletions(-)

diff --git a/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h 
b/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h
index 3ae28c1dba3eb5a..43a70f596a4da28 100644
--- a/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h
+++ b/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h
@@ -65,7 +65,7 @@ class SValExplainer : public FullSValVisitor {
 return "undefined value";
   }
 
-  std::string VisitLocMemRegionVal(loc::MemRegionVal V) {
+  std::string VisitMemRegionVal(loc::MemRegionVal V) {
 const MemRegion *R = V.getRegion();
 // Avoid the weird "pointer to pointee of ...".
 if (auto SR = dyn_cast(R)) {
@@ -76,7 +76,7 @@ class SValExplainer : public FullSValVisitor {
 return "pointer to " + Visit(R);
   }
 
-  std::string VisitLocConcreteInt(loc::ConcreteInt V) {
+  std::string VisitConcreteInt(loc::ConcreteInt V) {
 const llvm::APSInt  = V.getValue();
 std::string Str;
 llvm::raw_string_ostream OS(Str);
@@ -84,11 +84,11 @@ class SValExplainer : public FullSValVisitor {
 return Str;
   }
 
-  std::string VisitNonLocSymbolVal(nonloc::SymbolVal V) {
+  std::string VisitSymbolVal(nonloc::SymbolVal V) {
 return Visit(V.getSymbol());
   }
 
-  std::string VisitNonLocConcreteInt(nonloc::ConcreteInt V) {
+  std::string VisitConcreteInt(nonloc::ConcreteInt V) {
 const llvm::APSInt  = V.getValue();
 std::string Str;
 llvm::raw_string_ostream OS(Str);
@@ -97,7 

[clang] [analyzer][NFC] Rework SVal kind representation (PR #71039)

2023-11-02 Thread Vlad Serebrennikov via cfe-commits

https://github.com/Endilll approved this pull request.


https://github.com/llvm/llvm-project/pull/71039
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Rework SVal kind representation (PR #71039)

2023-11-02 Thread Vlad Serebrennikov via cfe-commits

Endilll wrote:

> I've considered this but I found the number of alternatives too large to make 
> it feasible. Consider that we have 11 possible SValKinds, which would require 
> 4 bits to encode. Requiring all Data pointers to be aligned as such seems 
> rough - although not impossible.

Sorry, I missed the part that you can have 11 different types there. 
`alignas(16)` might not be unreasonable, but it depends and requires 
consideration.

> I'd say it's one baby step in that direction, but not the end of the journey. 
> How about looking at this like that?

No objections to this point of view. I just wanted to make sure debugger 
struggles are brought up in the discussion and understood.

https://github.com/llvm/llvm-project/pull/71039
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Rework SVal kind representation (PR #71039)

2023-11-02 Thread Balazs Benics via cfe-commits

steakhal wrote:

> But I have to point out that this patch doesn't address the fact that `const 
> void* Data` is not friendly to debuggers, especially with type information 
> encoded in another member. So even with this patch applied, someone would 
> still have to write (and maintain) a custom formatter on debugger side to 
> display `Data` correctly.

That's correct. I'll think about how to resolve that as well.

> If you refactor `const void* Data` to be a `llvm::PointerUnion`, then it will 
> be picked up automatically (`PointerUnion` is a popular type, so I've already 
> written a formatter for it.) Together with removing `BaseBits` from `Kind` 
> and making the latter non-packed, `SVal` will have a debugger-friendly layout.

I've considered this but I found the number of alternatives too large to make 
it feasible. Consider that we have 11 possible SValKinds, which would require 4 
bits to encode. Requiring all `Data` pointers to be aligned as such seems rough 
- although not impossible.

> That said, debugger-friendliness is probably not among top priorities, so I'm 
> not going to block this patch, if this is the direction you and reviewers 
> want to take.

I'd say it's one baby step in that direction, but not the end of the journey. 
How about looking at this like that?


https://github.com/llvm/llvm-project/pull/71039
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Rework SVal kind representation (PR #71039)

2023-11-02 Thread Vlad Serebrennikov via cfe-commits


@@ -105,38 +86,25 @@ class SVal {
 return llvm::dyn_cast(*this);
   }
 
-  unsigned getRawKind() const { return Kind; }
-  BaseKind getBaseKind() const { return (BaseKind) (Kind & BaseMask); }
-  unsigned getSubKind() const { return Kind >> BaseBits; }
+  SValKind getKind() const { return Kind; }
 
   // This method is required for using SVal in a FoldingSetNode.  It
   // extracts a unique signature for this SVal object.
   void Profile(llvm::FoldingSetNodeID ) const {
-ID.AddInteger((unsigned) getRawKind());
+ID.AddInteger((unsigned)getKind());

Endilll wrote:

I think recently introduced `llvm::to_underlying()` is a better fit here.

https://github.com/llvm/llvm-project/pull/71039
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Rework SVal kind representation (PR #71039)

2023-11-02 Thread Vlad Serebrennikov via cfe-commits

https://github.com/Endilll commented:

Converting packed `unsigned Kind` into non-packed `SValKind Kind` is definitely 
going to help debuggers to display the value correctly.

But I have to point out that this patch doesn't address the fact that `const 
void* Data` is not friendly to debuggers, especially with type information 
encoded in another member. So even with this patch applied, someone would still 
have to write (and maintain) a custom formatter on debugger side to display 
`Data` correctly.

If you refactor `const void* Data` to be a `llvm::PointerUnion`, then it will 
be picked up automatically (`PointerUnion` is a popular type, so I've already 
written a formatter for it.) Together with removing `BaseBits` from `Kind` and 
making the latter non-packed, `SVal` will have a debugger-friendly layout.

That said, debugger-friendliness is probably not among top priorities, so I'm 
not going to block this patch, if this is the direction you and reviewers want 
to take.

https://github.com/llvm/llvm-project/pull/71039
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Rework SVal kind representation (PR #71039)

2023-11-02 Thread Vlad Serebrennikov via cfe-commits

https://github.com/Endilll edited 
https://github.com/llvm/llvm-project/pull/71039
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Rework SVal kind representation (PR #71039)

2023-11-02 Thread Balazs Benics via cfe-commits

steakhal wrote:

This PR relates to #69835 
([comment](https://github.com/llvm/llvm-project/issues/69835#issuecomment-1775533393)).

https://github.com/llvm/llvm-project/pull/71039
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Rework SVal kind representation (PR #71039)

2023-11-02 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang-static-analyzer-1

Author: Balazs Benics (steakhal)


Changes

The goal of this patch is to refine how the `SVal` base and sub-kinds are 
represented by forming one unified enum describing the possible SVals. This 
means that the `unsigned SVal::Kind` and the attached bit-packing semantics 
would be replaced by a single unified enum. This is more conventional and leads 
to a better debugging experience by default. This eases the need of using debug 
pretty-printers, or the use of runtime functions doing the printing for us like 
we do today by calling `Val.dump()` whenever we inspect the values.

Previously, the first 2 bits of the `unsigned SVal::Kind` discriminated the 
following quartet: `UndefinedVal`, `UnknownVal`, `Loc`, or `NonLoc`. The rest 
of the upper bits represented the sub-kind, where the value represented the 
index among only the `Loc`s or `NonLoc`s, effectively attaching 2 meanings of 
the upper bits depending on the base-kind. We don't need to pack these bits, as 
we have plenty even if we would use just a plan-old `unsigned char`.

Consequently, in this patch, I propose to lay out all the (non-abstract) `SVal` 
kinds into a single enum, along with some metadata (`BEGIN_Loc`, `END_Loc`, 
`BEGIN_NonLoc`, `END_NonLoc`) artificial enum values, similar how we do with 
the `MemRegions`.

Note that in the unified `SVal::Kind` enum, to differentiate 
`nonloc::ConcreteInt` from `loc::ConcreteInt`, I had to prefix them with `Loc` 
and `NonLoc` to resolve this ambiguity.
This should not surface in general, because I'm replacing the `nonloc::Kind` 
enum items with `inline constexpr` global constants to mimic the original 
behavior - and offer nicer spelling to these enum values.

Some `SVal` constructors were not marked explicit, which I now mark as such to 
follow best practices, and marked others as `/*implicit*/` to clarify the 
intent.
During refactoring, I also found at least one function not marked 
`LLVM_ATTRIBUTE_RETURNS_NONNULL`, so I did that.

The `TypeRetrievingVisitor` visitor had some accidental dead code, namely: 
`VisitNonLocConcreteInt` and `VisitLocConcreteInt`.

Previously, the `SValVisitor` expected visit handlers of 
`VisitNonLocX(nonloc::X)` and `VisitLocX(loc::X)`, where I felt 
that envoding `NonLoc` and `Loc` in the name is not necessary as the type of 
the parameter would select the right overload anyways, so I simplified the 
naming of those visit functions.

The rest of the diff is a lot of times just formatting, because `getKind()` by 
nature, frequently appears in switches, which means that the whole switch gets 
automatically reformatted. I could probably undo the formatting, but I didn't 
want to deviate from the rule unless explicitly requested.

---

Patch is 40.42 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/71039.diff


9 Files Affected:

- (modified) clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h (+5-5) 
- (modified) 
clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValVisitor.h (+25-32) 
- (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.def 
(+18-20) 
- (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h 
(+77-148) 
- (modified) clang/lib/StaticAnalyzer/Core/SValBuilder.cpp (+14-14) 
- (modified) clang/lib/StaticAnalyzer/Core/SVals.cpp (+42-45) 
- (modified) clang/lib/StaticAnalyzer/Core/SimpleConstraintManager.cpp (+2-2) 
- (modified) clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp (+36-37) 
- (modified) clang/lib/StaticAnalyzer/Core/Store.cpp (+1-1) 


``diff
diff --git a/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h 
b/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h
index 3ae28c1dba3eb5a..43a70f596a4da28 100644
--- a/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h
+++ b/clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h
@@ -65,7 +65,7 @@ class SValExplainer : public FullSValVisitor {
 return "undefined value";
   }
 
-  std::string VisitLocMemRegionVal(loc::MemRegionVal V) {
+  std::string VisitMemRegionVal(loc::MemRegionVal V) {
 const MemRegion *R = V.getRegion();
 // Avoid the weird "pointer to pointee of ...".
 if (auto SR = dyn_cast(R)) {
@@ -76,7 +76,7 @@ class SValExplainer : public FullSValVisitor {
 return "pointer to " + Visit(R);
   }
 
-  std::string VisitLocConcreteInt(loc::ConcreteInt V) {
+  std::string VisitConcreteInt(loc::ConcreteInt V) {
 const llvm::APSInt  = V.getValue();
 std::string Str;
 llvm::raw_string_ostream OS(Str);
@@ -84,11 +84,11 @@ class SValExplainer : public FullSValVisitor {
 return Str;
   }
 
-  std::string VisitNonLocSymbolVal(nonloc::SymbolVal V) {
+  std::string VisitSymbolVal(nonloc::SymbolVal V) {
 return Visit(V.getSymbol());
   }
 
-  std::string VisitNonLocConcreteInt(nonloc::ConcreteInt V) {
+  std::string VisitConcreteInt(nonloc::ConcreteInt V) 

[clang] [analyzer][NFC] Rework SVal kind representation (PR #71039)

2023-11-02 Thread Balazs Benics via cfe-commits

https://github.com/steakhal created 
https://github.com/llvm/llvm-project/pull/71039

The goal of this patch is to refine how the `SVal` base and sub-kinds are 
represented by forming one unified enum describing the possible SVals. This 
means that the `unsigned SVal::Kind` and the attached bit-packing semantics 
would be replaced by a single unified enum. This is more conventional and leads 
to a better debugging experience by default. This eases the need of using debug 
pretty-printers, or the use of runtime functions doing the printing for us like 
we do today by calling `Val.dump()` whenever we inspect the values.

Previously, the first 2 bits of the `unsigned SVal::Kind` discriminated the 
following quartet: `UndefinedVal`, `UnknownVal`, `Loc`, or `NonLoc`. The rest 
of the upper bits represented the sub-kind, where the value represented the 
index among only the `Loc`s or `NonLoc`s, effectively attaching 2 meanings of 
the upper bits depending on the base-kind. We don't need to pack these bits, as 
we have plenty even if we would use just a plan-old `unsigned char`.

Consequently, in this patch, I propose to lay out all the (non-abstract) `SVal` 
kinds into a single enum, along with some metadata (`BEGIN_Loc`, `END_Loc`, 
`BEGIN_NonLoc`, `END_NonLoc`) artificial enum values, similar how we do with 
the `MemRegions`.

Note that in the unified `SVal::Kind` enum, to differentiate 
`nonloc::ConcreteInt` from `loc::ConcreteInt`, I had to prefix them with `Loc` 
and `NonLoc` to resolve this ambiguity.
This should not surface in general, because I'm replacing the `nonloc::Kind` 
enum items with `inline constexpr` global constants to mimic the original 
behavior - and offer nicer spelling to these enum values.

Some `SVal` constructors were not marked explicit, which I now mark as such to 
follow best practices, and marked others as `/*implicit*/` to clarify the 
intent.
During refactoring, I also found at least one function not marked 
`LLVM_ATTRIBUTE_RETURNS_NONNULL`, so I did that.

The `TypeRetrievingVisitor` visitor had some accidental dead code, namely: 
`VisitNonLocConcreteInt` and `VisitLocConcreteInt`.

Previously, the `SValVisitor` expected visit handlers of 
`VisitNonLocX(nonloc::X)` and `VisitLocX(loc::X)`, where I felt 
that envoding `NonLoc` and `Loc` in the name is not necessary as the type of 
the parameter would select the right overload anyways, so I simplified the 
naming of those visit functions.

The rest of the diff is a lot of times just formatting, because `getKind()` by 
nature, frequently appears in switches, which means that the whole switch gets 
automatically reformatted. I could probably undo the formatting, but I didn't 
want to deviate from the rule unless explicitly requested.

>From 3bc43ab005aa76a43644d4d93286215b490cc8fa Mon Sep 17 00:00:00 2001
From: Balazs Benics 
Date: Thu, 2 Nov 2023 10:21:03 +0100
Subject: [PATCH] [analyzer][NFC] Rework SVal kind representation

The goal of this patch is to refine how the `SVal` base and sub-kinds
are represented by forming one unified enum describing the possible SVals.
This means that the `unsigned SVal::Kind` and the attached bit-packing
semantics would be replaced by a single unified enum.
This is more conventional and leads to a better debugging experience by default.
This eases the need of using debug pretty-printers, or the use of
runtime functions doing the printing for us like we do today by calling
`Val.dump()` whenever we inspect the values.

Previously, the first 2 bits of the `unsigned SVal::Kind` discriminated
the following quartet: `UndefinedVal`, `UnknownVal`, `Loc`, or `NonLoc`.
The rest of the upper bits represented the sub-kind, where the value
represented the index among only the `Loc`s or `NonLoc`s, effectively
attaching 2 meanings of the upper bits depending on the base-kind.
We don't need to pack these bits, as we have plenty even if we would use
just a plan-old `unsigned char`.

Consequently, in this patch, I propose to lay out all the (non-abstract)
`SVal` kinds into a single enum, along with some metadata (`BEGIN_Loc`,
`END_Loc`, `BEGIN_NonLoc`, `END_NonLoc`) artificial enum values, similar
how we do with the `MemRegions`.

Note that in the unified `SVal::Kind` enum, to differentiate
`nonloc::ConcreteInt` from `loc::ConcreteInt`, I had to prefix them with
`Loc` and `NonLoc` to resolve this ambiguity.
This should not surface in general, because I'm replacing the
`nonloc::Kind` enum items with `inline constexpr` global constants to
mimic the original behavior - and offer nicer spelling to these enum
values.

Some `SVal` constructors were not marked explicit, which I now mark as
such to follow best practices, and marked others as `/*implicit*/` to
clarify the intent.
During refactoring, I also found at least one function not marked
`LLVM_ATTRIBUTE_RETURNS_NONNULL`, so I did that.

The `TypeRetrievingVisitor` visitor had some accidental dead code,
namely: `VisitNonLocConcreteInt` and