[clang] [Serialization] No transitive identifier change (PR #92085)

2024-06-10 Thread Chuanqi Xu via cfe-commits

ChuanqiXu9 wrote:

@jansvoboda11 ping~ I hope we can land the series of the patches in 2 weeks to 
give more baking times for them

https://github.com/llvm/llvm-project/pull/92085
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Serialization] No transitive identifier change (PR #92085)

2024-06-12 Thread Jan Svoboda via cfe-commits

https://github.com/jansvoboda11 approved this pull request.

LGTM, but I think it would be better to split out the `LazyIdentifierInfoPtr` 
change into a separate commit.

https://github.com/llvm/llvm-project/pull/92085
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Serialization] No transitive identifier change (PR #92085)

2024-06-17 Thread Chuanqi Xu via cfe-commits

ChuanqiXu9 wrote:

Thanks. Will do when landing.

https://github.com/llvm/llvm-project/pull/92085
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Serialization] No transitive identifier change (PR #92085)

2024-06-03 Thread Chuanqi Xu via cfe-commits

https://github.com/ChuanqiXu9 edited 
https://github.com/llvm/llvm-project/pull/92085
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Serialization] No transitive identifier change (PR #92085)

2024-06-03 Thread Jan Svoboda via cfe-commits


@@ -3896,7 +3903,7 @@ void ASTWriter::WriteIdentifierTable(Preprocessor &PP,
 
   // Write out identifiers if either the ID is local or the identifier has
   // changed since it was loaded.
-  if (ID >= FirstIdentID || !Chain || !II->isFromAST() ||
+  if (isLocalIdentifierID(ID) || !Chain || !II->isFromAST() ||

jansvoboda11 wrote:

Ah, got it. I was looking at this line in `ASTWriter.h`:

```c++
  /// The first ID number we can use for our own identifiers.
  serialization::IdentID FirstIdentID = serialization::NUM_PREDEF_IDENT_IDS;
```

and didn't realize it's being re-initialized in 
`ASTWriter::ReaderInitialized()`.

Doesn't that make the `!II->isFromAST()` check redundant then?

https://github.com/llvm/llvm-project/pull/92085
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Serialization] No transitive identifier change (PR #92085)

2024-06-03 Thread Jan Svoboda via cfe-commits


@@ -918,7 +918,7 @@ ASTSelectorLookupTrait::ReadKey(const unsigned char* d, 
unsigned) {
   SelectorTable &SelTable = Reader.getContext().Selectors;
   unsigned N = endian::readNext(d);
   const IdentifierInfo *FirstII = Reader.getLocalIdentifier(
-  F, endian::readNext(d));
+  F, endian::readNext(d));

jansvoboda11 wrote:

You're right, it would fail when looking for a `getSwappedBytes()` overload.

https://github.com/llvm/llvm-project/pull/92085
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Serialization] No transitive identifier change (PR #92085)

2024-06-04 Thread Chuanqi Xu via cfe-commits


@@ -3896,7 +3903,7 @@ void ASTWriter::WriteIdentifierTable(Preprocessor &PP,
 
   // Write out identifiers if either the ID is local or the identifier has
   // changed since it was loaded.
-  if (ID >= FirstIdentID || !Chain || !II->isFromAST() ||
+  if (isLocalIdentifierID(ID) || !Chain || !II->isFromAST() ||

ChuanqiXu9 wrote:

Nice catch! The  `!II->isFromAST()` check looks redundant indeed. I've made a 
NFC patch to address this:  
https://github.com/llvm/llvm-project/commit/799ae77993fa5d7b0638f10b3895090f8748de92

https://github.com/llvm/llvm-project/pull/92085
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Serialization] No transitive identifier change (PR #92085)

2024-06-04 Thread Chuanqi Xu via cfe-commits

https://github.com/ChuanqiXu9 updated 
https://github.com/llvm/llvm-project/pull/92085

>From e8f756ec7f8ea7e5bf18cc122a965fb2f258fd15 Mon Sep 17 00:00:00 2001
From: Chuanqi Xu 
Date: Tue, 14 May 2024 15:33:12 +0800
Subject: [PATCH] [Serialization] No transitive identifier change

---
 .../clang/Lex/ExternalPreprocessorSource.h|  54 -
 clang/include/clang/Lex/HeaderSearch.h|  12 +-
 .../include/clang/Serialization/ASTBitCodes.h |   2 +-
 clang/include/clang/Serialization/ASTReader.h |  19 ++-
 .../include/clang/Serialization/ModuleFile.h  |   3 -
 clang/lib/Lex/HeaderSearch.cpp|  33 +++---
 clang/lib/Serialization/ASTReader.cpp |  98 
 clang/lib/Serialization/ASTWriter.cpp |  63 ++
 clang/lib/Serialization/GlobalModuleIndex.cpp |   3 +-
 clang/lib/Serialization/ModuleFile.cpp|   1 -
 .../no-transitive-identifier-change.cppm  | 110 ++
 11 files changed, 286 insertions(+), 112 deletions(-)
 create mode 100644 clang/test/Modules/no-transitive-identifier-change.cppm

diff --git a/clang/include/clang/Lex/ExternalPreprocessorSource.h 
b/clang/include/clang/Lex/ExternalPreprocessorSource.h
index 6775841860373..48429948dbffe 100644
--- a/clang/include/clang/Lex/ExternalPreprocessorSource.h
+++ b/clang/include/clang/Lex/ExternalPreprocessorSource.h
@@ -36,12 +36,64 @@ class ExternalPreprocessorSource {
   /// Return the identifier associated with the given ID number.
   ///
   /// The ID 0 is associated with the NULL identifier.
-  virtual IdentifierInfo *GetIdentifier(unsigned ID) = 0;
+  virtual IdentifierInfo *GetIdentifier(uint64_t ID) = 0;
 
   /// Map a module ID to a module.
   virtual Module *getModule(unsigned ModuleID) = 0;
 };
 
+// Either a pointer to an IdentifierInfo of the controlling macro or the ID
+// number of the controlling macro.
+class LazyIdentifierInfoPtr {
+  // If the low bit is clear, a pointer to the IdentifierInfo. If the low
+  // bit is set, the upper 63 bits are the ID number.
+  mutable uint64_t Ptr = 0;
+
+public:
+  LazyIdentifierInfoPtr() = default;
+
+  explicit LazyIdentifierInfoPtr(const IdentifierInfo *Ptr)
+  : Ptr(reinterpret_cast(Ptr)) {}
+
+  explicit LazyIdentifierInfoPtr(uint64_t ID) : Ptr((ID << 1) | 0x01) {
+assert((ID << 1 >> 1) == ID && "ID must require < 63 bits");
+if (ID == 0)
+  Ptr = 0;
+  }
+
+  LazyIdentifierInfoPtr &operator=(const IdentifierInfo *Ptr) {
+this->Ptr = reinterpret_cast(Ptr);
+return *this;
+  }
+
+  LazyIdentifierInfoPtr &operator=(uint64_t ID) {
+assert((ID << 1 >> 1) == ID && "IDs must require < 63 bits");
+if (ID == 0)
+  Ptr = 0;
+else
+  Ptr = (ID << 1) | 0x01;
+
+return *this;
+  }
+
+  /// Whether this pointer is non-NULL.
+  ///
+  /// This operation does not require the AST node to be deserialized.
+  bool isValid() const { return Ptr != 0; }
+
+  /// Whether this pointer is currently stored as ID.
+  bool isID() const { return Ptr & 0x01; }
+
+  IdentifierInfo *getPtr() const {
+assert(!isID());
+return reinterpret_cast(Ptr);
+  }
+
+  uint64_t getID() const {
+assert(isID());
+return Ptr >> 1;
+  }
+};
 }
 
 #endif
diff --git a/clang/include/clang/Lex/HeaderSearch.h 
b/clang/include/clang/Lex/HeaderSearch.h
index 5ac634d4e..65700b8f9dc11 100644
--- a/clang/include/clang/Lex/HeaderSearch.h
+++ b/clang/include/clang/Lex/HeaderSearch.h
@@ -16,6 +16,7 @@
 #include "clang/Basic/SourceLocation.h"
 #include "clang/Basic/SourceManager.h"
 #include "clang/Lex/DirectoryLookup.h"
+#include "clang/Lex/ExternalPreprocessorSource.h"
 #include "clang/Lex/HeaderMap.h"
 #include "clang/Lex/ModuleMap.h"
 #include "llvm/ADT/ArrayRef.h"
@@ -119,13 +120,6 @@ struct HeaderFileInfo {
   LLVM_PREFERRED_TYPE(bool)
   unsigned IsValid : 1;
 
-  /// The ID number of the controlling macro.
-  ///
-  /// This ID number will be non-zero when there is a controlling
-  /// macro whose IdentifierInfo may not yet have been loaded from
-  /// external storage.
-  unsigned ControllingMacroID = 0;
-
   /// If this file has a \#ifndef XXX (or equivalent) guard that
   /// protects the entire contents of the file, this is the identifier
   /// for the macro that controls whether or not it has any effect.
@@ -134,7 +128,7 @@ struct HeaderFileInfo {
   /// the controlling macro of this header, since
   /// getControllingMacro() is able to load a controlling macro from
   /// external storage.
-  const IdentifierInfo *ControllingMacro = nullptr;
+  LazyIdentifierInfoPtr LazyControllingMacro;
 
   /// If this header came from a framework include, this is the name
   /// of the framework.
@@ -580,7 +574,7 @@ class HeaderSearch {
   /// no-op \#includes.
   void SetFileControllingMacro(FileEntryRef File,
const IdentifierInfo *ControllingMacro) {
-getFileInfo(File).ControllingMacro = ControllingMacro;
+getFileInfo(File).LazyControllingMacro = ControllingMacro;
   }
 

[clang] [Serialization] No transitive identifier change (PR #92085)

2024-06-04 Thread via cfe-commits

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff 799ae77993fa5d7b0638f10b3895090f8748de92 
e8f756ec7f8ea7e5bf18cc122a965fb2f258fd15 -- 
clang/test/Modules/no-transitive-identifier-change.cppm 
clang/include/clang/Lex/ExternalPreprocessorSource.h 
clang/include/clang/Lex/HeaderSearch.h 
clang/include/clang/Serialization/ASTBitCodes.h 
clang/include/clang/Serialization/ASTReader.h 
clang/include/clang/Serialization/ModuleFile.h clang/lib/Lex/HeaderSearch.cpp 
clang/lib/Serialization/ASTReader.cpp clang/lib/Serialization/ASTWriter.cpp 
clang/lib/Serialization/GlobalModuleIndex.cpp 
clang/lib/Serialization/ModuleFile.cpp
``





View the diff from clang-format here.


``diff
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index 29a7fcb2cb..fb41a65608 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -3436,8 +3436,8 @@ llvm::Error ASTReader::ReadASTBlock(ModuleFile &F,
   F.BaseIdentifierID = getTotalNumIdentifiers();
 
   if (F.LocalNumIdentifiers > 0)
-IdentifiersLoaded.resize(IdentifiersLoaded.size()
- + F.LocalNumIdentifiers);
+IdentifiersLoaded.resize(IdentifiersLoaded.size() +
+ F.LocalNumIdentifiers);
   break;
 }
 

``




https://github.com/llvm/llvm-project/pull/92085
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Serialization] No transitive identifier change (PR #92085)

2024-06-19 Thread Chuanqi Xu via cfe-commits

https://github.com/ChuanqiXu9 updated 
https://github.com/llvm/llvm-project/pull/92085

>From a2f883278bafb311eb290015a9cf04affeb8d760 Mon Sep 17 00:00:00 2001
From: Chuanqi Xu 
Date: Tue, 14 May 2024 15:33:12 +0800
Subject: [PATCH] [Serialization] No transitive identifier change

---
 .../clang/Lex/ExternalPreprocessorSource.h|   2 +-
 .../include/clang/Serialization/ASTBitCodes.h |   2 +-
 clang/include/clang/Serialization/ASTReader.h |  19 ++-
 .../include/clang/Serialization/ModuleFile.h  |   3 -
 clang/lib/Serialization/ASTReader.cpp |  96 ---
 clang/lib/Serialization/ASTWriter.cpp |  56 ++---
 clang/lib/Serialization/GlobalModuleIndex.cpp |   3 +-
 clang/lib/Serialization/ModuleFile.cpp|   1 -
 .../no-transitive-identifier-change.cppm  | 110 ++
 9 files changed, 210 insertions(+), 82 deletions(-)
 create mode 100644 clang/test/Modules/no-transitive-identifier-change.cppm

diff --git a/clang/include/clang/Lex/ExternalPreprocessorSource.h 
b/clang/include/clang/Lex/ExternalPreprocessorSource.h
index 6672ada2ee732..48429948dbffe 100644
--- a/clang/include/clang/Lex/ExternalPreprocessorSource.h
+++ b/clang/include/clang/Lex/ExternalPreprocessorSource.h
@@ -36,7 +36,7 @@ class ExternalPreprocessorSource {
   /// Return the identifier associated with the given ID number.
   ///
   /// The ID 0 is associated with the NULL identifier.
-  virtual IdentifierInfo *GetIdentifier(unsigned ID) = 0;
+  virtual IdentifierInfo *GetIdentifier(uint64_t ID) = 0;
 
   /// Map a module ID to a module.
   virtual Module *getModule(unsigned ModuleID) = 0;
diff --git a/clang/include/clang/Serialization/ASTBitCodes.h 
b/clang/include/clang/Serialization/ASTBitCodes.h
index 9f0f900a02914..570b088532414 100644
--- a/clang/include/clang/Serialization/ASTBitCodes.h
+++ b/clang/include/clang/Serialization/ASTBitCodes.h
@@ -59,7 +59,7 @@ const unsigned VERSION_MINOR = 1;
 ///
 /// The ID numbers of identifiers are consecutive (in order of discovery)
 /// and start at 1. 0 is reserved for NULL.
-using IdentifierID = uint32_t;
+using IdentifierID = uint64_t;
 
 /// The number of predefined identifier IDs.
 const unsigned int NUM_PREDEF_IDENT_IDS = 1;
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index 480f852e3bf07..0e95d82928459 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -661,14 +661,6 @@ class ASTReader
   /// been loaded.
   std::vector IdentifiersLoaded;
 
-  using GlobalIdentifierMapType =
-  ContinuousRangeMap;
-
-  /// Mapping from global identifier IDs to the module in which the
-  /// identifier resides along with the offset that should be added to the
-  /// global identifier ID to produce a local ID.
-  GlobalIdentifierMapType GlobalIdentifierMap;
-
   /// A vector containing macros that have already been
   /// loaded.
   ///
@@ -1547,6 +1539,11 @@ class ASTReader
   /// Translate a \param GlobalDeclID to the index of DeclsLoaded array.
   unsigned translateGlobalDeclIDToIndex(GlobalDeclID ID) const;
 
+  /// Translate an \param IdentifierID ID to the index of IdentifiersLoaded
+  /// array and the corresponding module file.
+  std::pair
+  translateIdentifierIDToIndex(serialization::IdentifierID ID) const;
+
 public:
   /// Load the AST file and validate its contents against the given
   /// Preprocessor.
@@ -2131,7 +2128,7 @@ class ASTReader
   /// Load a selector from disk, registering its ID if it exists.
   void LoadSelector(Selector Sel);
 
-  void SetIdentifierInfo(unsigned ID, IdentifierInfo *II);
+  void SetIdentifierInfo(serialization::IdentifierID ID, IdentifierInfo *II);
   void SetGloballyVisibleDecls(IdentifierInfo *II,
const SmallVectorImpl &DeclIDs,
SmallVectorImpl *Decls = nullptr);
@@ -2158,10 +2155,10 @@ class ASTReader
 return DecodeIdentifierInfo(ID);
   }
 
-  IdentifierInfo *getLocalIdentifier(ModuleFile &M, unsigned LocalID);
+  IdentifierInfo *getLocalIdentifier(ModuleFile &M, uint64_t LocalID);
 
   serialization::IdentifierID getGlobalIdentifierID(ModuleFile &M,
-unsigned LocalID);
+uint64_t LocalID);
 
   void resolvePendingMacro(IdentifierInfo *II, const PendingMacroInfo &PMInfo);
 
diff --git a/clang/include/clang/Serialization/ModuleFile.h 
b/clang/include/clang/Serialization/ModuleFile.h
index 56193d44dd6f3..3787f4eeb8a8b 100644
--- a/clang/include/clang/Serialization/ModuleFile.h
+++ b/clang/include/clang/Serialization/ModuleFile.h
@@ -310,9 +310,6 @@ class ModuleFile {
   /// Base identifier ID for identifiers local to this module.
   serialization::IdentifierID BaseIdentifierID = 0;
 
-  /// Remapping table for identifier IDs in this module.
-  ContinuousRangeMap IdentifierRemap;
-
   /// Actual data for the on-disk hash

[clang] [Serialization] No transitive identifier change (PR #92085)

2024-06-19 Thread Chuanqi Xu via cfe-commits

https://github.com/ChuanqiXu9 closed 
https://github.com/llvm/llvm-project/pull/92085
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits