ChuanqiXu9 wrote: > > Yes, I explained this in Some low level details section. The size of source > > location won't be affected. Since the size of source location is unsigned > > (practically, it is 32 bits in most platforms). And we use uint64_t as a > > unit in the serializer. So there are 32 bit not used completely. The plan > > is to store the module file index in the higher 32 bits and it shouldn't be > > a safe problem. Maybe the original wording is not so clear. I've updated it. > > Thank you, using 64 bits in the serialization format makes sense! This also > means that whenever Clang is configured with 64 bit `SourceLocation`, we > should be using 96 bits for serialization: 32 bits for the module file index > and 64 bits for the offset itself, correct?
If Clang is configured with 64 bit `SourceLocation`, we can't use 96 bits for serialization. We can at most use 64 bits for a slot. In that case, we can only assume the offset of source location **in its own module** (not the global offset!) is not large than 2^32. I hope this may not be true. > > > The only trade-off I saw about this change is that it may increase the size > > of **on-disk** .pcm files due to we use VBR6 format to decrease the size of > > small numbers. But on the one side, we still need to pay for more spaces if > > we want to use `{local-module-index, offset-within-module} pair` (Thanks > > for the good name suggestion). On the other hand, from the experiment, it > > shows the overhead is acceptable. > > Sorry, I don't quite understand. Are you saying you did or did not try to > encode this as two separate 32bit values? I **tried** to encode this as two separate 32bit values. But it will break too many codes. Since a lot of places assume that we can encode the source location as an uint64_t. What I mean is, with VBR6 format (https://llvm.org/docs/BitCodeFormat.html#variable-width-integer), we can save more space for small integers in **on-disk** .pcm files (the memory representation should be the same). For example, for a 64 bits unsigned int `1`, VBR6 can use only 6 bits to store that `000001` to represent the 64 bits value `1` in the on-disk representations. So that even if I don't use more slots to store the module file index, the size of the .pcm files will increase after all. https://github.com/llvm/llvm-project/pull/86912 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits