ChuanqiXu9 wrote:

> > Yes, I explained this in Some low level details section. The size of source 
> > location won't be affected. Since the size of source location is unsigned 
> > (practically, it is 32 bits in most platforms). And we use uint64_t as a 
> > unit in the serializer. So there are 32 bit not used completely. The plan 
> > is to store the module file index in the higher 32 bits and it shouldn't be 
> > a safe problem. Maybe the original wording is not so clear. I've updated it.
> 
> Thank you, using 64 bits in the serialization format makes sense! This also 
> means that whenever Clang is configured with 64 bit `SourceLocation`, we 
> should be using 96 bits for serialization: 32 bits for the module file index 
> and 64 bits for the offset itself, correct?

If Clang is configured with 64 bit `SourceLocation`, we can't use 96 bits for 
serialization. We can at most use 64 bits for a slot. In that case, we can only 
assume the offset of source location **in its own module** (not the global 
offset!) is not large than 2^32. I hope this may not be true.

> 
> > The only trade-off I saw about this change is that it may increase the size 
> > of **on-disk** .pcm files due to we use VBR6 format to decrease the size of 
> > small numbers. But on the one side, we still need to pay for more spaces if 
> > we want to use `{local-module-index, offset-within-module} pair` (Thanks 
> > for the good name suggestion). On the other hand, from the experiment, it 
> > shows the overhead is acceptable.
> 
> Sorry, I don't quite understand. Are you saying you did or did not try to 
> encode this as two separate 32bit values?

I **tried** to encode this as two separate 32bit values. But it will break too 
many codes. Since a lot of places assume that we can encode the source location 
as an uint64_t.

What I mean is, with VBR6 format 
(https://llvm.org/docs/BitCodeFormat.html#variable-width-integer),  we can save 
more space for small integers in **on-disk** .pcm files (the memory 
representation should be the same). For example, for a 64 bits unsigned int 
`1`, VBR6 can use only 6 bits to store that `000001` to represent the 64 bits 
value `1` in the on-disk representations. So that even if I don't use more 
slots to store the module file index, the size of the .pcm files will increase 
after all.

https://github.com/llvm/llvm-project/pull/86912
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to