Re: [protobuf] zigzag encode and zigzag decode invoking undefined behavior

'Josh Haberman' via Protocol Buffers Fri, 28 Apr 2017 17:26:45 -0700

If you have a change that makes protobuf code not rely on undefined 
behavior, and also doesn't sacrifice performance, then we should merge it.


Feel free to send a GitHub PR. But make sure the code is C++03 (protobuf 
doesn't use C++11 yet).

On Tuesday, April 25, 2017 at 1:36:31 PM UTC-7, Wolfgang Brehm wrote:
>
> I know, but the proposed solution is still worth considering as it gives 
> you the best of both worlds with just a little compile time overhead.
> It yields the same result on 2s complement machines as the current 
> function but also on other machines because it is well defined.
> This means that
> clang++ -fsanitize=undefined
> will not have anything to complain either.
> TL;DR: You loose nothing but time and gain better defined code, but 
> basically everything stays the same.
>
> Am Dienstag, 25. April 2017 20:17:33 UTC+2 schrieb Feng Xiao:
>>
>> Right now protobuf implementation assumes 2s complement and won't work on 
>> any other environments. That's undefined behavior according to C++ standard 
>> but in practice it's very unlikely to be a problem.
>>
>> On Tue, Apr 25, 2017 at 5:42 AM, Wolfgang Brehm <wolfgan...@gmail.com> 
>> wrote:
>>
>>> right shifting negative integers is tecnically undefined, this means 
>>> that the implementation used for encoding integers:
>>> (n<<1)^(n>>(digits-1))
>>> is tecnically undefined according to:
>>>
>>>> The value of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits 
>>>> are zero-filled. If E1 has an unsigned type, the value of the result is E1 
>>>> × 2 E2, reduced modulo one more than the maximum value representable in 
>>>> the 
>>>> result type. Otherwise, if E1 has a signed type and non-negative value, 
>>>> and 
>>>> E1 × 2 E2 is representable in the corresponding unsigned type of the 
>>>> result 
>>>> type, then that value, converted to the result type, is the resulting 
>>>> value; otherwise, the behavior is undefined.
>>>
>>>
>>> and the implementation for decoding:
>>> (n>>1)^(-(n&1))
>>> depends on the 2s complement as well and is probably tecnically 
>>> undefined (but I cant find a quote from the standard).
>>>
>>> This is why I propose the following solution:
>>>
>>> template<typename S,class U = typename std::make_unsigned<S>::type>>
>>> U constexpr zigzag_encode(const S &n){
>>>   return (U(n)<<1)^(n<0?~U(0):U(0));
>>> }
>>>
>>> template<typename U,class S = typename std::make_signed<U>::type>>
>>> S constexpr zigzag_decode(const U &n){
>>>   return (n&1)?-1-S(n>>1):(n>>1);
>>> }
>>>
>>> This does not look as cool, but modern compilers (llvm and gcc were 
>>> tested) will compile to bascially the same instructions, gcc will introduce 
>>> 1 additional instruction for decode.
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Protocol Buffers" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to protobuf+u...@googlegroups.com.
>>> To post to this group, send email to prot...@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/protobuf.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to protobuf+unsubscr...@googlegroups.com.
To post to this group, send email to protobuf@googlegroups.com.
Visit this group at https://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.

Re: [protobuf] zigzag encode and zigzag decode invoking undefined behavior

Reply via email to