Re: [swift-dev] Making the sign of NaNs unspecified to enable enum layout optimization

John McCall via swift-dev Mon, 24 Oct 2016 13:38:05 -0700

> On Oct 24, 2016, at 1:23 PM, Joe Groff <[email protected]> wrote:
>> On Oct 24, 2016, at 12:58 PM, John McCall <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>>> On Oct 24, 2016, at 12:30 PM, Stephen Canon <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>>> On Oct 24, 2016, at 2:55 PM, John McCall via swift-dev 
>>>> <[email protected] <mailto:[email protected]>> wrote:
>>>> 
>>>>> On Oct 24, 2016, at 8:49 AM, Joe Groff via swift-dev <[email protected] 
>>>>> <mailto:[email protected]>> wrote:
>>>>>> On Oct 22, 2016, at 10:39 AM, Chris Lattner <[email protected] 
>>>>>> <mailto:[email protected]>> wrote:
>>>>>> 
>>>>>>> On Oct 20, 2016, at 2:59 PM, Joe Groff via swift-dev 
>>>>>>> <[email protected] <mailto:[email protected]>> wrote:
>>>>>>>> 
>>>>>>>> copysign( ) is a reason to not pick the first option.  I’m not very 
>>>>>>>> worried about it, but it is a reason.  I see no problem with the 
>>>>>>>> second option.
>>>>>>> 
>>>>>>> As we discussed in person this morning, de-canonicalizing b11 might be 
>>>>>>> a better compromise to minimize the potential impact of layout 
>>>>>>> optimizations. That would leave the implementation with 2^51 NaN 
>>>>>>> representations (50 significand bits, plus the sign bit) in Double to 
>>>>>>> play with, which ought to be enough for anyone™. I liked the idea of 
>>>>>>> using the sign bit originally since testing for NaNs and sign bits is 
>>>>>>> something that can be easily done using common FPU instructions without 
>>>>>>> crossing domains, but as you noted, it sounds like comparison and 
>>>>>>> branching operations tend to do that anyway, so masking and branching 
>>>>>>> using integer operations shouldn't be too much of a burden. Jordan's 
>>>>>>> question of to what degree we consider different NaN encodings to be 
>>>>>>> distinct semantic values is still an interesting one, but if we take 
>>>>>>> only the b11 NaN payloads away, that should minimize the degree to 
>>>>>>> which the implementation needs to be considered as a constraint in 
>>>>>>> having that discussion.
>>>>>> 
>>>>>> To your original email, I agree this is an important problem to tackle, 
>>>>>> and that we should handle the inhabitant masking when the FP value is 
>>>>>> converted to optional.
>>>>>> 
>>>>>> That said, I don’t understand the above.  With the “b11” representation, 
>>>>>> what how is a "Double?" tested for “.None"? One advantage of using the 
>>>>>> signbit is that “is negative” comparisons are very cheap on risc 
>>>>>> systems, because you don’t have to materialize a large/weird immediate.
>>>>> 
>>>>> That's why I liked using the sign bit originally too. Steve noted that, 
>>>>> since any operation on an Optional is probably going to involve testing 
>>>>> and branching before revealing the underlying float value, and float 
>>>>> comparisons and branches tend to unavoidably burn a couple cycles 
>>>>> engaging the integer ALU, there's unlikely to be much benefit on ARM or 
>>>>> Intel avoiding integer masking operations. (More strictly RISCy 
>>>>> architectures like Power would be more negatively impacted, perhaps.) On 
>>>>> ARM64 at least, the bitmask for a b11 NaN is still representable as an 
>>>>> immediate, since it involves a single contiguous run of 1 bits.
>>>> 
>>>> There isn't any efficient way of just testing the sign bit of a value 
>>>> using FP instructions that I can see.  You could maybe take advantage of 
>>>> the vector registers overlapping the FP registers and use integer vector 
>>>> operations, but it would take a lot of code and have false-dependency 
>>>> problems.  So in both representations, the most efficient test sequence 
>>>> seems to be (1) get value in integer register (2) compare against some 
>>>> specific integer value.  And in that case, in both representations it 
>>>> seems to me that the obvious extra-inhabitant sequence is 0xFFFFFFFF, 
>>>> 0xFFFFFFFE, …
>>> 
>>> The test for detecting the reserved encoding is essentially identical 
>>> either way (pseudo-assembly):
>>> 
>>>     detectNegativeNaN:
>>>             ADD encoding, encoding, 0x0010000000000000
>>>             JC nil
>>> 
>>>     detectLeading11NaN:
>>>             ADD encoding, encoding, 0x0004000000000000
>>>             JO nil
>> 
>> Sure, that's basically just a different way of spelling the comparison.  For 
>> the most part, though, Swift will not need to perform this operation; it'll 
>> be checking for a specific value.  I don't see any reason to say that e.g. 
>> .none can be encoded by an arbitrary reserved NaN rather than a specific one.
> 
> When we know there's exactly one no-payload case, as with .none in Optional, 
> we do have the option of testing for an arbitrary extra inhabitant if it 
> happens to be cheaper/smaller code, since having any extra inhabitant 
> representation other than the first would be UB anyway.


Sure.

> In these cases, either the mask or first inhabitant should fit in an ARM64 
> bitmask immediate, and are a 64-bit movabs on Intel either way, so it's 
> probably not worthwhile.

Well, if we always set the sign bit on our extra inhabitants, we end up with a 
prefix that's amenable to extra inhabitants typically being small-magnitude 
negative numbers, right?  Or am I missing something important?

>> Anyway, we're agreed that both representations require doing integer 
>> comparisons on the value, not FP comparisons, and so operations on Float? 
>> will generally require moving the value between register banks if we do 
>> this.  It's not as pure a win as we might hope.  Still probably worthwhile, 
>> though.
> 
> Right. Since there's no perf benefit to using the sign bit, using b11 
> payloads has the least potential of interfering with users trying to use 
> specific NaN encodings for their own purposes.

I agree.

John.

_______________________________________________
swift-dev mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-dev

Re: [swift-dev] Making the sign of NaNs unspecified to enable enum layout optimization

Reply via email to