Re: [rust-dev] Integer overflow, round -2147483648

Gábor Lehel Sun, 22 Jun 2014 04:59:58 -0700

On Sat, Jun 21, 2014 at 3:31 AM, Jerry Morrison <jhm...@gmail.com> wrote:


>
> On Fri, Jun 20, 2014 at 5:36 PM, Gábor Lehel <glaebho...@gmail.com> wrote:
>
>>
>>
>>
>> On Sat, Jun 21, 2014 at 1:37 AM, Jerry Morrison <jhm...@gmail.com> wrote:
>>
>>>
>>> On Fri, Jun 20, 2014 at 2:07 PM, Gábor Lehel <glaebho...@gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>>
>>>> On Thu, Jun 19, 2014 at 9:05 AM, Jerry Morrison <jhm...@gmail.com>
>>>> wrote:
>>>>
>>>>> Nice analysis!
>>>>>
>>>>> Over what scope should programmers pick between Gábor's 3 categories?
>>>>>
>>>>> The "wraparound is desired" category should only occur in narrow parts
>>>>> of code, like computing a hash. That suits a wraparound-operator better
>>>>> than a wraparound-type, and certainly better than a compiler switch. And 
>>>>> it
>>>>> doesn't make sense for a type like 'int' that doesn't have a fixed size.
>>>>>
>>>>
>>>> I thought hash algorithms were precisely the kind of case where you
>>>> might opt for types which were clearly defined as wrapping. Why do you
>>>> think using different operators instead would be preferred?
>>>>
>>>
>>> Considering a hashing or CRC example, the code reads a bunch of
>>> non-wraparound values, mashes them together using wraparound arithmetic,
>>> then uses the result in a way that does not mean to wrap around at the
>>> integer size.
>>>
>>> It's workable to convert inputs to wraparound types and use
>>> wraparound accumulators, then convert the result to a non-wraparound type.
>>> But using wraparound operators seems simpler, more visible, and less
>>> error-prone. E.g. it'd be a mistake if the hash function returned a
>>> wraparound type, which gets assigned with type inference, and so downstream
>>> operations wrap around.
>>>
>>
>> Yes, the signature of the hash function shouldn't necessarily expose the
>> implementation's use of wraparound types... though it's not completely
>> obvious to me. What kind of downstream operations would it make sense to
>> perform on a hash value anyway? Anything besides further hashing?
>>
>> I'm only minimally knowledgeable about hashing algorithms, but I would've
>> thought that casting the inputs to wraparound types at the outset and then
>> casting the result back at the end would be *less* error prone than making
>> sure to use the wraparound version for every operation in the function. Is
>> that wrong? Are there any operations within the body of the hash function
>> where overflow should be caught?
>>
>> And if we'd be going with separate operators instead of separate types,
>> hash functions are a niche enough use case that, in themselves, I don't
>> think they *warrant* having distinct symbolic operators for the wraparound
>> operations; they could just use named methods instead.
>>
>> Hashing is the one that always comes up, but are there any other
>> instances where wraparound is the desired semantics?
>>
>
> Here's an example hash function from *Effective Java
> <http://www.amazon.com/Effective-Java-2nd-Joshua-Bloch-ebook/dp/B000WJOUPA/ref=sr_1_1?ie=UTF8&qid=1403312109&sr=8-1&keywords=joshua+bloch+effective+java>*
>  (page
> 48) following its recipe for writing hash functions by combining the
> object's significant fields:
>
> @Override public int hashCode() {
>
> int result = 17;
>
> result = 31 * result + areaCode;
>
> result = 31 * result + prefix;
>
> result = 31 * result + lineNumber;
>
> return result;
>
> }
>
>
> So using Swift's wraparound operators in Java looks like:
>
> @Override public int hashCode() {
>
> int result = 17;
>
> result = 31 &* result &+ areaCode;
>
> result = 31 &* result &+ prefix;
>
> result = 31 &* result &+ lineNumber;
>
> return result;
>
> }
>
>
> Alternatively, with a wraparound integer type wint (note that int is
> defined to be 32 bits in Java):
>
> @Override public int hashCode() {
>
> wint result = 17;
>
> result = (wint) 31 * result + (wint) areaCode;
>
> result = (wint) 31 * result + (wint) prefix;
>
> result = (wint) 31 * result + (wint) lineNumber;
>
> return (int) result;
>
> }
>
>
> In this example, it's easier to get the first one right than the second
> one.
>

Thanks. I think the first `(wint)` cast in the middle three lines might be
avoidable in Rust. And once you've written `int hashCode()` and `wint
result`, the typechecker should complain about any casts that you
accidentally forget or get wrong. Given these, the winner is no longer so
clear to me. But yeah, the operator-based version isn't as obviously worse
as I had assumed.


>
> The prototypical use for a hash code is to index into a hash table modulo
> the table's current size. It can also be used for debugging, e.g. Java's
> default toString() method uses the object's class name and hash, returning
> something like "PhoneNumber@163b91".
>
> Another example of wraparound math is computing a checksum like CRC32.
> The checksum value is typically sent over a wire or stored in a storage
> medium to cross-check data integrity at the receiving end. After computing
> the checksum, you only want to pass it around and compare it.
>
> The only other example that comes to mind is emulating the arithmetic
> operations of a target CPU or other hardware.
>
> In other cases of bounded numbers, like ARGB color components, one wants
> to deal with overflow, not silently wraparound.
>
> Implementing BigInt can use wraparound math if it can also get the carry
> bit.
>
> Yes, these cases are so few that named operators may suffice. That's a bit
> less convenient but linguistically simpler than Swift's 5 wraparound
> arithmetic operators.
>
>
>
>>>
>>>>
>>>>>
>>>>> The "wraparound is undesired but performance is critical" category
>>>>> occurs in the most performance critical bits of code [I'm doubting that 
>>>>> all
>>>>> parts of all Rust programs are performance critical], and programmers need
>>>>> to make the trade-off over that limited scope. Maybe via operators or
>>>>> types, but not by a compiler switch over whole compilation units.
>>>>>
>>>>> That leaves "wraparound is undesired and performance is not critical"
>>>>> category for everything else. The choice between checked vs. unbounded
>>>>> sounds like typing.
>>>>>
>>>>> BigUint is weird: It can underflow but not overflow. When you use its
>>>>> value in a more bounded way you'll need to bounds-check it then, whether 
>>>>> it
>>>>> can go negative or not. Wouldn't it be easier to discard it than squeeze 
>>>>> it
>>>>> into the wraparound or checked models?
>>>>>
>>>>
>>>> Making the unbounded integer types implement the Checking/Wrapping
>>>> traits is more for completeness than anything else, I'm not sure whether it
>>>> has practical value.
>>>>
>>>>  A BigUint/Natural type is not as important as BigInt/Integer, but it
>>>> can be nice to have. Haskell only has Integer in the Prelude, but an
>>>> external package provides Natural, and there've been proposals to mainline
>>>> it. It's useful for function inputs where only nonnegative values make
>>>> sense. You could write asserts manually, but you might as well factor them
>>>> out. And types are documentation.
>>>>
>>>> The Haskell implementation of Natural is just a newtype over Integer
>>>> with added checks, and the same thing might make sense for Rust.
>>>>
>>>
>>> I see. Good points.
>>>
>>>
>>>>
>>>> On Wed, Jun 18, 2014 at 11:21 AM, Brian Anderson <bander...@mozilla.com
>>>> > wrote:
>>>>
>>>>>
>>>>> On 06/18/2014 10:08 AM, Gábor Lehel wrote:
>>>>>
>>>>>>
>>>>>> # Checked math
>>>>>>
>>>>>> For (2), the standard library offers checked math in the
>>>>>> `CheckedAdd`, `CheckedMul` etc. traits, as well as integer types of
>>>>>> unbounded size: `BigInt` and `BigUint`. This is good, but it's not 
>>>>>> enough.
>>>>>> The acid test has to be whether for non-performance-critical code, people
>>>>>> are actually *using* checked math. If they're not, then we've failed.
>>>>>>
>>>>>> `CheckedAdd` and co. are important to have for flexibility, but
>>>>>> they're far too unwieldy for general use. People aren't going to write
>>>>>> `.checked_add(2).unwrap()` when they can write `+ 2`. A more adequate
>>>>>> design might be something like this:
>>>>>>
>>>>>>  * Have traits for all the arithmetic operations for both checking on
>>>>>> overflow and for wrapping around on overflow, e.g. `CheckedAdd` (as now),
>>>>>> `WrappingAdd`, `CheckedSub`, `WrappingSub`, and so on.
>>>>>>
>>>>>>  * Offer convenience methods for the Checked traits which perform
>>>>>> `unwrap()` automatically.
>>>>>>
>>>>>>  * Have separate sets of integer types which check for overflow and
>>>>>> which wrap around on overflow. Whatever they're called: `CheckedU8`,
>>>>>> `checked::u8`, `uc8`, ...
>>>>>>
>>>>>>  * Both sets of types implement all of the Checked* and Wrapping*
>>>>>> traits. You can use explicit calls to get either behavior with either 
>>>>>> types.
>>>>>>
>>>>>>  * The checked types use the Checked traits to implement the operator
>>>>>> overloads (`Add`, Mul`, etc.), while the wrapping types use the Wrapping
>>>>>> traits to implement them. In other words, the difference between the 
>>>>>> types
>>>>>> is (probably only) in the behavior of the operators.
>>>>>>
>>>>>>  * `BigInt` also implements all of the Wrapping and Checked traits:
>>>>>> because overflow never happens, it can claim to do anything if it "does
>>>>>> happen". `BigUint` implements all of them except for the Wrapping traits
>>>>>> which may underflow, such as `WrappingSub`, because it has nowhere to 
>>>>>> wrap
>>>>>> around to.
>>>>>>
>>>>>> Another option would be to have just one set of types but two sets of
>>>>>> operators, like Swift does. I think that would work as well, or even
>>>>>> better, but we've been avoiding having any operators which aren't 
>>>>>> familiar
>>>>>> from C.
>>>>>>
>>>>>
>>>>> The general flavor of this proposal w/r/t checked arithmetic sounds
>>>>> pretty reasonable to me, and we can probably make progress on this now. I
>>>>> particularly think that having checked types that use operator overloading
>>>>> is important for ergonomics.
>>>>>  _______________________________________________
>>>>> Rust-dev mailing list
>>>>> Rust-dev@mozilla.org
>>>>> https://mail.mozilla.org/listinfo/rust-dev
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>    Jerry
>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>>    Jerry
>>>
>>
>>
>
>
> --
>    Jerry
>

_______________________________________________
Rust-dev mailing list
Rust-dev@mozilla.org
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Integer overflow, round -2147483648

Reply via email to