Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Tuesday, 25 November 2014 at 22:56:50 UTC, Ola Fosheim Grøstad wrote: I personally would take the monotonic optimizations and rather have a separate bit-fidling type that provides a clean builtin swiss-army-knife toolset that gives close to direct access to the whole arsenal that the CPU instruction set provides (carry, ROL/ROR, bitcounts etc). I don't think there's such clear separation that can be expressed in a type, it's more in used coding practices rather than type. You can't change coding practice by introducing a new type.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Thursday, 27 November 2014 at 08:31:24 UTC, Kagamin wrote: I don't think there's such clear separation that can be expressed in a type, it's more in used coding practices rather than type. You can't change coding practice by introducing a new type. You need to separate and define the old types as well as introducing a clean way to do low level manipulation. How to do the latter is not as clear, but… …regular types should be constrained to convey the intent of the programmer. The intent is conveyed to the compiler and to readers of the source-code. So the type definition should be strict on whether the intent is to convey monotonic qualities or circular/modular qualities. The C-practice of casting from void* to char* to float to uint to int in order to do bit manipulation leads to badly structured code. Intrinsics also leads to less readable code. There's got to be a better solution to keep bit hacks separate from regular code. Maybe a register type that maps onto SIMD registers…
Re: 'int' is enough for 'length' to migrate code from x86 to x64
Kagamin: You can't change coding practice by introducing a new type. We can try to change coding practice introducing new types :-) Bye, bearophile
Re: 'int' is enough for 'length' to migrate code from x86 to x64
when I migrate dfl codes from x86 to 64 bit,modify the drawing.d ,find the 'offset' and 'index',point(x,y),rect(x,y),all be keep with the 'lengh's type, so I don't modify them to size_t,only cast(int)length to int,then it's easy to migrate dfl codes to 64 bit. Ok,then dfl can work on 64 bit now.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Monday, 24 November 2014 at 21:34:19 UTC, Walter Bright wrote: On 11/24/2014 2:20 AM, Don wrote: I believe I do understand the problem. As a practical matter, overflow checks are not going to be added for performance reasons. The performance overhead would be practically zero. All we would need to do, is restrict array slices such that the length cannot exceed ssize_t.max. This can only happen in the case where the element type has a size of 1, and only in the case of slicing a pointer, concatenation, and memory allocation. (length1 + length2) / 2 That's not an issue with length, that's an issue with doing a calculation with an insufficient bit width. Unsigned doesn't actually help, it's still wrong. For unsigned values, if length1 = length2 = 0x8000_, that gives an answer of 0. In exchange, 99% of uses of unsigned would disappear from D code, and with it, a whole category of bugs. You're not proposing changing size_t, so I believe this statement is incorrect. From the D code that I've seen, almost all uses of size_t come directly from the use of .length. But I concede (see below) that many of them come from .sizeof. Also, in principle, uint-uint can generate a runtime check for underflow (i.e. the carry flag). No it cannot. The compiler does not have enough information to know if the value is intended to be positive integer, or an unsigned. That information is lost from the type system. Eg from C, wrapping of an unsigned type is not an error. It is perfectly defined behaviour. With signed types, it's undefined behaviour. I know it's not an error. It can be defined to be an error, and the compiler can insert a runtime check. (I'm not proposing this, just saying it can be done.) But it can't do that, without turning unsigned into a different type. You'd be turning unsigned into a 'non-negative' which is a completely different type. This is my whole point. unsigned has no sign, you just get the raw bit pattern with no interpretation. This can mean several things, for example: 1. extended_non_negative is where you are using it for the positive range 0.. +0x_ Then, overflow and underflow are errors. 2. a value where the highest bit is always 0. This can be safely used as int or uint. 3. Or, it can be modulo 2^^32 arithmetic, where wrapping is intended. 4. It can be part of extended precision arithmetic, where you want the carry flag. 5. It can be just a raw bit pattern. 6. The high bit can be a sign bit. This is a signed type, cast to uint. If the sign bit ever flips because of a carry, that's an error. The type system doesn't specify a meaning for the bit pattern. We've got a special type for case 6, but not for the others. The problem with unsigned is that since it can mean so many things, as if it were a union of these possibilities. So it's not strictly typed -- you need to careful, requiring some element of faith-based programming. And signed-unsigned mismatch is really where you are implicitly assuming that the unsigned value is case 2 or 6. But, if it is one of the other cases, you get nonsense. But those signed unsigned mismatch errors only catch some of the possible cases where you may forget which interpretation you are using, and act as if it were another one. To make this clear: I am not proposing that size_t should be changed. I am proposing that for .length returns a signed type, that for array slices is guaranteed to never be negative. There'll be mass confusion if .length is not the same type as .sizeof Ah, that is a good point. .sizeof is another source of unsigned. Again, quite unnecessarily, can a single type ever actually use up half of the memory space? (It was possible in the 8 and 16 bit days, but it's hard to imagine today). Even sillier, it is nearly always known at compile time! But still, .sizeof is low-level in a way that .length is not.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Tuesday, 25 November 2014 at 07:39:44 UTC, Don wrote: No, that is not overflow. That is a carry. Overflow is when the sign bit changes. I think this discussion will be less confusing with clearing up the terminology. An overflow condition happens when the representation cannot hold the magnitude of the intended type. In floating point that is +Inf and -Inf. And underflow condition happens when the representation cannot represent the precision of small numbers. In floating point that is +0, -0 and denormal numbers, detected or undetected. Carry is an extra bit that can be considered part of the computation for a concrete machine code instruction that provides carry. Eg 32bits + 32bits = (32+1) bits. If the intended type is true Reals and the representation is integer then we get: 0u - 1u = overflow 1u / 2u = underflow Carry can be taken as an overflow condition, but it is not proper overflow if you interpret it as s part of the result that depends on the machine language instruction and use of it. For a regular ADD/SUB instruction with carry the ALU covers two intended types (signed/unsigned) and use the control register flags in a way which let's the programmer make the interpretation. Some SIMD instructions does not provide control register flags and are therefore true modular arithmetic that does not overflow by definition, but if you use them for representing a non-modular intended type then you get undetected overflow… Overflow is in relation to an interpretation: the intended type versus the internal representation and the concrete machine language instruction.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Monday, 24 November 2014 at 21:34:19 UTC, Walter Bright wrote: In exchange, 99% of uses of unsigned would disappear from D code, and with it, a whole category of bugs. You're not proposing changing size_t, so I believe this statement is incorrect. The idea is to make unsigned types opt-in, a deliberate choice of individual programmers, not forced by the language. Positive signed integers convert to unsigned integers perfectly without losing information, so mixing types will work perfectly for those who request it.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Monday, 24 November 2014 at 15:56:44 UTC, Andrei Alexandrescu wrote: On 11/24/14 4:54 AM, Don wrote: In D, 1u - 2u 0u. This is defined behaviour, not an overflow. I think I get what you mean, but overflow is also defined behavior (in D at least). -- Andrei Aargh! You're right. That's new, and dreadful. It didn't used to be. The offending commit is alexrp 2012-05-15 15:37:24 which only provides an unsigned example. Why are defining behaviour that is always a bug? Java makes it defined, but it has to because it doesn't have unsigned types. I think the intention probably was to improve on the C situation, where there is undefined behaviour that really should be defined. But do we really want to preclude ever having overflow checking for integers?
Re: 'int' is enough for 'length' to migrate code from x86 to x64
Don: Aargh! You're right. That's new, and dreadful. It didn't used to be. The offending commit is alexrp 2012-05-15 15:37:24 which only provides an unsigned example. Why are defining behaviour that is always a bug? Java makes it defined, but it has to because it doesn't have unsigned types. I think the intention probably was to improve on the C situation, where there is undefined behaviour that really should be defined. But do we really want to preclude ever having overflow checking for integers? +1 Bye, bearophile
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Tuesday, 25 November 2014 at 11:43:01 UTC, Don wrote: Why are defining behaviour that is always a bug? Java makes it defined, but it has to because it doesn't have unsigned types. I think the intention probably was to improve on the C situation, where there is undefined behaviour that really should be defined. Mostly to prevent optimizations based on no-overflow assumption. But do we really want to preclude ever having overflow checking for integers? Overflow checking doesn't contradict to overflow being defined. The latter simply reflects how hardware works, nothing else. And hardware works that way, because that's a fast implementation of arithmetic for general case.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Tuesday, 25 November 2014 at 13:52:32 UTC, Kagamin wrote: Overflow checking doesn't contradict to overflow being defined. The latter simply reflects how hardware works, nothing else. And hardware works that way, because that's a fast implementation of arithmetic for general case. So you are basically saying that D does not provide modular arithmetic, but allows you to continue with the incorrect result of an overflow as a modulo representation? Because you have to choose, you cannot both have modular arithmetic and overflow at the same time for the same operator. Overflow happens because you have monotonic semantics for addition, not modular semantics. Btw, http://dlang.org/expression needs a clean up, the term underflow is not used correctly.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Tuesday, 25 November 2014 at 14:30:36 UTC, Ola Fosheim Grøstad wrote: So you are basically saying that D does not provide modular arithmetic, but allows you to continue with the incorrect result of an overflow as a modulo representation? Correctness is an emergent property - when behavior matches expectation, so overflow has variable correctness in various parts of the code.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Tuesday, 25 November 2014 at 15:42:13 UTC, Kagamin wrote: Correctness is an emergent property - when behavior matches expectation, so overflow has variable correctness in various parts of the code. I assume you are basically saying that Walter's view that matching C++ is more important than getting it right, because some people might expect C++ behaviour. Yet Ada chose a different path and is considered a better language with respect to correctness. I think it is important to get the definitions consistent and sound so they are easy to reason about, both for users and implementors. So one should choose whether the type is primarily monotonic, with incorrect values truncated into modulo N, or if the type is primarily modular. If addition is defined to be primarily monotonic it means you can optimize if(x x+1)… into if (true)…. If it is defined to be primarily modular, then you cannot.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Tuesday, 25 November 2014 at 15:52:22 UTC, Ola Fosheim Grøstad wrote: I assume you are basically saying that Walter's view that matching C++ is more important than getting it right, because some people might expect C++ behaviour. Yet Ada chose a different path and is considered a better language with respect to correctness. C++ legacy is huge especially in culture. That said, the true issue is in beliefs (which probably stem from 16-bit era). Can't judge Ada, have no experience with it, though examples of Java and .net show how marginal is importance of unsigned types. I think it is important to get the definitions consistent and sound so they are easy to reason about, both for users and implementors. So one should choose whether the type is primarily monotonic, with incorrect values truncated into modulo N, or if the type is primarily modular. In this light examples by Marco Leise become interesting, he tries to evade wrapping even for unsigned types, so, yes types are primarily monotonic and optimized for small values. If addition is defined to be primarily monotonic it means you can optimize if(x x+1)… into if (true)…. If it is defined to be primarily modular, then you cannot. Such optimizations have a bad reputation. If they were more conservative and didn't propagate back in code flow, the situation would be probably better. Also isn't (x x+1) a suspicious expression, is it a good idea to mess with it?
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Tuesday, 25 November 2014 at 18:24:29 UTC, Kagamin wrote: C++ legacy is huge especially in culture. That said, the true issue is in beliefs (which probably stem from 16-bit era). Can't judge Ada, have no experience with it, though examples of Java and .net show how marginal is importance of unsigned types. Unsigned bytes are important, and I personally tend to make just about everything unsigned when dealing with C-like languages because that makes me aware of the pitfalls and I avoid the signedness issue. The downside is that it takes extra work to get the evaluation order right and you have to take extra care to make sure loops terminate correctly by being very conscious about +-1 issues when terminating around zero. But I don't really think C++ legacy is a good reason to keep implicit coercion no matter what programming style one has. Coercion is generally something I try to avoid, even explicitly, so why would I want the compiler to do it with no warning? Such optimizations have a bad reputation. If they were more conservative and didn't propagate back in code flow, the situation would be probably better. Also isn't (x x+1) a suspicious expression, is it a good idea to mess with it? It is just an example, it could be the result of substituting aliased values. Anyway, I think it is important to not only define what happens if you add 1 to 0x, but also define whether that result is considered in correspondence with the type. If it isn't a correct value for the type, then the programmer will have to make no assumptions that optimizations will heed the resulting incorrect value. The only acceptable alternative is to have the language specification explicitly define the type as modular and overflow free. If not you end up with weak typing…? I personally would take the monotonic optimizations and rather have a separate bit-fidling type that provides a clean builtin swiss-army-knife toolset that gives close to direct access to the whole arsenal that the CPU instruction set provides (carry, ROL/ROR, bitcounts etc).
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Friday, 21 November 2014 at 20:17:12 UTC, Walter Bright wrote: On 11/21/2014 7:36 AM, Don wrote: On Friday, 21 November 2014 at 04:53:38 UTC, Walter Bright wrote: 0 crossing bugs tend to show up much sooner, and often immediately. You're missing the point here. The problem is that people are using 'uint' as if it were a positive integer type. Suppose D had a type 'natint', which could hold natural numbers in the range 0..uint.max. Sounds like 'uint', right? People make the mistake of thinking that is what uint is. But it is not. How would natint behave, in the type system? typeof (natint - natint) == int NOT natint !!! This would of course overflow if the result is too big to fit in an int. But the type would be correct. 1 - 2 == -1. But typeof (uint - uint ) == uint. The bit pattern is identical to the other case. But the type is wrong. It is for this reason that uint is not appropriate as a model for positive integers. Having warnings about mixing int and uint operations in relational operators is a bit misleading, because mixing signed and unsigned is not usually the real problem. Instead, those warnings a symptom of a type system mistake. You are quite right in saying that with a signed length, overflows can still occur. But, those are in principle detectable. The compiler could add runtime overflow checks for them, for example. But the situation for unsigned is not fixable, because it is a problem with the type system. By making .length unsigned, we are telling people that if .length is used in a subtraction expression, the type will be wrong. It is the incorrect use of the type system that is the underlying problem. I believe I do understand the problem. As a practical matter, overflow checks are not going to be added for performance reasons. The performance overhead would be practically zero. All we would need to do, is restrict array slices such that the length cannot exceed ssize_t.max. This can only happen in the case where the element type has a size of 1, and only in the case of slicing a pointer, concatenation, and memory allocation. Making this restriction would have been unreasonable in the 8 and 16 bit days, but D doesn't support those. For 32 bits, this is an extreme corner case. For 64 bit, this condition never happens at all. In exchange, 99% of uses of unsigned would disappear from D code, and with it, a whole category of bugs. Also, in principle, uint-uint can generate a runtime check for underflow (i.e. the carry flag). No it cannot. The compiler does not have enough information to know if the value is intended to be positive integer, or an unsigned. That information is lost from the type system. Eg from C, wrapping of an unsigned type is not an error. It is perfectly defined behaviour. With signed types, it's undefined behaviour. To make this clear: I am not proposing that size_t should be changed. I am proposing that for .length returns a signed type, that for array slices is guaranteed to never be negative.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Friday, 21 November 2014 at 17:23:51 UTC, Marco Leise wrote: Am Thu, 20 Nov 2014 08:18:23 + schrieb Don x...@nospam.com: It's particularly challenging in D because of the widespread use of 'auto': auto x = foo(); auto y = bar(); auto z = baz(); if (x - y z) { ... } This might be a bug, if one of these functions returns an unsigned type. Good luck finding that. Note that if all functions return unsigned, there isn't even any signed-unsigned mismatch. With those function names I cannot write code. ℕ x = length(); ℕ y = index(); ℕ z = requiredRange(); if (x - y z) { ... } Ah, now we're getting somewhere. Yes the code is obviously correct. You need to be aware of the value ranges of your variables and write subtractions in a way that the result can only be = 0. If you realize that you cannot guarantee that for some case, you just found a logic bug. An invalid program state that you need to assert/if-else/throw. Yup. And that is not captured in the type system. I don't get why so many APIs return ints. Must be to support Java or something where proper unsigned types aren't available. D and C do not have suitable types either. unsigned != ℕ. In D, 1u - 2u 0u. This is defined behaviour, not an overflow.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Friday, 21 November 2014 at 08:46:20 UTC, Walter Bright wrote: On 11/21/2014 12:10 AM, bearophile wrote: Walter Bright: All you're doing is trading 0 crossing for 0x7FFF crossing issues, and pretending the problems have gone away. I'm not pretending anything. I am asking in practical programming what of the two solutions leads to leas problems/bugs. So far I've seen the unsigned solution and I've seen it's highly bug-prone. I'm suggesting that having a bug and detecting the bug are two different things. The 0-crossing bug is easier to detect, but that doesn't mean that shifting the problem to 0x7FFF crossing bugs is making the bug count less. BTW, granted the 0x7FFF problems exhibit the bugs less often, but paradoxically this can make the bug worse, because then it only gets found much, much later in supposedly tested robust code. Is this true? Do you have some examples of buggy code? http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html Changing signed to unsigned in that example does NOT fix the bug. It just means it fails with length = 2^^31 instead of length = 2^^30. uint a = 0x8000_u; uint b = 0x8000_0002u; assert( (a + b) /2 == 0); But actually I don't understand that article. The arrays are int, not char. Since length fits into 32 bits, the largest possible value is 2^^32-1. Therefore, for an int array, with 4 byte elements, the largest possible value is 2^^30-1. So I think the article is wrong. I don't think there is a bug in the code.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On 11/24/14 2:20 AM, Don wrote: I am proposing that for .length returns a signed type, that for array slices is guaranteed to never be negative. Assuming you do make the case this change is an improvement, do you believe it's worth the breakage it would create? -- Andrei
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On 11/24/14 4:54 AM, Don wrote: In D, 1u - 2u 0u. This is defined behaviour, not an overflow. I think I get what you mean, but overflow is also defined behavior (in D at least). -- Andrei
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Mon, 24 Nov 2014 12:54:58 + Don via Digitalmars-d digitalmars-d@puremagic.com wrote: In D, 1u - 2u 0u. This is defined behaviour, not an overflow. p.s. sorry, of course this is not and overflow. this is underflow. signature.asc Description: PGP signature
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Mon, 24 Nov 2014 12:54:58 + Don via Digitalmars-d digitalmars-d@puremagic.com wrote: In D, 1u - 2u 0u. This is defined behaviour, not an overflow. this *is* overflow. D just has overflow result defined. signature.asc Description: PGP signature
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Monday, 24 November 2014 at 16:00:53 UTC, ketmar via Digitalmars-d wrote: this *is* overflow. D just has overflow result defined. So it basically is and isn't modular arithmetic at the same time? I think Ada got this right by providing the ability to specify the modulo value, so you can define: type Weekday is mod 7; type Byte is mod 256; A solid solution solution is to provide «As if Infinitely Ranged Integer Model» where the compiler figures out how large integers are needed for computation and then does overflow detection when you truncate for storage: http://resources.sei.cmu.edu/library/asset-view.cfm?assetid=9019
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Monday, 24 November 2014 at 16:45:35 UTC, Ola Fosheim Grøstad wrote: On Monday, 24 November 2014 at 16:00:53 UTC, ketmar via Digitalmars-d wrote: this *is* overflow. D just has overflow result defined. So it basically is and isn't modular arithmetic at the same time? Overflow is part of modular arithmetic. However, there is no signed and unsigned modular arithmetic, or, more precisely, they are the same. Computer words just aren't a good representation of integers. You can either use modular arithmetic, which follows the common arithmetic laws for addition and multiplication (commutativity, associativity, etc., even most non-zero numbers have a multiplicative inverse), but break the common ordering laws (a = 0 b = 0 implies a+b = 0). Or you can use some other order preserving arithmetic (e.g. saturating to min/max values), but that breaks the arithmetic laws. I think Ada got this right by providing the ability to specify the modulo value, so you can define: type Weekday is mod 7; type Byte is mod 256; A solid solution solution is to provide «As if Infinitely Ranged Integer Model» where the compiler figures out how large integers are needed for computation and then does overflow detection when you truncate for storage: http://resources.sei.cmu.edu/library/asset-view.cfm?assetid=9019 You could just as well use a library like GMP.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Monday, 24 November 2014 at 17:12:31 UTC, Matthias Bentrup wrote: Overflow is part of modular arithmetic. However, there is no signed and unsigned modular arithmetic, or, more precisely, they are the same. Would you say that a phase that goes from 0…2pi overflows? Does polar coordinates overflow once every turn? I'd say overflow/underflow means that the result is wrong. (Carry is not overflow per se). Or you can use some other order preserving arithmetic (e.g. saturating to min/max values), but that breaks the arithmetic laws. I don't think it breaks them, but I think a system language would be better off by having explicit operators for alternative edge-case handling on a bit-fiddling type. E.g.: a + b as regular addition a (+) b as modulo arithmetic addition a [+] b as clamped (saturating) addition The bad behaviour of C-like languages is the implicit coercion to/from a bit-fiddling type. The bit-fiddling should be contained in expression where the programmer by choosing the type says I am gonna do tricky bit hacks here. Just casting to uint does not convey that message in a clear manner. A solid solution solution is to provide «As if Infinitely Ranged Integer Model» where the compiler figures out how large integers are needed for computation and then does overflow detection when you truncate for storage: http://resources.sei.cmu.edu/library/asset-view.cfm?assetid=9019 You could just as well use a library like GMP. I think the point with having compiler support is to retain most optimizations. The compiler select the most efficient representation based on the needed headroom and makes sure that overflow is recorded so that you can eventually respond to it. If you couple AIR with constrained integer types, which Pascal and Ada has, then it can be very efficient in many cases.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Monday, 24 November 2014 at 17:55:06 UTC, Ola Fosheim Grøstad wrote: I think the point with having compiler support is to retain most optimizations. The compiler select the most efficient representation based on the needed headroom and makes sure that overflow is recorded so that you can eventually respond to it. It is also worth noting that Intel CPUs have 3 new instructions for working with large integers: MULX and ADCX/ADOX. http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/ia-large-integer-arithmetic-paper.html So there is no reason to not go for it IMO.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Monday, 24 November 2014 at 17:55:06 UTC, Ola Fosheim Grøstad wrote: On Monday, 24 November 2014 at 17:12:31 UTC, Matthias Bentrup wrote: Overflow is part of modular arithmetic. However, there is no signed and unsigned modular arithmetic, or, more precisely, they are the same. Would you say that a phase that goes from 0…2pi overflows? Does polar coordinates overflow once every turn? No, sin and cos are periodic functions, but that doesn't mean their arguments are modular. sin 4pi is well defined by e.g. the taylor expansion of sin without any modular arithmetic at all. I'd say overflow/underflow means that the result is wrong. (Carry is not overflow per se). There is no right or wrong in Mathematics, only true and false. The result of modular addition with overflow is not wrong, it is just different than the result of integer addition. Or you can use some other order preserving arithmetic (e.g. saturating to min/max values), but that breaks the arithmetic laws. I don't think it breaks them, but I think a system language would be better off by having explicit operators for alternative edge-case handling on a bit-fiddling type. E.g.: a + b as regular addition a (+) b as modulo arithmetic addition a [+] b as clamped (saturating) addition The bad behaviour of C-like languages is the implicit coercion to/from a bit-fiddling type. The bit-fiddling should be contained in expression where the programmer by choosing the type says I am gonna do tricky bit hacks here. Just casting to uint does not convey that message in a clear manner. Agreed, though I don't like the explosion of new operators. I'd prefer the C# syntax like check(expression), wrap(expression), saturate(expression). A solid solution solution is to provide «As if Infinitely Ranged Integer Model» where the compiler figures out how large integers are needed for computation and then does overflow detection when you truncate for storage: http://resources.sei.cmu.edu/library/asset-view.cfm?assetid=9019 You could just as well use a library like GMP. I think the point with having compiler support is to retain most optimizations. The compiler select the most efficient representation based on the needed headroom and makes sure that overflow is recorded so that you can eventually respond to it. If you couple AIR with constrained integer types, which Pascal and Ada has, then it can be very efficient in many cases. And can fail spectacularly in others. The compiler always has to prepare for the worst case, i.e. the largest integer size possible, while in practice you may need that only for a few extreme cases.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Monday, 24 November 2014 at 19:06:35 UTC, Matthias Bentrup wrote: There is no right or wrong in Mathematics, only true and false. The result of modular addition with overflow is not wrong, it is just different than the result of integer addition. I think we are talking past each other. In my view the term overflow has nothing to do with mathematics, overflow is a signal from the ALU that the computation is incorrect e.g. not in accordance with the intended type. Agreed, though I don't like the explosion of new operators. I'd prefer the C# syntax like check(expression), wrap(expression), saturate(expression). Yep, that is another way to do it. What is preferable probably varies from case to case. And can fail spectacularly in others. The compiler always has to prepare for the worst case, i.e. the largest integer size possible, while in practice you may need that only for a few extreme cases. In some loops it probably can get tricky to get it right without help from the programmer. I believe some languages allow you to annotate loops with an upper boundary to help the semantic analysis, but you could also add more frequent overflow checks on request?
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On 11/24/2014 2:20 AM, Don wrote: I believe I do understand the problem. As a practical matter, overflow checks are not going to be added for performance reasons. The performance overhead would be practically zero. All we would need to do, is restrict array slices such that the length cannot exceed ssize_t.max. This can only happen in the case where the element type has a size of 1, and only in the case of slicing a pointer, concatenation, and memory allocation. (length1 + length2) / 2 In exchange, 99% of uses of unsigned would disappear from D code, and with it, a whole category of bugs. You're not proposing changing size_t, so I believe this statement is incorrect. Also, in principle, uint-uint can generate a runtime check for underflow (i.e. the carry flag). No it cannot. The compiler does not have enough information to know if the value is intended to be positive integer, or an unsigned. That information is lost from the type system. Eg from C, wrapping of an unsigned type is not an error. It is perfectly defined behaviour. With signed types, it's undefined behaviour. I know it's not an error. It can be defined to be an error, and the compiler can insert a runtime check. (I'm not proposing this, just saying it can be done.) To make this clear: I am not proposing that size_t should be changed. I am proposing that for .length returns a signed type, that for array slices is guaranteed to never be negative. There'll be mass confusion if .length is not the same type as .sizeof
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Monday, 24 November 2014 at 19:06:35 UTC, Matthias Bentrup wrote: Agreed, though I don't like the explosion of new operators. I'd prefer the C# syntax like check(expression), wrap(expression), saturate(expression). You maybe like this: ---small test 1-- import std.stdio; template subuint(T1,T2){ auto subuint(T1 x, T2 y, ref bool overflow) { if(is(T1 == uint) is(T2==uint)) { if (x y) { return cast(int)(x -y); } else { return x - y; } } else if(is(T1 == uint) is(T2==int)) {writeln(enter here1); if (x y) { writeln(enter here2); return cast(int)(x -y); } else { writeln(enter here3); return x - y; } } else if(is(T1 == int) is(T2==uint)) { if (x y) { return cast(int)(x -y); } else { return x - y; } } else if(is(T1 == int) is(T2==int)) { return x - y; } } } unittest { bool overflow; assert(subuint(3, 2, overflow) == 1); assert(!overflow); assert(subuint(3, 4, overflow) == -1); assert(!overflow); assert(subuint(uint.max, 1, overflow) == uint.max - 1); writeln(typeid = ,typeid(subuint(uint.max, 1, overflow))); assert(!overflow); assert(subuint(1, 1, overflow) == uint.min); assert(!overflow); assert(subuint(0, 1, overflow) == -1); assert(!overflow); assert(subuint(uint.max - 1, uint.max, overflow) == -1); assert(!overflow); assert(subuint(0, 0, overflow) == 0); assert(!overflow); assert(subuint(3, -2, overflow) == 5); assert(!overflow); assert(subuint(uint.max, -1, overflow) == uint.max + 1); assert(!overflow); assert(subuint(1, -1, overflow) == 2); assert(!overflow); assert(subuint(0, -1, overflow) == 1); assert(!overflow); assert(subuint(uint.max - 1, int.max, overflow) == int.max); assert(!overflow); assert(subuint(0, 0, overflow) == 0); assert(!overflow); assert(subuint(-2, 1, overflow) == -3); assert(!overflow); } void main() { uint a= 3; int b = 4; int c =2; writeln(c -a =,c-a); writeln(a -b =,a-b); writeln(); bool overflow; writeln(typeid = ,typeid(subuint(a, b, overflow)),, a-b=,subuint(a, b, overflow)); writeln(ok); } ---here is a simple ,but it's error-- import std.stdio; template subuint(T1,T2){ auto subuint(T1 x, T2 y, ref bool overflow) { if(is(T1 == int) is(T2==int)) { return x - y; } else if((is(T1 == uint) is(T2==int)) | (is(T1 == uint) is(T2==uint)) | (is(T1 == int) is(T2==uint))) { if (x y) { return cast(int)(x -y); } else { return x - y; } } } } void main() { uint a= 3; int b = 4; int c =2; writeln(c -a =,c-a); writeln(a -b =,a-b); writeln(); bool overflow; writeln(typeid = ,typeid(subuint(a, b, overflow)),, a-b=,subuint(a, b, overflow)); writeln(ok); }
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Monday, 24 November 2014 at 16:00:53 UTC, ketmar via Digitalmars-d wrote: On Mon, 24 Nov 2014 12:54:58 + Don via Digitalmars-d digitalmars-d@puremagic.com wrote: In D, 1u - 2u 0u. This is defined behaviour, not an overflow. this *is* overflow. D just has overflow result defined. No, that is not overflow. That is a carry. Overflow is when the sign bit changes.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Friday, 21 November 2014 at 16:12:19 UTC, Don wrote: It is not uint.max. It is natint.max. And yes, that's an overflow condition. Exactly the same as when you do int.max + int.max. This depends on how you look at it. From a formal perspective assume zero as the base, then a predecessor function P and a successor function S. Then you have 0u - 1u + 2u == SSP0 Then you do a normalization where you cancel out successor and predecessor pairs and you get the result S0 == 1u. On the other hand if you end up with P0 the result should be bottom (error). In binary representation you need to collect the carry over N terms, so you need an extra accumulator which you can get by extending the precision by ~ log2(N) bits. Then do a masking of the most significant bits to check for over/underflow. Advanced for a compiler, but possible. The type that I think would be useful, would be a number in the range 0..int.max. It has no risk of underflow. Yep, from a correctness perspective length should be integer with a =0 constraint. Ada also acknowledge this by having unsigned integers being 31 bits like you suggest. And now that most CPUs go 64 bit then a 63 bit integer would be the right choice for array length. unsigned types are not a subset of mathematical integers. They do not just have a restricted range. They have different semantics. The question of what happens when a range is exceeded, is a different question. There is really no difference between signed and unsigned in principle since you only have an offset, but in practical programming 64 bits signed and 63 bits unsigned is enough for most situations with the advantage that you have the same bit representation with only one interpretation. What the semantics are depend on how you define the operators, right? So you can have both modular arithmetic and non-modular in the same type by providing more operators. This is after all how the hardware does it. Contrary to what is claimed by others in this thread the general hardware ALU does not default to modular arithmetic, it preserves resolution: 32bit + 32bit == 33bit result 32bit * 32bit == 64bit result Modular arithmetic is an artifact of the language, not the hardware.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Friday, 21 November 2014 at 21:44:25 UTC, Marco Leise wrote: Am Wed, 19 Nov 2014 18:20:24 + schrieb Marc Schütz schue...@gmx.net: I'd say length being unsigned is fine. The real mistake is that the difference between two unsigned values isn't signed, which would be the most correct behaviour. Now take my position where I explicitly write code relying on the fact that `bigger - smaller` yields correct results. uint bigger = uint.max; uint smaller = 2; if (bigger smaller) { auto added = bigger - smaller; // Now 'added' is an int with the value -3 ! } else { auto removed = smaller - bigger; } In fact checking which value is larger is the only way to handle the full result range of subtracting two machine integers which is ~2 times larger than what the original type can handle: T.min - T.max .. T.max - T.min This is one reason why I'd like to just keep working with the original unsigned type, but split the range around the positive/negative pivot with an if-else. Implicit conversion of unsigned subtractions to signed values would make the above code unnecessarily hard. Yes, that's true. However, I doubt that this is a common case. I'd say that when two values are to be subtracted (signed or unsigned), and there's no knowledge about which one is larger, it's more useful to get a signed difference. This should be correct in most cases, because I believe it is more likely that the two values are close to each other. It only becomes a problem when they're an opposite sides of the value range. Unfortunately, no matter how you turn it, there will always be corner cases that a) will be wrong and b) the compiler will allow silently. So the question becomes one of preferences between usefulness for common use cases, ease of detection of errors, and compatibility.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Sat, 22 Nov 2014 03:09:59 + deadalnix via Digitalmars-d digitalmars-d@puremagic.com wrote: On Friday, 21 November 2014 at 09:47:32 UTC, Stefan Koch wrote: On Friday, 21 November 2014 at 09:37:50 UTC, Walter Bright wrote: I thought everyone hated foreach_reverse! I dislike foreach_reverse; 1. it's a keyword with an underscore in it; 2. complicates implementation of foreach and parsing. 3. key_word with under_score These are compiler implementation issue and all solvable. People don't give a shit about how the compiler work and rightly so. The language is made to fit need of the user, not the needs of the implementer. `foreach (auto n; ...)` anyone? and `foreach (; ...)`? nope. cosmetic changes aren't needed. this is clearly made for implementer. luckyly, it's not me who will try explain to newcomers why they has new variable declaration in `foreach` which looks like variable reusing, why they must invent new variable name for each nested `foreach` and so on. but please, don't tell me about solvable -- all this solvable only in the sense make your own fork and fix it. ah, and support your fork. and don't forget that your code cannot be used with vanilla compiler anymore. ok for me, but for others? signature.asc Description: PGP signature
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Saturday, 22 November 2014 at 11:12:06 UTC, Marc Schütz wrote: I'd say that when two values are to be subtracted (signed or unsigned), and there's no knowledge about which one is larger, it's more useful to get a signed difference. This should be correct in most cases, because I believe it is more likely that the two values are close to each other. It only becomes a problem when they're an opposite sides of the value range. Not being able to decrement unsigned types would be a disaster. Think about unsigned integers as an enumeration. You should be able to both take the predecessor and successor of the value. This is also in line with how you formalize natural numbers in math: 0 == zero 1 == successor(zero) 2 == successor(successor(zero)) This is basically a unary representation of natural numbers and it allows both addition and subtraction. Unsigned int should be considered a binary representation of the same capped at max value. Bearophile has given a sensible solution a long time ago, make type coercion explicit and add a weaker coercion operator. That operator should prevent senseless type coercion, but allow system-level-coercion over signedness. Problem fixed.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On 20/11/2014 08:02, Walter Bright wrote: On 11/19/2014 5:03 PM, H. S. Teoh via Digitalmars-d wrote: If this kind of unsafe mixing wasn't allowed, or required explict casts (to signify yes I know what I'm doing and I'm prepared to face the consequences), I suspect that bearophile would be much happier about this issue. ;-) Explicit casts are worse than the problem - they can easily cause bugs. I recently explained to you that explicit casts are easily avoided using `import std.conv: signed, unsigned;`. D compilers badly need a way to detect bug-prone sign mixing. It is no exaggeration to say D is worse than C compilers in this regard. Usually we discuss how to compete with modern languages; here we are not even keeping up with C. It's disappointing this issue was pre-approved last year, but now neither you nor even Andrei seem particularly cognizant of the need to resolve it. If you belittle the problem, you discourage others from trying to solve it.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
Am Fri, 21 Nov 2014 17:50:11 -0800 schrieb Andrei Alexandrescu seewebsiteforem...@erdani.org: I agree, though foreach (i; length.iota.retro) is no slouch either! -- Andrei Yes, no, well, it feels like too much science for a loop with a decrementing index instead of an incrementing, no matter how few parenthesis are used. It is not the place where I would want to introduce functional programming to someone who never saw D code before. That said, I'd also be uncertain if compilers transparently convert this to the equivalent of a reverse loop. -- Marco
Re: 'int' is enough for 'length' to migrate code from x86 to x64
FrankLike wrote in message news:musbvhmuuhhvetovx...@forum.dlang.org... If you compile the dfl Library to 64 bit,you will find error: core.sys.windows.windows.WaitForMultipleObjects(uint nCount,void** lpHandles,) is not callable using argument types(ulong,void**,...) the 'WaitForMultipleObjects' Function is in dmd2/src/druntime/src/core/sys/windows/windows.d the argument of first is dfl's value ,it comes from a 'length' ,it's type is size_t,now it is 'ulong' on 64 bit. So druntime must keep the same as phobos for size_t. Or keep the same to int with WindowsAPI to modify the size_t to int ? I suggest using WaitForMultipleObjects(to!uint(xxx.length), ...) as it will both convert and check for overflow IIRC. I'm just happy D gives you an error here instead of silently truncating the value.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
H. S. Teoh via Digitalmars-d wrote in message news:mailman.2156.1416499421.9932.digitalmar...@puremagic.com... By that logic, using an int to represent an integer is also using the incorrect type, because a signed type is *also* subject to module 2^^n arithmetic -- just a different form of it where the most negative value wraps around to the most positive values. Fixed-width integers in computing are NOT the same thing as unrestricted integers in mathematics. No matter how you try to rationalize it, as long as you use hardware fix-width integers, you're dealing with modulo arithmetic in one form or another. Pretending you're not, is the real source of said subtle bugs. While what you've said is true, the typical range of values stored in an integral type is much more likely to cause unsigned wrapping than signed overflow. So to get the desired 'integer-like' behaviour from D's integral types, you need to care about magnitude for signed types, or both magnitude and ordering for unsigned types. eg 'a b' becoming 'a - b 0' is valid for integers, and small ints, but not valid for small uints unless a b. You will always have to care about the imperfect representation of mathematical integers, but with unsigned types you have an extra rule that is much more likely to affect typical code.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
Walter Bright: All you're doing is trading 0 crossing for 0x7FFF crossing issues, and pretending the problems have gone away. I'm not pretending anything. I am asking in practical programming what of the two solutions leads to leas problems/bugs. So far I've seen the unsigned solution and I've seen it's highly bug-prone. BTW, granted the 0x7FFF problems exhibit the bugs less often, but paradoxically this can make the bug worse, because then it only gets found much, much later in supposedly tested robust code. Is this true? Do you have some examples of buggy code? Bye, bearophile
Re: 'int' is enough for 'length' to migrate code from x86 to x64
Walter Bright wrote in message news:m4mggi$e1h$1...@digitalmars.com... BTW, granted the 0x7FFF problems exhibit the bugs less often, but paradoxically this can make the bug worse, because then it only gets found much, much later in supposedly tested robust code. 0 crossing bugs tend to show up much sooner, and often immediately. I don't think I have ever written a D program where an array had more than 2^^31 elements. And I'm sure I've never had it where 2^31-1 wasn't enough and yet 2^^32-1 was. Zero, on the other hand, is usually quite near the typical array lengths and differences in lengths.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
Andrei Alexandrescu wrote in message news:m4l711$1t39$1...@digitalmars.com... The most difficult pattern that comes to mind is the long arrow operator seen in backward iteration: void fun(int[] a) { for (auto i = a.length; i -- 0; ) { // use i } } Over the years most of my unsigned-related bugs have been from screwing up various loop conditions. Thankfully D solves this perfectly with: void fun(int[] a) { foreach_reverse(i, 0...a.length) { } } So I never have to write those again.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
bearophile wrote in message news:lkcltlokangpzzdzz...@forum.dlang.org... From my experience in coding in D they are far more unlikely than sign-related bugs of array lengths. Here's a simple program to calculate the relative size of two files, that will not work correctly with unsigned lengths. module sizediff import std.file; import std.stdio; void main(string[] args) { assert(args.length == 3, Usage: sizediff file1 file2); auto l1 = args[1].read().length; auto l2 = args[2].read().length; writeln(Difference: , l1 - l2); } The two ways this can fail (that I want to highlight) are: 1. If either file is too large to fit in a size_t the result will (probably) be wrong 2. If file2 is bigger than file1 the result will be wrong If length was signed, problem 2 would not exist, and problem 1 would be more likely to occur. I think it's clear that signed lengths would work for more possible realistic inputs. While this is just an example, a similar pattern occurs in real code whenever array/range lengths are subtracted.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On 11/21/2014 12:10 AM, Daniel Murphy wrote: Walter Bright wrote in message news:m4mggi$e1h$1...@digitalmars.com... BTW, granted the 0x7FFF problems exhibit the bugs less often, but paradoxically this can make the bug worse, because then it only gets found much, much later in supposedly tested robust code. 0 crossing bugs tend to show up much sooner, and often immediately. I don't think I have ever written a D program where an array had more than 2^^31 elements. And I'm sure I've never had it where 2^31-1 wasn't enough and yet 2^^32-1 was. There turned out to be such a bug in one of the examples in Programming Pearls that remained undetected for many years: http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html Zero, on the other hand, is usually quite near the typical array lengths and differences in lengths. That's true, that's why they are detected sooner, when it is less costly to fix them.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On 11/21/2014 12:31 AM, Daniel Murphy wrote: Here's a simple program to calculate the relative size of two files, that will not work correctly with unsigned lengths. module sizediff import std.file; import std.stdio; void main(string[] args) { assert(args.length == 3, Usage: sizediff file1 file2); auto l1 = args[1].read().length; auto l2 = args[2].read().length; writeln(Difference: , l1 - l2); } The two ways this can fail (that I want to highlight) are: 1. If either file is too large to fit in a size_t the result will (probably) be wrong Presumably read() will throw if the size is larger than it can handle. If it doesn't, this code is not buggy, but read() is.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On 11/21/2014 12:10 AM, bearophile wrote: Walter Bright: All you're doing is trading 0 crossing for 0x7FFF crossing issues, and pretending the problems have gone away. I'm not pretending anything. I am asking in practical programming what of the two solutions leads to leas problems/bugs. So far I've seen the unsigned solution and I've seen it's highly bug-prone. I'm suggesting that having a bug and detecting the bug are two different things. The 0-crossing bug is easier to detect, but that doesn't mean that shifting the problem to 0x7FFF crossing bugs is making the bug count less. BTW, granted the 0x7FFF problems exhibit the bugs less often, but paradoxically this can make the bug worse, because then it only gets found much, much later in supposedly tested robust code. Is this true? Do you have some examples of buggy code? http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html
Re: 'int' is enough for 'length' to migrate code from x86 to x64
Walter Bright wrote in message news:m4mu0q$sc5$1...@digitalmars.com... Zero, on the other hand, is usually quite near the typical array lengths and differences in lengths. That's true, that's why they are detected sooner, when it is less costly to fix them. It would be even less costly if they weren't possible.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
Daniel Murphy: void fun(int[] a) { foreach_reverse(i, 0...a.length) { } } Better (it's a workaround for a D design flaw that we're unwilling to fix): foreach_reverse(immutable i, 0...a.length) Bye, bearophile
Re: 'int' is enough for 'length' to migrate code from x86 to x64
Walter Bright wrote in message news:m4mua1$shh$1...@digitalmars.com... Presumably read() will throw if the size is larger than it can handle. If it doesn't, this code is not buggy, but read() is. You're right, but that's really not the point.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Friday, 21 November 2014 at 08:54:40 UTC, Daniel Murphy wrote: Walter Bright wrote in message news:m4mu0q$sc5$1...@digitalmars.com... Zero, on the other hand, is usually quite near the typical array lengths and differences in lengths. That's true, that's why they are detected sooner, when it is less costly to fix them. It would be even less costly if they weren't possible. C# has the checked and unchecked operators (http://msdn.microsoft.com/en-us/library/khy08726.aspx), which allow the programmer to specify if overflows should wrap of fail within an arithmetic expression. That could be a useful addition to D. However, a language that doesn't have unsigned integers and modular arithmetic is IMHO not a system language, because that is how most hardware works internally.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
Here's a simple program to calculate the relative size of two files, that will not work correctly with unsigned lengths. module sizediff; import std.file; import std.stdio; void main(string[] args) { assert(args.length == 3, Usage: sizediff file1 file2); auto l1 = args[1].read().length; auto l2 = args[2].read().length; writeln(Difference: , l1 - l2); } This will be ok: writeln(Difference: , (l1 l2)? (l1 - l2):(l2 - l1)); If 'length''s type is not 'size_t',but is 'int' or 'long', it will be ok like this: import std.math; writeln(Difference: , abs(l1 l2)); Mathematical difference between unsigned value,size comparison should be done before in the right side of the equal sign character. If this work is done in druntime,D will be a real system language.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Friday, 21 November 2014 at 04:53:38 UTC, Walter Bright wrote: BTW, granted the 0x7FFF problems exhibit the bugs less often, but paradoxically this can make the bug worse, because then it only gets found much, much later in supposedly tested robust code. 0 crossing bugs tend to show up much sooner, and often immediately. Wrong. Unsigned integers can hold bigger values, so it takes more to makes them overflow, hence the bug is harder to detect. http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html Specifically, it fails if the sum of low and high is greater than the maximum positive int value So it fails sooner for signed integers than for unsigned integers.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Thursday, 20 November 2014 at 21:27:11 UTC, Walter Bright wrote: If that is changed to a signed type, then you'll have a same-only-different set of subtle bugs If people use signed length with unsigned integers, the length with implicitly convert to unsigned and behave like now, no difference. plus you'll break the intuition about these things from everyone who has used C/C++ a lot. C/C++ programmers disagree: http://critical.eschertech.com/2010/04/07/danger-unsigned-types-used-here/ Why do you think they can't handle signed integers?
Re: 'int' is enough for 'length' to migrate code from x86 to x64
Mathematical difference between unsigned value,size comparison should be done before in the right side of the equal sign character. such as: l3 = (l1 l2)? (l1 - l2):(l2 - l1); If this work is done in druntime,small bug will be rarely.D will be a real system language. Frank
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Thursday, 20 November 2014 at 16:03:41 UTC, H. S. Teoh via Digitalmars-d wrote: By that logic, using an int to represent an integer is also using the incorrect type, because a signed type is *also* subject to module 2^^n arithmetic -- just a different form of it where the most negative value wraps around to the most positive values. The type is chosen at design time so that it's unlikely to overflow for the particular scenario. Why would you want the count of objects to reset at some point when counting objects? Wrapping of unsigned integers has valid usage for e.g. hash functions, but there they are used as bit arrays, not proper numbers, and arithmetic operators are used for bit shuffling, not for computing some numbers.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On 11/21/2014 12:16 AM, Daniel Murphy wrote: Over the years most of my unsigned-related bugs have been from screwing up various loop conditions. Thankfully D solves this perfectly with: void fun(int[] a) { foreach_reverse(i, 0...a.length) { } } So I never have to write those again. I thought everyone hated foreach_reverse! But, yeah, foreach and ranges+algorithms have virtually eliminated a large category of looping bugs.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On 11/21/2014 1:01 AM, Matthias Bentrup wrote: C# has the checked and unchecked operators (http://msdn.microsoft.com/en-us/library/khy08726.aspx), which allow the programmer to specify if overflows should wrap of fail within an arithmetic expression. That could be a useful addition to D. D already has them: https://github.com/D-Programming-Language/druntime/blob/master/src/core/checkedint.d
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Thursday, 20 November 2014 at 16:34:12 UTC, flamencofantasy wrote: My experience is totally the opposite of his. I have been using unsigned for lengths, widths, heights for the past 15 years in C, C++, C# and more recently in D with great success. I don't pretend to be any kind of authority though. C# doesn't encourage usage of unsigned types and warns that they are not CLS-compliant. You're going against established practices there. And signed types for numbers works wonders in C# without any notable problem and makes reasoning about code easier as you don't have to manually check for unsigned conversion bugs everywhere. The article you point to is totally flawed and kinda wasteful in terms of having to read it; the very first code snippet is obviously buggy. That's the whole point: mixing signed with unsigned is bug-prone. Worse, it's inevitable if you force unsigned types everywhere.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Friday, 21 November 2014 at 09:37:50 UTC, Walter Bright wrote: I thought everyone hated foreach_reverse! I dislike foreach_reverse; 1. it's a keyword with an underscore in it; 2. complicates implementation of foreach and parsing. 3. key_word with under_score
Re: 'int' is enough for 'length' to migrate code from x86 to x64
bearophile wrote in message news:rqyuiioyrrjgggctf...@forum.dlang.org... Better (it's a workaround for a D design flaw that we're unwilling to fix): foreach_reverse(immutable i, 0...a.length) I know you feel that way, but I'd rather face the non-existent risk of accidentally mutating the induction variable than write immutable every time.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Friday, 21 November 2014 at 09:37:50 UTC, Walter Bright wrote: I thought everyone hated foreach_reverse! Not me. It's ugly but it gets the job done. All I have to do is add '_reverse' and it just works! Stefan Koch wrote in message news:mmvuvkdfnvwezyvtc...@forum.dlang.org... I dislike foreach_reverse; 1. it's a keyword with an underscore in it; So what. 2. complicates implementation of foreach and parsing. The additional complexity is trivial. 3. key_word with under_score Don't care.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
Frank Like wrote in message news:zhejapfebcvxnzrez...@forum.dlang.org... If this work is done in druntime,D will be a real system language. Sure, this is obviously the fundamental thing holding D back from being a _real_ system language.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
Walter Bright: I thought everyone hated foreach_reverse! I love it! Bye, bearophile
Re: 'int' is enough for 'length' to migrate code from x86 to x64
Daniel Murphy: foreach_reverse(immutable i, 0...a.length) I know you feel that way, but I'd rather face the non-existent risk of accidentally mutating the induction variable than write immutable every time. It's not non-existent :-) (And the right default for a modern language is to have immutable on default and mutable on request. If D doesn't have this quality, better to add immutable every damn time). Bye, bearophile
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Friday, 21 November 2014 at 09:43:04 UTC, Kagamin wrote: On Thursday, 20 November 2014 at 16:34:12 UTC, flamencofantasy C# doesn't encourage usage of unsigned types and warns that they are not CLS-compliant. You're going against established practices there. And signed types for numbers works wonders in C# without any notable problem and makes reasoning about code easier as you don't have to manually check for unsigned conversion bugs everywhere. That's the whole point: mixing signed with unsigned is bug-prone. Worse, it's inevitable if you force unsigned types everywhere. Right. Druntime should have a checksize_t.d Frank
Re: 'int' is enough for 'length' to migrate code from x86 to x64
Druntime's checkint.d should be modify: uint subu(uint x, uint y, ref bool overflow) { if (x y) return y - x; else return x - y; } uint subu(ulong x, ulong y, ref bool overflow) { if (x y) return y - x; else return x - y; } Frank
Re: 'int' is enough for 'length' to migrate code from x86 to x64
D already has them: https://github.com/D-Programming-Language/druntime/blob/master/src/core/checkedint.d Druntime's checkint.d should be modify: uint subu(uint x, uint y, ref bool overflow) { if (x y) return y - x; else return x - y; } ulong subu(ulong x, ulong y, ref bool overflow) { if (x y) return y - x; else return x - y; } Frank
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Friday, 21 November 2014 at 09:43:04 UTC, Kagamin wrote: C# doesn't encourage usage of unsigned types and warns that they are not CLS-compliant. You're going against established practices there. And signed types for numbers works wonders in C# without any notable problem and makes reasoning about code easier as you don't have to manually check for unsigned conversion bugs everywhere. I don't want to be CLS compliant! I make very heavy use of unsafe code, stackalloc and interop to worry about CLS compliance. Actually one of the major reasons I am looking at D for production code is so that I don't have to mix and match Assembly, C/C++ with C#. I want the best of all worlds in one language/runtime :). Anyways, I believe the discussion is about using unsigned for array lengths, not unsigned in general. At this point most people seem to express an opinion - including me, and I certainly hope D stays as it is when it comes to length of an array. I am not convinced in the slightest that signed is the way to go.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Thu, 20 Nov 2014 15:40:39 + Araq via Digitalmars-d digitalmars-d@puremagic.com wrote: Here are some more opinions: http://critical.eschertech.com/2010/04/07/danger-unsigned-types-used-here/ trying to illustrate something with obviously wrong code is very funny. the whole article then reduces to hey, i'm writing bad code, and i can teach you to do the same! won't buy it. signature.asc Description: PGP signature
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Fri, 21 Nov 2014 08:10:55 + bearophile via Digitalmars-d digitalmars-d@puremagic.com wrote: BTW, granted the 0x7FFF problems exhibit the bugs less often, but paradoxically this can make the bug worse, because then it only gets found much, much later in supposedly tested robust code. Is this true? Do you have some examples of buggy code? any code which does something like `if (a-b 0)` is broken. it will work in most cases, but it is broken. you MUST to check values before subtracting. and if you must to do checks anyway, what is the reason of making length signed? signature.asc Description: PGP signature
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Fri, 21 Nov 2014 09:23:01 + Kagamin via Digitalmars-d digitalmars-d@puremagic.com wrote: C/C++ programmers disagree: http://critical.eschertech.com/2010/04/07/danger-unsigned-types-used-here/ Why do you think they can't handle signed integers? being C programmer i disagree that author of the article is C programmer. signature.asc Description: PGP signature
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Thu, 20 Nov 2014 13:28:37 -0800 Walter Bright via Digitalmars-d digitalmars-d@puremagic.com wrote: On 11/20/2014 7:52 AM, H. S. Teoh via Digitalmars-d wrote: What *could* be improved, is the prevention of obvious mistakes in *mixing* signed and unsigned types. Right now, D allows code like the following with no warning: uint x; int y; auto z = x - y; BTW, this one is the same in essence as an actual bug that I fixed in druntime earlier this year, so downplaying it as a mistake people make 'cos they confound computer math with math math is fallacious. What about: uint x; auto z = x - 1; ? here z must be `long`. and for `ulong` compiler must emit error. signature.asc Description: PGP signature
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On 11/21/14, 5:45 AM, Walter Bright wrote: On 11/21/2014 12:10 AM, bearophile wrote: Walter Bright: All you're doing is trading 0 crossing for 0x7FFF crossing issues, and pretending the problems have gone away. I'm not pretending anything. I am asking in practical programming what of the two solutions leads to leas problems/bugs. So far I've seen the unsigned solution and I've seen it's highly bug-prone. I'm suggesting that having a bug and detecting the bug are two different things. The 0-crossing bug is easier to detect, but that doesn't mean that shifting the problem to 0x7FFF crossing bugs is making the bug count less. BTW, granted the 0x7FFF problems exhibit the bugs less often, but paradoxically this can make the bug worse, because then it only gets found much, much later in supposedly tested robust code. Is this true? Do you have some examples of buggy code? http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html This bug can manifest itself for arrays whose length (in elements) is 2^30 or greater (roughly a billion elements) How often does that happen in practice?
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Fri, 21 Nov 2014 19:31:23 +1100 Daniel Murphy via Digitalmars-d digitalmars-d@puremagic.com wrote: bearophile wrote in message news:lkcltlokangpzzdzz...@forum.dlang.org... From my experience in coding in D they are far more unlikely than sign-related bugs of array lengths. Here's a simple program to calculate the relative size of two files, that will not work correctly with unsigned lengths. module sizediff import std.file; import std.stdio; void main(string[] args) { assert(args.length == 3, Usage: sizediff file1 file2); auto l1 = args[1].read().length; auto l2 = args[2].read().length; writeln(Difference: , l1 - l2); } The two ways this can fail (that I want to highlight) are: 1. If either file is too large to fit in a size_t the result will (probably) be wrong 2. If file2 is bigger than file1 the result will be wrong If length was signed, problem 2 would not exist, and problem 1 would be more likely to occur. I think it's clear that signed lengths would work for more possible realistic inputs. no, the problem 2 just becomes hidden. while the given code works most of the time, it is still broken. signature.asc Description: PGP signature
Re: 'int' is enough for 'length' to migrate code from x86 to x64
no, the problem 2 just becomes hidden. while the given code works most of the time, it is still broken. You cannot handle stack overflow in C reliably or out of memory conditions so fails in extreme edge cases is true for every piece of software. broken is not a black-white thing. Works most of the time surely is much more useful than doesn't work. Otherwise you would throw away your phone the first time you get a busy signal.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Fri, 21 Nov 2014 11:17:06 -0300 Ary Borenszweig via Digitalmars-d digitalmars-d@puremagic.com wrote: This bug can manifest itself for arrays whose length (in elements) is 2^30 or greater (roughly a billion elements) How often does that happen in practice? once in almost ten years is too often, as for me. i think that the answer must be never. either no bug, or the code is broken. and one of the worst code is the code that works most of the time, but still broken. signature.asc Description: PGP signature
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Fri, 21 Nov 2014 14:37:39 + Araq via Digitalmars-d digitalmars-d@puremagic.com wrote: broken is not a black-white thing. Works most of the time surely is much more useful than doesn't work. Otherwise you would throw away your phone the first time you get a busy signal. works most of the time is the worst thing: the bug can be hidden for decades and then suddenly blows up stright into your face, making you wonder what happens with good code. i will chose the code which doesn't work over works most of the time one: the first has a clearly visible problem, and the former has a carefully hidden problem. i prefer visible problems. btw, your phone example is totally wrong, 'case busy is a well-defined state. i for sure will throw the phone away if the phone accepts only *some* incoming calls and silently ignores some others (without me explicitly telling it to do so, of course). that's like a code that works most of the time. but not in that time when they phoning you to tell that your house is on fire. signature.asc Description: PGP signature
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Friday, 21 November 2014 at 13:59:08 UTC, ketmar via Digitalmars-d wrote: any code which does something like `if (a-b 0)` is broken. it Modify it: https://github.com/D-Programming-Language/druntime/blob/master/src/core/checkedint.d Modify method: subu(uint ...) or subu(ulong ...) if(xy) return y -x ; else return x -y; It will be not broken.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Fri, 21 Nov 2014 14:55:45 + FrankLike via Digitalmars-d digitalmars-d@puremagic.com wrote: On Friday, 21 November 2014 at 13:59:08 UTC, ketmar via Digitalmars-d wrote: any code which does something like `if (a-b 0)` is broken. it Modify it: https://github.com/D-Programming-Language/druntime/blob/master/src/core/checkedint.d Modify method: subu(uint ...) or subu(ulong ...) if(xy) return y -x ; else return x -y; It will be not broken. and it will not do the same anymore too. it's not a fix at all. signature.asc Description: PGP signature
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Friday, 21 November 2014 at 15:13:22 UTC, ketmar via Digitalmars-d wrote: and it will not do the same anymore too. it's not a fix at all. But it is a part of bugs. Sure,bug which is in mixing sign and unsign values should be fix.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Friday, 21 November 2014 at 04:53:38 UTC, Walter Bright wrote: On 11/20/2014 7:11 PM, Walter Bright wrote: On 11/20/2014 3:25 PM, bearophile wrote: Walter Bright: If that is changed to a signed type, then you'll have a same-only-different set of subtle bugs, This is possible. Can you show some of the bugs, we can discuss them, and see if they are actually worse than the current situation. All you're doing is trading 0 crossing for 0x7FFF crossing issues, and pretending the problems have gone away. BTW, granted the 0x7FFF problems exhibit the bugs less often, but paradoxically this can make the bug worse, because then it only gets found much, much later in supposedly tested robust code. 0 crossing bugs tend to show up much sooner, and often immediately. You're missing the point here. The problem is that people are using 'uint' as if it were a positive integer type. Suppose D had a type 'natint', which could hold natural numbers in the range 0..uint.max. Sounds like 'uint', right? People make the mistake of thinking that is what uint is. But it is not. How would natint behave, in the type system? typeof (natint - natint) == int NOT natint !!! This would of course overflow if the result is too big to fit in an int. But the type would be correct. 1 - 2 == -1. But typeof (uint - uint ) == uint. The bit pattern is identical to the other case. But the type is wrong. It is for this reason that uint is not appropriate as a model for positive integers. Having warnings about mixing int and uint operations in relational operators is a bit misleading, because mixing signed and unsigned is not usually the real problem. Instead, those warnings a symptom of a type system mistake. You are quite right in saying that with a signed length, overflows can still occur. But, those are in principle detectable. The compiler could add runtime overflow checks for them, for example. But the situation for unsigned is not fixable, because it is a problem with the type system. By making .length unsigned, we are telling people that if .length is used in a subtraction expression, the type will be wrong. It is the incorrect use of the type system that is the underlying problem.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Fri, Nov 21, 2014 at 03:36:01PM +, Don via Digitalmars-d wrote: [...] Suppose D had a type 'natint', which could hold natural numbers in the range 0..uint.max. Sounds like 'uint', right? People make the mistake of thinking that is what uint is. But it is not. How would natint behave, in the type system? typeof (natint - natint) == int NOT natint !!! Wrong. (uint.max - 0) == uint.max, which is of type uint. If you interpret it as int, you get a negative number, which is wrong. So your proposal breaks uint in even worse ways, in that now subtracting a smaller number from a larger number may overflow, whereas it wouldn't before. So that fixes nothing, you're just shifting the problem somewhere else. T -- Too many people have open minds but closed eyes.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Friday, 21 November 2014 at 15:36:02 UTC, Don wrote: On Friday, 21 November 2014 at 04:53:38 UTC, Walter Bright wrote: On 11/20/2014 7:11 PM, Walter Bright wrote: On 11/20/2014 3:25 PM, bearophile wrote: Walter Bright: If that is changed to a signed type, then you'll have a same-only-different set of subtle bugs, This is possible. Can you show some of the bugs, we can discuss them, and see if they are actually worse than the current situation. All you're doing is trading 0 crossing for 0x7FFF crossing issues, and pretending the problems have gone away. BTW, granted the 0x7FFF problems exhibit the bugs less often, but paradoxically this can make the bug worse, because then it only gets found much, much later in supposedly tested robust code. 0 crossing bugs tend to show up much sooner, and often immediately. You're missing the point here. The problem is that people are using 'uint' as if it were a positive integer type. Suppose D had a type 'natint', which could hold natural numbers in the range 0..uint.max. Sounds like 'uint', right? People make the mistake of thinking that is what uint is. But it is not. How would natint behave, in the type system? typeof (natint - natint) == int NOT natint !!! This would of course overflow if the result is too big to fit in an int. But the type would be correct. 1 - 2 == -1. So if i is a natint the expression i-- would change the type of variable i on the fly to int ?
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On 11/21/14 12:56 AM, Daniel Murphy wrote: Walter Bright wrote in message news:m4mua1$shh$1...@digitalmars.com... Presumably read() will throw if the size is larger than it can handle. If it doesn't, this code is not buggy, but read() is. You're right, but that's really not the point. What is your point? (Honest question.) Are you proposing that we make all array lengths signed? -- Andrei
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Friday, 21 November 2014 at 15:50:05 UTC, H. S. Teoh via Digitalmars-d wrote: On Fri, Nov 21, 2014 at 03:36:01PM +, Don via Digitalmars-d wrote: [...] Suppose D had a type 'natint', which could hold natural numbers in the range 0..uint.max. Sounds like 'uint', right? People make the mistake of thinking that is what uint is. But it is not. How would natint behave, in the type system? typeof (natint - natint) == int NOT natint !!! Wrong. (uint.max - 0) == uint.max, which is of type uint. It is not uint.max. It is natint.max. And yes, that's an overflow condition. Exactly the same as when you do int.max + int.max. If you interpret it as int, you get a negative number, which is wrong. So your proposal breaks uint in even worse ways, in that now subtracting a smaller number from a larger number may overflow, whereas it wouldn't before. So that fixes nothing, you're just shifting the problem somewhere else. T This is not a proposal I am just illustrating the difference between what people *think* uint does, vs what it actually does. The type that I think would be useful, would be a number in the range 0..int.max. It has no risk of underflow. To put it another way: natural numbers are a subset of mathematical integers. (the range 0..infinity) signed types are a subset of mathematical integers (the range -int.max .. int.max). unsigned types are not a subset of mathematical integers. They do not just have a restricted range. They have different semantics. The question of what happens when a range is exceeded, is a different question.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On 11/21/14 6:03 AM, ketmar via Digitalmars-d wrote: On Thu, 20 Nov 2014 13:28:37 -0800 Walter Bright via Digitalmars-d digitalmars-d@puremagic.com wrote: On 11/20/2014 7:52 AM, H. S. Teoh via Digitalmars-d wrote: What *could* be improved, is the prevention of obvious mistakes in *mixing* signed and unsigned types. Right now, D allows code like the following with no warning: uint x; int y; auto z = x - y; BTW, this one is the same in essence as an actual bug that I fixed in druntime earlier this year, so downplaying it as a mistake people make 'cos they confound computer math with math math is fallacious. What about: uint x; auto z = x - 1; ? here z must be `long`. and for `ulong` compiler must emit error. Would you agree that that would break a substantial amount of correct D code? -- Andrei
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Thursday, 20 November 2014 at 20:17:15 UTC, deadalnix wrote: On Thursday, 20 November 2014 at 15:55:21 UTC, H. S. Teoh via Digitalmars-d wrote: Using unsigned types for array length doesn't necessarily lead to subtle bugs, if the language was stricter about mixing signed and unsigned values. Yes, I think that this is the real issue. Thirded. Array lengths are always non-negative integers. This is axiomatic. But the subtraction thing keeps coming up in this thread; what to do? There's probably something fundamentally wrong with this and I'll probably be called an idiot by both sides, but my gut feeling is that if expressions with subtraction simply returned a signed type by default, much of the problem would disappear. It doesn't catch everything and stuff like: uint x = 2; uint y = 4; uint z = x - y; ...is still going to overflow, but maybe you know what you're doing? More importantly, changing it to auto z = x - y; actually works as expected for the majority of cases. (I'm actually on the fence re: pass/warn/error on mixing, but I _will_ note C's promotion rules have bitten me in the ass a few times and I have no particular love for them.) -Wyatt PS: I can't even believe how this thread has blown up, considering how it started.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On 11/21/14 6:17 AM, Ary Borenszweig wrote: On 11/21/14, 5:45 AM, Walter Bright wrote: On 11/21/2014 12:10 AM, bearophile wrote: Walter Bright: All you're doing is trading 0 crossing for 0x7FFF crossing issues, and pretending the problems have gone away. I'm not pretending anything. I am asking in practical programming what of the two solutions leads to leas problems/bugs. So far I've seen the unsigned solution and I've seen it's highly bug-prone. I'm suggesting that having a bug and detecting the bug are two different things. The 0-crossing bug is easier to detect, but that doesn't mean that shifting the problem to 0x7FFF crossing bugs is making the bug count less. BTW, granted the 0x7FFF problems exhibit the bugs less often, but paradoxically this can make the bug worse, because then it only gets found much, much later in supposedly tested robust code. Is this true? Do you have some examples of buggy code? http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html This bug can manifest itself for arrays whose length (in elements) is 2^30 or greater (roughly a billion elements) How often does that happen in practice? Every time you read a DVD image :o). I should say that in my doctoral work it was often the case I'd have very large arrays. Andrei
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Thursday, 20 November 2014 at 08:14:41 UTC, Walter Bright wrote: clip For example, in America we drive on the right. In Australia, they drive on the left. When I visit Australia, I know this, but when stepping out into the road I instinctively check my left for cars, step into the road, and my foot gets run over by a car coming from the right. I've had to be very careful as a pedestrian there, as my intuition would get me killed. Don't mess with systems programmers' intuitions. It'll cause more problems than it solves. I live in Quebec and my intuition always tells me to look both ways - because you never know :o)
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Fri, 21 Nov 2014 08:31:13 -0800 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: Would you agree that that would break a substantial amount of correct D code? -- Andrei i don't think that code with possible int wrapping and `auto` is correct, so the answer is no. bad code must be made bad. signature.asc Description: PGP signature
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Friday, 21 November 2014 at 16:48:35 UTC, CraigDillabaugh wrote: I live in Quebec and my intuition always tells me to look both ways - because you never know :o) While doing my driver's training years ago, my instructor half-jokingly warned us never to jaywalk in Quebec unless we have a death wish and want to hear all about chalices and tabernacles.
Re: 'int' is enough for 'length' to migrate code from x86 to x64
Am Wed, 19 Nov 2014 10:22:49 + schrieb Dominikus Dittes Scherkl dominikus.sche...@continental-corporation.com: On Wednesday, 19 November 2014 at 09:06:16 UTC, Maroc Leise wrote: Clearly size_t (which I tend to alias with ℕ in my code for brevity and coolness) No, this is far from the implied infinite set. A much better candidate for ℕ is BigUInt (and ℤ for BigInt) How far exactly is it from infinity? And how much closer is BigInt? I wanted a fast ℕ within the constraints of the machine. ;) -- Marco
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Fri, Nov 21, 2014 at 08:31:13AM -0800, Andrei Alexandrescu via Digitalmars-d wrote: On 11/21/14 6:03 AM, ketmar via Digitalmars-d wrote: On Thu, 20 Nov 2014 13:28:37 -0800 Walter Bright via Digitalmars-d digitalmars-d@puremagic.com wrote: On 11/20/2014 7:52 AM, H. S. Teoh via Digitalmars-d wrote: What *could* be improved, is the prevention of obvious mistakes in *mixing* signed and unsigned types. Right now, D allows code like the following with no warning: uint x; int y; auto z = x - y; BTW, this one is the same in essence as an actual bug that I fixed in druntime earlier this year, so downplaying it as a mistake people make 'cos they confound computer math with math math is fallacious. What about: uint x; auto z = x - 1; ? here z must be `long`. and for `ulong` compiler must emit error. What if x==uint.max? Would you agree that that would break a substantial amount of correct D code? -- Andrei Yeah I don't think it's a good idea for subtraction to yield a different type from its operands. Non-closure of operators (i.e., results are of a different type than operands) leads to a lot of frustration because you keep ending up with the wrong type, and inevitably people will just throw in random casts everywhere just to make things work. T -- We are in class, we are supposed to be learning, we have a teacher... Is it too much that I expect him to teach me??? -- RL
Re: 'int' is enough for 'length' to migrate code from x86 to x64
Am Thu, 20 Nov 2014 08:18:23 + schrieb Don x...@nospam.com: It's particularly challenging in D because of the widespread use of 'auto': auto x = foo(); auto y = bar(); auto z = baz(); if (x - y z) { ... } This might be a bug, if one of these functions returns an unsigned type. Good luck finding that. Note that if all functions return unsigned, there isn't even any signed-unsigned mismatch. With those function names I cannot write code. ℕ x = length(); ℕ y = index(); ℕ z = requiredRange(); if (x - y z) { ... } Ah, now we're getting somewhere. Yes the code is obviously correct. You need to be aware of the value ranges of your variables and write subtractions in a way that the result can only be = 0. If you realize that you cannot guarantee that for some case, you just found a logic bug. An invalid program state that you need to assert/if-else/throw. I don't get why so many APIs return ints. Must be to support Java or something where proper unsigned types aren't available. -- Marco
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On 11/21/14, 11:29 AM, ketmar via Digitalmars-d wrote: On Fri, 21 Nov 2014 19:31:23 +1100 Daniel Murphy via Digitalmars-d digitalmars-d@puremagic.com wrote: bearophile wrote in message news:lkcltlokangpzzdzz...@forum.dlang.org... From my experience in coding in D they are far more unlikely than sign-related bugs of array lengths. Here's a simple program to calculate the relative size of two files, that will not work correctly with unsigned lengths. module sizediff import std.file; import std.stdio; void main(string[] args) { assert(args.length == 3, Usage: sizediff file1 file2); auto l1 = args[1].read().length; auto l2 = args[2].read().length; writeln(Difference: , l1 - l2); } The two ways this can fail (that I want to highlight) are: 1. If either file is too large to fit in a size_t the result will (probably) be wrong 2. If file2 is bigger than file1 the result will be wrong If length was signed, problem 2 would not exist, and problem 1 would be more likely to occur. I think it's clear that signed lengths would work for more possible realistic inputs. no, the problem 2 just becomes hidden. while the given code works most of the time, it is still broken. So how would you solve problem 2?
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On 11/21/14, 11:47 AM, ketmar via Digitalmars-d wrote: On Fri, 21 Nov 2014 11:17:06 -0300 Ary Borenszweig via Digitalmars-d digitalmars-d@puremagic.com wrote: This bug can manifest itself for arrays whose length (in elements) is 2^30 or greater (roughly a billion elements) How often does that happen in practice? once in almost ten years is too often, as for me. i think that the answer must be never. either no bug, or the code is broken. and one of the worst code is the code that works most of the time, but still broken. You see, if you don't use a BigNum for everything than you will always have hidden bugs, be it with int, uint or whatever. The thing is that with int bugs are much less frequent than with uint. So I don't know why you'd rather have uint than int...
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On 11/21/14, 1:32 PM, Andrei Alexandrescu wrote: On 11/21/14 6:17 AM, Ary Borenszweig wrote: On 11/21/14, 5:45 AM, Walter Bright wrote: On 11/21/2014 12:10 AM, bearophile wrote: Walter Bright: All you're doing is trading 0 crossing for 0x7FFF crossing issues, and pretending the problems have gone away. I'm not pretending anything. I am asking in practical programming what of the two solutions leads to leas problems/bugs. So far I've seen the unsigned solution and I've seen it's highly bug-prone. I'm suggesting that having a bug and detecting the bug are two different things. The 0-crossing bug is easier to detect, but that doesn't mean that shifting the problem to 0x7FFF crossing bugs is making the bug count less. BTW, granted the 0x7FFF problems exhibit the bugs less often, but paradoxically this can make the bug worse, because then it only gets found much, much later in supposedly tested robust code. Is this true? Do you have some examples of buggy code? http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html This bug can manifest itself for arrays whose length (in elements) is 2^30 or greater (roughly a billion elements) How often does that happen in practice? Every time you read a DVD image :o). I should say that in my doctoral work it was often the case I'd have very large arrays. Oh, sorry, I totally forgot that when you open a DVD with VLC it reads the whole thing to memory. /sarcasm
Re: 'int' is enough for 'length' to migrate code from x86 to x64
Am Fri, 21 Nov 2014 16:32:20 + schrieb Wyatt wyatt@gmail.com: Array lengths are always non-negative integers. This is axiomatic. But the subtraction thing keeps coming up in this thread; what to do? There's probably something fundamentally wrong with this and I'll probably be called an idiot by both sides, but my gut feeling is that if expressions with subtraction simply returned a signed type by default, much of the problem would disappear. [...] As I said above, I always order my unsigned variables by magnitude and uint.max - uint.min should result in uint.max and not -1. In code dealing with lengths or offsets there is typically some base that is less than the position or an index that is less than the length. The expression `base - position` is just wrong. If it is in fact below base then you will end up with an if-else later on under guarantee. So why not place it up front: if (position = base) { auto offset = position - base; } else { … } [...] -Wyatt PS: I can't even believe how this thread has blown up, considering how it started. Exactly my thought, but suddenly I couldn't stop myself from posting. -- Marco
Re: 'int' is enough for 'length' to migrate code from x86 to x64
Am Thu, 20 Nov 2014 20:53:31 -0800 schrieb Walter Bright newshou...@digitalmars.com: On 11/20/2014 7:11 PM, Walter Bright wrote: On 11/20/2014 3:25 PM, bearophile wrote: Walter Bright: If that is changed to a signed type, then you'll have a same-only-different set of subtle bugs, This is possible. Can you show some of the bugs, we can discuss them, and see if they are actually worse than the current situation. All you're doing is trading 0 crossing for 0x7FFF crossing issues, and pretending the problems have gone away. BTW, granted the 0x7FFF problems exhibit the bugs less often, but paradoxically this can make the bug worse, because then it only gets found much, much later in supposedly tested robust code. 0 crossing bugs tend to show up much sooner, and often immediately. +1000. This is also the reason we have a special float .init in D. There is no plethora of bugs to show, because they are under the radar. Signed types are only more convenient in the scripting language sense, like using double for everything and array indexing in JavaScript. -- Marco
Re: 'int' is enough for 'length' to migrate code from x86 to x64
On Fri, 21 Nov 2014 14:38:26 -0300 Ary Borenszweig via Digitalmars-d digitalmars-d@puremagic.com wrote: You see, if you don't use a BigNum for everything than you will always have hidden bugs, be it with int, uint or whatever. why do you believe that i'm not aware of overflows and don't checking for that? i'm used to think about overflows and do overflow checking in production code since my Z80 days. and i don't believe that infrequent bug is better than frequent bug. both are equally bad. signature.asc Description: PGP signature