Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-27 Thread Kagamin via Digitalmars-d
On Tuesday, 25 November 2014 at 22:56:50 UTC, Ola Fosheim Grøstad 
wrote:
I personally would take the monotonic optimizations and rather 
have a separate bit-fidling type that provides a clean builtin 
swiss-army-knife toolset that gives close to direct access to 
the whole arsenal that the CPU instruction set provides (carry, 
ROL/ROR, bitcounts etc).


I don't think there's such clear separation that can be expressed 
in a type, it's more in used coding practices rather than type. 
You can't change coding practice by introducing a new type.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-27 Thread via Digitalmars-d

On Thursday, 27 November 2014 at 08:31:24 UTC, Kagamin wrote:
I don't think there's such clear separation that can be 
expressed in a type, it's more in used coding practices rather 
than type. You can't change coding practice by introducing a 
new type.


You need to separate and define the old types as well as 
introducing a clean way to do low level manipulation. How to do 
the latter is not as clear, but…


…regular types should be constrained to convey the intent of the 
programmer. The intent is conveyed to the compiler and to readers 
of the source-code. So the type definition should be strict on 
whether the intent is to convey monotonic qualities or 
circular/modular qualities.


The C-practice of casting from void* to char* to float to uint to 
int in order to do bit manipulation leads to badly structured 
code. Intrinsics also leads to less readable code. There's got to 
be a better solution to keep bit hacks separate from regular 
code. Maybe a register type that maps onto SIMD registers…


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-27 Thread bearophile via Digitalmars-d

Kagamin:


You can't change coding practice by introducing a new type.


We can try to change coding practice introducing new types :-)

Bye,
bearophile


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-26 Thread Frank Like via Digitalmars-d
when I migrate dfl codes from x86 to 64 bit,modify the drawing.d 
,find the 'offset' and 'index',point(x,y),rect(x,y),all be 
keep with the 'lengh's type, so I don't modify them to 
size_t,only cast(int)length to int,then it's easy to migrate dfl 
codes to 64 bit.

Ok,then dfl can work  on 64 bit now.



Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-25 Thread Don via Digitalmars-d

On Monday, 24 November 2014 at 21:34:19 UTC, Walter Bright wrote:

On 11/24/2014 2:20 AM, Don wrote:
I believe I do understand the problem. As a practical matter, 
overflow checks

are not going to be added for performance reasons.


The performance overhead would be practically zero. All we 
would need to do, is
restrict array slices such that the length cannot exceed 
ssize_t.max.


This can only happen in the case where the element type has a 
size of 1, and
only in the case of slicing a pointer, concatenation, and 
memory allocation.


(length1 + length2) / 2


That's not an issue with length, that's an issue with doing a 
calculation with an insufficient bit width. Unsigned doesn't 
actually help, it's still wrong.


For unsigned values, if length1 = length2 = 0x8000_, that 
gives an answer of 0.



In exchange, 99% of uses of unsigned would disappear from D 
code, and with it, a

whole category of bugs.


You're not proposing changing size_t, so I believe this 
statement is incorrect.


From the D code that I've seen, almost all uses of size_t come 
directly from the use of .length. But I concede (see below) that 
many of them come from .sizeof.


Also, in principle, uint-uint can generate a runtime check 
for underflow (i.e.

the carry flag).


No it cannot. The compiler does not have enough information to 
know if the value
is intended to be positive integer, or an unsigned. That 
information is lost

from the type system.

Eg from C, wrapping of an unsigned type is not an error. It is 
perfectly defined

behaviour. With signed types, it's undefined behaviour.


I know it's not an error. It can be defined to be an error, and 
the compiler can insert a runtime check. (I'm not proposing 
this, just saying it can be done.)


But it can't do that, without turning unsigned into a different 
type.
You'd be turning unsigned into a 'non-negative' which is a 
completely different type. This is my whole point.


unsigned has no sign, you just get the raw bit pattern with no 
interpretation.

This can mean several things, for example:
1. extended_non_negative is where you are using it for the 
positive range 0.. +0x_

  Then, overflow and underflow are errors.
2. a value where the highest bit is always 0. This can be safely 
used as int or uint.
3. Or, it can be modulo 2^^32 arithmetic, where wrapping is 
intended.
4. It can be part of extended precision arithmetic, where you 
want the carry flag.

5. It can be just a raw bit pattern.
6. The high bit can be a sign bit. This is a signed type, cast to 
uint.

If the sign bit ever flips because of a carry, that's an error.

The type system doesn't specify a meaning for the bit pattern. 
We've got a special type for case 6, but not for the others.


The problem with unsigned is that since it can mean so many 
things, as if it were a union of these possibilities. So it's not 
strictly typed -- you need to careful, requiring some element of 
faith-based programming.


And signed-unsigned mismatch is really where you are implicitly 
assuming that the unsigned value is case 2 or 6.  But, if it is 
one of the other cases, you get nonsense.


But those signed unsigned mismatch errors only catch some of 
the possible cases where you may forget which interpretation you 
are using, and act as if it were another one.



To make this clear: I am not proposing that size_t should be 
changed.
I am proposing that for .length returns a signed type, that 
for array slices is

guaranteed to never be negative.


There'll be mass confusion if .length is not the same type as 
.sizeof


Ah, that is a good point. .sizeof is another source of unsigned.
Again, quite unnecessarily, can a single type ever actually use 
up half of the memory space? (It was possible in the 8 and 16 bit 
days, but it's hard to imagine today). Even sillier, it is nearly 
always known at compile time!


But still, .sizeof is low-level in a way that .length is not.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-25 Thread via Digitalmars-d

On Tuesday, 25 November 2014 at 07:39:44 UTC, Don wrote:
No, that is not overflow. That is a carry. Overflow is when the 
sign bit changes.


I think this discussion will be less confusing with clearing up 
the terminology.


An overflow condition happens when the representation cannot hold 
the magnitude of the intended type. In floating point that is 
+Inf and -Inf.


And underflow condition happens when the representation cannot 
represent the precision of small numbers. In floating point that 
is +0, -0 and denormal numbers, detected or undetected.


Carry is an extra bit that can be considered part of the 
computation for a concrete machine code instruction that provides 
carry. Eg 32bits + 32bits = (32+1) bits.


If the intended type is true Reals and the representation is 
integer then we get:


0u - 1u = overflow
1u / 2u = underflow

Carry can be taken as an overflow condition, but it is not proper 
overflow if you interpret it as s part of the result that depends 
on the machine language instruction and use of it. For a regular 
ADD/SUB instruction with carry the ALU covers two intended types 
(signed/unsigned) and use the control register flags in a way 
which let's the programmer make the interpretation.


Some SIMD instructions does not provide control register flags 
and are therefore true modular arithmetic that does not overflow 
by definition, but if you use them for representing a non-modular 
intended type then you get undetected overflow…


Overflow is in relation to an interpretation: the intended type 
versus the internal representation and the concrete machine 
language instruction.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-25 Thread Kagamin via Digitalmars-d

On Monday, 24 November 2014 at 21:34:19 UTC, Walter Bright wrote:
In exchange, 99% of uses of unsigned would disappear from D 
code, and with it, a

whole category of bugs.


You're not proposing changing size_t, so I believe this 
statement is incorrect.


The idea is to make unsigned types opt-in, a deliberate choice of 
individual programmers, not forced by the language. Positive 
signed integers convert to unsigned integers perfectly without 
losing information, so mixing types will work perfectly for those 
who request it.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-25 Thread Don via Digitalmars-d
On Monday, 24 November 2014 at 15:56:44 UTC, Andrei Alexandrescu 
wrote:

On 11/24/14 4:54 AM, Don wrote:
In D,  1u - 2u  0u. This is defined behaviour, not an 
overflow.


I think I get what you mean, but overflow is also defined 
behavior (in D at least). -- Andrei



Aargh! You're right. That's new, and dreadful. It didn't used to 
be.

The offending commit is

alexrp  2012-05-15 15:37:24

which only provides an unsigned example.

Why are defining behaviour that is always a bug? Java makes it 
defined, but it has to because it doesn't have unsigned types.
I think the intention probably was to improve on the C situation, 
where there is undefined behaviour that really should be defined.


But do we really want to preclude ever having overflow checking 
for integers?




Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-25 Thread bearophile via Digitalmars-d

Don:

Aargh! You're right. That's new, and dreadful. It didn't used 
to be.

The offending commit is

alexrp  2012-05-15 15:37:24

which only provides an unsigned example.

Why are defining behaviour that is always a bug? Java makes it 
defined, but it has to because it doesn't have unsigned types.
I think the intention probably was to improve on the C 
situation, where there is undefined behaviour that really 
should be defined.


But do we really want to preclude ever having overflow checking 
for integers?


+1

Bye,
bearophile


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-25 Thread Kagamin via Digitalmars-d

On Tuesday, 25 November 2014 at 11:43:01 UTC, Don wrote:
Why are defining behaviour that is always a bug? Java makes it 
defined, but it has to because it doesn't have unsigned types.
I think the intention probably was to improve on the C 
situation, where there is undefined behaviour that really 
should be defined.


Mostly to prevent optimizations based on no-overflow assumption.

But do we really want to preclude ever having overflow checking 
for integers?


Overflow checking doesn't contradict to overflow being defined. 
The latter simply reflects how hardware works, nothing else. And 
hardware works that way, because that's a fast implementation of 
arithmetic for general case.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-25 Thread via Digitalmars-d

On Tuesday, 25 November 2014 at 13:52:32 UTC, Kagamin wrote:
Overflow checking doesn't contradict to overflow being defined. 
The latter simply reflects how hardware works, nothing else. 
And hardware works that way, because that's a fast 
implementation of arithmetic for general case.


So you are basically saying that D does not provide modular 
arithmetic, but allows you to continue with the incorrect result 
of an overflow as a modulo representation?


Because you have to choose, you cannot both have modular 
arithmetic and overflow at the same time for the same operator. 
Overflow happens because you have monotonic semantics for 
addition, not modular semantics.


Btw,  http://dlang.org/expression needs a clean up, the term 
underflow is not used correctly.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-25 Thread Kagamin via Digitalmars-d
On Tuesday, 25 November 2014 at 14:30:36 UTC, Ola Fosheim Grøstad 
wrote:
So you are basically saying that D does not provide modular 
arithmetic, but allows you to continue with the incorrect 
result of an overflow as a modulo representation?


Correctness is an emergent property - when behavior matches 
expectation, so overflow has variable correctness in various 
parts of the code.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-25 Thread via Digitalmars-d

On Tuesday, 25 November 2014 at 15:42:13 UTC, Kagamin wrote:
Correctness is an emergent property - when behavior matches 
expectation, so overflow has variable correctness in various 
parts of the code.


I assume you are basically saying that Walter's view that 
matching C++ is more important than getting it right, because 
some people might expect C++ behaviour. Yet Ada chose a different 
path and is considered a better language with respect to 
correctness.


I think it is important to get the definitions consistent and 
sound so they are easy to reason about, both for users and 
implementors. So one should choose whether the type is primarily 
monotonic, with incorrect values truncated into modulo N, or if 
the type is primarily modular.


If addition is defined to be primarily monotonic it means you can 
optimize if(x  x+1)… into if (true)…. If it is defined to be 
primarily modular, then you cannot.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-25 Thread Kagamin via Digitalmars-d
On Tuesday, 25 November 2014 at 15:52:22 UTC, Ola Fosheim Grøstad 
wrote:
I assume you are basically saying that Walter's view that 
matching C++ is more important than getting it right, because 
some people might expect C++ behaviour. Yet Ada chose a 
different path and is considered a better language with respect 
to correctness.


C++ legacy is huge especially in culture. That said, the true 
issue is in beliefs (which probably stem from 16-bit era). Can't 
judge Ada, have no experience with it, though examples of Java 
and .net show how marginal is importance of unsigned types.


I think it is important to get the definitions consistent and 
sound so they are easy to reason about, both for users and 
implementors. So one should choose whether the type is 
primarily monotonic, with incorrect values truncated into 
modulo N, or if the type is primarily modular.


In this light examples by Marco Leise become interesting, he 
tries to evade wrapping even for unsigned types, so, yes types 
are primarily monotonic and optimized for small values.


If addition is defined to be primarily monotonic it means you 
can optimize if(x  x+1)… into if (true)…. If it is defined 
to be primarily modular, then you cannot.


Such optimizations have a bad reputation. If they were more 
conservative and didn't propagate back in code flow, the 
situation would be probably better. Also isn't (x  x+1) a 
suspicious expression, is it a good idea to mess with it?


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-25 Thread via Digitalmars-d

On Tuesday, 25 November 2014 at 18:24:29 UTC, Kagamin wrote:
C++ legacy is huge especially in culture. That said, the true 
issue is in beliefs (which probably stem from 16-bit era). 
Can't judge Ada, have no experience with it, though examples of 
Java and .net show how marginal is importance of unsigned types.


Unsigned bytes are important, and I personally tend to make just 
about everything unsigned when dealing with C-like languages 
because that makes me aware of the pitfalls and I avoid the 
signedness issue.


The downside is that it takes extra work to get the evaluation 
order right and you have to take extra care to make sure loops 
terminate correctly by being very conscious about +-1 issues when 
terminating around zero.


But I don't really think C++ legacy is a good reason to keep 
implicit coercion no matter what programming style one has. 
Coercion is generally something I try to avoid, even explicitly, 
so why would I want the compiler to do it with no warning?


Such optimizations have a bad reputation. If they were more 
conservative and didn't propagate back in code flow, the 
situation would be probably better. Also isn't (x  x+1) a 
suspicious expression, is it a good idea to mess with it?


It is just an example, it could be the result of substituting 
aliased values.


Anyway, I think it is important to not only define what happens 
if you add 1 to 0x, but also define whether that result 
is considered in correspondence with the type. If it isn't a 
correct value for the type, then the programmer will have to make 
no assumptions that optimizations will heed the resulting 
incorrect value. The only acceptable alternative is to have the 
language specification explicitly define the type as modular and 
overflow free. If not you end up with weak typing…?


I personally would take the monotonic optimizations and rather 
have a separate bit-fidling type that provides a clean builtin 
swiss-army-knife toolset that gives close to direct access to the 
whole arsenal that the CPU instruction set provides (carry, 
ROL/ROR, bitcounts etc).


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-24 Thread Don via Digitalmars-d

On Friday, 21 November 2014 at 20:17:12 UTC, Walter Bright wrote:

On 11/21/2014 7:36 AM, Don wrote:
On Friday, 21 November 2014 at 04:53:38 UTC, Walter Bright 
wrote:
0 crossing bugs tend to show up much sooner, and often 
immediately.



You're missing the point here. The problem is that people are 
using 'uint' as if

it were a positive integer type.

Suppose  D had a type 'natint', which could hold natural 
numbers in the range
0..uint.max.  Sounds like 'uint', right? People make the 
mistake of thinking

that is what uint is. But it is not.

How would natint behave, in the type system?

typeof (natint - natint)  ==  int NOT natint  !!!

This would of course overflow if the result is too big to fit 
in an int. But the

type would be correct.  1 - 2 == -1.

But

typeof (uint - uint ) == uint.

The bit pattern is identical to the other case. But the type 
is wrong.


It is for this reason that uint is not appropriate as a model 
for positive
integers. Having warnings about mixing int and uint operations 
in relational
operators is a bit misleading, because mixing signed and 
unsigned is not usually
the real problem. Instead, those warnings a symptom of a type 
system mistake.


You are quite right in saying that with a signed length, 
overflows can still
occur. But, those are in principle detectable. The compiler 
could add runtime
overflow checks for them, for example. But the situation for 
unsigned is not

fixable, because it is a problem with the type system.


By making .length unsigned, we are telling people that if 
.length is

used in a subtraction expression, the type will be wrong.

It is the incorrect use of the type system that is the 
underlying problem.


I believe I do understand the problem. As a practical matter, 
overflow checks are not going to be added for performance 
reasons.


The performance overhead would be practically zero. All we would 
need to do, is restrict array slices such that the length cannot 
exceed ssize_t.max.


This can only happen in the case where the element type has a 
size of 1, and only in the case of slicing a pointer, 
concatenation, and memory allocation.


Making this restriction would have been unreasonable in the 8 and 
16 bit days, but D doesn't support those.  For 32 bits, this is 
an extreme corner case. For 64 bit, this condition never happens 
at all.


In exchange, 99% of uses of unsigned would disappear from D code, 
and with it, a whole category of bugs.



Also, in principle, uint-uint can generate a runtime check for 
underflow (i.e. the carry flag).


No it cannot. The compiler does not have enough information to 
know if the value is intended to be positive integer, or an 
unsigned. That information is lost from the type system.


Eg from C, wrapping of an unsigned type is not an error. It is 
perfectly defined behaviour. With signed types, it's undefined 
behaviour.



To make this clear: I am not proposing that size_t should be 
changed.
I am proposing that for .length returns a signed type, that for 
array slices is guaranteed to never be negative.







Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-24 Thread Don via Digitalmars-d

On Friday, 21 November 2014 at 17:23:51 UTC, Marco Leise wrote:

Am Thu, 20 Nov 2014 08:18:23 +
schrieb Don x...@nospam.com:

It's particularly challenging in D because of the widespread 
use of 'auto':


auto x = foo();
auto y = bar();
auto z = baz();

if (x - y  z) { ... }


This might be a bug, if one of these functions returns an 
unsigned type.  Good luck finding that. Note that if all 
functions return unsigned, there isn't even any 
signed-unsigned mismatch.


With those function names I cannot write code.

ℕ x = length();
ℕ y = index();
ℕ z = requiredRange();

if (x - y  z) { ... }

Ah, now we're getting somewhere. Yes the code is obviously
correct. You need to be aware of the value ranges of your
variables and write subtractions in a way that the result can
only be = 0. If you realize that you cannot guarantee that
for some case, you just found a logic bug. An invalid program
state that you need to assert/if-else/throw.


Yup. And that is not captured in the type system.



I don't get why so many APIs return ints. Must be to support
Java or something where proper unsigned types aren't available.


 D and C do not have suitable types either.

unsigned !=  ℕ.

In D,  1u - 2u  0u. This is defined behaviour, not an overflow.




Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-24 Thread Don via Digitalmars-d

On Friday, 21 November 2014 at 08:46:20 UTC, Walter Bright wrote:

On 11/21/2014 12:10 AM, bearophile wrote:

Walter Bright:

All you're doing is trading 0 crossing for 0x7FFF 
crossing issues, and

pretending the problems have gone away.


I'm not pretending anything. I am asking in practical 
programming what of the
two solutions leads to leas problems/bugs. So far I've seen 
the unsigned

solution and I've seen it's highly bug-prone.


I'm suggesting that having a bug and detecting the bug are two 
different things. The 0-crossing bug is easier to detect, but 
that doesn't mean that shifting the problem to 0x7FFF 
crossing bugs is making the bug count less.



BTW, granted the 0x7FFF problems exhibit the bugs less 
often, but
paradoxically this can make the bug worse, because then it 
only gets found

much, much later in supposedly tested  robust code.


Is this true? Do you have some examples of buggy code?


http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html


Changing signed to unsigned in that example does NOT fix the bug.
It just means it fails with length = 2^^31 instead of length = 
2^^30.


uint a = 0x8000_u;
uint b = 0x8000_0002u;
assert( (a + b) /2 == 0);

But actually I don't understand that article.
The arrays are int, not char. Since length fits into 32 bits, the 
largest possible value is 2^^32-1. Therefore, for an int array, 
with 4 byte elements, the largest possible value is 2^^30-1.


So I think the article is wrong. I don't think there is a bug in 
the code.







Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-24 Thread Andrei Alexandrescu via Digitalmars-d

On 11/24/14 2:20 AM, Don wrote:

I am proposing that for .length returns a signed type, that for array
slices is guaranteed to never be negative.


Assuming you do make the case this change is an improvement, do you 
believe it's worth the breakage it would create? -- Andrei




Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-24 Thread Andrei Alexandrescu via Digitalmars-d

On 11/24/14 4:54 AM, Don wrote:

In D,  1u - 2u  0u. This is defined behaviour, not an overflow.


I think I get what you mean, but overflow is also defined behavior (in D 
at least). -- Andrei


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-24 Thread ketmar via Digitalmars-d
On Mon, 24 Nov 2014 12:54:58 +
Don via Digitalmars-d digitalmars-d@puremagic.com wrote:

 In D,  1u - 2u  0u. This is defined behaviour, not an overflow.
p.s. sorry, of course this is not and overflow. this is underflow.


signature.asc
Description: PGP signature


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-24 Thread ketmar via Digitalmars-d
On Mon, 24 Nov 2014 12:54:58 +
Don via Digitalmars-d digitalmars-d@puremagic.com wrote:

 In D,  1u - 2u  0u. This is defined behaviour, not an overflow.
this *is* overflow. D just has overflow result defined.


signature.asc
Description: PGP signature


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-24 Thread via Digitalmars-d
On Monday, 24 November 2014 at 16:00:53 UTC, ketmar via 
Digitalmars-d wrote:

this *is* overflow. D just has overflow result defined.


So it basically is and isn't modular arithmetic at the same time? 
I think Ada got this right by providing the ability to specify 
the modulo value, so you can define:


type Weekday is mod 7;
type Byte is mod 256;

A solid solution solution is to provide «As if Infinitely Ranged 
Integer Model» where the compiler figures out how large integers 
are needed for computation and then does overflow detection when 
you truncate for storage:


http://resources.sei.cmu.edu/library/asset-view.cfm?assetid=9019


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-24 Thread Matthias Bentrup via Digitalmars-d

On Monday, 24 November 2014 at 16:45:35 UTC, Ola Fosheim Grøstad
wrote:
On Monday, 24 November 2014 at 16:00:53 UTC, ketmar via 
Digitalmars-d wrote:

this *is* overflow. D just has overflow result defined.


So it basically is and isn't modular arithmetic at the same 
time?


Overflow is part of modular arithmetic. However, there is no
signed and unsigned modular arithmetic, or, more precisely, they
are the same.

Computer words just aren't a good representation of integers.

You can either use modular arithmetic, which follows the common
arithmetic laws for addition and multiplication (commutativity,
associativity, etc., even most non-zero numbers have a
multiplicative inverse), but break the common ordering laws (a =
0  b = 0 implies a+b = 0).

Or you can use some other order preserving arithmetic (e.g.
saturating to min/max values), but that breaks the arithmetic
laws.

I think Ada got this right by providing the ability to specify 
the modulo value, so you can define:


type Weekday is mod 7;
type Byte is mod 256;

A solid solution solution is to provide «As if Infinitely 
Ranged Integer Model» where the compiler figures out how large 
integers are needed for computation and then does overflow 
detection when you truncate for storage:


http://resources.sei.cmu.edu/library/asset-view.cfm?assetid=9019


You could just as well use a library like GMP.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-24 Thread via Digitalmars-d
On Monday, 24 November 2014 at 17:12:31 UTC, Matthias Bentrup 
wrote:

Overflow is part of modular arithmetic. However, there is no
signed and unsigned modular arithmetic, or, more precisely, they
are the same.


Would you say that a phase that goes from 0…2pi overflows? Does 
polar coordinates overflow once every turn?


I'd say overflow/underflow means that the result is wrong. (Carry 
is not overflow per se).



Or you can use some other order preserving arithmetic (e.g.
saturating to min/max values), but that breaks the arithmetic
laws.


I don't think it breaks them, but I think a system language would 
be better off by having explicit operators for alternative 
edge-case handling on a bit-fiddling type. E.g.:


a + b as regular addition
a (+) b as modulo arithmetic addition
a [+] b as clamped (saturating) addition

The bad behaviour of C-like languages is the implicit coercion 
to/from a bit-fiddling type. The bit-fiddling should be contained 
in expression where the programmer by choosing the type says I 
am gonna do tricky bit hacks here. Just casting to uint does not 
convey that message in a clear manner.


A solid solution solution is to provide «As if Infinitely 
Ranged Integer Model» where the compiler figures out how large 
integers are needed for computation and then does overflow 
detection when you truncate for storage:


http://resources.sei.cmu.edu/library/asset-view.cfm?assetid=9019


You could just as well use a library like GMP.


I think the point with having compiler support is to retain most 
optimizations. The compiler select the most efficient 
representation based on the needed headroom and makes sure that 
overflow is recorded so that you can eventually respond to it.


If you couple AIR with constrained integer types, which Pascal 
and Ada has, then it can be very efficient in many cases.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-24 Thread via Digitalmars-d
On Monday, 24 November 2014 at 17:55:06 UTC, Ola Fosheim Grøstad 
wrote:
I think the point with having compiler support is to retain 
most optimizations. The compiler select the most efficient 
representation based on the needed headroom and makes sure that 
overflow is recorded so that you can eventually respond to it.


It is also worth noting that Intel CPUs have 3 new instructions 
for working with large integers:


MULX and ADCX/ADOX.

http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/ia-large-integer-arithmetic-paper.html

So there is no reason to not go for it IMO.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-24 Thread Matthias Bentrup via Digitalmars-d

On Monday, 24 November 2014 at 17:55:06 UTC, Ola Fosheim Grøstad
wrote:
On Monday, 24 November 2014 at 17:12:31 UTC, Matthias Bentrup 
wrote:

Overflow is part of modular arithmetic. However, there is no
signed and unsigned modular arithmetic, or, more precisely, 
they

are the same.


Would you say that a phase that goes from 0…2pi overflows? Does 
polar coordinates overflow once every turn?




No, sin and cos are periodic functions, but that doesn't mean
their arguments are modular. sin 4pi is well defined by e.g. the
taylor expansion of sin without any modular arithmetic at all.

I'd say overflow/underflow means that the result is wrong. 
(Carry is not overflow per se).




There is no right or wrong in Mathematics, only true and false.
The result of modular addition with overflow is not wrong, it is
just different than the result of integer addition.


Or you can use some other order preserving arithmetic (e.g.
saturating to min/max values), but that breaks the arithmetic
laws.


I don't think it breaks them, but I think a system language 
would be better off by having explicit operators for 
alternative edge-case handling on a bit-fiddling type. E.g.:


a + b as regular addition
a (+) b as modulo arithmetic addition
a [+] b as clamped (saturating) addition

The bad behaviour of C-like languages is the implicit coercion 
to/from a bit-fiddling type. The bit-fiddling should be 
contained in expression where the programmer by choosing the 
type says I am gonna do tricky bit hacks here. Just casting 
to uint does not convey that message in a clear manner.




Agreed, though I don't like the explosion of new operators. I'd
prefer the C# syntax like check(expression), wrap(expression),
saturate(expression).

A solid solution solution is to provide «As if Infinitely 
Ranged Integer Model» where the compiler figures out how 
large integers are needed for computation and then does 
overflow detection when you truncate for storage:


http://resources.sei.cmu.edu/library/asset-view.cfm?assetid=9019


You could just as well use a library like GMP.


I think the point with having compiler support is to retain 
most optimizations. The compiler select the most efficient 
representation based on the needed headroom and makes sure that 
overflow is recorded so that you can eventually respond to it.


If you couple AIR with constrained integer types, which Pascal 
and Ada has, then it can be very efficient in many cases.


And can fail spectacularly in others. The compiler always has to
prepare for the worst case, i.e. the largest integer size
possible, while in practice you may need that only for a few
extreme cases.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-24 Thread via Digitalmars-d
On Monday, 24 November 2014 at 19:06:35 UTC, Matthias Bentrup 
wrote:

There is no right or wrong in Mathematics, only true and false.
The result of modular addition with overflow is not wrong, it is
just different than the result of integer addition.


I think we are talking past each other. In my view the term 
overflow has nothing to do with mathematics, overflow is a 
signal from the ALU that the computation is incorrect e.g. not in 
accordance with the intended type.



Agreed, though I don't like the explosion of new operators. I'd
prefer the C# syntax like check(expression), wrap(expression),
saturate(expression).


Yep, that is another way to do it. What is preferable probably 
varies from case to case.



And can fail spectacularly in others. The compiler always has to
prepare for the worst case, i.e. the largest integer size
possible, while in practice you may need that only for a few
extreme cases.


In some loops it probably can get tricky to get it right without 
help from the programmer. I believe some languages allow you to 
annotate loops with an upper boundary to help the semantic 
analysis, but you could also add more frequent overflow checks on 
request?


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-24 Thread Walter Bright via Digitalmars-d

On 11/24/2014 2:20 AM, Don wrote:

I believe I do understand the problem. As a practical matter, overflow checks
are not going to be added for performance reasons.


The performance overhead would be practically zero. All we would need to do, is
restrict array slices such that the length cannot exceed ssize_t.max.

This can only happen in the case where the element type has a size of 1, and
only in the case of slicing a pointer, concatenation, and memory allocation.


(length1 + length2) / 2



In exchange, 99% of uses of unsigned would disappear from D code, and with it, a
whole category of bugs.


You're not proposing changing size_t, so I believe this statement is incorrect.



Also, in principle, uint-uint can generate a runtime check for underflow (i.e.
the carry flag).


No it cannot. The compiler does not have enough information to know if the value
is intended to be positive integer, or an unsigned. That information is lost
from the type system.

Eg from C, wrapping of an unsigned type is not an error. It is perfectly defined
behaviour. With signed types, it's undefined behaviour.


I know it's not an error. It can be defined to be an error, and the compiler can 
insert a runtime check. (I'm not proposing this, just saying it can be done.)




To make this clear: I am not proposing that size_t should be changed.
I am proposing that for .length returns a signed type, that for array slices is
guaranteed to never be negative.


There'll be mass confusion if .length is not the same type as .sizeof



Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-24 Thread FrankLike via Digitalmars-d
On Monday, 24 November 2014 at 19:06:35 UTC, Matthias Bentrup 
wrote:

Agreed, though I don't like the explosion of new operators. I'd
prefer the C# syntax like check(expression), wrap(expression),
saturate(expression).


You maybe like this:
---small test 1--
import std.stdio;

template  subuint(T1,T2){
auto subuint(T1 x, T2 y, ref bool overflow)
{
if(is(T1 == uint)  is(T2==uint))
{
if (x  y)
{
return cast(int)(x -y);
}
else
{
return x - y;
}
}
else if(is(T1 == uint)  is(T2==int))
{writeln(enter here1);

if (x  y)
{ writeln(enter here2);
return cast(int)(x -y);
}
else
{ writeln(enter here3);
return x - y;
}
}
else if(is(T1 == int)  is(T2==uint))
{
if (x  y)
{
return cast(int)(x -y);
}
else
{
return x - y;
}
}
else if(is(T1 == int)  is(T2==int))
{
return x - y;
}
  }
}

unittest
{
bool overflow;
assert(subuint(3, 2, overflow) == 1);
   assert(!overflow);
assert(subuint(3, 4, overflow) == -1);

assert(!overflow);
assert(subuint(uint.max, 1, overflow) == uint.max - 1);
writeln(typeid = ,typeid(subuint(uint.max, 1, overflow)));
assert(!overflow);
assert(subuint(1, 1, overflow) == uint.min);
assert(!overflow);
assert(subuint(0, 1, overflow) == -1);
assert(!overflow);
assert(subuint(uint.max - 1, uint.max, overflow) == -1);
assert(!overflow);
assert(subuint(0, 0, overflow) == 0);
assert(!overflow);

assert(subuint(3, -2, overflow) == 5);
assert(!overflow);
assert(subuint(uint.max, -1, overflow) == uint.max + 1);

assert(!overflow);
assert(subuint(1, -1, overflow) == 2);
assert(!overflow);
assert(subuint(0, -1, overflow) == 1);
assert(!overflow);
assert(subuint(uint.max - 1, int.max, overflow) == int.max);
assert(!overflow);
assert(subuint(0, 0, overflow) == 0);
assert(!overflow);
assert(subuint(-2, 1, overflow) == -3);
assert(!overflow);
}


void main()
{
 uint a= 3;
 int b = 4;
 int c =2;
  writeln(c -a =,c-a);
 writeln(a -b =,a-b);
  writeln();
bool overflow;
	 writeln(typeid = ,typeid(subuint(a, b, overflow)),, 
a-b=,subuint(a, b, overflow));

writeln(ok);
}

---here is a simple ,but it's 
error--

import std.stdio;

template  subuint(T1,T2){
auto subuint(T1 x, T2 y, ref bool overflow)
{
if(is(T1 == int)  is(T2==int))
{
return x - y;
}
	else if((is(T1 == uint)  is(T2==int)) | (is(T1 == uint)  
is(T2==uint)) | (is(T1 == int)  is(T2==uint)))

{
if (x  y)
{
return cast(int)(x -y);
}
else
{
return x - y;
}
}
  }
}


void main()
{
 uint a= 3;
 int b = 4;
 int c =2;
  writeln(c -a =,c-a);
 writeln(a -b =,a-b);
  writeln();
bool overflow;
	 writeln(typeid = ,typeid(subuint(a, b, overflow)),, 
a-b=,subuint(a, b, overflow));

writeln(ok);
}



Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-24 Thread Don via Digitalmars-d
On Monday, 24 November 2014 at 16:00:53 UTC, ketmar via 
Digitalmars-d wrote:

On Mon, 24 Nov 2014 12:54:58 +
Don via Digitalmars-d digitalmars-d@puremagic.com wrote:

In D,  1u - 2u  0u. This is defined behaviour, not an 
overflow.

this *is* overflow. D just has overflow result defined.


No, that is not overflow. That is a carry. Overflow is when the 
sign bit changes.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-23 Thread via Digitalmars-d

On Friday, 21 November 2014 at 16:12:19 UTC, Don wrote:
It is not uint.max. It is natint.max. And yes, that's an 
overflow condition.


Exactly the same as when you do int.max + int.max.


This depends on how you look at it. From a formal perspective 
assume zero as the base, then a predecessor function P and a 
successor function S.


Then you have 0u  - 1u + 2u == SSP0

Then you do a normalization where you cancel out successor and 
predecessor pairs and you get the result S0 == 1u. On the other 
hand if you end up with P0 the result should be bottom (error).


In binary representation you need to collect the carry over N 
terms, so you need an extra accumulator which you can get by 
extending the precision by ~ log2(N) bits. Then do a masking of 
the most significant bits to check for over/underflow.


Advanced for a compiler, but possible.

The type that I think would be useful, would be a number in the 
range 0..int.max.

It has no risk of underflow.


Yep, from a correctness perspective length should be integer with 
a =0 constraint. Ada also acknowledge this by having unsigned 
integers being 31 bits like you suggest. And now that most CPUs 
go 64 bit then a 63 bit integer would be the right choice for 
array length.



unsigned types are not a subset of mathematical integers.

They do not just have a restricted range. They have different 
semantics.



The question of what happens when a range is exceeded, is a 
different question.


There is really no difference between signed and unsigned in 
principle since you only have an offset, but in practical 
programming 64 bits signed and 63 bits unsigned is enough for 
most situations with the advantage that you have the same bit 
representation with only one interpretation.


What the semantics are depend on how you define the operators, 
right? So you can have both modular arithmetic and non-modular in 
the same type by providing more operators. This is after all how 
the hardware does it.


Contrary to what is claimed by others in this thread the general 
hardware ALU does not default to modular arithmetic, it preserves 
resolution:


32bit + 32bit == 33bit result
32bit * 32bit == 64bit result

Modular arithmetic is an artifact of the language, not the 
hardware.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-22 Thread via Digitalmars-d

On Friday, 21 November 2014 at 21:44:25 UTC, Marco Leise wrote:

Am Wed, 19 Nov 2014 18:20:24 +
schrieb Marc Schütz schue...@gmx.net:

I'd say length being unsigned is fine. The real mistake is 
that the difference between two unsigned values isn't signed, 
which would be the most correct behaviour.


Now take my position where I explicitly write code relying
on the fact that `bigger - smaller` yields correct results.

uint bigger = uint.max;
uint smaller = 2;
if (bigger  smaller)
{
auto added = bigger - smaller;
// Now 'added' is an int with the value -3 !
}
else
{
auto removed = smaller - bigger;
}

In fact checking which value is larger is the only way to
handle the full result range of subtracting two machine
integers which is ~2 times larger than what the original type
can handle:

T.min - T.max .. T.max - T.min

This is one reason why I'd like to just keep working with
the original unsigned type, but split the range around the
positive/negative pivot with an if-else.

Implicit conversion of unsigned subtractions to signed values
would make the above code unnecessarily hard.


Yes, that's true. However, I doubt that this is a common case. 
I'd say that when two values are to be subtracted (signed or 
unsigned), and there's no knowledge about which one is larger, 
it's more useful to get a signed difference. This should be 
correct in most cases, because I believe it is more likely that 
the two values are close to each other. It only becomes a problem 
when they're an opposite sides of the value range.


Unfortunately, no matter how you turn it, there will always be 
corner cases that a) will be wrong and b) the compiler will allow 
silently. So the question becomes one of preferences between 
usefulness for common use cases, ease of detection of errors, and 
compatibility.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-22 Thread ketmar via Digitalmars-d
On Sat, 22 Nov 2014 03:09:59 +
deadalnix via Digitalmars-d digitalmars-d@puremagic.com wrote:

 On Friday, 21 November 2014 at 09:47:32 UTC, Stefan Koch wrote:
  On Friday, 21 November 2014 at 09:37:50 UTC, Walter Bright 
  wrote:
  I thought everyone hated foreach_reverse!
 
  I dislike foreach_reverse;
  1. it's a keyword with an underscore in it;
  2. complicates implementation of foreach and parsing.
  3. key_word with under_score
 
 These are compiler implementation issue and all solvable. People 
 don't give a shit about how the compiler work and rightly so. The 
 language is made to fit need of the user, not the needs of the 
 implementer.
`foreach (auto n; ...)` anyone? and `foreach (; ...)`? nope. cosmetic
changes aren't needed. this is clearly made for implementer.

luckyly, it's not me who will try explain to newcomers why they has new
variable declaration in `foreach` which looks like variable reusing,
why they must invent new variable name for each nested `foreach` and so
on.

but please, don't tell me about solvable -- all this solvable only
in the sense make your own fork and fix it. ah, and support your fork.
and don't forget that your code cannot be used with vanilla compiler
anymore. ok for me, but for others?


signature.asc
Description: PGP signature


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-22 Thread via Digitalmars-d

On Saturday, 22 November 2014 at 11:12:06 UTC, Marc Schütz wrote:
I'd say that when two values are to be subtracted (signed or 
unsigned), and there's no knowledge about which one is larger, 
it's more useful to get a signed difference. This should be 
correct in most cases, because I believe it is more likely that 
the two values are close to each other. It only becomes a 
problem when they're an opposite sides of the value range.


Not being able to decrement unsigned types would be a disaster. 
Think about unsigned integers as an enumeration. You should be 
able to both take the predecessor and successor of the value.


This is also in line with how you formalize natural numbers in 
math:


0 == zero
1 == successor(zero)
2 == successor(successor(zero))

This is basically a unary representation of natural numbers and 
it allows both addition and subtraction. Unsigned int should be 
considered a binary representation of the same capped at max 
value.


Bearophile has given a sensible solution a long time ago, make 
type coercion explicit and add a weaker coercion operator. That 
operator should prevent senseless type coercion, but allow 
system-level-coercion over signedness. Problem fixed.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-22 Thread Nick Treleaven via Digitalmars-d

On 20/11/2014 08:02, Walter Bright wrote:

On 11/19/2014 5:03 PM, H. S. Teoh via Digitalmars-d wrote:

If this kind of unsafe mixing wasn't allowed, or required explict casts
(to signify yes I know what I'm doing and I'm prepared to face the
consequences), I suspect that bearophile would be much happier about
this issue. ;-)


Explicit casts are worse than the problem - they can easily cause bugs.


I recently explained to you that explicit casts are easily avoided using 
`import std.conv: signed, unsigned;`.


D compilers badly need a way to detect bug-prone sign mixing. It is no 
exaggeration to say D is worse than C compilers in this regard. Usually 
we discuss how to compete with modern languages; here we are not even 
keeping up with C.


It's disappointing this issue was pre-approved last year, but now 
neither you nor even Andrei seem particularly cognizant of the need to 
resolve it. If you belittle the problem, you discourage others from 
trying to solve it.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-22 Thread Marco Leise via Digitalmars-d
Am Fri, 21 Nov 2014 17:50:11 -0800
schrieb Andrei Alexandrescu seewebsiteforem...@erdani.org:

 I agree, though foreach (i; length.iota.retro) is no slouch either! -- 
 Andrei

Yes, no, well, it feels like too much science for a loop with
a decrementing index instead of an incrementing, no matter how
few parenthesis are used. It is not the place where I would
want to introduce functional programming to someone who never
saw D code before.
That said, I'd also be uncertain if compilers transparently
convert this to the equivalent of a reverse loop.

-- 
Marco



Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Daniel Murphy via Digitalmars-d

FrankLike  wrote in message news:musbvhmuuhhvetovx...@forum.dlang.org...


If you compile the dfl Library to 64 bit,you will find error:

core.sys.windows.windows.WaitForMultipleObjects(uint
nCount,void** lpHandles,) is not callable using argument
types(ulong,void**,...)

the 'WaitForMultipleObjects' Function is in
dmd2/src/druntime/src/core/sys/windows/windows.d

the argument of first is dfl's value ,it comes from a 'length'
,it's type is size_t,now it is 'ulong' on 64 bit.

So druntime must keep the same as  phobos for size_t.
Or  keep the same to int with WindowsAPI to  modify the size_t to int ?


I suggest using WaitForMultipleObjects(to!uint(xxx.length), ...) as it will 
both convert and check for overflow IIRC.  I'm just happy D gives you an 
error here instead of silently truncating the value. 



Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Daniel Murphy via Digitalmars-d
H. S. Teoh via Digitalmars-d  wrote in message 
news:mailman.2156.1416499421.9932.digitalmar...@puremagic.com...



By that logic, using an int to represent an integer is also using the
incorrect type, because a signed type is *also* subject to module 2^^n
arithmetic -- just a different form of it where the most negative value
wraps around to the most positive values.  Fixed-width integers in
computing are NOT the same thing as unrestricted integers in
mathematics. No matter how you try to rationalize it, as long as you use
hardware fix-width integers, you're dealing with modulo arithmetic in
one form or another. Pretending you're not, is the real source of said
subtle bugs.


While what you've said is true, the typical range of values stored in an 
integral type is much more likely to cause unsigned wrapping than signed 
overflow.  So to get the desired 'integer-like' behaviour from D's integral 
types, you need to care about magnitude for signed types, or both magnitude 
and ordering for unsigned types.


eg 'a  b' becoming 'a - b  0' is valid for integers, and small ints, but 
not valid for small uints unless a  b.  You will always have to care about 
the imperfect representation of mathematical integers, but with unsigned 
types you have an extra rule that is much more likely to affect typical 
code. 



Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread bearophile via Digitalmars-d

Walter Bright:

All you're doing is trading 0 crossing for 0x7FFF crossing 
issues, and pretending the problems have gone away.


I'm not pretending anything. I am asking in practical programming 
what of the two solutions leads to leas problems/bugs. So far 
I've seen the unsigned solution and I've seen it's highly 
bug-prone.



BTW, granted the 0x7FFF problems exhibit the bugs less 
often, but paradoxically this can make the bug worse, because 
then it only gets found much, much later in supposedly tested  
robust code.


Is this true? Do you have some examples of buggy code?

Bye,
bearophile


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Daniel Murphy via Digitalmars-d

Walter Bright  wrote in message news:m4mggi$e1h$1...@digitalmars.com...

BTW, granted the 0x7FFF problems exhibit the bugs less often, but 
paradoxically this can make the bug worse, because then it only gets found

much, much later in supposedly tested  robust code.

0 crossing bugs tend to show up much sooner, and often immediately.


I don't think I have ever written a D program where an array had more than 
2^^31 elements.  And I'm sure I've never had it where 2^31-1 wasn't enough 
and yet 2^^32-1 was.


Zero, on the other hand, is usually quite near the typical array lengths and 
differences in lengths. 



Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Daniel Murphy via Digitalmars-d
Andrei Alexandrescu  wrote in message 
news:m4l711$1t39$1...@digitalmars.com...


The most difficult pattern that comes to mind is the long arrow operator 
seen in backward iteration:


void fun(int[] a)
{
 for (auto i = a.length; i -- 0; )
 {
 // use i
 }
}


Over the years most of my unsigned-related bugs have been from screwing up 
various loop conditions.  Thankfully D solves this perfectly with:


void fun(int[] a)
{
   foreach_reverse(i, 0...a.length)
   {
   }
}

So I never have to write those again. 



Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Daniel Murphy via Digitalmars-d

bearophile  wrote in message news:lkcltlokangpzzdzz...@forum.dlang.org...

From my experience in coding in D they are far more unlikely than 
sign-related bugs of array lengths.


Here's a simple program to calculate the relative size of two files, that 
will not work correctly with unsigned lengths.


module sizediff

import std.file;
import std.stdio;

void main(string[] args)
{
   assert(args.length == 3, Usage: sizediff file1 file2);
   auto l1 = args[1].read().length;
   auto l2 = args[2].read().length;
   writeln(Difference: , l1 - l2);
}

The two ways this can fail (that I want to highlight) are:
1. If either file is too large to fit in a size_t the result will (probably) 
be wrong

2. If file2 is bigger than file1 the result will be wrong

If length was signed, problem 2 would not exist, and problem 1 would be more 
likely to occur.  I think it's clear that signed lengths would work for more 
possible realistic inputs.


While this is just an example, a similar pattern occurs in real code 
whenever array/range lengths are subtracted. 



Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Walter Bright via Digitalmars-d

On 11/21/2014 12:10 AM, Daniel Murphy wrote:

Walter Bright  wrote in message news:m4mggi$e1h$1...@digitalmars.com...


BTW, granted the 0x7FFF problems exhibit the bugs less often, but
paradoxically this can make the bug worse, because then it only gets found
much, much later in supposedly tested  robust code.

0 crossing bugs tend to show up much sooner, and often immediately.


I don't think I have ever written a D program where an array had more than 2^^31
elements.  And I'm sure I've never had it where 2^31-1 wasn't enough and yet
2^^32-1 was.


There turned out to be such a bug in one of the examples in Programming Pearls 
that remained undetected for many years:


http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html



Zero, on the other hand, is usually quite near the typical array lengths and
differences in lengths.


That's true, that's why they are detected sooner, when it is less costly to fix 
them.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Walter Bright via Digitalmars-d

On 11/21/2014 12:31 AM, Daniel Murphy wrote:

Here's a simple program to calculate the relative size of two files, that will
not work correctly with unsigned lengths.

module sizediff

import std.file;
import std.stdio;

void main(string[] args)
{
assert(args.length == 3, Usage: sizediff file1 file2);
auto l1 = args[1].read().length;
auto l2 = args[2].read().length;
writeln(Difference: , l1 - l2);
}

The two ways this can fail (that I want to highlight) are:
1. If either file is too large to fit in a size_t the result will (probably) be
wrong


Presumably read() will throw if the size is larger than it can handle. If it 
doesn't, this code is not buggy, but read() is.




Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Walter Bright via Digitalmars-d

On 11/21/2014 12:10 AM, bearophile wrote:

Walter Bright:


All you're doing is trading 0 crossing for 0x7FFF crossing issues, and
pretending the problems have gone away.


I'm not pretending anything. I am asking in practical programming what of the
two solutions leads to leas problems/bugs. So far I've seen the unsigned
solution and I've seen it's highly bug-prone.


I'm suggesting that having a bug and detecting the bug are two different things. 
The 0-crossing bug is easier to detect, but that doesn't mean that shifting the 
problem to 0x7FFF crossing bugs is making the bug count less.




BTW, granted the 0x7FFF problems exhibit the bugs less often, but
paradoxically this can make the bug worse, because then it only gets found
much, much later in supposedly tested  robust code.


Is this true? Do you have some examples of buggy code?


http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Daniel Murphy via Digitalmars-d

Walter Bright  wrote in message news:m4mu0q$sc5$1...@digitalmars.com...

 Zero, on the other hand, is usually quite near the typical array lengths 
 and

 differences in lengths.

That's true, that's why they are detected sooner, when it is less costly 
to fix them.


It would be even less costly if they weren't possible. 



Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread bearophile via Digitalmars-d

Daniel Murphy:


void fun(int[] a)
{
   foreach_reverse(i, 0...a.length)
   {
   }
}


Better (it's a workaround for a D design flaw that we're 
unwilling to fix):


foreach_reverse(immutable i, 0...a.length)

Bye,
bearophile


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Daniel Murphy via Digitalmars-d

Walter Bright  wrote in message news:m4mua1$shh$1...@digitalmars.com...

Presumably read() will throw if the size is larger than it can handle. If 
it doesn't, this code is not buggy, but read() is.


You're right, but that's really not the point. 



Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Matthias Bentrup via Digitalmars-d

On Friday, 21 November 2014 at 08:54:40 UTC, Daniel Murphy wrote:
Walter Bright  wrote in message 
news:m4mu0q$sc5$1...@digitalmars.com...


 Zero, on the other hand, is usually quite near the typical 
 array lengths and

 differences in lengths.

That's true, that's why they are detected sooner, when it is 
less costly to fix them.


It would be even less costly if they weren't possible.


C# has the checked and unchecked operators 
(http://msdn.microsoft.com/en-us/library/khy08726.aspx), which 
allow the programmer to specify if overflows should wrap of fail 
within an arithmetic expression. That could be a useful addition 
to D.


However, a language that doesn't have unsigned integers and 
modular arithmetic is IMHO not a system language, because that is 
how most hardware works internally.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Frank Like via Digitalmars-d
Here's a simple program to calculate the relative size of two 
files, that will not work correctly with unsigned lengths.


module sizediff;

import std.file;
import std.stdio;

void main(string[] args)
{
   assert(args.length == 3, Usage: sizediff file1 file2);
   auto l1 = args[1].read().length;
   auto l2 = args[2].read().length;
   writeln(Difference: , l1 - l2);
}


This will be ok:

writeln(Difference: , (l1 l2)? (l1 - l2):(l2 - l1));

If 'length''s type is not 'size_t',but is 'int' or 'long', it 
will be ok like this:


import std.math;
writeln(Difference: , abs(l1 l2));

Mathematical difference between unsigned value,size comparison 
should be done before in the right side of the equal sign 
character.


If this work is done in druntime,D will be a real system language.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Kagamin via Digitalmars-d

On Friday, 21 November 2014 at 04:53:38 UTC, Walter Bright wrote:
BTW, granted the 0x7FFF problems exhibit the bugs less 
often, but paradoxically this can make the bug worse, because 
then it only gets found much, much later in supposedly tested  
robust code.


0 crossing bugs tend to show up much sooner, and often 
immediately.


Wrong. Unsigned integers can hold bigger values, so it takes more 
to makes them overflow, hence the bug is harder to detect.



http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html
Specifically, it fails if the sum of low and high is greater 
than the maximum positive int value


So it fails sooner for signed integers than for unsigned integers.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Kagamin via Digitalmars-d
On Thursday, 20 November 2014 at 21:27:11 UTC, Walter Bright 
wrote:
If that is changed to a signed type, then you'll have a 
same-only-different set of subtle bugs


If people use signed length with unsigned integers, the length 
with implicitly convert to unsigned and behave like now, no 
difference.


plus you'll break the intuition about these things from 
everyone who has used C/C++ a lot.


C/C++ programmers disagree: 
http://critical.eschertech.com/2010/04/07/danger-unsigned-types-used-here/

Why do you think they can't handle signed integers?


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Frank Like via Digitalmars-d

Mathematical difference between unsigned value,size comparison
should be done before in the right side of the equal sign
character.

such as:  l3 = (l1 l2)? (l1 - l2):(l2 - l1);

If this work is done in druntime,small bug will be rarely.D will 
be a real system language.


Frank


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Kagamin via Digitalmars-d
On Thursday, 20 November 2014 at 16:03:41 UTC, H. S. Teoh via 
Digitalmars-d wrote:
By that logic, using an int to represent an integer is also 
using the
incorrect type, because a signed type is *also* subject to 
module 2^^n
arithmetic -- just a different form of it where the most 
negative value

wraps around to the most positive values.


The type is chosen at design time so that it's unlikely to 
overflow for the particular scenario. Why would you want the 
count of objects to reset at some point when counting objects? 
Wrapping of unsigned integers has valid usage for e.g. hash 
functions, but there they are used as bit arrays, not proper 
numbers, and arithmetic operators are used for bit shuffling, not 
for computing some numbers.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Walter Bright via Digitalmars-d

On 11/21/2014 12:16 AM, Daniel Murphy wrote:

Over the years most of my unsigned-related bugs have been from screwing up
various loop conditions.  Thankfully D solves this perfectly with:

void fun(int[] a)
{
foreach_reverse(i, 0...a.length)
{
}
}

So I never have to write those again.


I thought everyone hated foreach_reverse!

But, yeah, foreach and ranges+algorithms have virtually eliminated a large 
category of looping bugs.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Walter Bright via Digitalmars-d

On 11/21/2014 1:01 AM, Matthias Bentrup wrote:

C# has the checked and unchecked operators
(http://msdn.microsoft.com/en-us/library/khy08726.aspx), which allow the
programmer to specify if overflows should wrap of fail within an arithmetic
expression. That could be a useful addition to D.


D already has them:

https://github.com/D-Programming-Language/druntime/blob/master/src/core/checkedint.d



Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Kagamin via Digitalmars-d
On Thursday, 20 November 2014 at 16:34:12 UTC, flamencofantasy 
wrote:
My experience is totally the opposite of his. I have been using 
unsigned for lengths, widths, heights for the past 15 years in 
C, C++, C# and more recently in D with great success. I don't 
pretend to be any kind of authority though.


C# doesn't encourage usage of unsigned types and warns that they 
are not CLS-compliant. You're going against established practices 
there. And signed types for numbers works wonders in C# without 
any notable problem and makes reasoning about code easier as you 
don't have to manually check for unsigned conversion bugs 
everywhere.


The article you point to is totally flawed and kinda wasteful 
in terms of having to read it; the very first code snippet is 
obviously buggy.


That's the whole point: mixing signed with unsigned is bug-prone. 
Worse, it's inevitable if you force unsigned types everywhere.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Stefan Koch via Digitalmars-d

On Friday, 21 November 2014 at 09:37:50 UTC, Walter Bright wrote:

I thought everyone hated foreach_reverse!


I dislike foreach_reverse;
1. it's a keyword with an underscore in it;
2. complicates implementation of foreach and parsing.
3. key_word with under_score


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Daniel Murphy via Digitalmars-d

bearophile  wrote in message news:rqyuiioyrrjgggctf...@forum.dlang.org...

Better (it's a workaround for a D design flaw that we're unwilling to 
fix):


foreach_reverse(immutable i, 0...a.length)



I know you feel that way, but I'd rather face the non-existent risk of 
accidentally mutating the induction variable than write immutable every 
time. 



Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Daniel Murphy via Digitalmars-d

On Friday, 21 November 2014 at 09:37:50 UTC, Walter Bright wrote:
 I thought everyone hated foreach_reverse!


Not me.  It's ugly but it gets the job done.  All I have to do is add 
'_reverse' and it just works!


Stefan Koch  wrote in message news:mmvuvkdfnvwezyvtc...@forum.dlang.org...

I dislike foreach_reverse;
1. it's a keyword with an underscore in it;


So what.


2. complicates implementation of foreach and parsing.


The additional complexity is trivial.


3. key_word with under_score


Don't care. 



Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Daniel Murphy via Digitalmars-d

Frank Like  wrote in message news:zhejapfebcvxnzrez...@forum.dlang.org...


If this work is done in druntime,D will be a real system language.


Sure, this is obviously the fundamental thing holding D back from being a 
_real_ system language. 



Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread bearophile via Digitalmars-d

Walter Bright:


I thought everyone hated foreach_reverse!


I love it!

Bye,
bearophile


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread bearophile via Digitalmars-d

Daniel Murphy:


foreach_reverse(immutable i, 0...a.length)



I know you feel that way, but I'd rather face the non-existent 
risk of accidentally mutating the induction variable than write 
immutable every time.


It's not non-existent :-) (And the right default for a modern 
language is to have immutable on default and mutable on request. 
If D doesn't have this quality, better to add immutable every 
damn time).


Bye,
bearophile


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread FrankLike via Digitalmars-d

On Friday, 21 November 2014 at 09:43:04 UTC, Kagamin wrote:

On Thursday, 20 November 2014 at 16:34:12 UTC, flamencofantasy


C# doesn't encourage usage of unsigned types and warns that 
they are not CLS-compliant. You're going against established 
practices there. And signed types for numbers works wonders in 
C# without any notable problem and makes reasoning about code 
easier as you don't have to manually check for unsigned 
conversion bugs everywhere.




That's the whole point: mixing signed with unsigned is 
bug-prone. Worse, it's inevitable if you force unsigned types 
everywhere.


Right.

Druntime should have a checksize_t.d


Frank


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread FrankLike via Digitalmars-d


 Druntime's checkint.d  should be modify:

 uint subu(uint x, uint y, ref bool overflow)
{
if (x  y)
  return y - x;
 else
  return x - y;
}

 uint subu(ulong x, ulong y, ref bool overflow)
{
if (x  y)
  return y - x;
 else
  return x - y;
}


Frank


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread FrankLike via Digitalmars-d

D already has them:

https://github.com/D-Programming-Language/druntime/blob/master/src/core/checkedint.d


Druntime's checkint.d  should be modify:

 uint subu(uint x, uint y, ref bool overflow)
{
if (x  y)
  return y - x;
 else
  return x - y;
}

 ulong subu(ulong x, ulong y, ref bool overflow)
{
if (x  y)
  return y - x;
 else
  return x - y;
}


Frank


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread flamencofantasy via Digitalmars-d

On Friday, 21 November 2014 at 09:43:04 UTC, Kagamin wrote:

C# doesn't encourage usage of unsigned types and warns that 
they are not CLS-compliant. You're going against established 
practices there. And signed types for numbers works wonders in 
C# without any notable problem and makes reasoning about code 
easier as you don't have to manually check for unsigned 
conversion bugs everywhere.




I don't want to be CLS compliant! I make very heavy use of unsafe 
code, stackalloc and interop to worry about CLS compliance. 
Actually one of the major reasons I am looking at D for 
production code is so that I don't have to mix and match 
Assembly, C/C++ with C#. I want the best of all worlds in one 
language/runtime :).


Anyways, I believe the discussion is about using unsigned for 
array lengths, not unsigned in general. At this point most people 
seem to express an opinion - including me, and I certainly hope D 
stays as it is when it comes to length of an array. I am not 
convinced in the slightest that signed is the way to go.




Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread ketmar via Digitalmars-d
On Thu, 20 Nov 2014 15:40:39 +
Araq via Digitalmars-d digitalmars-d@puremagic.com wrote:

 Here are some more opinions:
 http://critical.eschertech.com/2010/04/07/danger-unsigned-types-used-here/
trying to illustrate something with obviously wrong code is very funny.
the whole article then reduces to hey, i'm writing bad code, and i can
teach you to do the same!

won't buy it.


signature.asc
Description: PGP signature


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread ketmar via Digitalmars-d
On Fri, 21 Nov 2014 08:10:55 +
bearophile via Digitalmars-d digitalmars-d@puremagic.com wrote:

  BTW, granted the 0x7FFF problems exhibit the bugs less 
  often, but paradoxically this can make the bug worse, because 
  then it only gets found much, much later in supposedly tested  
  robust code.
 
 Is this true? Do you have some examples of buggy code?
any code which does something like `if (a-b  0)` is broken. it will
work in most cases, but it is broken. you MUST to check values before
subtracting. and if you must to do checks anyway, what is the reason of
making length signed?


signature.asc
Description: PGP signature


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread ketmar via Digitalmars-d
On Fri, 21 Nov 2014 09:23:01 +
Kagamin via Digitalmars-d digitalmars-d@puremagic.com wrote:

 C/C++ programmers disagree: 
 http://critical.eschertech.com/2010/04/07/danger-unsigned-types-used-here/
 Why do you think they can't handle signed integers?
being C programmer i disagree that author of the article is C
programmer.


signature.asc
Description: PGP signature


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread ketmar via Digitalmars-d
On Thu, 20 Nov 2014 13:28:37 -0800
Walter Bright via Digitalmars-d digitalmars-d@puremagic.com wrote:

 On 11/20/2014 7:52 AM, H. S. Teoh via Digitalmars-d wrote:
  What *could* be improved, is the prevention of obvious mistakes in
  *mixing* signed and unsigned types. Right now, D allows code like the
  following with no warning:
 
  uint x;
  int y;
  auto z = x - y;
 
  BTW, this one is the same in essence as an actual bug that I fixed in
  druntime earlier this year, so downplaying it as a mistake people make
  'cos they confound computer math with math math is fallacious.
 
 What about:
 
  uint x;
  auto z = x - 1;
 
 ?
 
here z must be `long`. and for `ulong` compiler must emit error.


signature.asc
Description: PGP signature


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Ary Borenszweig via Digitalmars-d

On 11/21/14, 5:45 AM, Walter Bright wrote:

On 11/21/2014 12:10 AM, bearophile wrote:

Walter Bright:


All you're doing is trading 0 crossing for 0x7FFF crossing
issues, and
pretending the problems have gone away.


I'm not pretending anything. I am asking in practical programming what
of the
two solutions leads to leas problems/bugs. So far I've seen the unsigned
solution and I've seen it's highly bug-prone.


I'm suggesting that having a bug and detecting the bug are two different
things. The 0-crossing bug is easier to detect, but that doesn't mean
that shifting the problem to 0x7FFF crossing bugs is making the bug
count less.



BTW, granted the 0x7FFF problems exhibit the bugs less often, but
paradoxically this can make the bug worse, because then it only gets
found
much, much later in supposedly tested  robust code.


Is this true? Do you have some examples of buggy code?


http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html



This bug can manifest itself for arrays whose length (in elements) is 
2^30 or greater (roughly a billion elements)


How often does that happen in practice?


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread ketmar via Digitalmars-d
On Fri, 21 Nov 2014 19:31:23 +1100
Daniel Murphy via Digitalmars-d digitalmars-d@puremagic.com wrote:

 bearophile  wrote in message news:lkcltlokangpzzdzz...@forum.dlang.org...
 
  From my experience in coding in D they are far more unlikely than 
  sign-related bugs of array lengths.
 
 Here's a simple program to calculate the relative size of two files, that 
 will not work correctly with unsigned lengths.
 
 module sizediff
 
 import std.file;
 import std.stdio;
 
 void main(string[] args)
 {
 assert(args.length == 3, Usage: sizediff file1 file2);
 auto l1 = args[1].read().length;
 auto l2 = args[2].read().length;
 writeln(Difference: , l1 - l2);
 }
 
 The two ways this can fail (that I want to highlight) are:
 1. If either file is too large to fit in a size_t the result will (probably) 
 be wrong
 2. If file2 is bigger than file1 the result will be wrong
 
 If length was signed, problem 2 would not exist, and problem 1 would be more 
 likely to occur.  I think it's clear that signed lengths would work for more 
 possible realistic inputs.
no, the problem 2 just becomes hidden. while the given code works most
of the time, it is still broken.


signature.asc
Description: PGP signature


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Araq via Digitalmars-d
no, the problem 2 just becomes hidden. while the given code 
works most

of the time, it is still broken.


You cannot handle stack overflow in C reliably or out of memory
conditions so fails in extreme edge cases is true for every
piece of software.

broken is not a black-white thing. Works most of the time
surely is much more useful than doesn't work. Otherwise you
would throw away your phone the first time you get a busy signal.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread ketmar via Digitalmars-d
On Fri, 21 Nov 2014 11:17:06 -0300
Ary Borenszweig via Digitalmars-d digitalmars-d@puremagic.com wrote:

 This bug can manifest itself for arrays whose length (in elements) is 
 2^30 or greater (roughly a billion elements)
 
 How often does that happen in practice?
once in almost ten years is too often, as for me. i think that the
answer must be never. either no bug, or the code is broken. and one
of the worst code is the code that works most of the time, but still
broken.


signature.asc
Description: PGP signature


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread ketmar via Digitalmars-d
On Fri, 21 Nov 2014 14:37:39 +
Araq via Digitalmars-d digitalmars-d@puremagic.com wrote:

 broken is not a black-white thing. Works most of the time
 surely is much more useful than doesn't work. Otherwise you
 would throw away your phone the first time you get a busy signal.
works most of the time is the worst thing: the bug can be hidden for
decades and then suddenly blows up stright into your face, making you
wonder what happens with good code.

i will chose the code which doesn't work over works most of the
time one: the first has a clearly visible problem, and the former has
a carefully hidden problem. i prefer visible problems.

btw, your phone example is totally wrong, 'case busy is a
well-defined state. i for sure will throw the phone away if the phone
accepts only *some* incoming calls and silently ignores some others
(without me explicitly telling it to do so, of course). that's like a
code that works most of the time. but not in that time when they
phoning you to tell that your house is on fire.


signature.asc
Description: PGP signature


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread FrankLike via Digitalmars-d
On Friday, 21 November 2014 at 13:59:08 UTC, ketmar via 
Digitalmars-d wrote:



any code which does something like `if (a-b  0)` is broken. it


Modify it: 
https://github.com/D-Programming-Language/druntime/blob/master/src/core/checkedint.d


Modify method: subu(uint ...) or subu(ulong ...)
if(xy)
return y -x ;
else
 return x -y;

It will be not broken.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread ketmar via Digitalmars-d
On Fri, 21 Nov 2014 14:55:45 +
FrankLike via Digitalmars-d digitalmars-d@puremagic.com wrote:

 On Friday, 21 November 2014 at 13:59:08 UTC, ketmar via 
 Digitalmars-d wrote:
 
  any code which does something like `if (a-b  0)` is broken. it
 
 Modify it: 
 https://github.com/D-Programming-Language/druntime/blob/master/src/core/checkedint.d
 
 Modify method: subu(uint ...) or subu(ulong ...)
 if(xy)
 return y -x ;
 else
   return x -y;
 
 It will be not broken.
and it will not do the same anymore too. it's not a fix at all.


signature.asc
Description: PGP signature


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread FrankLike via Digitalmars-d
On Friday, 21 November 2014 at 15:13:22 UTC, ketmar via 
Digitalmars-d wrote:




and it will not do the same anymore too. it's not a fix at all.


But  it is a part of bugs.
Sure,bug  which  is  in  mixing sign and  unsign  values should  
be  fix.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Don via Digitalmars-d

On Friday, 21 November 2014 at 04:53:38 UTC, Walter Bright wrote:

On 11/20/2014 7:11 PM, Walter Bright wrote:

On 11/20/2014 3:25 PM, bearophile wrote:

Walter Bright:

If that is changed to a signed type, then you'll have a 
same-only-different

set of subtle bugs,


This is possible. Can you show some of the bugs, we can 
discuss them, and see if

they are actually worse than the current situation.


All you're doing is trading 0 crossing for 0x7FFF crossing 
issues, and

pretending the problems have gone away.


BTW, granted the 0x7FFF problems exhibit the bugs less 
often, but paradoxically this can make the bug worse, because 
then it only gets found much, much later in supposedly tested  
robust code.


0 crossing bugs tend to show up much sooner, and often 
immediately.



You're missing the point here. The problem is that people are 
using 'uint' as if it were a positive integer type.


Suppose  D had a type 'natint', which could hold natural numbers 
in the range 0..uint.max.  Sounds like 'uint', right? People make 
the mistake of thinking that is what uint is. But it is not.


How would natint behave, in the type system?

typeof (natint - natint)  ==  int NOT natint  !!!

This would of course overflow if the result is too big to fit in 
an int. But the type would be correct.  1 - 2 == -1.


But

typeof (uint - uint ) == uint.

The bit pattern is identical to the other case. But the type is 
wrong.


It is for this reason that uint is not appropriate as a model for 
positive integers. Having warnings about mixing int and uint 
operations in relational operators is a bit misleading, because 
mixing signed and unsigned is not usually the real problem. 
Instead, those warnings a symptom of a type system mistake.


You are quite right in saying that with a signed length, 
overflows can still occur. But, those are in principle 
detectable. The compiler could add runtime overflow checks for 
them, for example. But the situation for unsigned is not fixable, 
because it is a problem with the type system.



By making .length unsigned, we are telling people that if .length 
is

used in a subtraction expression, the type will be wrong.

It is the incorrect use of the type system that is the underlying 
problem.






Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread H. S. Teoh via Digitalmars-d
On Fri, Nov 21, 2014 at 03:36:01PM +, Don via Digitalmars-d wrote:
[...]
 Suppose  D had a type 'natint', which could hold natural numbers in
 the range 0..uint.max.  Sounds like 'uint', right? People make the
 mistake of thinking that is what uint is. But it is not.
 
 How would natint behave, in the type system?
 
 typeof (natint - natint)  ==  int NOT natint  !!!

Wrong. (uint.max - 0) == uint.max, which is of type uint. If you
interpret it as int, you get a negative number, which is wrong. So your
proposal breaks uint in even worse ways, in that now subtracting a
smaller number from a larger number may overflow, whereas it wouldn't
before. So that fixes nothing, you're just shifting the problem
somewhere else.


T

-- 
Too many people have open minds but closed eyes.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Matthias Bentrup via Digitalmars-d

On Friday, 21 November 2014 at 15:36:02 UTC, Don wrote:
On Friday, 21 November 2014 at 04:53:38 UTC, Walter Bright 
wrote:

On 11/20/2014 7:11 PM, Walter Bright wrote:

On 11/20/2014 3:25 PM, bearophile wrote:

Walter Bright:

If that is changed to a signed type, then you'll have a 
same-only-different

set of subtle bugs,


This is possible. Can you show some of the bugs, we can 
discuss them, and see if

they are actually worse than the current situation.


All you're doing is trading 0 crossing for 0x7FFF 
crossing issues, and

pretending the problems have gone away.


BTW, granted the 0x7FFF problems exhibit the bugs less 
often, but paradoxically this can make the bug worse, because 
then it only gets found much, much later in supposedly tested 
 robust code.


0 crossing bugs tend to show up much sooner, and often 
immediately.



You're missing the point here. The problem is that people are 
using 'uint' as if it were a positive integer type.


Suppose  D had a type 'natint', which could hold natural 
numbers in the range 0..uint.max.  Sounds like 'uint', right? 
People make the mistake of thinking that is what uint is. But 
it is not.


How would natint behave, in the type system?

typeof (natint - natint)  ==  int NOT natint  !!!

This would of course overflow if the result is too big to fit 
in an int. But the type would be correct.  1 - 2 == -1.




So if i is a natint the expression i-- would change the type of 
variable i on the fly to int ?


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Andrei Alexandrescu via Digitalmars-d

On 11/21/14 12:56 AM, Daniel Murphy wrote:

Walter Bright  wrote in message news:m4mua1$shh$1...@digitalmars.com...


Presumably read() will throw if the size is larger than it can handle.
If it doesn't, this code is not buggy, but read() is.


You're right, but that's really not the point.


What is your point? (Honest question.) Are you proposing that we make 
all array lengths signed? -- Andrei


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Don via Digitalmars-d
On Friday, 21 November 2014 at 15:50:05 UTC, H. S. Teoh via 
Digitalmars-d wrote:
On Fri, Nov 21, 2014 at 03:36:01PM +, Don via Digitalmars-d 
wrote:

[...]
Suppose  D had a type 'natint', which could hold natural 
numbers in
the range 0..uint.max.  Sounds like 'uint', right? People make 
the

mistake of thinking that is what uint is. But it is not.

How would natint behave, in the type system?

typeof (natint - natint)  ==  int NOT natint  !!!


Wrong. (uint.max - 0) == uint.max, which is of type uint.



It is not uint.max. It is natint.max. And yes, that's an overflow 
condition.


Exactly the same as when you do int.max + int.max.


If you
interpret it as int, you get a negative number, which is wrong. 
So your
proposal breaks uint in even worse ways, in that now 
subtracting a
smaller number from a larger number may overflow, whereas it 
wouldn't

before. So that fixes nothing, you're just shifting the problem
somewhere else.


T


This is not a proposal I am just illustrating the difference 
between what people *think* uint does, vs what it actually does.


The type that I think would be useful, would be a number in the 
range 0..int.max.

It has no risk of underflow.

To put it another way:

natural numbers are a subset of mathematical integers.
  (the range 0..infinity)

signed types are a subset of mathematical integers
  (the range -int.max .. int.max).

unsigned types are not a subset of mathematical integers.

They do not just have a restricted range. They have different 
semantics.



The question of what happens when a range is exceeded, is a 
different question.





Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Andrei Alexandrescu via Digitalmars-d

On 11/21/14 6:03 AM, ketmar via Digitalmars-d wrote:

On Thu, 20 Nov 2014 13:28:37 -0800
Walter Bright via Digitalmars-d digitalmars-d@puremagic.com wrote:


On 11/20/2014 7:52 AM, H. S. Teoh via Digitalmars-d wrote:

What *could* be improved, is the prevention of obvious mistakes in
*mixing* signed and unsigned types. Right now, D allows code like the
following with no warning:

uint x;
int y;
auto z = x - y;

BTW, this one is the same in essence as an actual bug that I fixed in
druntime earlier this year, so downplaying it as a mistake people make
'cos they confound computer math with math math is fallacious.


What about:

  uint x;
  auto z = x - 1;

?


here z must be `long`. and for `ulong` compiler must emit error.


Would you agree that that would break a substantial amount of correct D 
code? -- Andrei




Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Wyatt via Digitalmars-d

On Thursday, 20 November 2014 at 20:17:15 UTC, deadalnix wrote:

On Thursday, 20 November 2014 at 15:55:21 UTC, H. S. Teoh via
Digitalmars-d wrote:
Using unsigned types for array length doesn't necessarily lead 
to subtle
bugs, if the language was stricter about mixing signed and 
unsigned

values.



Yes, I think that this is the real issue.


Thirded.

Array lengths are always non-negative integers.  This is 
axiomatic.  But the subtraction thing keeps coming up in this 
thread; what to do?


There's probably something fundamentally wrong with this and I'll 
probably be called an idiot by both sides, but my gut feeling 
is that if expressions with subtraction simply returned a signed 
type by default, much of the problem would disappear.  It doesn't 
catch everything and stuff like:


uint x = 2;
uint y = 4;
uint z = x - y;

...is still going to overflow, but maybe you know what you're 
doing? More importantly, changing it to auto z = x - y; actually 
works as expected for the majority of cases.  (I'm actually on 
the fence re: pass/warn/error on mixing, but I _will_ note C's 
promotion rules have bitten me in the ass a few times and I have 
no particular love for them.)


-Wyatt

PS: I can't even believe how this thread has blown up, 
considering how it started.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Andrei Alexandrescu via Digitalmars-d

On 11/21/14 6:17 AM, Ary Borenszweig wrote:

On 11/21/14, 5:45 AM, Walter Bright wrote:

On 11/21/2014 12:10 AM, bearophile wrote:

Walter Bright:


All you're doing is trading 0 crossing for 0x7FFF crossing
issues, and
pretending the problems have gone away.


I'm not pretending anything. I am asking in practical programming what
of the
two solutions leads to leas problems/bugs. So far I've seen the unsigned
solution and I've seen it's highly bug-prone.


I'm suggesting that having a bug and detecting the bug are two different
things. The 0-crossing bug is easier to detect, but that doesn't mean
that shifting the problem to 0x7FFF crossing bugs is making the bug
count less.



BTW, granted the 0x7FFF problems exhibit the bugs less often, but
paradoxically this can make the bug worse, because then it only gets
found
much, much later in supposedly tested  robust code.


Is this true? Do you have some examples of buggy code?


http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html




This bug can manifest itself for arrays whose length (in elements) is
2^30 or greater (roughly a billion elements)

How often does that happen in practice?


Every time you read a DVD image :o). I should say that in my doctoral 
work it was often the case I'd have very large arrays.


Andrei



Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread CraigDillabaugh via Digitalmars-d
On Thursday, 20 November 2014 at 08:14:41 UTC, Walter Bright 
wrote:

clip


For example, in America we drive on the right. In Australia, 
they drive on the left. When I visit Australia, I know this, 
but when stepping out into the road I instinctively check my 
left for cars, step into the road, and my foot gets run over by 
a car coming from the right. I've had to be very careful as a 
pedestrian there, as my intuition would get me killed.


Don't mess with systems programmers' intuitions. It'll cause 
more problems than it solves.


I live in Quebec and my intuition always tells me to look both 
ways - because you never know :o)




Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread ketmar via Digitalmars-d
On Fri, 21 Nov 2014 08:31:13 -0800
Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com
wrote:

 Would you agree that that would break a substantial amount of correct D 
 code? -- Andrei
i don't think that code with possible int wrapping and `auto` is
correct, so the answer is no. bad code must be made bad.


signature.asc
Description: PGP signature


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Meta via Digitalmars-d
On Friday, 21 November 2014 at 16:48:35 UTC, CraigDillabaugh 
wrote:
I live in Quebec and my intuition always tells me to look both 
ways - because you never know :o)


While doing my driver's training years ago, my instructor 
half-jokingly warned us never to jaywalk in Quebec unless we have 
a death wish and want to hear all about chalices and tabernacles.


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Marco Leise via Digitalmars-d
Am Wed, 19 Nov 2014 10:22:49 +
schrieb Dominikus Dittes Scherkl
dominikus.sche...@continental-corporation.com:

 On Wednesday, 19 November 2014 at 09:06:16 UTC, Maroc Leise wrote:
  Clearly size_t (which I tend to alias with ℕ in my code for
  brevity and coolness)
 No, this is far from the implied infinite set.
 A much better candidate for ℕ is BigUInt (and ℤ for BigInt)

How far exactly is it from infinity? And how much closer is
BigInt? I wanted a fast ℕ within the constraints of the
machine. ;)

-- 
Marco



Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread H. S. Teoh via Digitalmars-d
On Fri, Nov 21, 2014 at 08:31:13AM -0800, Andrei Alexandrescu via Digitalmars-d 
wrote:
 On 11/21/14 6:03 AM, ketmar via Digitalmars-d wrote:
 On Thu, 20 Nov 2014 13:28:37 -0800
 Walter Bright via Digitalmars-d digitalmars-d@puremagic.com wrote:
 
 On 11/20/2014 7:52 AM, H. S. Teoh via Digitalmars-d wrote:
 What *could* be improved, is the prevention of obvious mistakes in
 *mixing* signed and unsigned types. Right now, D allows code like
 the following with no warning:
 
uint x;
int y;
auto z = x - y;
 
 BTW, this one is the same in essence as an actual bug that I fixed
 in druntime earlier this year, so downplaying it as a mistake
 people make 'cos they confound computer math with math math is
 fallacious.
 
 What about:
 
   uint x;
   auto z = x - 1;
 
 ?
 
 here z must be `long`. and for `ulong` compiler must emit error.

What if x==uint.max?


 Would you agree that that would break a substantial amount of correct
 D code? -- Andrei

Yeah I don't think it's a good idea for subtraction to yield a different
type from its operands. Non-closure of operators (i.e., results are of a
different type than operands) leads to a lot of frustration because you
keep ending up with the wrong type, and inevitably people will just
throw in random casts everywhere just to make things work.


T

-- 
We are in class, we are supposed to be learning, we have a teacher... Is it too 
much that I expect him to teach me??? -- RL


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Marco Leise via Digitalmars-d
Am Thu, 20 Nov 2014 08:18:23 +
schrieb Don x...@nospam.com:

 It's particularly challenging in D because of the widespread use 
 of 'auto':
 
 auto x = foo();
 auto y = bar();
 auto z = baz();
 
 if (x - y  z) { ... }
 
 
 This might be a bug, if one of these functions returns an 
 unsigned type.  Good luck finding that. Note that if all 
 functions return unsigned, there isn't even any signed-unsigned 
 mismatch.

With those function names I cannot write code.

ℕ x = length();
ℕ y = index();
ℕ z = requiredRange();

if (x - y  z) { ... }

Ah, now we're getting somewhere. Yes the code is obviously
correct. You need to be aware of the value ranges of your
variables and write subtractions in a way that the result can
only be = 0. If you realize that you cannot guarantee that
for some case, you just found a logic bug. An invalid program
state that you need to assert/if-else/throw.

I don't get why so many APIs return ints. Must be to support
Java or something where proper unsigned types aren't available.

-- 
Marco



Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Ary Borenszweig via Digitalmars-d

On 11/21/14, 11:29 AM, ketmar via Digitalmars-d wrote:

On Fri, 21 Nov 2014 19:31:23 +1100
Daniel Murphy via Digitalmars-d digitalmars-d@puremagic.com wrote:


bearophile  wrote in message news:lkcltlokangpzzdzz...@forum.dlang.org...


 From my experience in coding in D they are far more unlikely than
sign-related bugs of array lengths.


Here's a simple program to calculate the relative size of two files, that
will not work correctly with unsigned lengths.

module sizediff

import std.file;
import std.stdio;

void main(string[] args)
{
 assert(args.length == 3, Usage: sizediff file1 file2);
 auto l1 = args[1].read().length;
 auto l2 = args[2].read().length;
 writeln(Difference: , l1 - l2);
}

The two ways this can fail (that I want to highlight) are:
1. If either file is too large to fit in a size_t the result will (probably)
be wrong
2. If file2 is bigger than file1 the result will be wrong

If length was signed, problem 2 would not exist, and problem 1 would be more
likely to occur.  I think it's clear that signed lengths would work for more
possible realistic inputs.

no, the problem 2 just becomes hidden. while the given code works most
of the time, it is still broken.


So how would you solve problem 2?



Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Ary Borenszweig via Digitalmars-d

On 11/21/14, 11:47 AM, ketmar via Digitalmars-d wrote:

On Fri, 21 Nov 2014 11:17:06 -0300
Ary Borenszweig via Digitalmars-d digitalmars-d@puremagic.com wrote:


This bug can manifest itself for arrays whose length (in elements) is
2^30 or greater (roughly a billion elements)

How often does that happen in practice?

once in almost ten years is too often, as for me. i think that the
answer must be never. either no bug, or the code is broken. and one
of the worst code is the code that works most of the time, but still
broken.



You see, if you don't use a BigNum for everything than you will always 
have hidden bugs, be it with int, uint or whatever. The thing is that 
with int bugs are much less frequent than with uint. So I don't know why 
you'd rather have uint than int...


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Ary Borenszweig via Digitalmars-d

On 11/21/14, 1:32 PM, Andrei Alexandrescu wrote:

On 11/21/14 6:17 AM, Ary Borenszweig wrote:

On 11/21/14, 5:45 AM, Walter Bright wrote:

On 11/21/2014 12:10 AM, bearophile wrote:

Walter Bright:


All you're doing is trading 0 crossing for 0x7FFF crossing
issues, and
pretending the problems have gone away.


I'm not pretending anything. I am asking in practical programming what
of the
two solutions leads to leas problems/bugs. So far I've seen the
unsigned
solution and I've seen it's highly bug-prone.


I'm suggesting that having a bug and detecting the bug are two different
things. The 0-crossing bug is easier to detect, but that doesn't mean
that shifting the problem to 0x7FFF crossing bugs is making the bug
count less.



BTW, granted the 0x7FFF problems exhibit the bugs less often, but
paradoxically this can make the bug worse, because then it only gets
found
much, much later in supposedly tested  robust code.


Is this true? Do you have some examples of buggy code?


http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html





This bug can manifest itself for arrays whose length (in elements) is
2^30 or greater (roughly a billion elements)

How often does that happen in practice?


Every time you read a DVD image :o). I should say that in my doctoral
work it was often the case I'd have very large arrays.


Oh, sorry, I totally forgot that when you open a DVD with VLC it reads 
the whole thing to memory.


/sarcasm


Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Marco Leise via Digitalmars-d
Am Fri, 21 Nov 2014 16:32:20 +
schrieb Wyatt wyatt@gmail.com:

 Array lengths are always non-negative integers.  This is 
 axiomatic.  But the subtraction thing keeps coming up in this 
 thread; what to do?
 
 There's probably something fundamentally wrong with this and I'll 
 probably be called an idiot by both sides, but my gut feeling 
 is that if expressions with subtraction simply returned a signed 
 type by default, much of the problem would disappear. [...]

As I said above, I always order my unsigned variables by
magnitude and uint.max - uint.min should result in uint.max
and not -1. In code dealing with lengths or offsets there is
typically some base that is less than the position or an
index that is less than the length.

The expression `base - position` is just wrong. If it is in
fact below base then you will end up with an if-else later
on under guarantee. So why not place it up front:

if (position = base)
{
auto offset = position - base;
}
else
{
…
}

 [...]
 
 -Wyatt
 
 PS: I can't even believe how this thread has blown up, 
 considering how it started.

Exactly my thought, but suddenly I couldn't stop myself from
posting.

-- 
Marco



Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread Marco Leise via Digitalmars-d
Am Thu, 20 Nov 2014 20:53:31 -0800
schrieb Walter Bright newshou...@digitalmars.com:

 On 11/20/2014 7:11 PM, Walter Bright wrote:
  On 11/20/2014 3:25 PM, bearophile wrote:
  Walter Bright:
 
  If that is changed to a signed type, then you'll have a 
  same-only-different
  set of subtle bugs,
 
  This is possible. Can you show some of the bugs, we can discuss them, and 
  see if
  they are actually worse than the current situation.
 
  All you're doing is trading 0 crossing for 0x7FFF crossing issues, and
  pretending the problems have gone away.
 
 BTW, granted the 0x7FFF problems exhibit the bugs less often, but 
 paradoxically this can make the bug worse, because then it only gets found 
 much, 
 much later in supposedly tested  robust code.
 
 0 crossing bugs tend to show up much sooner, and often immediately.

 +1000. This is also the reason we have a special float .init in D.
There is no plethora of bugs to show, because they are under
the radar. Signed types are only more convenient in the
scripting language sense, like using double for everything and
array indexing in JavaScript.

-- 
Marco



Re: 'int' is enough for 'length' to migrate code from x86 to x64

2014-11-21 Thread ketmar via Digitalmars-d
On Fri, 21 Nov 2014 14:38:26 -0300
Ary Borenszweig via Digitalmars-d digitalmars-d@puremagic.com wrote:

 You see, if you don't use a BigNum for everything than you will always 
 have hidden bugs, be it with int, uint or whatever.
why do you believe that i'm not aware of overflows and don't checking
for that? i'm used to think about overflows and do overflow checking in
production code since my Z80 days. and i don't believe that infrequent
bug is better than frequent bug. both are equally bad.


signature.asc
Description: PGP signature


  1   2   3   >