Re: Signed word lengths and indexes
Hello Don, Simen kjaeraas wrote: BCS wrote: I still haven't seen anyone address how typeof(a>>>b) == typeof(a) breaks c code when a>>>b isn't legal c to begin with. It doesn't, of course. However, it is desirable to have similar rules for similar operations, like >> and >>>. Which is why I said that it doesn't seem possible to make >>> work, without making a special-case rule for it. At least for me, I find the current situation more surprising than the alternative. For that matter if >>> worked different than >>, If that were the case, I think I would have (the first time I ran across it) thought the >> case was the odd one. -- ... <
Re: Signed word lengths and indexes
Simen kjaeraas wrote: BCS wrote: I still haven't seen anyone address how typeof(a>>>b) == typeof(a) breaks c code when a>>>b isn't legal c to begin with. It doesn't, of course. However, it is desirable to have similar rules for similar operations, like >> and >>>. Which is why I said that it doesn't seem possible to make >>> work, without making a special-case rule for it.
Re: Signed word lengths and indexes
BCS wrote: I still haven't seen anyone address how typeof(a>>>b) == typeof(a) breaks c code when a>>>b isn't legal c to begin with. It doesn't, of course. However, it is desirable to have similar rules for similar operations, like >> and >>>. -- Simen
Re: Signed word lengths and indexes
Hello Don, It's C's cavalier approach to implicit conversions that wrecks generic code. And it makes such a pigs breakfast of it that >>> doesn't quite work. I still haven't seen anyone address how typeof(a>>>b) == typeof(a) breaks c code when a>>>b isn't legal c to begin with. (Note, I'm not saying do the same with >> or << because I see why that can;t be done) -- ... <
Re: Signed word lengths and indexes
Adam Ruppe: > Huh, weird. Doesn't make too much of a difference in practice though, > since it only changes the malloc line slightly. Probably it can be fixed, but you have to be careful, because the padding isn't constant, it can change in size according to the CPU word size and the types of the data that come before TailArray :-) Bye, bearophile
Re: Signed word lengths and indexes
On 6/18/10, bearophile wrote: > By the way, this program shows your code is not a replacement of the > operator overloading of the variable length struct itself I was talking > about, because D structs can't have length zero (plus 3 bytes of padding, > here): Huh, weird. Doesn't make too much of a difference in practice though, since it only changes the malloc line slightly. In C, before the array[0] was allowed (actually, I'm not completely sure it is allowed even now in the standard. C99 added something, but I don't recall if it is the same thing), people would use array[1]. Since it is at the tail of the struct, and you're using pointer magic to the raw memory anyway, it doesn't make much of a difference.
Re: Signed word lengths and indexes
Adam Ruppe: > I don't think a destructor can free the mem > of its own object. I see and I'd like to know! :-) By the way, this program shows your code is not a replacement of the operator overloading of the variable length struct itself I was talking about, because D structs can't have length zero (plus 3 bytes of padding, here): import std.stdio: writeln, write; struct TailArray(T) { T opIndex(size_t idx) { T* tmp = cast(T*)(&this) + idx; return *tmp; } T opIndexAssign(T value, size_t idx) { T* tmp = cast(T*)(&this) + idx; *tmp = value; return value; } } struct MyString1 { size_t size; TailArray!char data; // not the same as char data[0]; in C } struct MyString2 { size_t size; char[0] data; } void main() { writeln(MyString1.sizeof); // 8 writeln(MyString2.sizeof); // 4 } Bye, bearophile
Re: Signed word lengths and indexes
On 6/18/10, bearophile wrote: >> static void destroy(MyString* s) { >> free(s); >> } > > Why destroy instead of ~this() ? It allocates and deallocates the memory rather than initialize and uninitialize the object. I don't think a destructor can free the mem of its own object. If I used gc.malloc or stack allocation, the destroy method shouldn't be necessary at all, since the memory is handled automatically there. Though, the main reason I did it this way is I was just writing in a C style rather than a D style, so it was kinda automatic. Still, I'm pretty sure what I'm saying here about constructor/destructor not able to actually the memory of the object is true too.
Re: Signed word lengths and indexes
Walter Bright wrote: Don wrote: Walter Bright wrote: Andrei Alexandrescu wrote: Note that your argument is predicated on using signed types instead of unsigned types in the first place, and tacitly assumes the issue is frequent enough to *add a new operator*. Yet unsigned shifts correlate naturally with unsigned numbers. So what is exactly that is valuable in >>> that makes its presence in the language justifiable? Generally the irritation I feel whenever I right shift and have to go back through and either check the type or just cast it to unsigned to be sure there is no latent bug. But x >>> 1 doesn't work for shorts and bytes. I know. That's ill thought out. The please rule it out of the language. Andrei
Re: Signed word lengths and indexes
Adam Ruppe: > D need be no uglier than C. Here's my implementation: That's cute, thank you :-) > static void destroy(MyString* s) { > free(s); > } Why destroy instead of ~this() ? Bye, bearophile
Re: Signed word lengths and indexes
Don wrote: Walter Bright wrote: Andrei Alexandrescu wrote: Note that your argument is predicated on using signed types instead of unsigned types in the first place, and tacitly assumes the issue is frequent enough to *add a new operator*. Yet unsigned shifts correlate naturally with unsigned numbers. So what is exactly that is valuable in >>> that makes its presence in the language justifiable? Generally the irritation I feel whenever I right shift and have to go back through and either check the type or just cast it to unsigned to be sure there is no latent bug. But x >>> 1 doesn't work for shorts and bytes. I know. That's ill thought out. For example, the optlink asm code does quite a lot of unsigned right shifts. I have to be very careful about the typing to ensure a matching unsigned shift, since I have little idea what the range of values the variable can have. I've read the OMF spec, and I know it includes shorts and bytes. So I really don't think >>> solves even this use case. I can send you the source if you like .
Re: Signed word lengths and indexes
On 6/18/10, bearophile wrote: > As I have said, you have to use operator overloading of the struct and some > near-ugly code that uses the offsetof. I don't like this a lot. D need be no uglier than C. Here's my implementation: /* @very_unsafe */ struct TailArray(T) { T opIndex(size_t idx) { T* tmp = cast(T*) (&this) + idx; return *tmp; } T opIndexAssign(T value, size_t idx) { T* tmp = cast(T*) (&this) + idx; *tmp = value; return value; } } // And this demonstrates how to use it: import std.contracts; import std.c.stdlib; struct MyString { size_t size; TailArray!(char) data; // same as char data[0]; in C // to show how to construct it static MyString* make(size_t size) { MyString* item = cast(MyString*) malloc(MyString.sizeof + size); enforce(item !is null); item.size = size; return item; } static void destroy(MyString* s) { free(s); } } import std.stdio; void main() { MyString* str = MyString.make(5); scope(exit) MyString.destroy(str); // assigning works same as C str.data[0] = 'H'; str.data[1] = 'e'; str.data[2] = 'l'; str.data[3] = 'l'; str.data[4] = 'o'; // And so does getting for(int a = 0; a < str.size; a++) writef("%s", str.data[a]); writefln(""); }
Re: Signed word lengths and indexes
Michel Fortin: > Bypassing bound checks is as easy as appending ".ptr": > > staticArray.ptr[10]; // no bound check > > Make an alias to the static array's ptr property if you prefer not to > have to write .ptr all the time. If you try to compile this: import std.c.stdlib: malloc; struct Foo { int x; int[0] a; } void main() { enum N = 20; Foo* f = cast(Foo*)malloc(Foo.sizeof + N * typeof(Foo.a[0]).sizeof); f.a.ptr[10] = 5; } You receive: prog.d(9): Error: null dereference in function _Dmain As I have said, you have to use operator overloading of the struct and some near-ugly code that uses the offsetof. I don't like this a lot. Bye, bearophile
Re: Signed word lengths and indexes
On 2010-06-18 08:11:00 -0400, bearophile said: 4.13 Arrays of Length Zero: they are available in D, but you get a array bound error if you try to use them to create variable-length structs. So to use them you have to to overload the opIndex and opIndexAssign of the struct... Bypassing bound checks is as easy as appending ".ptr": staticArray.ptr[10]; // no bound check Make an alias to the static array's ptr property if you prefer not to have to write .ptr all the time. -- Michel Fortin michel.for...@michelf.com http://michelf.com/
Re: Signed word lengths and indexes
Sorry for the slow answer. Reading all this stuff and trying to understand some of it requires time to me. Walter Bright: >The Arduino is an 8 bit machine. D is designed for 32 bit and up machines. >Full C++ won't even work on a 16 bit machine, either.< So D isn't a "better C" because you can't use it in a *large* number of situations (for every 32 bit CPU built today, they probably build 10 8/16 bit CPUs) where C is used. >If you're a kernel dev, the language features should not be a problem for you.< >From what I have seen, C++ has a ton of features that are negative for kernel >development. So a language that misses them in the first place is surely >better, because it's simpler to use, and its compiler is smaller and simpler >to debug. About two years ago I have read about an unfocused (and dead) >proposal to write a C compiler just to write the Linux kernel, allowing to >avoid GCC. >BTW, you listed nested functions as disqualifying a language from being a >kernel dev language, yet gcc supports nested functions as an extension.< Nested functions are useful for my D code, I like them and I use them. But in D (unless they are static!) they create an extra pointer. From what I have read such silent creation of extra data structures is bad if you are writing a kernel. So probably a kernel dev can accept only static nested functions. For e kernel dev the default of nonstatic is bad, because if he/she/shi forgets to add the "static" attribute then it's probably a bug. This is why I have listed D nested functions as a negative point for a kernel dev. Regarding GCC having nested functions (GCC implements them with a trapoline), I presume kernel devs don't use thie GCC extension. GCC is designed for many purposes and surely some of its features are not designed for kernel-writing purposes. >As I pointed out, D implements the bulk of those extensions as a standard part >of D.< I am studying this still. See below. >They are useful in some circumstances, but are hardly necessary.< For a low-level programmer they can be positively useful, while several other D features are useless or actively negative. I have seen about 15-20% of performance increase using computed gotos in a finite state machine I have written (that processes strings). Recently CPython has introduced them with a 15-20% performance improvement: http://bugs.python.org/issue4753 -- > It's interesting that D already has most of the gcc extensions: > http://gcc.gnu.org/onlinedocs/gcc-2.95.3/gcc_4.html Some more items from that page: 4.5 Constructing Function Calls: this syntax&semantics seems dirty, and I don't fully understand how to use this stuff. In D I miss a good apply() and a good general memoize. The memoize is a quick and easy way to cache computations and to turn recursive functions into efficient dynamic programming algorithms. --- 4.13 Arrays of Length Zero: they are available in D, but you get a array bound error if you try to use them to create variable-length structs. So to use them you have to to overload the opIndex and opIndexAssign of the struct... --- 4.14 Arrays of Variable Length (allocated on the stack): this is missing in D. Using alloca is a workaround. --- 4.21 Case Ranges: D has this, but I am not sure D syntax is better. --- 4.22 Cast to a Union Type: this is missing in D, it can be done anyway using adding a static opCall to the enum for each of its fields: union Foo { int i; double d; static Foo opCall(int ii) { Foo f; f.i = ii; return f; } static Foo opCall(double dd) { Foo f; f.d = dd; return f; } } void main() { Foo f1 = Foo(10); Foo f2 = Foo(10.5); } --- 4.23 Declaring Attributes of Functions noreturn: missing in D. But I am not sure how much useful this is, the page says: >it helps avoid spurious warnings of uninitialized variables.< format (archetype, string-index, first-to-check) and format_arg (string-index): they are missing in D, and it can be useful for people that want to use std.c.stdio.printf. no_instrument_function: missing in D. It can be useful to not profile a function. section ("section-name"): missing in D. no_check_memory_usage: I don't understand this. --- 4.29 Specifying Attributes of Variables aligned (alignment): I think D doesn't allow to specify an align for fixed-sized arrays. nocommon: I don't understand this. --- 4.30 Specifying Attributes of Types transparent_union: D missed this, but I don't know how much useful this is. --- 4.34 Variables in Specified Registers: missing in D. 4.34.1, 4.34.2 Recently a Hashell middle-end for LLVM has shown that LLVM can be used to use registers better than fixing them in specified registers (so they are in specified registers only outside functions and this frees registers inside the functions and increases performance a bit). --- 4.37 Function Names
Re: Signed word lengths and indexes
Walter Bright wrote: Andrei Alexandrescu wrote: Note that your argument is predicated on using signed types instead of unsigned types in the first place, and tacitly assumes the issue is frequent enough to *add a new operator*. Yet unsigned shifts correlate naturally with unsigned numbers. So what is exactly that is valuable in >>> that makes its presence in the language justifiable? Generally the irritation I feel whenever I right shift and have to go back through and either check the type or just cast it to unsigned to be sure there is no latent bug. But x >>> 1 doesn't work for shorts and bytes. For example, the optlink asm code does quite a lot of unsigned right shifts. I have to be very careful about the typing to ensure a matching unsigned shift, since I have little idea what the range of values the variable can have. I've read the OMF spec, and I know it includes shorts and bytes. So I really don't think >>> solves even this use case.
Re: Signed word lengths and indexes
Walter Bright wrote: Andrei Alexandrescu wrote: Note that your argument is predicated on using signed types instead of unsigned types in the first place, and tacitly assumes the issue is frequent enough to *add a new operator*. Yet unsigned shifts correlate naturally with unsigned numbers. So what is exactly that is valuable in >>> that makes its presence in the language justifiable? Generally the irritation I feel whenever I right shift and have to go back through and either check the type or just cast it to unsigned to be sure there is no latent bug. For example, the optlink asm code does quite a lot of unsigned right shifts. I have to be very careful about the typing to ensure a matching unsigned shift, since I have little idea what the range of values the variable can have. I'm sure all linker asm writers will be happy about that feature :o}. Andrei
Re: Signed word lengths and indexes
Hello Don, BCS wrote: Hello Don, Surprise! c == -1. No kidding! Because 1 is an int, b gets promoted to int before the shift happens. Why would it ever need to be promoted? Unless all (most?) CPUs have only size_t shifts, all three shifts should never promote the LHS. It shouldn't NEED to. But C defined that >> and << operate that way. At leat for the >>> can we break that? C doesn't even *have* a >>> operator. -- ... <
Re: Signed word lengths and indexes
Andrei Alexandrescu wrote: Note that your argument is predicated on using signed types instead of unsigned types in the first place, and tacitly assumes the issue is frequent enough to *add a new operator*. Yet unsigned shifts correlate naturally with unsigned numbers. So what is exactly that is valuable in >>> that makes its presence in the language justifiable? Generally the irritation I feel whenever I right shift and have to go back through and either check the type or just cast it to unsigned to be sure there is no latent bug. For example, the optlink asm code does quite a lot of unsigned right shifts. I have to be very careful about the typing to ensure a matching unsigned shift, since I have little idea what the range of values the variable can have.
Re: Signed word lengths and indexes
Walter Bright wrote: Andrei Alexandrescu wrote: Walter Bright wrote: Steven Schveighoffer wrote: On Thu, 17 Jun 2010 15:24:52 -0400, Don wrote: A question I have though is, Java has >>>. Does Java have these problems too? Java doesn't have unsigned values, so it's necessary to use regular int's as bitmasks, hence the extra operator. The reason D has >>> is to cause an unsigned right shift to be generated without needing to resort to casts as one has to in C. The problem with such casts is they wreck generic code. No. http://www.digitalmars.com/d/2.0/phobos/std_traits.html#Unsigned void fun(T)(T num) if (isIntegral!T) { auto x = cast(Unsigned!T) num; ... } It's not a perfect replacement, as in if T is a custom integer type, you have to extend the template to support it. Let me think when I wanted an unsigned shift against an arbitrarily-sized integer. Um... never? Furthermore, now your BigInt custom type also has to support a cast to unsigned just so it can right shift. BigInt is a superficial argument. Unless you're willing to flesh it out much better, it can be safely dropped. Also, T may not be readily identifiable, so you'd have to write: cast(Unsigned!(typeof(expr)) expr; It's not like shift occurs often enough to make that an issue. Note that your argument is predicated on using signed types instead of unsigned types in the first place, and tacitly assumes the issue is frequent enough to *add a new operator*. Yet unsigned shifts correlate naturally with unsigned numbers. So what is exactly that is valuable in >>> that makes its presence in the language justifiable? Andrei
Re: Signed word lengths and indexes
Andrei Alexandrescu wrote: Walter Bright wrote: Steven Schveighoffer wrote: On Thu, 17 Jun 2010 15:24:52 -0400, Don wrote: A question I have though is, Java has >>>. Does Java have these problems too? Java doesn't have unsigned values, so it's necessary to use regular int's as bitmasks, hence the extra operator. The reason D has >>> is to cause an unsigned right shift to be generated without needing to resort to casts as one has to in C. The problem with such casts is they wreck generic code. No. http://www.digitalmars.com/d/2.0/phobos/std_traits.html#Unsigned void fun(T)(T num) if (isIntegral!T) { auto x = cast(Unsigned!T) num; ... } It's not a perfect replacement, as in if T is a custom integer type, you have to extend the template to support it. Furthermore, now your BigInt custom type also has to support a cast to unsigned just so it can right shift. Also, T may not be readily identifiable, so you'd have to write: cast(Unsigned!(typeof(expr)) expr;
Re: Signed word lengths and indexes
On Jun 17, 10 23:50, Don wrote: KennyTM~ wrote: On Jun 17, 10 21:04, Don wrote: KennyTM~ wrote: On Jun 17, 10 18:59, Don wrote: Kagamin wrote: Don Wrote: (D has introduced ANOTHER instance of this with the ridiculous >>> operator. byte b = -1; byte c = b >>> 1; Guess what c is! ) :) Well, there was issue. Wasn't it fixed? No. It's a design flaw, not a bug. I think it could only be fixed by disallowing that code, or creating a special rule to make that code do what you expect. A better solution would be to drop >>>. I disagree. The flaw is whether x should be promoted to CommonType!(typeof(x), int), given that the range of typeof(x >>> y) should never exceed the range of typeof(x), no matter what value y is. The range of typeof(x & y) can never exceed the range of typeof(x), no matter what value y is. Yet (byte & int) is promoted to int. That's arguable. But (byte & int -> int) is meaningful because (&) is some what "symmetric" compared to (>>>). See below. It's what C does that matters. Actually, what happens to x>>>y if y is negative? x.d(6): Error: shift by -1 is outside the range 0..32 If y is a variable, it actually performs x >>> (y&31); So it actually makes no sense for it to cast everything to int. The current rule is: x OP y means cast(CommonType!(x,y))x OP cast(CommonType!(x,y))y for any binary operation OP. How can we fix >>> without adding an extra rule? There's already an extra rule for >>>. ubyte a = 1; writeln(typeof(a >>> a).stringof); // prints "int". Similarly, (^^), (==), etc do not obey this "rule". The logical operators aren't relevant. They all return bool. ^^ obeys the rule: typeof(a^^b) is typeof(a*b), in all cases. IMO, for ShiftExpression ((>>), (<<), (>>>)) the return type should be typeof(lhs). I agree that would be better, but it would be a silent change from the C behaviour. So it's not possible. Too bad.
Re: Signed word lengths and indexes
Andrei Alexandrescu: > Just like with non-null references, Walter has > framed the matter in a way that makes convincing extremely difficult. > That would be great if he were right. I know this is off-topic in this thread. I remember the long thread about this. Making all D references nonnull on default requires a significant change in both the language and the way objects are used in D, so I can understand that Walter has refused this idea, maybe he is right. But something more moderate can be done, keep the references nullable on default, and it can be invented a symbol (like @) that can be added as suffix to a class reference type or pointer type, that denotes it is nonnull (and the type system can enforce it at the calling point too, etc, it's part of the function signature or variable type, so it's more than just syntax sugar for a null test inside the function!). I believe this reduced idea can be enough to avoid many null-derived bugs, it's different from the situation of the Java exceptions, it's less viral, if you write a 100 lines long D program, or a long C-style D program, you are probably free to never use this feature. void foo(int*@ ptr, Bar@ b) {...} void main() { int*@ p = ensureNonull(cast(int*)malloc(int.sizeof)); Bar@ b = ensureNonull(new Bar()); foo(p, b); } Something (badly named) like ensureNonull() changes the input type into a notnull type and performs a run-time test of not-null-ty :-) Surely this idea has some holes, but they can probably be fixed. Bye, bearophile
Re: Signed word lengths and indexes
Walter Bright wrote: Steven Schveighoffer wrote: On Thu, 17 Jun 2010 15:24:52 -0400, Don wrote: A question I have though is, Java has >>>. Does Java have these problems too? Java doesn't have unsigned values, so it's necessary to use regular int's as bitmasks, hence the extra operator. The reason D has >>> is to cause an unsigned right shift to be generated without needing to resort to casts as one has to in C. Unfortunately it doesn't work. You still can't do an unsigned right shift of a signed byte by 1, without resorting to a cast. The problem with such casts is they wreck generic code. It's C's cavalier approach to implicit conversions that wrecks generic code. And it makes such a pigs breakfast of it that >>> doesn't quite work.
Re: Signed word lengths and indexes
Walter Bright wrote: Steven Schveighoffer wrote: On Thu, 17 Jun 2010 15:24:52 -0400, Don wrote: A question I have though is, Java has >>>. Does Java have these problems too? Java doesn't have unsigned values, so it's necessary to use regular int's as bitmasks, hence the extra operator. The reason D has >>> is to cause an unsigned right shift to be generated without needing to resort to casts as one has to in C. The problem with such casts is they wreck generic code. No. http://www.digitalmars.com/d/2.0/phobos/std_traits.html#Unsigned void fun(T)(T num) if (isIntegral!T) { auto x = cast(Unsigned!T) num; ... } Andrei
Re: Signed word lengths and indexes
Steven Schveighoffer wrote: On Thu, 17 Jun 2010 15:24:52 -0400, Don wrote: A question I have though is, Java has >>>. Does Java have these problems too? Java doesn't have unsigned values, so it's necessary to use regular int's as bitmasks, hence the extra operator. The reason D has >>> is to cause an unsigned right shift to be generated without needing to resort to casts as one has to in C. The problem with such casts is they wreck generic code.
Re: Signed word lengths and indexes
Don wrote: Andrei Alexandrescu wrote: Don wrote: KennyTM~ wrote: On Jun 17, 10 18:59, Don wrote: Kagamin wrote: Don Wrote: (D has introduced ANOTHER instance of this with the ridiculous >>> operator. byte b = -1; byte c = b >>> 1; Guess what c is! ) :) Well, there was issue. Wasn't it fixed? No. It's a design flaw, not a bug. I think it could only be fixed by disallowing that code, or creating a special rule to make that code do what you expect. A better solution would be to drop >>>. I disagree. The flaw is whether x should be promoted to CommonType!(typeof(x), int), given that the range of typeof(x >>> y) should never exceed the range of typeof(x), no matter what value y is. The range of typeof(x & y) can never exceed the range of typeof(x), no matter what value y is. Yet (byte & int) is promoted to int. Actually, what happens to x>>>y if y is negative? The current rule is: x OP y means cast(CommonType!(x,y))x OP cast(CommonType!(x,y))y for any binary operation OP. How can we fix >>> without adding an extra rule? Wait a minute. D should never allow an implicit narrowing conversion. It doesn't for other cases, so isn't this a simple bug? It'll make it illegal, but it won't make it usable. I think the effect of full range propagation will be that >>> will become illegal for anything other than int and long, unless it is provably identical to >>. Unless you do the hideous b >>> cast(typeof(b))1; I think every D style guide will include the recommendation, "never use >>>". Three times. Three times I tried to convince Walter to remove that crap from D - one for each '>'. Last time was as the manuscript going out the door and I was willing to take the flak from the copyeditors for the changes in pagination. Just like with non-null references, Walter has framed the matter in a way that makes convincing extremely difficult. That would be great if he were right. A question I have though is, Java has >>>. Does Java have these problems too? Java is much more conservative with implicit conversions, so they wouldn't allow the assignment without a cast. Beyond that, yes, the issues are the same. Andrei
Re: Signed word lengths and indexes
On Thu, 17 Jun 2010 15:24:52 -0400, Don wrote: A question I have though is, Java has >>>. Does Java have these problems too? Java doesn't have unsigned values, so it's necessary to use regular int's as bitmasks, hence the extra operator. -Steve
Re: Signed word lengths and indexes
Andrei Alexandrescu wrote: Don wrote: KennyTM~ wrote: On Jun 17, 10 18:59, Don wrote: Kagamin wrote: Don Wrote: (D has introduced ANOTHER instance of this with the ridiculous >>> operator. byte b = -1; byte c = b >>> 1; Guess what c is! ) :) Well, there was issue. Wasn't it fixed? No. It's a design flaw, not a bug. I think it could only be fixed by disallowing that code, or creating a special rule to make that code do what you expect. A better solution would be to drop >>>. I disagree. The flaw is whether x should be promoted to CommonType!(typeof(x), int), given that the range of typeof(x >>> y) should never exceed the range of typeof(x), no matter what value y is. The range of typeof(x & y) can never exceed the range of typeof(x), no matter what value y is. Yet (byte & int) is promoted to int. Actually, what happens to x>>>y if y is negative? The current rule is: x OP y means cast(CommonType!(x,y))x OP cast(CommonType!(x,y))y for any binary operation OP. How can we fix >>> without adding an extra rule? Wait a minute. D should never allow an implicit narrowing conversion. It doesn't for other cases, so isn't this a simple bug? It'll make it illegal, but it won't make it usable. I think the effect of full range propagation will be that >>> will become illegal for anything other than int and long, unless it is provably identical to >>. Unless you do the hideous b >>> cast(typeof(b))1; I think every D style guide will include the recommendation, "never use >>>". A question I have though is, Java has >>>. Does Java have these problems too?
Re: Signed word lengths and indexes
Kagamin wrote: > Jérôme M. Berger Wrote: > >> #include >> #include >> >> int main (int argc, char** argv) { >>char*data = argv[0]; /* Just to get a valid pointer */ >>unsigned int offset = 3; >> >>printf ("Original: %p\n", data); >> >>data += offset; >>printf ("+3 : %p\n", data); >> >>data += -offset; >>printf ("-3 : %p\n", data); >> >>assert (data == argv[0]);/* Works on 32-bit systems, fails on 64-bit >> */ >> >>return 0; >> } >> > Yo, dude! > http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=97545 Yes, I know. I was pointing out to Walter a real life example of code that works on 32-bit systems but not on 64-bit systems because of signedness issues. That was in answer to Walter saying: "I thought most 64 bit C compilers were specifically designed to avoid this problem." Jerome -- mailto:jeber...@free.fr http://jeberger.free.fr Jabber: jeber...@jabber.fr signature.asc Description: OpenPGP digital signature
Re: Signed word lengths and indexes
On Thu, 17 Jun 2010 06:41:33 -0400, Kagamin wrote: > > Justin Spahr-Summers Wrote: > > > > 1. Ironically the issue is not in file offset's signedness. You still hit > > > the bug with ulong offset. > > > > How so? Subtracting a size_t from a ulong offset will only cause > > problems if the size_t value is larger than the offset. If that's the > > case, then the issue remains even with a signed offset. > > May be, you didn't see the testcase. > ulong a; > ubyte[] b; > a+=-b.length; // go a little backwards I did see that, but that's erroneous code. Maybe the compiler could warn about unary minus on an unsigned type, but I find such problems rare as long as everyone working on the code understands signedness. > or > > seek(-b.length, SEEK_CUR, file); I wouldn't call it a failure of unsigned types that this causes problems. Like I suggested above, the situation could possibly be alleviated if the compiler just warned about unary minus no-ops. Like a couple others pointed out, this is just a lack of understanding of unsigned types and modular arithmetic. I'd say that any programmer should have such an understanding, regardless if their programming language of choice supports unsigned types or not. > > > 2. Signed offset is two times safer than unsigned as you can detect > > > underflow bug (and, maybe, overflow). > > > > The solution with unsigned values is to make sure that they won't > > underflow *before* performing the arithmetic - and that's really the > > proper solution anyways. > > If you rely on client code to be correct, you get security issue. And client > doesn't necessarily use your language or your compiler. Or he can turn off > overflow checks for performance. Or he can use the same unsigned variable for > both signed and unsigned offsets, so checks for underflow become useless. What kind of client are we talking about? If you're referring to contract programming, then it's the client's own fault if they fiddle around with the code and end up breaking it or violating its conventions. > > > With unsigned offset you get exception if the filesystem doesn't > > > support sparse files, so the linux will keep silence. > > > > I'm not sure what this means. Can you explain? > > This means that you have subtle bug. > > > > 3. Signed offset is consistent/type-safe in the case of the seek function > > > as it doesn't arbitrarily mutate between signed and unsigned. > > > > My point was about signed values being used to represent zero-based > > indices. Obviously there are applications for a signed offset *from the > > current position*. It's seeking to a signed offset *from the start of > > the file* that's unsafe. > > To catch this is the case of signed offset you need only one check. In the > case of unsigned offsets you have to watch underflows in the entire > application code even if it's not related to file seeks - just in order to > fix issue that can be fixed separately. Signed offsets can (truly) underflow as well. I don't see how the issue is any different. > > > > 4. Choosing unsigned for file offset is not dictated by safety, but by > > > stupidity: "hey, I lose my bit!" > > > > You referred to 32-bit systems, correct? I'm sure there are 32-bit > > systems out there that need to be able to access files larger than two > > gigabytes. > > I'm talking about 64-bit file offsets which are 64-bit on 32-bit systems too. In D's provided interface, this is true, but fseek() from C uses C's long data type, which is *not* 64-bit on 32-bit systems, and this is (I assume) what std.stdio uses under-the-hood, making it doubly unsafe. > As to file size limitations there's no difference between signed and > unsigned lenghts. File sizes have no tendency stick to 4 gig value. If > you need to handle files larger that 2 gigs, you also need to handle > files larger than 4 gigs. Of course. But why restrict oneself to half the available space unnecessarily? > > > I AM an optimization zealot, but unsigned offsets are plain dead > > > freaking stupid. > > > > It's not an optimization. Unsigned values logically correspond to > > disk and memory locations. > > They don't. Memory locations are a *subset* of size_t values range. > That's why you have bound checks. And the problem is usage of these > locations: memory bus doesn't perform computations on the addresses, > application does - it adds, subtracts, mixes signeds with unsigneds, > has various type system holes or kludges, library design issues, used > good practices etc. In other words, it gets a little bit complex than > just locations. Bounds checking does alleviate the issue somewhat, I'll grant you that. But as far as address computation, even if your application does none, the operating system still will in order to map logical addresses, which start at 0, to physical addresses, which also start at 0. And the memory bus absolutely requires unsigned values even if it needs to perform no computat
Re: Signed word lengths and indexes
Don wrote: KennyTM~ wrote: On Jun 17, 10 18:59, Don wrote: Kagamin wrote: Don Wrote: (D has introduced ANOTHER instance of this with the ridiculous >>> operator. byte b = -1; byte c = b >>> 1; Guess what c is! ) :) Well, there was issue. Wasn't it fixed? No. It's a design flaw, not a bug. I think it could only be fixed by disallowing that code, or creating a special rule to make that code do what you expect. A better solution would be to drop >>>. I disagree. The flaw is whether x should be promoted to CommonType!(typeof(x), int), given that the range of typeof(x >>> y) should never exceed the range of typeof(x), no matter what value y is. The range of typeof(x & y) can never exceed the range of typeof(x), no matter what value y is. Yet (byte & int) is promoted to int. Actually, what happens to x>>>y if y is negative? The current rule is: x OP y means cast(CommonType!(x,y))x OP cast(CommonType!(x,y))y for any binary operation OP. How can we fix >>> without adding an extra rule? Wait a minute. D should never allow an implicit narrowing conversion. It doesn't for other cases, so isn't this a simple bug? Andrei
Re: Signed word lengths and indexes
Don wrote: Kagamin wrote: Don Wrote: (D has introduced ANOTHER instance of this with the ridiculous >>> operator. byte b = -1; byte c = b >>> 1; Guess what c is! ) :) Well, there was issue. Wasn't it fixed? No. It's a design flaw, not a bug. I think it could only be fixed by disallowing that code, or creating a special rule to make that code do what you expect. A better solution would be to drop >>>. I agree. But even within the current language, value range propagation (VRP) should disallow this case without a problem. There's been a long discussion about computing the bounds of a & b and a || b given the bounds of a and b. The current VRP code for those operations is broken, and I suspect the VRP code for a >>> b is broken too. Andrei
Re: Signed word lengths and indexes
KennyTM~ wrote: On Jun 17, 10 21:04, Don wrote: KennyTM~ wrote: On Jun 17, 10 18:59, Don wrote: Kagamin wrote: Don Wrote: (D has introduced ANOTHER instance of this with the ridiculous >>> operator. byte b = -1; byte c = b >>> 1; Guess what c is! ) :) Well, there was issue. Wasn't it fixed? No. It's a design flaw, not a bug. I think it could only be fixed by disallowing that code, or creating a special rule to make that code do what you expect. A better solution would be to drop >>>. I disagree. The flaw is whether x should be promoted to CommonType!(typeof(x), int), given that the range of typeof(x >>> y) should never exceed the range of typeof(x), no matter what value y is. The range of typeof(x & y) can never exceed the range of typeof(x), no matter what value y is. Yet (byte & int) is promoted to int. That's arguable. But (byte & int -> int) is meaningful because (&) is some what "symmetric" compared to (>>>). See below. It's what C does that matters. Actually, what happens to x>>>y if y is negative? x.d(6): Error: shift by -1 is outside the range 0..32 If y is a variable, it actually performs x >>> (y&31); So it actually makes no sense for it to cast everything to int. The current rule is: x OP y means cast(CommonType!(x,y))x OP cast(CommonType!(x,y))y for any binary operation OP. How can we fix >>> without adding an extra rule? There's already an extra rule for >>>. ubyte a = 1; writeln(typeof(a >>> a).stringof); // prints "int". Similarly, (^^), (==), etc do not obey this "rule". The logical operators aren't relevant. They all return bool. ^^ obeys the rule: typeof(a^^b) is typeof(a*b), in all cases. IMO, for ShiftExpression ((>>), (<<), (>>>)) the return type should be typeof(lhs). I agree that would be better, but it would be a silent change from the C behaviour. So it's not possible.
Re: Signed word lengths and indexes
BCS wrote: Hello Don, Surprise! c == -1. No kidding! Because 1 is an int, b gets promoted to int before the shift happens. Why would it ever need to be promoted? Unless all (most?) CPUs have only size_t shifts, all three shifts should never promote the LHS. It shouldn't NEED to. But C defined that >> and << operate that way.
Re: Signed word lengths and indexes
On Jun 17, 10 21:04, Don wrote: KennyTM~ wrote: On Jun 17, 10 18:59, Don wrote: Kagamin wrote: Don Wrote: (D has introduced ANOTHER instance of this with the ridiculous >>> operator. byte b = -1; byte c = b >>> 1; Guess what c is! ) :) Well, there was issue. Wasn't it fixed? No. It's a design flaw, not a bug. I think it could only be fixed by disallowing that code, or creating a special rule to make that code do what you expect. A better solution would be to drop >>>. I disagree. The flaw is whether x should be promoted to CommonType!(typeof(x), int), given that the range of typeof(x >>> y) should never exceed the range of typeof(x), no matter what value y is. The range of typeof(x & y) can never exceed the range of typeof(x), no matter what value y is. Yet (byte & int) is promoted to int. That's arguable. But (byte & int -> int) is meaningful because (&) is some what "symmetric" compared to (>>>). What does (&) do? (a & b) <=> foreach (bit x, y; zip(a, b)) yield bit (x == y ? 1 : 0); What does (>>>) do? (a >>> b) <=> repeat b times { logical right shift (a); } return a; Algorithmically, (&) needs to iterate over all bits of "a" and "b", but for (>>>) the range of "b" is irrelevant to the result of "a >>> b". Actually, what happens to x>>>y if y is negative? x.d(6): Error: shift by -1 is outside the range 0..32 The current rule is: x OP y means cast(CommonType!(x,y))x OP cast(CommonType!(x,y))y for any binary operation OP. How can we fix >>> without adding an extra rule? There's already an extra rule for >>>. ubyte a = 1; writeln(typeof(a >>> a).stringof); // prints "int". Similarly, (^^), (==), etc do not obey this "rule". IMO, for ShiftExpression ((>>), (<<), (>>>)) the return type should be typeof(lhs). More interesting case is byte c = -1 >>> 1;
Re: Signed word lengths and indexes
Hello Don, The current rule is: x OP y means cast(CommonType!(x,y))x OP cast(CommonType!(x,y))y for any binary operation OP. How can we fix >>> without adding an extra rule? However it's not that way for the ternary op, so there is a (somewhat related) precedent. Even considering RHS<0, I would NEVER /expect/ a shift to have any type other than typeof(LHS). -- ... <
Re: Signed word lengths and indexes
Hello Don, Surprise! c == -1. No kidding! Because 1 is an int, b gets promoted to int before the shift happens. Why would it ever need to be promoted? Unless all (most?) CPUs have only size_t shifts, all three shifts should never promote the LHS. -- ... <
Re: Signed word lengths and indexes
KennyTM~ wrote: On Jun 17, 10 18:59, Don wrote: Kagamin wrote: Don Wrote: (D has introduced ANOTHER instance of this with the ridiculous >>> operator. byte b = -1; byte c = b >>> 1; Guess what c is! ) :) Well, there was issue. Wasn't it fixed? No. It's a design flaw, not a bug. I think it could only be fixed by disallowing that code, or creating a special rule to make that code do what you expect. A better solution would be to drop >>>. I disagree. The flaw is whether x should be promoted to CommonType!(typeof(x), int), given that the range of typeof(x >>> y) should never exceed the range of typeof(x), no matter what value y is. The range of typeof(x & y) can never exceed the range of typeof(x), no matter what value y is. Yet (byte & int) is promoted to int. Actually, what happens to x>>>y if y is negative? The current rule is: x OP y means cast(CommonType!(x,y))x OP cast(CommonType!(x,y))y for any binary operation OP. How can we fix >>> without adding an extra rule? More interesting case is byte c = -1 >>> 1;
Re: Signed word lengths and indexes
On Jun 17, 10 18:59, Don wrote: Kagamin wrote: Don Wrote: (D has introduced ANOTHER instance of this with the ridiculous >>> operator. byte b = -1; byte c = b >>> 1; Guess what c is! ) :) Well, there was issue. Wasn't it fixed? No. It's a design flaw, not a bug. I think it could only be fixed by disallowing that code, or creating a special rule to make that code do what you expect. A better solution would be to drop >>>. I disagree. The flaw is whether x should be promoted to CommonType!(typeof(x), int), given that the range of typeof(x >>> y) should never exceed the range of typeof(x), no matter what value y is. More interesting case is byte c = -1 >>> 1;
Re: Signed word lengths and indexes
Kagamin wrote: Don Wrote: (D has introduced ANOTHER instance of this with the ridiculous >>> operator. byte b = -1; byte c = b >>> 1; Guess what c is! ) :) Well, there was issue. Wasn't it fixed? No. It's a design flaw, not a bug. I think it could only be fixed by disallowing that code, or creating a special rule to make that code do what you expect. A better solution would be to drop >>>. More interesting case is byte c = -1 >>> 1;
Re: Signed word lengths and indexes
Justin Spahr-Summers wrote: On Thu, 17 Jun 2010 10:00:24 +0200, Don wrote: (D has introduced ANOTHER instance of this with the ridiculous >>> operator. byte b = -1; byte c = b >>> 1; Guess what c is! ) 127, right? I know at least RISC processors tend to have instructions for both a logical and algebraic right shift. In that context, it makes sense for a systems programming language. Surprise! c == -1. Because 1 is an int, b gets promoted to int before the shift happens. Then the result is 0x7FFF_ which then gets converted to byte, leaving 0xFF == -1.
Re: Signed word lengths and indexes
Don Wrote: > (D has introduced ANOTHER instance of this with the ridiculous >>> > operator. > byte b = -1; > byte c = b >>> 1; > Guess what c is! > ) :) Well, there was issue. Wasn't it fixed? More interesting case is byte c = -1 >>> 1;
Re: Signed word lengths and indexes
Justin Spahr-Summers Wrote: > > 1. Ironically the issue is not in file offset's signedness. You still hit > > the bug with ulong offset. > > How so? Subtracting a size_t from a ulong offset will only cause > problems if the size_t value is larger than the offset. If that's the > case, then the issue remains even with a signed offset. May be, you didn't see the testcase. ulong a; ubyte[] b; a+=-b.length; // go a little backwards or seek(-b.length, SEEK_CUR, file); > > 2. Signed offset is two times safer than unsigned as you can detect > > underflow bug (and, maybe, overflow). > > The solution with unsigned values is to make sure that they won't > underflow *before* performing the arithmetic - and that's really the > proper solution anyways. If you rely on client code to be correct, you get security issue. And client doesn't necessarily use your language or your compiler. Or he can turn off overflow checks for performance. Or he can use the same unsigned variable for both signed and unsigned offsets, so checks for underflow become useless. > > With unsigned offset you get exception if the filesystem doesn't > > support sparse files, so the linux will keep silence. > > I'm not sure what this means. Can you explain? This means that you have subtle bug. > > 3. Signed offset is consistent/type-safe in the case of the seek function > > as it doesn't arbitrarily mutate between signed and unsigned. > > My point was about signed values being used to represent zero-based > indices. Obviously there are applications for a signed offset *from the > current position*. It's seeking to a signed offset *from the start of > the file* that's unsafe. To catch this is the case of signed offset you need only one check. In the case of unsigned offsets you have to watch underflows in the entire application code even if it's not related to file seeks - just in order to fix issue that can be fixed separately. > > 4. Choosing unsigned for file offset is not dictated by safety, but by > > stupidity: "hey, I lose my bit!" > > You referred to 32-bit systems, correct? I'm sure there are 32-bit > systems out there that need to be able to access files larger than two > gigabytes. I'm talking about 64-bit file offsets which are 64-bit on 32-bit systems too. As to file size limitations there's no difference between signed and unsigned lenghts. File sizes have no tendency stick to 4 gig value. If you need to handle files larger that 2 gigs, you also need to handle files larger than 4 gigs. > > I AM an optimization zealot, but unsigned offsets are plain dead > > freaking stupid. > > It's not an optimization. Unsigned values logically correspond to disk > and memory locations. They don't. Memory locations are a *subset* of size_t values range. That's why you have bound checks. And the problem is usage of these locations: memory bus doesn't perform computations on the addresses, application does - it adds, subtracts, mixes signeds with unsigneds, has various type system holes or kludges, library design issues, used good practices etc. In other words, it gets a little bit complex than just locations.
Re: Signed word lengths and indexes
On Thu, 17 Jun 2010 10:00:24 +0200, Don wrote: > (D has introduced ANOTHER instance of this with the ridiculous >>> > operator. > byte b = -1; > byte c = b >>> 1; > Guess what c is! > ) > 127, right? I know at least RISC processors tend to have instructions for both a logical and algebraic right shift. In that context, it makes sense for a systems programming language.
Re: Signed word lengths and indexes
Jérôme M. Berger wrote: Walter Bright wrote: Jérôme M. Berger wrote: Jérôme M. Berger wrote: Walter Bright wrote: Jérôme M. Berger wrote: Now, we have code that works fine on 32-bit platforms (x86 and arm) but segfaults on x86_64. Simply adding an (int) cast in front of the image dimensions in a couple of places fixes the issue (tested with various versions of gcc on linux and windows). Easy. offset should be a size_t, not an unsigned. And what about image width and height? Sure, in hindsight they could probably be made into size_t too. Much easier and safer to make them into signed ints instead, since we don't manipulate images bigger than 2_147_483_648 on a side anyway... Which is more or less bearophile's point: unless you're *really* sure that you know what you're doing, use signed ints even if negative numbers make no sense in a particular context. I agree. Actually the great evil in C is that implicit casts from signed<->unsigned AND sign extension are both permitted in a single expression. I hope that when the integer range checking is fully implemented in D, such two-way implicit casts will be forbidden. (D has introduced ANOTHER instance of this with the ridiculous >>> operator. byte b = -1; byte c = b >>> 1; Guess what c is! )
Re: Signed word lengths and indexes
On Thu, 17 Jun 2010 03:27:59 -0400, Kagamin wrote: > > Justin Spahr-Summers Wrote: > > > This sounds more like an issue with file offsets being longs, > > ironically. Using longs to represent zero-based locations in a file is > > extremely unsafe. Such usages should really be restricted to short-range > > offsets from the current file position, and fpos_t used for everything > > else (which is assumably available in std.c.stdio). > > 1. Ironically the issue is not in file offset's signedness. You still hit the > bug with ulong offset. How so? Subtracting a size_t from a ulong offset will only cause problems if the size_t value is larger than the offset. If that's the case, then the issue remains even with a signed offset. > 2. Signed offset is two times safer than unsigned as you can detect > underflow bug (and, maybe, overflow). The solution with unsigned values is to make sure that they won't underflow *before* performing the arithmetic - and that's really the proper solution anyways. > With unsigned offset you get exception if the filesystem doesn't > support sparse files, so the linux will keep silence. I'm not sure what this means. Can you explain? > 3. Signed offset is consistent/type-safe in the case of the seek function as > it doesn't arbitrarily mutate between signed and unsigned. My point was about signed values being used to represent zero-based indices. Obviously there are applications for a signed offset *from the current position*. It's seeking to a signed offset *from the start of the file* that's unsafe. > 4. Choosing unsigned for file offset is not dictated by safety, but by > stupidity: "hey, I lose my bit!" You referred to 32-bit systems, correct? I'm sure there are 32-bit systems out there that need to be able to access files larger than two gigabytes. > I AM an optimization zealot, but unsigned offsets are plain dead > freaking stupid. It's not an optimization. Unsigned values logically correspond to disk and memory locations.
Re: Signed word lengths and indexes
Justin Spahr-Summers Wrote: > This sounds more like an issue with file offsets being longs, > ironically. Using longs to represent zero-based locations in a file is > extremely unsafe. Such usages should really be restricted to short-range > offsets from the current file position, and fpos_t used for everything > else (which is assumably available in std.c.stdio). 1. Ironically the issue is not in file offset's signedness. You still hit the bug with ulong offset. 2. Signed offset is two times safer than unsigned as you can detect underflow bug (and, maybe, overflow). With unsigned offset you get exception if the filesystem doesn't support sparse files, so the linux will keep silence. 3. Signed offset is consistent/type-safe in the case of the seek function as it doesn't arbitrarily mutate between signed and unsigned. 4. Choosing unsigned for file offset is not dictated by safety, but by stupidity: "hey, I lose my bit!" I AM an optimization zealot, but unsigned offsets are plain dead freaking stupid.
Re: Signed word lengths and indexes
Walter Bright Wrote: > 2. For an operating system kernel's memory management logic, it still would > make > sense to represent the address space as a flat range from 0..n, not one > that's > split in the middle, half of which is accessed with negative offsets. D is > supposed to support OS development. > You said it yourself: the compiler can be modified for kernel development. This makes kernel examples (not even considering their validity) not very valuable.
Re: Signed word lengths and indexes
On Thu, 17 Jun 2010 02:46:13 -0400, Kagamin wrote: > > Walter Bright Wrote: > > > Easy. offset should be a size_t, not an unsigned. > > I've hit the bug using size_t at the right side of a+=-b (array length). It's > just a long was at the left side (file offset). Such code should actually > work in 64bit system and it fails in 32bit. MS compiler reports such > portability issues with a warning, I believe. This sounds more like an issue with file offsets being longs, ironically. Using longs to represent zero-based locations in a file is extremely unsafe. Such usages should really be restricted to short-range offsets from the current file position, and fpos_t used for everything else (which is assumably available in std.c.stdio).
Re: Signed word lengths and indexes
Walter Bright Wrote: > Easy. offset should be a size_t, not an unsigned. I've hit the bug using size_t at the right side of a+=-b (array length). It's just a long was at the left side (file offset). Such code should actually work in 64bit system and it fails in 32bit. MS compiler reports such portability issues with a warning, I believe.
Re: Signed word lengths and indexes
Jérôme M. Berger Wrote: > #include > #include > > int main (int argc, char** argv) { >char*data = argv[0]; /* Just to get a valid pointer */ >unsigned int offset = 3; > >printf ("Original: %p\n", data); > >data += offset; >printf ("+3 : %p\n", data); > >data += -offset; >printf ("-3 : %p\n", data); > >assert (data == argv[0]);/* Works on 32-bit systems, fails on 64-bit */ > >return 0; > } > Yo, dude! http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=97545
Re: Signed word lengths and indexes
On 06/16/2010 04:48 PM, Walter Bright wrote: Walter Bright wrote: The people who do understand transitive const and need no convincing are the functional programming crowd. What's interesting are the languages which claim to offer FP features, as that's the latest bandwagon, but totally miss transitive const. I wish to add that I've not heard any proposal or discussion of adding transitive const to C++0x, Java, or C#. I know the Javari paper was mentioned here by Bruno. Also, the idea of deep immutability just isn't rocket science and has occurred to many people, and is why many people have started looking into Haskell given the new focus on concurrency. However, you're right in that as far as I know D is the only language to take the ball and run with it.
Re: Signed word lengths and indexes
Walter Bright wrote: > Jérôme M. Berger wrote: >> Jérôme M. Berger wrote: >>> Walter Bright wrote: Jérôme M. Berger wrote: > Actually, that problem already occurs in C. I've had problems when > porting code from x86 to x86_64 because some unsigned operations > don't behave the same way on both... How so? I thought most 64 bit C compilers were specifically designed to avoid this problem. >>> I can't isolate it to a minimal test case, but at my job, we make >>> an image processing library. Since negative image dimensions don't >>> make sense, we decided to define width and height as "unsigned int". >>> Now, we have code that works fine on 32-bit platforms (x86 and arm) >>> but segfaults on x86_64. Simply adding an (int) cast in front of the >>> image dimensions in a couple of places fixes the issue (tested with >>> various versions of gcc on linux and windows). >>> >> Gotcha! See the attached test case. I will post the explanation for >> the issue as a reply to give everyone a chance to try and spot the >> error... > > Easy. offset should be a size_t, not an unsigned. And what about image width and height? Sure, in hindsight they could probably be made into size_t too. Much easier and safer to make them into signed ints instead, since we don't manipulate images bigger than 2_147_483_648 on a side anyway... Which is more or less bearophile's point: unless you're *really* sure that you know what you're doing, use signed ints even if negative numbers make no sense in a particular context. Jerome -- mailto:jeber...@free.fr http://jeberger.free.fr Jabber: jeber...@jabber.fr signature.asc Description: OpenPGP digital signature
Re: Signed word lengths and indexes
Walter Bright wrote: The people who do understand transitive const and need no convincing are the functional programming crowd. What's interesting are the languages which claim to offer FP features, as that's the latest bandwagon, but totally miss transitive const. I wish to add that I've not heard any proposal or discussion of adding transitive const to C++0x, Java, or C#.
Re: Signed word lengths and indexes
Andrei Alexandrescu wrote: > Jérôme M. Berger wrote: >> Jérôme M. Berger wrote: >>> Walter Bright wrote: Jérôme M. Berger wrote: > Actually, that problem already occurs in C. I've had problems when > porting code from x86 to x86_64 because some unsigned operations > don't behave the same way on both... How so? I thought most 64 bit C compilers were specifically designed to avoid this problem. >>> I can't isolate it to a minimal test case, but at my job, we make >>> an image processing library. Since negative image dimensions don't >>> make sense, we decided to define width and height as "unsigned int". >>> Now, we have code that works fine on 32-bit platforms (x86 and arm) >>> but segfaults on x86_64. Simply adding an (int) cast in front of the >>> image dimensions in a couple of places fixes the issue (tested with >>> various versions of gcc on linux and windows). >>> >> Gotcha! See the attached test case. I will post the explanation for >> the issue as a reply to give everyone a chance to try and spot the >> error... >> >> Jerome >> > > Whoa! That's indeed unfortunate. Allow me some more whoring for TDPL: > > == > \indexes{surprising behavior!of unary \lstinline{-}}% > One surprising behavior of unary minus is that, when applied to an > unsigned value, it still yields an unsigned value (according to the > rules in~\S~\vref{sec:typing-of-ops}). For example,\sbs @-55u@ is\sbs > @4_294_967_241@, which is\sbs \ccbox{uint.max - 55 + 1}. > > \indexes{unsigned type, natural number, two's complement, overflow}% > The fact that unsigned types are not really natural numbers is a fact > of life. In\sbs\dee and many other languages, two's complement > arithmetic with its simple overflow rules is an inescapable reality > that cannot be abstracted away. One way to think \mbox{of} @-val@ for > any integral val...@val@ is to consider it a short form \mbox{of}$\,$ > \cc{\~val + 1}; in other words, flip every bit in @val@ and then add > @1@ to the result. This manipulation does not raise particular > questions about the signedness o...@val@. > == > > (This heavily adorned text also shows what sausage making looks like...) > In the original code, the problem didn't come from an unary minus. The rhs expression was quite a bit more complicated than that (not counting the fact that it was hidden in a preprocessor macro...). Note moreover that the problem doesn't come from the unary minus since the code works as expected on 32-bit platforms... Jerome -- mailto:jeber...@free.fr http://jeberger.free.fr Jabber: jeber...@jabber.fr signature.asc Description: OpenPGP digital signature
Re: Signed word lengths and indexes
Jeff Nowakowski wrote: On 06/16/2010 12:33 PM, Walter Bright wrote: Jeff Nowakowski wrote: On 06/15/2010 05:43 PM, Walter Bright wrote: One example of this is transitive immutability. Nobody asked for it. I find this hard to believe. I seem to recall that you were personally against const for a very long time. Did none of the people advocating for const suggest a deep const? Should I dig through the archives? Andrei explained transitivity to me and convinced me of its utility. Ok, but lots of people have been talking about const correctness for years (including yourself), stemming from the known C++ problems, and I don't see how "transitive immutability" (a deep const) is a new idea that nobody asked for. The only thing new here is that you guys came up with an implementation for D, and lots of people were glad to have it, even if many were also against it. I've talked with C++ experts for years about const. Not one of them ever mentioned transitivity, let alone asked for it or thought it was a desirable property. After we designed transitive const for D, I presented it to several C++ experts. My first job was to explain what transitive meant - none of them were familiar with the idea. Next, it took a lot of convincing of them that this was a good idea. They all insisted that a const pointer to mutable data was terribly important. While it is true that C++ people have talked about const-correctness since const was introduced to C++, it does not at all imply any concept or understanding of transitivity. Transitivity is an orthogonal idea. The people who do understand transitive const and need no convincing are the functional programming crowd. What's interesting are the languages which claim to offer FP features, as that's the latest bandwagon, but totally miss transitive const.
Re: Signed word lengths and indexes
Jérôme M. Berger wrote: Jérôme M. Berger wrote: Walter Bright wrote: Jérôme M. Berger wrote: Actually, that problem already occurs in C. I've had problems when porting code from x86 to x86_64 because some unsigned operations don't behave the same way on both... How so? I thought most 64 bit C compilers were specifically designed to avoid this problem. I can't isolate it to a minimal test case, but at my job, we make an image processing library. Since negative image dimensions don't make sense, we decided to define width and height as "unsigned int". Now, we have code that works fine on 32-bit platforms (x86 and arm) but segfaults on x86_64. Simply adding an (int) cast in front of the image dimensions in a couple of places fixes the issue (tested with various versions of gcc on linux and windows). Gotcha! See the attached test case. I will post the explanation for the issue as a reply to give everyone a chance to try and spot the error... Easy. offset should be a size_t, not an unsigned.
Re: Signed word lengths and indexes
Jérôme M. Berger wrote: Jérôme M. Berger wrote: Walter Bright wrote: Jérôme M. Berger wrote: Actually, that problem already occurs in C. I've had problems when porting code from x86 to x86_64 because some unsigned operations don't behave the same way on both... How so? I thought most 64 bit C compilers were specifically designed to avoid this problem. I can't isolate it to a minimal test case, but at my job, we make an image processing library. Since negative image dimensions don't make sense, we decided to define width and height as "unsigned int". Now, we have code that works fine on 32-bit platforms (x86 and arm) but segfaults on x86_64. Simply adding an (int) cast in front of the image dimensions in a couple of places fixes the issue (tested with various versions of gcc on linux and windows). Gotcha! See the attached test case. I will post the explanation for the issue as a reply to give everyone a chance to try and spot the error... Jerome Whoa! That's indeed unfortunate. Allow me some more whoring for TDPL: == \indexes{surprising behavior!of unary \lstinline{-}}% One surprising behavior of unary minus is that, when applied to an unsigned value, it still yields an unsigned value (according to the rules in~\S~\vref{sec:typing-of-ops}). For example,\sbs @-55u@ is\sbs @4_294_967_241@, which is\sbs \ccbox{uint.max - 55 + 1}. \indexes{unsigned type, natural number, two's complement, overflow}% The fact that unsigned types are not really natural numbers is a fact of life. In\sbs\dee and many other languages, two's complement arithmetic with its simple overflow rules is an inescapable reality that cannot be abstracted away. One way to think \mbox{of} @-val@ for any integral val...@val@ is to consider it a short form \mbox{of}$\,$ \cc{\~val + 1}; in other words, flip every bit in @val@ and then add @1@ to the result. This manipulation does not raise particular questions about the signedness o...@val@. == (This heavily adorned text also shows what sausage making looks like...) Andrei
Re: Signed word lengths and indexes
Jérôme M. Berger wrote: > Jérôme M. Berger wrote: >> Walter Bright wrote: >>> Jérôme M. Berger wrote: Actually, that problem already occurs in C. I've had problems when porting code from x86 to x86_64 because some unsigned operations don't behave the same way on both... >>> How so? I thought most 64 bit C compilers were specifically designed to >>> avoid this problem. >> I can't isolate it to a minimal test case, but at my job, we make >> an image processing library. Since negative image dimensions don't >> make sense, we decided to define width and height as "unsigned int". >> Now, we have code that works fine on 32-bit platforms (x86 and arm) >> but segfaults on x86_64. Simply adding an (int) cast in front of the >> image dimensions in a couple of places fixes the issue (tested with >> various versions of gcc on linux and windows). >> > Gotcha! See the attached test case. I will post the explanation for > the issue as a reply to give everyone a chance to try and spot the > error... > The problem comes from the fact that an unsigned int is 32 bits, even on 64 bits architecture, so what happens is: - Some operation between signed and unsigned ints gives a negative result. Because of the automatic type conversion rules, this is converted to an unsigned 32-bit int; - The result is added to a pointer. On 32-bit systems, the operation simply wraps around and works. On 64-bit systems, the result is extended to 64 bits by adding zeroes (since it is unsigned) and the resulting pointer is wrong. That's reasonably easy to spot in this simple example. It's a lot more difficult on real world code. We had the problem because we were moving a pointer through the image data. As soon as the movement depended on the image dimensions (say: move left by 1/4 the width), then the program crashed. Every other kind of move worked just fine... Jerome -- mailto:jeber...@free.fr http://jeberger.free.fr Jabber: jeber...@jabber.fr signature.asc Description: OpenPGP digital signature
Re: Signed word lengths and indexes
Jérôme M. Berger wrote: > Walter Bright wrote: >> Jérôme M. Berger wrote: >>> Actually, that problem already occurs in C. I've had problems when >>> porting code from x86 to x86_64 because some unsigned operations >>> don't behave the same way on both... >> How so? I thought most 64 bit C compilers were specifically designed to >> avoid this problem. > > I can't isolate it to a minimal test case, but at my job, we make > an image processing library. Since negative image dimensions don't > make sense, we decided to define width and height as "unsigned int". > Now, we have code that works fine on 32-bit platforms (x86 and arm) > but segfaults on x86_64. Simply adding an (int) cast in front of the > image dimensions in a couple of places fixes the issue (tested with > various versions of gcc on linux and windows). > Gotcha! See the attached test case. I will post the explanation for the issue as a reply to give everyone a chance to try and spot the error... Jerome -- mailto:jeber...@free.fr http://jeberger.free.fr Jabber: jeber...@jabber.fr #include #include int main (int argc, char** argv) { char*data = argv[0]; /* Just to get a valid pointer */ unsigned int offset = 3; printf ("Original: %p\n", data); data += offset; printf ("+3 : %p\n", data); data += -offset; printf ("-3 : %p\n", data); assert (data == argv[0]);/* Works on 32-bit systems, fails on 64-bit */ return 0; } signature.asc Description: OpenPGP digital signature
Re: Signed word lengths and indexes
On 06/16/2010 12:33 PM, Walter Bright wrote: Jeff Nowakowski wrote: On 06/15/2010 05:43 PM, Walter Bright wrote: One example of this is transitive immutability. Nobody asked for it. I find this hard to believe. I seem to recall that you were personally against const for a very long time. Did none of the people advocating for const suggest a deep const? Should I dig through the archives? Andrei explained transitivity to me and convinced me of its utility. Ok, but lots of people have been talking about const correctness for years (including yourself), stemming from the known C++ problems, and I don't see how "transitive immutability" (a deep const) is a new idea that nobody asked for. The only thing new here is that you guys came up with an implementation for D, and lots of people were glad to have it, even if many were also against it.
Re: Signed word lengths and indexes
Walter Bright wrote: > Jérôme M. Berger wrote: >> Actually, that problem already occurs in C. I've had problems when >> porting code from x86 to x86_64 because some unsigned operations >> don't behave the same way on both... > > How so? I thought most 64 bit C compilers were specifically designed to > avoid this problem. I can't isolate it to a minimal test case, but at my job, we make an image processing library. Since negative image dimensions don't make sense, we decided to define width and height as "unsigned int". Now, we have code that works fine on 32-bit platforms (x86 and arm) but segfaults on x86_64. Simply adding an (int) cast in front of the image dimensions in a couple of places fixes the issue (tested with various versions of gcc on linux and windows). Jerome -- mailto:jeber...@free.fr http://jeberger.free.fr Jabber: jeber...@jabber.fr signature.asc Description: OpenPGP digital signature
Re: Signed word lengths and indexes
== Quote from bearophile (bearophileh...@lycos.com)'s article > Walter Bright: > > If we go back in the thread, the argument for the signed size_t argument > > was for > > 64 bit address spaces. > I was asking for signed word lengths and indexes on both 32 and 64 bit > systems. Sorry for not being more clear from the start. > >With 32 bit address spaces, objects larger than 31 bits are needed.< > I don't fully understand what you mean. On 32 bit systems I can accept arrays and lists and collections (or a call to malloc) with only 2_147_483_648 items/bytes. > On a 32 bit Windows with 3 GB RAM (and Windows itself set to use 3 GB) DMD allows me to allocate only a part of those. In practice you can't allocate more than 2 GB in a single block. > On a 32 bit system I can desire arrays with something like 3_000_000_000 items only when the array items are single bytes (ubyte, char, byte, bool), and such situations are not so common (and probably 32bit Windows will not allow me to do it). > (I am still writing a comment to another answer of yours, I am not so fast, please be patient :-) ) > Bye, > bearophile That's because Win32 reserves the upper half of the address space for kernel address space. If you use the 3GB switch, then you get 3GB for your program and 1GB for the kernel, but only if the program is large address space aware. If you use Win64, you get 4 GB of address space for your 32-bit programs, but again only if they're large address space aware. Programs need to be explicitly be made large address space aware because some legacy programs assumed it would never be possible to have more than 2GB of address space, and thus used the most significant bit of pointers "creatively" or used ints for things that size_t should be used for would break in unpredictable ways if the program could suddenly see more than 2GB of address space. You can make a program large address space aware by using editbin (http://msdn.microsoft.com/en-us/library/d25ddyfc%28v=VS.80%29.aspx).
Re: Signed word lengths and indexes
Walter Bright: > If we go back in the thread, the argument for the signed size_t argument was > for > 64 bit address spaces. I was asking for signed word lengths and indexes on both 32 and 64 bit systems. Sorry for not being more clear from the start. >With 32 bit address spaces, objects larger than 31 bits are needed.< I don't fully understand what you mean. On 32 bit systems I can accept arrays and lists and collections (or a call to malloc) with only 2_147_483_648 items/bytes. On a 32 bit Windows with 3 GB RAM (and Windows itself set to use 3 GB) DMD allows me to allocate only a part of those. In practice you can't allocate more than 2 GB in a single block. On a 32 bit system I can desire arrays with something like 3_000_000_000 items only when the array items are single bytes (ubyte, char, byte, bool), and such situations are not so common (and probably 32bit Windows will not allow me to do it). (I am still writing a comment to another answer of yours, I am not so fast, please be patient :-) ) Bye, bearophile
Re: Signed word lengths and indexes
bearophile wrote: Walter Bright: Changing the sign of size_t from unsigned to signed when going from 32 to 64 bits will cause a difference in behavior.< I have proposed to use a "signed word" on both 32 and 64 bits systems. So where's the difference in behaviour? If we go back in the thread, the argument for the signed size_t argument was for 64 bit address spaces. With 32 bit address spaces, objects larger than 31 bits are needed.
Re: Signed word lengths and indexes
Jeff Nowakowski wrote: On 06/15/2010 05:43 PM, Walter Bright wrote: One example of this is transitive immutability. Nobody asked for it. I find this hard to believe. I seem to recall that you were personally against const for a very long time. Did none of the people advocating for const suggest a deep const? Should I dig through the archives? Andrei explained transitivity to me and convinced me of its utility.
Re: Signed word lengths and indexes
On 06/15/2010 05:43 PM, Walter Bright wrote: One example of this is transitive immutability. Nobody asked for it. I find this hard to believe. I seem to recall that you were personally against const for a very long time. Did none of the people advocating for const suggest a deep const? Should I dig through the archives?
Re: Signed word lengths and indexes
Walter Bright: >Changing the sign of size_t from unsigned to signed when going from 32 to 64 >bits will cause a difference in behavior.< I have proposed to use a "signed word" on both 32 and 64 bits systems. So where's the difference in behaviour? >A memory manager sees the address space as 0..N, not -N/2..0..N/2< If D arrays use signed words as indexes on 32 bit systems then only half of the original length can be used. The numbers in 0..N/2 are a subset of half of the unsigned number range. Bye, bearophile
Re: Signed word lengths and indexes
Walter Bright wrote: Stephan wrote: On 15.06.2010 19:41, Walter Bright wrote: 3. The compiler could be easily modified to add a switch that prevents such features from being used. This is no different from the customizations done to C compilers for kernel dev. Why not make such a change in a future release of the official version ? It's pretty low on the priority list, because the absence of such a switch would not prevent you from using D as a better C compiler. I would move it up in the priority if there was a serious project that needed it, as opposed to being a convenient excuse to not use D. One reason that dmd comes with source is so that people can try out things like this.
Re: Signed word lengths and indexes
Stephan wrote: On 15.06.2010 19:41, Walter Bright wrote: 3. The compiler could be easily modified to add a switch that prevents such features from being used. This is no different from the customizations done to C compilers for kernel dev. Why not make such a change in a future release of the official version ? It's pretty low on the priority list, because the absence of such a switch would not prevent you from using D as a better C compiler.
Re: Signed word lengths and indexes
On 15.06.2010 19:41, Walter Bright wrote: 3. The compiler could be easily modified to add a switch that prevents such features from being used. This is no different from the customizations done to C compilers for kernel dev. Why not make such a change in a future release of the official version ?
Re: Signed word lengths and indexes
bearophile wrote: This thread was not about linux or Linus or operating systems, it was about my proposal of changing indexes and lengths in D to signed words. So let's go back to the true purpose of this thread! Walter Bright: 1. D source code is supposed to be portable between 32 and 64 bit systems. This would fail miserably if the sign of things silently change in the process. I don't understand this, please explain better. If I use a signed word on both 32 and 64 bit systems to represent indexes and lengths what bad things can this cause? Changing the sign of size_t from unsigned to signed when going from 32 to 64 bits will cause a difference in behavior. 2. For an operating system kernel's memory management logic, it still would make sense to represent the address space as a flat range from 0..n, not one that's split in the middle, half of which is accessed with negative offsets. D is supposed to support OS development. I don't understand, I don't understand how this is related to lengths and indexes, for examples array ones. A memory manager sees the address space as 0..N, not -N/2..0..N/2
Re: Signed word lengths and indexes
On 6/15/10, Walter Bright wrote: >> 4.3 Labels as Values: that's computed gotos, they can be useful if you >> write >> an interpreter or you implement some kind of state machine. > > They are useful in some circumstances, but are hardly necessary. Can't you accomplish the same thing with some minor sprinkling of inline assembly anyway?
Re: Signed word lengths and indexes
bearophile wrote: Walter Bright: I'd rephrase that as D supports many different styles. One of those styles is as a "better C".< D can replace many but not all usages of C; think about programming an Arduino (http://en.wikipedia.org/wiki/Arduino ) with a dmd compiler of today. The Arduino is an 8 bit machine. D is designed for 32 bit and up machines. Full C++ won't even work on a 16 bit machine, either. I agree on those points. Those features would not be used when using D as a "better C".< A problem is that some of those D features can worsen a kernel code. So for example you have to review code to avoid operator overloading usage :-) There is lot of D compiler complexity useless for that kind of code. A simpler compiler means less bugs and less D manual to read. If you're a kernel dev, the language features should not be a problem for you. BTW, you listed nested functions as disqualifying a language from being a kernel dev language, yet gcc supports nested functions as an extension. The answer is that C++ doesn't offer much over C that does not involve those trouble causing features. D, on the other hand, offers substantial and valuable features not available in C or C++ that can be highly useful for kernel dev. Read on.< I don't know if D offers enough of what a kernel developer needs. It offers more than what C does, so it must be enough since C is enough. Since it has more than C does, and C is used for kernel dev, then it must be enough.< Kernel C code uses several GCC extensions to the C language. As I pointed out, D implements the bulk of those extensions as a standard part of D. With all due respect to Linus, in 30 years of professionally writing software, I've found that if you solely base improvements on what customers ask for, all you have are incremental improvements. No quantum leaps, no paradigm shifts, no game changers.< You are right in general, but I don't know how much you are right regarding Linus. Linus desires some higher level features but maybe he doesn't exactly know what he desires :-) Linus may very well be an expert on various languages and their tradeoffs, but maybe not. As far as languages go, he may only be an expert on C. All I know for sure is he is an expert on C and kernel development, and a gifted manager. 4.3 Labels as Values: that's computed gotos, they can be useful if you write an interpreter or you implement some kind of state machine. They are useful in some circumstances, but are hardly necessary.
Re: Signed word lengths and indexes
Walter Bright: >I'd rephrase that as D supports many different styles. One of those styles is >as a "better C".< D can replace many but not all usages of C; think about programming an Arduino (http://en.wikipedia.org/wiki/Arduino ) with a dmd compiler of today. >I agree on those points. Those features would not be used when using D as a >"better C".< A problem is that some of those D features can worsen a kernel code. So for example you have to review code to avoid operator overloading usage :-) There is lot of D compiler complexity useless for that kind of code. A simpler compiler means less bugs and less D manual to read. >The answer is that C++ doesn't offer much over C that does not involve those >trouble causing features. D, on the other hand, offers substantial and >valuable features not available in C or C++ that can be highly useful for >kernel dev. Read on.< I don't know if D offers enough of what a kernel developer needs. >A non-standard feature means the language is inadequate.< I agree, standard C is not perfect for that purpose. >There is nothing at all preventing non-standard features from being added to D >for specific tasks. There is no reason to believe it is harder to do that for >D than to C.< I agree. (But note that here we are talking just about low level features. Linus has said that such features are important but he desires other things absent in C). >As for standard features D has that make it more suitable for low level >programming than C is:< I agree. >Since it has more than C does, and C is used for kernel dev, then it must be >enough.< Kernel C code uses several GCC extensions to the C language. And Linus says he desires higher level features absent from C, C++ and absent from those GCC extensions. >I'll await your reply there.< I appreciate your trust, but don't expect me to be able to teach you things about C and the kind of code needed to write a kernel, you have way more experience than me :-) >With all due respect to Linus, in 30 years of professionally writing software, >I've found that if you solely base improvements on what customers ask for, all >you have are incremental improvements. No quantum leaps, no paradigm shifts, >no game changers.< You are right in general, but I don't know how much you are right regarding Linus. Linus desires some higher level features but maybe he doesn't exactly know what he desires :-) I don't know if Linus has ever asked for some of the features of the Sing# language (http://en.wikipedia.org/wiki/Sing_Sharp ), needed to write the experimental Singularity OS. About Spec#: >The Spec# language is a superset of the programming language C# extending C# >by nonnull types, method contracts, object invariants and an ownership type >system [and Spec# also has built-in message passing for concurrency with a >syntax to specify message invariants]. The behavior of a Spec# program is >checked at runtime and statically verified by Boogie, the Spec# static program >verifier [2]. Boogie generates logical verification conditions from a Spec# >program. Internally, it uses an automatic theorem prover [7] that analyzes the >verification conditions to prove the correctness of the program or find errors >in it. One of the main innovations of Boogie is a systematic way (a >methodology) for specifying and verifying invariants. The Spec# Programming >System handles callbacks and aggregate objects, and it supports both object >[4] and static [3] class invariants.< In Spec# beside the "assert" there is also "assume", it seems similar to this one of C++: http://msdn.microsoft.com/en-us/library/1b3fsfxw%28VS.80%29.aspx But Spec# "assume" seems used mostly for the contract programming, for example to state that some condition is true before some method call that has that thing as precondition. I have not fully understood the purpose of this, but I think it can be useful for performance (because contracts are enforces in "release mode" too. So the compiler has to try to remove some of them to improve code performance). In Spec# nonnull types are specified adding "!" after their type: T! t = new T(); // OK t = null; // not allowed Even if D can't turn all its class references to nonnull on default, a syntax to specify references and pointers that can't be null can be added. The bang symbol can't be used in D for that purpose, it has enough purposes already. Spec# defines three types of purity: - [Pure] Method does not change the existing objects (but it may create and update new objects). - [Confined] Method is pure and reads only this and objects owned by this. - [StateIndependent] Method does not read the heap at all. Add one of the three attributes above to a method to declare it as pure method. Any called method in a contract has to be pure. Spec# "static class invariants" test the consistency of static fields. http://research.microsoft.com/en-us/projects/specs
Re: Signed word lengths and indexes
This thread was not about linux or Linus or operating systems, it was about my proposal of changing indexes and lengths in D to signed words. So let's go back to the true purpose of this thread! Walter Bright: > 1. D source code is supposed to be portable between 32 and 64 bit systems. > This > would fail miserably if the sign of things silently change in the process. I don't understand this, please explain better. If I use a signed word on both 32 and 64 bit systems to represent indexes and lengths what bad things can this cause? > 2. For an operating system kernel's memory management logic, it still would > make > sense to represent the address space as a flat range from 0..n, not one > that's > split in the middle, half of which is accessed with negative offsets. D is > supposed to support OS development. I don't understand, I don't understand how this is related to lengths and indexes, for examples array ones. Bye, bearophile
Re: Signed word lengths and indexes
Simen kjaeraas wrote: Walter Bright wrote: bearophile wrote: Don: Indeed, only a subset of D is useful for low-level development.< A problem is that some of those D features (that are often useful in application code) are actively negative for that kind of development. But D has more close-to-the-metal features than C does.< I don't know if those extra D features are enough. Since it has more than C does, and C is used for kernel dev, then it must be enough. I believe the point of Linus (and probably bearophile) was not that C++ lacked features, but rather it lets programmers confuse one another by having features that are not as straight-forward as C. D also has these. To some extent, yes. My point was that C++ doesn't have a whole lot beyond that to offer, while D does. One example of this is transitive immutability. Nobody asked for it. A lot of people question the need for it. I happen to believe that it offers a quantum improvement in the ability of a programmer to manage the complexity of a large program, which is why I (and Andrei) have invested so much effort in it, and are willing to endure flak over it. The payoff won't be clear for years, but I think it'll be large. I still have problems understanding how someone could come up with the idea of non-transitive const. I remember the reaction when I read about it being such a great thing on this newsgroup, and going "wtf? Why on earth would it not be transitive? That would be useless!" (yes, I was not a very experienced programmer). I don't think the non-transitive const is very useful either, and I think that C++ demonstrates that.
Re: Signed word lengths and indexes
Hello Steven, On Tue, 15 Jun 2010 16:07:26 -0400, BCS wrote: Hello Steven, Why is it offensive if I expect a code reviewer to take overflow into consideration when reviewing code That's /not/ offensive. For one thing, only very few people will ever need to be involved in that. The reason I wouldn't let it pass code review has zero to do with me not understanding it (I do understand for one thing) but has 100% with anyone who ever needs to touch the code needing to understand it. That is an open set (and that is why I find it marginally offensive). The cost of putting something in your code that is harder (note I'm not saying "hard") to understand goes up the more successful the code is and is effectively unbounded. So I have to worry about substandard coders trying to understand my code? If anything, they ask a question, and it is explained to them. If *any* user *ever* has to ask a question about how code, that does something as simple as loop over any array backwards, works the author has failed. If even a handful of users take long enough to understand it that they even notice thay'er are thinking about it, the author didn't do a good job. I guess I can restate my opinion as: I'm (slightly) offended that you are asking me to think about something that trivial. Would you rather I spend any time think about that or would you rather I spend it thinking about the rest of your code? In other words, the code looks strange, but is not hiding anything. Code that looks correct but contains a subtle sign bug is worse. Looks correct & is correct > looks wrong & is wrong > looks wrong and isn't looks right and isn't You might talk me into switching the middle two, but they are darn close. It's not some sort of snobbery, I just expect reviewers to be competent. I expect that to. I also expect people reading my code (for review or what-not) to have better things to do with their time than figure out clever code. I guess I'd say that's a prejudice against learning new code tricks because not everybody knows them. It sounds foolish to me. I have no problem with code trick. I have problems with complex code where simple less interesting code does just as well. I guess we aren't likely to agree on this so I'll just say; many you maintain interesting code. -- ... <
Re: Signed word lengths and indexes
Walter Bright wrote: bearophile wrote: Don: Indeed, only a subset of D is useful for low-level development.< A problem is that some of those D features (that are often useful in application code) are actively negative for that kind of development. But D has more close-to-the-metal features than C does.< I don't know if those extra D features are enough. Since it has more than C does, and C is used for kernel dev, then it must be enough. I believe the point of Linus (and probably bearophile) was not that C++ lacked features, but rather it lets programmers confuse one another by having features that are not as straight-forward as C. D also has these. One example of this is transitive immutability. Nobody asked for it. A lot of people question the need for it. I happen to believe that it offers a quantum improvement in the ability of a programmer to manage the complexity of a large program, which is why I (and Andrei) have invested so much effort in it, and are willing to endure flak over it. The payoff won't be clear for years, but I think it'll be large. I still have problems understanding how someone could come up with the idea of non-transitive const. I remember the reaction when I read about it being such a great thing on this newsgroup, and going "wtf? Why on earth would it not be transitive? That would be useless!" (yes, I was not a very experienced programmer). -- Simen
Re: Signed word lengths and indexes
bearophile wrote: Don: Indeed, only a subset of D is useful for low-level development.< A problem is that some of those D features (that are often useful in application code) are actively negative for that kind of development. But D has more close-to-the-metal features than C does.< I don't know if those extra D features are enough. Since it has more than C does, and C is used for kernel dev, then it must be enough. And the C dialect used for example by Linux is not standard C, it uses many other tricks. I think D doesn't have some of them (I will try to answer this better to a Walter's post). I'll await your reply there. So I agree that describing the data is important, but at the same time, the things that really need the most description are how the data hangs together, what the consistency requirements are, what the locking rules are (and not for a single data object either), etc etc. And my suspicion is that you can't easily really describe those to a compiler. So you end up having to write that code yourself regardless. And hey, maybe it's because I do just low-level programming that I think so. As mentioned, most of the code I work with really deeply cares about the kinds of things that most software projects probably never even think about: stack depth, memory access ordering, fine-grained locking, and direct hardware access.< D gives few more ways to give complex semantics to the compiler, but probably other better languages need to be invented for this. I think it is possible to invent such languages, but maybe they will be hard to use (maybe as Coq http://en.wikipedia.org/wiki/Coq ), so they will be niche languages. Such niche can be so small that maybe the work to invent and implement and keep updated and debugged such language is not worth it. With all due respect to Linus, in 30 years of professionally writing software, I've found that if you solely base improvements on what customers ask for, all you have are incremental improvements. No quantum leaps, no paradigm shifts, no game changers. To get those, you have to look quite a bit beyond what the customer asks for. It also requires understanding that if a customer asks for feature X, it really means he is having problem Y, and there may be a far better solution to X than Y. One example of this is transitive immutability. Nobody asked for it. A lot of people question the need for it. I happen to believe that it offers a quantum improvement in the ability of a programmer to manage the complexity of a large program, which is why I (and Andrei) have invested so much effort in it, and are willing to endure flak over it. The payoff won't be clear for years, but I think it'll be large. Scope guard statements are another example. So are shared types.
Re: Signed word lengths and indexes
bearophile wrote: Justin Johansson: To my interpretation this means that at sometimes trying to be clever is actually stupid. A great programmer writes code as simple as possible (but not simpler). I've never met a single programmer or engineer who didn't believe and recite that platitude, and this includes every programmer and engineer who would find very complicated ways to do simple things. I've also never met a programming language advocate that didn't believe their language fulfilled that maxim. To me, it just goes to show that anyone can create a complex solution, but it takes a genius to produce a simple one.
Re: Signed word lengths and indexes
On Tue, 15 Jun 2010 16:07:26 -0400, BCS wrote: Hello Steven, Why? If you can't understand/spot overflow/underflow problems, then why should I cater to you? It's like lowering academic testing standards for school children so they can pass on to the next grade. The way peoples brains are wired, the first thought people will have about that code is wrong. If that can be avoided, why not avoid it? Because the alternatives are uglier, and it's not as easy to see subtle sign problems with them. The code we are discussing has no such subtle problems since all arithmetic/comparison is done with unsigned values. Why is it offensive if I expect a code reviewer to take overflow into consideration when reviewing code That's /not/ offensive. For one thing, only very few people will ever need to be involved in that. The reason I wouldn't let it pass code review has zero to do with me not understanding it (I do understand for one thing) but has 100% with anyone who ever needs to touch the code needing to understand it. That is an open set (and that is why I find it marginally offensive). The cost of putting something in your code that is harder (note I'm not saying "hard") to understand goes up the more successful the code is and is effectively unbounded. So I have to worry about substandard coders trying to understand my code? If anything, they ask a question, and it is explained to them. There is no trickery or deception or obfuscation. I'd expect a coder who understands bitwise operations to understand this code no problem. I would not, on the other hand, expect a reasonably knowledgeable coder to see subtle sign errors due to comparing/subtracting signed and unsigned integers. Those are much trickier to see, even for experienced coders. In other words, the code looks strange, but is not hiding anything. Code that looks correct but contains a subtle sign bug is worse. It's not some sort of snobbery, I just expect reviewers to be competent. I expect that to. I also expect people reading my code (for review or what-not) to have better things to do with their time than figure out clever code. I guess I'd say that's a prejudice against learning new code tricks because not everybody knows them. It sounds foolish to me. -Steve
Re: Signed word lengths and indexes
Jérôme M. Berger wrote: Actually, that problem already occurs in C. I've had problems when porting code from x86 to x86_64 because some unsigned operations don't behave the same way on both... How so? I thought most 64 bit C compilers were specifically designed to avoid this problem.
Re: Signed word lengths and indexes
Hello Steven, On Tue, 15 Jun 2010 11:47:34 -0400, BCS wrote: Hello Steven, This is easily solved - put in a comment. I frequently put comments in my code because I know I'm going to forget why I did something. All else being equal, code that *requiters* comments to understand is inferior to code that doesn't. Code should *always* have comments. I hate reading code that doesn't have comments, it allows you to understand what the person is thinking. I agree. It should have comments. But if stripping them out would render the code unmaintainable, that indicates to me that it's likely the code is to complex. It's a sliding scale, the more difference the comments make, the more of an issue it is. And again, this is an "all else being equal" case; given two option and nothing else to chose between them, I'll pick the one that needs fewer comments. Reading code assuming integer wrapping never occurs is a big mistake. You should learn to assume wrapping is always possible. You should learn to write code where I and everyone else doesn't /need/ to assume it is possible. Why? If you can't understand/spot overflow/underflow problems, then why should I cater to you? It's like lowering academic testing standards for school children so they can pass on to the next grade. The way peoples brains are wired, the first thought people will have about that code is wrong. If that can be avoided, why not avoid it? (personably, I find it marginally offensive/greedy when someone's first proposal as to how to fix a problem if for the rest of the world to change and the second option is for the person to change.) Why is it offensive if I expect a code reviewer to take overflow into consideration when reviewing code That's /not/ offensive. For one thing, only very few people will ever need to be involved in that. The reason I wouldn't let it pass code review has zero to do with me not understanding it (I do understand for one thing) but has 100% with anyone who ever needs to touch the code needing to understand it. That is an open set (and that is why I find it marginally offensive). The cost of putting something in your code that is harder (note I'm not saying "hard") to understand goes up the more successful the code is and is effectively unbounded. It's not some sort of snobbery, I just expect reviewers to be competent. I expect that to. I also expect people reading my code (for review or what-not) to have better things to do with their time than figure out clever code. -- ... <
Re: Signed word lengths and indexes
Don: >Indeed, only a subset of D is useful for low-level development.< A problem is that some of those D features (that are often useful in application code) are actively negative for that kind of development. >But D has more close-to-the-metal features than C does.< I don't know if those extra D features are enough. And the C dialect used for example by Linux is not standard C, it uses many other tricks. I think D doesn't have some of them (I will try to answer this better to a Walter's post). A recent nice post by Linus linked here by Walter has partially answered a question I have asked here, that is what language features a kernel developer can enjoy that both C and C++ lack. That answer has shown that close-to-the-metal features are useful but they are not enough. I presume it's not even easy to express what those more important things are, Linus writes: >So I agree that describing the data is important, but at the same time, the >things that really need the most description are how the data hangs together, >what the consistency requirements are, what the locking rules are (and not for >a single data object either), etc etc. And my suspicion is that you can't >easily really describe those to a compiler. So you end up having to write that >code yourself regardless. And hey, maybe it's because I do just low-level >programming that I think so. As mentioned, most of the code I work with really >deeply cares about the kinds of things that most software projects probably >never even think about: stack depth, memory access ordering, fine-grained >locking, and direct hardware access.< D gives few more ways to give complex semantics to the compiler, but probably other better languages need to be invented for this. I think it is possible to invent such languages, but maybe they will be hard to use (maybe as Coq http://en.wikipedia.org/wiki/Coq ), so they will be niche languages. Such niche can be so small that maybe the work to invent and implement and keep updated and debugged such language is not worth it. Bye, bearophile
Re: Signed word lengths and indexes
Justin Johansson: > To my interpretation this means that at sometimes trying to be clever is > actually stupid. A great programmer writes code as simple as possible (but not simpler). Code that doesn't need comments to be understood is often better than code that needs comments to be understood. Bye, bearophile
Re: Signed word lengths and indexes
bearophile wrote: D also lacks a good number of nonstandard C features that are present in the "C" compiled by GCC, such low-level features and compilation flags can be quite useful if you write a kernel. Even LDC has a bit of such features. It's interesting that D already has most of the gcc extensions: http://gcc.gnu.org/onlinedocs/gcc-2.95.3/gcc_4.html as standard features, rather than extensions. Being part of the standard language implies D being more suitable for kernel dev than standard C is.
Re: Signed word lengths and indexes
Walter Bright wrote: I think you are giving zero weight to the D features that assist kernel programming. What bothers me about this discussion is consider D with features 1 2 3 4, and language X with features 1 2 5. X is determined to be better than D because X has feature 5, but since X does not have features 3 and 4, therefore 3 and 4 are irrelevant. For example, the more I use scope guard statements, the more of a game changer I believe they are in eliminating the usual rat's nest of goto's one finds in C code.
Re: Signed word lengths and indexes
Walter Bright wrote: > bearophile wrote: >> We are going to 64 bit systems where 63 bits can be enough for >> lenghts. If >> arrays of 4 billion items are seen as important on 32 bit systems too, >> then >> use a long :-) 2) I don't like D to silently gulp down expressions >> that mix >> signed and unsigned integers and spit out wrong results when the integers >> were negative. > > That idea has a lot of merit for 64 bit systems. But there are two > problems with it: > > 1. D source code is supposed to be portable between 32 and 64 bit > systems. This would fail miserably if the sign of things silently change > in the process. > Actually, that problem already occurs in C. I've had problems when porting code from x86 to x86_64 because some unsigned operations don't behave the same way on both... Jerome -- mailto:jeber...@free.fr http://jeberger.free.fr Jabber: jeber...@jabber.fr signature.asc Description: OpenPGP digital signature
Re: Signed word lengths and indexes
Walter Bright wrote: bearophile wrote: Walter Bright: But I can say that D is already not the best language to develop non-toy operating systems. Why? I have not written an OS yet, so I can't be sure. But from what I have read and seen D seems designed for different purposes, mostly as a high-performance low-level application language that currently is programmed in a style that doesn't assume a very efficient GC. I'd rephrase that as D supports many different styles. One of those styles is as a "better C". D has many features that are useless or negative if you want to write code close to the metal as a kernel, as classes, virtual functions, garbage collector, operator overloading, interfaces, exceptions and try-catch-finally blocks, closures, references, delegates, nested functions and structs, array concat, built-in associative arrays, monitor, automatic destructors. When you write code close to the metal you want to know exactly what your code is doing, so all the automatic things or higher level things become useless or worse, they keep you from seeing what the hardware is actually doing. I agree on those points. Those features would not be used when using D as a "better C". So, you could ask why not use C++ as a "better C" and eschew the C++ features that cause trouble for kernel dev? The answer is that C++ doesn't offer much over C that does not involve those trouble causing features. D, on the other hand, offers substantial and valuable features not available in C or C++ that can be highly useful for kernel dev. Read on. On the other hand current D language (and C and C++) lacks other hard-to-implement features that allow the kernel programmer to give more semantics to the code. So such semantics has to be expressed through normal coding. Future languages maybe will improve on this, but it will be a hard work. ATS language tries to improve a bit on this, but it's far from being good and its syntax is awful. I think you are giving zero weight to the D features that assist kernel programming. D also lacks a good number of nonstandard C features that are present in the "C" compiled by GCC, such low-level features and compilation flags can be quite useful if you write a kernel. Even LDC has a bit of such features. A non-standard feature means the language is inadequate. There is nothing at all preventing non-standard features from being added to D for specific tasks. There is no reason to believe it is harder to do that for D than to C. As for standard features D has that make it more suitable for low level programming than C is: 1. inline assembler as a standard feature 2. const/immutable qualifiers 3. identification of shared data with the shared type constructor 4. enforced function purity 5. guaranteed basic type sizes 6. arrays that actually work 6.5. arrays that actually work and don't need garbage collection 7. scope guard (yes, even without exception handling) Andrei
Re: Signed word lengths and indexes
Walter Bright wrote: Alex Makhotin wrote: Walter Bright wrote: Andrei and I went down that alley for a while. It's not practical. A link on the discussion or examples of unpractical explicit cast would be helpful to me to try to understand such decision. I don't have one, the message database of this n.g. is enormous. You can try the search box here: http://www.digitalmars.com/d/archives/digitalmars/D/index.html The discussions about polysemous types should be relevant. We tried to fix things quite valiantly. Currently I believe that improving value range propagation is the best way to go. Andrei
Re: Signed word lengths and indexes
Steven Schveighoffer wrote: On Tue, 15 Jun 2010 11:28:43 -0400, BCS wrote: Hello Steven, On Tue, 15 Jun 2010 08:49:56 -0400, Pelle wrote: On 06/15/2010 02:10 PM, Steven Schveighoffer wrote: On Tue, 15 Jun 2010 07:30:52 -0400, bearophile wrote: Steven Schveighoffer: i is unsigned, and therefore can never be less than 0. It's actually a clever way to do it that I've never thought of. Clever code is bad. It must be minimized. In some rare situations it becomes useful, but its usage must be seen as a failure of the programmer, that was unable to write not-clever code that does the same things. Clever code is bad? What are you smoking? In my opinion, clever code that is clear and concise should always be favored over code that is unnecessarily verbose. Clever code is bad because you have to think a couple of times more every time you see it. This is a temporary problem. Once you get used to any particular coding trick, you understand it better. People cutting you off on the road is a temporary problem, once you tell everyone off, they will understand better. Your statement might have merit if the "you" in it were the specific "you" rather than the universal "you". In fact, I meant the specific you. Once a person gets used to any particular coding trick, that person will understand it better when the trick is encountered again. This is a basic principle of learning. Also, it looks wrong. Why? i is unsigned, therefore >= 0, and must be < length. That seems reasonable and correct to me. It looks wrong because i only gets smaller. People are hardwired to think about continues number system, not modulo number system (explain that 0 - 1 = -1 to a 6 year old; easy, explain that 0 - 1 = 2^32-1 to them, good luck). Yes we can be trained to use such system, but most people still wont think that way reflexively. It's really easy to explain. Use an odometer as an example. And we don't have to be specific in this case, you can substitue 'some very large number' for '2^32 - 1'. Besides, why does a 6-year old have to understand a for loop? D doesn't cater to people who can't grasp the modulo arithmetic concept. I think that this discussion is becoming pointless. Let's just accept that we don't have to review code for one another, and we like it that way :) -Steve I would say, if you have trouble understanding that trick, you should NOT be using unsigned arithmetic EVER. And I agree that most people have trouble with it.
Re: Signed word lengths and indexes
bearophile wrote: Walter Bright: But I can say that D is already not the best language to develop non-toy operating systems. Why? I have not written an OS yet, so I can't be sure. But from what I have read and seen D seems designed for different purposes, mostly as a high-performance low-level application language that currently is programmed in a style that doesn't assume a very efficient GC. I'd rephrase that as D supports many different styles. One of those styles is as a "better C". D has many features that are useless or negative if you want to write code close to the metal as a kernel, as classes, virtual functions, garbage collector, operator overloading, interfaces, exceptions and try-catch-finally blocks, closures, references, delegates, nested functions and structs, array concat, built-in associative arrays, monitor, automatic destructors. When you write code close to the metal you want to know exactly what your code is doing, so all the automatic things or higher level things become useless or worse, they keep you from seeing what the hardware is actually doing. I agree on those points. Those features would not be used when using D as a "better C". So, you could ask why not use C++ as a "better C" and eschew the C++ features that cause trouble for kernel dev? The answer is that C++ doesn't offer much over C that does not involve those trouble causing features. D, on the other hand, offers substantial and valuable features not available in C or C++ that can be highly useful for kernel dev. Read on. On the other hand current D language (and C and C++) lacks other hard-to-implement features that allow the kernel programmer to give more semantics to the code. So such semantics has to be expressed through normal coding. Future languages maybe will improve on this, but it will be a hard work. ATS language tries to improve a bit on this, but it's far from being good and its syntax is awful. I think you are giving zero weight to the D features that assist kernel programming. D also lacks a good number of nonstandard C features that are present in the "C" compiled by GCC, such low-level features and compilation flags can be quite useful if you write a kernel. Even LDC has a bit of such features. A non-standard feature means the language is inadequate. There is nothing at all preventing non-standard features from being added to D for specific tasks. There is no reason to believe it is harder to do that for D than to C. As for standard features D has that make it more suitable for low level programming than C is: 1. inline assembler as a standard feature 2. const/immutable qualifiers 3. identification of shared data with the shared type constructor 4. enforced function purity 5. guaranteed basic type sizes 6. arrays that actually work 7. scope guard (yes, even without exception handling) BTW, you might ask "how do I know my D code doesn't have exception handling or GC calls in it?" There are several ways: 1. Remove the support from it from the library. Then, attempts to use such features will cause the link step to fail. Kernel C programmers use a custom library anyway, no reason why D kernel dev cannot. 2. Compiling code with "nothrow" will check that exceptions are not generated. 3. The compiler could be easily modified to add a switch that prevents such features from being used. This is no different from the customizations done to C compilers for kernel dev.
Re: Signed word lengths and indexes
Alex Makhotin wrote: Walter Bright wrote: Andrei and I went down that alley for a while. It's not practical. A link on the discussion or examples of unpractical explicit cast would be helpful to me to try to understand such decision. I don't have one, the message database of this n.g. is enormous. You can try the search box here: http://www.digitalmars.com/d/archives/digitalmars/D/index.html
Re: Signed word lengths and indexes
BCS wrote: It looks wrong because i only gets smaller. People are hardwired to think about continues number system, not modulo number system (explain that 0 - 1 = -1 to a 6 year old; easy, explain that 0 - 1 = 2^32-1 to them, good luck). Yes we can be trained to use such system, but most people still wont think that way reflexively. Hardwired? Hardly. However, continuous number systems are ubiquitous, modulo systems are not. As for teaching a 6-year old, give him a wheel with the numbers 0-9 written on each of the ten spokes, and ask him what number you get by going backward one step from 0. -- Simen
Re: Signed word lengths and indexes
bearophile wrote: When you write code close to the metal you want to know exactly what your code is doing, so all the automatic things or higher level things become useless or worse, they keep you from seeing what the hardware is actually doing. Right. That's why I well respect the point of view of Linus on that matter. And his last comments on that look well motivated to me. -- Alex Makhotin, the founder of BITPROX, http://bitprox.com
Re: Signed word lengths and indexes
Walter Bright wrote: div0 wrote: I do think that allowing un-casted assignments between signed/unsigned is a problem though; that's where most of the bugs creep up I've come across crop up. I think D should simply disallow implicit mixing of signd-ness. Andrei and I went down that alley for a while. It's not practical. A link on the discussion or examples of unpractical explicit cast would be helpful to me to try to understand such decision. -- Alex Makhotin, the founder of BITPROX, http://bitprox.com
Re: Signed word lengths and indexes
On Tue, 15 Jun 2010 11:47:34 -0400, BCS wrote: Hello Steven, This is easily solved - put in a comment. I frequently put comments in my code because I know I'm going to forget why I did something. All else being equal, code that *requiters* comments to understand is inferior to code that doesn't. Code should *always* have comments. I hate reading code that doesn't have comments, it allows you to understand what the person is thinking. That being said, I don't think this construct requires comments, maybe a note like 'uses underflow' or something to let the reader know the writer was aware of the issue and did it on purpose, but a comment is not essential to understanding the code. *That* being said, I don't expect to use this construct often. Typically one iterates forwards through an array, and foreach is much better suited for iteration anyways. Reading code assuming integer wrapping never occurs is a big mistake. You should learn to assume wrapping is always possible. You should learn to write code where I and everyone else doesn't /need/ to assume it is possible. Why? If you can't understand/spot overflow/underflow problems, then why should I cater to you? It's like lowering academic testing standards for school children so they can pass on to the next grade. (personably, I find it marginally offensive/greedy when someone's first proposal as to how to fix a problem if for the rest of the world to change and the second option is for the person to change.) Why is it offensive if I expect a code reviewer to take overflow into consideration when reviewing code? It's not some sort of snobbery, I just expect reviewers to be competent. -Steve
Re: Signed word lengths and indexes
Hello Steven, On Tue, 15 Jun 2010 11:28:43 -0400, BCS wrote: Hello Steven, On Tue, 15 Jun 2010 08:49:56 -0400, Pelle wrote: Clever code is bad because you have to think a couple of times more every time you see it. This is a temporary problem. Once you get used to any particular coding trick, you understand it better. People cutting you off on the road is a temporary problem, once you tell everyone off, they will understand better. Your statement might have merit if the "you" in it were the specific "you" rather than the universal "you". In fact, I meant the specific you. Once a person gets used to any particular coding trick, that person will understand it better when the trick is encountered again. This is a basic principle of learning. Yes, once Pelle (sorry to pick on you) gets used to any particular coding trick, Pelle will understand it better when the trick is encountered again. But what about everyone else? If Pelle were the only one who was going to read your code, that would be fine. But unless you can, right now, list by name everyone who will ever read your (and if you can, just go buy a lottery ticket and retire) then anything but the universal "you" makes the statement irrelevant. Also, it looks wrong. Why? i is unsigned, therefore >= 0, and must be < length. That seems reasonable and correct to me. It looks wrong because i only gets smaller. People are hardwired to think about continues number system, not modulo number system (explain that 0 - 1 = -1 to a 6 year old; easy, explain that 0 - 1 = 2^32-1 to them, good luck). Yes we can be trained to use such system, but most people still wont think that way reflexively. It's really easy to explain. Use an odometer as an example. And we don't have to be specific in this case, you can substitue 'some very large number' for '2^32 - 1'. Most 6 year olds will need to have an odometer explained to them first. Besides, why does a 6-year old have to understand a for loop? D doesn't cater to people who can't grasp the modulo arithmetic concept. I wasn't taking about for loops, but the semantics of int vs. uint near zero. If a 6 year old can understand something, I won't have to think about it to work with it and I can use the time and cycles I gain for something else. -- ... <