Re: Nightly builds
On 02.01.2012 14:15, simendsjo wrote: Is it possible for the autotester to allow downloading the latest build that passed testing? better post this in D newsgroup instead of D.learn
Using "in" with associative arrays and then indexing them (efficiency)
Hello everyone, I would like to know whether if (symbol in symbols) return symbols[symbol]; is any less efficient than auto tmp = symbol in symbols; if (tmp !is null) return *tmp; Without optimisation, it looks like the first example searches for `symbol' twice. Thanks, Matej
Re: Using "in" with associative arrays and then indexing them (efficiency)
On Tuesday, January 03, 2012 11:52:13 Matej Nanut wrote: > Hello everyone, > > I would like to know whether > > if (symbol in symbols) > return symbols[symbol]; > > is any less efficient than > > auto tmp = symbol in symbols; > if (tmp !is null) > return *tmp; > > Without optimisation, it looks like the first example > searches for `symbol' twice. Of course it does. in does a search and returns a pointer to the element in the AA (or null if it isn't there). The subscript operator also does a search, returning the element if it's there and blowing up if it's not (OutOfRangeError IIRC without -release and who-knows-what with -release). So, if you use in and then the subscript operator, of course it's going to search twice. Part of the point of using in is to not have to do a double lookup (like you would be doing if AAs had a contains function and you called that prior to using the substript operator). The correct way to do it is the second way, though you should be able to reduce it to if(auto tmp = symbol in symbols) return *tmp; - Jonathan M Davis
Re: Using "in" with associative arrays and then indexing them (efficiency)
On 01/03/2012 12:07 PM, Jonathan M Davis wrote: On Tuesday, January 03, 2012 11:52:13 Matej Nanut wrote: Hello everyone, I would like to know whether if (symbol in symbols) return symbols[symbol]; is any less efficient than auto tmp = symbol in symbols; if (tmp !is null) return *tmp; Without optimisation, it looks like the first example searches for `symbol' twice. Of course it does. in does a search and returns a pointer to the element in the AA (or null if it isn't there). The subscript operator also does a search, returning the element if it's there and blowing up if it's not (OutOfRangeError IIRC without -release and who-knows-what with -release). So, if you use in and then the subscript operator, of course it's going to search twice. Part of the point of using in is to not have to do a double lookup (like you would be doing if AAs had a contains function and you called that prior to using the substript operator). The correct way to do it is the second way, though you should be able to reduce it to if(auto tmp = symbol in symbols) return *tmp; - Jonathan M Davis I think this is the single most ugly thing in the language. IIRC ldc will generate identical code for both code snippets.
Re: Using "in" with associative arrays and then indexing them (efficiency)
On Tuesday, January 03, 2012 12:13:45 Timon Gehr wrote: > On 01/03/2012 12:07 PM, Jonathan M Davis wrote: > > On Tuesday, January 03, 2012 11:52:13 Matej Nanut wrote: > >> Hello everyone, > >> > >> I would like to know whether > >> > >> if (symbol in symbols) > >> > >> return symbols[symbol]; > >> > >> is any less efficient than > >> > >> auto tmp = symbol in symbols; > >> if (tmp !is null) > >> > >> return *tmp; > >> > >> Without optimisation, it looks like the first example > >> searches for `symbol' twice. > > > > Of course it does. in does a search and returns a pointer to the element > > in the AA (or null if it isn't there). The subscript operator also does > > a search, returning the element if it's there and blowing up if it's > > not > > (OutOfRangeError IIRC without -release and who-knows-what with > > -release). So, if you use in and then the subscript operator, of course > > it's going to search twice. Part of the point of using in is to not > > have to do a double lookup (like you would be doing if AAs had a > > contains function and you called that prior to using the substript > > operator). > > > > The correct way to do it is the second way, though you should be able to > > reduce it to > > > > if(auto tmp = symbol in symbols) > > > > return *tmp; > > > > - Jonathan M Davis > > I think this is the single most ugly thing in the language. IIRC ldc > will generate identical code for both code snippets. What, declaring variables in if statements? It's fantastic IMHO. It allows you to restrict the scope of the variable to the if statement's scope and still use it in the if's condition. And yes, as far as the assembly goes, the generated code is identical. But the scoping for the variable is most definitely different - it won't exist past the if statement if it's declared in the if's condition - and it saves you a line of code. The reduced scope is the more important of the two though IMHO, as nice as saving a line of code is. - Jonathan M Davis
Re: Using "in" with associative arrays and then indexing them (efficiency)
On 01/03/2012 12:22 PM, Jonathan M Davis wrote: On Tuesday, January 03, 2012 12:13:45 Timon Gehr wrote: On 01/03/2012 12:07 PM, Jonathan M Davis wrote: On Tuesday, January 03, 2012 11:52:13 Matej Nanut wrote: Hello everyone, I would like to know whether if (symbol in symbols) return symbols[symbol]; is any less efficient than auto tmp = symbol in symbols; if (tmp !is null) return *tmp; Without optimisation, it looks like the first example searches for `symbol' twice. Of course it does. in does a search and returns a pointer to the element in the AA (or null if it isn't there). The subscript operator also does a search, returning the element if it's there and blowing up if it's not (OutOfRangeError IIRC without -release and who-knows-what with -release). So, if you use in and then the subscript operator, of course it's going to search twice. Part of the point of using in is to not have to do a double lookup (like you would be doing if AAs had a contains function and you called that prior to using the substript operator). The correct way to do it is the second way, though you should be able to reduce it to if(auto tmp = symbol in symbols) return *tmp; - Jonathan M Davis I think this is the single most ugly thing in the language. IIRC ldc will generate identical code for both code snippets. What, declaring variables in if statements? It's fantastic IMHO. It allows you to restrict the scope of the variable to the if statement's scope and still use it in the if's condition. And yes, as far as the assembly goes, the generated code is identical. But the scoping for the variable is most definitely different - it won't exist past the if statement if it's declared in the if's condition - and it saves you a line of code. The reduced scope is the more important of the two though IMHO, as nice as saving a line of code is. - Jonathan M Davis No, I love declaring variables in if statements and would like it to be extended to while statements as well. What I meant is the fact that something called 'in' returns a pointer. And the two code snippets I was referring to were the two in Matej's post.
Re: Using "in" with associative arrays and then indexing them (efficiency)
On Tuesday, January 03, 2012 12:27:08 Timon Gehr wrote: > No, I love declaring variables in if statements and would like it to be > extended to while statements as well. What I meant is the fact that > something called 'in' returns a pointer. And the two code snippets I was > referring to were the two in Matej's post. Those two code snippets can't possibly result in the same code without the compiler assuming that it can safely optimize the first one into the second. Certainly, with a user-defined type, that would be impossible. With the AA, since it's essentially built-in, it may decide that it can make that assumption, but it could definitely result in different behavior if you were dealing with shared or the like, and there's nothing requiring the compiler to make such an optimization. - Jonathan M Davis
Enumerating structs?
Hello! I have some structs struct A { int a; } struct B { int b, c; } and I'd like to be able to enumerate them (preferrably as integers) based on their names. I've no idea how this would look, but some pseudo code that would use this feature: // pseudo int type = stream.read!int(); switch(type){ case A.enumof: auto data = stream.read!A(); // ... break; case B.enumof: // ... break; // ... The idea here is to enable stuff like static if's etc and to enforce the connection between the struct and the enumeration. Right now I have a separate enums, like so: enum {A_ENUM = 1, B_ENUM = 2}; // ... case A_ENUM: auto data = stream.read!A(); // ... break; //... But this means the idea that struct A has the enumeration "1" is only by convention. So when I, for instance, refactor struct A to "C", all code still compiles. It would be cool if it didn't, somehow. With this small example it's of course not a problem, but for larger more complex code perhaps. A naive idea I had was to let each struct have an enum: struct A{ enum TYPE_ENUM = 1; int a; } That would be refactor-friendly and be a strong connection, but then there's no guarantee two structs don't have the same enum, of course. Another idea was to maybe use mixin to somehow construct the enum declaration: mixin enumByType!(A,B); That could generate code like: enum {A_ENUM = 1, ...} and then couple it with a // ... case typeEnumFor!A(): //... break; // ... but now it's starting to maybe feel a bit overkill? Is there an easier/correct/other way? - I usually end up feeling like this a lot with D, I just realized. It's like, yes, with mixins I can more or less do anything, but where does one stop? You know what I mean? I like it though. Mixin-paralysis : ) /HF
opCast!bool
I guess this is as designed, but I'll ask anyway. http://dlang.org/operatoroverloading.html#Cast says an expression is rewritten to opCast "whenever a bool result is expected". This is true for if(e) somethingElse and e && somethingElse , but not for other parts. assert(cast(bool)e == true); // explicit cast works assert(e == true); // Error: incompatible types for ((s) == (false)): 'S' and 'bool' is(typeof(e) : bool); // false
Re: Enumerating structs?
On Tue, 03 Jan 2012 16:35:29 +0100, Heywood Floyd wrote: Hello! I have some structs struct A { int a; } struct B { int b, c; } and I'd like to be able to enumerate them (preferrably as integers) based on their names. I've no idea how this would look, but some pseudo code that would use this feature: // pseudo int type = stream.read!int(); switch(type){ case A.enumof: auto data = stream.read!A(); // ... break; case B.enumof: // ... break; // ... The idea here is to enable stuff like static if's etc and to enforce the connection between the struct and the enumeration. Right now I have a separate enums, like so: enum {A_ENUM = 1, B_ENUM = 2}; // ... case A_ENUM: auto data = stream.read!A(); // ... break; //... But this means the idea that struct A has the enumeration "1" is only by convention. So when I, for instance, refactor struct A to "C", all code still compiles. It would be cool if it didn't, somehow. With this small example it's of course not a problem, but for larger more complex code perhaps. A naive idea I had was to let each struct have an enum: struct A{ enum TYPE_ENUM = 1; int a; } That would be refactor-friendly and be a strong connection, but then there's no guarantee two structs don't have the same enum, of course. Another idea was to maybe use mixin to somehow construct the enum declaration: mixin enumByType!(A,B); That could generate code like: enum {A_ENUM = 1, ...} and then couple it with a // ... case typeEnumFor!A(): //... break; // ... but now it's starting to maybe feel a bit overkill? Is there an easier/correct/other way? - I usually end up feeling like this a lot with D, I just realized. It's like, yes, with mixins I can more or less do anything, but where does one stop? You know what I mean? I like it though. Mixin-paralysis :) Yeah, D feels like that to me too, sometimes. Anyways, for your question - would using the struct name be good enough? They're easy to get hold of and usable in switch statements. If not, how about this: import std.typetuple; struct TypeEnum( T... ) { static pure nothrow @property int value( U )( ) { static assert ( staticIndexOf!( U, T ) != -1 ); return staticIndexOf!( U, T ); } } struct A {} struct B {} void main( ) { alias TypeEnum!(A, B) types; assert( types.value!A == 0 ); assert( types.value!B == 1 ); }
Re: Using "in" with associative arrays and then indexing them (efficiency)
On 01/03/2012 04:07 AM, Jonathan M Davis wrote: On Tuesday, January 03, 2012 11:52:13 Matej Nanut wrote: Hello everyone, I would like to know whether if (symbol in symbols) return symbols[symbol]; is any less efficient than auto tmp = symbol in symbols; if (tmp !is null) return *tmp; Without optimisation, it looks like the first example searches for `symbol' twice. Of course it does. in does a search and returns a pointer to the element in the AA (or null if it isn't there). The subscript operator also does a search, returning the element if it's there and blowing up if it's not (OutOfRangeError IIRC without -release and who-knows-what with -release). So, if you use in and then the subscript operator, of course it's going to search twice. Part of the point of using in is to not have to do a double lookup (like you would be doing if AAs had a contains function and you called that prior to using the substript operator). The correct way to do it is the second way, though you should be able to reduce it to if(auto tmp = symbol in symbols) return *tmp; - Jonathan M Davis +1 Very slick :)
Re: Using "in" with associative arrays and then indexing them (efficiency)
On 3 January 2012 17:58, Kai Meyer wrote: > On 01/03/2012 04:07 AM, Jonathan M Davis wrote: >> if(auto tmp = symbol in symbols) >> return *tmp; >> >> - Jonathan M Davis > > > +1 > > Very slick :) Yup, I'm going with this one. Thanks!
AA char[] as key
seems T[char[]] is rewritten as T[const(char)[]], and does not accept char[] as key even if mutable data should automatically convert to const (right..?) Shouldn't T[char[]] be disallowed, and have to be written as T[immutable(char)[]] instead of a silent rewrite? alias long[char[]] AA; // key automatically changed to const(char)[] static assert(is(AA == long[const(char)[]])); AA aa; aa["a"] = 10; // error - have to use immutable keys aa["b".dup] = 11;
Re: AA char[] as key
simendsjo: > Shouldn't T[char[]] be disallowed, and have to be written as > T[immutable(char)[]] instead of a silent rewrite? Of course. I have a Bugzilla issue on this. Bye, bearophile
Re: AA char[] as key
On Tue, Jan 3, 2012 at 1:25 PM, simendsjo wrote: > seems T[char[]] is rewritten as T[const(char)[]], and does not accept char[] > as key even if mutable data should automatically convert to const (right..?) > > Shouldn't T[char[]] be disallowed, and have to be written as > T[immutable(char)[]] instead of a silent rewrite? > > > alias long[char[]] AA; > // key automatically changed to const(char)[] > static assert(is(AA == long[const(char)[]])); > AA aa; > aa["a"] = 10; > // error - have to use immutable keys > aa["b".dup] = 11; By design, the problem is things like this: char[] key = "somekey"; long[char[]] aa; aa[key] = 5; key[2] = 'b'; If this were allowed, the associative array would reach an invalid state where the hash it stored for the key is no longer correct. It does seem like T[char[]] should be disallowed and the requirement for immutable keys should be literally enforced, and it's possible that was intended but immutable/const weren't as complete when this problem was last visited. I'll see if I can dig up an old discussion about this.
Re: AA char[] as key
On Tue, Jan 3, 2012 at 1:41 PM, Andrew Wiley wrote: > On Tue, Jan 3, 2012 at 1:25 PM, simendsjo wrote: >> seems T[char[]] is rewritten as T[const(char)[]], and does not accept char[] >> as key even if mutable data should automatically convert to const (right..?) >> >> Shouldn't T[char[]] be disallowed, and have to be written as >> T[immutable(char)[]] instead of a silent rewrite? >> >> >> alias long[char[]] AA; >> // key automatically changed to const(char)[] >> static assert(is(AA == long[const(char)[]])); >> AA aa; >> aa["a"] = 10; >> // error - have to use immutable keys >> aa["b".dup] = 11; > > By design, the problem is things like this: > > char[] key = "somekey"; Sorry, should be: char[] key = "somekey".dup;
Re: AA char[] as key
On 03.01.2012 20:41, Andrew Wiley wrote: On Tue, Jan 3, 2012 at 1:25 PM, simendsjo wrote: seems T[char[]] is rewritten as T[const(char)[]], and does not accept char[] as key even if mutable data should automatically convert to const (right...?) Shouldn't T[char[]] be disallowed, and have to be written as T[immutable(char)[]] instead of a silent rewrite? alias long[char[]] AA; // key automatically changed to const(char)[] static assert(is(AA == long[const(char)[]])); AA aa; aa["a"] = 10; // error - have to use immutable keys aa["b".dup] = 11; By design, the problem is things like this: char[] key = "somekey"; long[char[]] aa; aa[key] = 5; key[2] = 'b'; If this were allowed, the associative array would reach an invalid state where the hash it stored for the key is no longer correct. It does seem like T[char[]] should be disallowed and the requirement for immutable keys should be literally enforced, and it's possible that was intended but immutable/const weren't as complete when this problem was last visited. I'll see if I can dig up an old discussion about this. It is disallowed, but it's enforced when setting a key rather than when constructing the type. So `aa[key] = 5` above fails as key is char[] rather than string.
Re: AA char[] as key
On Tue, Jan 3, 2012 at 1:50 PM, simendsjo wrote: > On 03.01.2012 20:41, Andrew Wiley wrote: >> >> On Tue, Jan 3, 2012 at 1:25 PM, simendsjo wrote: >>> >>> seems T[char[]] is rewritten as T[const(char)[]], and does not accept >>> char[] >>> as key even if mutable data should automatically convert to const >>> (right...?) >>> >>> >>> Shouldn't T[char[]] be disallowed, and have to be written as >>> T[immutable(char)[]] instead of a silent rewrite? >>> >>> >>> alias long[char[]] AA; >>> // key automatically changed to const(char)[] >>> static assert(is(AA == long[const(char)[]])); >>> AA aa; >>> aa["a"] = 10; >>> // error - have to use immutable keys >>> aa["b".dup] = 11; >> >> >> By design, the problem is things like this: >> >> char[] key = "somekey"; >> long[char[]] aa; >> aa[key] = 5; >> key[2] = 'b'; >> >> If this were allowed, the associative array would reach an invalid >> state where the hash it stored for the key is no longer correct. >> >> It does seem like T[char[]] should be disallowed and the requirement >> for immutable keys should be literally enforced, and it's possible >> that was intended but immutable/const weren't as complete when this >> problem was last visited. I'll see if I can dig up an old discussion >> about this. > > > It is disallowed, but it's enforced when setting a key rather than when > constructing the type. > So `aa[key] = 5` above fails as key is char[] rather than string. Yes, while that's correct, it doesn't make much sense (as you pointed out). Rewriting long[char[]] to long[const(char)[]] isn't particularly useful when you can't use char[] as a key. If the compiler is basically going to disallow using the AA as anything but a long[string], it should really disallow declaring anything with a mutable key type. Disallowing mutable keys at that assignment site but allowing them in the type is confusing.
Re: Using "in" with associative arrays and then indexing them (efficiency)
On 01/03/2012 02:52 AM, Matej Nanut wrote: > I would like to know whether > > if (symbol in symbols) > return symbols[symbol]; > > is any less efficient than > > auto tmp = symbol in symbols; > if (tmp !is null) > return *tmp; > > Without optimisation, it looks like the first example > searches for `symbol' twice. Although the symbol is looked up twice, the cost may be negligible. Being hash tables, AAs have constant time lookup. Algorithmically, looking up twice is the same as looking up once in hash tables. When we assume that the looked-up object is going to be used in a non-trivial operation, then it doesn't matter. Having said that, I would use the second version too :D perhaps shorter as if (tmp) { // use *tmp } Ali
Re: AA char[] as key
Andrew Wiley: > If the compiler is basically going to disallow using the AA as > anything but a long[string], it should really disallow declaring > anything with a mutable key type. Disallowing mutable keys at that > assignment site but allowing them in the type is confusing. Two related bug reports: http://d.puremagic.com/issues/show_bug.cgi?id=4475 http://d.puremagic.com/issues/show_bug.cgi?id=6253 Bye, bearophile
Re: opCast!bool
On Tuesday, January 03, 2012 17:41:12 simendsjo wrote: > I guess this is as designed, but I'll ask anyway. > > http://dlang.org/operatoroverloading.html#Cast says an expression is > rewritten to opCast "whenever a bool result is expected". > > This is true for > if(e) somethingElse > and e && somethingElse > > , but not for other parts. > assert(cast(bool)e == true); // explicit cast works > assert(e == true); // Error: incompatible types for ((s) == (false)): > 'S' and 'bool' > > is(typeof(e) : bool); // false Yeah. It's the same for built-in types. Take arrays and pointers for example. They don't implicitly convert to bool, but when you use them in a condition, they implicitly convert to bool (true if they're non-null, false if they're null). If you want implicit conversion in general, then you need to use alias this. - Jonathan M Davis
Re: opCast!bool
On 01/04/2012 12:31 AM, Jonathan M Davis wrote: On Tuesday, January 03, 2012 17:41:12 simendsjo wrote: I guess this is as designed, but I'll ask anyway. http://dlang.org/operatoroverloading.html#Cast says an expression is rewritten to opCast "whenever a bool result is expected". This is true for if(e) somethingElse and e&& somethingElse , but not for other parts. assert(cast(bool)e == true); // explicit cast works assert(e == true); // Error: incompatible types for ((s) == (false)): 'S' and 'bool' is(typeof(e) : bool); // false Yeah. It's the same for built-in types. Take arrays and pointers for example. They don't implicitly convert to bool, but when you use them in a condition, they implicitly convert to bool (true if they're non-null, false if they're null). If you want implicit conversion in general, then you need to use alias this. - Jonathan M Davis The conversion is explicit. if(x) is rewritten to if(cast(bool)x) and e && somethingElse is rewritten to cast(bool)e && cast(bool)somethingElse.
Re: AA char[] as key
On Tue, Jan 3, 2012 at 4:20 PM, bearophile wrote: > Andrew Wiley: > >> If the compiler is basically going to disallow using the AA as >> anything but a long[string], it should really disallow declaring >> anything with a mutable key type. Disallowing mutable keys at that >> assignment site but allowing them in the type is confusing. > > Two related bug reports: > http://d.puremagic.com/issues/show_bug.cgi?id=4475 "Improving the compiler 'in' associative array can return just a bool" Whether this is a good idea or not is a moot point. Changing this would break too much code (basically all code that uses AAs significantly). > http://d.puremagic.com/issues/show_bug.cgi?id=6253 "Refuse definition too of impossible associative arrays" This is what we're discussing. Some consequences of actually changing this: - This breaks D1 compatibility of AAs across the board because immutable simply didn't exist then - Significant D2 code breakage as well We might see if Walter is willing to add this as a warning and/or deprecation to see whether it's actually feasible to disallow it completely.
Re: opCast!bool
On 01/03/2012 05:41 PM, simendsjo wrote: I guess this is as designed, but I'll ask anyway. http://dlang.org/operatoroverloading.html#Cast says an expression is rewritten to opCast "whenever a bool result is expected". This is true for if(e) somethingElse and e && somethingElse , but not for other parts. assert(cast(bool)e == true); // explicit cast works assert(e == true); // Error: incompatible types for ((s) == (false)): 'S' and 'bool' There is no 'bool result expected': The relation of the two operands in == is symmetric. You could just as well say that the result of 'true' is expected to be of type typeof(e). is(typeof(e) : bool); // false This tests whether or not typeof(e) implicitly converts to bool, which can be false even if an explicit cast would succeed.
Re: opCast!bool
On Wednesday, January 04, 2012 00:35:20 Timon Gehr wrote: > On 01/04/2012 12:31 AM, Jonathan M Davis wrote: > > On Tuesday, January 03, 2012 17:41:12 simendsjo wrote: > >> I guess this is as designed, but I'll ask anyway. > >> > >> http://dlang.org/operatoroverloading.html#Cast says an expression is > >> rewritten to opCast "whenever a bool result is expected". > >> > >> This is true for > >> if(e) somethingElse > >> and e&& somethingElse > >> > >> , but not for other parts. > >> assert(cast(bool)e == true); // explicit cast works > >> assert(e == true); // Error: incompatible types for ((s) == (false)): > >> 'S' and 'bool' > >> > >> is(typeof(e) : bool); // false > > > > Yeah. It's the same for built-in types. Take arrays and pointers for > > example. They don't implicitly convert to bool, but when you use them > > in a condition, they implicitly convert to bool (true if they're > > non-null, false if they're null). If you want implicit conversion in > > general, then you need to use alias this. > > > > - Jonathan M Davis > > The conversion is explicit. if(x) is rewritten to if(cast(bool)x) and e > && somethingElse is rewritten to cast(bool)e && cast(bool)somethingElse. It's implicit as far as the programmer is concerned. You do if(x) without caring that x isn't a bool. That rewrite just explains why it works implicitly in that case but not in general. - Jonathan M Davis
Re: AA char[] as key
Andrew Wiley: > Some consequences of actually changing this: > - This breaks D1 compatibility of AAs across the board because > immutable simply didn't exist then D1 compatibility will stop being a problem in some time :-) > - Significant D2 code breakage as well It's essentially wrong code, because you are declaring something you can't actually use, so "breaking it" is an improvement. Bye, bearophile