Re: default '==' on structs
On Wed, 02 Feb 2011 17:35:50 +0100, spir wrote: On 02/02/2011 04:20 PM, Lars T. Kyllingstad wrote: On Wed, 02 Feb 2011 15:55:53 +0100, spir wrote: Hello, What are the default semantics for '==' on structs? I ask this because I was forced to write opEquals on a struct to get expected behaviour. This struct is basically: struct Lexeme { string tag; string slice; Ordinal index; } Equal Lexeme's compare unequal using default '=='. When I add: const bool opEquals (ref const(Lexeme) l) { return ( this.tag == l.tag this.slice == l.slice this.index == l.index ); } then all works fine. What do I miss? I think the compiler does a bitwise comparison in this case, meaning that it compares the arrays' pointers instead of their data. Related bug report: http://d.puremagic.com/issues/show_bug.cgi?id=3433 -Lars Thank you, Lars. In fact, I do not really understand what you mean. But it helped me think further :-) Two points: * The issue reported is about '==' on structs not using member opEquals when defined, instead performing bitwise comparison. This is not my case: Lexeme members are plain strings and an uint. They should just be compared as is. Bitwise comparison should just work fine. Also, this issue is marked solved for dmd 2.037 (I use 2.051). Yeah, but I would say it isn't really fixed. It seems that the final decision was that members which define opEquals() are compared using opEquals(), while all other members are compared bitwisely. But built-in dynamic arrays can also be compared in two ways, using '==' (equality) or 'is' (identity, i.e. bitwise equality). Struct members which are dynamic arrays should, IMO, be compared using '==', but apparently they are not. * The following works as expected: struct Floats {float f1, f2;} struct Strings {string s1, s2;} struct Lexeme { string tag; string slice; uint index; } unittest { assert ( Floats(1.1,2.2) == Floats(1.1,2.2) ); assert ( Strings(a,b) == Strings(a,b) ); assert ( Lexeme(a,b,1) == Lexeme(a,b,1) ); } This shows, if I'm right: 1. Array (string) members are compared by value, not by ref/pointer. 2. Comparing Lexeme's works in this test case. Nope, it doesn't show that, because you are assigning literals to your strings, and DMD is smart enough to detect duplicate literals. string s1 = foo; string s2 = foo; assert (s1.ptr == s2.ptr); That is actually pretty cool, by the way. :) Here's an example to demonstrate my point: import std.stdio; struct T { string s; } void main(string[] args) { auto s1 = args[1]; auto s2 = args[2]; auto t1 = T(s1); auto t2 = T(s2); if (s1 == s2) writeln(Arrays are equal); else writeln(Arrays are different); if (t1 == t2) writeln(Structs are equal); else writeln(Structs are different); } If run with the arguments foo bar it prints: Arrays are different Structs are different If run with the arguments foo foo it prints: Arrays are equal Structs are different -Lars
Re: default '==' on structs
On 02/03/2011 09:09 AM, Lars T. Kyllingstad wrote: On Wed, 02 Feb 2011 17:35:50 +0100, spir wrote: On 02/02/2011 04:20 PM, Lars T. Kyllingstad wrote: On Wed, 02 Feb 2011 15:55:53 +0100, spir wrote: Hello, What are the default semantics for '==' on structs? I ask this because I was forced to write opEquals on a struct to get expected behaviour. This struct is basically: struct Lexeme { string tag; string slice; Ordinal index; } Equal Lexeme's compare unequal using default '=='. When I add: const bool opEquals (ref const(Lexeme) l) { return ( this.tag == l.tag this.slice == l.slice this.index == l.index ); } then all works fine. What do I miss? I think the compiler does a bitwise comparison in this case, meaning that it compares the arrays' pointers instead of their data. Related bug report: http://d.puremagic.com/issues/show_bug.cgi?id=3433 -Lars Thank you, Lars. In fact, I do not really understand what you mean. But it helped me think further :-) Two points: * The issue reported is about '==' on structs not using member opEquals when defined, instead performing bitwise comparison. This is not my case: Lexeme members are plain strings and an uint. They should just be compared as is. Bitwise comparison should just work fine. Also, this issue is marked solved for dmd 2.037 (I use 2.051). Yeah, but I would say it isn't really fixed. It seems that the final decision was that members which define opEquals() are compared using opEquals(), while all other members are compared bitwisely. But built-in dynamic arrays can also be compared in two ways, using '==' (equality) or 'is' (identity, i.e. bitwise equality). Struct members which are dynamic arrays should, IMO, be compared using '==', but apparently they are not. * The following works as expected: struct Floats {float f1, f2;} struct Strings {string s1, s2;} struct Lexeme { string tag; string slice; uint index; } unittest { assert ( Floats(1.1,2.2) == Floats(1.1,2.2) ); assert ( Strings(a,b) == Strings(a,b) ); assert ( Lexeme(a,b,1) == Lexeme(a,b,1) ); } This shows, if I'm right: 1. Array (string) members are compared by value, not by ref/pointer. 2. Comparing Lexeme's works in this test case. Nope, it doesn't show that, because you are assigning literals to your strings, and DMD is smart enough to detect duplicate literals. string s1 = foo; string s2 = foo; assert (s1.ptr == s2.ptr); That is actually pretty cool, by the way. :) Here's an example to demonstrate my point: import std.stdio; struct T { string s; } void main(string[] args) { auto s1 = args[1]; auto s2 = args[2]; auto t1 = T(s1); auto t2 = T(s2); if (s1 == s2) writeln(Arrays are equal); else writeln(Arrays are different); if (t1 == t2) writeln(Structs are equal); else writeln(Structs are different); } If run with the arguments foo bar it prints: Arrays are different Structs are different If run with the arguments foo foo it prints: Arrays are equal Structs are different -Lars Thank you again, Lars: I was wrong and you are right. The key point is interned string literals, that interacted with my issue. Denis -- _ vita es estrany spir.wikidot.com
Re: default '==' on structs
On Wed, 02 Feb 2011 11:35:50 -0500, spir denis.s...@gmail.com wrote: On 02/02/2011 04:20 PM, Lars T. Kyllingstad wrote: I think the compiler does a bitwise comparison in this case, meaning that it compares the arrays' pointers instead of their data. Related bug report: Thank you, Lars. In fact, I do not really understand what you mean. But it helped me think further :-) I couldn't get from all your posts that you understand the issue. A bitwise comparison compares ONLY the bits in the struct, NOT what the struct points to. Comparing two arrays compares the data they point to. So what is happening is essentially, the struct default comparison is comparing that both strings are equal in the identity sense, i.e. they both point to the exact same data with the exact same length. If you analyze a string array, it looks like this (switch to mono-spaced font now :) : +--+ |int length| |immutable(char) *ptr -|-- hello world +--+ The pointer points to the data, it is not contained within the array head. The bitwise comparison only compares the head (what's in the box). Apologies if you already understood this, but I wanted to be sure that you got it. -Steve
Re: default '==' on structs
On Thu, 03 Feb 2011 12:52:28 -0500, spir denis.s...@gmail.com wrote: Side-questions: is it written somewhere dmd interns string literals? If yes, where? Is this supposed to be part of D's spec or an implementation aspect of dmd? String literals are immutable, which means the compiler is free to re-use them wherever it wants without repercussions (you can't change immutable data). It's not documented, but it fits within the requirements. One thing that *is* documented is that string literals always have an implicit 0 character appended to the end of them, to allow easy interaction with C. -Steve
Re: default '==' on structs
On 02/03/2011 07:00 PM, Steven Schveighoffer wrote: On Thu, 03 Feb 2011 12:52:28 -0500, spir denis.s...@gmail.com wrote: Side-questions: is it written somewhere dmd interns string literals? If yes, where? Is this supposed to be part of D's spec or an implementation aspect of dmd? String literals are immutable, which means the compiler is free to re-use them wherever it wants without repercussions (you can't change immutable data). It's not documented, but it fits within the requirements. One thing that *is* documented is that string literals always have an implicit 0 character appended to the end of them, to allow easy interaction with C. Right, thank you again, Steve. An additional issue, then, is that this makes struct '==' compare inconsistent in front of literal vs non-literal string members (and literal strings vs all other arrays, in fact). Denis -- _ vita es estrany spir.wikidot.com
default '==' on structs
Hello, What are the default semantics for '==' on structs? I ask this because I was forced to write opEquals on a struct to get expected behaviour. This struct is basically: struct Lexeme { string tag; string slice; Ordinal index; } Equal Lexeme's compare unequal using default '=='. When I add: const bool opEquals (ref const(Lexeme) l) { return ( this.tag == l.tag this.slice == l.slice this.index == l.index ); } then all works fine. What do I miss? Denis -- _ vita es estrany spir.wikidot.com
Re: default '==' on structs
On Wed, 02 Feb 2011 15:55:53 +0100, spir wrote: Hello, What are the default semantics for '==' on structs? I ask this because I was forced to write opEquals on a struct to get expected behaviour. This struct is basically: struct Lexeme { string tag; string slice; Ordinal index; } Equal Lexeme's compare unequal using default '=='. When I add: const bool opEquals (ref const(Lexeme) l) { return ( this.tag == l.tag this.slice == l.slice this.index == l.index ); } then all works fine. What do I miss? I think the compiler does a bitwise comparison in this case, meaning that it compares the arrays' pointers instead of their data. Related bug report: http://d.puremagic.com/issues/show_bug.cgi?id=3433 -Lars
Re: default '==' on structs
On 02/02/2011 04:20 PM, Lars T. Kyllingstad wrote: On Wed, 02 Feb 2011 15:55:53 +0100, spir wrote: Hello, What are the default semantics for '==' on structs? I ask this because I was forced to write opEquals on a struct to get expected behaviour. This struct is basically: struct Lexeme { string tag; string slice; Ordinal index; } Equal Lexeme's compare unequal using default '=='. When I add: const bool opEquals (ref const(Lexeme) l) { return ( this.tag == l.tag this.slice == l.slice this.index == l.index ); } then all works fine. What do I miss? I think the compiler does a bitwise comparison in this case, meaning that it compares the arrays' pointers instead of their data. Related bug report: http://d.puremagic.com/issues/show_bug.cgi?id=3433 -Lars Thank you, Lars. In fact, I do not really understand what you mean. But it helped me think further :-) Two points: * The issue reported is about '==' on structs not using member opEquals when defined, instead performing bitwise comparison. This is not my case: Lexeme members are plain strings and an uint. They should just be compared as is. Bitwise comparison should just work fine. Also, this issue is marked solved for dmd 2.037 (I use 2.051). * The following works as expected: struct Floats {float f1, f2;} struct Strings {string s1, s2;} struct Lexeme { string tag; string slice; uint index; } unittest { assert ( Floats(1.1,2.2) == Floats(1.1,2.2) ); assert ( Strings(a,b) == Strings(a,b) ); assert ( Lexeme(a,b,1) == Lexeme(a,b,1) ); } This shows, if I'm right: 1. Array (string) members are compared by value, not by ref/pointer. 2. Comparing Lexeme's works in this test case. * Why does my app then need opEquals, just to compare member per member (see code above)? The issue happens in a unittest. Lexemes are generated by a typical use of the module's features, then assert() compares them to expected result: assert ( lexeme == Lexeme(expected_data) ); I'll try to reduce the issue to isolate the key point. Thank you for your help, denis -- _ vita es estrany spir.wikidot.com
Re: default '==' on structs
What is Ordinal defined as? If it's a uint, I get the expected results: alias uint Ordinal; struct Lexeme { string tag; string slice; Ordinal index; } void main() { auto lex1 = Lexeme(a,b,1); auto lex2 = Lexeme(a,b,1); assert(lex1 == lex2); assert(lex1 == Lexeme(a,b,1)); } Can't say much more without knowing what your app does though.
Re: default '==' on structs
spir: * The issue reported is about '==' on structs not using member opEquals when defined, instead performing bitwise comparison. This is not my case: Lexeme members are plain strings and an uint. They should just be compared as is. Bitwise comparison should just work fine. Also, this issue is marked solved for dmd 2.037 (I use 2.051). Lars is right, the == among structs is broken still: struct Foo { string s; } void main() { string s1 = he; string s2 = llo; string s3 = hel; string s4 = lo; auto f1 = Foo(s1 ~ s2); auto f2 = Foo(s3 ~ s4); assert((s1 ~ s2) == (s3 ~ s4)); assert(f1 == f2); } Bye, bearophile
Re: default '==' on structs
On 02/02/2011 05:49 PM, Andrej Mitrovic wrote: What is Ordinal defined as? If it's a uint, I get the expected results: alias uint Ordinal; struct Lexeme { string tag; string slice; Ordinal index; } void main() { auto lex1 = Lexeme(a,b,1); auto lex2 = Lexeme(a,b,1); assert(lex1 == lex2); assert(lex1 == Lexeme(a,b,1)); } Can't say much more without knowing what your app does though. Actually, its size_t. But I also have everything working fine in a test case exactly similar to yours (see other post). Dunno yet why I need to add an opEquals just comparing members individually for my unittests to pass. I take the opportunity to say a few words about the module; case (1) it helps debugging (2) some people are interested in it. The module is a lexing toolkit. It allows creating a lexer from a language's morphology, then use it to scan source. Example for simple arithmetics: Morphology morphology = [ [ SPACING ,`[\ \t]*` ], [ OPEN_GROUP , `(` ], [ CLOSE_GROUP ,`)` ], [ operator , `[+*-/]` ], [ symbol , `[a-zA-A][a-zA-A0-9]*` ], [ number , `[+-]?[0-9]+(\.[0-9]+)?` ], ]; auto lexer = new Lexer(morphology); auto lexemes = lexer.lexemes(source); As you see, each lexeme kind is defined by a string tag and a regex format. The output is an array of lexemes holding the matched slice, wrapped in a class LexemeStream. This class mainly provides a match method: Lexeme* match (tag) Match returns a pointer to the current lexeme if it is of the right kind, else null (same principle as D's builtin 'in' operator). So, one can either ignore the lexeme if all what is needed is testing the match (case of punctuation), or use the lexeme's slice (case of values). The issue I get happens when checking that a result stream of lexemes is as expected: '==' failed. I then checked its first/last lexemes only: ditto. Thus, I started to wonder about the default semantics of '==' for structs, so that I wrote my own opEquals == pass for individual lexemes, pass for whole lexeme streams. Why? dunno. Denis -- _ vita es estrany spir.wikidot.com
Re: default '==' on structs
On 02/02/2011 07:05 PM, bearophile wrote: spir: * The issue reported is about '==' on structs not using member opEquals when defined, instead performing bitwise comparison. This is not my case: Lexeme members are plain strings and an uint. They should just be compared as is. Bitwise comparison should just work fine. Also, this issue is marked solved for dmd 2.037 (I use 2.051). Lars is right, the == among structs is broken still: struct Foo { string s; } void main() { string s1 = he; string s2 = llo; string s3 = hel; string s4 = lo; auto f1 = Foo(s1 ~ s2); auto f2 = Foo(s3 ~ s4); assert((s1 ~ s2) == (s3 ~ s4)); assert(f1 == f2); } Thank you, this helps much. I don't get the details yet, but think some similar issue is playing a role in my case. String members of the compared Lexeme structs are not concatenated, but one of them is sliced from the scanned source. If I dup'ed instead of slicing, this would create brand new strings; thus '==' performing bitwise comp should run fine, don't you think? I'll try in a short while. Do you know more about why/how the above fails? Denis -- _ vita es estrany spir.wikidot.com
Re: default '==' on structs
On 02/02/2011 07:09 PM, bearophile wrote: Lars is right, the == among structs is broken still: If necessary please open a new bug report, this is an important bug. Right, i'll do it when (hopefully) I understand more about the details of why/how '==' fails in my case. Denis -- _ vita es estrany spir.wikidot.com
Re: default '==' on structs
On 02/02/2011 07:09 PM, bearophile wrote: Lars is right, the == among structs is broken still: If necessary please open a new bug report, this is an important bug. Bye, bearophile Right, reduced the bug cases I found to: struct S {string s;} unittest { // concat string s1 = he; string s2 = llo; string s3 = hel; string s4 = lo; assert ( S(s1 ~ s2) != S(s3 ~ s4) ); // slice string s = hello; assert ( S(s[1..$-1]) != S(ell) ); } Same for array members (indeed): struct A {int[] a;} unittest { // concat int[] a1 = [1,2]; int[] a2 = [3]; int[] a3 = [1]; int[] a4 = [2,3]; assert ( A(a1 ~ a2) != A(a3 ~ a4) ); // slice int[] a = [1,2,3]; assert ( A(a[1..$-1]) != A([2]) ); } But this is not very relevant, because plain arrays /members/ (unlike strings) seem to be compared by ref (exactly by array struct): unittest { // string string s1 = hello; string s2 = hello; assert ( S(s1) == S(s2) ); // array (note '!=' assert) int[] a1 = [1,2,3]; int[] a2 = [1,2,3]; assert ( A(a1) != A(a2) ); } I think at opening a new bug report in a short while, with reference to issue #3433 (http://d.puremagic.com/issues/show_bug.cgi?id=3433) which was (unduly?) marked as fixed for dmd 2.037. In the meanwhile, if anyone knows about related cases of bug, or has more info, please tell. On the other hand, the example of arrays let me doubt about correct / desirable semantics. 1. Indeed, I think string members should be compared by value. 2. But arrays are not, so should strings be compared by ref as well, if only to avoid inconsistency? 3. But then, why the already existing difference between strings arrays? 4. Or should arrays be compared by value like string? 5. But strings are not /really/ compared by value as of now... The current behaviour is weird. I don't how it can only happen. Denis -- _ vita es estrany spir.wikidot.com
Re: default '==' on structs
spir: Do you know more about why/how the above fails? It's simple. A string (or array) is a 2-words long struct that contains a pointer to the data and a size_t length. Default struct equality just compares the bits of those two fields. In the above example I have created f1 and f2 using two strings that have the same contents and lengths, but the pointers are different, because they are generated at run-time (normally the compiler uses a pool of shared string literals), so the equality fails. I have asked Walter to fix this problem with strings and arrays probably three years ago or more, it's not a new problem :-) Bye, bearophile
Re: default '==' on structs
On 02/02/2011 07:41 PM, spir wrote: On 02/02/2011 07:05 PM, bearophile wrote: spir: * The issue reported is about '==' on structs not using member opEquals when defined, instead performing bitwise comparison. This is not my case: Lexeme members are plain strings and an uint. They should just be compared as is. Bitwise comparison should just work fine. Also, this issue is marked solved for dmd 2.037 (I use 2.051). Lars is right, the == among structs is broken still: struct Foo { string s; } void main() { string s1 = he; string s2 = llo; string s3 = hel; string s4 = lo; auto f1 = Foo(s1 ~ s2); auto f2 = Foo(s3 ~ s4); assert((s1 ~ s2) == (s3 ~ s4)); assert(f1 == f2); } Thank you, this helps much. I don't get the details yet, but think some similar issue is playing a role in my case. String members of the compared Lexeme structs are not concatenated, but one of them is sliced from the scanned source. If I dup'ed instead of slicing, this would create brand new strings; thus '==' performing bitwise comp should run fine, don't you think? I'll try in a short while. No! idup does not help, still need opEquals. See also this example case: struct S {string s;} unittest { // concat string s1 = he; string s2 = llo; string s3 = hel; string s4 = lo; assert ( S(s1 ~ s2) != S(s3 ~ s4) ); // slice string s = hello; assert ( S(s[1..$-1]) != S(ell) ); // idup'ed assert ( S(s[1..$-1].idup) != S(ell) ); s2 = s[1..$-1].idup; assert ( S(s2) != S(ell) ); } Denis -- _ vita es estrany spir.wikidot.com
Re: default '==' on structs
On 02/02/2011 08:20 PM, bearophile wrote: spir: Do you know more about why/how the above fails? It's simple. A string (or array) is a 2-words long struct that contains a pointer to the data and a size_t length. Default struct equality just compares the bits of those two fields. In the above example I have created f1 and f2 using two strings that have the same contents and lengths, but the pointers are different, because they are generated at run-time (normally the compiler uses a pool of shared string literals), so the equality fails. I have asked Walter to fix this problem with strings and arrays probably three years ago or more, it's not a new problem :-) All right, you mean string literals are interned? Explaining why the case below works... struct S {string s;} unittest { // plainly equal members string s01 = hello; string s02 = hello; assert ( S(s01) == S(s02) ); } ... because s01 s02 are actually the same, unique, piece of data in memory (thus pointers are equal indeed)? I'm ok to write another bug report as you asked. But since you've asked for this already, and there is bug#3433 on a very similar topic supposedly closed as well, I fear it's useless, don't you? And if we fix string, then the case of regular arrays becomes inconsistent. The code issue about clear semantics, I guess, is that the case above works *due to* an implementation detail. The rest is just annoying (need to write opequals to get expected semantics in 99% cases, probably), but /not/ inconsistent. Denis -- _ vita es estrany spir.wikidot.com
Re: default '==' on structs
spir: And if we fix string, then the case of regular arrays becomes inconsistent. The bug report is about arrays too, of course. I will write this bug report. Bye, bearophile
Re: default '==' on structs
The bug report is about arrays too, of course. I will write this bug report. http://d.puremagic.com/issues/show_bug.cgi?id=5519 Bye, bearophile