Re: default '==' on structs

2011-02-03 Thread Lars T. Kyllingstad
On Wed, 02 Feb 2011 17:35:50 +0100, spir wrote:

 On 02/02/2011 04:20 PM, Lars T. Kyllingstad wrote:
 On Wed, 02 Feb 2011 15:55:53 +0100, spir wrote:

 Hello,

 What are the default semantics for '==' on structs?

 I ask this because I was forced to write opEquals on a struct to get
 expected behaviour. This struct is basically:

 struct Lexeme {
   string tag;
   string slice;
   Ordinal index;
 }

 Equal Lexeme's compare unequal using default '=='. When I add:

   const bool opEquals (ref const(Lexeme) l) {
   return (
  this.tag   == l.tag
this.slice == l.slice
this.index == l.index
   );
   }

 then all works fine. What do I miss?

 I think the compiler does a bitwise comparison in this case, meaning
 that it compares the arrays' pointers instead of their data.  Related
 bug report:

http://d.puremagic.com/issues/show_bug.cgi?id=3433

 -Lars
 
 Thank you, Lars.
 In fact, I do not really understand what you mean. But it helped me
 think further :-)
 Two points:
 
 * The issue reported is about '==' on structs not using member opEquals
 when defined, instead performing bitwise comparison. This is not my
 case: Lexeme members are plain strings and an uint. They should just be
 compared as is. Bitwise comparison should just work fine. Also, this
 issue is marked solved for dmd 2.037 (I use 2.051).

Yeah, but I would say it isn't really fixed.  It seems that the final 
decision was that members which define opEquals() are compared using 
opEquals(), while all other members are compared bitwisely.  But built-in 
dynamic arrays can also be compared in two ways, using '==' (equality) or 
'is' (identity, i.e. bitwise equality).  Struct members which are dynamic 
arrays should, IMO, be compared using '==', but apparently they are not.


 * The following works as expected:
 
 struct Floats {float f1, f2;}
 struct Strings {string s1, s2;}
 struct Lexeme {
  string tag;
  string slice;
  uint index;
 }
 
 unittest {
  assert ( Floats(1.1,2.2)  == Floats(1.1,2.2) ); assert (
  Strings(a,b) == Strings(a,b) ); assert ( Lexeme(a,b,1)
  == Lexeme(a,b,1) );
 }
 
 This shows, if I'm right:
 1. Array (string) members are compared by value, not by ref/pointer. 2.
 Comparing Lexeme's works in this test case.

Nope, it doesn't show that, because you are assigning literals to your 
strings, and DMD is smart enough to detect duplicate literals.

string s1 = foo;
string s2 = foo;
assert (s1.ptr == s2.ptr);

That is actually pretty cool, by the way. :)

Here's an example to demonstrate my point:

import std.stdio;

struct T { string s; }

void main(string[] args)
{
auto s1 = args[1];
auto s2 = args[2];
auto t1 = T(s1);
auto t2 = T(s2);

if (s1 == s2) writeln(Arrays are equal);
else writeln(Arrays are different);

if (t1 == t2) writeln(Structs are equal);
else writeln(Structs are different);
}

If run with the arguments foo bar it prints:

Arrays are different
Structs are different

If run with the arguments foo foo it prints:

Arrays are equal
Structs are different

-Lars


Re: default '==' on structs

2011-02-03 Thread spir

On 02/03/2011 09:09 AM, Lars T. Kyllingstad wrote:

On Wed, 02 Feb 2011 17:35:50 +0100, spir wrote:


On 02/02/2011 04:20 PM, Lars T. Kyllingstad wrote:

On Wed, 02 Feb 2011 15:55:53 +0100, spir wrote:


Hello,

What are the default semantics for '==' on structs?

I ask this because I was forced to write opEquals on a struct to get
expected behaviour. This struct is basically:

struct Lexeme {
   string tag;
   string slice;
   Ordinal index;
}

Equal Lexeme's compare unequal using default '=='. When I add:

   const bool opEquals (ref const(Lexeme) l) {
   return (
  this.tag   == l.tag
 this.slice == l.slice
 this.index == l.index
   );
   }

then all works fine. What do I miss?


I think the compiler does a bitwise comparison in this case, meaning
that it compares the arrays' pointers instead of their data.  Related
bug report:

http://d.puremagic.com/issues/show_bug.cgi?id=3433

-Lars


Thank you, Lars.
In fact, I do not really understand what you mean. But it helped me
think further :-)
Two points:

* The issue reported is about '==' on structs not using member opEquals
when defined, instead performing bitwise comparison. This is not my
case: Lexeme members are plain strings and an uint. They should just be
compared as is. Bitwise comparison should just work fine. Also, this
issue is marked solved for dmd 2.037 (I use 2.051).


Yeah, but I would say it isn't really fixed.  It seems that the final
decision was that members which define opEquals() are compared using
opEquals(), while all other members are compared bitwisely.  But built-in
dynamic arrays can also be compared in two ways, using '==' (equality) or
'is' (identity, i.e. bitwise equality).  Struct members which are dynamic
arrays should, IMO, be compared using '==', but apparently they are not.



* The following works as expected:

struct Floats {float f1, f2;}
struct Strings {string s1, s2;}
struct Lexeme {
  string tag;
  string slice;
  uint index;
}

unittest {
  assert ( Floats(1.1,2.2)  == Floats(1.1,2.2) ); assert (
  Strings(a,b) == Strings(a,b) ); assert ( Lexeme(a,b,1)
  == Lexeme(a,b,1) );
}

This shows, if I'm right:
1. Array (string) members are compared by value, not by ref/pointer. 2.
Comparing Lexeme's works in this test case.


Nope, it doesn't show that, because you are assigning literals to your
strings, and DMD is smart enough to detect duplicate literals.

 string s1 = foo;
 string s2 = foo;
 assert (s1.ptr == s2.ptr);

That is actually pretty cool, by the way. :)

Here's an example to demonstrate my point:

 import std.stdio;

 struct T { string s; }

 void main(string[] args)
 {
 auto s1 = args[1];
 auto s2 = args[2];
 auto t1 = T(s1);
 auto t2 = T(s2);

 if (s1 == s2) writeln(Arrays are equal);
 else writeln(Arrays are different);

 if (t1 == t2) writeln(Structs are equal);
 else writeln(Structs are different);
 }

If run with the arguments foo bar it prints:

 Arrays are different
 Structs are different

If run with the arguments foo foo it prints:

 Arrays are equal
 Structs are different

-Lars


Thank you again, Lars: I was wrong and you are right. The key point is interned 
string literals, that interacted with my issue.


Denis
--
_
vita es estrany
spir.wikidot.com



Re: default '==' on structs

2011-02-03 Thread Steven Schveighoffer

On Wed, 02 Feb 2011 11:35:50 -0500, spir denis.s...@gmail.com wrote:


On 02/02/2011 04:20 PM, Lars T. Kyllingstad wrote:


I think the compiler does a bitwise comparison in this case, meaning  
that

it compares the arrays' pointers instead of their data.  Related bug
report:


Thank you, Lars.
In fact, I do not really understand what you mean. But it helped me  
think further :-)


I couldn't get from all your posts that you understand the issue.  A  
bitwise comparison compares ONLY the bits in the struct, NOT what the  
struct points to.


Comparing two arrays compares the data they point to.  So what is  
happening is essentially, the struct default comparison is comparing that  
both strings are equal in the identity sense, i.e. they both point to the  
exact same data with the exact same length.


If you analyze a string array, it looks like this (switch to mono-spaced  
font now :) :



+--+
|int length|
|immutable(char) *ptr -|-- hello world
+--+

The pointer points to the data, it is not contained within the array  
head.  The bitwise comparison only compares the head (what's in the box).


Apologies if you already understood this, but I wanted to be sure that you  
got it.


-Steve


Re: default '==' on structs

2011-02-03 Thread Steven Schveighoffer

On Thu, 03 Feb 2011 12:52:28 -0500, spir denis.s...@gmail.com wrote:



Side-questions: is it written somewhere dmd interns string literals? If  
yes, where? Is this supposed to be part of D's spec or an implementation  
aspect of dmd?


String literals are immutable, which means the compiler is free to re-use  
them wherever it wants without repercussions (you can't change immutable  
data).


It's not documented, but it fits within the requirements.

One thing that *is* documented is that string literals always have an  
implicit 0 character appended to the end of them, to allow easy  
interaction with C.


-Steve


Re: default '==' on structs

2011-02-03 Thread spir

On 02/03/2011 07:00 PM, Steven Schveighoffer wrote:

On Thu, 03 Feb 2011 12:52:28 -0500, spir denis.s...@gmail.com wrote:



Side-questions: is it written somewhere dmd interns string literals? If yes,
where? Is this supposed to be part of D's spec or an implementation aspect of
dmd?


String literals are immutable, which means the compiler is free to re-use them
wherever it wants without repercussions (you can't change immutable data).

It's not documented, but it fits within the requirements.

One thing that *is* documented is that string literals always have an implicit
0 character appended to the end of them, to allow easy interaction with C.



Right, thank you again, Steve.
An additional issue, then, is that this makes struct '==' compare inconsistent 
in front of literal vs non-literal string members (and literal strings vs all 
other arrays, in fact).


Denis
--
_
vita es estrany
spir.wikidot.com



default '==' on structs

2011-02-02 Thread spir

Hello,

What are the default semantics for '==' on structs?

I ask this because I was forced to write opEquals on a struct to get expected 
behaviour. This struct is basically:


struct Lexeme {
string tag;
string slice;
Ordinal index;
}

Equal Lexeme's compare unequal using default '=='. When I add:

const bool opEquals (ref const(Lexeme) l) {
return (
   this.tag   == l.tag
 this.slice == l.slice
 this.index == l.index
);
}

then all works fine. What do I miss?

Denis
--
_
vita es estrany
spir.wikidot.com



Re: default '==' on structs

2011-02-02 Thread Lars T. Kyllingstad
On Wed, 02 Feb 2011 15:55:53 +0100, spir wrote:

 Hello,
 
 What are the default semantics for '==' on structs?
 
 I ask this because I was forced to write opEquals on a struct to get
 expected behaviour. This struct is basically:
 
 struct Lexeme {
  string tag;
  string slice;
  Ordinal index;
 }
 
 Equal Lexeme's compare unequal using default '=='. When I add:
 
  const bool opEquals (ref const(Lexeme) l) {
  return (
 this.tag   == l.tag
   this.slice == l.slice
   this.index == l.index
  );
  }
 
 then all works fine. What do I miss?

I think the compiler does a bitwise comparison in this case, meaning that 
it compares the arrays' pointers instead of their data.  Related bug 
report:

  http://d.puremagic.com/issues/show_bug.cgi?id=3433

-Lars


Re: default '==' on structs

2011-02-02 Thread spir

On 02/02/2011 04:20 PM, Lars T. Kyllingstad wrote:

On Wed, 02 Feb 2011 15:55:53 +0100, spir wrote:


Hello,

What are the default semantics for '==' on structs?

I ask this because I was forced to write opEquals on a struct to get
expected behaviour. This struct is basically:

struct Lexeme {
  string tag;
  string slice;
  Ordinal index;
}

Equal Lexeme's compare unequal using default '=='. When I add:

  const bool opEquals (ref const(Lexeme) l) {
  return (
 this.tag   == l.tag
   this.slice == l.slice
   this.index == l.index
  );
  }

then all works fine. What do I miss?


I think the compiler does a bitwise comparison in this case, meaning that
it compares the arrays' pointers instead of their data.  Related bug
report:

   http://d.puremagic.com/issues/show_bug.cgi?id=3433

-Lars


Thank you, Lars.
In fact, I do not really understand what you mean. But it helped me think 
further :-)

Two points:

* The issue reported is about '==' on structs not using member opEquals when 
defined, instead performing bitwise comparison. This is not my case: Lexeme 
members are plain strings and an uint. They should just be compared as is. 
Bitwise comparison should just work fine.

Also, this issue is marked solved for dmd 2.037 (I use 2.051).

* The following works as expected:

struct Floats {float f1, f2;}
struct Strings {string s1, s2;}
struct Lexeme {
string tag;
string slice;
uint index;
}

unittest {
assert ( Floats(1.1,2.2)  == Floats(1.1,2.2) );
assert ( Strings(a,b) == Strings(a,b) );
assert ( Lexeme(a,b,1) == Lexeme(a,b,1) );
}

This shows, if I'm right:
1. Array (string) members are compared by value, not by ref/pointer.
2. Comparing Lexeme's works in this test case.

* Why does my app then need opEquals, just to compare member per member (see 
code above)?
The issue happens in a unittest. Lexemes are generated by a typical use of the 
module's features, then assert() compares them to expected result:

assert ( lexeme == Lexeme(expected_data) );
I'll try to reduce the issue to isolate the key point.

Thank you for your help,
denis
--
_
vita es estrany
spir.wikidot.com



Re: default '==' on structs

2011-02-02 Thread Andrej Mitrovic
What is Ordinal defined as? If it's a uint, I get the expected results:

alias uint Ordinal;

struct Lexeme {
   string tag;
   string slice;
   Ordinal index;
}

void main()
{
auto lex1 = Lexeme(a,b,1);
auto lex2 = Lexeme(a,b,1);

assert(lex1 == lex2);
assert(lex1 == Lexeme(a,b,1));
}

Can't say much more without knowing what your app does though.


Re: default '==' on structs

2011-02-02 Thread bearophile
spir:

 * The issue reported is about '==' on structs not using member opEquals when 
 defined, instead performing bitwise comparison. This is not my case: Lexeme 
 members are plain strings and an uint. They should just be compared as is. 
 Bitwise comparison should just work fine.
 Also, this issue is marked solved for dmd 2.037 (I use 2.051).

Lars is right, the == among structs is broken still:

struct Foo { string s; }
void main() {
string s1 = he;
string s2 = llo;
string s3 = hel;
string s4 = lo;
auto f1 = Foo(s1 ~ s2);
auto f2 = Foo(s3 ~ s4);
assert((s1 ~ s2) == (s3 ~ s4));
assert(f1 == f2);
}

Bye,
bearophile


Re: default '==' on structs

2011-02-02 Thread spir

On 02/02/2011 05:49 PM, Andrej Mitrovic wrote:

What is Ordinal defined as? If it's a uint, I get the expected results:

alias uint Ordinal;

struct Lexeme {
string tag;
string slice;
Ordinal index;
}

void main()
{
 auto lex1 = Lexeme(a,b,1);
 auto lex2 = Lexeme(a,b,1);

 assert(lex1 == lex2);
 assert(lex1 == Lexeme(a,b,1));
}

Can't say much more without knowing what your app does though.


Actually, its size_t. But I also have everything working fine in a test case 
exactly similar to yours (see other post). Dunno yet why I need to add an 
opEquals just comparing members individually for my unittests to pass.


I take the opportunity to say a few words about the module; case (1) it helps 
debugging (2) some people are interested in it.
The module is a lexing toolkit. It allows creating a lexer from a language's 
morphology, then use it to scan source. Example for simple arithmetics:


Morphology morphology = [
[ SPACING ,`[\ \t]*` ],
[ OPEN_GROUP , `(` ],
[ CLOSE_GROUP ,`)` ],
[ operator ,   `[+*-/]` ],
[ symbol , `[a-zA-A][a-zA-A0-9]*` ],
[ number , `[+-]?[0-9]+(\.[0-9]+)?` ],
];
auto lexer = new Lexer(morphology);
auto lexemes = lexer.lexemes(source);

As you see, each lexeme kind is defined by a string tag and a regex format.
The output is an array of lexemes holding the matched slice, wrapped in a class 
LexemeStream. This class mainly provides a match method:

Lexeme* match (tag)
Match returns a pointer to the current lexeme if it is of the right kind, else 
null (same principle as D's builtin 'in' operator). So, one can either ignore 
the lexeme if all what is needed is testing the match (case of punctuation), or 
use the lexeme's slice (case of values).


The issue I get happens when checking that a result stream of lexemes is as 
expected: '==' failed. I then checked its first/last lexemes only: ditto. Thus, 
I started to wonder about the default semantics of '==' for structs, so that I 
wrote my own opEquals == pass for individual lexemes, pass for whole lexeme 
streams. Why? dunno.


Denis
--
_
vita es estrany
spir.wikidot.com



Re: default '==' on structs

2011-02-02 Thread spir

On 02/02/2011 07:05 PM, bearophile wrote:

spir:


* The issue reported is about '==' on structs not using member opEquals when
defined, instead performing bitwise comparison. This is not my case: Lexeme
members are plain strings and an uint. They should just be compared as is.
Bitwise comparison should just work fine.
Also, this issue is marked solved for dmd 2.037 (I use 2.051).


Lars is right, the == among structs is broken still:

struct Foo { string s; }
void main() {
 string s1 = he;
 string s2 = llo;
 string s3 = hel;
 string s4 = lo;
 auto f1 = Foo(s1 ~ s2);
 auto f2 = Foo(s3 ~ s4);
 assert((s1 ~ s2) == (s3 ~ s4));
 assert(f1 == f2);
}


Thank you, this helps much. I don't get the details yet, but think some similar 
issue is playing a role in my case. String members of the compared Lexeme 
structs are not concatenated, but one of them is sliced from the scanned source.
If I dup'ed instead of slicing, this would create brand new strings; thus '==' 
performing bitwise comp should run fine, don't you think? I'll try in a short 
while.


Do you know more about why/how the above fails?

Denis
--
_
vita es estrany
spir.wikidot.com



Re: default '==' on structs

2011-02-02 Thread spir

On 02/02/2011 07:09 PM, bearophile wrote:

Lars is right, the == among structs is broken still:


If necessary please open a new bug report, this is an important bug.


Right, i'll do it when (hopefully) I understand more about the details of 
why/how '==' fails in my case.


Denis
--
_
vita es estrany
spir.wikidot.com



Re: default '==' on structs

2011-02-02 Thread spir

On 02/02/2011 07:09 PM, bearophile wrote:

Lars is right, the == among structs is broken still:


If necessary please open a new bug report, this is an important bug.

Bye,
bearophile


Right, reduced the bug cases I found to:

struct S {string s;}
unittest {
// concat
string s1 = he; string s2 = llo;
string s3 = hel; string s4 = lo;
assert ( S(s1 ~ s2) != S(s3 ~ s4) );
// slice
string s = hello;
assert ( S(s[1..$-1]) != S(ell) );
}

Same for array members (indeed):

struct A {int[] a;}
unittest {
// concat
int[] a1 = [1,2]; int[] a2 = [3];
int[] a3 = [1]; int[] a4 = [2,3];
assert ( A(a1 ~ a2) != A(a3 ~ a4) );
// slice
int[] a = [1,2,3];
assert ( A(a[1..$-1]) != A([2]) );
}

But this is not very relevant, because plain arrays /members/ (unlike strings) 
seem to be compared by ref (exactly by array struct):


unittest {
// string
string s1 = hello; string s2 = hello;
assert ( S(s1) == S(s2) );
// array (note '!=' assert)
int[] a1 = [1,2,3]; int[] a2 = [1,2,3];
assert ( A(a1) != A(a2) );
}

I think at opening a new bug report in a short while, with reference to issue 
#3433 (http://d.puremagic.com/issues/show_bug.cgi?id=3433) which was (unduly?) 
marked as fixed for dmd 2.037. In the meanwhile, if anyone knows about related 
cases of bug, or has more info, please tell.


On the other hand, the example of arrays let me doubt about correct / desirable 
semantics.

1. Indeed, I think string members should be compared by value.
2. But arrays are not, so should strings be compared by ref as well, if only to 
avoid inconsistency?

3. But then, why the already existing difference between strings  arrays?
4. Or should arrays be compared by value like string?
5. But strings are not /really/ compared by value as of now...
The current behaviour is weird. I don't how it can only happen.

Denis
--
_
vita es estrany
spir.wikidot.com



Re: default '==' on structs

2011-02-02 Thread bearophile
spir:

 Do you know more about why/how the above fails?

It's simple. A string (or array) is a 2-words long struct that contains a 
pointer to the data and a size_t length. Default struct equality just compares 
the bits of those two fields. In the above example I have created f1 and f2 
using two strings that have the same contents and lengths, but the pointers are 
different, because they are generated at run-time (normally the compiler uses a 
pool of shared string literals), so the equality fails.

I have asked Walter to fix this problem with strings and arrays probably three 
years ago or more, it's not a new problem :-)

Bye,
bearophile


Re: default '==' on structs

2011-02-02 Thread spir

On 02/02/2011 07:41 PM, spir wrote:

On 02/02/2011 07:05 PM, bearophile wrote:

spir:


* The issue reported is about '==' on structs not using member opEquals when
defined, instead performing bitwise comparison. This is not my case: Lexeme
members are plain strings and an uint. They should just be compared as is.
Bitwise comparison should just work fine.
Also, this issue is marked solved for dmd 2.037 (I use 2.051).


Lars is right, the == among structs is broken still:

struct Foo { string s; }
void main() {
string s1 = he;
string s2 = llo;
string s3 = hel;
string s4 = lo;
auto f1 = Foo(s1 ~ s2);
auto f2 = Foo(s3 ~ s4);
assert((s1 ~ s2) == (s3 ~ s4));
assert(f1 == f2);
}


Thank you, this helps much. I don't get the details yet, but think some similar
issue is playing a role in my case. String members of the compared Lexeme
structs are not concatenated, but one of them is sliced from the scanned source.
If I dup'ed instead of slicing, this would create brand new strings; thus '=='
performing bitwise comp should run fine, don't you think? I'll try in a short
while.


No! idup does not help, still need opEquals. See also this example case:

struct S {string s;}
unittest {
// concat
string s1 = he; string s2 = llo;
string s3 = hel; string s4 = lo;
assert ( S(s1 ~ s2) != S(s3 ~ s4) );
// slice
string s = hello;
assert ( S(s[1..$-1]) != S(ell) );
// idup'ed
assert ( S(s[1..$-1].idup) != S(ell) );
s2 = s[1..$-1].idup;
assert ( S(s2) != S(ell) );
}

Denis
--
_
vita es estrany
spir.wikidot.com



Re: default '==' on structs

2011-02-02 Thread spir

On 02/02/2011 08:20 PM, bearophile wrote:

spir:


Do you know more about why/how the above fails?


It's simple. A string (or array) is a 2-words long struct that contains a 
pointer to the data and a size_t length. Default struct equality just compares 
the bits of those two fields. In the above example I have created f1 and f2 
using two strings that have the same contents and lengths, but the pointers are 
different, because they are generated at run-time (normally the compiler uses a 
pool of shared string literals), so the equality fails.

I have asked Walter to fix this problem with strings and arrays probably three 
years ago or more, it's not a new problem :-)


All right, you mean string literals are interned? Explaining why the case below 
works...


struct S {string s;}
unittest {
// plainly equal members
string s01 = hello; string s02 = hello;
assert ( S(s01) == S(s02) );
}

... because s01  s02 are actually the same, unique, piece of data in memory 
(thus pointers are equal indeed)?


I'm ok to write another bug report as you asked. But since you've asked for 
this already, and there is bug#3433 on a very similar topic supposedly closed 
as well, I fear it's useless, don't you?

And if we fix string, then the case of regular arrays becomes inconsistent.
The code issue about clear semantics, I guess, is that the case above works 
*due to* an implementation detail. The rest is just annoying (need to write 
opequals to get expected semantics in 99% cases, probably), but /not/ inconsistent.


Denis
--
_
vita es estrany
spir.wikidot.com



Re: default '==' on structs

2011-02-02 Thread bearophile
spir:

 And if we fix string, then the case of regular arrays becomes inconsistent.

The bug report is about arrays too, of course. I will write this bug report.

Bye,
bearophile


Re: default '==' on structs

2011-02-02 Thread bearophile
 The bug report is about arrays too, of course. I will write this bug report.

http://d.puremagic.com/issues/show_bug.cgi?id=5519

Bye,
bearophile