Re: null Vs [] return arrays

2011-04-07 Thread Regan Heath
On Tue, 05 Apr 2011 18:46:06 +0100, Steven Schveighoffer  
schvei...@yahoo.com wrote:
On Tue, 05 Apr 2011 13:24:49 -0400, Regan Heath re...@netmail.co.nz  
wrote:


On Fri, 01 Apr 2011 18:23:28 +0100, Steven Schveighoffer  
schvei...@yahoo.com wrote:


assert( !is null); // works on D.  Try it.


Yes, but that's because this is a string literal.  It's not useful  
where you're getting your input from somewhere else.. like in the other  
2 use cases I mentioned.


But that isn't the same as [].  Basically, if you have an existing  
array, and you want to create a non-null empty array out of it, a slice  
of [0..0] always works.


I know you mention it, but I want to draw attention to the original  
problem, that [] returns a null array.  Other cases where you are not  
using [] or  are a separate issue.


All the cases you have brought up involve strings, for which there is a  
non-null array returned for .  I still have not yet seen a compelling  
use case for making [] return non-null.


Ahh.. I see, I really should have renamed the thread title.  I'm not, and  
never was, arguing for [] (specifically) returning non-null.  Sorry.




Quoting from your message previously (with added comment):


case 4:
return cast(char[]).dup;
case 5:
return cast(char[])[0..0]; // note lack of .dup
}


Drat, not sure what happened there.  My source had the 'dup' when I went  
back to it.  Sorry.


--
Using Opera's revolutionary email client: http://www.opera.com/mail/


Re: null Vs [] return arrays

2011-04-07 Thread Kagamin
bearophile Wrote:

 Regan Heath:
 
  conceptually it's nice to be able to express (exists but is empty) and  
  (does not exist).
 
 You may want to express that, but for the implementation of the language 
 those two situations are the same, because in the [] literal the ptr is null. 
 So I think it's better for the programmer to not differentiate the two 
 situations, because they are not different. If the programmer tells them 
 apart, he/she is doing something bad in D, creating a false illusion.

It's bad, when the language is driven by the implementation of a reference 
compiler by the copyright holder. This way compiler bugs and tricks become a 
language standard. See the story of VP7 codec.


Re: null Vs [] return arrays

2011-04-01 Thread Regan Heath
On Mon, 28 Mar 2011 17:54:29 +0100, bearophile bearophileh...@lycps.com  
wrote:

Steven Schveighoffer:


So essentially, you are getting the same thing, but using [] is slower.


It seems I was right then, thank you and Kagamin for the answers.


This may be slightly OT but I just wanted to raise the point that  
conceptually it's nice to be able to express (exists but is empty) and  
(does not exist).  Pointers/references have null as a (does not exist)  
value and this is incredibly useful.  Try doing the same thing with  
'int' .. it requires you either use int* or pass an additional boolean to  
indicate existence.. yuck.


I'd suggest if someone types '[]' they mean (exists but is empty) and if  
they type 'null' they mean (does not exist) and they may be relying on the  
.ptr value to differentiate these cases, which is useful.  If you're not  
interested in the difference, and you need performance, you simply use  
'null'.  Everybody is happy. :)


R


Re: null Vs [] return arrays

2011-04-01 Thread bearophile
Regan Heath:

 conceptually it's nice to be able to express (exists but is empty) and  
 (does not exist).

You may want to express that, but for the implementation of the language those 
two situations are the same, because in the [] literal the ptr is null. So I 
think it's better for the programmer to not differentiate the two situations, 
because they are not different. If the programmer tells them apart, he/she is 
doing something bad in D, creating a false illusion.

Bye,
bearophile


Re: null Vs [] return arrays

2011-04-01 Thread Torarin
2011/4/1 Regan Heath re...@netmail.co.nz:
 On Mon, 28 Mar 2011 17:54:29 +0100, bearophile bearophileh...@lycps.com
 wrote:

 Steven Schveighoffer:

 So essentially, you are getting the same thing, but using [] is slower.

 It seems I was right then, thank you and Kagamin for the answers.

 This may be slightly OT but I just wanted to raise the point that
 conceptually it's nice to be able to express (exists but is empty) and (does
 not exist).  Pointers/references have null as a (does not exist) value and
 this is incredibly useful.  Try doing the same thing with 'int' .. it
 requires you either use int* or pass an additional boolean to indicate
 existence.. yuck.

 I'd suggest if someone types '[]' they mean (exists but is empty) and if
 they type 'null' they mean (does not exist) and they may be relying on the
 .ptr value to differentiate these cases, which is useful.  If you're not
 interested in the difference, and you need performance, you simply use
 'null'.  Everybody is happy. :)

 R


For associative arrays it certainly would be nice to be able to do
something like
string[string] options = [:];
so that functions can manipulate an empty aa without using ref.

Torarin


Re: null Vs [] return arrays

2011-04-01 Thread spir

On 04/01/2011 12:38 PM, Regan Heath wrote:

On Mon, 28 Mar 2011 17:54:29 +0100, bearophile bearophileh...@lycps.com wrote:

Steven Schveighoffer:


So essentially, you are getting the same thing, but using [] is slower.


It seems I was right then, thank you and Kagamin for the answers.


This may be slightly OT but I just wanted to raise the point that conceptually
it's nice to be able to express (exists but is empty) and (does not exist).
Pointers/references have null as a (does not exist) value and this is
incredibly useful. Try doing the same thing with 'int' .. it requires you
either use int* or pass an additional boolean to indicate existence.. yuck.

I'd suggest if someone types '[]' they mean (exists but is empty) and if they
type 'null' they mean (does not exist) and they may be relying on the .ptr
value to differentiate these cases, which is useful. If you're not interested
in the difference, and you need performance, you simply use 'null'. Everybody
is happy. :)


That's the way I understand this distinction. Unfortunately, D does not really 
allow this, by semantically treating both indifferently (eg one can put a new 
element into an null array).


Denis
--
_
vita es estrany
spir.wikidot.com



Re: null Vs [] return arrays

2011-04-01 Thread Steven Schveighoffer
On Fri, 01 Apr 2011 06:38:56 -0400, Regan Heath re...@netmail.co.nz  
wrote:


On Mon, 28 Mar 2011 17:54:29 +0100, bearophile  
bearophileh...@lycps.com wrote:

Steven Schveighoffer:


So essentially, you are getting the same thing, but using [] is slower.


It seems I was right then, thank you and Kagamin for the answers.


This may be slightly OT but I just wanted to raise the point that  
conceptually it's nice to be able to express (exists but is empty) and  
(does not exist).  Pointers/references have null as a (does not exist)  
value and this is incredibly useful.  Try doing the same thing with  
'int' .. it requires you either use int* or pass an additional boolean  
to indicate existence.. yuck.


I'd suggest if someone types '[]' they mean (exists but is empty) and if  
they type 'null' they mean (does not exist) and they may be relying on  
the .ptr value to differentiate these cases, which is useful.  If you're  
not interested in the difference, and you need performance, you simply  
use 'null'.  Everybody is happy. :)


The distinction is useful if you have something to reference (e.g. an  
empty slice that points at the end of a pre-existing non-empty array).   
But [] is a new array, no point in allocating memory just so the pointer  
can be non-null.  Can you come up with a use case to show why you'd want  
such a thing?


Your plan would mean that [] is a memory allocation.  I'd rather not have  
the runtime do the lower performing thing unless there is a good reason.


As an alternative, you could use (cast(T *)null)[1..1] if you really  
needed it (this also would be higher performing, BTW since the runtime  
array literal function would not be called).


-Steve


Re: null Vs [] return arrays

2011-04-01 Thread Regan Heath
On Fri, 01 Apr 2011 13:38:45 +0100, Steven Schveighoffer  
schvei...@yahoo.com wrote:


On Fri, 01 Apr 2011 06:38:56 -0400, Regan Heath re...@netmail.co.nz  
wrote:


On Mon, 28 Mar 2011 17:54:29 +0100, bearophile  
bearophileh...@lycps.com wrote:

Steven Schveighoffer:

So essentially, you are getting the same thing, but using [] is  
slower.


It seems I was right then, thank you and Kagamin for the answers.


This may be slightly OT but I just wanted to raise the point that  
conceptually it's nice to be able to express (exists but is empty) and  
(does not exist).  Pointers/references have null as a (does not exist)  
value and this is incredibly useful.  Try doing the same thing with  
'int' .. it requires you either use int* or pass an additional boolean  
to indicate existence.. yuck.


I'd suggest if someone types '[]' they mean (exists but is empty) and  
if they type 'null' they mean (does not exist) and they may be relying  
on the .ptr value to differentiate these cases, which is useful.  If  
you're not interested in the difference, and you need performance, you  
simply use 'null'.  Everybody is happy. :)


The distinction is useful if you have something to reference (e.g. an  
empty slice that points at the end of a pre-existing non-empty array).   
But [] is a new array, no point in allocating memory just so the pointer  
can be non-null.  Can you come up with a use case to show why you'd want  
such a thing?


Ok.  Recently I wrote (in C) a function proxy interface.  I had to execute  
a set of functions from one thread, and wanted to 'call' them from  
potentially many.  So, I set up the thread, added events, and a queue, etc  
and I wrote a proxy function for 'calling' them from the many threads  
which looks like...


void proxy(int func, ...) {}

So, it accepts a variable list of args, places them in a structure, places  
that in the queue, and waits on an event for the proxy thread to execute  
the command and return the result.  Lets say the function I am executing  
is a database lookup, lets say I have a database field which is a string,  
lets say it can be NULL (database definition allows NULLS).  Now, lets say  
I want to do these lookups:

1. lookup all objects where the field is NULL
2. lookup all objects where the field is reganwashere
3. lookup all objects where the field is  (empty/non-null)

#1 and #2 are simple enough.  I call proxy like..
  proxy(LOOKUP, NULL);
  proxy(LOOKUP, reganwashere);

and in the actual lookup function, invoked by proxy, I call:
  pFieldValue = va_arg(pArgs, char*);

and I get NULL, and reganwashere.

In C, case #2 would also be easy, I would call proxy(LOOKUP, ) and in  
the actual lookup function pFieldValue would be  (not NULL).


But, in D it seems I cannot do this.  In D I would have to pass an  
additional boolean parameter, or add another level of indirection i.e.  
pass a string[]*.  The same problem exists in C if I want to pass an 'int'  
or any primitive type, I have to pass it as int*, use a boolean, or invent  
a 'special' value which means essentially NULL/not-set/ignored.


There are plenty of other use cases, essentially anywhere where you have  
something that can exist in one of 3 states:

  1. NULL   (not set)
  2.  (set, to blank)
  3. anything (set, to anything)

Like.. parsing input from a web page, where a field can:
  1. not be present on the page  (NULL)
  2. be present, but left blank  ()
  3. be present, contains anything (anything)

This one came up a lot when I worked with web software, we had to be able  
to detect whether the user was trying to set something to a blank string,  
and in some cases we wanted that to remove the setting entirely (null
being identical ok) or actually set it to a blank string (null   being  
identical, not ok).


Or... saving settings to a file from user input, where the user selects a  
setting from a menu, then enters the value and could:

  1. not select setting A, therefore save no value(NULL)
  2. select the setting A, enter blank string ()
  3. select the setting A, enter the value anything (anything)

Granted (and this was the response 2 years back when this topic came up) I  
can work around the deficiency by using a map/hash/dictionary where I  
insert key/value pairs, then I can ask it if the key exists.  But, this is  
essentially another level of indirection like an int* or string[]* and is  
more heavy weight than I might want/need.


Ultimately, and people may disagree here, I don't have a problem with  
pointers, and this is a really 'nice' feature of using pointers, and it  
seems D's arrays don't share it, which bothers me.


Your plan would mean that [] is a memory allocation.  I'd rather not  
have the runtime do the lower performing thing unless there is a good  
reason.


I'm not too bothered what syntax gets used, provided it was something that  
you don't accidently use when you do not want it, and wasn't too horrible  
to use as 

Re: null Vs [] return arrays

2011-04-01 Thread Steven Schveighoffer
On Fri, 01 Apr 2011 11:52:47 -0400, Regan Heath re...@netmail.co.nz  
wrote:


On Fri, 01 Apr 2011 13:38:45 +0100, Steven Schveighoffer  
schvei...@yahoo.com wrote:


On Fri, 01 Apr 2011 06:38:56 -0400, Regan Heath re...@netmail.co.nz  
wrote:


On Mon, 28 Mar 2011 17:54:29 +0100, bearophile  
bearophileh...@lycps.com wrote:

Steven Schveighoffer:

So essentially, you are getting the same thing, but using [] is  
slower.


It seems I was right then, thank you and Kagamin for the answers.


This may be slightly OT but I just wanted to raise the point that  
conceptually it's nice to be able to express (exists but is empty) and  
(does not exist).  Pointers/references have null as a (does not exist)  
value and this is incredibly useful.  Try doing the same thing with  
'int' .. it requires you either use int* or pass an additional boolean  
to indicate existence.. yuck.


I'd suggest if someone types '[]' they mean (exists but is empty) and  
if they type 'null' they mean (does not exist) and they may be relying  
on the .ptr value to differentiate these cases, which is useful.  If  
you're not interested in the difference, and you need performance, you  
simply use 'null'.  Everybody is happy. :)


The distinction is useful if you have something to reference (e.g. an  
empty slice that points at the end of a pre-existing non-empty array).   
But [] is a new array, no point in allocating memory just so the  
pointer can be non-null.  Can you come up with a use case to show why  
you'd want such a thing?


Ok.  Recently I wrote (in C) a function proxy interface.  I had to  
execute a set of functions from one thread, and wanted to 'call' them  
from potentially many.  So, I set up the thread, added events, and a  
queue, etc and I wrote a proxy function for 'calling' them from the many  
threads which looks like...


void proxy(int func, ...) {}

So, it accepts a variable list of args, places them in a structure,  
places that in the queue, and waits on an event for the proxy thread to  
execute the command and return the result.  Lets say the function I am  
executing is a database lookup, lets say I have a database field which  
is a string, lets say it can be NULL (database definition allows  
NULLS).  Now, lets say I want to do these lookups:

1. lookup all objects where the field is NULL
2. lookup all objects where the field is reganwashere
3. lookup all objects where the field is  (empty/non-null)

#1 and #2 are simple enough.  I call proxy like..
   proxy(LOOKUP, NULL);
   proxy(LOOKUP, reganwashere);

and in the actual lookup function, invoked by proxy, I call:
   pFieldValue = va_arg(pArgs, char*);

and I get NULL, and reganwashere.

In C, case #2 would also be easy, I would call proxy(LOOKUP, ) and in  
the actual lookup function pFieldValue would be  (not NULL).


But, in D it seems I cannot do this.  In D I would have to pass an  
additional boolean parameter, or add another level of indirection i.e.  
pass a string[]*.  The same problem exists in C if I want to pass an  
'int' or any primitive type, I have to pass it as int*, use a boolean,  
or invent a 'special' value which means essentially NULL/not-set/ignored.


assert( !is null); // works on D.  Try it.

Your plan would mean that [] is a memory allocation.  I'd rather not  
have the runtime do the lower performing thing unless there is a good  
reason.


I'm not too bothered what syntax gets used, provided it was something  
that you don't accidently use when you do not want it, and wasn't too  
horrible to use as I don't see this as being a very uncommon occurance  
(which would warrant/allow ugliness of syntax).  [] seems logical, as  
does new T[], both are not null so the programmer was obviously  
trying to do something other than pass null.


It's one thing to want an array with a non-null pointer, but it's another  
thing entirely to want an array with a non-null pointer which points to a  
valid heap address.


In my opinion, [] means empty array.  I don't care what the pointer is, as  
long as the array is empty.  The implementation can put whatever value it  
wants for the pointer.  If it wants to put null, that is fine.  null means  
I want a null pointer.


If I had it my way, all array literals would be immutable, and the  
pointers would point to ROM (even empty ones).  We should not be  
constructing array literals at runtime.  But my opinion is still that you  
should not count on the pointer being anything because it's not specified  
what it is.


As an alternative, you could use (cast(T *)null)[1..1] if you really  
needed it (this also would be higher performing, BTW since the runtime  
array literal function would not be called).


That seems to work, but it's hideous syntax for something that is not  
that uncommon IMO.


My opinion is that it is uncommon, but it can be abstracted:

template emptyArray(T)
{
   enum emptyArray = (cast(T*)0)[1..1];
}

rename as desired.

To remind myself what D does, and try and find 

Re: null Vs [] return arrays

2011-03-28 Thread Kagamin
bearophile Wrote:

 Kagamin:
 
  [] is not null, it's an array of 0 elements, what is done exactly.
  edx points to the allocated array.
 
 I don't understand what you say. I think the caller of foo() and bar() 
 receive the same thing, two empty registers. I think that cast(int[])null and 
 cast(int[])[] are the same thing for D.

That's a mistake.

Well, if there's no differnce for you, you can use either of them. What's the 
problem?


Re: null Vs [] return arrays

2011-03-28 Thread Steven Schveighoffer
On Sun, 27 Mar 2011 09:37:47 -0400, bearophile bearophileh...@lycos.com  
wrote:



I have compiled this little D2 program:


int[] foo() {
return [];
}
int[] bar() {
return null;
}
void main() {}



Using DMD 2.052,  dmd -O -release -inline test2.d

This is the asm of the two functions:

_D5test23fooFZAicomdat
L0: pushEAX
mov EAX,offset FLAT:_D11TypeInfo_Ai6__initZ
push0
pushEAX
callnear ptr __d_arrayliteralT
mov EDX,EAX
add ESP,8
pop ECX
xor EAX,EAX
ret

_D5test23barFZAicomdat
xor EAX,EAX
xor EDX,EDX
ret

Is this expected and desired? Isn't it better to compile the foo() as  
bar()?


Probably.  The runtime that allocates an array looks like this (irrelevant  
parts collapsed):



extern (C) void* _d_arrayliteralT(TypeInfo ti, size_t length, ...)
{
auto sizeelem = ti.next.tsize();// array element size
void* result;

...
if (length == 0 || sizeelem == 0)
result = null;
else
{
   ...
}
return result;
}

So essentially, you are getting the same thing, but using [] is slower.

-Steve


Re: null Vs [] return arrays

2011-03-28 Thread bearophile
Steven Schveighoffer:

 So essentially, you are getting the same thing, but using [] is slower.

It seems I was right then, thank you and Kagamin for the answers.

Bye,
bearophile


null Vs [] return arrays

2011-03-27 Thread bearophile
I have compiled this little D2 program:


int[] foo() {
return [];
}
int[] bar() {
return null;
}
void main() {}



Using DMD 2.052,  dmd -O -release -inline test2.d

This is the asm of the two functions:

_D5test23fooFZAicomdat
L0: pushEAX
mov EAX,offset FLAT:_D11TypeInfo_Ai6__initZ
push0
pushEAX
callnear ptr __d_arrayliteralT
mov EDX,EAX
add ESP,8
pop ECX
xor EAX,EAX
ret

_D5test23barFZAicomdat
xor EAX,EAX
xor EDX,EDX
ret

Is this expected and desired? Isn't it better to compile the foo() as bar()?

Bye,
bearophile


Re: null Vs [] return arrays

2011-03-27 Thread Kagamin
bearophile Wrote:

 I have compiled this little D2 program:
 
 
 int[] foo() {
 return [];
 }
 int[] bar() {
 return null;
 }
 void main() {}
 
 
 
 Using DMD 2.052,  dmd -O -release -inline test2.d
 
 This is the asm of the two functions:
 
 _D5test23fooFZAicomdat
 L0: pushEAX
 mov EAX,offset FLAT:_D11TypeInfo_Ai6__initZ
 push0
 pushEAX
 callnear ptr __d_arrayliteralT
 mov EDX,EAX
 add ESP,8
 pop ECX
 xor EAX,EAX
 ret
 
 _D5test23barFZAicomdat
 xor EAX,EAX
 xor EDX,EDX
 ret
 
 Is this expected and desired? Isn't it better to compile the foo() as bar()?
 
 Bye,
 bearophile

[] is not null, it's an array of 0 elements, what is done exactly.
edx points to the allocated array.


Re: null Vs [] return arrays

2011-03-27 Thread bearophile
Kagamin:

 [] is not null, it's an array of 0 elements, what is done exactly.
 edx points to the allocated array.

I don't understand what you say. I think the caller of foo() and bar() receive 
the same thing, two empty registers. I think that cast(int[])null and 
cast(int[])[] are the same thing for D.

void main() {
assert(cast(int[])null == cast(int[])null);
auto a1 = cast(int[])null;
a1 ~= 1;
auto a2 = 1 ~ cast(int[])null;
}

Bye,
bearophile


Re: null Vs [] return arrays

2011-03-27 Thread Jonathan M Davis
On 2011-03-27 11:42, bearophile wrote:
 Kagamin:
  [] is not null, it's an array of 0 elements, what is done exactly.
  edx points to the allocated array.
 
 I don't understand what you say. I think the caller of foo() and bar()
 receive the same thing, two empty registers. I think that cast(int[])null
 and cast(int[])[] are the same thing for D.
 
 void main() {
 assert(cast(int[])null == cast(int[])null);
 auto a1 = cast(int[])null;
 a1 ~= 1;
 auto a2 = 1 ~ cast(int[])null;
 }

What I would _expect_ the difference between a null array and an empty one to 
be would be that the null one's ptr property would be null, whereas the empty 
one wouldn't be. But dmd treats them pretty much the same. empty returns true 
for both. You can append to both. The null one would be a guaranteed memory 
reallocation when you append to it whereas the empty one may not be, but their 
behavior is almost identical.

How that affects the generated assembly code, I don't know. Particularly if 
you're compiling with -inline and and -O, the compiler can likely make 
assumptions about null that it can't make about [], since it probably treats 
[] more generally without worrying about the fact that it happens to be empty 
as far as optimizations go - that and there _is_ a semantic difference between 
null and [] if you're messing with the ptr property, so Walter may think that 
it's best for null to not be turned into the same thing as [] automatically.

- Jonathan M Davis


Re: null Vs [] return arrays

2011-03-27 Thread bearophile
Jonathan M Davis:

 the compiler can likely make 
 assumptions about null that it can't make about [], since it probably treats 
 [] more generally without worrying about the fact that it happens to be empty 
 as far as optimizations go - that and there _is_ a semantic difference 
 between 
 null and [] if you're messing with the ptr property, so Walter may think that 
 it's best for null to not be turned into the same thing as [] automatically.

Thank you for your answer. I have added a low-priority enhancement request.

Bye,
bearophile