zero-terminated strings, string literals, etc

2012-10-02 Thread Regan Heath
Recent discussions on the zero terminated string problems and  
inconsistency of string literals has me, again, wondering why D doesn't  
have a 'type' to represent C's zero terminated strings.  It seems to me  
that having a type, and typing C functions with it would solve a lot of  
problems.


The compiler could/would then error if people attempted to pass a D string  
without converting it correctly.


The compiler would create literals with or without \0 as required by the  
'type' being assigned, parameter passed, etc.


The conversion function from a D string to a C string would return the new  
type.


A %sz format specifier could be added to writef which would be able to  
type check the argument.


As the only way to get a variable of the new type would be from a literal,  
conversion or C function call so we could be sure it was in fact \0  
terminated(*), and so..


An implicit conversion between a C string and a D string (slice using  
strlen) would be possible, and safe.  (Though, not at zero runtime cost)


Existing (correct) code would continue to compile, by this I mean:
 - passing literals
 - calling a conversion function for each D string argument

But code which passes D string variables to C functions without conversion  
would start to fail to compile, so the change will 'break' existing code.


There would be several solutions in these cases:

1) add a call to a conversion function.  Introducing a conversion cost  
which was not previously present.


2) re-type the variable as a C string.  If it's not used for anything else  
then this is more correct.  If it's passed to other code then because a  
C string will implicitly converts (with a slice/strlen) to a D string this  
substitution will work in most cases, however that act of conversion will  
incur a cost (but it can/should be one off if the result is assigned/kept).


I am probably missing something obvious, or I have forgotten one of the  
array/slice complexities which makes this a nightmare.


Thoughts?

Regan

(*) Ignoring buggy/broken C functions returning non-zero terminated  
strings.. as we will crash on these no matter what in any case.


--
Using Opera's revolutionary email client: http://www.opera.com/mail/


Re: zero-terminated strings, string literals, etc

2012-10-02 Thread David

Am 02.10.2012 16:55, schrieb Regan Heath:

Recent discussions on the zero terminated string problems and
inconsistency of string literals has me, again, wondering why D doesn't
have a 'type' to represent C's zero terminated strings.  It seems to me
that having a type, and typing C functions with it would solve a lot of
problems.


You have basically a type only used for 0-terminated strings, char*, in 
D you use normally string, and if you wanna represent binary data you 
use ubyte[], I've never used char* except for interfacing with C. I 
would prefer a library soulution, some kind of Struct which is 
implicitly convertable to char* (0-terminates) and also to string (not 
0-terminated), not sure how to implement that though.