subject:"Idea\: Introduce zero\-terminated string specifier"

Re: Idea: Introduce zero-terminated string specifier

2012-10-04 Thread Regan Heath

On Thu, 04 Oct 2012 01:05:14 +0100, Steven Schveighoffer  
 wrote:


On Wed, 03 Oct 2012 08:37:14 -0400, Regan Heath   
wrote:


On Tue, 02 Oct 2012 21:44:11 +0100, Steven Schveighoffer  
 wrote:
In fact, a better solution would be to define a C string type (other  
than char *), and just pretend those system calls return that.  Then  
support that C string type in writef.


-Steve


:D
http://comments.gmane.org/gmane.comp.lang.d.general/97793



Almost what I was thinking.

:)

Though, at that point, I don't think we need a special specifier for  
writef.  %s works.


True.

However, looking at the vast reach of these changes, I wonder if it's  
worth it.  That's a lot of prototypes to C functions that have to  
change, and a large compiler change (treating string literals as CString  
instead of char *), just so C strings print out with writef.


That's not the only motivation.  The change brings more type safety in  
general and should help to catch bugs, like for example the common one  
made by people just starting out with D (from a C/C++ background).



Not to mention code that will certainly break...


Some code will definitely stop compiling, but it's debatable as to whether  
this code is not already "broken" to some degree.. it's likely not as  
safe/robust as it could be.


R

--
Using Opera's revolutionary email client: http://www.opera.com/mail/

Re: Idea: Introduce zero-terminated string specifier

2012-10-04 Thread Paulo Pinto


On Tuesday, 2 October 2012 at 13:07:46 UTC, deadalnix wrote:

Le 01/10/2012 22:33, Vladimir Panteleev a écrit :

On Monday, 1 October 2012 at 12:12:52 UTC, deadalnix wrote:

Le 01/10/2012 13:29, Vladimir Panteleev a écrit :

On Monday, 1 October 2012 at 10:56:36 UTC, deadalnix wrote:

How does to!string know that the string is 0 terminated ?


By convention (it doesn't).


It is unsafe as hell oO


Forcing the programmer to put strlen calls everywhere in his 
code is not

any safer.


I make the library safer. If the programmer manipulate unsafe 
construct (like c strings) it is up to the programmer to ensure 
safety, not the lib.


Thrusting the programmer is what brought upon us the wrath of 
security exploits via buffer overflows.


--
Paulo

Re: Idea: Introduce zero-terminated string specifier

2012-10-03 Thread Steven Schveighoffer

On Wed, 03 Oct 2012 08:37:14 -0400, Regan Heath   
wrote:


On Tue, 02 Oct 2012 21:44:11 +0100, Steven Schveighoffer  
 wrote:
In fact, a better solution would be to define a C string type (other  
than char *), and just pretend those system calls return that.  Then  
support that C string type in writef.


-Steve


:D
http://comments.gmane.org/gmane.comp.lang.d.general/97793



Almost what I was thinking.

:)

Though, at that point, I don't think we need a special specifier for  
writef.  %s works.


However, looking at the vast reach of these changes, I wonder if it's  
worth it.  That's a lot of prototypes to C functions that have to change,  
and a large compiler change (treating string literals as CString instead  
of char *), just so C strings print out with writef.  Not to mention code  
that will certainly break...


-Steve

Re: Idea: Introduce zero-terminated string specifier

2012-10-03 Thread Regan Heath

On Tue, 02 Oct 2012 21:44:11 +0100, Steven Schveighoffer  
 wrote:
In fact, a better solution would be to define a C string type (other  
than char *), and just pretend those system calls return that.  Then  
support that C string type in writef.


-Steve


:D
http://comments.gmane.org/gmane.comp.lang.d.general/97793

--
Using Opera's revolutionary email client: http://www.opera.com/mail/

Re: Idea: Introduce zero-terminated string specifier

2012-10-02 Thread Jonathan M Davis

On Wednesday, October 03, 2012 07:35:23 Jakob Ovrum wrote:
> > I suppose we could just use @trusted
> > and call it a day.
> 
> No, that would be abusing @trusted. The function would no longer
> be safe, *because it contains possibly unsafe code*. @trusted is
> for safe functions that the compiler cannot prove safe.

Yeah. You basically _never_ just mark @trusted and call it a day. You only 
mark something @trusted if you've verified that _everything_ that that function 
does which is @system is done in a way that's ultimately @safe. In particular, 
marking much of anything which is templated as @trusted is almost always just 
plain wrong.

- Jonathan M Davis

Re: Idea: Introduce zero-terminated string specifier

2012-10-02 Thread Jakob Ovrum


On Wednesday, 3 October 2012 at 05:04:01 UTC, H. S. Teoh wrote:
Yes that's what I mean. If the format string is known at 
compile-time
and known to involve only @safe code, then this would work. 
Something

like this might work if CTFE is used to parse the format string
piecemeal (i.e., translate something like writefln("%d %s",x,y) 
into
write!int(x); write!string(" "); write!string(y)). The safe 
instances of

write!T(...) will be marked @safe.


It doesn't matter if the argument is known at compile-time or 
not, because there's no way to know that without receiving the 
format string as a template parameter, in which case it must 
*always* be known at compile-time (runtime format string would 
not be supported), and then the syntax is no longer writefln("%d 
%s", x, y). Obviously, such a change is not acceptable.



I suppose we could just use @trusted
and call it a day.


No, that would be abusing @trusted. The function would no longer 
be safe, *because it contains possibly unsafe code*. @trusted is 
for safe functions that the compiler cannot prove safe.

Re: Idea: Introduce zero-terminated string specifier

2012-10-02 Thread H. S. Teoh

On Tue, Oct 02, 2012 at 07:50:09PM -0700, Jonathan M Davis wrote:
> On Tuesday, October 02, 2012 18:21:30 H. S. Teoh wrote:
> > On Wed, Oct 03, 2012 at 03:07:14AM +0200, Andrej Mitrovic wrote:
> > > On 10/3/12, Jakob Ovrum  wrote:
> > > > writefln cannot be @safe if it has to support an unsafe format
> > > > specifier. It's "hidden" because it affects every call to
> > > > writefln, even if it doesn't use the unsafe format specifier.
> > 
> > [...]
> > 
> > Hmm, this seems to impose unnecessary limitations on @safe. I guess
> > the current language doesn't allow for a "conditionally-safe" tag
> > where something can be implicitly marked @safe if it's provable at
> > compile-time that it's safe?
> 
> The format string is a runtime argument, so nothing can be proven
> about it at compile time.
> 
> If you want any kind of @safe inferrence, you need to use a template.
> If writefln took the format string as a template argument and
> generated different code (which was @safe or not depending on what it
> did) based on what was in the format string, then inferrence could
> take place, but otherwise no.
[...]

Yes that's what I mean. If the format string is known at compile-time
and known to involve only @safe code, then this would work. Something
like this might work if CTFE is used to parse the format string
piecemeal (i.e., translate something like writefln("%d %s",x,y) into
write!int(x); write!string(" "); write!string(y)). The safe instances of
write!T(...) will be marked @safe. But it does seem like a lot of work
just so we can use @safe, though. I suppose we could just use @trusted
and call it a day.


T

-- 
Claiming that your operating system is the best in the world because
more people use it is like saying McDonalds makes the best food in the
world. -- Carl B. Constantine

Re: Idea: Introduce zero-terminated string specifier

2012-10-02 Thread Jonathan M Davis

On Tuesday, October 02, 2012 18:21:30 H. S. Teoh wrote:
> On Wed, Oct 03, 2012 at 03:07:14AM +0200, Andrej Mitrovic wrote:
> > On 10/3/12, Jakob Ovrum  wrote:
> > > writefln cannot be @safe if it has to support an unsafe format
> > > specifier. It's "hidden" because it affects every call to writefln,
> > > even if it doesn't use the unsafe format specifier.
> 
> [...]
> 
> Hmm, this seems to impose unnecessary limitations on @safe. I guess the
> current language doesn't allow for a "conditionally-safe" tag where
> something can be implicitly marked @safe if it's provable at
> compile-time that it's safe?

The format string is a runtime argument, so nothing can be proven about it at 
compile time.

If you want any kind of @safe inferrence, you need to use a template. If 
writefln took the format string as a template argument and generated different 
code (which was @safe or not depending on what it did) based on what was in 
the format string, then inferrence could take place, but otherwise no.

- Jonathan M Davis

Re: Idea: Introduce zero-terminated string specifier

2012-10-02 Thread H. S. Teoh

On Wed, Oct 03, 2012 at 03:07:14AM +0200, Andrej Mitrovic wrote:
> On 10/3/12, Jakob Ovrum  wrote:
> > writefln cannot be @safe if it has to support an unsafe format
> > specifier. It's "hidden" because it affects every call to writefln,
> > even if it doesn't use the unsafe format specifier.
[...]

Hmm, this seems to impose unnecessary limitations on @safe. I guess the
current language doesn't allow for a "conditionally-safe" tag where
something can be implicitly marked @safe if it's provable at
compile-time that it's safe?

T

-- 
Elegant or ugly code as well as fine or rude sentences have something in 
common: they don't depend on the language. -- Luca De Vitis

Re: Idea: Introduce zero-terminated string specifier

2012-10-02 Thread Andrej Mitrovic

On 10/3/12, Jakob Ovrum  wrote:
> writefln cannot be @safe if it has to support an unsafe format
> specifier. It's "hidden" because it affects every call to
> writefln, even if it doesn't use the unsafe format specifier.

Ah damn I completely forgot about @safe. I tend to avoid recent features..

OK then I think my arguments are moot. Nevertheless I can always
define a helper function for my own purposes I guess.

Sorry Walter for not taking @safe into account. :)

Re: Idea: Introduce zero-terminated string specifier

2012-10-02 Thread Jakob Ovrum


On Tuesday, 2 October 2012 at 21:30:35 UTC, Andrej Mitrovic wrote:

On 10/2/12, Walter Bright  wrote:

On 9/30/2012 11:31 AM, deadalnix wrote:
If you know that a string is 0 terminated, you can easily 
create a slice

from it as follow :

char* myZeroTerminatedString;
char[] myZeroTerminatedString[0 .. 
strlen(myZeroTerminatedString)];

Since %zs is inherently unsafe, it
hides such unsafety in a commonly used library function, which 
will
infect everything else that transitively calls writefln with 
unsafety.


This makes %zs an unacceptable feature.


How does it hide anything if you have to explicitly mark the 
format
specifier as %zs? It would be documented, just like it's 
documented
that passing pointers to garbage-collected memory to the C side 
is

inherently unsafe.


writefln cannot be @safe if it has to support an unsafe format 
specifier. It's "hidden" because it affects every call to 
writefln, even if it doesn't use the unsafe format specifier.

Re: Idea: Introduce zero-terminated string specifier

2012-10-02 Thread Andrej Mitrovic

On 10/2/12, Walter Bright  wrote:
> On 9/30/2012 11:31 AM, deadalnix wrote:
>> If you know that a string is 0 terminated, you can easily create a slice
>> from it as follow :
>>
>> char* myZeroTerminatedString;
>> char[] myZeroTerminatedString[0 .. strlen(myZeroTerminatedString)];
> Since %zs is inherently unsafe, it
> hides such unsafety in a commonly used library function, which will
> infect everything else that transitively calls writefln with unsafety.
>
> This makes %zs an unacceptable feature.

How does it hide anything if you have to explicitly mark the format
specifier as %zs? It would be documented, just like it's documented
that passing pointers to garbage-collected memory to the C side is
inherently unsafe.

> deadalnix's example shows that adding a new format specifier %zs adds
> little value.

It adds convenience, which is an important trait in this day and age.
If that's not a concern, why is printf a symbol you can get your hands
on as soon as you import std.stdio? And if safety is a concern why is
printf used in Phobos at all? I count 427 lines of printf calls in
Phobos and 843 lines in Druntime (druntime might have a good excuse
since it shouldn't import Phobos functions). Many of these calls in
Phobos are not simple D string literal printf calls either.

Btw, some weeks ago when dstep was announced you were jumping for joy
and were instantly proposing language changes to add better support
for wrapping C. But asking for better library support is somehow
controversial. I don't understand the double-standard.

Re: Idea: Introduce zero-terminated string specifier

2012-10-02 Thread Steven Schveighoffer

On Tue, 02 Oct 2012 15:35:47 -0400, David Nadlinger   
wrote:



On Tuesday, 2 October 2012 at 19:34:31 UTC, David Nadlinger wrote:

Well, make it to!char(char*) then! ;)


Oh dear, this doesn't get better: Of course, I've meant to write  
»to!(char[])(char*)«.


Right.  I agree, this should not allocate (I think someone said it does,  
but it's probably not necessary to).


But still, what looks better?

auto x = SomeSystemCallThatReturnsACString();

writefln("%s", to!(char[])(x));
writefln("%s", zstr(x));

I want something easy to type, and not too difficult to visually parse.

In fact, a better solution would be to define a C string type (other than  
char *), and just pretend those system calls return that.  Then support  
that C string type in writef.


-Steve

Re: Idea: Introduce zero-terminated string specifier

2012-10-02 Thread David Nadlinger


On Tuesday, 2 October 2012 at 19:34:31 UTC, David Nadlinger wrote:

Well, make it to!char(char*) then! ;)


Oh dear, this doesn't get better: Of course, I've meant to write 
»to!(char[])(char*)«.


David

Re: Idea: Introduce zero-terminated string specifier

2012-10-02 Thread David Nadlinger

On Tuesday, 2 October 2012 at 19:31:33 UTC, Steven Schveighoffer 
wrote:
On Tue, 02 Oct 2012 15:17:42 -0400, David Nadlinger 
 wrote:


On Tuesday, 2 October 2012 at 02:22:33 UTC, Steven 
Schveighoffer wrote:

@system char[] zstr(char *s) { return s[0..strlen(s)]; }

[…]

Does it make sense for Phobos to provide such a shortcut in 
an obscure header somewhere?  Like std.cstring?  Or should we 
just say "roll your own if you need it"?


I didn't look it up, so I could be making quite a fool of 
myself right now, but doesn't to!string(char*) provide exactly 
that?


string is immutable.  Must allocate.

You fool :)


Well, make it to!char(char*) then! ;)

David

Re: Idea: Introduce zero-terminated string specifier

2012-10-02 Thread Steven Schveighoffer

On Tue, 02 Oct 2012 15:17:42 -0400, David Nadlinger   
wrote:



On Tuesday, 2 October 2012 at 02:22:33 UTC, Steven Schveighoffer wrote:

@system char[] zstr(char *s) { return s[0..strlen(s)]; }

[…]

Does it make sense for Phobos to provide such a shortcut in an obscure  
header somewhere?  Like std.cstring?  Or should we just say "roll your  
own if you need it"?


I didn't look it up, so I could be making quite a fool of myself right  
now, but doesn't to!string(char*) provide exactly that?


string is immutable.  Must allocate.

You fool :)  just kidding, honest mistake.

-Steve

Re: Idea: Introduce zero-terminated string specifier

2012-10-02 Thread David Nadlinger

On Tuesday, 2 October 2012 at 02:22:33 UTC, Steven Schveighoffer 
wrote:

@system char[] zstr(char *s) { return s[0..strlen(s)]; }

[…]

Does it make sense for Phobos to provide such a shortcut in an 
obscure header somewhere?  Like std.cstring?  Or should we just 
say "roll your own if you need it"?


I didn't look it up, so I could be making quite a fool of myself 
right now, but doesn't to!string(char*) provide exactly that?


David

Re: Idea: Introduce zero-terminated string specifier

2012-10-02 Thread Steven Schveighoffer

On Tue, 02 Oct 2012 04:09:43 -0400, Walter Bright  
 wrote:



On 10/1/2012 7:22 PM, Steven Schveighoffer wrote:

Does it make sense for Phobos to provide such a shortcut in an obscure
header somewhere? Like std.cstring? Or should we just say "roll your own
if you need it"?


As a matter of principle, I really don't like gobs of Phobos functions  
that are literally one liners. Phobos should not become a mile wide but  
inch deep library of trivia. It should consist of non-trivial, useful,  
and relatively deep functions.


This, arguably, is one of the most important aspects of C to support.   
There are lots of C functions which provide C strings.  Yes, we don't want  
to promote using C strings, but to have one point of conversion so you  
*can* use safe strings is a good thing.  In other words, the sooner you  
convert your zero-terminated strings to char slices, the better off you  
are.  And if we label it system code, it can't be misused in @safe code.


Why support zero-terminated strings as literals if it wasn't important?   
You could argue that things like system calls which return zero-terminated  
strings are as safe to use as string literals which you know have zero  
terminated values.


The only other alternative is to wrap those C functions with D ones that  
convert to char[].  I don't find this any more appealing.


-Steve

Re: Idea: Introduce zero-terminated string specifier

2012-10-02 Thread deadalnix


Le 02/10/2012 03:13, Walter Bright a écrit :

On 9/30/2012 11:31 AM, deadalnix wrote:

If you know that a string is 0 terminated, you can easily create a slice
from it as follow :

char* myZeroTerminatedString;
char[] myZeroTerminatedString[0 .. strlen(myZeroTerminatedString)];

It is clean and avoid to modify the stdlib in an unsafe way.



Of course, using strlen() is always going to be unsafe. But having %zs
is equally unsafe for the same reason.

deadalnix's example shows that adding a new format specifier %zs adds
little value, but it gets much worse. Since %zs is inherently unsafe, it
hides such unsafety in a commonly used library function, which will
infect everything else that transitively calls writefln with unsafety.

This makes %zs an unacceptable feature.


Exactly my point.

Re: Idea: Introduce zero-terminated string specifier

2012-10-02 Thread deadalnix


Le 01/10/2012 22:33, Vladimir Panteleev a écrit :

On Monday, 1 October 2012 at 12:12:52 UTC, deadalnix wrote:

Le 01/10/2012 13:29, Vladimir Panteleev a écrit :

On Monday, 1 October 2012 at 10:56:36 UTC, deadalnix wrote:

How does to!string know that the string is 0 terminated ?


By convention (it doesn't).


It is unsafe as hell oO


Forcing the programmer to put strlen calls everywhere in his code is not
any safer.


I make the library safer. If the programmer manipulate unsafe construct 
(like c strings) it is up to the programmer to ensure safety, not the lib.

Re: Idea: Introduce zero-terminated string specifier

2012-10-02 Thread Andrei Alexandrescu


On 10/2/12 4:09 AM, Walter Bright wrote:

On 10/1/2012 7:22 PM, Steven Schveighoffer wrote:

Does it make sense for Phobos to provide such a shortcut in an obscure
header somewhere? Like std.cstring? Or should we just say "roll your own
if you need it"?


As a matter of principle, I really don't like gobs of Phobos functions
that are literally one liners. Phobos should not become a mile wide but
inch deep library of trivia. It should consist of non-trivial, useful,
and relatively deep functions.


Well there are some possible reasons. Clearly useful functionality 
that's nontrivial deserves being abstracted in a function. On the other 
hand, even a short function is valuable if frequent enough and deserving 
of a name. We have e.g. s.strip even though it's equivalent to 
s.stripLeft.stripRight.


Andrei

Re: Idea: Introduce zero-terminated string specifier

2012-10-02 Thread Piotr Szturmaj


Andrej Mitrovic wrote:

On 10/1/12, Piotr Szturmaj  wrote:

For example C binding writers may change:

extern(C) char* getstr();

to

extern(C) cstring getstr();


I don't think you can reliably do that because of semantics w.r.t.
passing parameters on the stack vs in registers based on whether a
type is a pointer or not. I've had this sort of bug when wrapping C++
where the C++ compiler was passing a parameter in one way but the D
compiler expected the parameters to be passed, simply because I tried
to be clever and fake a return type. See:
http://forum.dlang.org/thread/mailman.1547.1346632732.31962.d@puremagic.com#post-mailman.1557.1346690320.31962.d.gnu:40puremagic.com


I think that align(1) structs that wrap a single value should be treated 
as its type. After all they have the same size and representation. I 
don't know how this works now, though.

Re: Idea: Introduce zero-terminated string specifier

2012-10-02 Thread Walter Bright


On 10/1/2012 7:22 PM, Steven Schveighoffer wrote:

However, we can't require an import to use a bizarre
specifier, and you can't link un@safe code to a specifier, so the zstr
concept is far superior in requiring the user to know what he is doing,
and having the compiler enforce that.


Yup.



Does it make sense for Phobos to provide such a shortcut in an obscure
header somewhere? Like std.cstring? Or should we just say "roll your own
if you need it"?


As a matter of principle, I really don't like gobs of Phobos functions 
that are literally one liners. Phobos should not become a mile wide but 
inch deep library of trivia. It should consist of non-trivial, useful, 
and relatively deep functions.

Re: Idea: Introduce zero-terminated string specifier

2012-10-01 Thread Steven Schveighoffer

On Mon, 01 Oct 2012 21:13:47 -0400, Walter Bright  
 wrote:



On 9/30/2012 11:31 AM, deadalnix wrote:

If you know that a string is 0 terminated, you can easily create a slice
from it as follow :

char* myZeroTerminatedString;
char[] myZeroTerminatedString[0 .. strlen(myZeroTerminatedString)];

It is clean and avoid to modify the stdlib in an unsafe way.



Of course, using strlen() is always going to be unsafe. But having %zs  
is equally unsafe for the same reason.


deadalnix's example shows that adding a new format specifier %zs adds  
little value, but it gets much worse. Since %zs is inherently unsafe, it  
hides such unsafety in a commonly used library function, which will  
infect everything else that transitively calls writefln with unsafety.


This makes %zs an unacceptable feature.


What about %s just working with zero-terminated strings?

I was going to argue this point, but I just thought of a very very good  
counter-case for this.


string x = "abc".idup; // no zero-terminator!

writefln("%s", x.ptr);

What we don't want is for writefln to try and interpret the pointer as a C  
string.  Not only is it bad, but even the code seems to suggest "Hey, this  
should print a pointer!"


The large underlying issue here is that C considers char * to be a  
zero-terminated string, and D considers it to be a pointer.


This means any code which uses C calls heavily will have to awkwardly  
dance between both worlds.  I think there is some value in providing  
something that is *not* common to do the above work (convert char * to  
char[]).


Hm...

@system char[] zstr(char *s) { return s[0..strlen(s)]; }

provides:

writefln("%s", zstr(s));

vs.

writefln("%zs", s);

Arguably, nobody uses %zs, so even though writefln is common, the  
specifier is not.  However, we can't require an import to use a bizarre  
specifier, and you can't link un@safe code to a specifier, so the zstr  
concept is far superior in requiring the user to know what he is doing,  
and having the compiler enforce that.


Does it make sense for Phobos to provide such a shortcut in an obscure  
header somewhere?  Like std.cstring?  Or should we just say "roll your own  
if you need it"?


-Steve

Re: Idea: Introduce zero-terminated string specifier

2012-10-01 Thread Walter Bright


On 9/30/2012 11:31 AM, deadalnix wrote:

If you know that a string is 0 terminated, you can easily create a slice
from it as follow :

char* myZeroTerminatedString;
char[] myZeroTerminatedString[0 .. strlen(myZeroTerminatedString)];

It is clean and avoid to modify the stdlib in an unsafe way.



Of course, using strlen() is always going to be unsafe. But having %zs 
is equally unsafe for the same reason.


deadalnix's example shows that adding a new format specifier %zs adds 
little value, but it gets much worse. Since %zs is inherently unsafe, it 
hides such unsafety in a commonly used library function, which will 
infect everything else that transitively calls writefln with unsafety.


This makes %zs an unacceptable feature.

Re: Idea: Introduce zero-terminated string specifier

2012-10-01 Thread Vladimir Panteleev


On Monday, 1 October 2012 at 12:12:52 UTC, deadalnix wrote:

Le 01/10/2012 13:29, Vladimir Panteleev a écrit :

On Monday, 1 October 2012 at 10:56:36 UTC, deadalnix wrote:

How does to!string know that the string is 0 terminated ?


By convention (it doesn't).


It is unsafe as hell oO


Forcing the programmer to put strlen calls everywhere in his code 
is not any safer.

Re: Idea: Introduce zero-terminated string specifier

2012-10-01 Thread Andrej Mitrovic

On 10/1/12, Andrej Mitrovic  wrote:
> but the D
> compiler expected the parameters to be passed

missing "in another way" there.

Re: Idea: Introduce zero-terminated string specifier

2012-10-01 Thread Andrej Mitrovic

On 10/1/12, Piotr Szturmaj  wrote:
> For example C binding writers may change:
>
> extern(C) char* getstr();
>
> to
>
> extern(C) cstring getstr();

I don't think you can reliably do that because of semantics w.r.t.
passing parameters on the stack vs in registers based on whether a
type is a pointer or not. I've had this sort of bug when wrapping C++
where the C++ compiler was passing a parameter in one way but the D
compiler expected the parameters to be passed, simply because I tried
to be clever and fake a return type. See:
http://forum.dlang.org/thread/mailman.1547.1346632732.31962.d@puremagic.com#post-mailman.1557.1346690320.31962.d.gnu:40puremagic.com

Re: Idea: Introduce zero-terminated string specifier

2012-10-01 Thread Piotr Szturmaj

Johannes Pfau wrote:

struct CString(T)
  if (isSomeChar!T)
{
  T* str;
}

@property
auto cstring(S : T*, T)(S str)
  if (isSomeChar!T)
{
  return CString!T(str);
}

string test = "abc";
immutable(char)* p = test.ptr;

writefln("%s", p.cstring); // prints "abc"

Here the char pointer type is "annotated" as null terminated string
and writefln can use this information.

If CString implemented a toString method (probably the variant taking a
sink delegate), this would already work.

I reworked this example to form a forward range:

http://dpaste.dzfl.pl/7ab1eeec

The major advantage over "%zs" is that it could be used anywhere, not 
only with writef().

For example C binding writers may change:

extern(C) char* getstr();

to

extern(C) cstring getstr();

so the string may be immediately used with writef();

> I'm not sure about performance
> though: Isn't writing out bigger buffers a lot faster than writing
> single chars? You could print every char individually, but wouldn't a
> p[0 .. strlen(p)] usually be faster?

I think it internally prints single characters anyway. At least it must 
test each character if it's not zero valued. strlen() does that.

Re: Idea: Introduce zero-terminated string specifier

2012-10-01 Thread Steven Schveighoffer

On Mon, 01 Oct 2012 05:54:30 -0400, Jonathan M Davis   
wrote:



I'm not completely against the idea of %zs, but I confess that I have to
wonder what someone is doing if they really need to print zero-terminated
strings all that often in D for anything other than quick debugging (in  
which

case to!string works just fine)


to!string necessarily allocates, I think that is not a small problem.

I think %s should treat char * as if it is zero-terminated.

Invariably, you will have two approaches to this problem:

1. writefln("%s", mycstring); => 0xptrlocation
2. hm.., I guess I'll just use to!string => vulnerable to  
non-zero-terminated strings!


or

2. hm.., to!string will allocate, I guess I'll just use writefln("%s",  
mycstring[0..strlen(mycstring)]); => vulnerable to non-zero-terminated  
strings!


So how is forcing the user to use one of these methods any safer?  I don't  
see any casts in there...



, since only stuff directly interacting with C
code will even care. And if it's really that big a deal, and you're  
constantly

interacting with C code like that, you can always use the appropriate C
function - printf - and then it's a non-issue.


Nobody should ever *ever* use printf, unless you are debugging druntime.

It's not a non-issue.  printf has no type checking whatsoever.  Using it  
means 1) non-typechecked code (i.e., accidentally pass an int instead of a  
string, or forget to pass an arg for a specifier, and you've crashed your  
code), and 2) you have locked yourself into using C's streams (something I  
hope to remedy in the future).


Besides, it doesn't *gain* you anything over having writef(ln) just  
support char *.


Bottom line -- if to!string(arg) is supported, writefln("%s", arg) should  
be supported, and do the same thing.


-Steve

Re: Idea: Introduce zero-terminated string specifier

2012-10-01 Thread deadalnix


Le 01/10/2012 13:29, Vladimir Panteleev a écrit :

On Monday, 1 October 2012 at 10:56:36 UTC, deadalnix wrote:

Le 30/09/2012 21:58, Vladimir Panteleev a écrit :

On Sunday, 30 September 2012 at 18:31:00 UTC, deadalnix wrote:

If you know that a string is 0 terminated, you can easily create a
slice from it as follow :

char* myZeroTerminatedString;
char[] myZeroTerminatedString[0 .. strlen(myZeroTerminatedString)];

It is clean and avoid to modify the stdlib in an unsafe way.


That's what to!string already does.


How does to!string know that the string is 0 terminated ?


By convention (it doesn't).


It is unsafe as hell oO

Re: Idea: Introduce zero-terminated string specifier

2012-10-01 Thread Johannes Pfau

Am Mon, 01 Oct 2012 13:22:46 +0200
schrieb Piotr Szturmaj :

> Paulo Pinto wrote:
> > On Monday, 1 October 2012 at 09:42:08 UTC, Piotr Szturmaj wrote:
> >> Jakob Ovrum wrote:
> >>> On Monday, 1 October 2012 at 09:17:52 UTC, Piotr Szturmaj wrote:
>  Adam D. Ruppe wrote:
> > On Saturday, 29 September 2012 at 02:11:12 UTC, Alex Rønne
> > Petersen wrote:
> > Also this reminds me of the utter uselessness of the current
> > behavior of
> > "%s" and a pointer - it prints the address.
> 
>  Why not specialize current "%s" for character pointer types so
>  it will print null terminated strings? It's always possible to
>  cast to void* to print an address.
> >>>
> >>> It's not safe to assume that pointers to characters are generally
> >>> null terminated.
> >>
> >> Yes, but programmer should know what he's passing anyway.
> >
> > The thinking "the programmer should" only works in one man teams.
> >
> > As soon as you start having teams with disparate programming
> > knowledge among team members, you can forget everything about "the
> > programmer should".
> 
> I experienced such team at my previous work and I know what you mean.
> My original thoughts was based on telling writef that I want print a 
> null-terminated string rather than address. to!string will surely
> work, but it implies double iteration, one in to!string to calculate
> length (seeking for 0 char) and one in writef (printing). With long
> strings this is suboptimal. What about something like this:
> 
> struct CString(T)
>  if (isSomeChar!T)
> {
>  T* str;
> }
> 
> @property
> auto cstring(S : T*, T)(S str)
>  if (isSomeChar!T)
> {
>  return CString!T(str);
> }
> 
> string test = "abc";
> immutable(char)* p = test.ptr;
> 
> writefln("%s", p.cstring); // prints "abc"
> 
> Here the char pointer type is "annotated" as null terminated string
> and writefln can use this information.

If CString implemented a toString method (probably the variant taking a
sink delegate), this would already work. I'm not sure about performance
though: Isn't writing out bigger buffers a lot faster than writing
single chars? You could print every char individually, but wouldn't a
p[0 .. strlen(p)] usually be faster?

Re: Idea: Introduce zero-terminated string specifier

2012-10-01 Thread Piotr Szturmaj


Jonathan M Davis wrote:

On Monday, October 01, 2012 11:18:16 Piotr Szturmaj wrote:

Adam D. Ruppe wrote:

On Saturday, 29 September 2012 at 02:11:12 UTC, Alex Rønne Petersen wrote:

While the idea is reasonable, the problem then becomes that if you
accidentally pass a non-zero terminated char* to %sz, all hell breaks
loose just like with printf.


That's the same risk with to!string(), yes? We aren't really losing
anything by adding it.

Also this reminds me of the utter uselessness of the current behavior of
"%s" and a pointer - it prints the address.


Why not specialize current "%s" for character pointer types so it will
print null terminated strings? It's always possible to cast to void* to
print an address.


Honestly? One of Phobos' best features is the fact that %s works for
_everything_. Specializing it for _anything_ would be horrible. It would also
break a _ton_ of code. Who even uses %d, %f, etc. if they don't need to use
format specifiers? It's just way simpler to always use %s.


OK, I think you're right.


I'm not completely against the idea of %zs, but I confess that I have to
wonder what someone is doing if they really need to print zero-terminated
strings all that often in D for anything other than quick debugging (in which
case to!string works just fine), since only stuff directly interacting with C
code will even care. And if it's really that big a deal, and you're constantly
interacting with C code like that, you can always use the appropriate C
function - printf - and then it's a non-issue.


Imagine you're serializing great amount of text when some of the text 
come from a C library (as null-terminated char*) and you're using 
format() with %s specifiers. Direct handling of C strings would be just 
faster because it avoids double iteration.

Re: Idea: Introduce zero-terminated string specifier

2012-10-01 Thread Vladimir Panteleev


On Monday, 1 October 2012 at 10:56:36 UTC, deadalnix wrote:

Le 30/09/2012 21:58, Vladimir Panteleev a écrit :

On Sunday, 30 September 2012 at 18:31:00 UTC, deadalnix wrote:
If you know that a string is 0 terminated, you can easily 
create a

slice from it as follow :

char* myZeroTerminatedString;
char[] myZeroTerminatedString[0 .. 
strlen(myZeroTerminatedString)];


It is clean and avoid to modify the stdlib in an unsafe way.


That's what to!string already does.


How does to!string know that the string is 0 terminated ?


By convention (it doesn't).

Re: Idea: Introduce zero-terminated string specifier

2012-10-01 Thread Piotr Szturmaj


Paulo Pinto wrote:

On Monday, 1 October 2012 at 09:42:08 UTC, Piotr Szturmaj wrote:

Jakob Ovrum wrote:

On Monday, 1 October 2012 at 09:17:52 UTC, Piotr Szturmaj wrote:

Adam D. Ruppe wrote:

On Saturday, 29 September 2012 at 02:11:12 UTC, Alex Rønne Petersen
wrote:
Also this reminds me of the utter uselessness of the current
behavior of
"%s" and a pointer - it prints the address.


Why not specialize current "%s" for character pointer types so it will
print null terminated strings? It's always possible to cast to void*
to print an address.


It's not safe to assume that pointers to characters are generally null
terminated.


Yes, but programmer should know what he's passing anyway.


The thinking "the programmer should" only works in one man teams.

As soon as you start having teams with disparate programming knowledge
among team members, you can forget everything about "the programmer
should".


I experienced such team at my previous work and I know what you mean. My 
original thoughts was based on telling writef that I want print a 
null-terminated string rather than address. to!string will surely work, 
but it implies double iteration, one in to!string to calculate length 
(seeking for 0 char) and one in writef (printing). With long strings 
this is suboptimal. What about something like this:


struct CString(T)
if (isSomeChar!T)
{
T* str;
}

@property
auto cstring(S : T*, T)(S str)
if (isSomeChar!T)
{
return CString!T(str);
}

string test = "abc";
immutable(char)* p = test.ptr;

writefln("%s", p.cstring); // prints "abc"

Here the char pointer type is "annotated" as null terminated string and 
writefln can use this information.

Re: Idea: Introduce zero-terminated string specifier

2012-10-01 Thread deadalnix


Le 30/09/2012 21:58, Vladimir Panteleev a écrit :

On Sunday, 30 September 2012 at 18:31:00 UTC, deadalnix wrote:

If you know that a string is 0 terminated, you can easily create a
slice from it as follow :

char* myZeroTerminatedString;
char[] myZeroTerminatedString[0 .. strlen(myZeroTerminatedString)];

It is clean and avoid to modify the stdlib in an unsafe way.


That's what to!string already does.


How does to!string know that the string is 0 terminated ?

Re: Idea: Introduce zero-terminated string specifier

2012-10-01 Thread Paulo Pinto


On Monday, 1 October 2012 at 09:42:08 UTC, Piotr Szturmaj wrote:

Jakob Ovrum wrote:
On Monday, 1 October 2012 at 09:17:52 UTC, Piotr Szturmaj 
wrote:

Adam D. Ruppe wrote:
On Saturday, 29 September 2012 at 02:11:12 UTC, Alex Rønne 
Petersen

wrote:
Also this reminds me of the utter uselessness of the current 
behavior of

"%s" and a pointer - it prints the address.


Why not specialize current "%s" for character pointer types 
so it will
print null terminated strings? It's always possible to cast 
to void*

to print an address.


It's not safe to assume that pointers to characters are 
generally null

terminated.


Yes, but programmer should know what he's passing anyway.


The thinking "the programmer should" only works in one man teams.

As soon as you start having teams with disparate programming 
knowledge among team members, you can forget everything about 
"the programmer should".


..
Paulo

Re: Idea: Introduce zero-terminated string specifier

2012-10-01 Thread Jonathan M Davis

On Monday, October 01, 2012 11:18:16 Piotr Szturmaj wrote:
> Adam D. Ruppe wrote:
> > On Saturday, 29 September 2012 at 02:11:12 UTC, Alex Rønne Petersen wrote:
> >> While the idea is reasonable, the problem then becomes that if you
> >> accidentally pass a non-zero terminated char* to %sz, all hell breaks
> >> loose just like with printf.
> > 
> > That's the same risk with to!string(), yes? We aren't really losing
> > anything by adding it.
> > 
> > Also this reminds me of the utter uselessness of the current behavior of
> > "%s" and a pointer - it prints the address.
> 
> Why not specialize current "%s" for character pointer types so it will
> print null terminated strings? It's always possible to cast to void* to
> print an address.

Honestly? One of Phobos' best features is the fact that %s works for 
_everything_. Specializing it for _anything_ would be horrible. It would also 
break a _ton_ of code. Who even uses %d, %f, etc. if they don't need to use 
format specifiers? It's just way simpler to always use %s.

I'm not completely against the idea of %zs, but I confess that I have to 
wonder what someone is doing if they really need to print zero-terminated 
strings all that often in D for anything other than quick debugging (in which 
case to!string works just fine), since only stuff directly interacting with C 
code will even care. And if it's really that big a deal, and you're constantly 
interacting with C code like that, you can always use the appropriate C 
function - printf - and then it's a non-issue.

- Jonathan M Davis

Re: Idea: Introduce zero-terminated string specifier

2012-10-01 Thread Piotr Szturmaj


Jakob Ovrum wrote:

On Monday, 1 October 2012 at 09:17:52 UTC, Piotr Szturmaj wrote:

Adam D. Ruppe wrote:

On Saturday, 29 September 2012 at 02:11:12 UTC, Alex Rønne Petersen
wrote:
Also this reminds me of the utter uselessness of the current behavior of
"%s" and a pointer - it prints the address.


Why not specialize current "%s" for character pointer types so it will
print null terminated strings? It's always possible to cast to void*
to print an address.


It's not safe to assume that pointers to characters are generally null
terminated.


Yes, but programmer should know what he's passing anyway.

Re: Idea: Introduce zero-terminated string specifier

2012-10-01 Thread Jakob Ovrum


On Monday, 1 October 2012 at 09:17:52 UTC, Piotr Szturmaj wrote:

Adam D. Ruppe wrote:
On Saturday, 29 September 2012 at 02:11:12 UTC, Alex Rønne 
Petersen wrote:
Also this reminds me of the utter uselessness of the current 
behavior of

"%s" and a pointer - it prints the address.


Why not specialize current "%s" for character pointer types so 
it will print null terminated strings? It's always possible to 
cast to void* to print an address.


It's not safe to assume that pointers to characters are generally 
null terminated.

Re: Idea: Introduce zero-terminated string specifier

2012-10-01 Thread Piotr Szturmaj


Adam D. Ruppe wrote:

On Saturday, 29 September 2012 at 02:11:12 UTC, Alex Rønne Petersen wrote:

While the idea is reasonable, the problem then becomes that if you
accidentally pass a non-zero terminated char* to %sz, all hell breaks
loose just like with printf.


That's the same risk with to!string(), yes? We aren't really losing
anything by adding it.

Also this reminds me of the utter uselessness of the current behavior of
"%s" and a pointer - it prints the address.


Why not specialize current "%s" for character pointer types so it will 
print null terminated strings? It's always possible to cast to void* to 
print an address.

Re: Idea: Introduce zero-terminated string specifier

2012-10-01 Thread Rob T


On Monday, 1 October 2012 at 06:58:41 UTC, Paulo Pinto wrote:
You should anyway wrap those APIs not to pollute D call with 
lower level APIs.


I have to agree, esp when it applies to pointers.

We should not forget that one of the objectives of D is to make 
coding "safe" by getting rid of the need to use pointers and 
other unsafe features. It encourages safe practice by making safe 
practice much easier to do than using unsafe practice. It however 
allows unsafe practice where necessary, but the programmer has to 
intentionally do something extra to make that happen.


I think the suggestion of introducing a null string specifier 
fundamentally goes against the objectives of D, and if introduced 
will unltimately degrade the quality of the language.


--rt

Re: Idea: Introduce zero-terminated string specifier

2012-10-01 Thread Paulo Pinto

On Sunday, 30 September 2012 at 20:27:16 UTC, Andrej Mitrovic 
wrote:

On 9/30/12, deadalnix  wrote:
If you know that a string is 0 terminated, you can easily 
create a slice

from it as follow :

char* myZeroTerminatedString;
char[]  myZeroTerminatedString[0 .. 
strlen(myZeroTerminatedString)];


It is clean and avoid to modify the stdlib in an unsafe way.



What does that have to do with writef()? You can call 
to!string, but
that's beside the point. The point was getting rid of this 
verbosity

when using C APIs.


You should anyway wrap those APIs not to pollute D call with 
lower level APIs.


As such I don't find the verbosity, as you put it, that much of 
an issue.


Then again, I favor the Pascal family of languages for systems 
programming.


--
Paulo

Re: Idea: Introduce zero-terminated string specifier

2012-09-30 Thread Muhtar

On Sunday, 30 September 2012 at 19:58:16 UTC, Vladimir Panteleev 
wrote:

On Sunday, 30 September 2012 at 18:31:00 UTC, deadalnix wrote:
If you know that a string is 0 terminated, you can easily 
create a slice from it as follow :


char* myZeroTerminatedString;
char[]  myZeroTerminatedString[0 .. 
strlen(myZeroTerminatedString)];


It is clean and avoid to modify the stdlib in an unsafe way.


That's what to!string already does.


I aggere you... href="http://www.tercumesirketi.com/";>Tercüme || href="http://www.tercumesirketi.com/";>Tercüme Büroları

Re: Idea: Introduce zero-terminated string specifier

2012-09-30 Thread Andrej Mitrovic

On 9/30/12, deadalnix  wrote:
> If you know that a string is 0 terminated, you can easily create a slice
> from it as follow :
>
> char* myZeroTerminatedString;
> char[]  myZeroTerminatedString[0 .. strlen(myZeroTerminatedString)];
>
> It is clean and avoid to modify the stdlib in an unsafe way.
>

What does that have to do with writef()? You can call to!string, but
that's beside the point. The point was getting rid of this verbosity
when using C APIs.

Re: Idea: Introduce zero-terminated string specifier

2012-09-30 Thread Vladimir Panteleev


On Sunday, 30 September 2012 at 18:58:11 UTC, Paulo Pinto wrote:

+1

We don't need to preserve C's design errors regarding strings 
and vectors.


The problem is that, unsurprisingly, most C APIs (not just libc, 
but also most C libraries and OS APIs) use zero-terminated 
strings. The philosophy of ignoring the existence of C strings 
throughout all of D makes working with such APIs needlessly 
verbose (and sometimes annoying, as D code will compile and 
produce unexpected results).

Re: Idea: Introduce zero-terminated string specifier

2012-09-30 Thread Vladimir Panteleev

On Saturday, 29 September 2012 at 02:07:38 UTC, Andrej Mitrovic 
wrote:
I've noticed I'm having to do a lot of to!string calls when I 
want to

call the versatile writef() function. So I was thinking, why not
introduce a special zero-terminated string specifier which 
would both

alleviate the need to call to!string and would probably save on
needless memory allocation. If all we want to do is print 
something,

why waste time duplicating a string?


I just checked and std.conv.to always allocates a copy, even when 
constness doesn't require it. It should not reallocate when 
constness doesn't change, or is a safe conversion (e.g. immutable 
-> const).


A discussion on a related topic (formatting of C strings results 
in unexpected behavior) is here: 
http://d.puremagic.com/issues/show_bug.cgi?id=8384

Re: Idea: Introduce zero-terminated string specifier

2012-09-30 Thread Vladimir Panteleev


On Sunday, 30 September 2012 at 18:31:00 UTC, deadalnix wrote:
If you know that a string is 0 terminated, you can easily 
create a slice from it as follow :


char* myZeroTerminatedString;
char[]  myZeroTerminatedString[0 .. 
strlen(myZeroTerminatedString)];


It is clean and avoid to modify the stdlib in an unsafe way.


That's what to!string already does.

Re: Idea: Introduce zero-terminated string specifier

2012-09-30 Thread Paulo Pinto


On Sunday, 30 September 2012 at 18:31:00 UTC, deadalnix wrote:
If you know that a string is 0 terminated, you can easily 
create a slice from it as follow :


char* myZeroTerminatedString;
char[]  myZeroTerminatedString[0 .. 
strlen(myZeroTerminatedString)];


It is clean and avoid to modify the stdlib in an unsafe way.


+1

We don't need to preserve C's design errors regarding strings and 
vectors.


--
Paulo

Re: Idea: Introduce zero-terminated string specifier

2012-09-30 Thread deadalnix

If you know that a string is 0 terminated, you can easily create a slice 
from it as follow :


char* myZeroTerminatedString;
char[]  myZeroTerminatedString[0 .. strlen(myZeroTerminatedString)];

It is clean and avoid to modify the stdlib in an unsafe way.

Re: Idea: Introduce zero-terminated string specifier

2012-09-28 Thread Adam D. Ruppe

On Saturday, 29 September 2012 at 02:11:12 UTC, Alex Rønne 
Petersen wrote:
While the idea is reasonable, the problem then becomes that if 
you accidentally pass a non-zero terminated char* to %sz, all 
hell breaks loose just like with printf.


That's the same risk with to!string(), yes? We aren't really 
losing anything by adding it.


Also this reminds me of the utter uselessness of the current 
behavior of "%s" and a pointer - it prints the address.


I think this should be simply disallowed. If you want that, you 
can use %x, and if you want it printed, that's where the new %z 
comes in.

Re: Idea: Introduce zero-terminated string specifier

2012-09-28 Thread Alex Rønne Petersen


On 29-09-2012 04:08, Andrej Mitrovic wrote:

I've noticed I'm having to do a lot of to!string calls when I want to
call the versatile writef() function. So I was thinking, why not
introduce a special zero-terminated string specifier which would both
alleviate the need to call to!string and would probably save on
needless memory allocation. If all we want to do is print something,
why waste time duplicating a string?

Let's say we call the new specifier %zs (we can debate for the actual name):

extern(C) const(void)* GetName();  // e.g. some C api functions..
extern(C) const(void)* GetLastName();

Before:
writefln("Name %s, Last Name %s", to!string(GetName()),
to!string(GetLastName()));

After:
writefln("Name %zs, Last Name %zs", GetName(), GetLastName());

Of course in this simple case you could just use printf(), but
remember that writef() is much more versatile and allows you to
specify %s to match any type. It would be great to match printf's
original meaning of %s with another specifier.



While the idea is reasonable, the problem then becomes that if you 
accidentally pass a non-zero terminated char* to %sz, all hell breaks 
loose just like with printf.


--
Alex Rønne Petersen
a...@lycus.org
http://lycus.org

52 matches

Mail list logo