Re: toStringz or not toStringz

Regan Heath Thu, 14 Jul 2011 08:05:45 -0700

On Thu, 14 Jul 2011 12:30:24 +0100, Steven Schveighoffer<schvei...@yahoo.com> wrote:

On Thu, 14 Jul 2011 05:53:47 -0400, Regan Heath <re...@netmail.co.nz>wrote:
On Wed, 13 Jul 2011 19:31:42 +0100, Steven Schveighoffer<schvei...@yahoo.com> wrote:
On Wed, 13 Jul 2011 13:32:56 -0400, Regan Heath <re...@netmail.co.nz>wrote:
On Wed, 13 Jul 2011 17:00:39 +0100, Steven Schveighoffer<schvei...@yahoo.com> wrote:
How does your proposal know that a char * is part of aheap-allocated array? If you are assuming the only case where char* is passed will be arr.ptr, then that doesn't cut it. What if thecompiler doesn't know where the char * came from?
See your Q and my A above ("char * foo" example).
The inherent problem of zero-terminated strings is that you don'tknow how long it is until you search for a zero. If it's notproperly terminated, then you are screwed. That problem cannot be"solved", even with compiler help -- you can get situations wherethere is no more information other than the pointer.
Really? But cant we obtain the GC lock and look them up, asmentioned above? And isn't this exactly what toStringz will do whenthe programmer first of all curses because it has crashed, and thenadds an explicit toStringz call?
Who said the char * points into GC memory? It could point at stackmemory, or static data in ROM.
Ok. What would toStringz do in this case? .. because that's what I'mproposing we do here.
Nothing, you don't call toStringz on a char *, you call it on a string.The point is, for those who have already guaranteed a char * has a 0 init, they should not have to have the compiler injecting useless code fora simple function call.
A really really good example is if you use a char * you got from a Cfunction to call another C function.

Good points all. So, the idea should be limited to cases where D's char[]and string are passed to extern "C" functions expecting char*, and shouldnot affect cases where D's char* is passed directly. Sounds good.

The goal here is to pick some low hanging fruit, the general casementioned earlier, and make it work as a new D programmer wouldexpect. In that case there is no technical difficulty implementing it(toStringz already exists), there is no extra cost (you already have tocall toStringz), and the only disagreement seems to be whether itshould be implicit or explicit.
There is an extra cost where you wouldn't have to call toStringzcurrently.

The point I've tried to make all along is that this is a rare situation,and not the general case. In the general case you're going to need tocall toStringz. Especially if you restrict this idea to D's char[] andstring and not D's char* as mentioned above.

In this particular case I cannot see any harm in making it implicit.Yes, there are some edge cases, but they either already exist (as shownby the explicit toStringz example I gave where the passed char[]remained unchanged, and your example passing buffer[]), or they may bedetectable by the compiler, or they are rare - in which case requiringsome manual intervention is not too much to ask.
So, on balance I reckon the implicit call would be "better" for morepeople more of the time, and at no extra cost. It seems like a win/winto me. Yes, there are edge cases, yes there are wrinkles to iron out,no it's not a "general/covers everything perfectly" kind of idea -which I agree we'd all prefer, but it makes D look slicker, and removesone more stumbling block for new D programmers.
We also have to weigh this against two things:


Assuming the above mentioned restriction (char[] and string, not char*)...

1. How will existing code (that already calls toStringz) be affected?


Not at all.

2. This is *not* a trivial compiler change. So all other options shouldbe considered, there's a *lot* of C calls that exist from D today thatcould possibly be affected.


It will affect none of these.

If C strings were their own type (and not conflated with "bufferpointer"), and verifying a C string was valid without segfaulting and inO(1) time, I'd agree that a compiler change would be warranted. There'sjust too many cases (note, these aren't the majority, but they areenough) where the injected calls will be either performance drags orunnecessary.

I disagree about the number of cases being too many, but this is a gutfeeling and I have no evidence to support it.

I think with the restriction I mentioned above the situation changeshowever, as all those edge cases are unaffected, old code is unaffectedand only new code will allow char[] and string to be passed as extern "C"char* parameters.


--
Using Opera's revolutionary email client: http://www.opera.com/mail/

Re: toStringz or not toStringz

Reply via email to