On Thu, 14 Jul 2011 12:30:24 +0100, Steven Schveighoffer <schvei...@yahoo.com> wrote:
On Thu, 14 Jul 2011 05:53:47 -0400, Regan Heath <re...@netmail.co.nz> wrote:

On Wed, 13 Jul 2011 19:31:42 +0100, Steven Schveighoffer <schvei...@yahoo.com> wrote:

On Wed, 13 Jul 2011 13:32:56 -0400, Regan Heath <re...@netmail.co.nz> wrote:

On Wed, 13 Jul 2011 17:00:39 +0100, Steven Schveighoffer <schvei...@yahoo.com> wrote:

How does your proposal know that a char * is part of a heap-allocated array? If you are assuming the only case where char * is passed will be arr.ptr, then that doesn't cut it. What if the compiler doesn't know where the char * came from?

See your Q and my A above ("char * foo" example).

The inherent problem of zero-terminated strings is that you don't know how long it is until you search for a zero. If it's not properly terminated, then you are screwed. That problem cannot be "solved", even with compiler help -- you can get situations where there is no more information other than the pointer.

Really? But cant we obtain the GC lock and look them up, as mentioned above? And isn't this exactly what toStringz will do when the programmer first of all curses because it has crashed, and then adds an explicit toStringz call?

Who said the char * points into GC memory? It could point at stack memory, or static data in ROM.

Ok. What would toStringz do in this case? .. because that's what I'm proposing we do here.

Nothing, you don't call toStringz on a char *, you call it on a string. The point is, for those who have already guaranteed a char * has a 0 in it, they should not have to have the compiler injecting useless code for a simple function call.

A really really good example is if you use a char * you got from a C function to call another C function.

Good points all. So, the idea should be limited to cases where D's char[] and string are passed to extern "C" functions expecting char*, and should not affect cases where D's char* is passed directly. Sounds good.

The goal here is to pick some low hanging fruit, the general case mentioned earlier, and make it work as a new D programmer would expect. In that case there is no technical difficulty implementing it (toStringz already exists), there is no extra cost (you already have to call toStringz), and the only disagreement seems to be whether it should be implicit or explicit.

There is an extra cost where you wouldn't have to call toStringz currently.

The point I've tried to make all along is that this is a rare situation, and not the general case. In the general case you're going to need to call toStringz. Especially if you restrict this idea to D's char[] and string and not D's char* as mentioned above.

In this particular case I cannot see any harm in making it implicit. Yes, there are some edge cases, but they either already exist (as shown by the explicit toStringz example I gave where the passed char[] remained unchanged, and your example passing buffer[]), or they may be detectable by the compiler, or they are rare - in which case requiring some manual intervention is not too much to ask.

So, on balance I reckon the implicit call would be "better" for more people more of the time, and at no extra cost. It seems like a win/win to me. Yes, there are edge cases, yes there are wrinkles to iron out, no it's not a "general/covers everything perfectly" kind of idea - which I agree we'd all prefer, but it makes D look slicker, and removes one more stumbling block for new D programmers.

We also have to weigh this against two things:

Assuming the above mentioned restriction (char[] and string, not char*)...

1. How will existing code (that already calls toStringz) be affected?

Not at all.

2. This is *not* a trivial compiler change. So all other options should be considered, there's a *lot* of C calls that exist from D today that could possibly be affected.

It will affect none of these.

If C strings were their own type (and not conflated with "buffer pointer"), and verifying a C string was valid without segfaulting and in O(1) time, I'd agree that a compiler change would be warranted. There's just too many cases (note, these aren't the majority, but they are enough) where the injected calls will be either performance drags or unnecessary.

I disagree about the number of cases being too many, but this is a gut feeling and I have no evidence to support it.

I think with the restriction I mentioned above the situation changes however, as all those edge cases are unaffected, old code is unaffected and only new code will allow char[] and string to be passed as extern "C" char* parameters.

--
Using Opera's revolutionary email client: http://www.opera.com/mail/

Reply via email to