Steven Schveighoffer wrote:
On Thu, 12 Nov 2009 11:46:48 -0500, Andrei Alexandrescu
<seewebsiteforem...@erdani.org> wrote:
Steven Schveighoffer wrote:
On Thu, 12 Nov 2009 10:29:17 -0500, Andrei Alexandrescu
<seewebsiteforem...@erdani.org> wrote:
Steven Schveighoffer wrote:
On Tue, 10 Nov 2009 18:49:54 -0500, Andrei Alexandrescu
<seewebsiteforem...@erdani.org> wrote:
I think the best option for toString is to take an output range
and write to it. (The sink is a simplified range.)
Bad idea...
A range only makes sense as a struct, not an interface/object.
I'll tell you why: performance.
You are right. If range interfaces accommodate block transfers, this
problem may be addressed. I agree that one virtual call per
character output would be overkill. (I seem to recall it's one of
the reasons why C++'s iostreams are so inefficient.)
IIRC, I don't think C++ iostreams use polymorphism
Oh yes they do. (Did you even google?) Virtual multiple inheritance,
the works.
http://www.deitel.com/articles/cplusplus_tutorials/20060225/virtualBaseClass/
From my C++ book, it appears to only use virtual inheritance. I don't
know enough about virtual inheritance to know how that changes function
calls.
As far as virtual functions, only the destructor is virtual, so there is
no issue there.
You're right, but there is an issue because as far as I can recall these
functions' implementation do end up calling a virtual function per char;
that might be streambuf.overflow. I'm not keen on investigating this any
further, but I'd be grateful if you shared any related knowledge. At the
end of the day, there seem to be violent agreement that we don't want
one virtual call per character or one delegate call per character.
void put(in char[] str)
{
foreach(dchar dc; str)
{
put((&dc)[0..1]);
}
}
Note that you probably want to build a buffer of dchars instead of
putting one at a time, but you get the idea.
I don't get the idea. I'm seeing one virtual call per character.
You missed the note. I didn't implement it, but you could easily
implement a stack-allocated buffer to cache the conversions, passing
multiple converted code-points at once. But I don't think it's even
worth discussing per my other points.
That being said, one other point that makes all this moot is --
toString is for debugging, not for general purpose. We don't need to
support everything that is possible. You should be able to say "hey,
toString only accepts char[], deal." Of course, you could substitute
wchar[] or dchar[], but I think by far char[] is the most common (and
is the default type for string literals).
I was hoping we could elevate the usefulness of toString a bit.
Whatever kind of data the output stream gets, it's going to convert it
to the format it wants anyways (as for stdout, I think that would be
utf8), the only benefit is if you have data stored in a different width
that you wanted to output. Calling a conversion function in that case I
think is reasonable enough, and saves the output stream from having to
convert/deal with it.
In other words, I don't think it's going to be that common a case where
you need anything other than utf8 output, and therefore the cost of
creating an interface, making virtual calls, disallowing simple delegate
passing etc is worth the convenience *just in case* you have data stored
as wchar[] you want to output.
I'm not sure.
http://www.gnu.org/s/libc/manual/html_node/Streams-and-I18N.html#Streams-and-I18N
gnu defines means to set and detect a utf-16 console, which dmd observes
(grep std/ for fwide). But then I'm not sure how many are using that
kind of stuff.
That's not to say there is no reason to have a TextOutputStream
object. Such a thing is perfectly usable for a toString which takes
a char[] delegate sink, just pass &put. In fact, there could be a
default toString function in Object that does just that:
class Object
{
...
void toString(delegate void(in char[] buf) put, string fmt) const
{}
void toString(TextOutputStream tos, string fmt) const
{ toString(&tos.put, fmt); }
}
I'd agree with the delegate idea if we established that UTF-8 is
favored compared to all other formats.
D seems to favor UTF8 -- it is the default type for string literals. I
don't think I've ever used dchar, and I usually only use wchar to talk
to Win32 functions when required.
The question I'd ask is -- how common is it where the versions other
than char[] would be more convenient?
I don't know. I think Asian-language users might give a salient answer.
Andrei