Steven Schveighoffer wrote:
On Thu, 12 Nov 2009 10:29:17 -0500, Andrei Alexandrescu
<seewebsiteforem...@erdani.org> wrote:
Steven Schveighoffer wrote:
On Tue, 10 Nov 2009 18:49:54 -0500, Andrei Alexandrescu
<seewebsiteforem...@erdani.org> wrote:
I think the best option for toString is to take an output range and
write to it. (The sink is a simplified range.)
Bad idea...
A range only makes sense as a struct, not an interface/object. I'll
tell you why: performance.
You are right. If range interfaces accommodate block transfers, this
problem may be addressed. I agree that one virtual call per character
output would be overkill. (I seem to recall it's one of the reasons
why C++'s iostreams are so inefficient.)
IIRC, I don't think C++ iostreams use polymorphism
Oh yes they do. (Did you even google?) Virtual multiple inheritance, the
works.
http://www.deitel.com/articles/cplusplus_tutorials/20060225/virtualBaseClass/
, and I don't think
they use the "one char at a time" method.
Well they do offer one char at a time and also a block transfer.
http://msdn.microsoft.com/en-us/library/760t8w1z%28VS.80%29.aspx
I'm not sure how the heck but they still manage to call one virtual
method per char, otherwise they'd be plenty fast, which they aren't. I
seem to recall write() has a default implementation that calls put() in
a loop or something. It's not a topic that I want to study closely.
iostreams suck, why spend time on learning the quirks of a broken design.
Ranges are special in two respects:
1. They are foreachable. I think everyone agrees that calling 2
interface functions per loop iteration is much lower performing than
using opApply, which calls one delegate function per loop. My
recommendation -- use opApply when dealing with polymorphism. I
don't think there's a way around this.
>
2. They are useful for passing to std.algorithm. But std.algorithm
is template-interfaced. No need for using interfaces because the
correct instatiation will be chosen.
If you are intending to add a streaming module that uses ranges,
would it not be templated for the range type as std.algorithm is? If
not, the next logical choice is a delegate, which requires no vtable
lookup. Using an interface is just asking for a performance penalty
for not much gain.
I think the cost of calling through the delegate is roughly the same
as a virtual call.
Not exactly. I think you are right that struct member calls are faster
than delegates, but only slightly. The difference being that a struct
member call does not need to load the function address from the stack,
it can hard-code the address directly.
However, virtual calls have to be lower performing because you are doing
two indirections, one to the class vtable, then one to the function
address itself. Plus those two locations are most likely located on the
heap, not the stack, and so may not be in the cache.
I think the only way to figure is to measure. For one thing I disagree
with the comment about the cache - a vtable is quite likely to be warm
after a couple of calls.
I know one thing - Walter's old format function used delegates and it
was unusably slow.
x.toString(outputRange, format)
and
x.toString(&outputRange.sink, format)
is pretty darn minimal, and if outputRange is an interface or
object, this saves a virtual call per buffer write. Plus the second
form is more universal, you can pass any delegate, and not have to
use a range type to wrap a delegate.
Don't fall into the "OOP newbie" trap -- where just because you've
found a new concept that is amazing, you want to use it for
everything. I say this because I've seen in the past where someone
discovers the power of OOP and then wants to use it for everything,
when in some cases, it's overkill. Just look at some Java "classes"...
There is no need to worry that I'll fall into at least that particular
OOP newbie trap.
What I think we should do is define a text output interface that
allows writing individual characters of all widths and also arrays of
all widths. That would be a universal means for text output.
interface TextOutputStream {
void put(dchar); // also accommodates char and wchar
void put(in char[]);
void put(in wchar[]);
void put(in dchar[]);
}
The toString method (re-baptized as toStream) would take such an
interface. Better ideas are always welcome. Perhaps I'm falling
another OOP newbie trap! (Seriously!)
This still fits within a single function, which takes one of the 3
widths (pick one, they can all be translated to eachother):
void put(in char[] str)
{
foreach(dchar dc; str)
{
put((&dc)[0..1]);
}
}
Note that you probably want to build a buffer of dchars instead of
putting one at a time, but you get the idea.
I don't get the idea. I'm seeing one virtual call per character.
Also, putting a single character is probably pretty uncommon, but can be
handled in a similar fashion.
I'm not sure about the uncommonality of outputting one character, but it
may be good to discourage it just to not foster slow code.
That being said, one other point that makes all this moot is -- toString
is for debugging, not for general purpose. We don't need to support
everything that is possible. You should be able to say "hey, toString
only accepts char[], deal." Of course, you could substitute wchar[] or
dchar[], but I think by far char[] is the most common (and is the
default type for string literals).
I was hoping we could elevate the usefulness of toString a bit.
That's not to say there is no reason to have a TextOutputStream object.
Such a thing is perfectly usable for a toString which takes a char[]
delegate sink, just pass &put. In fact, there could be a default
toString function in Object that does just that:
class Object
{
...
void toString(delegate void(in char[] buf) put, string fmt) const
{}
void toString(TextOutputStream tos, string fmt) const
{ toString(&tos.put, fmt); }
}
I'd agree with the delegate idea if we established that UTF-8 is favored
compared to all other formats.
Andrei