Re: Semantics of toString

Andrei Alexandrescu Thu, 12 Nov 2009 08:50:36 -0800

Steven Schveighoffer wrote:

On Thu, 12 Nov 2009 10:29:17 -0500, Andrei Alexandrescu<seewebsiteforem...@erdani.org> wrote:
Steven Schveighoffer wrote:
On Tue, 10 Nov 2009 18:49:54 -0500, Andrei Alexandrescu<seewebsiteforem...@erdani.org> wrote:
I think the best option for toString is to take an output range andwrite to it. (The sink is a simplified range.)
 Bad idea...
A range only makes sense as a struct, not an interface/object. I'lltell you why: performance.
You are right. If range interfaces accommodate block transfers, thisproblem may be addressed. I agree that one virtual call per characteroutput would be overkill. (I seem to recall it's one of the reasonswhy C++'s iostreams are so inefficient.)
IIRC, I don't think C++ iostreams use polymorphism

Oh yes they do. (Did you even google?) Virtual multiple inheritance, theworks.


http://www.deitel.com/articles/cplusplus_tutorials/20060225/virtualBaseClass/

, and I don't thinkthey use the "one char at a time" method.


Well they do offer one char at a time and also a block transfer.

http://msdn.microsoft.com/en-us/library/760t8w1z%28VS.80%29.aspx

I'm not sure how the heck but they still manage to call one virtualmethod per char, otherwise they'd be plenty fast, which they aren't. Iseem to recall write() has a default implementation that calls put() ina loop or something. It's not a topic that I want to study closely.iostreams suck, why spend time on learning the quirks of a broken design.

Ranges are special in two respects:
1. They are foreachable. I think everyone agrees that calling 2interface functions per loop iteration is much lower performing thanusing opApply, which calls one delegate function per loop. Myrecommendation -- use opApply when dealing with polymorphism. Idon't think there's a way around this.
 >
2. They are useful for passing to std.algorithm. But std.algorithmis template-interfaced. No need for using interfaces because thecorrect instatiation will be chosen.If you are intending to add a streaming module that uses ranges,would it not be templated for the range type as std.algorithm is? Ifnot, the next logical choice is a delegate, which requires no vtablelookup. Using an interface is just asking for a performance penaltyfor not much gain.
I think the cost of calling through the delegate is roughly the sameas a virtual call.
Not exactly. I think you are right that struct member calls are fasterthan delegates, but only slightly. The difference being that a structmember call does not need to load the function address from the stack,it can hard-code the address directly.
However, virtual calls have to be lower performing because you are doingtwo indirections, one to the class vtable, then one to the functionaddress itself. Plus those two locations are most likely located on theheap, not the stack, and so may not be in the cache.

I think the only way to figure is to measure. For one thing I disagreewith the comment about the cache - a vtable is quite likely to be warmafter a couple of calls.

I know one thing - Walter's old format function used delegates and itwas unusably slow.

x.toString(outputRange, format)
 and
 x.toString(&outputRange.sink, format)
is pretty darn minimal, and if outputRange is an interface orobject, this saves a virtual call per buffer write. Plus the secondform is more universal, you can pass any delegate, and not have touse a range type to wrap a delegate.Don't fall into the "OOP newbie" trap -- where just because you'vefound a new concept that is amazing, you want to use it foreverything. I say this because I've seen in the past where someonediscovers the power of OOP and then wants to use it for everything,when in some cases, it's overkill. Just look at some Java "classes"...
There is no need to worry that I'll fall into at least that particularOOP newbie trap.
What I think we should do is define a text output interface thatallows writing individual characters of all widths and also arrays ofall widths. That would be a universal means for text output.
interface TextOutputStream {
     void put(dchar); // also accommodates char and wchar
     void put(in char[]);
     void put(in wchar[]);
     void put(in dchar[]);
}
The toString method (re-baptized as toStream) would take such aninterface. Better ideas are always welcome. Perhaps I'm fallinganother OOP newbie trap! (Seriously!)
This still fits within a single function, which takes one of the 3widths (pick one, they can all be translated to eachother):
void put(in char[] str)
{
  foreach(dchar dc; str)
  {
     put((&dc)[0..1]);
  }
}
Note that you probably want to build a buffer of dchars instead ofputting one at a time, but you get the idea.


I don't get the idea. I'm seeing one virtual call per character.

Also, putting a single character is probably pretty uncommon, but can behandled in a similar fashion.

I'm not sure about the uncommonality of outputting one character, but itmay be good to discourage it just to not foster slow code.

That being said, one other point that makes all this moot is -- toStringis for debugging, not for general purpose. We don't need to supporteverything that is possible. You should be able to say "hey, toStringonly accepts char[], deal." Of course, you could substitute wchar[] ordchar[], but I think by far char[] is the most common (and is thedefault type for string literals).


I was hoping we could elevate the usefulness of toString a bit.

That's not to say there is no reason to have a TextOutputStream object.Such a thing is perfectly usable for a toString which takes a char[]delegate sink, just pass &put. In fact, there could be a defaulttoString function in Object that does just that:
class Object
{
   ...
   void toString(delegate void(in char[] buf) put, string fmt) const
   {}
   void toString(TextOutputStream tos, string fmt) const
   { toString(&tos.put, fmt); }
}

I'd agree with the delegate idea if we established that UTF-8 is favoredcompared to all other formats.



Andrei

Re: Semantics of toString

Reply via email to