On 12/28/10 11:39 AM, Michel Fortin wrote:
On 2010-12-28 02:02:29 -0500, Andrei Alexandrescu
<seewebsiteforem...@erdani.org> said:

I've put together over the past days an embryonic streaming interface.
It separates transport from formatting, input from output, and
buffered from unbuffered operation.

http://erdani.com/d/phobos/std_stream2.html

There are a number of questions interspersed. It would be great to
start a discussion using that design as a baseline. Please voice any
related thoughts - thanks!

One of my concerns is the number of virtual calls required in actual
usage, because virtual calls prevent inlining. I know it's necessary to
have virtual calls in the formatter to serialize objects (which requires
double dispatch), but in your design the underlying transport layer too
wants to be called virtually. How many virtual calls will be necessary
to serialize an array of 10 objects, each having 10 fields? Let's see:

10 calls to Formatter.put(Object)
+ 10 calls to Object.toString(Formatter)
+ 10 objects * 10 calls per object to Formatter.put(<some field type>)
+ 10 objects * 10 calls per object to UnbufferedOutputTransport.write(in
ubyte[])

Total: 220 virtual calls, for 10 objects with 10 fields each. Most of
the functions called virtually here are pretty trivial and would
normally be inlined if the context allowed it. Assuming those fields are
4 byte integers and are stored as is in the stream, the result will be
between 400 and 500 byte long once we add the object's class name. We
end up having almost 1 virtual call for each two byte of emitted data;
is this overhead really acceptable? How much inlining does it prevent?

Probably that overhead may be quite large.

My second concern is that your approach to Formatter is too rigid. For
instance, what if an object needs to write different fields depending on
the output format, or write them in a different order? It'll have to
check at runtime which kind of formatter it got (through casts
probably). Or what if I have a formatter that wants to expose an XML
tree instead of bytes? It'll need a totally different interface that
deals with XML elements, attributes, and character data, not bytes.

I think that's a very rare situation. When you pick a certain formatter, you commit to a certain representation, period. It's poor design to have the object object (sic) to that representation.

To some extent representation can be tweaked via format specifiers, which are a language spoken by both the formatter and the formatted.

So because of all this virtual dispatch and all this rigidity, I think
Formatter needs to be rethought a little. My preference obviously goes
to satically-typed formatters.

It's heartwarming to see so much interest in static polymorphism. Only a couple of years ago I would've had trouble convincing people of that; now I need to preach the advantages of dynamic polymorphism.

But what I'd like to see is something
like this:

interface Serializable(F) {
void writeTo(F formatter);
}

Let me make sure I understand correctly. So when I define a class I commit to its possible representations? Doesn't seem good design to me. What if I later come with a new Formatter? I'd need to change my entire class hierarchy too.

Any object can implement a serialization for a given formatter by
implementing the interface above parametrized with the formatter type.

If only one formatter would be allowed that would be even worse. But you can allow several:

class Widget : Serializable!Json, Serializable!Binary {
  ...
}

Sorry, I think this is poor design.

(Struct types could have a similar writeTo function too, they just don't
need to implement an interface.) The formatter type can expose the
interface it wants and use or not use virtual functions, it could be an
XML writer interface (something with openElement, writeCharacterData,
closeElement, etc), it could be a JSON interface; it could even be your
Formatter as proposed, we just wouldn't be limited by it.

So basically, I'm not proposing you dump Formatter, just that you make
it part of a reusable pattern for
formatting/serializing/unformatting/unserializing things using other
things that your Formatter interface.

I may be misunderstanding, but to me it seems that this design brings more problems than it solves.

As for the transport layer, I don't mind it much if it's an interface.
Unlike Formatter, nothing prevents you from creating a 'final' class and
using it directly when you can to avoid virtual dispatch. This doesn't
work so well for Formatter however because it requires double dispatch
when it encounters a class, which washes away all static information.

I agree that Transport is fine with the dynamic interface.


Andrei

Reply via email to