Am 03.08.2014 17:14, schrieb Andrei Alexandrescu:
On 8/3/14, 2:38 AM, Sönke Ludwig wrote:
A few thoughts based on my experience with vibe.data.json:

1. No decoding of strings appears to mean that "Value" also always
contains encoded strings. This seems the be a leaky and also error prone
leaky abstraction. For the token stream, performance should be top
priority, so it's okay to not decode there, but "Value" is a high level
abstraction of a JSON value, so it should really hide all implementation
details of the storage format.

Nonono. I think there's a confusion. The input strings are not UTF
decoded for the simple need there's no need (all tokenization decisions
are taken on the basis of ASCII characters/code units). The
backslash-prefixed characters are indeed decoded.

An optimization I didn't implement yet is to use slices of the input
wherever possible (when the input is string, immutable(byte)[], or
immutable(ubyte)[]). That will reduce allocations considerably.

Ah okay, *phew* ;) But in that case I'd actually think about leaving off the backslash decoding in the low level parser, so that slices could be used for immutable inputs in all cases - maybe with a name of "rawString" for the stored data and an additional "string" property that decodes on the fly. This may come in handy when the first comparative benchmarks together with rapidjson and the like are done.

2. Algebraic is a good choice for its generic handling of operations on
the contained types (which isn't exposed here, though). However, a
tagged union type in my experience has quite some advantages for
usability. Since adding a type tag possibly affects the interface in a
non-backwards compatible way, this should be evaluated early on.

There's a public opCast(Payload) that gives the end user access to the
Payload inside a Value. I forgot to add documentation to it.

I see. Suppose that opDispatch would be dropped, would anything speak against "alias this"ing _payload to avoid the need for the manually defined operators?

What advantages are to a tagged union? (FWIW: to me Algebraic and
Variant are also tagged unions, just that the tags are not 0, 1, ..., n.
That can be easily fixed for Algebraic by defining operations to access
the index of the currently-stored type.)

The two major points are probably that it's possible to use "final switch" on the type tag if it's an enum, and the type id can be easily stored in both integer and string form (which is not as conveniently possible with a TypeInfo).

(...)

The way I see it, good work on tagged unions must be either integrated
within std.variant (either by modifying Variant/Algebraic or by adding
new types to it). I am very strongly opposed to adding a tagged union
type only for JSON purposes, which I'd consider essentially a usability
bug in std.variant, the opposite of dogfooding, etc.

Definitely agree there.

An enum based tagged union design also currently has the unfortunate property that the order of enum values and that of the accepted types must be defined consistently, or bad things will happen. Supporting UDAs on enum values would be a possible direction to fix this:

        enum JsonType {
                @variantType!string string,
                @variantType!(JsonValue[]) array,
                @variantType!(JsonValue[string]) object
        }
        alias JsonValue = TaggedUnion!JsonType;

But then there are obviously still issues with cyclic type references. So, anyway, this is something that still requires some thought. It could also be designed in a way that is backwards compatible with a pure "Algebraic", so it shouldn't be a blocker for the current design.

Reply via email to