Am 03.08.2014 17:14, schrieb Andrei Alexandrescu:
On 8/3/14, 2:38 AM, Sönke Ludwig wrote:
A few thoughts based on my experience with vibe.data.json:
1. No decoding of strings appears to mean that "Value" also always
contains encoded strings. This seems the be a leaky and also error prone
leaky abstraction. For the token stream, performance should be top
priority, so it's okay to not decode there, but "Value" is a high level
abstraction of a JSON value, so it should really hide all implementation
details of the storage format.
Nonono. I think there's a confusion. The input strings are not UTF
decoded for the simple need there's no need (all tokenization decisions
are taken on the basis of ASCII characters/code units). The
backslash-prefixed characters are indeed decoded.
An optimization I didn't implement yet is to use slices of the input
wherever possible (when the input is string, immutable(byte)[], or
immutable(ubyte)[]). That will reduce allocations considerably.
Ah okay, *phew* ;) But in that case I'd actually think about leaving off
the backslash decoding in the low level parser, so that slices could be
used for immutable inputs in all cases - maybe with a name of
"rawString" for the stored data and an additional "string" property that
decodes on the fly. This may come in handy when the first comparative
benchmarks together with rapidjson and the like are done.
2. Algebraic is a good choice for its generic handling of operations on
the contained types (which isn't exposed here, though). However, a
tagged union type in my experience has quite some advantages for
usability. Since adding a type tag possibly affects the interface in a
non-backwards compatible way, this should be evaluated early on.
There's a public opCast(Payload) that gives the end user access to the
Payload inside a Value. I forgot to add documentation to it.
I see. Suppose that opDispatch would be dropped, would anything speak
against "alias this"ing _payload to avoid the need for the manually
defined operators?
What advantages are to a tagged union? (FWIW: to me Algebraic and
Variant are also tagged unions, just that the tags are not 0, 1, ..., n.
That can be easily fixed for Algebraic by defining operations to access
the index of the currently-stored type.)
The two major points are probably that it's possible to use "final
switch" on the type tag if it's an enum, and the type id can be easily
stored in both integer and string form (which is not as conveniently
possible with a TypeInfo).
(...)
The way I see it, good work on tagged unions must be either integrated
within std.variant (either by modifying Variant/Algebraic or by adding
new types to it). I am very strongly opposed to adding a tagged union
type only for JSON purposes, which I'd consider essentially a usability
bug in std.variant, the opposite of dogfooding, etc.
Definitely agree there.
An enum based tagged union design also currently has the unfortunate
property that the order of enum values and that of the accepted types
must be defined consistently, or bad things will happen. Supporting UDAs
on enum values would be a possible direction to fix this:
enum JsonType {
@variantType!string string,
@variantType!(JsonValue[]) array,
@variantType!(JsonValue[string]) object
}
alias JsonValue = TaggedUnion!JsonType;
But then there are obviously still issues with cyclic type references.
So, anyway, this is something that still requires some thought. It could
also be designed in a way that is backwards compatible with a pure
"Algebraic", so it shouldn't be a blocker for the current design.