Re: std.data.json formal review

deadalnix via Digitalmars-d Tue, 11 Aug 2015 15:25:40 -0700

On Tuesday, 11 August 2015 at 21:06:24 UTC, Sönke Ludwig wrote:

Seehttp://s-ludwig.github.io/std_data_json/stdx/data/json/value/JSONValue.payload.html
The question whether each field is "really" needed obviouslydepends on the application. However, the biggest type is BigIntthat, form a quick look, contains a dynamic array + a boolfield, so it's not as compact as it could be, but also notreally large. There is also an additional Location field thatmay sometimes be important for good error messages and the likeand sometimes may be totally unneeded.

Urg. Looks like BigInt should steal a bit somewhere instead ofhaving a bool like this. That is not really your lib's fault, butthat's quite an heavy cost.

Consider this, if the struct fit into 2 registers, it will bepassed around as such rather than in memory. That is asignificant difference. For BigInt itself, and, by proxy, for theJSON library.

Putting the BigInt thing aside, it seems like the biggest fieldin there is an array of JSONValues or a string. For the string,you can artificially limit the length by 3 bits to stick a tag.That still give absurdly large strings. For the JSONValue case,the alignment on the pointer is such as you can steal 3 bits fromthere. Or as for string, the length can be used.

It seems very realizable to me to have the JSONValue struct fitinto 2 registers, granted the tag fit in 3 bits (8 differenttypes).


I can help with that if you want to.

However, my goal when implementing this has never been to makethe DOM representation as efficient as possible. The simplereason is that a DOM representation is inherently inefficientwhen compared to operating on the structure using either thepull parser or using a deserializer that directly converts intoa static D type. IMO these should be advertised instead oftrying to milk a dead cow (in terms of performance).


Indeed. Still, JSON nodes should be as lightweight as possible.

2/ As far as I can see, the element are discriminated usingtypeid. Anenum is preferable as the compiler would know values ahead oftime andoptimize based on this. It also allow use of things like finalswitch.
Using a tagged union like structure is definitely what I'd liketo have, too. However, the main goal was to build the DOM typeupon a generic algebraic type instead of using a home-brewtagged union. The reason is that it automatically makesdifferent DOM types with a similar structure interoperable(JSON/BSON/TOML/...).

That is a great point that I haven't considered. I'd go the otherway around about it: providing a compatible typeid based structfrom the enum tagged one for compatibility. It can even be aliasthis so the transition is transparent.

The transformation is not bijective, so that'd be great to getthe most restrictive form (the enum) and fallback on the leastrestrictive one (alias this) when wanted.

Now Phobos unfortunately only has Algebraic, which not onlydoesn't have a type enum, but is currently also really bad atkeeping static type information when forwarding function callsor operators. The only options were basically to resort toAlgebraic for now, but have something that works, or to firstimplement an alternative algebraic type and get it acceptedinto Phobos, which would delay the whole process nearlyindefinitely.

That's fine. Done is better than perfect. Still API changes tendto be problematic, so we need to nail that part at least, and anenum with fallback on typeid based solution seems like the bestoption.

Or do you perhaps mean the JSON -> deserialize -> manipulate ->serialize -> JSON approach? That definitely is not a "loserstrategy"*, but yes, it is limited to applications where youhave a partially fixed schema. However, arguably mostapplications fall into that category.


Yes.

Re: std.data.json formal review

Reply via email to