Link to the proxy which I forgot to include https://gist.github.com/bowbahdoe/eb29d172351162408eab5e4ee9d84fec
On Tue, Feb 28, 2023 at 12:16 PM Ethan McCue <et...@mccue.dev> wrote: > As an update to my character arc, I documented and wrote up an explanation > for the prototype library I was working on.[1] > > And I've gotten a good deal of feedback on reddit[2] and in private. > > I think its relevant to the conversation here in the sense of > > - There are more of rzwitserloot's objections to read on the general > concept JSON as a built in.[3] > - There are a lot of well reasoned objections to the manner in which I am > interpreting a JSON tree, as well > as objections to the usage of a tree as the core. JEP 198's current > writeup (which I know is subject to a rewrite/retraction) > presumes that an immutable tree would be the core data structure. > - The peanut gallery might be interested in a "base" to implement whatever > their take on an API should be. > > For that last category, I have a method-handle proxy written up for those > who want to try the "push parser into a pull parser" > transformation I alluded to in my first email of this thread. > > [1]: https://mccue.dev/pages/2-26-23-json > [2]: > https://www.reddit.com/r/java/comments/11cyoh1/please_try_my_json_library/ > [3]: Including one that reddit took down, but can be seen through reveddit > https://www.reveddit.com/y/rzwitserloot/?after=t1_jacpsj6&limit=1&sort=new&show=t1_jaa3x0q&removal_status=all > > On Fri, Dec 16, 2022 at 6:23 PM Ethan McCue <et...@mccue.dev> wrote: > >> Sidenote about "Project Galahad" - I know Graal uses json for a few >> things including a reflection-config.json. Food for thought. >> >> > the java.util.log experiment shows that trying to ‘core-librarize’ >> needs that the community at large already fulfills with third party deps >> isn’t a good move, >> >> I, personally, do not have much historical context for java.util.log. >> What feels distinct about providing a JSON api is that >> logging is an implicitly global thing. If a JSON api doesn't fill all >> ecosystem niches, multiple can be used alongside >> each other. >> >> > The root issue with JSON is that you just can’t tell how to interpret >> any given JSON token >> >> The point where this could be an issue is numbers. Once something is >> identified as a number we can >> >> 1. Parse it immediately. Using a long and falling back to a BigInteger. >> For decimals its harder to know >> whether to use a double or BigDecimal internally. In the library I've >> been copy pasting from to build >> a prototype that last one is an explicit option and it defaults to >> doubles for the whole parse. >> 2. Store the string and parse it upon request. We can still model it as a >> Json.Number, but the >> work of interpreting is deferred. >> >> But in general, making a tree of json values doesn't particularly affect >> our ability to interpret it >> in a certain way. That interpretation is just positional. That's just as >> true as when making assertions >> in the form of class structure and field types as it is when making >> assertions in the form of code.[2] >> >> record Thing(Instant a) {} >> >> // vs. >> >> Decoder.field(json, "a", a -> Instant.ofEpochSecond(Decoder.long_(a))) >> >> If anything, using a named type as a lookup key for a deserialization >> function is the less obvious >> way to do this. >> >> > I’m not sure how to square this circle >> > I don’t like the idea of shipping a non-data-binding JSON API in the >> core libs. >> >> I think the way to cube this rhombus is to find ways to like the idea of >> a non-data-binding JSON API. ¯\_(ツ)_/¯ >> >> My personal journey with that is reaching its terminus here I think. >> >> Look on the bright side though - there are legit upsides to explicit tree >> plucking! >> >> Yeah, the friction per field is slightly higher, but the relative >> friction of custom types, or multiple construction methods for a >> particular type, or maintaining compatibility with >> legacy representations, or even just handling a top level list of things >> - its much lower. >> >> And all that complexity - that an instant is made by looking for a long >> or that it is parsed from a string in a >> particular format - it lives in Java code you can see, touch, feel and >> taste. >> >> I know "nobody does this"[2] but it's not that bad, actually. >> >> [1]: I do apologize for the code sketches consistently being "what I >> think an interaction with a tree api should look like." >> That is what I have been thinking about for a while so it's hard to >> resist. >> [2]: https://youtu.be/dOgfWXw9VrI?t=1225 >> >> On Thu, Dec 15, 2022 at 6:34 PM Ethan McCue <et...@mccue.dev> wrote: >> >>> > are pure JSON parsers really the go-to for most people? >>> >>> Depends on what you mean by JSON parsers and it depends on what you mean >>> by people. >>> >>> To the best of my knowledge, both python and Javascript do not include >>> streaming, databinding, or path navigation capabilities in their json >>> parsers. >>> >>> >>> On Thu, Dec 15, 2022 at 6:26 PM Ethan McCue <et...@mccue.dev> wrote: >>> >>>> > The 95%+ use case for working with JSON for your average java coder >>>> is best done with data binding. >>>> >>>> To be brave yet controversial: I'm not sure this is neccesarily true. >>>> >>>> I will elaborate and respond to the other points after a hot cocoa, but >>>> the last point is part of why I think that tree-crawling needs _something_ >>>> better as an API to fit the bill. >>>> >>>> With my sketch that set of requirements would be represented as >>>> >>>> record Thing( >>>> List<Long> xs >>>> ) { >>>> static Thing fromJson(Json json) >>>> var defaultList = List.of(0L); >>>> return new Thing(Decoder.optionalNullableField( >>>> json >>>> "xs", >>>> Decoder.oneOf( >>>> Decoder.array(Decoder.oneOf( >>>> x -> Long.parseLong(Decoder.string(x)), >>>> Decoder::long >>>> )) >>>> Decoder.null_(defaultList), >>>> x -> List.of(Decoder.long_(x)) >>>> ), >>>> defaultList >>>> )); >>>> ) >>>> } >>>> >>>> Which isn't amazing at first glance, but also >>>> >>>> {} >>>> {"xs": null} >>>> {"xs": 5} >>>> {"xs": [5]} {"xs": ["5"]} >>>> {"xs": [1, "2", "3"]} >>>> >>>> these are some wildly varied structures. You could make a solid >>>> argument that something which silently treats these all the same is >>>> a bad API for all the reasons you would consider it a good one. >>>> >>>> On Thu, Dec 15, 2022 at 6:18 PM Johannes Lichtenberger < >>>> lichtenberger.johan...@gmail.com> wrote: >>>> >>>>> I'll have to read the whole thing, but are pure JSON parsers really >>>>> the go-to for most people? I'm a big advocate of providing also something >>>>> similar to XPath/XQuery and that's IMHO JSONiq (90% XQuery). I might be >>>>> biased, of course, as I'm working on Brackit[1] in my spare time (which is >>>>> also a query compiler and intended to be used with proven optimizations by >>>>> document stores / JSON stores), but also can be used as an in-memory query >>>>> engine. >>>>> >>>>> kind regards >>>>> Johannes >>>>> >>>>> [1] https://github.com/sirixdb/brackit >>>>> >>>>> Am Do., 15. Dez. 2022 um 23:03 Uhr schrieb Reinier Zwitserloot < >>>>> rein...@zwitserloot.com>: >>>>> >>>>>> A recent Advent-of-Code puzzle also made me double check the support >>>>>> of JSON in the java core libs and it is indeed a curious situation that >>>>>> the >>>>>> java core libs don’t cater to it particularly well. >>>>>> >>>>>> However, I’m not seeing an easy way forward to try to close this hole >>>>>> in the core library offerings. >>>>>> >>>>>> If you need to stream huge swaths of JSON, generally there’s a clear >>>>>> unit size that you can just databind. Something like: >>>>>> >>>>>> String jsonStr = """ { "version": 5, "data": [ >>>>>> -- 1 million relatively small records in this list -- >>>>>> ] } """; >>>>>> >>>>>> >>>>>> The usual swath of JSON parsers tend to support this (giving you a >>>>>> stream of java instances created by databinding those small records one >>>>>> by >>>>>> one), or if not, the best move forward is presumably to file a pull >>>>>> request >>>>>> with those projects; the java.util.log experiment shows that trying to >>>>>> ‘core-librarize’ needs that the community at large already fulfills with >>>>>> third party deps isn’t a good move, especially if the core library >>>>>> variant >>>>>> tries to oversimplify to avoid the trap of being too opinionated (which >>>>>> core libs shouldn’t be). In other words, the need for ’stream this JSON >>>>>> for >>>>>> me’ style APIs is even more exotic that Ethan is suggesting. >>>>>> >>>>>> I see a fundamental problem here: >>>>>> >>>>>> >>>>>> - The 95%+ use case for working with JSON for your average java >>>>>> coder is best done with data binding. >>>>>> - core libs doesn’t want to provide it, partly because it’s got a >>>>>> large design space, partly because the field’s already covered by >>>>>> GSON and >>>>>> Jackson-json; java.util.log proves this doesn’t work. At least, I >>>>>> gather >>>>>> that’s what Ethan thinks and I agree with this assessment. >>>>>> - A language that claims to be “batteries included” that doesn’t >>>>>> ship with a JSON parser in this era is dubious, to say the least. >>>>>> >>>>>> >>>>>> I’m not sure how to square this circle. Hence it feels like core-libs >>>>>> needs to hold some more fundamental debates first: >>>>>> >>>>>> >>>>>> - Maybe it’s time to state in a more or less official decree that >>>>>> well-established, large design space jobs will remain the purview of >>>>>> dependencies no matter how popular it has, unless being part of the >>>>>> core-libs adds something more fundamental the third party deps cannot >>>>>> bring >>>>>> to the table (such as language integration), or the community >>>>>> standardizes >>>>>> on a single library (JSR310’s story, more or less). JSON parsing would >>>>>> qualify as ‘well-established’ (GSON and Jackson) and ‘large design >>>>>> space’ >>>>>> as Ethan pointed out. >>>>>> - Given that 99% of java projects, even really simple ones, start >>>>>> with maven/gradle and a list of deps, is that really a problem? >>>>>> >>>>>> >>>>>> I’m honestly not sure what the right answer is. On one hand, the npm >>>>>> ecosystem seems to be doing very well even though their ‘batteries >>>>>> included’ situation is an utter shambles. Then again, the notion that >>>>>> your >>>>>> average nodejs project includes 10x+ more dependencies than other >>>>>> languages >>>>>> is likely a significant part of the security clown fiesta going on over >>>>>> there as far as 3rd party deps is concerned, so by no means should java >>>>>> just blindly emulate their solutions. >>>>>> >>>>>> I don’t like the idea of shipping a non-data-binding JSON API in the >>>>>> core libs. The root issue with JSON is that you just can’t tell how to >>>>>> interpret any given JSON token, because that’s not how JSON is used in >>>>>> practice. What does 5 mean? Could be that I’m to take that as an int, >>>>>> or as a double, or perhaps even as a j.t.Instant (epoch-millis), and >>>>>> defaulting behaviour (similar to j.u.Map’s .getOrDefault is *very* >>>>>> convenient to parse most JSON out there in the real world - omitting k/v >>>>>> pairs whose value is still on default is very common). That’s what makes >>>>>> those databind libraries so enticing: Instead of trying to pattern match >>>>>> my >>>>>> way into this behaviour: >>>>>> >>>>>> >>>>>> - If the element isn’t there at all or null, give me a >>>>>> list-of-longs with a single 0 in it. >>>>>> - If the element is a number, make me a list-of-longs with 1 >>>>>> value in it, that is that number, as long. >>>>>> - If the element is a string, parse it into a long, then get me a >>>>>> list with this one long value (because IEEE double rules mean >>>>>> sometimes you >>>>>> have to put these things in string form or they get mangled by >>>>>> javascript- >>>>>> eval style parsers). >>>>>> >>>>>> >>>>>> And yet the above is quite common, and can easily be done by a >>>>>> databinder, which sees you want a List<Long> for a field whose >>>>>> default value is List.of(1L), and, armed with that knowledge, can >>>>>> transit the JSON into java in that way. >>>>>> >>>>>> You don’t *need* databinding to cater to this idea: You could for >>>>>> example have a jsonNode.asLong(123) method that would parse a string >>>>>> if need be, even. But this has nothing to do with pattern matching >>>>>> either. >>>>>> >>>>>> --Reinier Zwitserloot >>>>>> >>>>>> >>>>>> On 15 Dec 2022 at 21:30:17, Ethan McCue <et...@mccue.dev> wrote: >>>>>> >>>>>>> I'm writing this to drive some forward motion and to nerd-snipe >>>>>>> those who know better than I do into putting their thoughts into words. >>>>>>> >>>>>>> There are three ways to process JSON[1] >>>>>>> - Streaming (Push or Pull) >>>>>>> - Traversing a Tree (Realized or Lazy) >>>>>>> - Declarative Databind (N ways) >>>>>>> >>>>>>> Of these, JEP-198 explicitly ruled out providing "JAXB style type >>>>>>> safe data binding." >>>>>>> >>>>>>> No justification is given, but if I had to insert my own: mapping >>>>>>> the Json model to/from the Java/JVM object model is a cursed combo of >>>>>>> - Huge possible design space >>>>>>> - Unpalatably large surface for backwards compatibility >>>>>>> - Serialization! Boo![2] >>>>>>> >>>>>>> So for an artifact like the JDK, it probably doesn't make sense to >>>>>>> include. That tracks. >>>>>>> It won't make everyone happy, people like databind APIs, but it >>>>>>> tracks. >>>>>>> >>>>>>> So for the "read flow" these are the things to figure out. >>>>>>> >>>>>>> | Should Provide? | Intended User(s) | >>>>>>> ----------------+-----------------+------------------+ >>>>>>> Streaming Push | | | >>>>>>> ----------------+-----------------+------------------+ >>>>>>> Streaming Pull | | | >>>>>>> ----------------+-----------------+------------------+ >>>>>>> Realized Tree | | | >>>>>>> ----------------+-----------------+------------------+ >>>>>>> Lazy Tree | | | >>>>>>> ----------------+-----------------+------------------+ >>>>>>> >>>>>>> At which point, we should talk about what "meets needs of Java >>>>>>> developers using JSON" implies. >>>>>>> >>>>>>> JSON is ubiquitous. Most kinds of software us schmucks write could >>>>>>> have a reason to interact with it. >>>>>>> The full set of "user personas" therefore aren't practical for me to >>>>>>> talk about.[3] >>>>>>> >>>>>>> JSON documents, however, are not so varied. >>>>>>> >>>>>>> - There are small ones (1-10kb) >>>>>>> - There are medium ones (10-1000kb) >>>>>>> - There are big ones (1000kb-???) >>>>>>> >>>>>>> - There are shallow ones >>>>>>> - There are deep ones >>>>>>> >>>>>>> So that feels like an easier direction to talk about it from. >>>>>>> >>>>>>> >>>>>>> This repo[4] has some convenient toy examples of how some of those >>>>>>> APIs look in libraries >>>>>>> in the ecosystem. Specifically the Streaming Pull and Realized Tree >>>>>>> models. >>>>>>> >>>>>>> User r = new User(); >>>>>>> while (true) { >>>>>>> JsonToken token = reader.peek(); >>>>>>> switch (token) { >>>>>>> case BEGIN_OBJECT: >>>>>>> reader.beginObject(); >>>>>>> break; >>>>>>> case END_OBJECT: >>>>>>> reader.endObject(); >>>>>>> return r; >>>>>>> case NAME: >>>>>>> String fieldname = reader.nextName(); >>>>>>> switch (fieldname) { >>>>>>> case "id": >>>>>>> r.setId(reader.nextString()); >>>>>>> break; >>>>>>> case "index": >>>>>>> r.setIndex(reader.nextInt()); >>>>>>> break; >>>>>>> ... >>>>>>> case "friends": >>>>>>> r.setFriends(new ArrayList<>()); >>>>>>> Friend f = null; >>>>>>> carryOn = true; >>>>>>> while (carryOn) { >>>>>>> token = reader.peek(); >>>>>>> switch (token) { >>>>>>> case BEGIN_ARRAY: >>>>>>> reader.beginArray(); >>>>>>> break; >>>>>>> case END_ARRAY: >>>>>>> reader.endArray(); >>>>>>> carryOn = false; >>>>>>> break; >>>>>>> case BEGIN_OBJECT: >>>>>>> reader.beginObject(); >>>>>>> f = new Friend(); >>>>>>> break; >>>>>>> case END_OBJECT: >>>>>>> reader.endObject(); >>>>>>> r.getFriends().add(f); >>>>>>> break; >>>>>>> case NAME: >>>>>>> String fn = >>>>>>> reader.nextName(); >>>>>>> switch (fn) { >>>>>>> case "id": >>>>>>> >>>>>>> f.setId(reader.nextString()); >>>>>>> break; >>>>>>> case "name": >>>>>>> >>>>>>> f.setName(reader.nextString()); >>>>>>> break; >>>>>>> } >>>>>>> break; >>>>>>> } >>>>>>> } >>>>>>> break; >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> I think its not hard to argue that the streaming apis are brutalist. >>>>>>> The above is Gson, but Jackson, moshi, etc >>>>>>> seem at least morally equivalent. >>>>>>> >>>>>>> Its hard to write, hard to write *correctly*, and theres is a >>>>>>> curious protensity towards pairing it >>>>>>> with anemic, mutable models. >>>>>>> >>>>>>> That being said, it handles big documents and deep documents really >>>>>>> well. It also performs >>>>>>> pretty darn well and is good enough as a "fallback" when the >>>>>>> intended user experience >>>>>>> is through something like databind. >>>>>>> >>>>>>> So what could we do meaningfully better with the language we have >>>>>>> today/will have tommorow? >>>>>>> >>>>>>> - Sealed interfaces + Pattern matching could give a nicer model for >>>>>>> tokens >>>>>>> >>>>>>> sealed interface JsonToken { >>>>>>> record Field(String name) implements JsonToken {} >>>>>>> record BeginArray() implements JsonToken {} >>>>>>> record EndArray() implements JsonToken {} >>>>>>> record BeginObject() implements JsonToken {} >>>>>>> record EndObject() implements JsonToken {} >>>>>>> // ... >>>>>>> } >>>>>>> >>>>>>> // ... >>>>>>> >>>>>>> User r = new User(); >>>>>>> while (true) { >>>>>>> JsonToken token = reader.peek(); >>>>>>> switch (token) { >>>>>>> case BeginObject __: >>>>>>> reader.beginObject(); >>>>>>> break; >>>>>>> case EndObject __: >>>>>>> reader.endObject(); >>>>>>> return r; >>>>>>> case Field("id"): >>>>>>> r.setId(reader.nextString()); >>>>>>> break; >>>>>>> case Field("index"): >>>>>>> r.setIndex(reader.nextInt()); >>>>>>> break; >>>>>>> >>>>>>> // ... >>>>>>> >>>>>>> case Field("friends"): >>>>>>> r.setFriends(new ArrayList<>()); >>>>>>> Friend f = null; >>>>>>> carryOn = true; >>>>>>> while (carryOn) { >>>>>>> token = reader.peek(); >>>>>>> switch (token) { >>>>>>> // ... >>>>>>> >>>>>>> - Value classes can make it all more efficient >>>>>>> >>>>>>> sealed interface JsonToken { >>>>>>> value record Field(String name) implements JsonToken {} >>>>>>> value record BeginArray() implements JsonToken {} >>>>>>> value record EndArray() implements JsonToken {} >>>>>>> value record BeginObject() implements JsonToken {} >>>>>>> value record EndObject() implements JsonToken {} >>>>>>> // ... >>>>>>> } >>>>>>> >>>>>>> - (Fun One) We can transform a simpler-to-write push parser into a >>>>>>> pull parser with Coroutines >>>>>>> >>>>>>> This is just a toy we could play with while making something in >>>>>>> the JDK. I'm pretty sure >>>>>>> we could make a parser which feeds into something like >>>>>>> >>>>>>> interface Listener { >>>>>>> void onObjectStart(); >>>>>>> void onObjectEnd(); >>>>>>> void onArrayStart(); >>>>>>> void onArrayEnd(); >>>>>>> void onField(String name); >>>>>>> // ... >>>>>>> } >>>>>>> >>>>>>> and invert a loop like >>>>>>> >>>>>>> while (true) { >>>>>>> char c = next(); >>>>>>> switch (c) { >>>>>>> case '{': >>>>>>> listener.onObjectStart(); >>>>>>> // ... >>>>>>> // ... >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> by putting a Coroutine.yield in the callback. >>>>>>> >>>>>>> That might be a meaningful simplification in code structure, I >>>>>>> don't know enough to say. >>>>>>> >>>>>>> But, I think there are some hard questions like >>>>>>> >>>>>>> - Is the intent[5] to be make backing parser for ecosystem databind >>>>>>> apis? >>>>>>> - Is the intent that users who want to handle big/deep documents >>>>>>> fall back to this? >>>>>>> - Are those new language features / conveniences enough to offset >>>>>>> the cost of committing to a new api? >>>>>>> - To whom exactly does a low level api provide value? >>>>>>> - What benefit is standardization in the JDK? >>>>>>> >>>>>>> and just generally - who would be the consumer(s) of this? >>>>>>> >>>>>>> The other kind of API still on the table is a Tree. There are two >>>>>>> ways to handle this >>>>>>> >>>>>>> 1. Load it into `Object`. Use a bunch of instanceof checks/casts to >>>>>>> confirm what it actually is. >>>>>>> >>>>>>> Object v; >>>>>>> User u = new User(); >>>>>>> >>>>>>> if ((v = jso.get("id")) != null) { >>>>>>> u.setId((String) v); >>>>>>> } >>>>>>> if ((v = jso.get("index")) != null) { >>>>>>> u.setIndex(((Long) v).intValue()); >>>>>>> } >>>>>>> if ((v = jso.get("guid")) != null) { >>>>>>> u.setGuid((String) v); >>>>>>> } >>>>>>> if ((v = jso.get("isActive")) != null) { >>>>>>> u.setIsActive(((Boolean) v)); >>>>>>> } >>>>>>> if ((v = jso.get("balance")) != null) { >>>>>>> u.setBalance((String) v); >>>>>>> } >>>>>>> // ... >>>>>>> if ((v = jso.get("latitude")) != null) { >>>>>>> u.setLatitude(v instanceof BigDecimal ? ((BigDecimal) >>>>>>> v).doubleValue() : (Double) v); >>>>>>> } >>>>>>> if ((v = jso.get("longitude")) != null) { >>>>>>> u.setLongitude(v instanceof BigDecimal ? ((BigDecimal) >>>>>>> v).doubleValue() : (Double) v); >>>>>>> } >>>>>>> if ((v = jso.get("greeting")) != null) { >>>>>>> u.setGreeting((String) v); >>>>>>> } >>>>>>> if ((v = jso.get("favoriteFruit")) != null) { >>>>>>> u.setFavoriteFruit((String) v); >>>>>>> } >>>>>>> if ((v = jso.get("tags")) != null) { >>>>>>> List<Object> jsonarr = (List<Object>) v; >>>>>>> u.setTags(new ArrayList<>()); >>>>>>> for (Object vi : jsonarr) { >>>>>>> u.getTags().add((String) vi); >>>>>>> } >>>>>>> } >>>>>>> if ((v = jso.get("friends")) != null) { >>>>>>> List<Object> jsonarr = (List<Object>) v; >>>>>>> u.setFriends(new ArrayList<>()); >>>>>>> for (Object vi : jsonarr) { >>>>>>> Map<String, Object> jso0 = (Map<String, Object>) vi; >>>>>>> Friend f = new Friend(); >>>>>>> f.setId((String) jso0.get("id")); >>>>>>> f.setName((String) jso0.get("name")); >>>>>>> u.getFriends().add(f); >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> 2. Have an explicit model for Json, and helper methods that do said >>>>>>> casts[6] >>>>>>> >>>>>>> >>>>>>> this.setSiteSetting(readFromJson(jsonObject.getJsonObject("site"))); >>>>>>> JsonArray groups = jsonObject.getJsonArray("group"); >>>>>>> if(groups != null) >>>>>>> { >>>>>>> int len = groups.size(); >>>>>>> for(int i=0; i<len; i++) >>>>>>> { >>>>>>> JsonObject grp = groups.getJsonObject(i); >>>>>>> SNMPSetting grpSetting = readFromJson(grp); >>>>>>> String grpName = grp.getString("dbgroup", null); >>>>>>> if(grpName != null && grpSetting != null) >>>>>>> this.groupSettings.put(grpName, grpSetting); >>>>>>> } >>>>>>> } >>>>>>> JsonArray hosts = jsonObject.getJsonArray("host"); >>>>>>> if(hosts != null) >>>>>>> { >>>>>>> int len = hosts.size(); >>>>>>> for(int i=0; i<len; i++) >>>>>>> { >>>>>>> JsonObject host = hosts.getJsonObject(i); >>>>>>> SNMPSetting hostSetting = readFromJson(host); >>>>>>> String hostName = host.getString("dbhost", null); >>>>>>> if(hostName != null && hostSetting != null) >>>>>>> this.hostSettings.put(hostName, hostSetting); >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> I think what has become easier to represent in the language nowadays >>>>>>> is that explicit model for Json. >>>>>>> Its the 101 lesson of sealed interfaces.[7] It feels nice and clean. >>>>>>> >>>>>>> sealed interface Json { >>>>>>> final class Null implements Json {} >>>>>>> final class True implements Json {} >>>>>>> final class False implements Json {} >>>>>>> final class Array implements Json {} >>>>>>> final class Object implements Json {} >>>>>>> final class String implements Json {} >>>>>>> final class Number implements Json {} >>>>>>> } >>>>>>> >>>>>>> And the cast-and-check approach is now more viable on account of >>>>>>> pattern matching. >>>>>>> >>>>>>> if (jso.get("id") instanceof String v) { >>>>>>> u.setId(v); >>>>>>> } >>>>>>> if (jso.get("index") instanceof Long v) { >>>>>>> u.setIndex(v.intValue()); >>>>>>> } >>>>>>> if (jso.get("guid") instanceof String v) { >>>>>>> u.setGuid(v); >>>>>>> } >>>>>>> >>>>>>> // or >>>>>>> >>>>>>> if (jso.get("id") instanceof String id && >>>>>>> jso.get("index") instanceof Long index && >>>>>>> jso.get("guid") instanceof String guid) { >>>>>>> return new User(id, index, guid, ...); // look ma, no >>>>>>> setters! >>>>>>> } >>>>>>> >>>>>>> >>>>>>> And on the horizon, again, is value types. >>>>>>> >>>>>>> But there are problems with this approach beyond the performance >>>>>>> implications of loading into >>>>>>> a tree. >>>>>>> >>>>>>> For one, all the code samples above have different behaviors around >>>>>>> null keys and missing keys >>>>>>> that are not obvious from first glance. >>>>>>> >>>>>>> This won't accept any null or missing fields >>>>>>> >>>>>>> if (jso.get("id") instanceof String id && >>>>>>> jso.get("index") instanceof Long index && >>>>>>> jso.get("guid") instanceof String guid) { >>>>>>> return new User(id, index, guid, ...); >>>>>>> } >>>>>>> >>>>>>> This will accept individual null or missing fields, but also will >>>>>>> silently ignore >>>>>>> fields with incorrect types >>>>>>> >>>>>>> if (jso.get("id") instanceof String v) { >>>>>>> u.setId(v); >>>>>>> } >>>>>>> if (jso.get("index") instanceof Long v) { >>>>>>> u.setIndex(v.intValue()); >>>>>>> } >>>>>>> if (jso.get("guid") instanceof String v) { >>>>>>> u.setGuid(v); >>>>>>> } >>>>>>> >>>>>>> And, compared to databind where there is information about the >>>>>>> expected structure of the document >>>>>>> and its the job of the framework to assert that, I posit that the >>>>>>> errors that would be encountered >>>>>>> when writing code against this would be more like >>>>>>> >>>>>>> "something wrong with user" >>>>>>> >>>>>>> than >>>>>>> >>>>>>> "problem at users[5].name, expected string or null. got 5" >>>>>>> >>>>>>> Which feels unideal. >>>>>>> >>>>>>> >>>>>>> One approach I find promising is something close to what Elm does >>>>>>> with its decoders[8]. Not just combining assertion >>>>>>> and binding like what pattern matching with records allows, but >>>>>>> including a scheme for bubbling/nesting errors. >>>>>>> >>>>>>> static String string(Json json) throws JsonDecodingException { >>>>>>> if (!(json instanceof Json.String jsonString)) { >>>>>>> throw JsonDecodingException.of( >>>>>>> "expected a string", >>>>>>> json >>>>>>> ); >>>>>>> } else { >>>>>>> return jsonString.value(); >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> static <T> T field(Json json, String fieldName, Decoder<? >>>>>>> extends T> valueDecoder) throws JsonDecodingException { >>>>>>> var jsonObject = object(json); >>>>>>> var value = jsonObject.get(fieldName); >>>>>>> if (value == null) { >>>>>>> throw JsonDecodingException.atField( >>>>>>> fieldName, >>>>>>> JsonDecodingException.of( >>>>>>> "no value for field", >>>>>>> json >>>>>>> ) >>>>>>> ); >>>>>>> } >>>>>>> else { >>>>>>> try { >>>>>>> return valueDecoder.decode(value); >>>>>>> } catch (JsonDecodingException e) { >>>>>>> throw JsonDecodingException.atField( >>>>>>> fieldName, >>>>>>> e >>>>>>> ); >>>>>>> } catch (Exception e) { >>>>>>> throw JsonDecodingException.atField(fieldName, >>>>>>> JsonDecodingException.of(e, value)); >>>>>>> } >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> Which I think has some benefits over the ways I've seen of working >>>>>>> with trees. >>>>>>> >>>>>>> >>>>>>> >>>>>>> - It is declarative enough that folks who prefer databind might be >>>>>>> happy enough. >>>>>>> >>>>>>> static User fromJson(Json json) { >>>>>>> return new User( >>>>>>> Decoder.field(json, "id", Decoder::string), >>>>>>> Decoder.field(json, "index", Decoder::long_), >>>>>>> Decoder.field(json, "guid", Decoder::string), >>>>>>> ); >>>>>>> } >>>>>>> >>>>>>> / ... >>>>>>> >>>>>>> List<User> users = Decoders.array(json, User::fromJson); >>>>>>> >>>>>>> - Handling null and optional fields could be less easily conflated >>>>>>> >>>>>>> Decoder.field(json, "id", Decoder::string); >>>>>>> >>>>>>> Decoder.nullableField(json, "id", Decoder::string); >>>>>>> >>>>>>> Decoder.optionalField(json, "id", Decoder::string); >>>>>>> >>>>>>> Decoder.optionalNullableField(json, "id", Decoder::string); >>>>>>> >>>>>>> >>>>>>> - It composes well with user defined classes >>>>>>> >>>>>>> record Guid(String value) { >>>>>>> Guid { >>>>>>> // some assertions on the structure of value >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> Decoder.string(json, "guid", guid -> new >>>>>>> Guid(Decoder.string(guid))); >>>>>>> >>>>>>> // or even >>>>>>> >>>>>>> record Guid(String value) { >>>>>>> Guid { >>>>>>> // some assertions on the structure of value >>>>>>> } >>>>>>> >>>>>>> static Guid fromJson(Json json) { >>>>>>> return new Guid(Decoder.string(guid)); >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> Decoder.string(json, "guid", Guid::fromJson); >>>>>>> >>>>>>> >>>>>>> - When something goes wrong, the API can handle the fiddlyness of >>>>>>> capturing information for feedback. >>>>>>> >>>>>>> In the code I've sketched out its just what field/index things >>>>>>> went wrong at. Potentially >>>>>>> capturing metadata like row/col numbers of the source would be >>>>>>> sensible too. >>>>>>> >>>>>>> Its just not reasonable to expect devs to do extra work to get >>>>>>> that and its really nice to give it. >>>>>>> >>>>>>> There are also some downsides like >>>>>>> >>>>>>> - I do not know how compatible it would be with lazy trees. >>>>>>> >>>>>>> Lazy trees being the only way that a tree api could handle big >>>>>>> or deep documents. >>>>>>> The general concept as applied in libraries like json-tree[9] >>>>>>> is to navigate without >>>>>>> doing any work, and that clashes with wanting to instanceof >>>>>>> check the info at the >>>>>>> current path. >>>>>>> >>>>>>> - It *almost* gives enough information to be a general schema >>>>>>> approach >>>>>>> >>>>>>> If one field fails, that in the model throws an exception >>>>>>> immediately. If an API should >>>>>>> return "errors": [...], that is inconvenient to construct. >>>>>>> >>>>>>> - None of the existing popular libraries are doing this >>>>>>> >>>>>>> The only mechanics that are strictly required to give this sort >>>>>>> of API is lambdas. Those have >>>>>>> been out for a decade. Yes sealed interfaces make the data >>>>>>> model prettier but in concept you >>>>>>> can build the same thing on top of anything. >>>>>>> >>>>>>> I could argue that this is because of "cultural momentum" of >>>>>>> databind or some other reason, >>>>>>> but the fact remains that it isn't a proven out approach. >>>>>>> >>>>>>> Writing Json libraries is a todo list[10]. There are a lot of >>>>>>> bad ideas and this might be one of the, >>>>>>> >>>>>>> - Performance impact of so many instanceof checks >>>>>>> >>>>>>> I've gotten a 4.2% slowdown compared to the "regular" tree code >>>>>>> without the repeated casts. >>>>>>> >>>>>>> But that was with a parser that is 5x slower than Jacksons. >>>>>>> (using the same benchmark project as for the snippets). >>>>>>> I think there could be reason to believe that the JIT does well >>>>>>> enough with repeated instanceof >>>>>>> checks to consider it. >>>>>>> >>>>>>> >>>>>>> My current thinking is that - despite not solving for large or deep >>>>>>> documents - starting with a really "dumb" realized tree api >>>>>>> might be the right place to start for the read side of a potential >>>>>>> incubator module. >>>>>>> >>>>>>> But regardless - this feels like a good time to start more concrete >>>>>>> conversations. I fell I should cap this email since I've reached the >>>>>>> point >>>>>>> of decoherence and haven't even mentioned the write side of things >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> [1]: >>>>>>> http://www.cowtowncoder.com/blog/archives/2009/01/entry_131.html >>>>>>> [2]: https://security.snyk.io/vuln/maven?search=jackson-databind >>>>>>> [3]: I only know like 8 people >>>>>>> [4]: >>>>>>> https://github.com/fabienrenaud/java-json-benchmark/blob/master/src/main/java/com/github/fabienrenaud/jjb/stream/UsersStreamDeserializer.java >>>>>>> [5]: When I say "intent", I do so knowing full well no one has been >>>>>>> actively thinking of this for an entire Game of Thrones >>>>>>> [6]: >>>>>>> https://github.com/yahoo/mysql_perf_analyzer/blob/master/myperf/src/main/java/com/yahoo/dba/perf/myperf/common/SNMPSettings.java >>>>>>> [7]: https://www.infoq.com/articles/data-oriented-programming-java/ >>>>>>> [8]: >>>>>>> https://package.elm-lang.org/packages/elm/json/latest/Json-Decode >>>>>>> [9]: https://github.com/jbee/json-tree >>>>>>> [10]: https://stackoverflow.com/a/14442630/2948173 >>>>>>> [11]: In 30 days JEP-198 it will be recognizably PI days old for the >>>>>>> 2nd time in its history. >>>>>>> [12]: To me, the fact that is still an open JEP is more a social >>>>>>> convenience than anything. I could just as easily writing this exact >>>>>>> same >>>>>>> email about TOML. >>>>>>> >>>>>>