--> the json library was written based on experience from this work: http://www.calldei.com/pubs/Balisage2011/index.html http://www.balisage.net/Proceedings/vol7/html/Lee01/BalisageVol7-Lee01.html
However promising, that was a small personal 'research project' not a product -- certainly not efficient. It does illuminate some of the more subtle issues -- suggest possible strategies (see below), but also (unexpected) limitations of them. --- >>> wrt ML and schema -- and why the ML json library take a different approach then the above -- or the following >>> ". if you look at the sc:* functions you can parse to get to schema. And >>> then using a few functions to build out the structure you need create a >>> function that does the transformation for you. " - I did investigate this approach but it was not feasible in the context of the use cases where json:transform-xxx is targeted. It may well be in individual cases, but doesn’t pan out so well for a general-purpose library Two major issues Schemas for BOTH Sides In general-- to do a schema based transformation -- you need schema for *both sides* If you don’t care about deterministic transformations or bi-directional transformations, you can make do with less. Given full scheme on both sides up-front -- that helps reduce the problem -- but doesn’t solve it. Consider a simpler example: --> Produce a Transform for XML into XHTML. Pick one valid transformation. Only one. Pick one that makes everyone happy. --- #profit JSON<>XML is harder. If you look at the 'data mapping' field -- Even the best implementations ( say Jackson ) -- same problem. Fairly easy to map JSON to some generic data object. say "Node" tree. easy. Or to transform from a specific class/object into 'JSON' --> just write every field as a fully annotated generic JSON object. That’s easy. Its also 'really ugly' and generally undesirable. The 'basic' strategy does this for JSON-> XML and the "full" strategy does this for XML->JSON. High fidelity, bi-directional, configuration-free -- schema agnostic. Ugly as sin. More than a good theory -- Make people Happy. Even If you supply both sides -- a schema for source and target -- such as you can get directly with many programming languages via reflection on class declarations -- one would think the problem is solved. But its more like the XML<>XHTML problem. You can determine what is valid, but not what is *desired*. Valid is easy. Desirable -- not so easy to define or achieve. --------- Back to Schemas .. XML + JSON In ML -- we have neither really -- we don’t have JSON Schema currently. And while we do have XML Schema -- its used for a very specific purpose. the 'sc:*' functions mentioned expose the results of schema validation, not the schema itself. That’s a subtle problem that makes them not as useful in this case as one might think. The sc:* functions operate on *instances* of XML data. Post schema validation. You can't use them to query a schema document in the sense desired. They are a Reflection API into instances of validated documents, expressed as schema attributes/axis on the node. They are not a reflection API into the schema document itself. The json:transform--xxx library does use schema information for atomic types. When converting from XML, if the type of an atomic value is known, then that influences the JSON output. E.g. if it’s a xs:string then it becomes a JSON String. an xs:int -> JSON Number. xs:boolean , JSON bool, an empty 'xs:nullable' element -> JSON null. Even that has problems -- most JSON parsers use 64 bit doubles to represent Number -- which means any integer > 52 bits get corrupted. ML data types use 64 bit unsinged integers extensively -- so strings are used for large numbers. You can see that in the management and metering API endpoints. xs:date, dateTime, duration -- majority of xsd primates -- no standard JSON representation. Beyond atomics -- its just too different. XML has named values/nodes , JSON has unnamed values/nodes but named vectors/vertices. If you look at the native JSON implementation In V8 -- there is a novel approach on how native JSON objects are mapped into XDM and XPath. Suppose you have full schema on both sides -- and 'obvious' mappings of structure and naming --- There is still a challenge -- even in theory. Produce a single transform that is valid, lossless *and* universally desirable. The last part -- that’s the fun part. Figure how to determine what is 'desirable' -- universally. Then implement it simply. Produce what is desired, not what is asked for. Make People Happy. Very interested in how to achieve that better. From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of Gary Vidal Sent: Thursday, April 13, 2017 3:38 PM To: general@developer.marklogic.com Subject: Re: [MarkLogic Dev General] json:config for XML schema Well, the good news is if you have a schema you already know the definition of the structure you need to convert. The general issue is to deal with "mixed" content and linking @ref elements to their ultimate definition and things like xs:sequence vs xs:choice. The good news is MarkLogic has a library that can execute against the schema and provide you a means to create your own custom code to convert to JSON etc. if you look at the sc:* functions you can parse to get to schema. And then using a few functions to build out the structure you need create a function that does the transformation for you. I have some various code bits I can share if you need help. If you give me some time (say tomorrow) I can probably write the code to generate the json for you. Ping me directly if you need any help. Regards, Gary Vidal
_______________________________________________ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general