Wow, really appreciate you taking the time to explain how it works.

Now just need to modify my PR to get this one-line change into the SQL
parser so that it accepts sub-expressions in JSON queries
(the other changes in that PR are invalid and should be dropped)

https://github.com/apache/calcite/pull/2730/files#diff-e873041549333502af52ece8a1b34301ae5a059ff4719e9bddbaef48929e7047

One thing that seems odd though -- looking at
"SqlJsonValueExpressionOperator" the definition seems to imply "FORMAT
JSON":
https://github.com/apache/calcite/blob/170035fd97df1afdd0c0a499f63e0ca1606f7837/core/src/main/java/org/apache/calcite/sql/fun/SqlJsonValueExpressionOperator.java#L33

  public SqlJsonValueExpressionOperator() {
    super("FORMAT JSON", SqlKind.JSON_VALUE_EXPRESSION, 28,
        ReturnTypes.explicit(SqlTypeName.ANY)
            .andThen(SqlTypeTransforms.TO_NULLABLE),
        null, OperandTypes.CHARACTER);
  }

Could it be possible that manually adding it FORMAT JSON makes it apply
twice, or I wonder why it's not doing this by default?



On Mon, Mar 21, 2022 at 8:06 PM Benchao Li <libenc...@apache.org> wrote:

> Gavin,
>
> "FORMAT JSON" is an operator, it converts the json string into an internal
> `JsonValueContext`[1],
> and the `JsonValueContext` holds the deserialized json object which will be
> further used by
> the following json functions.
>
> "FORMAT JSON" is just one choice of "JSON output clause", which is stated
> in the standard as:
> <JSON output clause> ::=
>   RETURNING <data type>
>       [ FORMAT <JSON representation> ]
> <JSON representation> ::=
>     JSON [ ENCODING { UTF8 | UTF16 | UTF32 } ]
>   | <implementation-defined JSON representation option>
>
> Other options could be "FORMAT AVRO", "FORMAT BSON" (which is not
> implemented yet in calcite).
>
> "JSON output clause" only controls the serialization format for json
> function's output,
> and "JSON FORMAT" should be the default. Hence the behavior without "FORMAT
> JSON"
> should be corrected IMHO.
>
> [1]
>
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/runtime/JsonFunctions.java#L101
>
> Gavin Ray <ray.gavi...@gmail.com> 于2022年3月22日周二 03:48写道:
>
> > Benchao,
> >
> > I owe you a beer, thank you!!
> > Using FORMAT JSON fixes everything -- it just has to be specified on
> every
> > JSON function.
> >
> > See images below if curious:
> >
> > Without FORMAT JSON:
> > https://i.imgur.com/ekgFXqe.png
> >
> > With FORMAT JSON:
> > https://i.imgur.com/8lHIqUZ.png
> >
> > This is fantastic news =D
> > Now I am just curious how the "FORMAT JSON" modifier works.
> >
> > On Sun, Mar 20, 2022 at 10:42 PM Gavin Ray <ray.gavi...@gmail.com>
> wrote:
> >
> > > Hot damn, you're right -- that doesn't look like it's double-escaped!
> > > I will try to this out ASAP and post an update here, thank you for the
> > tip
> > > =)
> > >
> > >
> > > On Sun, Mar 20, 2022 at 8:51 PM Benchao Li <libenc...@apache.org>
> wrote:
> > >
> > >> Hi Gavin,
> > >>
> > >> Have you ever tried 'FORMAT JSON' like this[1], the tests show that
> the
> > >> result of 'format json' is what you want.
> > >>
> > >> [1]
> > >>
> > >>
> >
> https://github.com/apache/calcite/blob/master/testkit/src/main/java/org/apache/calcite/test/SqlOperatorTest.java#L4381
> > >>
> > >>
> > >> Gavin Ray <ray.gavi...@gmail.com> 于2022年3月19日周六 05:19写道:
> > >>
> > >> > That sounds very reasonable to me
> > >> >
> > >> > I don't know the Calcite codebase as well as other folks -- and
> > >> certainly
> > >> > not nearly as well as you
> > >> > Where would be the place to put such a thing/the overall approach?
> > >> >
> > >> > On Fri, Mar 18, 2022 at 1:53 PM Julian Hyde <jhyde.apa...@gmail.com
> >
> > >> > wrote:
> > >> >
> > >> > > I think you’re proposing making the JSON_ functions smarter at
> > >> runtime.
> > >> > My
> > >> > > general philosophy is to have the smarts at prepare time and make
> > the
> > >> > > runtime operators dumb. I think that philosophy can be applied
> here.
> > >> Some
> > >> > > extra logic would kick in when preparing a query that has JSON_
> > >> > functions,
> > >> > > and that logic would fold together multiple nested JSON_
> functions.
> > >> > Perhaps
> > >> > > we need to add a new (internal) JSON_ function that can do all of
> > the
> > >> > steps.
> > >> > >
> > >> > > Julian
> > >> > >
> > >> > >
> > >> > > > On Mar 18, 2022, at 8:50 AM, Gavin Ray <ray.gavi...@gmail.com>
> > >> wrote:
> > >> > > >
> > >> > > > Sorry to beat a dead horse here, but I'm one of those weirdos
> that
> > >> > gets a
> > >> > > > lot of use out of Calcite's JSON operators.
> > >> > > > Calcite's JSON implementation is broken for queries that have
> more
> > >> than
> > >> > > one
> > >> > > > depth of JSON object/array calls.
> > >> > > >
> > >> > > > The reason is because the operator calls "jsonize()", which
> parses
> > >> the
> > >> > > > (presumably) JVM object as a JSON string
> > >> > > > This works for something like Map<String, String>, but if you
> have
> > >> > > > Map<String, Map<String, String>>, what happens is this:
> > >> > > >
> > >> > > > JSON_OBJECT(
> > >> > > >  foo: 1,
> > >> > > >  bar: JSON_OBJECT(
> > >> > > >     qux: 2
> > >> > > >  )
> > >> > > > )
> > >> > > >
> > >> > > > The parse happens inside-out, so first we get the innermost
> object
> > >> > > parsed,
> > >> > > > which gives:
> > >> > > >
> > >> > > > { "qux": 2 }
> > >> > > >
> > >> > > > But -- as a string! This is important!
> > >> > > > Now, when we parse the next object, we get this:
> > >> > > >
> > >> > > > { "foo": 1, "bar": \"{ \"qux\": 2 }\" }
> > >> > > >
> > >> > > > This is because the object with "qux" isn't an object, but a
> > string
> > >> > value
> > >> > > > Which to be valid JSON, needs to have its quotes and braces
> > escaped
> > >> > > >
> > >> > > > Definitely not what you want, and the value isn't usable =(
> > >> > > >
> > >> > > > Since there is no state/context/stack (that I can tell) when the
> > >> parse
> > >> > > > function is called,
> > >> > > > how might it be possible to write something to the effect of:
> > >> > > >
> > >> > > > "Analyze the query, and if the number of JSON operations is
> > greater
> > >> > than
> > >> > > > one,
> > >> > > > only call 'jsonize()' on the outer-most parse/object."
> > >> > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://github.com/apache/calcite/blob/8d21c3f2f0b75d788e70bbeea9746695f2fde552/core/src/main/java/org/apache/calcite/adapter/enumerable/RexImpTable.java#L1886-L1891
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://github.com/apache/calcite/blob/4bc916619fd286b2c0cc4d5c653c96a68801d74e/core/src/main/java/org/apache/calcite/runtime/JsonFunctions.java#L93-L95
> > >> > >
> > >> > >
> > >> >
> > >>
> > >>
> > >> --
> > >>
> > >> Best,
> > >> Benchao Li
> > >>
> > >
> >
>
>
> --
>
> Best,
> Benchao Li
>

Reply via email to