I left some comments under CALCITE-4989 with respect to FORMAT JSON etc. If
it really solves the problem can someone update the JIRA accordingly?

Best,
Stamatis

[1] https://issues.apache.org/jira/browse/CALCITE-4989

On Tue, Mar 22, 2022 at 3:56 AM Gavin Ray <ray.gavi...@gmail.com> wrote:

> Wow, really appreciate you taking the time to explain how it works.
>
> Now just need to modify my PR to get this one-line change into the SQL
> parser so that it accepts sub-expressions in JSON queries
> (the other changes in that PR are invalid and should be dropped)
>
>
> https://github.com/apache/calcite/pull/2730/files#diff-e873041549333502af52ece8a1b34301ae5a059ff4719e9bddbaef48929e7047
>
> One thing that seems odd though -- looking at
> "SqlJsonValueExpressionOperator" the definition seems to imply "FORMAT
> JSON":
>
> https://github.com/apache/calcite/blob/170035fd97df1afdd0c0a499f63e0ca1606f7837/core/src/main/java/org/apache/calcite/sql/fun/SqlJsonValueExpressionOperator.java#L33
>
>   public SqlJsonValueExpressionOperator() {
>     super("FORMAT JSON", SqlKind.JSON_VALUE_EXPRESSION, 28,
>         ReturnTypes.explicit(SqlTypeName.ANY)
>             .andThen(SqlTypeTransforms.TO_NULLABLE),
>         null, OperandTypes.CHARACTER);
>   }
>
> Could it be possible that manually adding it FORMAT JSON makes it apply
> twice, or I wonder why it's not doing this by default?
>
>
>
> On Mon, Mar 21, 2022 at 8:06 PM Benchao Li <libenc...@apache.org> wrote:
>
> > Gavin,
> >
> > "FORMAT JSON" is an operator, it converts the json string into an
> internal
> > `JsonValueContext`[1],
> > and the `JsonValueContext` holds the deserialized json object which will
> be
> > further used by
> > the following json functions.
> >
> > "FORMAT JSON" is just one choice of "JSON output clause", which is stated
> > in the standard as:
> > <JSON output clause> ::=
> >   RETURNING <data type>
> >       [ FORMAT <JSON representation> ]
> > <JSON representation> ::=
> >     JSON [ ENCODING { UTF8 | UTF16 | UTF32 } ]
> >   | <implementation-defined JSON representation option>
> >
> > Other options could be "FORMAT AVRO", "FORMAT BSON" (which is not
> > implemented yet in calcite).
> >
> > "JSON output clause" only controls the serialization format for json
> > function's output,
> > and "JSON FORMAT" should be the default. Hence the behavior without
> "FORMAT
> > JSON"
> > should be corrected IMHO.
> >
> > [1]
> >
> >
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/runtime/JsonFunctions.java#L101
> >
> > Gavin Ray <ray.gavi...@gmail.com> 于2022年3月22日周二 03:48写道:
> >
> > > Benchao,
> > >
> > > I owe you a beer, thank you!!
> > > Using FORMAT JSON fixes everything -- it just has to be specified on
> > every
> > > JSON function.
> > >
> > > See images below if curious:
> > >
> > > Without FORMAT JSON:
> > > https://i.imgur.com/ekgFXqe.png
> > >
> > > With FORMAT JSON:
> > > https://i.imgur.com/8lHIqUZ.png
> > >
> > > This is fantastic news =D
> > > Now I am just curious how the "FORMAT JSON" modifier works.
> > >
> > > On Sun, Mar 20, 2022 at 10:42 PM Gavin Ray <ray.gavi...@gmail.com>
> > wrote:
> > >
> > > > Hot damn, you're right -- that doesn't look like it's double-escaped!
> > > > I will try to this out ASAP and post an update here, thank you for
> the
> > > tip
> > > > =)
> > > >
> > > >
> > > > On Sun, Mar 20, 2022 at 8:51 PM Benchao Li <libenc...@apache.org>
> > wrote:
> > > >
> > > >> Hi Gavin,
> > > >>
> > > >> Have you ever tried 'FORMAT JSON' like this[1], the tests show that
> > the
> > > >> result of 'format json' is what you want.
> > > >>
> > > >> [1]
> > > >>
> > > >>
> > >
> >
> https://github.com/apache/calcite/blob/master/testkit/src/main/java/org/apache/calcite/test/SqlOperatorTest.java#L4381
> > > >>
> > > >>
> > > >> Gavin Ray <ray.gavi...@gmail.com> 于2022年3月19日周六 05:19写道:
> > > >>
> > > >> > That sounds very reasonable to me
> > > >> >
> > > >> > I don't know the Calcite codebase as well as other folks -- and
> > > >> certainly
> > > >> > not nearly as well as you
> > > >> > Where would be the place to put such a thing/the overall approach?
> > > >> >
> > > >> > On Fri, Mar 18, 2022 at 1:53 PM Julian Hyde <
> jhyde.apa...@gmail.com
> > >
> > > >> > wrote:
> > > >> >
> > > >> > > I think you’re proposing making the JSON_ functions smarter at
> > > >> runtime.
> > > >> > My
> > > >> > > general philosophy is to have the smarts at prepare time and
> make
> > > the
> > > >> > > runtime operators dumb. I think that philosophy can be applied
> > here.
> > > >> Some
> > > >> > > extra logic would kick in when preparing a query that has JSON_
> > > >> > functions,
> > > >> > > and that logic would fold together multiple nested JSON_
> > functions.
> > > >> > Perhaps
> > > >> > > we need to add a new (internal) JSON_ function that can do all
> of
> > > the
> > > >> > steps.
> > > >> > >
> > > >> > > Julian
> > > >> > >
> > > >> > >
> > > >> > > > On Mar 18, 2022, at 8:50 AM, Gavin Ray <ray.gavi...@gmail.com
> >
> > > >> wrote:
> > > >> > > >
> > > >> > > > Sorry to beat a dead horse here, but I'm one of those weirdos
> > that
> > > >> > gets a
> > > >> > > > lot of use out of Calcite's JSON operators.
> > > >> > > > Calcite's JSON implementation is broken for queries that have
> > more
> > > >> than
> > > >> > > one
> > > >> > > > depth of JSON object/array calls.
> > > >> > > >
> > > >> > > > The reason is because the operator calls "jsonize()", which
> > parses
> > > >> the
> > > >> > > > (presumably) JVM object as a JSON string
> > > >> > > > This works for something like Map<String, String>, but if you
> > have
> > > >> > > > Map<String, Map<String, String>>, what happens is this:
> > > >> > > >
> > > >> > > > JSON_OBJECT(
> > > >> > > >  foo: 1,
> > > >> > > >  bar: JSON_OBJECT(
> > > >> > > >     qux: 2
> > > >> > > >  )
> > > >> > > > )
> > > >> > > >
> > > >> > > > The parse happens inside-out, so first we get the innermost
> > object
> > > >> > > parsed,
> > > >> > > > which gives:
> > > >> > > >
> > > >> > > > { "qux": 2 }
> > > >> > > >
> > > >> > > > But -- as a string! This is important!
> > > >> > > > Now, when we parse the next object, we get this:
> > > >> > > >
> > > >> > > > { "foo": 1, "bar": \"{ \"qux\": 2 }\" }
> > > >> > > >
> > > >> > > > This is because the object with "qux" isn't an object, but a
> > > string
> > > >> > value
> > > >> > > > Which to be valid JSON, needs to have its quotes and braces
> > > escaped
> > > >> > > >
> > > >> > > > Definitely not what you want, and the value isn't usable =(
> > > >> > > >
> > > >> > > > Since there is no state/context/stack (that I can tell) when
> the
> > > >> parse
> > > >> > > > function is called,
> > > >> > > > how might it be possible to write something to the effect of:
> > > >> > > >
> > > >> > > > "Analyze the query, and if the number of JSON operations is
> > > greater
> > > >> > than
> > > >> > > > one,
> > > >> > > > only call 'jsonize()' on the outer-most parse/object."
> > > >> > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://github.com/apache/calcite/blob/8d21c3f2f0b75d788e70bbeea9746695f2fde552/core/src/main/java/org/apache/calcite/adapter/enumerable/RexImpTable.java#L1886-L1891
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://github.com/apache/calcite/blob/4bc916619fd286b2c0cc4d5c653c96a68801d74e/core/src/main/java/org/apache/calcite/runtime/JsonFunctions.java#L93-L95
> > > >> > >
> > > >> > >
> > > >> >
> > > >>
> > > >>
> > > >> --
> > > >>
> > > >> Best,
> > > >> Benchao Li
> > > >>
> > > >
> > >
> >
> >
> > --
> >
> > Best,
> > Benchao Li
> >
>

Reply via email to