Hi Rui,

I'm glad that the fix was useful.

Thanks,
Igor


On Thu, Dec 12, 2019 at 8:16 PM Rui Wang <amaliu...@apache.org> wrote:

> Absolutely. Thanks lgor for the contribution! :)
>
>
> -Rui
>
> On Wed, Dec 11, 2019 at 10:54 PM Stamatis Zampetakis <zabe...@gmail.com>
> wrote:
>
> > So basically thanks to Igor :)
> >
> > On Wed, Dec 11, 2019 at 9:56 PM Rui Wang <amaliu...@apache.org> wrote:
> >
> > > Thanks Stamatis's suggestion. Indeed a recent effort in [1] enhanced
> the
> > > support that reconstructs ROW in the top SELECT, which is supposed to
> > solve
> > > the problem.
> > >
> > >
> > >
> > > [1]: https://jira.apache.org/jira/browse/CALCITE-3138
> > >
> > > On Mon, Dec 9, 2019 at 3:21 PM Rui Wang <amaliu...@apache.org> wrote:
> > >
> > > > Hello,
> > > >
> > > > Sorry for the long delay on this thread. Recently I heard about
> > requests
> > > > on how to deal with STRUCT without flattening it again in BeamSQL.
> > Also I
> > > > realized Flink has already disabled it in their codebase[1]. I did
> try
> > to
> > > > remove STRUCT flattening and run unit tests of calcite core to see
> how
> > > many
> > > > tests breaks: it was 25, which wasn't that bad. So I would like to
> pick
> > > up
> > > > this effort again.
> > > >
> > > > Before I do it, I just want to ask if Calcite community supports this
> > > > effort (or think if it is a good idea)?
> > > >
> > > > My current execution plan will be the following:
> > > > 1. Add a new flag to FrameworkConfig to specify whether flattening
> > > STRUCT.
> > > > By default, it is yes.
> > > > 2. When disabling struct flatterner, add more tests to test STRUCT
> > > support
> > > > in general. For example, test STRUCT support on projection, join
> > > condition,
> > > > filtering, etc.  If there is something breaks, try to fix it.
> > > > 3. Check the 25 failed tests above and see why they have failed if
> > struct
> > > > flattener is gone. Duplicate those failed tests but have necessary
> > fixes
> > > to
> > > > make sure they can pass without STRUCT flattening.
> > > >
> > > >
> > > > [1]:
> > > >
> > >
> >
> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/calcite/FlinkPlannerImpl.scala#L166
> > > >
> > > >
> > > > -Rui
> > > >
> > > > On Wed, Sep 5, 2018 at 11:59 AM Julian Hyde <jh...@apache.org>
> wrote:
> > > >
> > > >> It might not be minor, but it’s worth a try. At optimization time we
> > > >> treat all fields as fields, regardless of whether they have complex
> > > types
> > > >> (maps, arrays, multisets, records) so there should not be too many
> > > >> problems. The flattening was mainly for the benefit of the runtime.
> > > >>
> > > >>
> > > >> > On Sep 5, 2018, at 11:32 AM, Rui Wang <ruw...@google.com.INVALID>
> > > >> wrote:
> > > >> >
> > > >> > Thanks for your helpful response! It seems like disabling the
> > > flattening
> > > >> > will at least affect some rules in optimization. It might not be a
> > > minor
> > > >> > change.
> > > >> >
> > > >> >
> > > >> > -Rui
> > > >> >
> > > >> > On Wed, Sep 5, 2018 at 4:54 AM Stamatis Zampetakis <
> > zabe...@gmail.com
> > > >
> > > >> > wrote:
> > > >> >
> > > >> >> Hi Rui,
> > > >> >>
> > > >> >> Disabling flattening in some cases seems reasonable.
> > > >> >>
> > > >> >> If I am not mistaken, even in the existing code it is not used
> all
> > > the
> > > >> time
> > > >> >> so it makes sense to become configurable.
> > > >> >> For example, Calcite prepared statements (CalcitePrepareImpl) are
> > > >> using the
> > > >> >> flattener only for DDL operations that create materialized views
> > (and
> > > >> this
> > > >> >> is because this code at some point passes from the PlannerImpl).
> > > >> >> On the other hand, any query that is using the Planner will also
> > pass
> > > >> from
> > > >> >> the flattener.
> > > >> >>
> > > >> >> Disabling the flattener does not mean that all rules will work
> > > without
> > > >> >> problems. The Javadoc of the RelStructuredTypeFlattener at some
> > point
> > > >> says
> > > >> >> "This approach has the benefit that real optimizer and codegen
> > rules
> > > >> never
> > > >> >> have to deal with structured types.". Due to this, it is very
> > likely
> > > >> that
> > > >> >> some rules were written based on the fact that there are no
> > > structured
> > > >> >> types.
> > > >> >>
> > > >> >> Best,
> > > >> >> Stamatis
> > > >> >>
> > > >> >>
> > > >> >> Στις Τετ, 5 Σεπ 2018 στις 9:48 π.μ., ο/η Julian Hyde <
> > > jh...@apache.org
> > > >> >
> > > >> >> έγραψε:
> > > >> >>
> > > >> >>> Flattening was introduced mainly because the original engine
> used
> > > flat
> > > >> >>> column-oriented storage. Now we have several ways to executing,
> > > >> >>> including generating java code.
> > > >> >>>
> > > >> >>> Adding a mode to disable flattening might make sense.
> > > >> >>> On Tue, Sep 4, 2018 at 12:52 PM Rui Wang
> > <ruw...@google.com.invalid
> > > >
> > > >> >>> wrote:
> > > >> >>>>
> > > >> >>>> Hi Community,
> > > >> >>>>
> > > >> >>>> While trying to support Row type in Apache Beam SQL on top of
> > > >> Calcite,
> > > >> >> I
> > > >> >>>> realized flattening Row logic will make structure information
> of
> > > Row
> > > >> >> lost
> > > >> >>>> after Projections. There is a use case where users want to mix
> > Beam
> > > >> >>>> programming model with Beam SQL together to process a dataset.
> > The
> > > >> >>>> following is an example of the use case:
> > > >> >>>>
> > > >> >>>> dataset.apply(something user defined)
> > > >> >>>>            .apply(SELECT ...)
> > > >> >>>>            .apply(something user defined)
> > > >> >>>>
> > > >> >>>> As you can see, after the SQL statement is applied, the data
> > > >> structure
> > > >> >>>> should be preserved for further processing.
> > > >> >>>>
> > > >> >>>> The most straightforward way to me is to make Struct fattening
> > > >> optional
> > > >> >>> so
> > > >> >>>> I could choose to disable it and the Row structure is
> preserved.
> > > Can
> > > >> I
> > > >> >>> ask
> > > >> >>>> if it is feasible to make it happen? What could happen if
> Calcite
> > > >> just
> > > >> >>>> doesn't flatten Struct in flattener? (I tried to disable it but
> > had
> > > >> >>>> exceptions in optimizer. I wasn't sure if that were some minor
> > > thing
> > > >> to
> > > >> >>> fix
> > > >> >>>> or Struct flattening was a design choice so the impact of
> change
> > > was
> > > >> >>> huge)
> > > >> >>>>
> > > >> >>>> Additionally, if there is a way to keep the information that I
> > can
> > > >> use
> > > >> >> to
> > > >> >>>> reconstruct the Row after projections, it might be ok as well.
> > Does
> > > >> >> this
> > > >> >>>> idea exist in Calcite? If it does not exist, how is this idea
> > > >> compared
> > > >> >>> with
> > > >> >>>> disabling Struct flattening?
> > > >> >>>>
> > > >> >>>> Thanks,
> > > >> >>>> Rui
> > > >> >>>
> > > >> >>
> > > >>
> > > >>
> > >
> >
>

Reply via email to