Absolutely. Thanks lgor for the contribution! :)
-Rui On Wed, Dec 11, 2019 at 10:54 PM Stamatis Zampetakis <zabe...@gmail.com> wrote: > So basically thanks to Igor :) > > On Wed, Dec 11, 2019 at 9:56 PM Rui Wang <amaliu...@apache.org> wrote: > > > Thanks Stamatis's suggestion. Indeed a recent effort in [1] enhanced the > > support that reconstructs ROW in the top SELECT, which is supposed to > solve > > the problem. > > > > > > > > [1]: https://jira.apache.org/jira/browse/CALCITE-3138 > > > > On Mon, Dec 9, 2019 at 3:21 PM Rui Wang <amaliu...@apache.org> wrote: > > > > > Hello, > > > > > > Sorry for the long delay on this thread. Recently I heard about > requests > > > on how to deal with STRUCT without flattening it again in BeamSQL. > Also I > > > realized Flink has already disabled it in their codebase[1]. I did try > to > > > remove STRUCT flattening and run unit tests of calcite core to see how > > many > > > tests breaks: it was 25, which wasn't that bad. So I would like to pick > > up > > > this effort again. > > > > > > Before I do it, I just want to ask if Calcite community supports this > > > effort (or think if it is a good idea)? > > > > > > My current execution plan will be the following: > > > 1. Add a new flag to FrameworkConfig to specify whether flattening > > STRUCT. > > > By default, it is yes. > > > 2. When disabling struct flatterner, add more tests to test STRUCT > > support > > > in general. For example, test STRUCT support on projection, join > > condition, > > > filtering, etc. If there is something breaks, try to fix it. > > > 3. Check the 25 failed tests above and see why they have failed if > struct > > > flattener is gone. Duplicate those failed tests but have necessary > fixes > > to > > > make sure they can pass without STRUCT flattening. > > > > > > > > > [1]: > > > > > > https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/calcite/FlinkPlannerImpl.scala#L166 > > > > > > > > > -Rui > > > > > > On Wed, Sep 5, 2018 at 11:59 AM Julian Hyde <jh...@apache.org> wrote: > > > > > >> It might not be minor, but it’s worth a try. At optimization time we > > >> treat all fields as fields, regardless of whether they have complex > > types > > >> (maps, arrays, multisets, records) so there should not be too many > > >> problems. The flattening was mainly for the benefit of the runtime. > > >> > > >> > > >> > On Sep 5, 2018, at 11:32 AM, Rui Wang <ruw...@google.com.INVALID> > > >> wrote: > > >> > > > >> > Thanks for your helpful response! It seems like disabling the > > flattening > > >> > will at least affect some rules in optimization. It might not be a > > minor > > >> > change. > > >> > > > >> > > > >> > -Rui > > >> > > > >> > On Wed, Sep 5, 2018 at 4:54 AM Stamatis Zampetakis < > zabe...@gmail.com > > > > > >> > wrote: > > >> > > > >> >> Hi Rui, > > >> >> > > >> >> Disabling flattening in some cases seems reasonable. > > >> >> > > >> >> If I am not mistaken, even in the existing code it is not used all > > the > > >> time > > >> >> so it makes sense to become configurable. > > >> >> For example, Calcite prepared statements (CalcitePrepareImpl) are > > >> using the > > >> >> flattener only for DDL operations that create materialized views > (and > > >> this > > >> >> is because this code at some point passes from the PlannerImpl). > > >> >> On the other hand, any query that is using the Planner will also > pass > > >> from > > >> >> the flattener. > > >> >> > > >> >> Disabling the flattener does not mean that all rules will work > > without > > >> >> problems. The Javadoc of the RelStructuredTypeFlattener at some > point > > >> says > > >> >> "This approach has the benefit that real optimizer and codegen > rules > > >> never > > >> >> have to deal with structured types.". Due to this, it is very > likely > > >> that > > >> >> some rules were written based on the fact that there are no > > structured > > >> >> types. > > >> >> > > >> >> Best, > > >> >> Stamatis > > >> >> > > >> >> > > >> >> Στις Τετ, 5 Σεπ 2018 στις 9:48 π.μ., ο/η Julian Hyde < > > jh...@apache.org > > >> > > > >> >> έγραψε: > > >> >> > > >> >>> Flattening was introduced mainly because the original engine used > > flat > > >> >>> column-oriented storage. Now we have several ways to executing, > > >> >>> including generating java code. > > >> >>> > > >> >>> Adding a mode to disable flattening might make sense. > > >> >>> On Tue, Sep 4, 2018 at 12:52 PM Rui Wang > <ruw...@google.com.invalid > > > > > >> >>> wrote: > > >> >>>> > > >> >>>> Hi Community, > > >> >>>> > > >> >>>> While trying to support Row type in Apache Beam SQL on top of > > >> Calcite, > > >> >> I > > >> >>>> realized flattening Row logic will make structure information of > > Row > > >> >> lost > > >> >>>> after Projections. There is a use case where users want to mix > Beam > > >> >>>> programming model with Beam SQL together to process a dataset. > The > > >> >>>> following is an example of the use case: > > >> >>>> > > >> >>>> dataset.apply(something user defined) > > >> >>>> .apply(SELECT ...) > > >> >>>> .apply(something user defined) > > >> >>>> > > >> >>>> As you can see, after the SQL statement is applied, the data > > >> structure > > >> >>>> should be preserved for further processing. > > >> >>>> > > >> >>>> The most straightforward way to me is to make Struct fattening > > >> optional > > >> >>> so > > >> >>>> I could choose to disable it and the Row structure is preserved. > > Can > > >> I > > >> >>> ask > > >> >>>> if it is feasible to make it happen? What could happen if Calcite > > >> just > > >> >>>> doesn't flatten Struct in flattener? (I tried to disable it but > had > > >> >>>> exceptions in optimizer. I wasn't sure if that were some minor > > thing > > >> to > > >> >>> fix > > >> >>>> or Struct flattening was a design choice so the impact of change > > was > > >> >>> huge) > > >> >>>> > > >> >>>> Additionally, if there is a way to keep the information that I > can > > >> use > > >> >> to > > >> >>>> reconstruct the Row after projections, it might be ok as well. > Does > > >> >> this > > >> >>>> idea exist in Calcite? If it does not exist, how is this idea > > >> compared > > >> >>> with > > >> >>>> disabling Struct flattening? > > >> >>>> > > >> >>>> Thanks, > > >> >>>> Rui > > >> >>> > > >> >> > > >> > > >> > > >