Hi Venkata,

> Also I made minor changes to the *NestedFieldReferenceExpression, *instead
> of *fieldIndexArray* we can just do away with *fieldNames *array that
> includes fieldName at every level for the nested field.


I don't think keeping only the field names array would work. At the end of
the day, the contract between Flink SQL and the connectors is based on the
indexes, not the names. Technically speaking, the connectors only emit a
bunch of RowData which is based on positions. The field names are added by
the SQL framework via the DDL for those RowData. In this sense, the
connectors may not be aware of the field names in Flink DDL at all. The
common language between Flink SQL and source is just positions. This is
also why ProjectionPushDown would work by only relying on the indexes, not
the field names. So I think the field index array is a must have here in
the NestedFieldReferenceExpression.

Thanks,

Jiangjie (Becket) Qin

On Fri, Sep 1, 2023 at 8:12 AM Venkatakrishnan Sowrirajan <vsowr...@asu.edu>
wrote:

> Gentle ping on the vote for FLIP-356: Support Nested fields filter pushdown
> <https://www.mail-archive.com/dev@flink.apache.org/msg69289.html>.
>
> Regards
> Venkata krishnan
>
>
> On Tue, Aug 29, 2023 at 9:18 PM Venkatakrishnan Sowrirajan <
> vsowr...@asu.edu>
> wrote:
>
> > Sure, will reference this discussion to resume where we started as part
> of
> > the flip to refactor SupportsProjectionPushDown.
> >
> > On Tue, Aug 29, 2023, 7:22 PM Jark Wu <imj...@gmail.com> wrote:
> >
> >> I'm fine with this. `ReferenceExpression` and
> `SupportsProjectionPushDown`
> >> can be another FLIP. However, could you summarize the design of this
> part
> >> in the future part of the FLIP? This can be easier to get started with
> in
> >> the future.
> >>
> >>
> >> Best,
> >> Jark
> >>
> >>
> >> On Wed, 30 Aug 2023 at 02:45, Venkatakrishnan Sowrirajan <
> >> vsowr...@asu.edu>
> >> wrote:
> >>
> >> > Thanks Jark. Sounds good.
> >> >
> >> > One more thing, earlier in my summary I mentioned,
> >> >
> >> > Introduce a new *ReferenceExpression* (or *BaseReferenceExpression*)
> >> > > abstract class which will be extended by both
> >> *FieldReferenceExpression*
> >> > >  and *NestedFieldReferenceExpression* (to be introduced as part of
> >> this
> >> > > FLIP)
> >> >
> >> > This can be punted for now and can be handled as part of refactoring
> >> > SupportsProjectionPushDown.
> >> >
> >> > Also I made minor changes to the *NestedFieldReferenceExpression,
> >> *instead
> >> > of *fieldIndexArray* we can just do away with *fieldNames *array that
> >> > includes fieldName at every level for the nested field.
> >> >
> >> > Updated the FLIP-357
> >> > <
> >> >
> >>
> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/FLINK/FLIP-356*3A*Support*Nested*Fields*Filter*Pushdown__;JSsrKysr!!IKRxdwAv5BmarQ!YAk6kV4CYvUSPfpoUDQRs6VlbmJXVX8KOKqFxKbNDkUWKzShvwpkLRGkAV1tgV3EqClNrjGS-Ij86Q$
> >> > >
> >> > wiki as well.
> >> >
> >> > Regards
> >> > Venkata krishnan
> >> >
> >> >
> >> > On Tue, Aug 29, 2023 at 5:21 AM Jark Wu <imj...@gmail.com> wrote:
> >> >
> >> > > Hi Venkata,
> >> > >
> >> > > Your summary looks good to me. +1 to start a vote.
> >> > >
> >> > > I think we don't need "inputIndex" in
> NestedFieldReferenceExpression.
> >> > > Actually, I think it is also not needed in FieldReferenceExpression,
> >> > > and we should try to remove it (another topic). The RexInputRef in
> >> > Calcite
> >> > > also doesn't require an inputIndex because the field index should
> >> > represent
> >> > > index of the field in the underlying row type. Field references
> >> shouldn't
> >> > > be
> >> > >  aware of the number of inputs.
> >> > >
> >> > > Best,
> >> > > Jark
> >> > >
> >> > >
> >> > > On Tue, 29 Aug 2023 at 02:24, Venkatakrishnan Sowrirajan <
> >> > vsowr...@asu.edu
> >> > > >
> >> > > wrote:
> >> > >
> >> > > > Hi Jinsong,
> >> > > >
> >> > > > Thanks for your comments.
> >> > > >
> >> > > > What is inputIndex in NestedFieldReferenceExpression?
> >> > > >
> >> > > >
> >> > > > I haven't looked at it before. Do you mean, given that it is now
> >> only
> >> > > used
> >> > > > to push filters it won't be subsequently used in further
> >> > > > planning/optimization and therefore it is not required at this
> time?
> >> > > >
> >> > > > So if NestedFieldReferenceExpression doesn't need inputIndex, is
> >> there
> >> > > > > a need to introduce a base class `ReferenceExpression`?
> >> > > >
> >> > > > For SupportsFilterPushDown itself, *ReferenceExpression* base
> class
> >> is
> >> > > not
> >> > > > needed. But there were discussions around cleaning up and
> >> standardizing
> >> > > the
> >> > > > API for Supports*PushDown. SupportsProjectionPushDown currently
> >> pushes
> >> > > the
> >> > > > projects as a 2-d array, instead it would be better to use the
> >> standard
> >> > > API
> >> > > > which seems to be the *ResolvedExpression*. For
> >> > > SupportsProjectionPushDown
> >> > > > either FieldReferenceExpression (top level fields) or
> >> > > > NestedFieldReferenceExpression (nested fields) is enough, in order
> >> to
> >> > > > provide a single API that handles both top level and nested
> fields,
> >> > > > ReferenceExpression will be introduced as a base class.
> >> > > >
> >> > > > Eventually, *SupportsProjectionPushDown#applyProjections* would
> >> evolve
> >> > as
> >> > > > applyProjection(List<ReferenceExpression> projectedFields) and
> >> nested
> >> > > > fields would be pushed only if *supportsNestedProjections* returns
> >> > true.
> >> > > >
> >> > > > Regards
> >> > > > Venkata krishnan
> >> > > >
> >> > > >
> >> > > > On Sun, Aug 27, 2023 at 11:12 PM Jingsong Li <
> >> jingsongl...@gmail.com>
> >> > > > wrote:
> >> > > >
> >> > > > > So if NestedFieldReferenceExpression doesn't need inputIndex, is
> >> > there
> >> > > > > a need to introduce a base class `ReferenceExpression`?
> >> > > > >
> >> > > > > Best,
> >> > > > > Jingsong
> >> > > > >
> >> > > > > On Mon, Aug 28, 2023 at 2:09 PM Jingsong Li <
> >> jingsongl...@gmail.com>
> >> > > > > wrote:
> >> > > > > >
> >> > > > > > Hi thanks all for your discussion.
> >> > > > > >
> >> > > > > > What is inputIndex in NestedFieldReferenceExpression?
> >> > > > > >
> >> > > > > > I know inputIndex has special usage in
> FieldReferenceExpression,
> >> > but
> >> > > > > > it is only for Join operators, and it is only for SQL
> >> optimization.
> >> > > It
> >> > > > > > looks like there is no requirement for Nested.
> >> > > > > >
> >> > > > > > Best,
> >> > > > > > Jingsong
> >> > > > > >
> >> > > > > > On Mon, Aug 28, 2023 at 1:13 PM Venkatakrishnan Sowrirajan
> >> > > > > > <vsowr...@asu.edu> wrote:
> >> > > > > > >
> >> > > > > > > Thanks for all the feedback and discussion everyone. Looks
> >> like
> >> > we
> >> > > > have
> >> > > > > > > reached a consensus here.
> >> > > > > > >
> >> > > > > > > Just to summarize:
> >> > > > > > >
> >> > > > > > > 1. Introduce a new *ReferenceExpression* (or
> >> > > > *BaseReferenceExpression*)
> >> > > > > > > abstract class which will be extended by both
> >> > > > > *FieldReferenceExpression*
> >> > > > > > > and *NestedFieldReferenceExpression* (to be introduced as
> >> part of
> >> > > > this
> >> > > > > FLIP)
> >> > > > > > > 2. No need of *supportsNestedFilters *check as the current
> >> > > > > > > *SupportsFilterPushDown* should already ignore unknown
> >> > expressions
> >> > > (
> >> > > > > > > *NestedFieldReferenceExpression* for example) and return
> them
> >> as
> >> > > > > > > *remainingFilters.
> >> > > > > > > *Maybe this should be clarified explicitly in the Javadoc of
> >> > > > > > > *SupportsFilterPushDown.
> >> > > > > > > *I will file a separate JIRA to fix the documentation.
> >> > > > > > > 3. Refactor *SupportsProjectionPushDown* to use
> >> > > *ReferenceExpression
> >> > > > > *instead
> >> > > > > > > of existing 2-d arrays to consolidate and be consistent with
> >> > other
> >> > > > > > > Supports*PushDown APIs - *outside the scope of this FLIP*
> >> > > > > > > 4. Similarly *SupportsAggregatePushDown* should also be
> >> evolved
> >> > > > > whenever
> >> > > > > > > nested fields support is added to use the
> >> *ReferenceExpression -
> >> > > > > **outside
> >> > > > > > > the scope of this FLIP*
> >> > > > > > >
> >> > > > > > > Does this sound good? Please let me know if I have missed
> >> > anything
> >> > > > > here. If
> >> > > > > > > there are no concerns, I will start a vote tomorrow. I will
> >> also
> >> > > get
> >> > > > > the
> >> > > > > > > FLIP-356 wiki updated. Thanks everyone once again!
> >> > > > > > >
> >> > > > > > > Regards
> >> > > > > > > Venkata krishnan
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > On Thu, Aug 24, 2023 at 8:19 PM Becket Qin <
> >> becket....@gmail.com
> >> > >
> >> > > > > wrote:
> >> > > > > > >
> >> > > > > > > > Hi Jark,
> >> > > > > > > >
> >> > > > > > > > How about having a separate
> NestedFieldReferenceExpression,
> >> and
> >> > > > > > > > > abstracting a common base class "ReferenceExpression"
> for
> >> > > > > > > > > NestedFieldReferenceExpression and
> >> FieldReferenceExpression?
> >> > > This
> >> > > > > makes
> >> > > > > > > > > unifying expressions in
> >> > > > > > > > >
> >> > > > >
> >> > "SupportsProjectionPushdown#applyProjections(List<ReferenceExpression>
> >> > > > > > > > > ...)"
> >> > > > > > > > > possible.
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > > I'd be fine with this. It at least provides a consistent
> API
> >> > > style
> >> > > > /
> >> > > > > > > > formality.
> >> > > > > > > >
> >> > > > > > > >  Re: Yunhong,
> >> > > > > > > >
> >> > > > > > > > 3. Finally, I think we need to look at the costs and
> >> benefits
> >> > of
> >> > > > > unifying
> >> > > > > > > > > the SupportsFilterPushDown and
> SupportsProjectionPushDown
> >> (or
> >> > > > > others)
> >> > > > > > > > from
> >> > > > > > > > > the perspective of interface implementers. A stable API
> >> can
> >> > > > reduce
> >> > > > > user
> >> > > > > > > > > development and change costs, if the current API can
> fully
> >> > meet
> >> > > > the
> >> > > > > > > > > functional requirements at the framework level, I
> personal
> >> > > > suggest
> >> > > > > > > > reducing
> >> > > > > > > > > the impact on connector developers.
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > > > I agree that the cost and benefit should be measured. And
> >> the
> >> > > > > measurement
> >> > > > > > > > should be in the long term instead of short term. That is
> >> why
> >> > we
> >> > > > > always
> >> > > > > > > > need to align on the ideal end state first.
> >> > > > > > > > Meeting functionality requirements is the bare minimum bar
> >> for
> >> > an
> >> > > > > API.
> >> > > > > > > > Simplicity, intuitiveness, robustness and evolvability are
> >> also
> >> > > > > important.
> >> > > > > > > > In addition, for projects with many APIs, such as Flink, a
> >> > > > > consistent API
> >> > > > > > > > style is also critical for the user adoption as well as
> bug
> >> > > > > avoidance. It
> >> > > > > > > > is very helpful for the community to agree on some API
> >> design
> >> > > > > conventions /
> >> > > > > > > > principles.
> >> > > > > > > > For example, in this particular case, via our discussion,
> >> > > hopefully
> >> > > > > we sort
> >> > > > > > > > of established the following API design conventions /
> >> > principles
> >> > > > for
> >> > > > > all
> >> > > > > > > > the Supports*PushDown interfaces.
> >> > > > > > > >
> >> > > > > > > > 1. By default, expressions should be used if applicable
> >> instead
> >> > > of
> >> > > > > other
> >> > > > > > > > representations.
> >> > > > > > > > 2. In general, the pushdown method should not assume all
> the
> >> > > > > pushdowns will
> >> > > > > > > > succeed. So the applyX() method should return a boolean or
> >> > > List<X>,
> >> > > > > to
> >> > > > > > > > handle the cases that some of the pushdowns cannot be
> >> fulfilled
> >> > > by
> >> > > > > the
> >> > > > > > > > implementation.
> >> > > > > > > >
> >> > > > > > > > Establishing such conventions and principles demands
> careful
> >> > > > > thinking for
> >> > > > > > > > the aspects I mentioned earlier in addition to the API
> >> > > > > functionalities.
> >> > > > > > > > This helps lower the bar of understanding, reduces the
> >> chance
> >> > of
> >> > > > > having
> >> > > > > > > > loose ends in the API, and will benefit all the
> >> participants in
> >> > > the
> >> > > > > project
> >> > > > > > > > over time. I think this is the right way to achieve real
> API
> >> > > > > stability.
> >> > > > > > > > Otherwise, we may end up chasing our tails to find ways
> not
> >> to
> >> > > > > change the
> >> > > > > > > > existing non-ideal APIs.
> >> > > > > > > >
> >> > > > > > > > Thanks,
> >> > > > > > > >
> >> > > > > > > > Jiangjie (Becket) Qin
> >> > > > > > > >
> >> > > > > > > > On Fri, Aug 25, 2023 at 9:33 AM yh z <
> >> zhengyunhon...@gmail.com
> >> > >
> >> > > > > wrote:
> >> > > > > > > >
> >> > > > > > > > > Hi, Venkat,
> >> > > > > > > > >
> >> > > > > > > > > Thanks for the FLIP, it sounds good to support nested
> >> fields
> >> > > > filter
> >> > > > > > > > > pushdown. Based on the design of flip and the above
> >> options,
> >> > I
> >> > > > > would like
> >> > > > > > > > > to make a few suggestions:
> >> > > > > > > > >
> >> > > > > > > > > 1.  At present, introducing
> NestedFieldReferenceExpression
> >> > > looks
> >> > > > > like a
> >> > > > > > > > > better solution, which can fully meet our requirements
> >> while
> >> > > > > reducing
> >> > > > > > > > > modifications to base class FieldReferenceExpression. In
> >> the
> >> > > long
> >> > > > > run, I
> >> > > > > > > > > tend to abstract a basic class for
> >> > > NestedFieldReferenceExpression
> >> > > > > and
> >> > > > > > > > > FieldReferenceExpression as u suggested.
> >> > > > > > > > >
> >> > > > > > > > > 2. Personally, I don't recommend introducing
> >> > > > > *supportsNestedFilters() in
> >> > > > > > > > > supportsFilterPushdown. We just need to better declare
> the
> >> > > return
> >> > > > > value
> >> > > > > > > > of
> >> > > > > > > > > the method *applyFilters.
> >> > > > > > > > >
> >> > > > > > > > > 3. Finally, I think we need to look at the costs and
> >> benefits
> >> > > of
> >> > > > > unifying
> >> > > > > > > > > the SupportsFilterPushDown and
> SupportsProjectionPushDown
> >> (or
> >> > > > > others)
> >> > > > > > > > from
> >> > > > > > > > > the perspective of interface implementers. A stable API
> >> can
> >> > > > reduce
> >> > > > > user
> >> > > > > > > > > development and change costs, if the current API can
> fully
> >> > meet
> >> > > > the
> >> > > > > > > > > functional requirements at the framework level, I
> personal
> >> > > > suggest
> >> > > > > > > > reducing
> >> > > > > > > > > the impact on connector developers.
> >> > > > > > > > >
> >> > > > > > > > > Regards,
> >> > > > > > > > > Yunhong Zheng (Swuferhong)
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > Venkatakrishnan Sowrirajan <vsowr...@asu.edu>
> >> 于2023年8月25日周五
> >> > > > > 01:25写道:
> >> > > > > > > > >
> >> > > > > > > > > > To keep it backwards compatible, introduce another API
> >> > > > > *applyAggregates
> >> > > > > > > > > > *with
> >> > > > > > > > > > *List<ReferenceExpression> *when nested field support
> is
> >> > > added
> >> > > > > and
> >> > > > > > > > > > deprecate the current API. This will by default throw
> an
> >> > > > > exception. In
> >> > > > > > > > > > flink planner, *applyAggregates *with nested fields
> and
> >> if
> >> > it
> >> > > > > throws
> >> > > > > > > > > > exception then *applyAggregates* without nested
> fields.
> >> > > > > > > > > >
> >> > > > > > > > > > Regards
> >> > > > > > > > > > Venkata krishnan
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > > On Thu, Aug 24, 2023 at 10:13 AM Venkatakrishnan
> >> > Sowrirajan <
> >> > > > > > > > > > vsowr...@asu.edu> wrote:
> >> > > > > > > > > >
> >> > > > > > > > > > > Jark,
> >> > > > > > > > > > >
> >> > > > > > > > > > > How about having a separate
> >> > NestedFieldReferenceExpression,
> >> > > > and
> >> > > > > > > > > > >> abstracting a common base class
> "ReferenceExpression"
> >> > for
> >> > > > > > > > > > >> NestedFieldReferenceExpression and
> >> > > FieldReferenceExpression?
> >> > > > > This
> >> > > > > > > > > makes
> >> > > > > > > > > > >> unifying expressions in
> >> > > > > > > > > > >>
> >> > > > > > > >
> >> > > > >
> >> > "SupportsProjectionPushdown#applyProjections(List<ReferenceExpression>
> >> > > > > > > > > > >> ...)"
> >> > > > > > > > > > >> possible.
> >> > > > > > > > > > >
> >> > > > > > > > > > > This should be fine for *SupportsProjectionPushDown*
> >> and
> >> > > > > > > > > > > *SupportsFilterPushDown*. One concern in the case of
> >> > > > > > > > > > > *SupportsAggregatePushDown* with nested fields
> support
> >> > (to
> >> > > be
> >> > > > > added
> >> > > > > > > > in
> >> > > > > > > > > > > the future), with this proposal, the API will become
> >> > > > backwards
> >> > > > > > > > > > incompatible
> >> > > > > > > > > > > as the *args *for the aggregate function is
> >> > > > > > > > > > *List<FieldReferenceExpression>
> >> > > > > > > > > > > *that needs to change to
> *List<ReferenceExpression>*.
> >> > > > > > > > > > >
> >> > > > > > > > > > > Regards
> >> > > > > > > > > > > Venkata krishnan
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > On Thu, Aug 24, 2023 at 1:18 AM Jark Wu <
> >> > imj...@gmail.com>
> >> > > > > wrote:
> >> > > > > > > > > > >
> >> > > > > > > > > > >> Hi Becket,
> >> > > > > > > > > > >>
> >> > > > > > > > > > >> I think it is the second case, that a
> >> > > > > FieldReferenceExpression is
> >> > > > > > > > > > >> constructed
> >> > > > > > > > > > >> by the framework and passed to the connector
> >> (interfaces
> >> > > > > listed by
> >> > > > > > > > > > >> Venkata[1]
> >> > > > > > > > > > >> and Catalog#listPartitionsByFilter). Besides,
> >> > > understanding
> >> > > > > the
> >> > > > > > > > nested
> >> > > > > > > > > > >> field
> >> > > > > > > > > > >> is optional for users/connectors (just treat it as
> an
> >> > > > unknown
> >> > > > > > > > > expression
> >> > > > > > > > > > >> if
> >> > > > > > > > > > >> the
> >> > > > > > > > > > >> connector doesn't want to support it).
> >> > > > > > > > > > >>
> >> > > > > > > > > > >> If we extend FieldReferenceExpression, in the case
> of
> >> > > "where
> >> > > > > > > > > col.nested
> >> > > > > > > > > > >
> >> > > > > > > > > > >> 10",
> >> > > > > > > > > > >> for the connectors already supported filter/delete
> >> > > pushdown,
> >> > > > > they
> >> > > > > > > > may
> >> > > > > > > > > > >> wrongly
> >> > > > > > > > > > >> pushdown "col > 10" instead of "nested > 10"
> because
> >> > they
> >> > > > > still
> >> > > > > > > > treat
> >> > > > > > > > > > >> FieldReferenceExpression as a top-level column.
> This
> >> > > problem
> >> > > > > can be
> >> > > > > > > > > > >> resolved
> >> > > > > > > > > > >> by introducing an additional
> >> "supportedNestedPushdown"
> >> > for
> >> > > > > each
> >> > > > > > > > > > interface,
> >> > > > > > > > > > >> but that method is not elegant and is hard to
> remove
> >> in
> >> > > the
> >> > > > > future,
> >> > > > > > > > > and
> >> > > > > > > > > > >> this could
> >> > > > > > > > > > >> be avoided if we have a separate
> >> > > > > NestedFieldReferenceExpression.
> >> > > > > > > > > > >>
> >> > > > > > > > > > >> If we want to extend FieldReferenceExpression, we
> >> have
> >> > to
> >> > > > add
> >> > > > > > > > > > protections
> >> > > > > > > > > > >> for every related API in one shot. Besides,
> >> > > > > FieldReferenceExpression
> >> > > > > > > > > is
> >> > > > > > > > > > a
> >> > > > > > > > > > >> fundamental class in the planner, we have to go
> >> through
> >> > > all
> >> > > > > the code
> >> > > > > > > > > > that
> >> > > > > > > > > > >> is using it to make sure it properly handling it if
> >> it
> >> > is
> >> > > a
> >> > > > > nested
> >> > > > > > > > > field
> >> > > > > > > > > > >> which
> >> > > > > > > > > > >> is a big effort for the community.
> >> > > > > > > > > > >>
> >> > > > > > > > > > >> If we were designing this API on day 1, I fully
> >> support
> >> > > > > merging them
> >> > > > > > > > > in
> >> > > > > > > > > > a
> >> > > > > > > > > > >> FieldReferenceExpression. But in this case, I'm
> >> thinking
> >> > > > > about how
> >> > > > > > > > to
> >> > > > > > > > > > >> provide
> >> > > > > > > > > > >> users with a smooth migration path, and allow the
> >> > > community
> >> > > > to
> >> > > > > > > > > gradually
> >> > > > > > > > > > >> put efforts into evolving the API, and not block
> the
> >> > > "Nested
> >> > > > > Fields
> >> > > > > > > > > > Filter
> >> > > > > > > > > > >> Pushdown"
> >> > > > > > > > > > >> requirement.
> >> > > > > > > > > > >>
> >> > > > > > > > > > >> How about having a separate
> >> > > NestedFieldReferenceExpression,
> >> > > > > and
> >> > > > > > > > > > >> abstracting a common base class
> "ReferenceExpression"
> >> > for
> >> > > > > > > > > > >> NestedFieldReferenceExpression and
> >> > > FieldReferenceExpression?
> >> > > > > This
> >> > > > > > > > > makes
> >> > > > > > > > > > >> unifying expressions in
> >> > > > > > > > > > >>
> >> > > > > > > >
> >> > > > >
> >> > "SupportsProjectionPushdown#applyProjections(List<ReferenceExpression>
> >> > > > > > > > > > >> ...)"
> >> > > > > > > > > > >> possible.
> >> > > > > > > > > > >>
> >> > > > > > > > > > >> Best,
> >> > > > > > > > > > >> Jark
> >> > > > > > > > > > >>
> >> > > > > > > > > > >> On Thu, 24 Aug 2023 at 07:00, Venkatakrishnan
> >> > Sowrirajan <
> >> > > > > > > > > > >> vsowr...@asu.edu>
> >> > > > > > > > > > >> wrote:
> >> > > > > > > > > > >>
> >> > > > > > > > > > >> > Becket and Jark,
> >> > > > > > > > > > >> >
> >> > > > > > > > > > >> >  Deprecate all the other
> >> > > > > > > > > > >> > > methods except tryApplyFilters() and
> >> > > > > tryApplyProjections().
> >> > > > > > > > > > >> >
> >> > > > > > > > > > >> > For *SupportsProjectionPushDown*, we still need a
> >> > > > > > > > > > >> > *supportsNestedProjections* API on the table
> >> source as
> >> > > > some
> >> > > > > of the
> >> > > > > > > > > > table
> >> > > > > > > > > > >> > sources might not be able to handle nested fields
> >> and
> >> > > > > therefore
> >> > > > > > > > the
> >> > > > > > > > > > >> Flink
> >> > > > > > > > > > >> > planner should not push down the nested
> >> projections or
> >> > > > else
> >> > > > > the
> >> > > > > > > > > > >> > *applyProjection
> >> > > > > > > > > > >> > *API has to be appropriately changed to return
> >> > > > > > > > > > >> > *unconvertibleProjections *similar
> >> > > > > > > > > > >> > to *SupportsFilterPushDown*.
> >> > > > > > > > > > >> >
> >> > > > > > > > > > >> > Or we have to introduce two different
> >> > applyProjections()
> >> > > > > > > > > > >> > > methods for FieldReferenceExpression /
> >> > > > > > > > > > NestedFieldReferenceExpression
> >> > > > > > > > > > >> > > respectively.
> >> > > > > > > > > > >> >
> >> > > > > > > > > > >> > Agree this is not preferred. Given that
> >> > > > > *supportNestedProjections
> >> > > > > > > > > > >> *cannot
> >> > > > > > > > > > >> > be deprecated/removed based on the current API
> >> form,
> >> > > > > extending
> >> > > > > > > > > > >> > *FieldReferenceExpression* to support nested
> fields
> >> > > should
> >> > > > > be
> >> > > > > > > > okay.
> >> > > > > > > > > > >> >
> >> > > > > > > > > > >> > Another alternative could be to change
> >> > *applyProjections
> >> > > > > *to take
> >> > > > > > > > > > >> > List<ResolvedExpression> and on the connector
> side
> >> > they
> >> > > > > choose to
> >> > > > > > > > > > handle
> >> > > > > > > > > > >> > *FieldReferenceExpression* and
> >> > > > > *NestedFieldReferenceExpression *as
> >> > > > > > > > > > >> > applicable and return the remainingProjections.
> In
> >> the
> >> > > > case
> >> > > > > of
> >> > > > > > > > > nested
> >> > > > > > > > > > >> field
> >> > > > > > > > > > >> > projections not supported, it should return them
> >> back
> >> > > but
> >> > > > > only
> >> > > > > > > > > > >> projecting
> >> > > > > > > > > > >> > the top level fields. IMO, this is also *not
> >> > preferred*.
> >> > > > > > > > > > >> >
> >> > > > > > > > > > >> > *SupportsAggregatePushDown*
> >> > > > > > > > > > >> >
> >> > > > > > > > > > >> > *AggregateExpression *currently takes in a list
> of
> >> > > > > > > > > > >> > *FieldReferenceExpression* as args for the
> >> aggregate
> >> > > > > function, if
> >> > > > > > > > in
> >> > > > > > > > > > >> future
> >> > > > > > > > > > >> > *SupportsAggregatePushDown* adds support for
> >> aggregate
> >> > > > > pushdown on
> >> > > > > > > > > > >> nested
> >> > > > > > > > > > >> > fields then the AggregateExpression API also has
> to
> >> > > change
> >> > > > > if a
> >> > > > > > > > new
> >> > > > > > > > > > >> > NestedFieldReferenceExpression is introduced for
> >> > nested
> >> > > > > fields.
> >> > > > > > > > > > >> >
> >> > > > > > > > > > >> > If we add a
> >> > > > > > > > > > >> > > flag for each new filter,
> >> > > > > > > > > > >> > > the interface will be filled with lots of flags
> >> > (e.g.,
> >> > > > > > > > > > >> supportsBetween,
> >> > > > > > > > > > >> > > supportsIN)
> >> > > > > > > > > > >> >
> >> > > > > > > > > > >> > In an ideal situation, I completely agree with
> you.
> >> > But
> >> > > in
> >> > > > > the
> >> > > > > > > > > current
> >> > > > > > > > > > >> > state, *supportsNestedFilters* can act as a
> bridge
> >> to
> >> > > > reach
> >> > > > > the
> >> > > > > > > > > > eventual
> >> > > > > > > > > > >> > desired state which is to have a clean and
> >> consistent
> >> > > set
> >> > > > > of APIs
> >> > > > > > > > > > >> > throughout all Supports*PushDown.
> >> > > > > > > > > > >> >
> >> > > > > > > > > > >> > Also shared some thoughts on the end state API
> >> > > > > > > > > > >> > <
> >> > > > > > > > > > >> >
> >> > > > > > > > > > >>
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://urldefense.com/v3/__https://docs.google.com/document/d/1stLRPKOcxlEv8eHblkrOh0Zf5PLM-h76WMhEINHOyPY/edit?usp=sharing__;!!IKRxdwAv5BmarQ!ZZ2nS1PYlXLnEGFcikS3NsYG7tMaV3wU_z7FmvihNwQBmoLZk2WmcpuRWszK0FFmsInh9A6cndkJrQ$
> >> > > > > > > > > > >> > >
> >> > > > > > > > > > >> > with extension to the *FieldReferenceExpression*
> to
> >> > > > support
> >> > > > > nested
> >> > > > > > > > > > >> fields.
> >> > > > > > > > > > >> > Please take a look.
> >> > > > > > > > > > >> >
> >> > > > > > > > > > >> > Regards
> >> > > > > > > > > > >> > Venkata krishnan
> >> > > > > > > > > > >> >
> >> > > > > > > > > > >> > On Tue, Aug 22, 2023 at 5:02 PM Becket Qin <
> >> > > > > becket....@gmail.com>
> >> > > > > > > > > > >> wrote:
> >> > > > > > > > > > >> >
> >> > > > > > > > > > >> > > Hi Jark,
> >> > > > > > > > > > >> > >
> >> > > > > > > > > > >> > > Regarding the migration path, it would be
> useful
> >> to
> >> > > > > scrutinize
> >> > > > > > > > the
> >> > > > > > > > > > use
> >> > > > > > > > > > >> > case
> >> > > > > > > > > > >> > > of FiledReferenceExpression and
> >> ResolvedExpressions.
> >> > > > > There are
> >> > > > > > > > two
> >> > > > > > > > > > >> kinds
> >> > > > > > > > > > >> > of
> >> > > > > > > > > > >> > > use cases:
> >> > > > > > > > > > >> > >
> >> > > > > > > > > > >> > > 1. A ResolvedExpression is constructed by the
> >> user
> >> > or
> >> > > > > connector
> >> > > > > > > > /
> >> > > > > > > > > > >> plugin
> >> > > > > > > > > > >> > > developers.
> >> > > > > > > > > > >> > > 2. A ResolvedExpression is constructed by the
> >> > > framework
> >> > > > > and
> >> > > > > > > > passed
> >> > > > > > > > > > to
> >> > > > > > > > > > >> > user
> >> > > > > > > > > > >> > > or connector / plugin developers.
> >> > > > > > > > > > >> > >
> >> > > > > > > > > > >> > > For the first case, both of the approaches
> >> provide
> >> > the
> >> > > > > same
> >> > > > > > > > > > migration
> >> > > > > > > > > > >> > > experience.
> >> > > > > > > > > > >> > >
> >> > > > > > > > > > >> > > For the second case, generally speaking,
> >> introducing
> >> > > > > > > > > > >> > > NestedFieldReferenceExpression and extending
> >> > > > > > > > > > FieldReferenceExpression
> >> > > > > > > > > > >> > would
> >> > > > > > > > > > >> > > have the same impact for backwards
> compatibility.
> >> > > > > > > > > > >> SupportsFilterPushDown
> >> > > > > > > > > > >> > is
> >> > > > > > > > > > >> > > a special case here because understanding the
> >> filter
> >> > > > > expressions
> >> > > > > > > > > is
> >> > > > > > > > > > >> > > optional for the source implementation. In
> other
> >> use
> >> > > > > cases, if
> >> > > > > > > > > > >> > > understanding the reference to a nested field
> is
> >> a
> >> > > must
> >> > > > > have,
> >> > > > > > > > the
> >> > > > > > > > > > user
> >> > > > > > > > > > >> > code
> >> > > > > > > > > > >> > > has to be changed, regardless of which approach
> >> we
> >> > > take
> >> > > > to
> >> > > > > > > > support
> >> > > > > > > > > > >> nested
> >> > > > > > > > > > >> > > fields.
> >> > > > > > > > > > >> > >
> >> > > > > > > > > > >> > > Therefore, I think we have to check each public
> >> API
> >> > > > where
> >> > > > > the
> >> > > > > > > > > nested
> >> > > > > > > > > > >> > field
> >> > > > > > > > > > >> > > reference is exposed. If we have many public
> APIs
> >> > > where
> >> > > > > > > > > > understanding
> >> > > > > > > > > > >> > > nested fields is optional for the user  /
> plugin
> >> /
> >> > > > > connector
> >> > > > > > > > > > >> developers,
> >> > > > > > > > > > >> > > having a separate
> NestedFieldReferenceExpression
> >> > would
> >> > > > > have a
> >> > > > > > > > more
> >> > > > > > > > > > >> smooth
> >> > > > > > > > > > >> > > migration. Otherwise, there seems to be no
> >> > difference
> >> > > > > between
> >> > > > > > > > the
> >> > > > > > > > > > two
> >> > > > > > > > > > >> > > approaches.
> >> > > > > > > > > > >> > >
> >> > > > > > > > > > >> > > Migration path aside, the main reason I prefer
> >> > > extending
> >> > > > > > > > > > >> > > FieldReferenceExpression over a new
> >> > > > > > > > NestedFieldReferenceExpression
> >> > > > > > > > > > is
> >> > > > > > > > > > >> > > because this makes the
> SupportsProjectionPushDown
> >> > > > > interface
> >> > > > > > > > > simpler.
> >> > > > > > > > > > >> > > Otherwise, we have to treat it as a special
> case
> >> > that
> >> > > > > does not
> >> > > > > > > > > match
> >> > > > > > > > > > >> the
> >> > > > > > > > > > >> > > overall API style. Or we have to introduce two
> >> > > different
> >> > > > > > > > > > >> > applyProjections()
> >> > > > > > > > > > >> > > methods for FieldReferenceExpression /
> >> > > > > > > > > > NestedFieldReferenceExpression
> >> > > > > > > > > > >> > > respectively. This issue further extends to
> >> > > > > implementation in
> >> > > > > > > > > > >> addition to
> >> > > > > > > > > > >> > > public API. A single FieldReferenceExpression
> >> might
> >> > > help
> >> > > > > > > > simplify
> >> > > > > > > > > > the
> >> > > > > > > > > > >> > > implementation code a little bit. For example,
> >> in a
> >> > > > > recursive
> >> > > > > > > > > > >> processing
> >> > > > > > > > > > >> > of
> >> > > > > > > > > > >> > > a row with nested rows, we may not need to
> switch
> >> > > > between
> >> > > > > > > > > > >> > > FieldReferenceExpression and
> >> > > > > NestedFieldReferenceExpression
> >> > > > > > > > > > depending
> >> > > > > > > > > > >> on
> >> > > > > > > > > > >> > > whether the record being processed is a top
> level
> >> > > record
> >> > > > > or
> >> > > > > > > > nested
> >> > > > > > > > > > >> > record.
> >> > > > > > > > > > >> > >
> >> > > > > > > > > > >> > > Thanks,
> >> > > > > > > > > > >> > >
> >> > > > > > > > > > >> > > Jiangjie (Becket) Qin
> >> > > > > > > > > > >> > >
> >> > > > > > > > > > >> > >
> >> > > > > > > > > > >> > > On Tue, Aug 22, 2023 at 11:43 PM Jark Wu <
> >> > > > > imj...@gmail.com>
> >> > > > > > > > > wrote:
> >> > > > > > > > > > >> > >
> >> > > > > > > > > > >> > > > Hi Becket,
> >> > > > > > > > > > >> > > >
> >> > > > > > > > > > >> > > > I totally agree we should try to have a
> >> consistent
> >> > > API
> >> > > > > for a
> >> > > > > > > > > final
> >> > > > > > > > > > >> > state.
> >> > > > > > > > > > >> > > > The only concern I have mentioned is the
> >> "smooth"
> >> > > > > migration
> >> > > > > > > > > path.
> >> > > > > > > > > > >> > > > The FiledReferenceExpression is widely used
> in
> >> > many
> >> > > > > public
> >> > > > > > > > APIs,
> >> > > > > > > > > > >> > > > not only in the SupportsFilterPushDown. Yes,
> we
> >> > can
> >> > > > > change
> >> > > > > > > > every
> >> > > > > > > > > > >> > > > methods in 2-steps, but is it good to change
> >> API
> >> > > back
> >> > > > > and
> >> > > > > > > > forth
> >> > > > > > > > > > for
> >> > > > > > > > > > >> > this?
> >> > > > > > > > > > >> > > > Personally, I'm fine with a separate
> >> > > > > > > > > > NestedFieldReferenceExpression
> >> > > > > > > > > > >> > > class.
> >> > > > > > > > > > >> > > > TBH, I prefer the separated way because it
> >> makes
> >> > the
> >> > > > > reference
> >> > > > > > > > > > >> > expression
> >> > > > > > > > > > >> > > > more clear and concise.
> >> > > > > > > > > > >> > > >
> >> > > > > > > > > > >> > > > Best,
> >> > > > > > > > > > >> > > > Jark
> >> > > > > > > > > > >> > > >
> >> > > > > > > > > > >> > > >
> >> > > > > > > > > > >> > > > On Tue, 22 Aug 2023 at 16:53, Becket Qin <
> >> > > > > > > > becket....@gmail.com>
> >> > > > > > > > > > >> wrote:
> >> > > > > > > > > > >> > > >
> >> > > > > > > > > > >> > > > > Thanks for the reply, Jark.
> >> > > > > > > > > > >> > > > >
> >> > > > > > > > > > >> > > > > I think it will be helpful to understand
> the
> >> > final
> >> > > > > state we
> >> > > > > > > > > want
> >> > > > > > > > > > >> to
> >> > > > > > > > > > >> > > > > eventually achieve first, then we can
> discuss
> >> > the
> >> > > > > steps
> >> > > > > > > > > towards
> >> > > > > > > > > > >> that
> >> > > > > > > > > > >> > > > final
> >> > > > > > > > > > >> > > > > state.
> >> > > > > > > > > > >> > > > >
> >> > > > > > > > > > >> > > > > It looks like there are two proposed end
> >> states
> >> > > now:
> >> > > > > > > > > > >> > > > >
> >> > > > > > > > > > >> > > > > 1. Have a separate
> >> > NestedFieldReferenceExpression
> >> > > > > class;
> >> > > > > > > > keep
> >> > > > > > > > > > >> > > > > SupportsFilterPushDown and
> >> > > > SupportsProjectionPushDown
> >> > > > > the
> >> > > > > > > > > same.
> >> > > > > > > > > > >> It is
> >> > > > > > > > > > >> > > > just
> >> > > > > > > > > > >> > > > > a one step change.
> >> > > > > > > > > > >> > > > >    - Regarding the
> >> > supportsNestedFilterPushDown()
> >> > > > > method, if
> >> > > > > > > > > our
> >> > > > > > > > > > >> > > contract
> >> > > > > > > > > > >> > > > > with the connector developer today is "The
> >> > > > > implementation
> >> > > > > > > > > should
> >> > > > > > > > > > >> > ignore
> >> > > > > > > > > > >> > > > > unrecognized expressions by putting them
> into
> >> > the
> >> > > > > remaining
> >> > > > > > > > > > >> filters,
> >> > > > > > > > > > >> > > > > instead of throwing exceptions". Then there
> >> is
> >> > no
> >> > > > > need for
> >> > > > > > > > > this
> >> > > > > > > > > > >> > > method. I
> >> > > > > > > > > > >> > > > > am not sure about the current contract. We
> >> > should
> >> > > > > probably
> >> > > > > > > > > make
> >> > > > > > > > > > it
> >> > > > > > > > > > >> > > clear
> >> > > > > > > > > > >> > > > in
> >> > > > > > > > > > >> > > > > the interface Java doc.
> >> > > > > > > > > > >> > > > >
> >> > > > > > > > > > >> > > > > 2. Extend the existing
> >> FiledReferenceExpression
> >> > > > class
> >> > > > > to
> >> > > > > > > > > support
> >> > > > > > > > > > >> > nested
> >> > > > > > > > > > >> > > > > fields; SupportsFilterPushDown only has one
> >> > method
> >> > > > of
> >> > > > > > > > > > >> > > > > applyFilters(List<ResolvedExpression>);
> >> > > > > > > > > > SupportsProjectionPushDown
> >> > > > > > > > > > >> > only
> >> > > > > > > > > > >> > > > has
> >> > > > > > > > > > >> > > > > one method of
> >> > > > > > > > applyProjections(List<FieldReferenceExpression>,
> >> > > > > > > > > > >> > > DataType).
> >> > > > > > > > > > >> > > > > It could just be two steps if we are not
> too
> >> > > > obsessed
> >> > > > > with
> >> > > > > > > > the
> >> > > > > > > > > > >> exact
> >> > > > > > > > > > >> > > > names
> >> > > > > > > > > > >> > > > > of "applyFilters" and "applyProjections".
> >> More
> >> > > > > specifically,
> >> > > > > > > > > it
> >> > > > > > > > > > >> takes
> >> > > > > > > > > > >> > > two
> >> > > > > > > > > > >> > > > > steps to achieve this final state:
> >> > > > > > > > > > >> > > > >     a. introduce a new method
> >> > > > > > > > > > >> > tryApplyFilters(List<ResolvedExpression>)
> >> > > > > > > > > > >> > > > to
> >> > > > > > > > > > >> > > > > SupportsFilterPushDown, which may have
> >> > > > > > > > > FiledReferenceExpression
> >> > > > > > > > > > >> with
> >> > > > > > > > > > >> > > > nested
> >> > > > > > > > > > >> > > > > fields. The default implementation throws
> an
> >> > > > > exception. The
> >> > > > > > > > > > >> runtime
> >> > > > > > > > > > >> > > will
> >> > > > > > > > > > >> > > > > first call tryApplyFilters() with nested
> >> fields.
> >> > > In
> >> > > > > case of
> >> > > > > > > > > > >> > exception,
> >> > > > > > > > > > >> > > it
> >> > > > > > > > > > >> > > > > calls the existing applyFilters() without
> >> > > including
> >> > > > > the
> >> > > > > > > > nested
> >> > > > > > > > > > >> > filters.
> >> > > > > > > > > > >> > > > > Similarly, in SupportsProjectionPushDown,
> >> > > introduce
> >> > > > a
> >> > > > > > > > > > >> > > > >
> >> tryApplyProjections<List<NestedFieldReference>
> >> > > > method
> >> > > > > > > > > returning
> >> > > > > > > > > > a
> >> > > > > > > > > > >> > > Result.
> >> > > > > > > > > > >> > > > > The Result also contains the accepted and
> >> > > > unapplicable
> >> > > > > > > > > > >> projections.
> >> > > > > > > > > > >> > The
> >> > > > > > > > > > >> > > > > default implementation also throws an
> >> exception.
> >> > > > > Deprecate
> >> > > > > > > > all
> >> > > > > > > > > > the
> >> > > > > > > > > > >> > > other
> >> > > > > > > > > > >> > > > > methods except tryApplyFilters() and
> >> > > > > tryApplyProjections().
> >> > > > > > > > > > >> > > > >     b. remove the deprecated methods in the
> >> next
> >> > > > major
> >> > > > > > > > version
> >> > > > > > > > > > >> bump.
> >> > > > > > > > > > >> > > > >
> >> > > > > > > > > > >> > > > > Now the question is putting the migration
> >> steps
> >> > > > > aside, which
> >> > > > > > > > > end
> >> > > > > > > > > > >> > state
> >> > > > > > > > > > >> > > do
> >> > > > > > > > > > >> > > > > we prefer? While the first end state is
> >> > acceptable
> >> > > > > for me,
> >> > > > > > > > > > >> > personally,
> >> > > > > > > > > > >> > > I
> >> > > > > > > > > > >> > > > > prefer the latter if we are designing from
> >> > > scratch.
> >> > > > > It is
> >> > > > > > > > > clean,
> >> > > > > > > > > > >> > > > consistent
> >> > > > > > > > > > >> > > > > and intuitive. Given the size of Flink,
> >> keeping
> >> > > APIs
> >> > > > > in the
> >> > > > > > > > > same
> >> > > > > > > > > > >> > style
> >> > > > > > > > > > >> > > > over
> >> > > > > > > > > > >> > > > > time is important. The migration is also
> not
> >> > that
> >> > > > > > > > complicated.
> >> > > > > > > > > > >> > > > >
> >> > > > > > > > > > >> > > > > Thanks,
> >> > > > > > > > > > >> > > > >
> >> > > > > > > > > > >> > > > > Jiangjie (Becket) Qin
> >> > > > > > > > > > >> > > > >
> >> > > > > > > > > > >> > > > >
> >> > > > > > > > > > >> > > > > On Tue, Aug 22, 2023 at 2:23 PM Jark Wu <
> >> > > > > imj...@gmail.com>
> >> > > > > > > > > > wrote:
> >> > > > > > > > > > >> > > > >
> >> > > > > > > > > > >> > > > > > Hi Venkat,
> >> > > > > > > > > > >> > > > > >
> >> > > > > > > > > > >> > > > > > Thanks for the proposal.
> >> > > > > > > > > > >> > > > > >
> >> > > > > > > > > > >> > > > > > I have some minor comments about the
> FLIP.
> >> > > > > > > > > > >> > > > > >
> >> > > > > > > > > > >> > > > > > 1. I think we don't need to
> >> > > > > > > > > > >> > > > > > add
> >> > > SupportsFilterPushDown#supportsNestedFilters()
> >> > > > > method,
> >> > > > > > > > > > >> > > > > > because connectors can skip nested
> filters
> >> by
> >> > > > > putting them
> >> > > > > > > > > in
> >> > > > > > > > > > >> > > > > > Result#remainingFilters().
> >> > > > > > > > > > >> > > > > > And this is backward-compatible because
> >> > unknown
> >> > > > > > > > expressions
> >> > > > > > > > > > were
> >> > > > > > > > > > >> > > added
> >> > > > > > > > > > >> > > > to
> >> > > > > > > > > > >> > > > > > the remaining filters.
> >> > > > > > > > > > >> > > > > > Planner should push predicate expressions
> >> as
> >> > > more
> >> > > > as
> >> > > > > > > > > possible.
> >> > > > > > > > > > >> If
> >> > > > > > > > > > >> > we
> >> > > > > > > > > > >> > > > add
> >> > > > > > > > > > >> > > > > a
> >> > > > > > > > > > >> > > > > > flag for each new filter,
> >> > > > > > > > > > >> > > > > > the interface will be filled with lots of
> >> > flags
> >> > > > > (e.g.,
> >> > > > > > > > > > >> > > supportsBetween,
> >> > > > > > > > > > >> > > > > > supportsIN).
> >> > > > > > > > > > >> > > > > >
> >> > > > > > > > > > >> > > > > > 2.
> >> > > NestedFieldReferenceExpression#nestedFieldName
> >> > > > > should
> >> > > > > > > > be
> >> > > > > > > > > an
> >> > > > > > > > > > >> > array
> >> > > > > > > > > > >> > > of
> >> > > > > > > > > > >> > > > > > field names?
> >> > > > > > > > > > >> > > > > > Each string represents a field name part
> of
> >> > the
> >> > > > > field
> >> > > > > > > > path.
> >> > > > > > > > > > Just
> >> > > > > > > > > > >> > keep
> >> > > > > > > > > > >> > > > > > aligning with `nestedFieldIndexArray`.
> >> > > > > > > > > > >> > > > > >
> >> > > > > > > > > > >> > > > > > 3. My concern about making
> >> > > > FieldReferenceExpression
> >> > > > > > > > support
> >> > > > > > > > > > >> nested
> >> > > > > > > > > > >> > > > fields
> >> > > > > > > > > > >> > > > > > is the compatibility.
> >> > > > > > > > > > >> > > > > > It is a public API and users/connectors
> are
> >> > > > already
> >> > > > > using
> >> > > > > > > > > it.
> >> > > > > > > > > > >> > People
> >> > > > > > > > > > >> > > > > > assumed it is a top-level column
> >> > > > > > > > > > >> > > > > > reference, and applied logic on it. But
> >> that's
> >> > > not
> >> > > > > true
> >> > > > > > > > now
> >> > > > > > > > > > and
> >> > > > > > > > > > >> > this
> >> > > > > > > > > > >> > > > may
> >> > > > > > > > > > >> > > > > > lead to unexpected errors.
> >> > > > > > > > > > >> > > > > > Having a separate
> >> > NestedFieldReferenceExpression
> >> > > > > sounds
> >> > > > > > > > > safer
> >> > > > > > > > > > to
> >> > > > > > > > > > >> > me.
> >> > > > > > > > > > >> > > > > Mixing
> >> > > > > > > > > > >> > > > > > them in a class may
> >> > > > > > > > > > >> > > > > >  confuse users what's the meaning of
> >> > > > getFieldName()
> >> > > > > and
> >> > > > > > > > > > >> > > > getFieldIndex().
> >> > > > > > > > > > >> > > > > >
> >> > > > > > > > > > >> > > > > >
> >> > > > > > > > > > >> > > > > > Regarding using
> >> NestedFieldReferenceExpression
> >> > > in
> >> > > > > > > > > > >> > > > > > SupportsProjectionPushDown, do you
> >> > > > > > > > > > >> > > > > > have any concerns @Timo Walther <
> >> > > > twal...@apache.org>
> >> > > > > ?
> >> > > > > > > > > > >> > > > > >
> >> > > > > > > > > > >> > > > > > Best,
> >> > > > > > > > > > >> > > > > > Jark
> >> > > > > > > > > > >> > > > > >
> >> > > > > > > > > > >> > > > > >
> >> > > > > > > > > > >> > > > > >
> >> > > > > > > > > > >> > > > > > On Tue, 22 Aug 2023 at 05:55,
> >> Venkatakrishnan
> >> > > > > Sowrirajan <
> >> > > > > > > > > > >> > > > > vsowr...@asu.edu
> >> > > > > > > > > > >> > > > > > >
> >> > > > > > > > > > >> > > > > > wrote:
> >> > > > > > > > > > >> > > > > >
> >> > > > > > > > > > >> > > > > > > Sounds like a great suggestion, Becket.
> >> +1.
> >> > > > Agree
> >> > > > > with
> >> > > > > > > > > > >> cleaning
> >> > > > > > > > > > >> > up
> >> > > > > > > > > > >> > > > the
> >> > > > > > > > > > >> > > > > > APIs
> >> > > > > > > > > > >> > > > > > > and making it consistent in all the
> >> pushdown
> >> > > > APIs.
> >> > > > > > > > > > >> > > > > > >
> >> > > > > > > > > > >> > > > > > > Your suggested approach seems fine to
> me,
> >> > > unless
> >> > > > > anyone
> >> > > > > > > > > else
> >> > > > > > > > > > >> has
> >> > > > > > > > > > >> > > any
> >> > > > > > > > > > >> > > > > > other
> >> > > > > > > > > > >> > > > > > > concerns. Just have couple of
> clarifying
> >> > > > > questions:
> >> > > > > > > > > > >> > > > > > >
> >> > > > > > > > > > >> > > > > > > 1. Do you think we should standardize
> the
> >> > APIs
> >> > > > > across
> >> > > > > > > > all
> >> > > > > > > > > > the
> >> > > > > > > > > > >> > > > pushdown
> >> > > > > > > > > > >> > > > > > > supports like
> SupportsPartitionPushdown,
> >> > > > > > > > > > >> SupportsDynamicFiltering
> >> > > > > > > > > > >> > > etc
> >> > > > > > > > > > >> > > > > in
> >> > > > > > > > > > >> > > > > > > the end state?
> >> > > > > > > > > > >> > > > > > >
> >> > > > > > > > > > >> > > > > > > The current proposal works if we do not
> >> want
> >> > > to
> >> > > > > migrate
> >> > > > > > > > > > >> > > > > > > > SupportsFilterPushdown to also use
> >> > > > > > > > > > >> > NestedFieldReferenceExpression
> >> > > > > > > > > > >> > > > in
> >> > > > > > > > > > >> > > > > > the
> >> > > > > > > > > > >> > > > > > > > long term.
> >> > > > > > > > > > >> > > > > > > >
> >> > > > > > > > > > >> > > > > > > Did you mean *FieldReferenceExpression*
> >> > > instead
> >> > > > of
> >> > > > > > > > > > >> > > > > > > *NestedFieldReferenceExpression*?
> >> > > > > > > > > > >> > > > > > >
> >> > > > > > > > > > >> > > > > > > 2. Extend the FieldReferenceExpression
> to
> >> > > > support
> >> > > > > nested
> >> > > > > > > > > > >> fields.
> >> > > > > > > > > > >> > > > > > > >     - Change the index field type
> from
> >> int
> >> > > to
> >> > > > > int[].
> >> > > > > > > > > > >> > > > > > >
> >> > > > > > > > > > >> > > > > > >     - Add a new method int[]
> >> > > > getFieldIndexArray().
> >> > > > > > > > > > >> > > > > > > >     - Deprecate the int
> getFieldIndex()
> >> > > > method,
> >> > > > > the
> >> > > > > > > > code
> >> > > > > > > > > > >> will
> >> > > > > > > > > > >> > be
> >> > > > > > > > > > >> > > > > > removed
> >> > > > > > > > > > >> > > > > > > in
> >> > > > > > > > > > >> > > > > > > > the next major version bump.
> >> > > > > > > > > > >> > > > > > >
> >> > > > > > > > > > >> > > > > > > I assume getFieldIndex would return
> >> > > > > fieldIndexArray[0],
> >> > > > > > > > > > right?
> >> > > > > > > > > > >> > > > > > >
> >> > > > > > > > > > >> > > > > > > Thanks
> >> > > > > > > > > > >> > > > > > > Venkat
> >> > > > > > > > > > >> > > > > > >
> >> > > > > > > > > > >> > > > > > > On Fri, Aug 18, 2023 at 4:47 PM Becket
> >> Qin <
> >> > > > > > > > > > >> becket....@gmail.com
> >> > > > > > > > > > >> > >
> >> > > > > > > > > > >> > > > > wrote:
> >> > > > > > > > > > >> > > > > > >
> >> > > > > > > > > > >> > > > > > > > Thanks for the proposal, Venkata.
> >> > > > > > > > > > >> > > > > > > >
> >> > > > > > > > > > >> > > > > > > > The current proposal works if we do
> not
> >> > want
> >> > > > to
> >> > > > > > > > migrate
> >> > > > > > > > > > >> > > > > > > > SupportsFilterPushdown to also use
> >> > > > > > > > > > >> > NestedFieldReferenceExpression
> >> > > > > > > > > > >> > > > in
> >> > > > > > > > > > >> > > > > > the
> >> > > > > > > > > > >> > > > > > > > long term.
> >> > > > > > > > > > >> > > > > > > >
> >> > > > > > > > > > >> > > > > > > Did you mean *FieldReferenceExpression*
> >> > > instead
> >> > > > of
> >> > > > > > > > > > >> > > > > > > *NestedFieldReferenceExpression*?
> >> > > > > > > > > > >> > > > > > >
> >> > > > > > > > > > >> > > > > > > >
> >> > > > > > > > > > >> > > > > > > > Otherwise, the alternative solution
> >> > briefly
> >> > > > > mentioned
> >> > > > > > > > in
> >> > > > > > > > > > the
> >> > > > > > > > > > >> > > > rejected
> >> > > > > > > > > > >> > > > > > > > alternatives would be the following:
> >> > > > > > > > > > >> > > > > > > > Phase 1:
> >> > > > > > > > > > >> > > > > > > > 1. Introduce a
> supportsNestedFilters()
> >> > > method
> >> > > > > to the
> >> > > > > > > > > > >> > > > > > > SupportsFilterPushdown
> >> > > > > > > > > > >> > > > > > > > interface. (same as current
> proposal).
> >> > > > > > > > > > >> > > > > > > > 2. Extend the
> FieldReferenceExpression
> >> to
> >> > > > > support
> >> > > > > > > > nested
> >> > > > > > > > > > >> > fields.
> >> > > > > > > > > > >> > > > > > > >     - Change the index field type
> from
> >> int
> >> > > to
> >> > > > > int[].
> >> > > > > > > > > > >> > > > > > >
> >> > > > > > > > > > >> > > > > > >     - Add a new method int[]
> >> > > > getFieldIndexArray().
> >> > > > > > > > > > >> > > > > > > >     - Deprecate the int
> getFieldIndex()
> >> > > > method,
> >> > > > > the
> >> > > > > > > > code
> >> > > > > > > > > > >> will
> >> > > > > > > > > > >> > be
> >> > > > > > > > > > >> > > > > > removed
> >> > > > > > > > > > >> > > > > > > in
> >> > > > > > > > > > >> > > > > > > > the next major version bump.
> >> > > > > > > > > > >> > > > > > >
> >> > > > > > > > > > >> > > > > > >
> >> > > > > > > > > > >> > > > > > >
> >> > > > > > > > > > >> > > > > > > 3. In the SupportsProjectionPushDown
> >> > interface
> >> > > > > > > > > > >> > > > > > > >     - add a new method
> >> > > > > > > > > > >> > > > >
> >> applyProjection(List<FieldReferenceExpression>,
> >> > > > > > > > > > >> > > > > > > > DataType), with default
> implementation
> >> > > > invoking
> >> > > > > > > > > > >> > > > > > applyProjection(int[][],
> >> > > > > > > > > > >> > > > > > > > DataType)
> >> > > > > > > > > > >> > > > > > > >     - deprecate the current
> >> > > > > applyProjection(int[][],
> >> > > > > > > > > > >> DataType)
> >> > > > > > > > > > >> > > > method
> >> > > > > > > > > > >> > > > > > > >
> >> > > > > > > > > > >> > > > > > > > Phase 2 (in the next major version
> >> bump)
> >> > > > > > > > > > >> > > > > > > > 1. remove the deprecated methods.
> >> > > > > > > > > > >> > > > > > > >
> >> > > > > > > > > > >> > > > > > > > Phase 3 (optional)
> >> > > > > > > > > > >> > > > > > > > 1. deprecate and remove the
> >> > > > > supportsNestedFilters() /
> >> > > > > > > > > > >> > > > > > > > supportsNestedProjection() methods
> from
> >> > the
> >> > > > > > > > > > >> > > SupportsFilterPushDown
> >> > > > > > > > > > >> > > > /
> >> > > > > > > > > > >> > > > > > > > SupportsProjectionPushDown
> interfaces.
> >> > > > > > > > > > >> > > > > > > >
> >> > > > > > > > > > >> > > > > > > > Personally I prefer this alternative.
> >> It
> >> > > takes
> >> > > > > longer
> >> > > > > > > > to
> >> > > > > > > > > > >> finish
> >> > > > > > > > > > >> > > the
> >> > > > > > > > > > >> > > > > > work,
> >> > > > > > > > > > >> > > > > > > > but the API eventually becomes clean
> >> and
> >> > > > > consistent.
> >> > > > > > > > > But I
> >> > > > > > > > > > >> can
> >> > > > > > > > > > >> > > live
> >> > > > > > > > > > >> > > > > > with
> >> > > > > > > > > > >> > > > > > > > the current proposal.
> >> > > > > > > > > > >> > > > > > > >
> >> > > > > > > > > > >> > > > > > > > Thanks,
> >> > > > > > > > > > >> > > > > > > >
> >> > > > > > > > > > >> > > > > > > > Jiangjie (Becket) Qin
> >> > > > > > > > > > >> > > > > > > >
> >> > > > > > > > > > >> > > > > > > > On Sat, Aug 19, 2023 at 12:09 AM
> >> > > > Venkatakrishnan
> >> > > > > > > > > > Sowrirajan
> >> > > > > > > > > > >> <
> >> > > > > > > > > > >> > > > > > > > vsowr...@asu.edu> wrote:
> >> > > > > > > > > > >> > > > > > > >
> >> > > > > > > > > > >> > > > > > > > > Gentle ping for reviews/feedback.
> >> > > > > > > > > > >> > > > > > > > >
> >> > > > > > > > > > >> > > > > > > > > On Tue, Aug 15, 2023, 5:37 PM
> >> > > > Venkatakrishnan
> >> > > > > > > > > > Sowrirajan <
> >> > > > > > > > > > >> > > > > > > > vsowr...@asu.edu
> >> > > > > > > > > > >> > > > > > > > > >
> >> > > > > > > > > > >> > > > > > > > > wrote:
> >> > > > > > > > > > >> > > > > > > > >
> >> > > > > > > > > > >> > > > > > > > > > Hi All,
> >> > > > > > > > > > >> > > > > > > > > >
> >> > > > > > > > > > >> > > > > > > > > > I am opening this thread to
> discuss
> >> > > > > FLIP-356:
> >> > > > > > > > > Support
> >> > > > > > > > > > >> > Nested
> >> > > > > > > > > > >> > > > > Fields
> >> > > > > > > > > > >> > > > > > > > > > Filter Pushdown. The FLIP can be
> >> found
> >> > > at
> >> > > > > > > > > > >> > > > > > > > > >
> >> > > > > > > > > > >> > > > > > > > >
> >> > > > > > > > > > >> > > > > > > >
> >> > > > > > > > > > >> > > > > > >
> >> > > > > > > > > > >> > > > > >
> >> > > > > > > > > > >> > > > >
> >> > > > > > > > > > >> > > >
> >> > > > > > > > > > >> > >
> >> > > > > > > > > > >> >
> >> > > > > > > > > > >>
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/FLINK/FLIP-356*3A*Support*Nested*Fields*Filter*Pushdown__;JSsrKysr!!IKRxdwAv5BmarQ!clxXJwshKpn559SAkQiieqgGe0ZduXCzUKCmYLtFIbQLmrmEEgdmuEIM8ZM1M3O_uGqOploU4ailqGpukAg$
> >> > > > > > > > > > >> > > > > > > > > >
> >> > > > > > > > > > >> > > > > > > > > > This FLIP adds support for
> pushing
> >> > down
> >> > > > > nested
> >> > > > > > > > > fields
> >> > > > > > > > > > >> > filters
> >> > > > > > > > > > >> > > > to
> >> > > > > > > > > > >> > > > > > the
> >> > > > > > > > > > >> > > > > > > > > > underlying TableSource. In our
> data
> >> > > lake,
> >> > > > > we find
> >> > > > > > > > a
> >> > > > > > > > > > lot
> >> > > > > > > > > > >> of
> >> > > > > > > > > > >> > > > > datasets
> >> > > > > > > > > > >> > > > > > > > have
> >> > > > > > > > > > >> > > > > > > > > > nested fields and also user
> queries
> >> > with
> >> > > > > filters
> >> > > > > > > > > > >> defined on
> >> > > > > > > > > > >> > > the
> >> > > > > > > > > > >> > > > > > > nested
> >> > > > > > > > > > >> > > > > > > > > > fields. This would drastically
> >> improve
> >> > > the
> >> > > > > > > > > performance
> >> > > > > > > > > > >> for
> >> > > > > > > > > > >> > > > those
> >> > > > > > > > > > >> > > > > > sets
> >> > > > > > > > > > >> > > > > > > > of
> >> > > > > > > > > > >> > > > > > > > > > queries.
> >> > > > > > > > > > >> > > > > > > > > >
> >> > > > > > > > > > >> > > > > > > > > > Appreciate any comments or
> feedback
> >> > you
> >> > > > may
> >> > > > > have
> >> > > > > > > > on
> >> > > > > > > > > > this
> >> > > > > > > > > > >> > > > > proposal.
> >> > > > > > > > > > >> > > > > > > > > >
> >> > > > > > > > > > >> > > > > > > > > > Regards
> >> > > > > > > > > > >> > > > > > > > > > Venkata krishnan
> >> > > > > > > > > > >> > > > > > > > > >
> >> > > > > > > > > > >> > > > > > > > >
> >> > > > > > > > > > >> > > > > > > >
> >> > > > > > > > > > >> > > > > > >
> >> > > > > > > > > > >> > > > > >
> >> > > > > > > > > > >> > > > >
> >> > > > > > > > > > >> > > >
> >> > > > > > > > > > >> > >
> >> > > > > > > > > > >> >
> >> > > > > > > > > > >>
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>

Reply via email to