Re: FLINK-20767 - Support for nested fields filter push down

2023-08-18 Thread Venkatakrishnan Sowrirajan
Gentle ping

On Wed, Aug 16, 2023, 11:56 AM Venkatakrishnan Sowrirajan 
wrote:

> Forgot to share the link -
> https://lists.apache.org/thread/686bhgwrrb4xmbfzlk60szwxos4z64t7 in the
> last email.
>
> Regards
> Venkata krishnan
>
>
> On Wed, Aug 16, 2023 at 11:55 AM Venkatakrishnan Sowrirajan <
> vsowr...@asu.edu> wrote:
>
>> Btw, this is the FLIP proposal discussion thread. Please share your
>> thoughts. Thanks.
>>
>> Regards
>> Venkata krishnan
>>
>>
>> On Sun, Aug 13, 2023 at 6:35 AM liu ron  wrote:
>>
>>> Hi, Venkata krishnan
>>>
>>> Thanks for driving this work, look forward to your FLIP.
>>>
>>> Best,
>>> Ron
>>>
>>> Venkatakrishnan Sowrirajan  于2023年8月13日周日 14:34写道:
>>>
>>> > Thanks Yunhong. That's correct. I am able to make it work locally.
>>> > Currently, in the process of writing a FLIP for the necessary changes
>>> to
>>> > the SupportsFilterPushDown API to support nested fields filter push
>>> down.
>>> >
>>> > Regards
>>> > Venkata krishnan
>>> >
>>> >
>>> > On Mon, Aug 7, 2023 at 8:28 PM yh z  wrote:
>>> >
>>> > > Hi Venkatakrishnan,
>>> > > Sorry for the late reply. I have looked at the code and feel like you
>>> > need
>>> > > to modify the logic of the
>>> > > ExpressionConverter.visit(FieldReferenceExpression expression)
>>> method to
>>> > > support nested types,
>>> > > which are not currently supported in currently code.
>>> > >
>>> > > Regards,
>>> > > Yunhong Zheng (Swuferhong)
>>> > >
>>> > > Venkatakrishnan Sowrirajan  于2023年8月7日周一 13:30写道:
>>> > >
>>> > > > (Sorry, I pressed send too early)
>>> > > >
>>> > > > Thanks for the help @zhengyunhon...@gmail.com.
>>> > > >
>>> > > > Agree on not changing the API as much as possible as well as wrt
>>> > > > simplifying Projection pushdown with nested fields eventually as
>>> well.
>>> > > >
>>> > > > In terms of the code itself, currently I am trying to leverage the
>>> > > > FieldReferenceExpression to also handle nested fields for filter
>>> push
>>> > > down.
>>> > > > But where I am currently struggling to make progress is, once the
>>> > filters
>>> > > > are pushed to the table source itself, in
>>> > > >
>>> >
>>> PushFilterIntoSourceScanRuleBase#resolveFiltersAndCreateTableSourceTable
>>> > > > there is a conversion from List>> > > > FieldReferenceExpression) to the List itself.
>>> > > >
>>> > > > If you have some pointers for that, please let me know. Thanks.
>>> > > >
>>> > > > Regards
>>> > > > Venkata krishnan
>>> > > >
>>> > > >
>>> > > > On Sun, Aug 6, 2023 at 10:23 PM Venkatakrishnan Sowrirajan <
>>> > > > vsowr...@asu.edu>
>>> > > > wrote:
>>> > > >
>>> > > > > Thanks @zhengyunhon...@gmail.com
>>> > > > > Regards
>>> > > > > Venkata krishnan
>>> > > > >
>>> > > > >
>>> > > > > On Sun, Aug 6, 2023 at 6:16 PM yh z 
>>> > wrote:
>>> > > > >
>>> > > > >> Hi, Venkatakrishnan,
>>> > > > >> I think this is a very useful feature. I have been focusing on
>>> the
>>> > > > >> development of the flink-table-planner module recently, so if
>>> you
>>> > need
>>> > > > >> some
>>> > > > >> help, I can assist you in completing the development of some
>>> > sub-tasks
>>> > > > or
>>> > > > >> code review.
>>> > > > >>
>>> > > > >> Returning to the design itself, I think it's necessary to modify
>>> > > > >> FieldReferenceExpression or re-implement a
>>> > > > NestedFieldReferenceExpression.
>>> > > > >> As for modifying the interface of SupportsProjectionPushDown, I
>>> > think
>>> > > we
>>> > > > >> need to make some trade-offs. As a connector developer, the
>>> > stability
>>> > > of
>>> > > > >> the interface is very important. If there are no unresolved
>>> bugs, I
>>> > > > >> personally do not recommend modifying the interface. However,
>>> when I
>>> > > > first
>>> > > > >> read the code of SupportsProjectionPushDown, the design of
>>> int[][]
>>> > was
>>> > > > >> very
>>> > > > >> confusing for me, and it took me a long time to understand it by
>>> > > running
>>> > > > >> specify UT tests. Therefore, in terms of the design of this
>>> > interface
>>> > > > and
>>> > > > >> the consistency between different interfaces, there is indeed
>>> room
>>> > for
>>> > > > >> improvement it.
>>> > > > >>
>>> > > > >> Thanks,
>>> > > > >> Yunhong Zheng (Swuferhong)
>>> > > > >>
>>> > > > >>
>>> > > > >>
>>> > > > >>
>>> > > > >> Becket Qin  于2023年8月3日周四 07:44写道:
>>> > > > >>
>>> > > > >> > Hi Jark,
>>> > > > >> >
>>> > > > >> > If the FieldReferenceExpression contains an int[] to support a
>>> > > nested
>>> > > > >> field
>>> > > > >> > reference, List (or
>>> > > > >> FieldReferenceExpression[])
>>> > > > >> > and int[][] are actually equivalent. If we are designing this
>>> from
>>> > > > >> scratch,
>>> > > > >> > personally I prefer using List for
>>> > > > >> consistency,
>>> > > > >> > i.e. always resolving everything to expressions for users.
>>> > > Projection
>>> > > > >> is a
>>> > > > >> > simpler case, but should not be a special case. This avoids
>>> doing
>>> > > the
>>> > > > >> same
>>> > > > >> > thing in different 

Re: FLINK-20767 - Support for nested fields filter push down

2023-08-16 Thread Venkatakrishnan Sowrirajan
Forgot to share the link -
https://lists.apache.org/thread/686bhgwrrb4xmbfzlk60szwxos4z64t7 in the
last email.

Regards
Venkata krishnan


On Wed, Aug 16, 2023 at 11:55 AM Venkatakrishnan Sowrirajan <
vsowr...@asu.edu> wrote:

> Btw, this is the FLIP proposal discussion thread. Please share your
> thoughts. Thanks.
>
> Regards
> Venkata krishnan
>
>
> On Sun, Aug 13, 2023 at 6:35 AM liu ron  wrote:
>
>> Hi, Venkata krishnan
>>
>> Thanks for driving this work, look forward to your FLIP.
>>
>> Best,
>> Ron
>>
>> Venkatakrishnan Sowrirajan  于2023年8月13日周日 14:34写道:
>>
>> > Thanks Yunhong. That's correct. I am able to make it work locally.
>> > Currently, in the process of writing a FLIP for the necessary changes to
>> > the SupportsFilterPushDown API to support nested fields filter push
>> down.
>> >
>> > Regards
>> > Venkata krishnan
>> >
>> >
>> > On Mon, Aug 7, 2023 at 8:28 PM yh z  wrote:
>> >
>> > > Hi Venkatakrishnan,
>> > > Sorry for the late reply. I have looked at the code and feel like you
>> > need
>> > > to modify the logic of the
>> > > ExpressionConverter.visit(FieldReferenceExpression expression) method
>> to
>> > > support nested types,
>> > > which are not currently supported in currently code.
>> > >
>> > > Regards,
>> > > Yunhong Zheng (Swuferhong)
>> > >
>> > > Venkatakrishnan Sowrirajan  于2023年8月7日周一 13:30写道:
>> > >
>> > > > (Sorry, I pressed send too early)
>> > > >
>> > > > Thanks for the help @zhengyunhon...@gmail.com.
>> > > >
>> > > > Agree on not changing the API as much as possible as well as wrt
>> > > > simplifying Projection pushdown with nested fields eventually as
>> well.
>> > > >
>> > > > In terms of the code itself, currently I am trying to leverage the
>> > > > FieldReferenceExpression to also handle nested fields for filter
>> push
>> > > down.
>> > > > But where I am currently struggling to make progress is, once the
>> > filters
>> > > > are pushed to the table source itself, in
>> > > >
>> > PushFilterIntoSourceScanRuleBase#resolveFiltersAndCreateTableSourceTable
>> > > > there is a conversion from List> > > > FieldReferenceExpression) to the List itself.
>> > > >
>> > > > If you have some pointers for that, please let me know. Thanks.
>> > > >
>> > > > Regards
>> > > > Venkata krishnan
>> > > >
>> > > >
>> > > > On Sun, Aug 6, 2023 at 10:23 PM Venkatakrishnan Sowrirajan <
>> > > > vsowr...@asu.edu>
>> > > > wrote:
>> > > >
>> > > > > Thanks @zhengyunhon...@gmail.com
>> > > > > Regards
>> > > > > Venkata krishnan
>> > > > >
>> > > > >
>> > > > > On Sun, Aug 6, 2023 at 6:16 PM yh z 
>> > wrote:
>> > > > >
>> > > > >> Hi, Venkatakrishnan,
>> > > > >> I think this is a very useful feature. I have been focusing on
>> the
>> > > > >> development of the flink-table-planner module recently, so if you
>> > need
>> > > > >> some
>> > > > >> help, I can assist you in completing the development of some
>> > sub-tasks
>> > > > or
>> > > > >> code review.
>> > > > >>
>> > > > >> Returning to the design itself, I think it's necessary to modify
>> > > > >> FieldReferenceExpression or re-implement a
>> > > > NestedFieldReferenceExpression.
>> > > > >> As for modifying the interface of SupportsProjectionPushDown, I
>> > think
>> > > we
>> > > > >> need to make some trade-offs. As a connector developer, the
>> > stability
>> > > of
>> > > > >> the interface is very important. If there are no unresolved
>> bugs, I
>> > > > >> personally do not recommend modifying the interface. However,
>> when I
>> > > > first
>> > > > >> read the code of SupportsProjectionPushDown, the design of
>> int[][]
>> > was
>> > > > >> very
>> > > > >> confusing for me, and it took me a long time to understand it by
>> > > running
>> > > > >> specify UT tests. Therefore, in terms of the design of this
>> > interface
>> > > > and
>> > > > >> the consistency between different interfaces, there is indeed
>> room
>> > for
>> > > > >> improvement it.
>> > > > >>
>> > > > >> Thanks,
>> > > > >> Yunhong Zheng (Swuferhong)
>> > > > >>
>> > > > >>
>> > > > >>
>> > > > >>
>> > > > >> Becket Qin  于2023年8月3日周四 07:44写道:
>> > > > >>
>> > > > >> > Hi Jark,
>> > > > >> >
>> > > > >> > If the FieldReferenceExpression contains an int[] to support a
>> > > nested
>> > > > >> field
>> > > > >> > reference, List (or
>> > > > >> FieldReferenceExpression[])
>> > > > >> > and int[][] are actually equivalent. If we are designing this
>> from
>> > > > >> scratch,
>> > > > >> > personally I prefer using List for
>> > > > >> consistency,
>> > > > >> > i.e. always resolving everything to expressions for users.
>> > > Projection
>> > > > >> is a
>> > > > >> > simpler case, but should not be a special case. This avoids
>> doing
>> > > the
>> > > > >> same
>> > > > >> > thing in different ways which is also a confusion to the
>> users. To
>> > > me,
>> > > > >> the
>> > > > >> > int[][] format would become kind of a technical debt after we
>> > extend
>> > > > the
>> > > > >> > FieldReferenceExpression. Although we don't have to address it
>> 

Re: FLINK-20767 - Support for nested fields filter push down

2023-08-16 Thread Venkatakrishnan Sowrirajan
Btw, this is the FLIP proposal discussion thread. Please share your
thoughts. Thanks.

Regards
Venkata krishnan


On Sun, Aug 13, 2023 at 6:35 AM liu ron  wrote:

> Hi, Venkata krishnan
>
> Thanks for driving this work, look forward to your FLIP.
>
> Best,
> Ron
>
> Venkatakrishnan Sowrirajan  于2023年8月13日周日 14:34写道:
>
> > Thanks Yunhong. That's correct. I am able to make it work locally.
> > Currently, in the process of writing a FLIP for the necessary changes to
> > the SupportsFilterPushDown API to support nested fields filter push down.
> >
> > Regards
> > Venkata krishnan
> >
> >
> > On Mon, Aug 7, 2023 at 8:28 PM yh z  wrote:
> >
> > > Hi Venkatakrishnan,
> > > Sorry for the late reply. I have looked at the code and feel like you
> > need
> > > to modify the logic of the
> > > ExpressionConverter.visit(FieldReferenceExpression expression) method
> to
> > > support nested types,
> > > which are not currently supported in currently code.
> > >
> > > Regards,
> > > Yunhong Zheng (Swuferhong)
> > >
> > > Venkatakrishnan Sowrirajan  于2023年8月7日周一 13:30写道:
> > >
> > > > (Sorry, I pressed send too early)
> > > >
> > > > Thanks for the help @zhengyunhon...@gmail.com.
> > > >
> > > > Agree on not changing the API as much as possible as well as wrt
> > > > simplifying Projection pushdown with nested fields eventually as
> well.
> > > >
> > > > In terms of the code itself, currently I am trying to leverage the
> > > > FieldReferenceExpression to also handle nested fields for filter push
> > > down.
> > > > But where I am currently struggling to make progress is, once the
> > filters
> > > > are pushed to the table source itself, in
> > > >
> > PushFilterIntoSourceScanRuleBase#resolveFiltersAndCreateTableSourceTable
> > > > there is a conversion from List > > > FieldReferenceExpression) to the List itself.
> > > >
> > > > If you have some pointers for that, please let me know. Thanks.
> > > >
> > > > Regards
> > > > Venkata krishnan
> > > >
> > > >
> > > > On Sun, Aug 6, 2023 at 10:23 PM Venkatakrishnan Sowrirajan <
> > > > vsowr...@asu.edu>
> > > > wrote:
> > > >
> > > > > Thanks @zhengyunhon...@gmail.com
> > > > > Regards
> > > > > Venkata krishnan
> > > > >
> > > > >
> > > > > On Sun, Aug 6, 2023 at 6:16 PM yh z 
> > wrote:
> > > > >
> > > > >> Hi, Venkatakrishnan,
> > > > >> I think this is a very useful feature. I have been focusing on the
> > > > >> development of the flink-table-planner module recently, so if you
> > need
> > > > >> some
> > > > >> help, I can assist you in completing the development of some
> > sub-tasks
> > > > or
> > > > >> code review.
> > > > >>
> > > > >> Returning to the design itself, I think it's necessary to modify
> > > > >> FieldReferenceExpression or re-implement a
> > > > NestedFieldReferenceExpression.
> > > > >> As for modifying the interface of SupportsProjectionPushDown, I
> > think
> > > we
> > > > >> need to make some trade-offs. As a connector developer, the
> > stability
> > > of
> > > > >> the interface is very important. If there are no unresolved bugs,
> I
> > > > >> personally do not recommend modifying the interface. However,
> when I
> > > > first
> > > > >> read the code of SupportsProjectionPushDown, the design of int[][]
> > was
> > > > >> very
> > > > >> confusing for me, and it took me a long time to understand it by
> > > running
> > > > >> specify UT tests. Therefore, in terms of the design of this
> > interface
> > > > and
> > > > >> the consistency between different interfaces, there is indeed room
> > for
> > > > >> improvement it.
> > > > >>
> > > > >> Thanks,
> > > > >> Yunhong Zheng (Swuferhong)
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >> Becket Qin  于2023年8月3日周四 07:44写道:
> > > > >>
> > > > >> > Hi Jark,
> > > > >> >
> > > > >> > If the FieldReferenceExpression contains an int[] to support a
> > > nested
> > > > >> field
> > > > >> > reference, List (or
> > > > >> FieldReferenceExpression[])
> > > > >> > and int[][] are actually equivalent. If we are designing this
> from
> > > > >> scratch,
> > > > >> > personally I prefer using List for
> > > > >> consistency,
> > > > >> > i.e. always resolving everything to expressions for users.
> > > Projection
> > > > >> is a
> > > > >> > simpler case, but should not be a special case. This avoids
> doing
> > > the
> > > > >> same
> > > > >> > thing in different ways which is also a confusion to the users.
> To
> > > me,
> > > > >> the
> > > > >> > int[][] format would become kind of a technical debt after we
> > extend
> > > > the
> > > > >> > FieldReferenceExpression. Although we don't have to address it
> > right
> > > > >> away
> > > > >> > in the same FLIP, this kind of debt accumulates over time and
> > makes
> > > > the
> > > > >> > project harder to learn and maintain. So, personally I prefer to
> > > > address
> > > > >> > these technical debts as soon as possible.
> > > > >> >
> > > > >> > Thanks,
> > > > >> >
> > > > >> > Jiangjie (Becket) Qin
> > > > >> >
> > > > >> > On Wed, Aug 2, 2023 

Re: FLINK-20767 - Support for nested fields filter push down

2023-08-13 Thread liu ron
Hi, Venkata krishnan

Thanks for driving this work, look forward to your FLIP.

Best,
Ron

Venkatakrishnan Sowrirajan  于2023年8月13日周日 14:34写道:

> Thanks Yunhong. That's correct. I am able to make it work locally.
> Currently, in the process of writing a FLIP for the necessary changes to
> the SupportsFilterPushDown API to support nested fields filter push down.
>
> Regards
> Venkata krishnan
>
>
> On Mon, Aug 7, 2023 at 8:28 PM yh z  wrote:
>
> > Hi Venkatakrishnan,
> > Sorry for the late reply. I have looked at the code and feel like you
> need
> > to modify the logic of the
> > ExpressionConverter.visit(FieldReferenceExpression expression) method to
> > support nested types,
> > which are not currently supported in currently code.
> >
> > Regards,
> > Yunhong Zheng (Swuferhong)
> >
> > Venkatakrishnan Sowrirajan  于2023年8月7日周一 13:30写道:
> >
> > > (Sorry, I pressed send too early)
> > >
> > > Thanks for the help @zhengyunhon...@gmail.com.
> > >
> > > Agree on not changing the API as much as possible as well as wrt
> > > simplifying Projection pushdown with nested fields eventually as well.
> > >
> > > In terms of the code itself, currently I am trying to leverage the
> > > FieldReferenceExpression to also handle nested fields for filter push
> > down.
> > > But where I am currently struggling to make progress is, once the
> filters
> > > are pushed to the table source itself, in
> > >
> PushFilterIntoSourceScanRuleBase#resolveFiltersAndCreateTableSourceTable
> > > there is a conversion from List > > FieldReferenceExpression) to the List itself.
> > >
> > > If you have some pointers for that, please let me know. Thanks.
> > >
> > > Regards
> > > Venkata krishnan
> > >
> > >
> > > On Sun, Aug 6, 2023 at 10:23 PM Venkatakrishnan Sowrirajan <
> > > vsowr...@asu.edu>
> > > wrote:
> > >
> > > > Thanks @zhengyunhon...@gmail.com
> > > > Regards
> > > > Venkata krishnan
> > > >
> > > >
> > > > On Sun, Aug 6, 2023 at 6:16 PM yh z 
> wrote:
> > > >
> > > >> Hi, Venkatakrishnan,
> > > >> I think this is a very useful feature. I have been focusing on the
> > > >> development of the flink-table-planner module recently, so if you
> need
> > > >> some
> > > >> help, I can assist you in completing the development of some
> sub-tasks
> > > or
> > > >> code review.
> > > >>
> > > >> Returning to the design itself, I think it's necessary to modify
> > > >> FieldReferenceExpression or re-implement a
> > > NestedFieldReferenceExpression.
> > > >> As for modifying the interface of SupportsProjectionPushDown, I
> think
> > we
> > > >> need to make some trade-offs. As a connector developer, the
> stability
> > of
> > > >> the interface is very important. If there are no unresolved bugs, I
> > > >> personally do not recommend modifying the interface. However, when I
> > > first
> > > >> read the code of SupportsProjectionPushDown, the design of int[][]
> was
> > > >> very
> > > >> confusing for me, and it took me a long time to understand it by
> > running
> > > >> specify UT tests. Therefore, in terms of the design of this
> interface
> > > and
> > > >> the consistency between different interfaces, there is indeed room
> for
> > > >> improvement it.
> > > >>
> > > >> Thanks,
> > > >> Yunhong Zheng (Swuferhong)
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> Becket Qin  于2023年8月3日周四 07:44写道:
> > > >>
> > > >> > Hi Jark,
> > > >> >
> > > >> > If the FieldReferenceExpression contains an int[] to support a
> > nested
> > > >> field
> > > >> > reference, List (or
> > > >> FieldReferenceExpression[])
> > > >> > and int[][] are actually equivalent. If we are designing this from
> > > >> scratch,
> > > >> > personally I prefer using List for
> > > >> consistency,
> > > >> > i.e. always resolving everything to expressions for users.
> > Projection
> > > >> is a
> > > >> > simpler case, but should not be a special case. This avoids doing
> > the
> > > >> same
> > > >> > thing in different ways which is also a confusion to the users. To
> > me,
> > > >> the
> > > >> > int[][] format would become kind of a technical debt after we
> extend
> > > the
> > > >> > FieldReferenceExpression. Although we don't have to address it
> right
> > > >> away
> > > >> > in the same FLIP, this kind of debt accumulates over time and
> makes
> > > the
> > > >> > project harder to learn and maintain. So, personally I prefer to
> > > address
> > > >> > these technical debts as soon as possible.
> > > >> >
> > > >> > Thanks,
> > > >> >
> > > >> > Jiangjie (Becket) Qin
> > > >> >
> > > >> > On Wed, Aug 2, 2023 at 8:19 PM Jark Wu  wrote:
> > > >> >
> > > >> > > Hi,
> > > >> > >
> > > >> > > I agree with Becket that we may need to extend
> > > >> FieldReferenceExpression
> > > >> > to
> > > >> > > support nested field access (or maybe a new
> > > >> > > NestedFieldReferenceExpression).
> > > >> > > But I have some concerns about evolving the
> > > >> > > SupportsProjectionPushDown.applyProjection.
> > > >> > > A projection is much simpler than Filter Expression which only

Re: FLINK-20767 - Support for nested fields filter push down

2023-08-13 Thread Venkatakrishnan Sowrirajan
Thanks Yunhong. That's correct. I am able to make it work locally.
Currently, in the process of writing a FLIP for the necessary changes to
the SupportsFilterPushDown API to support nested fields filter push down.

Regards
Venkata krishnan


On Mon, Aug 7, 2023 at 8:28 PM yh z  wrote:

> Hi Venkatakrishnan,
> Sorry for the late reply. I have looked at the code and feel like you need
> to modify the logic of the
> ExpressionConverter.visit(FieldReferenceExpression expression) method to
> support nested types,
> which are not currently supported in currently code.
>
> Regards,
> Yunhong Zheng (Swuferhong)
>
> Venkatakrishnan Sowrirajan  于2023年8月7日周一 13:30写道:
>
> > (Sorry, I pressed send too early)
> >
> > Thanks for the help @zhengyunhon...@gmail.com.
> >
> > Agree on not changing the API as much as possible as well as wrt
> > simplifying Projection pushdown with nested fields eventually as well.
> >
> > In terms of the code itself, currently I am trying to leverage the
> > FieldReferenceExpression to also handle nested fields for filter push
> down.
> > But where I am currently struggling to make progress is, once the filters
> > are pushed to the table source itself, in
> > PushFilterIntoSourceScanRuleBase#resolveFiltersAndCreateTableSourceTable
> > there is a conversion from List > FieldReferenceExpression) to the List itself.
> >
> > If you have some pointers for that, please let me know. Thanks.
> >
> > Regards
> > Venkata krishnan
> >
> >
> > On Sun, Aug 6, 2023 at 10:23 PM Venkatakrishnan Sowrirajan <
> > vsowr...@asu.edu>
> > wrote:
> >
> > > Thanks @zhengyunhon...@gmail.com
> > > Regards
> > > Venkata krishnan
> > >
> > >
> > > On Sun, Aug 6, 2023 at 6:16 PM yh z  wrote:
> > >
> > >> Hi, Venkatakrishnan,
> > >> I think this is a very useful feature. I have been focusing on the
> > >> development of the flink-table-planner module recently, so if you need
> > >> some
> > >> help, I can assist you in completing the development of some sub-tasks
> > or
> > >> code review.
> > >>
> > >> Returning to the design itself, I think it's necessary to modify
> > >> FieldReferenceExpression or re-implement a
> > NestedFieldReferenceExpression.
> > >> As for modifying the interface of SupportsProjectionPushDown, I think
> we
> > >> need to make some trade-offs. As a connector developer, the stability
> of
> > >> the interface is very important. If there are no unresolved bugs, I
> > >> personally do not recommend modifying the interface. However, when I
> > first
> > >> read the code of SupportsProjectionPushDown, the design of int[][] was
> > >> very
> > >> confusing for me, and it took me a long time to understand it by
> running
> > >> specify UT tests. Therefore, in terms of the design of this interface
> > and
> > >> the consistency between different interfaces, there is indeed room for
> > >> improvement it.
> > >>
> > >> Thanks,
> > >> Yunhong Zheng (Swuferhong)
> > >>
> > >>
> > >>
> > >>
> > >> Becket Qin  于2023年8月3日周四 07:44写道:
> > >>
> > >> > Hi Jark,
> > >> >
> > >> > If the FieldReferenceExpression contains an int[] to support a
> nested
> > >> field
> > >> > reference, List (or
> > >> FieldReferenceExpression[])
> > >> > and int[][] are actually equivalent. If we are designing this from
> > >> scratch,
> > >> > personally I prefer using List for
> > >> consistency,
> > >> > i.e. always resolving everything to expressions for users.
> Projection
> > >> is a
> > >> > simpler case, but should not be a special case. This avoids doing
> the
> > >> same
> > >> > thing in different ways which is also a confusion to the users. To
> me,
> > >> the
> > >> > int[][] format would become kind of a technical debt after we extend
> > the
> > >> > FieldReferenceExpression. Although we don't have to address it right
> > >> away
> > >> > in the same FLIP, this kind of debt accumulates over time and makes
> > the
> > >> > project harder to learn and maintain. So, personally I prefer to
> > address
> > >> > these technical debts as soon as possible.
> > >> >
> > >> > Thanks,
> > >> >
> > >> > Jiangjie (Becket) Qin
> > >> >
> > >> > On Wed, Aug 2, 2023 at 8:19 PM Jark Wu  wrote:
> > >> >
> > >> > > Hi,
> > >> > >
> > >> > > I agree with Becket that we may need to extend
> > >> FieldReferenceExpression
> > >> > to
> > >> > > support nested field access (or maybe a new
> > >> > > NestedFieldReferenceExpression).
> > >> > > But I have some concerns about evolving the
> > >> > > SupportsProjectionPushDown.applyProjection.
> > >> > > A projection is much simpler than Filter Expression which only
> needs
> > >> to
> > >> > > represent the field indexes.
> > >> > > If we evolve `applyProjection` to accept
> > >> `List
> > >> > > projectedFields`,
> > >> > > users have to convert the `List` back to
> > >> > int[][]
> > >> > > which is an overhead for users.
> > >> > > Field indexes (int[][]) is required to project schemas with the
> > >> > > utility org.apache.flink.table.connector.Projection.
> > >> > >
> > >> > >
> > >> > > 

Re: FLINK-20767 - Support for nested fields filter push down

2023-08-07 Thread yh z
Hi Venkatakrishnan,
Sorry for the late reply. I have looked at the code and feel like you need
to modify the logic of the
ExpressionConverter.visit(FieldReferenceExpression expression) method to
support nested types,
which are not currently supported in currently code.

Regards,
Yunhong Zheng (Swuferhong)

Venkatakrishnan Sowrirajan  于2023年8月7日周一 13:30写道:

> (Sorry, I pressed send too early)
>
> Thanks for the help @zhengyunhon...@gmail.com.
>
> Agree on not changing the API as much as possible as well as wrt
> simplifying Projection pushdown with nested fields eventually as well.
>
> In terms of the code itself, currently I am trying to leverage the
> FieldReferenceExpression to also handle nested fields for filter push down.
> But where I am currently struggling to make progress is, once the filters
> are pushed to the table source itself, in
> PushFilterIntoSourceScanRuleBase#resolveFiltersAndCreateTableSourceTable
> there is a conversion from List FieldReferenceExpression) to the List itself.
>
> If you have some pointers for that, please let me know. Thanks.
>
> Regards
> Venkata krishnan
>
>
> On Sun, Aug 6, 2023 at 10:23 PM Venkatakrishnan Sowrirajan <
> vsowr...@asu.edu>
> wrote:
>
> > Thanks @zhengyunhon...@gmail.com
> > Regards
> > Venkata krishnan
> >
> >
> > On Sun, Aug 6, 2023 at 6:16 PM yh z  wrote:
> >
> >> Hi, Venkatakrishnan,
> >> I think this is a very useful feature. I have been focusing on the
> >> development of the flink-table-planner module recently, so if you need
> >> some
> >> help, I can assist you in completing the development of some sub-tasks
> or
> >> code review.
> >>
> >> Returning to the design itself, I think it's necessary to modify
> >> FieldReferenceExpression or re-implement a
> NestedFieldReferenceExpression.
> >> As for modifying the interface of SupportsProjectionPushDown, I think we
> >> need to make some trade-offs. As a connector developer, the stability of
> >> the interface is very important. If there are no unresolved bugs, I
> >> personally do not recommend modifying the interface. However, when I
> first
> >> read the code of SupportsProjectionPushDown, the design of int[][] was
> >> very
> >> confusing for me, and it took me a long time to understand it by running
> >> specify UT tests. Therefore, in terms of the design of this interface
> and
> >> the consistency between different interfaces, there is indeed room for
> >> improvement it.
> >>
> >> Thanks,
> >> Yunhong Zheng (Swuferhong)
> >>
> >>
> >>
> >>
> >> Becket Qin  于2023年8月3日周四 07:44写道:
> >>
> >> > Hi Jark,
> >> >
> >> > If the FieldReferenceExpression contains an int[] to support a nested
> >> field
> >> > reference, List (or
> >> FieldReferenceExpression[])
> >> > and int[][] are actually equivalent. If we are designing this from
> >> scratch,
> >> > personally I prefer using List for
> >> consistency,
> >> > i.e. always resolving everything to expressions for users. Projection
> >> is a
> >> > simpler case, but should not be a special case. This avoids doing the
> >> same
> >> > thing in different ways which is also a confusion to the users. To me,
> >> the
> >> > int[][] format would become kind of a technical debt after we extend
> the
> >> > FieldReferenceExpression. Although we don't have to address it right
> >> away
> >> > in the same FLIP, this kind of debt accumulates over time and makes
> the
> >> > project harder to learn and maintain. So, personally I prefer to
> address
> >> > these technical debts as soon as possible.
> >> >
> >> > Thanks,
> >> >
> >> > Jiangjie (Becket) Qin
> >> >
> >> > On Wed, Aug 2, 2023 at 8:19 PM Jark Wu  wrote:
> >> >
> >> > > Hi,
> >> > >
> >> > > I agree with Becket that we may need to extend
> >> FieldReferenceExpression
> >> > to
> >> > > support nested field access (or maybe a new
> >> > > NestedFieldReferenceExpression).
> >> > > But I have some concerns about evolving the
> >> > > SupportsProjectionPushDown.applyProjection.
> >> > > A projection is much simpler than Filter Expression which only needs
> >> to
> >> > > represent the field indexes.
> >> > > If we evolve `applyProjection` to accept
> >> `List
> >> > > projectedFields`,
> >> > > users have to convert the `List` back to
> >> > int[][]
> >> > > which is an overhead for users.
> >> > > Field indexes (int[][]) is required to project schemas with the
> >> > > utility org.apache.flink.table.connector.Projection.
> >> > >
> >> > >
> >> > > Best,
> >> > > Jark
> >> > >
> >> > >
> >> > >
> >> > > On Wed, 2 Aug 2023 at 07:40, Venkatakrishnan Sowrirajan <
> >> > vsowr...@asu.edu>
> >> > > wrote:
> >> > >
> >> > > > Thanks Becket for the suggestion. That makes sense. Let me try it
> >> out
> >> > and
> >> > > > get back to you.
> >> > > >
> >> > > > Regards
> >> > > > Venkata krishnan
> >> > > >
> >> > > >
> >> > > > On Tue, Aug 1, 2023 at 9:04 AM Becket Qin 
> >> > wrote:
> >> > > >
> >> > > > > This is a very useful feature in practice.
> >> > > > >
> >> > > > > It looks to me that the key issue 

Re: FLINK-20767 - Support for nested fields filter push down

2023-08-06 Thread Venkatakrishnan Sowrirajan
(Sorry, I pressed send too early)

Thanks for the help @zhengyunhon...@gmail.com.

Agree on not changing the API as much as possible as well as wrt
simplifying Projection pushdown with nested fields eventually as well.

In terms of the code itself, currently I am trying to leverage the
FieldReferenceExpression to also handle nested fields for filter push down.
But where I am currently struggling to make progress is, once the filters
are pushed to the table source itself, in
PushFilterIntoSourceScanRuleBase#resolveFiltersAndCreateTableSourceTable
there is a conversion from List itself.

If you have some pointers for that, please let me know. Thanks.

Regards
Venkata krishnan


On Sun, Aug 6, 2023 at 10:23 PM Venkatakrishnan Sowrirajan 
wrote:

> Thanks @zhengyunhon...@gmail.com
> Regards
> Venkata krishnan
>
>
> On Sun, Aug 6, 2023 at 6:16 PM yh z  wrote:
>
>> Hi, Venkatakrishnan,
>> I think this is a very useful feature. I have been focusing on the
>> development of the flink-table-planner module recently, so if you need
>> some
>> help, I can assist you in completing the development of some sub-tasks or
>> code review.
>>
>> Returning to the design itself, I think it's necessary to modify
>> FieldReferenceExpression or re-implement a NestedFieldReferenceExpression.
>> As for modifying the interface of SupportsProjectionPushDown, I think we
>> need to make some trade-offs. As a connector developer, the stability of
>> the interface is very important. If there are no unresolved bugs, I
>> personally do not recommend modifying the interface. However, when I first
>> read the code of SupportsProjectionPushDown, the design of int[][] was
>> very
>> confusing for me, and it took me a long time to understand it by running
>> specify UT tests. Therefore, in terms of the design of this interface and
>> the consistency between different interfaces, there is indeed room for
>> improvement it.
>>
>> Thanks,
>> Yunhong Zheng (Swuferhong)
>>
>>
>>
>>
>> Becket Qin  于2023年8月3日周四 07:44写道:
>>
>> > Hi Jark,
>> >
>> > If the FieldReferenceExpression contains an int[] to support a nested
>> field
>> > reference, List (or
>> FieldReferenceExpression[])
>> > and int[][] are actually equivalent. If we are designing this from
>> scratch,
>> > personally I prefer using List for
>> consistency,
>> > i.e. always resolving everything to expressions for users. Projection
>> is a
>> > simpler case, but should not be a special case. This avoids doing the
>> same
>> > thing in different ways which is also a confusion to the users. To me,
>> the
>> > int[][] format would become kind of a technical debt after we extend the
>> > FieldReferenceExpression. Although we don't have to address it right
>> away
>> > in the same FLIP, this kind of debt accumulates over time and makes the
>> > project harder to learn and maintain. So, personally I prefer to address
>> > these technical debts as soon as possible.
>> >
>> > Thanks,
>> >
>> > Jiangjie (Becket) Qin
>> >
>> > On Wed, Aug 2, 2023 at 8:19 PM Jark Wu  wrote:
>> >
>> > > Hi,
>> > >
>> > > I agree with Becket that we may need to extend
>> FieldReferenceExpression
>> > to
>> > > support nested field access (or maybe a new
>> > > NestedFieldReferenceExpression).
>> > > But I have some concerns about evolving the
>> > > SupportsProjectionPushDown.applyProjection.
>> > > A projection is much simpler than Filter Expression which only needs
>> to
>> > > represent the field indexes.
>> > > If we evolve `applyProjection` to accept
>> `List
>> > > projectedFields`,
>> > > users have to convert the `List` back to
>> > int[][]
>> > > which is an overhead for users.
>> > > Field indexes (int[][]) is required to project schemas with the
>> > > utility org.apache.flink.table.connector.Projection.
>> > >
>> > >
>> > > Best,
>> > > Jark
>> > >
>> > >
>> > >
>> > > On Wed, 2 Aug 2023 at 07:40, Venkatakrishnan Sowrirajan <
>> > vsowr...@asu.edu>
>> > > wrote:
>> > >
>> > > > Thanks Becket for the suggestion. That makes sense. Let me try it
>> out
>> > and
>> > > > get back to you.
>> > > >
>> > > > Regards
>> > > > Venkata krishnan
>> > > >
>> > > >
>> > > > On Tue, Aug 1, 2023 at 9:04 AM Becket Qin 
>> > wrote:
>> > > >
>> > > > > This is a very useful feature in practice.
>> > > > >
>> > > > > It looks to me that the key issue here is that Flink
>> > ResolvedExpression
>> > > > > does not have necessary abstraction for nested field access. So
>> the
>> > > > Calcite
>> > > > > RexFieldAccess does not have a counterpart in the
>> ResolvedExpression.
>> > > The
>> > > > > FieldReferenceExpression only supports direct access to the
>> fields,
>> > not
>> > > > > nested access.
>> > > > >
>> > > > > Theoretically speaking, this nested field reference is also
>> required
>> > by
>> > > > > projection pushdown. However, we addressed that by using an
>> int[][]
>> > in
>> > > > the
>> > > > > SupportsProjectionPushDown interface. Maybe we can do the
>> following:
>> > > > >
>> > > > > 1. Extend the 

Re: FLINK-20767 - Support for nested fields filter push down

2023-08-06 Thread Venkatakrishnan Sowrirajan
Thanks @zhengyunhon...@gmail.com
Regards
Venkata krishnan


On Sun, Aug 6, 2023 at 6:16 PM yh z  wrote:

> Hi, Venkatakrishnan,
> I think this is a very useful feature. I have been focusing on the
> development of the flink-table-planner module recently, so if you need some
> help, I can assist you in completing the development of some sub-tasks or
> code review.
>
> Returning to the design itself, I think it's necessary to modify
> FieldReferenceExpression or re-implement a NestedFieldReferenceExpression.
> As for modifying the interface of SupportsProjectionPushDown, I think we
> need to make some trade-offs. As a connector developer, the stability of
> the interface is very important. If there are no unresolved bugs, I
> personally do not recommend modifying the interface. However, when I first
> read the code of SupportsProjectionPushDown, the design of int[][] was very
> confusing for me, and it took me a long time to understand it by running
> specify UT tests. Therefore, in terms of the design of this interface and
> the consistency between different interfaces, there is indeed room for
> improvement it.
>
> Thanks,
> Yunhong Zheng (Swuferhong)
>
>
>
>
> Becket Qin  于2023年8月3日周四 07:44写道:
>
> > Hi Jark,
> >
> > If the FieldReferenceExpression contains an int[] to support a nested
> field
> > reference, List (or FieldReferenceExpression[])
> > and int[][] are actually equivalent. If we are designing this from
> scratch,
> > personally I prefer using List for consistency,
> > i.e. always resolving everything to expressions for users. Projection is
> a
> > simpler case, but should not be a special case. This avoids doing the
> same
> > thing in different ways which is also a confusion to the users. To me,
> the
> > int[][] format would become kind of a technical debt after we extend the
> > FieldReferenceExpression. Although we don't have to address it right away
> > in the same FLIP, this kind of debt accumulates over time and makes the
> > project harder to learn and maintain. So, personally I prefer to address
> > these technical debts as soon as possible.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Wed, Aug 2, 2023 at 8:19 PM Jark Wu  wrote:
> >
> > > Hi,
> > >
> > > I agree with Becket that we may need to extend FieldReferenceExpression
> > to
> > > support nested field access (or maybe a new
> > > NestedFieldReferenceExpression).
> > > But I have some concerns about evolving the
> > > SupportsProjectionPushDown.applyProjection.
> > > A projection is much simpler than Filter Expression which only needs to
> > > represent the field indexes.
> > > If we evolve `applyProjection` to accept
> `List
> > > projectedFields`,
> > > users have to convert the `List` back to
> > int[][]
> > > which is an overhead for users.
> > > Field indexes (int[][]) is required to project schemas with the
> > > utility org.apache.flink.table.connector.Projection.
> > >
> > >
> > > Best,
> > > Jark
> > >
> > >
> > >
> > > On Wed, 2 Aug 2023 at 07:40, Venkatakrishnan Sowrirajan <
> > vsowr...@asu.edu>
> > > wrote:
> > >
> > > > Thanks Becket for the suggestion. That makes sense. Let me try it out
> > and
> > > > get back to you.
> > > >
> > > > Regards
> > > > Venkata krishnan
> > > >
> > > >
> > > > On Tue, Aug 1, 2023 at 9:04 AM Becket Qin 
> > wrote:
> > > >
> > > > > This is a very useful feature in practice.
> > > > >
> > > > > It looks to me that the key issue here is that Flink
> > ResolvedExpression
> > > > > does not have necessary abstraction for nested field access. So the
> > > > Calcite
> > > > > RexFieldAccess does not have a counterpart in the
> ResolvedExpression.
> > > The
> > > > > FieldReferenceExpression only supports direct access to the fields,
> > not
> > > > > nested access.
> > > > >
> > > > > Theoretically speaking, this nested field reference is also
> required
> > by
> > > > > projection pushdown. However, we addressed that by using an int[][]
> > in
> > > > the
> > > > > SupportsProjectionPushDown interface. Maybe we can do the
> following:
> > > > >
> > > > > 1. Extend the FieldReferenceExpression to include an int[] for
> nested
> > > > field
> > > > > access,
> > > > > 2. By doing (1),
> > > > > SupportsFilterPushDown#applyFilters(List) can
> > > support
> > > > > nested field access.
> > > > > 3. Evolve the SupportsProjectionPushDown.applyProjection(int[][]
> > > > > projectedFields, DataType producedDataType) to
> > > > > applyProjection(List projectedFields,
> > > DataType
> > > > > producedDataType)
> > > > >
> > > > > This will need a FLIP.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > > On Tue, Aug 1, 2023 at 11:42 PM Venkatakrishnan Sowrirajan <
> > > > > vsowr...@asu.edu>
> > > > > wrote:
> > > > >
> > > > > > Thanks for the response. Looking forward to your pointers. In the
> > > > > > meanwhile, let me figure out how we can implement it. Will keep
> you
> > > > > posted.
> > > > > >
> > > > > > On Mon, Jul 31, 2023, 11:43 PM liu 

Re: FLINK-20767 - Support for nested fields filter push down

2023-08-06 Thread yh z
Hi, Venkatakrishnan,
I think this is a very useful feature. I have been focusing on the
development of the flink-table-planner module recently, so if you need some
help, I can assist you in completing the development of some sub-tasks or
code review.

Returning to the design itself, I think it's necessary to modify
FieldReferenceExpression or re-implement a NestedFieldReferenceExpression.
As for modifying the interface of SupportsProjectionPushDown, I think we
need to make some trade-offs. As a connector developer, the stability of
the interface is very important. If there are no unresolved bugs, I
personally do not recommend modifying the interface. However, when I first
read the code of SupportsProjectionPushDown, the design of int[][] was very
confusing for me, and it took me a long time to understand it by running
specify UT tests. Therefore, in terms of the design of this interface and
the consistency between different interfaces, there is indeed room for
improvement it.

Thanks,
Yunhong Zheng (Swuferhong)




Becket Qin  于2023年8月3日周四 07:44写道:

> Hi Jark,
>
> If the FieldReferenceExpression contains an int[] to support a nested field
> reference, List (or FieldReferenceExpression[])
> and int[][] are actually equivalent. If we are designing this from scratch,
> personally I prefer using List for consistency,
> i.e. always resolving everything to expressions for users. Projection is a
> simpler case, but should not be a special case. This avoids doing the same
> thing in different ways which is also a confusion to the users. To me, the
> int[][] format would become kind of a technical debt after we extend the
> FieldReferenceExpression. Although we don't have to address it right away
> in the same FLIP, this kind of debt accumulates over time and makes the
> project harder to learn and maintain. So, personally I prefer to address
> these technical debts as soon as possible.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Wed, Aug 2, 2023 at 8:19 PM Jark Wu  wrote:
>
> > Hi,
> >
> > I agree with Becket that we may need to extend FieldReferenceExpression
> to
> > support nested field access (or maybe a new
> > NestedFieldReferenceExpression).
> > But I have some concerns about evolving the
> > SupportsProjectionPushDown.applyProjection.
> > A projection is much simpler than Filter Expression which only needs to
> > represent the field indexes.
> > If we evolve `applyProjection` to accept `List
> > projectedFields`,
> > users have to convert the `List` back to
> int[][]
> > which is an overhead for users.
> > Field indexes (int[][]) is required to project schemas with the
> > utility org.apache.flink.table.connector.Projection.
> >
> >
> > Best,
> > Jark
> >
> >
> >
> > On Wed, 2 Aug 2023 at 07:40, Venkatakrishnan Sowrirajan <
> vsowr...@asu.edu>
> > wrote:
> >
> > > Thanks Becket for the suggestion. That makes sense. Let me try it out
> and
> > > get back to you.
> > >
> > > Regards
> > > Venkata krishnan
> > >
> > >
> > > On Tue, Aug 1, 2023 at 9:04 AM Becket Qin 
> wrote:
> > >
> > > > This is a very useful feature in practice.
> > > >
> > > > It looks to me that the key issue here is that Flink
> ResolvedExpression
> > > > does not have necessary abstraction for nested field access. So the
> > > Calcite
> > > > RexFieldAccess does not have a counterpart in the ResolvedExpression.
> > The
> > > > FieldReferenceExpression only supports direct access to the fields,
> not
> > > > nested access.
> > > >
> > > > Theoretically speaking, this nested field reference is also required
> by
> > > > projection pushdown. However, we addressed that by using an int[][]
> in
> > > the
> > > > SupportsProjectionPushDown interface. Maybe we can do the following:
> > > >
> > > > 1. Extend the FieldReferenceExpression to include an int[] for nested
> > > field
> > > > access,
> > > > 2. By doing (1),
> > > > SupportsFilterPushDown#applyFilters(List) can
> > support
> > > > nested field access.
> > > > 3. Evolve the SupportsProjectionPushDown.applyProjection(int[][]
> > > > projectedFields, DataType producedDataType) to
> > > > applyProjection(List projectedFields,
> > DataType
> > > > producedDataType)
> > > >
> > > > This will need a FLIP.
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > > On Tue, Aug 1, 2023 at 11:42 PM Venkatakrishnan Sowrirajan <
> > > > vsowr...@asu.edu>
> > > > wrote:
> > > >
> > > > > Thanks for the response. Looking forward to your pointers. In the
> > > > > meanwhile, let me figure out how we can implement it. Will keep you
> > > > posted.
> > > > >
> > > > > On Mon, Jul 31, 2023, 11:43 PM liu ron  wrote:
> > > > >
> > > > > > Hi, Venkata
> > > > > >
> > > > > > Thanks for reporting this issue. Currently, Flink doesn't support
> > > > nested
> > > > > > filter pushdown. I also think that this optimization would be
> > useful,
> > > > > > especially for jobs, which may need to read a lot of data from
> the
> > > > > parquet
> > > > > > or orc file. We didn't move forward 

Re: FLINK-20767 - Support for nested fields filter push down

2023-08-02 Thread Becket Qin
Hi Jark,

If the FieldReferenceExpression contains an int[] to support a nested field
reference, List (or FieldReferenceExpression[])
and int[][] are actually equivalent. If we are designing this from scratch,
personally I prefer using List for consistency,
i.e. always resolving everything to expressions for users. Projection is a
simpler case, but should not be a special case. This avoids doing the same
thing in different ways which is also a confusion to the users. To me, the
int[][] format would become kind of a technical debt after we extend the
FieldReferenceExpression. Although we don't have to address it right away
in the same FLIP, this kind of debt accumulates over time and makes the
project harder to learn and maintain. So, personally I prefer to address
these technical debts as soon as possible.

Thanks,

Jiangjie (Becket) Qin

On Wed, Aug 2, 2023 at 8:19 PM Jark Wu  wrote:

> Hi,
>
> I agree with Becket that we may need to extend FieldReferenceExpression to
> support nested field access (or maybe a new
> NestedFieldReferenceExpression).
> But I have some concerns about evolving the
> SupportsProjectionPushDown.applyProjection.
> A projection is much simpler than Filter Expression which only needs to
> represent the field indexes.
> If we evolve `applyProjection` to accept `List
> projectedFields`,
> users have to convert the `List` back to int[][]
> which is an overhead for users.
> Field indexes (int[][]) is required to project schemas with the
> utility org.apache.flink.table.connector.Projection.
>
>
> Best,
> Jark
>
>
>
> On Wed, 2 Aug 2023 at 07:40, Venkatakrishnan Sowrirajan 
> wrote:
>
> > Thanks Becket for the suggestion. That makes sense. Let me try it out and
> > get back to you.
> >
> > Regards
> > Venkata krishnan
> >
> >
> > On Tue, Aug 1, 2023 at 9:04 AM Becket Qin  wrote:
> >
> > > This is a very useful feature in practice.
> > >
> > > It looks to me that the key issue here is that Flink ResolvedExpression
> > > does not have necessary abstraction for nested field access. So the
> > Calcite
> > > RexFieldAccess does not have a counterpart in the ResolvedExpression.
> The
> > > FieldReferenceExpression only supports direct access to the fields, not
> > > nested access.
> > >
> > > Theoretically speaking, this nested field reference is also required by
> > > projection pushdown. However, we addressed that by using an int[][] in
> > the
> > > SupportsProjectionPushDown interface. Maybe we can do the following:
> > >
> > > 1. Extend the FieldReferenceExpression to include an int[] for nested
> > field
> > > access,
> > > 2. By doing (1),
> > > SupportsFilterPushDown#applyFilters(List) can
> support
> > > nested field access.
> > > 3. Evolve the SupportsProjectionPushDown.applyProjection(int[][]
> > > projectedFields, DataType producedDataType) to
> > > applyProjection(List projectedFields,
> DataType
> > > producedDataType)
> > >
> > > This will need a FLIP.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Tue, Aug 1, 2023 at 11:42 PM Venkatakrishnan Sowrirajan <
> > > vsowr...@asu.edu>
> > > wrote:
> > >
> > > > Thanks for the response. Looking forward to your pointers. In the
> > > > meanwhile, let me figure out how we can implement it. Will keep you
> > > posted.
> > > >
> > > > On Mon, Jul 31, 2023, 11:43 PM liu ron  wrote:
> > > >
> > > > > Hi, Venkata
> > > > >
> > > > > Thanks for reporting this issue. Currently, Flink doesn't support
> > > nested
> > > > > filter pushdown. I also think that this optimization would be
> useful,
> > > > > especially for jobs, which may need to read a lot of data from the
> > > > parquet
> > > > > or orc file. We didn't move forward with this for some priority
> > > reasons.
> > > > >
> > > > > Regarding your three questions, I will respond to you later after
> my
> > > > > on-call is finished because I need to dive into the source code.
> > About
> > > > your
> > > > > commit, I don't think it's the right solution because
> > > > > FieldReferenceExpression doesn't currently support nested field
> > filter
> > > > > pushdown, maybe we need to extend it.
> > > > >
> > > > > You can also look further into reasonable solutions, which we'll
> > > discuss
> > > > > further later on.
> > > > >
> > > > > Best,
> > > > > Ron
> > > > >
> > > > >
> > > > > Venkatakrishnan Sowrirajan  于2023年7月29日周六
> 03:31写道:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > Currently, I am working on adding support for nested fields
> filter
> > > push
> > > > > > down. In our use case running Flink on Batch, we found nested
> > fields
> > > > > filter
> > > > > > push down is key - without it, it is significantly slow. Note:
> > Spark
> > > > SQL
> > > > > > supports nested fields filter push down.
> > > > > >
> > > > > > While debugging the code using IcebergTableSource as the table
> > > source,
> > > > > > narrowed down the issue to missing support for
> > > > > > RexNodeExtractor#RexNodeToExpressionConverter#visitFieldAccess.
> > > > > > As 

Re: FLINK-20767 - Support for nested fields filter push down

2023-08-02 Thread Jark Wu
Hi,

I agree with Becket that we may need to extend FieldReferenceExpression to
support nested field access (or maybe a new
NestedFieldReferenceExpression).
But I have some concerns about evolving the
SupportsProjectionPushDown.applyProjection.
A projection is much simpler than Filter Expression which only needs to
represent the field indexes.
If we evolve `applyProjection` to accept `List
projectedFields`,
users have to convert the `List` back to int[][]
which is an overhead for users.
Field indexes (int[][]) is required to project schemas with the
utility org.apache.flink.table.connector.Projection.


Best,
Jark



On Wed, 2 Aug 2023 at 07:40, Venkatakrishnan Sowrirajan 
wrote:

> Thanks Becket for the suggestion. That makes sense. Let me try it out and
> get back to you.
>
> Regards
> Venkata krishnan
>
>
> On Tue, Aug 1, 2023 at 9:04 AM Becket Qin  wrote:
>
> > This is a very useful feature in practice.
> >
> > It looks to me that the key issue here is that Flink ResolvedExpression
> > does not have necessary abstraction for nested field access. So the
> Calcite
> > RexFieldAccess does not have a counterpart in the ResolvedExpression. The
> > FieldReferenceExpression only supports direct access to the fields, not
> > nested access.
> >
> > Theoretically speaking, this nested field reference is also required by
> > projection pushdown. However, we addressed that by using an int[][] in
> the
> > SupportsProjectionPushDown interface. Maybe we can do the following:
> >
> > 1. Extend the FieldReferenceExpression to include an int[] for nested
> field
> > access,
> > 2. By doing (1),
> > SupportsFilterPushDown#applyFilters(List) can support
> > nested field access.
> > 3. Evolve the SupportsProjectionPushDown.applyProjection(int[][]
> > projectedFields, DataType producedDataType) to
> > applyProjection(List projectedFields, DataType
> > producedDataType)
> >
> > This will need a FLIP.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Tue, Aug 1, 2023 at 11:42 PM Venkatakrishnan Sowrirajan <
> > vsowr...@asu.edu>
> > wrote:
> >
> > > Thanks for the response. Looking forward to your pointers. In the
> > > meanwhile, let me figure out how we can implement it. Will keep you
> > posted.
> > >
> > > On Mon, Jul 31, 2023, 11:43 PM liu ron  wrote:
> > >
> > > > Hi, Venkata
> > > >
> > > > Thanks for reporting this issue. Currently, Flink doesn't support
> > nested
> > > > filter pushdown. I also think that this optimization would be useful,
> > > > especially for jobs, which may need to read a lot of data from the
> > > parquet
> > > > or orc file. We didn't move forward with this for some priority
> > reasons.
> > > >
> > > > Regarding your three questions, I will respond to you later after my
> > > > on-call is finished because I need to dive into the source code.
> About
> > > your
> > > > commit, I don't think it's the right solution because
> > > > FieldReferenceExpression doesn't currently support nested field
> filter
> > > > pushdown, maybe we need to extend it.
> > > >
> > > > You can also look further into reasonable solutions, which we'll
> > discuss
> > > > further later on.
> > > >
> > > > Best,
> > > > Ron
> > > >
> > > >
> > > > Venkatakrishnan Sowrirajan  于2023年7月29日周六 03:31写道:
> > > >
> > > > > Hi all,
> > > > >
> > > > > Currently, I am working on adding support for nested fields filter
> > push
> > > > > down. In our use case running Flink on Batch, we found nested
> fields
> > > > filter
> > > > > push down is key - without it, it is significantly slow. Note:
> Spark
> > > SQL
> > > > > supports nested fields filter push down.
> > > > >
> > > > > While debugging the code using IcebergTableSource as the table
> > source,
> > > > > narrowed down the issue to missing support for
> > > > > RexNodeExtractor#RexNodeToExpressionConverter#visitFieldAccess.
> > > > > As part of fixing it, I made changes by returning an
> > > > > Option(FieldReferenceExpression)
> > > > > with appropriate reference to the parent index and the child index
> > for
> > > > the
> > > > > nested field with the data type info.
> > > > >
> > > > > But this new ResolvedExpression cannot be converted to RexNode
> which
> > > > > happens in PushFilterIntoSourceScanRuleBase
> > > > > <
> > > > >
> > > >
> > >
> >
> https://urldefense.com/v3/__https://github.com/apache/flink/blob/3f63e03e83144e9857834f8db1895637d2aa218a/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/rules/logical/PushFilterIntoSourceScanRuleBase.java*L104__;Iw!!IKRxdwAv5BmarQ!fNgxcul8ZGwkNE9ygOeVGlWlU6m_MLMXf4A3S3oQu9LBzYTPF90pZ7uXSGMr-5dFmzRn37-e9Q5cMnVs$
> > > > > >
> > > > > .
> > > > >
> > > > > Few questions
> > > > >
> > > > > 1. Does FieldReferenceExpression support nested fields currently or
> > > > should
> > > > > it be extended to support nested fields? I couldn't figure this out
> > > from
> > > > > the PushProjectIntoTableScanRule that supports nested column
> > projection
> > > > > push down.
> > > > > 2. 

Re: FLINK-20767 - Support for nested fields filter push down

2023-08-01 Thread Venkatakrishnan Sowrirajan
Thanks Becket for the suggestion. That makes sense. Let me try it out and
get back to you.

Regards
Venkata krishnan


On Tue, Aug 1, 2023 at 9:04 AM Becket Qin  wrote:

> This is a very useful feature in practice.
>
> It looks to me that the key issue here is that Flink ResolvedExpression
> does not have necessary abstraction for nested field access. So the Calcite
> RexFieldAccess does not have a counterpart in the ResolvedExpression. The
> FieldReferenceExpression only supports direct access to the fields, not
> nested access.
>
> Theoretically speaking, this nested field reference is also required by
> projection pushdown. However, we addressed that by using an int[][] in the
> SupportsProjectionPushDown interface. Maybe we can do the following:
>
> 1. Extend the FieldReferenceExpression to include an int[] for nested field
> access,
> 2. By doing (1),
> SupportsFilterPushDown#applyFilters(List) can support
> nested field access.
> 3. Evolve the SupportsProjectionPushDown.applyProjection(int[][]
> projectedFields, DataType producedDataType) to
> applyProjection(List projectedFields, DataType
> producedDataType)
>
> This will need a FLIP.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Tue, Aug 1, 2023 at 11:42 PM Venkatakrishnan Sowrirajan <
> vsowr...@asu.edu>
> wrote:
>
> > Thanks for the response. Looking forward to your pointers. In the
> > meanwhile, let me figure out how we can implement it. Will keep you
> posted.
> >
> > On Mon, Jul 31, 2023, 11:43 PM liu ron  wrote:
> >
> > > Hi, Venkata
> > >
> > > Thanks for reporting this issue. Currently, Flink doesn't support
> nested
> > > filter pushdown. I also think that this optimization would be useful,
> > > especially for jobs, which may need to read a lot of data from the
> > parquet
> > > or orc file. We didn't move forward with this for some priority
> reasons.
> > >
> > > Regarding your three questions, I will respond to you later after my
> > > on-call is finished because I need to dive into the source code. About
> > your
> > > commit, I don't think it's the right solution because
> > > FieldReferenceExpression doesn't currently support nested field filter
> > > pushdown, maybe we need to extend it.
> > >
> > > You can also look further into reasonable solutions, which we'll
> discuss
> > > further later on.
> > >
> > > Best,
> > > Ron
> > >
> > >
> > > Venkatakrishnan Sowrirajan  于2023年7月29日周六 03:31写道:
> > >
> > > > Hi all,
> > > >
> > > > Currently, I am working on adding support for nested fields filter
> push
> > > > down. In our use case running Flink on Batch, we found nested fields
> > > filter
> > > > push down is key - without it, it is significantly slow. Note: Spark
> > SQL
> > > > supports nested fields filter push down.
> > > >
> > > > While debugging the code using IcebergTableSource as the table
> source,
> > > > narrowed down the issue to missing support for
> > > > RexNodeExtractor#RexNodeToExpressionConverter#visitFieldAccess.
> > > > As part of fixing it, I made changes by returning an
> > > > Option(FieldReferenceExpression)
> > > > with appropriate reference to the parent index and the child index
> for
> > > the
> > > > nested field with the data type info.
> > > >
> > > > But this new ResolvedExpression cannot be converted to RexNode which
> > > > happens in PushFilterIntoSourceScanRuleBase
> > > > <
> > > >
> > >
> >
> https://urldefense.com/v3/__https://github.com/apache/flink/blob/3f63e03e83144e9857834f8db1895637d2aa218a/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/rules/logical/PushFilterIntoSourceScanRuleBase.java*L104__;Iw!!IKRxdwAv5BmarQ!fNgxcul8ZGwkNE9ygOeVGlWlU6m_MLMXf4A3S3oQu9LBzYTPF90pZ7uXSGMr-5dFmzRn37-e9Q5cMnVs$
> > > > >
> > > > .
> > > >
> > > > Few questions
> > > >
> > > > 1. Does FieldReferenceExpression support nested fields currently or
> > > should
> > > > it be extended to support nested fields? I couldn't figure this out
> > from
> > > > the PushProjectIntoTableScanRule that supports nested column
> projection
> > > > push down.
> > > > 2. ExpressionConverter
> > > > <
> > > >
> > >
> >
> https://urldefense.com/v3/__https://github.com/apache/flink/blob/3f63e03e83144e9857834f8db1895637d2aa218a/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/expressions/converter/ExpressionConverter.java*L197__;Iw!!IKRxdwAv5BmarQ!fNgxcul8ZGwkNE9ygOeVGlWlU6m_MLMXf4A3S3oQu9LBzYTPF90pZ7uXSGMr-5dFmzRn37-e9Z6jnkJm$
> > > > >
> > > > converts ResolvedExpression -> RexNode but the new
> > > FieldReferenceExpression
> > > > with the nested field cannot be converted to RexNode. This is why the
> > > > answer to the 1st question is key.
> > > > 3. Anything else that I'm missing here? or is there an even easier
> way
> > to
> > > > add support for nested fields filter push down?
> > > >
> > > > Partially working changes - Commit
> > > > <
> > > >
> > >
> >
> 

Re: FLINK-20767 - Support for nested fields filter push down

2023-08-01 Thread Becket Qin
This is a very useful feature in practice.

It looks to me that the key issue here is that Flink ResolvedExpression
does not have necessary abstraction for nested field access. So the Calcite
RexFieldAccess does not have a counterpart in the ResolvedExpression. The
FieldReferenceExpression only supports direct access to the fields, not
nested access.

Theoretically speaking, this nested field reference is also required by
projection pushdown. However, we addressed that by using an int[][] in the
SupportsProjectionPushDown interface. Maybe we can do the following:

1. Extend the FieldReferenceExpression to include an int[] for nested field
access,
2. By doing (1),
SupportsFilterPushDown#applyFilters(List) can support
nested field access.
3. Evolve the SupportsProjectionPushDown.applyProjection(int[][]
projectedFields, DataType producedDataType) to
applyProjection(List projectedFields, DataType
producedDataType)

This will need a FLIP.

Thanks,

Jiangjie (Becket) Qin

On Tue, Aug 1, 2023 at 11:42 PM Venkatakrishnan Sowrirajan 
wrote:

> Thanks for the response. Looking forward to your pointers. In the
> meanwhile, let me figure out how we can implement it. Will keep you posted.
>
> On Mon, Jul 31, 2023, 11:43 PM liu ron  wrote:
>
> > Hi, Venkata
> >
> > Thanks for reporting this issue. Currently, Flink doesn't support nested
> > filter pushdown. I also think that this optimization would be useful,
> > especially for jobs, which may need to read a lot of data from the
> parquet
> > or orc file. We didn't move forward with this for some priority reasons.
> >
> > Regarding your three questions, I will respond to you later after my
> > on-call is finished because I need to dive into the source code. About
> your
> > commit, I don't think it's the right solution because
> > FieldReferenceExpression doesn't currently support nested field filter
> > pushdown, maybe we need to extend it.
> >
> > You can also look further into reasonable solutions, which we'll discuss
> > further later on.
> >
> > Best,
> > Ron
> >
> >
> > Venkatakrishnan Sowrirajan  于2023年7月29日周六 03:31写道:
> >
> > > Hi all,
> > >
> > > Currently, I am working on adding support for nested fields filter push
> > > down. In our use case running Flink on Batch, we found nested fields
> > filter
> > > push down is key - without it, it is significantly slow. Note: Spark
> SQL
> > > supports nested fields filter push down.
> > >
> > > While debugging the code using IcebergTableSource as the table source,
> > > narrowed down the issue to missing support for
> > > RexNodeExtractor#RexNodeToExpressionConverter#visitFieldAccess.
> > > As part of fixing it, I made changes by returning an
> > > Option(FieldReferenceExpression)
> > > with appropriate reference to the parent index and the child index for
> > the
> > > nested field with the data type info.
> > >
> > > But this new ResolvedExpression cannot be converted to RexNode which
> > > happens in PushFilterIntoSourceScanRuleBase
> > > <
> > >
> >
> https://urldefense.com/v3/__https://github.com/apache/flink/blob/3f63e03e83144e9857834f8db1895637d2aa218a/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/rules/logical/PushFilterIntoSourceScanRuleBase.java*L104__;Iw!!IKRxdwAv5BmarQ!fNgxcul8ZGwkNE9ygOeVGlWlU6m_MLMXf4A3S3oQu9LBzYTPF90pZ7uXSGMr-5dFmzRn37-e9Q5cMnVs$
> > > >
> > > .
> > >
> > > Few questions
> > >
> > > 1. Does FieldReferenceExpression support nested fields currently or
> > should
> > > it be extended to support nested fields? I couldn't figure this out
> from
> > > the PushProjectIntoTableScanRule that supports nested column projection
> > > push down.
> > > 2. ExpressionConverter
> > > <
> > >
> >
> https://urldefense.com/v3/__https://github.com/apache/flink/blob/3f63e03e83144e9857834f8db1895637d2aa218a/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/expressions/converter/ExpressionConverter.java*L197__;Iw!!IKRxdwAv5BmarQ!fNgxcul8ZGwkNE9ygOeVGlWlU6m_MLMXf4A3S3oQu9LBzYTPF90pZ7uXSGMr-5dFmzRn37-e9Z6jnkJm$
> > > >
> > > converts ResolvedExpression -> RexNode but the new
> > FieldReferenceExpression
> > > with the nested field cannot be converted to RexNode. This is why the
> > > answer to the 1st question is key.
> > > 3. Anything else that I'm missing here? or is there an even easier way
> to
> > > add support for nested fields filter push down?
> > >
> > > Partially working changes - Commit
> > > <
> > >
> >
> https://urldefense.com/v3/__https://github.com/venkata91/flink/commit/00cdf34ecf9be3ba669a97baaed4b69b85cd26f9__;!!IKRxdwAv5BmarQ!fNgxcul8ZGwkNE9ygOeVGlWlU6m_MLMXf4A3S3oQu9LBzYTPF90pZ7uXSGMr-5dFmzRn37-e9XeOjJ_a$
> > > >
> > > Please
> > > feel free to leave a comment directly in the commit.
> > >
> > > Any pointers here would be much appreciated! Thanks in advance.
> > >
> > > Disclaimer: Relatively new to Flink code base especially Table planner
> > :-).
> > >
> > > Regards
> > > Venkata krishnan
> > >
> >
>


Re: FLINK-20767 - Support for nested fields filter push down

2023-08-01 Thread Venkatakrishnan Sowrirajan
Thanks for the response. Looking forward to your pointers. In the
meanwhile, let me figure out how we can implement it. Will keep you posted.

On Mon, Jul 31, 2023, 11:43 PM liu ron  wrote:

> Hi, Venkata
>
> Thanks for reporting this issue. Currently, Flink doesn't support nested
> filter pushdown. I also think that this optimization would be useful,
> especially for jobs, which may need to read a lot of data from the parquet
> or orc file. We didn't move forward with this for some priority reasons.
>
> Regarding your three questions, I will respond to you later after my
> on-call is finished because I need to dive into the source code. About your
> commit, I don't think it's the right solution because
> FieldReferenceExpression doesn't currently support nested field filter
> pushdown, maybe we need to extend it.
>
> You can also look further into reasonable solutions, which we'll discuss
> further later on.
>
> Best,
> Ron
>
>
> Venkatakrishnan Sowrirajan  于2023年7月29日周六 03:31写道:
>
> > Hi all,
> >
> > Currently, I am working on adding support for nested fields filter push
> > down. In our use case running Flink on Batch, we found nested fields
> filter
> > push down is key - without it, it is significantly slow. Note: Spark SQL
> > supports nested fields filter push down.
> >
> > While debugging the code using IcebergTableSource as the table source,
> > narrowed down the issue to missing support for
> > RexNodeExtractor#RexNodeToExpressionConverter#visitFieldAccess.
> > As part of fixing it, I made changes by returning an
> > Option(FieldReferenceExpression)
> > with appropriate reference to the parent index and the child index for
> the
> > nested field with the data type info.
> >
> > But this new ResolvedExpression cannot be converted to RexNode which
> > happens in PushFilterIntoSourceScanRuleBase
> > <
> >
> https://urldefense.com/v3/__https://github.com/apache/flink/blob/3f63e03e83144e9857834f8db1895637d2aa218a/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/rules/logical/PushFilterIntoSourceScanRuleBase.java*L104__;Iw!!IKRxdwAv5BmarQ!fNgxcul8ZGwkNE9ygOeVGlWlU6m_MLMXf4A3S3oQu9LBzYTPF90pZ7uXSGMr-5dFmzRn37-e9Q5cMnVs$
> > >
> > .
> >
> > Few questions
> >
> > 1. Does FieldReferenceExpression support nested fields currently or
> should
> > it be extended to support nested fields? I couldn't figure this out from
> > the PushProjectIntoTableScanRule that supports nested column projection
> > push down.
> > 2. ExpressionConverter
> > <
> >
> https://urldefense.com/v3/__https://github.com/apache/flink/blob/3f63e03e83144e9857834f8db1895637d2aa218a/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/expressions/converter/ExpressionConverter.java*L197__;Iw!!IKRxdwAv5BmarQ!fNgxcul8ZGwkNE9ygOeVGlWlU6m_MLMXf4A3S3oQu9LBzYTPF90pZ7uXSGMr-5dFmzRn37-e9Z6jnkJm$
> > >
> > converts ResolvedExpression -> RexNode but the new
> FieldReferenceExpression
> > with the nested field cannot be converted to RexNode. This is why the
> > answer to the 1st question is key.
> > 3. Anything else that I'm missing here? or is there an even easier way to
> > add support for nested fields filter push down?
> >
> > Partially working changes - Commit
> > <
> >
> https://urldefense.com/v3/__https://github.com/venkata91/flink/commit/00cdf34ecf9be3ba669a97baaed4b69b85cd26f9__;!!IKRxdwAv5BmarQ!fNgxcul8ZGwkNE9ygOeVGlWlU6m_MLMXf4A3S3oQu9LBzYTPF90pZ7uXSGMr-5dFmzRn37-e9XeOjJ_a$
> > >
> > Please
> > feel free to leave a comment directly in the commit.
> >
> > Any pointers here would be much appreciated! Thanks in advance.
> >
> > Disclaimer: Relatively new to Flink code base especially Table planner
> :-).
> >
> > Regards
> > Venkata krishnan
> >
>


Re: FLINK-20767 - Support for nested fields filter push down

2023-08-01 Thread liu ron
Hi, Venkata

Thanks for reporting this issue. Currently, Flink doesn't support nested
filter pushdown. I also think that this optimization would be useful,
especially for jobs, which may need to read a lot of data from the parquet
or orc file. We didn't move forward with this for some priority reasons.

Regarding your three questions, I will respond to you later after my
on-call is finished because I need to dive into the source code. About your
commit, I don't think it's the right solution because
FieldReferenceExpression doesn't currently support nested field filter
pushdown, maybe we need to extend it.

You can also look further into reasonable solutions, which we'll discuss
further later on.

Best,
Ron


Venkatakrishnan Sowrirajan  于2023年7月29日周六 03:31写道:

> Hi all,
>
> Currently, I am working on adding support for nested fields filter push
> down. In our use case running Flink on Batch, we found nested fields filter
> push down is key - without it, it is significantly slow. Note: Spark SQL
> supports nested fields filter push down.
>
> While debugging the code using IcebergTableSource as the table source,
> narrowed down the issue to missing support for
> RexNodeExtractor#RexNodeToExpressionConverter#visitFieldAccess.
> As part of fixing it, I made changes by returning an
> Option(FieldReferenceExpression)
> with appropriate reference to the parent index and the child index for the
> nested field with the data type info.
>
> But this new ResolvedExpression cannot be converted to RexNode which
> happens in PushFilterIntoSourceScanRuleBase
> <
> https://github.com/apache/flink/blob/3f63e03e83144e9857834f8db1895637d2aa218a/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/rules/logical/PushFilterIntoSourceScanRuleBase.java#L104
> >
> .
>
> Few questions
>
> 1. Does FieldReferenceExpression support nested fields currently or should
> it be extended to support nested fields? I couldn't figure this out from
> the PushProjectIntoTableScanRule that supports nested column projection
> push down.
> 2. ExpressionConverter
> <
> https://github.com/apache/flink/blob/3f63e03e83144e9857834f8db1895637d2aa218a/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/expressions/converter/ExpressionConverter.java#L197
> >
> converts ResolvedExpression -> RexNode but the new FieldReferenceExpression
> with the nested field cannot be converted to RexNode. This is why the
> answer to the 1st question is key.
> 3. Anything else that I'm missing here? or is there an even easier way to
> add support for nested fields filter push down?
>
> Partially working changes - Commit
> <
> https://github.com/venkata91/flink/commit/00cdf34ecf9be3ba669a97baaed4b69b85cd26f9
> >
> Please
> feel free to leave a comment directly in the commit.
>
> Any pointers here would be much appreciated! Thanks in advance.
>
> Disclaimer: Relatively new to Flink code base especially Table planner :-).
>
> Regards
> Venkata krishnan
>