Hi guys,
Since there are no further comments, Kindly ping for the vote thread
[1] :D
Thanks,
Aitozi.
[1]: https://lists.apache.org/thread/7g5n2vshosom2dj9bp7x4n01okrnx4xx
Aitozi 于2023年6月26日周一 10:31写道:
> Hi Lincoln,
> Thanks for your confirmation. I have updated the consensus to the
Hi Awake,
Thanks for your good point, updated
Best,
Aitozi.
宇航 李 于2023年7月5日周三 11:29写道:
> Hi Aitozi,
>
> I think it is necessary to add the following description in FLIP to
> express the difference between user-defined asynchronous table function and
> AsyncTableFunction:
>
> User-defined
Hi Aitozi,
I think it is necessary to add the following description in FLIP to express the
difference between user-defined asynchronous table function and
AsyncTableFunction:
User-defined asynchronous table functions allow complex parameters (e.g., Row
type) to be passed to function, which is
Hi Lincoln,
Thanks for your confirmation. I have updated the consensus to the FLIP
doc.
If there are no other comments, I'd like to restart the vote process in [1]
today.
https://lists.apache.org/thread/7g5n2vshosom2dj9bp7x4n01okrnx4xx
Thanks,
Aitozi.
Lincoln Lee 于2023年6月21日周三 22:29写道:
>
Hi Aitozi,
Thanks for your updates!
By the design of hints, the hints after select clause belong to the query
hints category, and this new hint is also a kind of join hints[1].
Join table function is one of the join type defined by flink sql joins[2],
all existing join hints[1] omit the 'join'
Hi all,
Sorry for the late reply, I have a discussion with Lincoln offline,
mainly about
the naming of the hints option. Thanks Lincoln for the valuable suggestions.
Let me answer the last email inline.
>For `JavaAsyncTableFunc0` in flip, can you use a scenario like RPC call as
an example?
Hi Aitozi,
Thanks for your reply! Gives sql users more flexibility to get
asynchronous processing capabilities via lateral join table function +1 for
this
For `JavaAsyncTableFunc0` in flip, can you use a scenario like RPC call as
an example?
For the name of this query hint, 'LATERAL' (include
Hi Lincoln
Very thanks for your valuable question. I will try to answer your
questions inline.
>Does the async udtf bring any additional benefits besides a
lighter implementation?
IMO, async udtf is more than a lighter implementation. It can act as a
general way for sql users to use the
Hi Aitozi,
Sorry for the lately reply here! Supports async udtf(`AsyncTableFunction`)
directly in sql seems like an attractive feature, but there're two issues
that need to be addressed before we can be sure to add it:
1. As mentioned in the flip[1], the current lookup function can already
Get your meaning now, thanks :)
Best,
Aitozi.
Feng Jin 于2023年6月13日周二 11:16写道:
> Hi Aitozi,
>
> Sorry for the confusing description.
>
> What I meant was that if we need to remind users about tire safety issues,
> we should introduce the new UDTF interface instead of executing the
> original
Hi Aitozi,
Sorry for the confusing description.
What I meant was that if we need to remind users about tire safety issues,
we should introduce the new UDTF interface instead of executing the
original UDTF asynchronously. Therefore, I agree with introducing the
AsyncTableFunction.
Best,
Feng
On
Hi Feng,
Thanks for your question. We do not provide a way to switch the UDTF
between sync and async way,
So there should be no thread safety problem here.
Best,
Aitozi
Feng Jin 于2023年6月13日周二 10:31写道:
> Hi Aitozi, We do need to remind users about thread safety issues. Thank you
> for your
Hi Aitozi, We do need to remind users about thread safety issues. Thank you
for your efforts on this FLIP. I have no further questions.
Best, Feng
On Tue, Jun 13, 2023 at 6:05 AM Jing Ge wrote:
> Hi Aitozi,
>
> Thanks for taking care of that part. I have no other concern.
>
> Best regards,
>
Hi Aitozi,
Thanks for taking care of that part. I have no other concern.
Best regards,
Jing
On Mon, Jun 12, 2023 at 5:38 PM Aitozi wrote:
> BTW, If there are no other more blocking issue / comments, I would like to
> start a VOTE in another thread this wednesday 6.14
>
> Thanks,
> Aitozi.
>
BTW, If there are no other more blocking issue / comments, I would like to
start a VOTE in another thread this wednesday 6.14
Thanks,
Aitozi.
Aitozi 于2023年6月12日周一 23:34写道:
> Hi, Jing,
> Thanks for your explanation. I get your point now.
>
> For the performance part, I think it's a good
Hi, Jing,
Thanks for your explanation. I get your point now.
For the performance part, I think it's a good idea to run with returning a
big table case, the memory consumption
should be a point to be taken care about. Because in the ordered mode, the
head element in buffer may affect the
total
Hi Aitozi,
Which key will be used for lookup is not an issue, only one row will be
required for each key in order to enrich it. True, it depends on the
implementation whether multiple rows or single row for each key will be
returned. However, for the lookup & enrichment scenario, one row/key is
Hi Jing,
I means the join key is not necessary to be the primary key or unique
index of the database.
In this situation, we may queried out multi rows for one join key. I think
that's why the
LookupFunction#lookup will return a collection of RowData.
BTW, I think the behavior of lookup join
Hi Aitozi,
The keyRow used in this case contains all keys[1].
Best regards,
Jing
[1]
https://github.com/apache/flink/blob/191ec6ca3943d7119f14837efe112e074d815c47/flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/LookupFunction.java#L49
On Fri, Jun 9, 2023 at 3:42
Hi Jing,
The performance test is added to the FLIP.
As I know, The lookup join can return multi rows, it depends on
whether the join key
is the primary key of the external database or not. The `lookup` [1] will
return a collection of
joined result, and each of them will be collected
Hi Aitozi,
Thanks for the feedback. Looking forward to the performance tests.
Afaik, lookup returns one row for each key [1] [2]. Conceptually, the
lookup function is used to enrich column(s) from the dimension table. If,
for the given key, there will be more than one row, there will be no way
Hi Feng,
Thanks for your good question, It's very attractive if we can support
run the original
UDTF asynchronously without introducing new UDTFs.
But I think it's not easy, because the original UDTFs are executed one
instance per parallelism
So there is no thread-safe problem to user. But
hi, Aitozi
Thank you for your proposal.
In our production environment, we often encounter efficiency issues with
user-defined functions (UDFs), which can lead to slower processing speeds.
I believe that this FLIP will make it easier for UDFs to be executed more
efficiently.
I have a small
Hi Jing
Thanks for your good questions. I have updated the example to the FLIP.
> Only one row for each lookup
lookup can also return multi rows, based on the query result. [1]
[1]:
Hi Aitozi,
Thanks for the clarification. The code example looks interesting. I would
suggest adding them into the FLIP. The description with code examples will
help readers understand the motivation and how to use it. Afaiac, it is a
valid feature for Flink users.
As we knew, lookup join is
Hi Mason,
Thanks for your input. I think if we support the user defined async
table function,
user will be able to use it to hold a batch data then handle it at one time
in the customized function.
AsyncSink is meant for the sink operator. I have not figure out how to
integrate in this case.
Hi Aitozi,
I think it makes sense to make it easier for SQL users to make RPCs. Do you
think your proposal can extend to the ability to batch data for the RPC?
This is also another common strategy to increase throughput. Also, have you
considered solving this a bit differently by leveraging
One more thing for discussion:
In our internal implementation, we reuse the option
`table.exec.async-lookup.buffer-capacity` and
`table.exec.async-lookup.timeout` to config
the async udtf. Do you think we should add two extra option to distinguish
from the lookup option such as
Hi Jing,
> what is the difference between the RPC call or query you mentioned
and the lookup in a very
general way
I think the RPC call or query service is quite similar to the lookup join.
But lookup join should work
with `LookupTableSource`.
Let's see how we can perform an async RPC call
Hi Aitozi,
Thanks for the update. Just out of curiosity, what is the difference
between the RPC call or query you mentioned and the lookup in a very
general way? Since Lateral join is used in the FLIP. Is there any special
thought for that? Sorry for asking so many questions. The FLIP contains
Hi Jing,
I have updated the proposed changes to the FLIP. IMO, lookup has its
clear
async call requirement is due to its IO heavy operator. In our usage, sql
users have
logic to do some RPC call or query the third-party service which is also IO
intensive.
In these case, we'd like to leverage
Hi Aitozi,
Sorry for the late reply. Would you like to update the proposed changes
with more details into the FLIP too?
I got your point. It looks like a rational idea. However, since lookup has
its clear async call requirement, are there any real use cases that
need this change? This will help
Hi Jing,
What do you think about it? Can we move forward this feature?
Thanks,
Aitozi.
Aitozi 于2023年5月29日周一 09:56写道:
> Hi Jing,
> > "Do you mean to support the AyncTableFunction beyond the
> LookupTableSource?"
> Yes, I mean to support the AyncTableFunction beyond the
Hi Jing,
> "Do you mean to support the AyncTableFunction beyond the
LookupTableSource?"
Yes, I mean to support the AyncTableFunction beyond the LookupTableSource.
The "AsyncTableFunction" is the function with ability to be executed async
(with AsyncWaitOperator).
The async lookup join is a
Hi Aitozi,
Thanks for the clarification. The naming "Lookup" might suggest using it
for table look up. But conceptually what the eval() method will do is to
get a collection of results(Row, RowData) from the given keys. How it will
be done depends on the implementation, i.e. you can implement
Hi Jing,
Thanks for your response. As stated in the FLIP, the purpose of this
FLIP is meant to support
user-defined async table function. As described in flink document [1]
Async table functions are special functions for table sources that perform
> a lookup.
>
So end user can not directly
Hi Aitozi,
Thanks for your proposal. I am not quite sure if I understood your thoughts
correctly. You described a special case implementation of the
AsyncTableFunction with on public API changes. Would you please elaborate
your purpose of writing a FLIP according to the FLIP documentation[1]?
May I ask for some feedback :D
Thanks,
Aitozi
Aitozi 于2023年5月23日周二 19:14写道:
>
> Just catch an user case report from Giannis Polyzos for this usage:
>
> https://lists.apache.org/thread/qljwd40v5ntz6733cwcdr8s4z97b343b
>
> Aitozi 于2023年5月23日周二 17:45写道:
> >
> > Hi guys,
> > I want to bring
Just catch an user case report from Giannis Polyzos for this usage:
https://lists.apache.org/thread/qljwd40v5ntz6733cwcdr8s4z97b343b
Aitozi 于2023年5月23日周二 17:45写道:
>
> Hi guys,
> I want to bring up a discussion about adding support of User
> Defined AsyncTableFunction in Flink.
> Currently,
Hi guys,
I want to bring up a discussion about adding support of User
Defined AsyncTableFunction in Flink.
Currently, async table function are special functions for table source
to perform
async lookup. However, it's worth to support the user defined async
table function.
Because, in this way,
40 matches
Mail list logo