Re: PyFlink UDF: When to use vectorized vs scalar

2021-04-19 Thread Yik San Chan
>>>> >>>> Regards, >>>> Dian >>>> >>>> 2021年4月16日 下午8:24,Fabian Paul 写道: >>>> >>>> Hi Yik San, >>>> >>>> I think the usage of vectorized udfs highly depends on your input and >&g

Re: PyFlink UDF: When to use vectorized vs scalar

2021-04-19 Thread Dian Fu
a-artisans.com>> 写道: >>> >>> Hi Yik San, >>> >>> I think the usage of vectorized udfs highly depends on your input and >>> output formats. For your example my first impression would say that parsing >>> a JSON string i

Re: PyFlink UDF: When to use vectorized vs scalar

2021-04-19 Thread Yik San Chan
; a JSON string is always an rather expensive operation and the vectorization >>> has not much impact on that. >>> >>> I am ccing Dian Fu who is more familiar with pyflink >>> >>> Best, >>> Fabian >>> >>> On

Re: PyFlink UDF: When to use vectorized vs scalar

2021-04-19 Thread Dian Fu
r with pyflink >> >> Best, >> Fabian >> >>> On 16. Apr 2021, at 11:04, Yik San Chan >> <mailto:evan.chanyik...@gmail.com>> wrote: >>> >>> The question is cross-posted on Stack Overflow >>> https://stackoverflow.com/questions/67122265/pyflin

Re: PyFlink UDF: When to use vectorized vs scalar

2021-04-19 Thread Yik San Chan
would say that parsing >> a JSON string is always an rather expensive operation and the vectorization >> has not much impact on that. >> >> I am ccing Dian Fu who is more familiar with pyflink >> >> Best, >> Fabian >> >> On 16. Apr 2021, at 11:0

Re: PyFlink UDF: When to use vectorized vs scalar

2021-04-19 Thread Yik San Chan
on that. > > I am ccing Dian Fu who is more familiar with pyflink > > Best, > Fabian > > On 16. Apr 2021, at 11:04, Yik San Chan wrote: > > The question is cross-posted on Stack Overflow > https://stackoverflow.com/questions/67122265/pyflink-udf-when-to-use-vecto

Re: PyFlink UDF: When to use vectorized vs scalar

2021-04-18 Thread Dian Fu
gmail.com>> wrote: >> >> The question is cross-posted on Stack Overflow >> https://stackoverflow.com/questions/67122265/pyflink-udf-when-to-use-vectorized-vs-scalar >> >> <https://stackoverflow.com/questions/67122265/pyflink-udf-when-to-use-vectorized-v

Re: PyFlink UDF: When to use vectorized vs scalar

2021-04-16 Thread Fabian Paul
more familiar with pyflink Best, Fabian > On 16. Apr 2021, at 11:04, Yik San Chan wrote: > > The question is cross-posted on Stack Overflow > https://stackoverflow.com/questions/67122265/pyflink-udf-when-to-use-vectorized-vs-scalar > > <https://stackoverflow.com/questions

PyFlink UDF: When to use vectorized vs scalar

2021-04-16 Thread Yik San Chan
The question is cross-posted on Stack Overflow https://stackoverflow.com/questions/67122265/pyflink-udf-when-to-use-vectorized-vs-scalar Is there a simple set of rules to follow when deciding between vectorized vs scalar PyFlink UDF? According to [docs]( https://ci.apache.org/projects/flink