Re: puzzle on OperatorChain

2024-07-04 Thread Yunfeng Zhou
Hi Enric, Yes that even if there is only one operator, StreamTask will still create an OperatorChain for it. OperatorChain provides an abstract to process events like endInputs, checkpoints and OperatorEvents in a unified way, no matter how may operators are running in the StreamTask. You may

Re: watermark and barrier

2024-07-03 Thread Yunfeng Zhou
Hi Enric, OperatorCoordinator is a mechanism allowing subtasks of the same operator to communicate with each other and thus unifying the behavior of subtasks running on different machines. It has mainly been used in source operators to distribute source splits. As for watermarks, there are

Re: 退订

2024-05-09 Thread Yunfeng Zhou
Hi, 退订请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org . Best, yunfeng On Thu, May 9, 2024 at 5:58 PM xpfei0811 wrote: > > 退订 > > 回复的原邮件 > | 发件人 | wangfengyang | > | 发送日期 | 2024年04月23日 18:10 | > | 收件人 | user-zh | > | 主题 | 退订 | > 退订

Re: Coordinator of operator ... does not exist or the job vertex this operator belongs to is not initialized.

2024-05-05 Thread Yunfeng Zhou
Hi Eduard, You may need to set log level = INFO to see if there are any other error messages generated in the JM or TM's log. The current exception message seems to be a result error generated from the JM, but the causing error message should still be lying somewhere in the TM's log. Best

Re: Flink流批一体应用在实时数仓数据核对场景下有哪些注意事项?

2024-04-18 Thread Yunfeng Zhou
流模式和批模式在watermark和一些算子语义等方面上有一些不同,但没看到Join和Window算子上有什么差异,这方面应该在batch mode下应该是支持的。具体的两种模式的比较可以看一下这个文档 https://nightlies.apache.org/flink/flink-docs-master/zh/docs/dev/datastream/execution_mode/ On Thu, Apr 18, 2024 at 9:44 AM casel.chen wrote: > > 有人尝试这么实践过么?可以给一些建议么?谢谢! > > > > > > > > > > > >

Re: Understanding event time wrt watermarking strategy in flink

2024-04-14 Thread Yunfeng Zhou
> does this mean that value of A should be less than that of B because records > with timestamp less than T - B would have already been dropped at the source. > > If this is not the case than how does lateness work with our of order > boundedness ? > > Thanks > Sachi

Re: Understanding event time wrt watermarking strategy in flink

2024-04-12 Thread Yunfeng Zhou
Hi Sachin, 1. When your Flink job performs an operation like map or flatmap, the output records would be automatically assigned with the same timestamp as the input record. You don't need to manually assign the timestamp in each step. So the windowing result in your example should be as you have

Re: Debugging Kryo Fallback

2024-04-08 Thread Yunfeng Zhou
gt; > Regards, > > Salva > > On Sun, Apr 7, 2024 at 5:43 AM Yunfeng Zhou > wrote: >> >> Hi Salva, >> >> According to the description of the configuration >> `pipeline.generic-types`, after setting this to false you should be >> able to find Unsuppo

Re: Combining multiple stages into a multi-stage processing pipeline

2024-04-06 Thread Yunfeng Zhou
Hi Mark, IMHO, your design of the Flink application is generally feasible. In Flink ML, I have once met a similar design in ChiSqTest operator, where the input data is first aggregated to generate some results and then broadcast and connected with other result streams from the same input

Re: HBase SQL连接器为啥不支持ARRAY/MAP/ROW类型

2024-04-06 Thread Yunfeng Zhou
应该是由于这些复杂集合在HBase中没有一个直接与之对应的数据类型,所以Flink SQL没有直接支持的。 一种思路是把这些数据类型按照某种格式(比如json)转换成字符串/序列化成byte array,把字符串存到HBase中,读取出来的时候也再解析/反序列化。 On Mon, Apr 1, 2024 at 7:38 PM 王广邦 wrote: > > HBase SQL 连接器(flink-connector-hbase_2.11) 为啥不支持数据类型:ARRAY、MAP / MULTISET、ROW > 不支持? >

Re: Error flink 1.18 not found ExecutionConfig

2023-11-27 Thread Yunfeng Zhou
environment for Flink job submission and the environment of the Flink cluster are the same. Best, Yunfeng Zhou On Mon, Nov 27, 2023 at 8:39 PM Dulce Morim wrote: > > Hi, > > In my IDE it works. I'm trying to run it in the EAR with all dependencies > jars, but it gives an error. Could

Re: 退订

2023-11-06 Thread Yunfeng Zhou
Hi, 请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 来取消订阅邮件。 Best Yunfeng Zhou On Mon, Nov 6, 2023 at 5:30 PM maozhaolin wrote: > > 退订

Re: 退订

2023-10-06 Thread Yunfeng Zhou
Hi, 请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 来取消订阅邮件。 Best, Yunfeng On Wed, Oct 4, 2023 at 10:07 AM 1 wrote: > >

Re: Side outputs documentation

2023-09-25 Thread Yunfeng Zhou
scenarios where it's really a must. > > Regards, > Alexis. > > Am Mo., 25. Sept. 2023 um 05:17 Uhr schrieb Yunfeng Zhou > : >> >> Hi Alexis, >> >> If you create OutputTag with the constructor `OutputTag(String id)`, >> you need to make it anonymou

Re: 退订

2023-09-24 Thread Yunfeng Zhou
Hi, 请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 地址来取消订阅来自 user-zh@flink.apache.org 邮件组的邮件,你可以参考[1][2] 管理你的邮件订阅。 Please send email to user-zh-unsubscr...@flink.apache.org if you want to unsubscribe the mail from user-zh@flink.apache.org , and you can refer [1][2] for more details. Best,

Re: Side outputs documentation

2023-09-24 Thread Yunfeng Zhou
Hi Alexis, If you create OutputTag with the constructor `OutputTag(String id)`, you need to make it anonymous for Flink to analyze the type information. But if you use the constructor `OutputTag(String id, TypeInformation typeInfo)`, you need not make it anonymous as you have provided the type

Re: New flink release date

2022-10-09 Thread Yunfeng Zhou
Hi Marco, The progress of the release of 1.16.0 can be tracked on d...@flink.apache.org. You can subscribe to this mail list and pay attention to related emails, like "[summary] Flink 1.16 release sync". As I can see there is an ongoing vote for the 1.16.0 release candidate #1, and you may get

Re: Global window in batch mode

2022-09-30 Thread Yunfeng Zhou
/datastream/EndOfStreamWindows.java The usage would be like: .keyBy(new MyKeySelector()) .window(EndOfStreamWindows.get()) .reduce(new MyReduceFunction()) Best, Yunfeng Zhou On Thu, Sep 29, 2022 at 9:36 PM Vararu, Vadim wrote: > Hi all, > > > > I need to configure a keyed global wi

Re: Is there any Natural Language Processing samples for flink?

2022-07-26 Thread Yunfeng Zhou
Hi John, So far as I know, Flink does not have an official library or sample specializing in NLP cases yet. You can refer to Flink ML[1] for machine learning samples or Deep Learning on Flink[2] for deep learning samples. [1] https://github.com/apache/flink-ml [2]

Re: jobmanager 与taskmanager间的对象传递

2022-07-03 Thread Yunfeng Zhou
你好。 如果只是需要从各个subtask中收集一些信息,在JobManager中汇总的话,我觉得可以用累加器和计数器[1]。 如果需要双向通信的话,可以考虑一下FLIP-27[2]引入的OperatorCoordinator。如何通过通信来传递对象可以通过自定义算子或函数来实现。 在自定义算子中使用OperatorCoordinator可能还有一些不方便的地方,可以追踪一下相关ticket的进展[3]。 [1]

Re: Table DataStream Conversion Lost Watermark

2021-12-06 Thread Yunfeng Zhou
AX_WATERMARK. I will > definitely forward this. > > But toDataStream forwards watermarks correctly. > > I hope this helps. Or do you think we should also rediscuss the > fromDataStream watermark behavior? > > Regards, > Timo > > > On 06.12.21 10:26, Yunfeng Zhou w

Table DataStream Conversion Lost Watermark

2021-11-04 Thread Yunfeng Zhou
Hi, I found that if I convert a Datastream into Table and back into Datastream, watermark of the stream will be lost. As shown in the program below, the TestOperator before the conversion will have its processWatermark() method triggered and watermark value printed, but the one after the