?????? flink????????????????????

2021-12-05 Thread ????
600w ?? ---- ??: "user-zh"

Re: Buffering when connecting streams

2021-12-05 Thread Piotr Nowojski
Hi Alexis and David, This actually can not happen. There are mechanisms in the code to make sure none of the input is starved IF there is some data to be read. The only time when input can be blocked is during the alignment phase of aligned checkpoints under back pressure. If there was a back

RE: Any way to require .uid(...) calls?

2021-12-05 Thread Schwalbe Matthias
Hi Dan, In case you also want to keep automatic UID assignment, we do something like this (scala): override def run(args: ApplicationArguments): Unit = { require(jobName != null, "a specific jobName needs to be configured, if hosted in Spring Boot, configure 'flink.job.name' in

Re: flink hang : es_rejected_execution_exception导致的flink 假死问题

2021-12-05 Thread Leonard Xu
Hi, ren I think the root cause is you didn’t set proper FailureHandler for ElasticSearch connector, the `RetryRejectedExecutionFailureHandler` can resolve your issue, you can see ElasticSearch connector docs[1] for more information. You can also set 'connector.failure-handler to

Re: enable.auto.commit=true and checkpointing turned on

2021-12-05 Thread Hang Ruan
Hi, 1. Yes, the kafka source will use the Kafka committed offset for the group id to start the job. 2. No, the auto.offset.reset is from Kafka consumer config, which defines what to do when there is no initial offset in

退订

2021-12-05 Thread 高耀军
退订

Re: flink结合历史数据怎么处理

2021-12-05 Thread Leonard Xu
如果你的数据源是 数据库,你可以尝试下 Flink CDC Connectors[1], 这些Connector 就是 hybrid source, 先读历史全量数据,再读增量数据, 历史和增量阶段是无缝衔接的。 祝好, Leonard [1] https://ververica.github.io/flink-cdc-connectors/release-2.1/content/connectors/mysql-cdc.html > 2021年12月2日 下午2:40,张阳 写道: > > 统计的指标有大量的历史数据,怎么把历史的数据和今天的实时数据进行汇总呢。

关于 issue https://issues.apache.org/jira/browse/FLINK-22848 的一些疑惑

2021-12-05 Thread 董剑辉
我目前接触了一段时间 Flink 和 Calcite,对这个 issue: https://issues.apache.org/jira/browse/FLINK-22848存有一些疑问,希望社区大佬有空时能为我解下惑。 首先关于这个 issue 讨论提到最初由于 calcite 不支持 set a=b(无引号)的语法解析,因此采用了正则,但是前两天我查看 commit 历史记录,calcite 应当在 v1.14 甚至更早就已经支持 SET 语法解析(可见 SqlSetOption),但它的问题是会将 true/false/unknown/null 这几个 token

回复: apache-flink - 在pyflink1.14中聚合函数无法输出的问题

2021-12-05 Thread su wenwen
hi,zebing! You can go to localhost:8081 and see if it works. Also, data written to Kafka should be in double quotes. example: {"amount": 500, "course_code": "97iscn4g8k","event_time":"2021-12-01 17:54:41"} Window aggregation needs to pay attention to the progress of watermark.

Re: apache-flink - 在pyflink1.14中聚合函数无法输出的问题

2021-12-05 Thread duanzebing
感谢指点,我按照您的方法测试了一下现在能够接受到数据,但是又遇到了新的问题。只能接收到一次,我把生产者测试数据用for循环循环了一千次发送,在我的预期中, insert into print_sink select count(amount) as amount ,max(course_code) as course_code ,tumble_start(event_time, interval '2' minute) as

Re: [ANNOUNCE] Open source of remote shuffle project for Flink batch data processing

2021-12-05 Thread Lijie Wang
As one of the contributors of flink remote shuffle, I'm glad to hear all the warm responses! Welcome more people to try the flink remote shuffle and look forward to your feedback. Best, Lijie Yingjie Cao 于2021年12月1日周三 17:50写道: > Hi Jiangang, > > Great to hear that, welcome to work together to

Re: [ANNOUNCE] Open source of remote shuffle project for Flink batch data processing

2021-12-05 Thread Lijie Wang
As one of the contributors of flink remote shuffle, I'm glad to hear all the warm responses! Welcome more people to try the flink remote shuffle and look forward to your feedback. Best, Lijie Yingjie Cao 于2021年12月1日周三 17:50写道: > Hi Jiangang, > > Great to hear that, welcome to work together to

??????flink????????????????????

2021-12-05 Thread Xianxun Ye
flink 1.14??hybrid source??boundedunbounded?? ??2021??12??3?? 15:21??Michael Ran ?? jdbc scan ?? 2021-12-02 14:40:06??"" ??

Re: apache-flink - 在pyflink1.14中聚合函数无法输出的问题

2021-12-05 Thread Dian Fu
可能是 watermark 问题:并发多,测试数据少,有些并发没有数据,导致watermark没有增长。 如果是这个原因的话,有两种解决办法: 1)t_env.get_config().get_configuration().set_string("parallelism.default", "1") 2)t_env.get_config().get_configuration().set_string("table.exec.source.idle-timeout", "5000 ms") On Sat, Dec 4, 2021 at 6:25 PM duanzebing

Re: Flinks DispatcherRestEndpoint Thread stuck even though TaskManager failed to execute job

2021-12-05 Thread Chesnay Schepler
The thread is blocked in your user-code, so whether we can unblock it depends on what said user-code is doing. On 05/12/2021 19:13, Yuval Itzchakov wrote: Hi, Flink 1.14.0, Scala 2.12 Flink on Kubernetes I use Lyfts FlinkOperator, which sets up a job cluster in Kubernetes and then submits

Re: PyFlink import internal packages

2021-12-05 Thread Королькевич Михаил
+ CABvJ6uUPXuaKNayJ-VT7uPg-ZqDq1xzGqV8arP7RYcEosVQouA@- все Hi, thank you!it was very helpful! 03.12.2021, 12:48, "Shuiqiang Chen" :Hi, Actually, you are able to develop your app in the clean python way. It's fine to split the code into multiple files and there is no need to call

Flinks DispatcherRestEndpoint Thread stuck even though TaskManager failed to execute job

2021-12-05 Thread Yuval Itzchakov
Hi, Flink 1.14.0, Scala 2.12 Flink on Kubernetes I use Lyfts FlinkOperator, which sets up a job cluster in Kubernetes and then submits the job via the REST API. At times, the job fails. Specifically one case I am analyzing fails due to invalid state migration. I see the following error when

Re: Unable to create new native thread error

2021-12-05 Thread Ilan Huchansky
Hi David, Thanks for your fast response. Do you think that changing the submission method could solve the problem? Using the CLI instead of the REST API. Another question, I see that the most critical issue (FLINK-25022) is in progress and should be released on with version 1.13.4 , do you

apache-flink - 在pyflink1.14中聚合函数无法输出的问题

2021-12-05 Thread duanzebing
大家好: 我是一个pyflink初学者,遇到一个flinksql中聚合后无法sink的问题,但是我的代码完全按照官方文档进行,到最后依然无法解决,只能像各位求助。 我的 source语句为: CREATE TABLE random_source ( amount int, course_code string, `event_time` TIMESTAMP(3) , WATERMARK FOR event_time AS event_time - INTERVAL '5' SECOND ) WITH ( 'connector' =

Re:Re:Re:Re: flink sql collect函数使用问题

2021-12-05 Thread casel.chen
这只是我举的一个例子,实际业务会有使用这种group by再收集成集合的场景 在 2021-12-04 15:51:01,"RS" 写道: >SELECT class_no, collect(info) > >FROM ( > >SELECT class_no, ROW(student_no, name, age) AS info > >FROM source_table > >) > >GROUP BY class_no; > > >从SQL层面想到比较接近的方法,但multiset无法转array > >