Re: Apache Flink - High Available K8 Setup Wrongly Marked Dead TaskManager

2019-11-28 Thread Eray Arslan
Hi Chesnay, Thank you for reply. I figure out that issue with using livenessProbe on Task Manager deployment. But I think it is still a workaround. I am using Flink 1.9.1 (currently its latest version) And I am getting "connection unexpectedly closed by remote task manager" error on Task Manager.

Re: ProcessFunction collect and close, when to use?

2019-11-28 Thread bupt_ljy
Hi Shuwen, > When to call close() ? After every element processed? Or on > ProcessFunction.close() ? Or never to use it? IMO, the #close() function is used to manage the lifecycle of #Collector instead of a single element. I think it should not be called in user function unless you have

ProcessFunction collect and close, when to use?

2019-11-28 Thread shuwen zhou
Hi Community, In ProcessFunction class, ProcessElement function, there is a Collector that has 2 method: collect() and close(). I would like to know: 1. When to call close() ? After every element processed? Or on ProcessFunction.close() ? Or never to use it? If it's been closed already, can the

Re: Auto Scaling in Flink

2019-11-28 Thread Caizhi Weng
Hi Akash, Flink doesn't support auto scaling in core currently, it may be supported in the future, when the new scheduling architecture is implemented https://issues.apache.org/jira/browse/FLINK-10407 . You can do it externally by cancel the job with a savepoint, update the parallelism, and

Re: Auto Scaling in Flink

2019-11-28 Thread vino yang
Hi Akash, You can use Pravega connector to integrate with Flink, the source code is here[1]. In short, relying on its rescalable state feature[2] flink supports scalable streaming jobs. Currently, the mainstream solution about auto-scaling is Flink + K8S, I can share some resources with you[3].

Auto Scaling in Flink

2019-11-28 Thread Akash Goel
Hi, Does Flunk support auto scaling. I read that it is supported using pravega? Is it incorporated in any version. Thanks, Akash Goel

Re: How to recover state from savepoint on embedded mode?

2019-11-28 Thread Arvid Heise
Just to add up, if you use LocalStreamEnvironment, you can pass a configuration and you can set "execution.savepoint.path" to point to your savepoint. Best, Arvid On Wed, Nov 27, 2019 at 1:00 PM Congxian Qiu wrote: > Hi, > > You can recovery from checkpoint/savepoint if JM can read from the >

Re: Flink 'Job Cluster' mode Ui Access

2019-11-28 Thread Chesnay Schepler
Could you try accessing :/#/overview ? The REST API is obviously accessible, and hence the WebUI should be too. How did you setup the session cluster? Are you using some custom Flink build or something, which potentially excluded flink-runtime-web from the classpath? On 28/11/2019 10:02,

Re: ***UNCHECKED*** Error while confirming Checkpoint

2019-11-28 Thread Piotr Nowojski
Thank you all for investigation/reporting/discussion. I have merged an older PR [1] that was fixing this issue which was previously rejected as we didn’t realise this is a production issue. I have merged it and issue should be fixed in Flink 1.10, 1.9.2 and 1.8.3 releases. Piotrek [1]

Re: Apache Flink - High Available K8 Setup Wrongly Marked Dead TaskManager

2019-11-28 Thread Chesnay Schepler
The akka.watch configuration options haven't been used for a while irrespective of FLINK-13883 (but I can't quite tell atm since when). Let's start with what version of Flink you are using, and what the taskmanager/jobmanager logs say. On 25/11/2019 12:05, Eray Arslan wrote: Hi, I have

Re: Flink 1.9.1 KafkaConnector missing data (1M+ records)

2019-11-28 Thread Kostas Kloudas
Hi Harrison, One thing to keep in mind is that Flink will only write files if there is data to write. If, for example, your partition is not active for a period of time, then no files will be written. Could this explain the fact that dt 2019-11-24T01 and 2019-11-24T02 are entirely skipped? In

Re: Streaming Files to S3

2019-11-28 Thread Arvid Heise
Hi Li, S3 file sink will write data into prefixes, with as many part-files as the degree of parallelism. This structure comes from the good ol' Hadoop days, where an output folder also contained part-files and is independent of S3. However, each of the part-files will be uploaded in a multipart

Re: Side output from Flink Sink

2019-11-28 Thread Robert Metzger
What do you mean by "from within a sink"? Do you have a custom sink? If you want to write to different Kafka topics from the same sink, you can do that using a custom KafkaSerializationSchema. It allows you to return a ProducerRecord with a custom target topic set. (A Kafka sink can write to

Re: 使用FsStateBackend导致的oom

2019-11-28 Thread Congxian Qiu
Hi Checkpooint 不会导致 State 被清理。Checkpoint 只是将 State 备份到一个远程存储,供后续恢复使用。如果有大量 CopyOnWriteStateTable$StateTableEntry 的话,首先需要确认,你作业真的会有大量的 state 存在吗?另外 Checkpoint 相关的配置是什么,OOM 前是否有 checkpoint 失败这类的 Best, Congxian 李茂伟 于2019年11月28日周四 下午9:02写道: > hi all ~ > >

使用FsStateBackend导致的oom

2019-11-28 Thread 李茂伟
hi all ~ 我在使用FsStateBackend,这种方式checkpoint时,导致oom,查看堆内存中有大量的CopyOnWriteStateTable$StateTableEntry对象没有被清理。请问在使用FsStateBackend这种方式checkpoint后,内存中的state会被清理吗?

Re: Flink Kudu Connector

2019-11-28 Thread Arvid Heise
Hi Rahul, can you check if the KuduSink tests of Apache Bahir shed any light? [1] Best, Arvid [1] https://github.com/apache/bahir-flink/blob/55240a993df999d66aefa36e587be719c29be92a/flink-connector-kudu/src/test/java/org/apache/flink/connectors/kudu/streaming/KuduSinkTest.java On Mon, Nov

Re: Idiomatic way to split pipeline

2019-11-28 Thread Arvid Heise
Hi Avi, it seems to me that you are not really needing any split feature. As far as I can see in your picture you want to apply two different windows on the same input data. In that case you simply use two different subgraphs. stream = ... stream1 = stream.window(...).addSink() stream2 =

?????? JobGraphs not cleaned up in HA mode

2019-11-28 Thread ??????
the chk-* directory is not found , I think the misssing because of jobmanager removes it automaticly , but why it still in zookeeper? ---- ??:"Vijay Bhaskar"http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Side output from Flink Sink

2019-11-28 Thread Arvid Heise
Hi Victor, you could implement your own SinkFunction that wraps the KafkaProducer. However, since you may need to check if the write operation is successful, you probably need to subclass KafkaProducer and implement your own error handling. Best, Arvid On Mon, Nov 25, 2019 at 7:51 AM vino yang

Re: Retrieving Flink job ID/YARN Id programmatically

2019-11-28 Thread Arvid Heise
Hi Lei, if you use public JobExecutionResult StreamExecutionEnvironment#execute() You can retrieve the job id through the result. result.getJobID() Best, Arvid On Mon, Nov 25, 2019 at 3:50 AM Ana wrote: > Hi Lei, > > To add, you may use Hadoop Resource Manager REST APIs >

Re: JobGraphs not cleaned up in HA mode

2019-11-28 Thread Vijay Bhaskar
One more thing: You configured: high-availability.cluster-id: /cluster-test it should be: high-availability.cluster-id: cluster-test I don't think this is major issue, in case it helps, you can check. Can you check one more thing: Is check pointing happening or not? Were you able to see the chk-*

?????? JobGraphs not cleaned up in HA mode

2019-11-28 Thread ??????
hi?? Is there any deference ??for me using nas is more convenient to test currently??? from the docs seems hdfs ,s3, nfs etc all will be fine. ---- ??:"vino yang"http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: JobGraphs not cleaned up in HA mode

2019-11-28 Thread vino yang
Hi, Why do you not use HDFS directly? Best, Vino 曾祥才 于2019年11月28日周四 下午6:48写道: > > anyone have the same problem? pls help, thks > > > > -- 原始邮件 -- > *发件人:* "曾祥才"; > *发送时间:* 2019年11月28日(星期四) 下午2:46 > *收件人:* "Vijay Bhaskar"; > *抄送:* "User-Flink"; > *主题:* 回复:

Re: Flink 'Job Cluster' mode Ui Access

2019-11-28 Thread vino yang
Hi Jatin, Which version do you use? Best, Vino Jatin Banger 于2019年11月28日周四 下午5:03写道: > Hi, > > I checked the log file there is no error. > And I checked the pods internal ports by using rest api. > > # curl : 4081 > {"errors":["Not found."]} > 4081 is the Ui port > > # curl :4081/config >

?????? JobGraphs not cleaned up in HA mode

2019-11-28 Thread ??????
anyone have the same problem?? pls help, thks ---- ??:"??"http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

flink1.9.1 cep 出现反压及内存问题

2019-11-28 Thread 宋倚天
Hi all,我在使用flink cep的过程中遇到了如下问题(一个 A FollowedBy B 的 case) Case1. 我构造的数据源中全是事件A,全速发送 该任务在执行数秒后阻塞(task之间不再有数据交换),源端的backpressure=1,一段时间后TM失去心跳响应,任务异常终止,此时TaskManager进程存活,但通过jstat观察内存使用情况后,发现内存被耗尽,且不停的FullGC Case2. 我构造的数据源中交叉发送事件A和事件B,全速发送

Rest api:/jobs/:jobid/stop报405

2019-11-28 Thread Moc
大家好: 我在官方文档中看到了/jobs/:jobid/stop这个接口,文档上说这个接口可以生成savepoint。因此我想使用这个接口停止运行在yarn上的flink作业,但是会报错。 请问是不是这个接口当前还用不了?是否有其他代替的接口可用? 接口地址我尝试过: http://host:8088/proxy/application_1572228642684_1907/v1/jobs/a1f207a50cdebd531824d9a4402162ca/stop 和

Re: Flink 'Job Cluster' mode Ui Access

2019-11-28 Thread Jatin Banger
Hi, I checked the log file there is no error. And I checked the pods internal ports by using rest api. # curl : 4081 {"errors":["Not found."]} 4081 is the Ui port # curl :4081/config {"refresh-interval":3000,"timezone-name":"Coordinated Universal

Re: What happens to the channels when there is backpressure?

2019-11-28 Thread Felipe Gutierrez
Hi Yingjie, I read this post and the next one as well ( https://flink.apache.org/2019/07/23/flink-network-stack-2.html). I mean the bandwidth of the channels between two physical operators. When they are in different hosts, so when the channels are a network channel. Thanks *--* *-- Felipe