Re: jobmanager 日志异常

2019-08-05 Thread Biao Liu
你好, > org.apache.flink.runtime.entrypoint.ClusterEntrypoint - RECEIVED > SIGNAL 15: SIGTERM. Shutting down as requested. 这是收到了 signal 15 了 [1],Wong 说得对,搜一下 yarn node manager 或者 yarn resource manager 的 log 1. https://access.redhat.com/solutions/737033 Thanks, Biao /'bɪ.aʊ/ On Tue,

Re: Re: Flink RocksDBStateBackend 问题

2019-08-05 Thread Yun Tang
@lvwenyuan 首先需要明确的一点是,你这里的“FileSystem”指的究竟是checkpoint时存储数据的file system,还是FsStateBackend,建议下次提问前可以把需要咨询的内容表述清楚一些。 * 如果指的是存储checkpoint数据的远程file system,在incremental

Re: jobmanager 日志异常

2019-08-05 Thread Wong Victor
Hi, 可以查看一下jobmanager所在节点的yarn log,搜索一下对应的container为什么被kill; Regards On 2019/8/6, 11:40 AM, "戴嘉诚" wrote: 大家好: 我的flink是部署在yarn上左session,今天早上jobmanager自动退出了,然后yarn把他重新拉起了,导致里面跑的job重新启动了,但是我查看日志,看到jobmanager的日志没有任何异常,同时jobmanager也没有长时间的full gc和频繁的gc,以下是jobmanager的日志:

Re: Re: Flink RocksDBStateBackend 问题

2019-08-05 Thread athlon...@gmail.com
你说的是memsystem的状态数据存在jm内存中的filesystem是存到文件系统上的 athlon...@gmail.com 发件人: 戴嘉诚 发送时间: 2019-08-06 11:42 收件人: user-zh 主题: Re: Flink RocksDBStateBackend 问题 FileSystem 我记得是存储的大小是不能超过tm的内存还是jm的内存,而rocksdb上存储的数据是可以无限的,不过相对来说, FileSystem的吞吐就会比rocksdb会高 lvwenyuan 于2019年8月6日周二 上午11:39写道: > 请教各位: >

Re: Flink RocksDBStateBackend 问题

2019-08-05 Thread 戴嘉诚
FileSystem 我记得是存储的大小是不能超过tm的内存还是jm的内存,而rocksdb上存储的数据是可以无限的,不过相对来说, FileSystem的吞吐就会比rocksdb会高 lvwenyuan 于2019年8月6日周二 上午11:39写道: > 请教各位: >RocksDBStateBackend > 中,rocksdb上存储的内如和FileSystem上存储的数据内容是一样的?如果不一样,那么分别是什么呢?感谢回答 > > > >

jobmanager 日志异常

2019-08-05 Thread 戴嘉诚
大家好: 我的flink是部署在yarn上左session,今天早上jobmanager自动退出了,然后yarn把他重新拉起了,导致里面跑的job重新启动了,但是我查看日志,看到jobmanager的日志没有任何异常,同时jobmanager也没有长时间的full gc和频繁的gc,以下是jobmanager的日志: 就是在06:44分的是偶,日志上标记了收收到停止请求,然后jobmanager直接停止了...请问是由于什么原因导致的呢? 2019-08-06 06:43:58,891 INFO >

flink 动态表输出问题

2019-08-05 Thread 金圣哲
各位Flink社区大佬, 你们好: 请教一下: 问题是 flink sql实现动态表之后,想基于动态表进行查询定时输出, 各位大神有什么实现的思路吗感激不尽. SQL 1: tableEnv.sqlQuery("select date_id, id, latest(status, utime) as status, latest(user_id, utime) as user_id from waybillAppendTable group by date_id, id"); 查询SQL2:tableEnv.sqlQuery("select date_id, user_id,

Re: An ArrayIndexOutOfBoundsException after a few message with Flink 1.8.1

2019-08-05 Thread Yun Gao
Hi Nicolas: Are you using a custom partitioner? If so, you might need to check if the Partitioners#partition has returned a value that is greater than or equal to the parallelism of the downstream tasks. The expected return value should be in the interval [0, the parallelism of the

Re: Memory constrains running Flink on Kubernetes

2019-08-05 Thread Yun Tang
You are correct, the default value of write buffer size is 64 MB [1]. However, the java doc for this value is not correct [2]. Already created a PR to fix this. [1] https://github.com/facebook/rocksdb/blob/30edf1874c11762a6cacf4434112ce34d13100d3/include/rocksdb/options.h#L191 [2]

Re: Dynamically allocating right-sized task resources

2019-08-05 Thread Xintong Song
Hi Chad, If I understand correctly, the scenarios you talked about are running batch jobs, right? At the moment (Flink 1.8 and earlier), Flink does not differentiate different working load of tasks. It uses a slot-sharing approach[1] to balance workloads among workers. The general idea is to put

Re: getting an exception

2019-08-05 Thread Gaël Renoux
Hi Avi and Victor, I just opened this ticket on JIRA: https://issues.apache.org/jira/browse/FLINK-13586 (I hadn't seen these e-mails). Backward compatibility is broken between 1.8.0 and 1.8.1 if you use Kafka connectors. Can you upgrade your flink-connector-kafka dependency to 1.8.1 ? It won't

Re: getting an exception

2019-08-05 Thread Wong Victor
Hi Avi: It seems you are submitting your job with an older Flink version (< 1.8), please check your flink-dist version. Regards, Victor From: Avi Levi Date: Monday, August 5, 2019 at 9:11 PM To: user Subject: getting an exception Hi, I'm using Flink 1.8.1. our code is mostly using Scala.

StreamingFileSink not committing file to S3

2019-08-05 Thread Ravi Bhushan Ratnakar
Thanks for your quick response. I am using custom implementation of BoundedOutOfOrderenessTimestampExtractor and also tweaked to return initial watermark not a negative value. One more observation that, when the job's parallelism is around 120, then it works well even with idle stream and Flink

Re: [DataStream API] Best practice to handle custom session window - time gap based but handles event for ending session

2019-08-05 Thread Fabian Hueske
Hi Jungtaek, I would recommend to implement the logic in a ProcessFunction and avoid Flink's windowing API. IMO, the windowing API is difficult to use, because there are many pieces like WindowAssigner, Window, Trigger, Evictor, WindowFunction that are orchestrated by Flink. This makes it very

getting an exception

2019-08-05 Thread Avi Levi
Hi, I'm using Flink 1.8.1. our code is mostly using Scala. When I try to submit my job (on my local machine ) it crashes with the error below (BTW on the IDE it runs perfectly). Any assistance would be appreciated. Thanks Avi 2019-08-05 12:58:03.783 [Flink-DispatcherRestEndpoint-thread-3] ERROR

Re: Memory constrains running Flink on Kubernetes

2019-08-05 Thread wvl
Btw, with regard to: > The default writer-buffer-number is 2 at most for each column family, and the default write-buffer-memory size is 4MB. This isn't what I see when looking at the OPTIONS-XX file in the rocksdb directories in state: [CFOptions "xx"] ttl=0

Re: some confuse for data exchange of parallel operator

2019-08-05 Thread Biao Liu
Hi Kylin, > Can this map record all data? Or this map only record data from one parallelism of upstream operator? Neither of your guess is correct. It depends on the partitioner between the map operator and upstream operator. You could find more in this document [1]. 1.

Re: StreamingFileSink not committing file to S3

2019-08-05 Thread Theo Diefenthal
Hi Ravi, Please checkout [1] and [2]. That is related to Kafka but probably applies to Kinesis as well. If one stream is empty, there is no way for Flink to know about the watermark of that stream and Flink can't advance the watermark. Following downstream operators can thus not know if there

some confuse for data exchange of parallel operator

2019-08-05 Thread tangkailin
Hello, I don’t know how parallel operator exchange data in flink. for example, I define a map in a operator with n (n > 1) parallelism for counting. Can this map record all data? Or this map only record data from one parallelism of upstream operator? Thanks, Kylin