blink(????flink1.5.1????)????????????hadoop????????

2020-01-21 Thread Yong
flinkhadoop ?? ??blink??flink standalonehadoop hdfs ??kerberos??TM??jobHadoop

Re: where does flink store the intermediate results of a join and what is the key?

2020-01-21 Thread kant kodali
Hi Jark, 1) shouldn't it be a col1 to List of row? multiple rows can have the same joining key right? 2) Can I use state processor API from an external application to query the intermediate results in near real-time? I

Re: Influxdb reporter not honouring the metrics scope

2020-01-21 Thread Gaurav Singhania
Thanks for the response and the fix. On Wed, 22 Jan 2020 at 01:43, Chesnay Schepler wrote: > The solution for 1.9 and below is to create a customized version of the > influx db reporter which excludes certain tags. > > On 21/01/2020 19:27, Yun Tang wrote: > > Hi, Gaurav > > InfluxDB metrics

Re: Re: flink on yarn任务启动报错 The assigned slot container_e10_1579661300080_0005_01_000002_0 was removed.

2020-01-21 Thread tison
那你看下 TM 那台机器上的 TM 日志,从 JM 端来看 TM 曾经成功起来过并注册了自己,你看看 TM 是怎么挂的或者别的什么情况 Best, tison. 郑 洁锋 于2020年1月22日周三 上午11:54写道: > TM没有起来,服务器本身内存cpu都是够的,还很空闲 > > > zjfpla...@hotmail.com > > 发件人: tison > 发送时间: 2020-01-22 11:25 > 收件人:

Re: Re: flink on yarn任务启动报错 The assigned slot container_e10_1579661300080_0005_01_000002_0 was removed.

2020-01-21 Thread 郑 洁锋
TM没有起来,服务器本身内存cpu都是够的,还很空闲 zjfpla...@hotmail.com 发件人: tison 发送时间: 2020-01-22 11:25 收件人: user-zh 主题: Re: flink on yarn任务启动报错 The assigned slot container_e10_1579661300080_0005_01_02_0 was removed.

Re: flink on yarn任务启动报错 The assigned slot container_e10_1579661300080_0005_01_000002_0 was removed.

2020-01-21 Thread tison
20/01/22 11:08:49 INFO yarn.YarnResourceManager: Closing TaskExecutor connection container_e10_1579661300080_0005_01_02 because: The heartbeat of TaskManager with id container_e10_1579661300080_0005_01_02 timed out. 你请求资源的时候把 slot 请求发到这台机器上了,然后它心跳超时了,你看看 TM 有没有正常起来,有没有资源不够或者挂了 Best,

Re: Custom Metrics outside RichFunctions

2020-01-21 Thread Yun Tang
Hi David FlinkKafkaConsumer in itself is RichParallelSourceFunction, and you could call function below to register your metrics group: getRuntimeContext().getMetricGroup().addGroup("MyMetrics").counter("myCounter") Best Yun Tang From: David Magalhães Sent:

flink on yarn任务启动报错 The assigned slot container_e10_1579661300080_0005_01_000002_0 was removed.

2020-01-21 Thread 郑 洁锋
大家好, flink on yarn任务启动时,发现报错了The assigned slot container_e10_1579661300080_0005_01_02_0 was removed. 环境:flink1.8.1,cdh5.14.2,kafka0.10,jdk1.8.0_241 flink版本为1.8.1,yarn上的日志: 20/01/22 11:07:53 INFO entrypoint.ClusterEntrypoint:

Re: Flink configuration on Docker deployment

2020-01-21 Thread Yang Wang
Hi Soheil, Since you are not using any container orchestration framework(e.g. docker-compose, Kubernetes, mesos), so you need to manually update the flink-conf.yaml in your docker images. Usually, it is located in the path "/opt/flink/conf". Docker volume also could be used to override the flink

Re: Flink Job Submission Fails even though job is running

2020-01-21 Thread tison
I guess it is a jm internal error which crashes the dispatcher or race condition so that the returning future never completed, possibly related to jdk bug. But again, never have a log in the case I cannot conclude anything. Best, tison. tison 于2020年1月22日周三 上午10:49写道: > It is a known issue

Re: Flink Job Submission Fails even though job is running

2020-01-21 Thread tison
It is a known issue reported multiple times that if you are in an early jdk 1.8.x version, upgrade the bugfix version and the issue will vanish. I don't ever have a log on jm side when this issue reported so I'm sorry unable to explain more... Best, tison. Yang Wang 于2020年1月22日周三 上午10:46写道:

Re: Replacing a server in Zookeeper Quorum

2020-01-21 Thread tison
Good to know :-) Best, tison. Aaron Langford 于2020年1月22日周三 上午10:44写道: > My apologies, I ended up resolving this through experimentation. AWS > replaces master nodes with the same internal DNS names, so configurations > need not be changed. > > Aaron > > > On Tue, Jan 21, 2020, 6:33 PM Yang

Re: Flink Job Submission Fails even though job is running

2020-01-21 Thread Yang Wang
The "web.timeout" will be used for all web monitor asynchronous operations, including the "DispatcherGateway.submitJob" in the "JobSubmitHandler". So when you increase the timeout, does it still could not work? Best, Yang satya brat 于2020年1月21日周二 下午8:57写道: > How does web.timeout help hear??

Re: Replacing a server in Zookeeper Quorum

2020-01-21 Thread Aaron Langford
My apologies, I ended up resolving this through experimentation. AWS replaces master nodes with the same internal DNS names, so configurations need not be changed. Aaron On Tue, Jan 21, 2020, 6:33 PM Yang Wang wrote: > Hi Aaron, > > I think it is not the responsibility of Flink. Flink uses

Re: Replacing a server in Zookeeper Quorum

2020-01-21 Thread tison
I second Yang that it would be a workaround that you set a static ip for EMR master node. Even in ZooKeeper world reconfig is a new and immature feature since 3.5.3 while Flink uses ZooKeeper 3.4.x. It would be a breaking change if we "just" upgrade zk version but hopefully the Flink community

Re: Replacing a server in Zookeeper Quorum

2020-01-21 Thread Yang Wang
Hi Aaron, I think it is not the responsibility of Flink. Flink uses zookeeper curator to connect the zk server. If multiple zk server are specified, it has an automatic retry mechanism. However, your problem is ip address will change when the EMR instance restarts. Currently, Flink can not

Re: java.lang.StackOverflowError

2020-01-21 Thread 刘建刚
I am using flink 1.6.2 on yarn. State backend is rocksdb. > 2020年1月22日 上午10:15,刘建刚 写道: > > I have a flink job which fails occasionally. I am eager to avoid this > problem. Can anyone help me? The error stacktrace is as following: > java.io.IOException: java.lang.StackOverflowError >

java.lang.StackOverflowError

2020-01-21 Thread 刘建刚
I have a flink job which fails occasionally. I am eager to avoid this problem. Can anyone help me? The error stacktrace is as following: java.io.IOException: java.lang.StackOverflowError at

Re: where does flink store the intermediate results of a join and what is the key?

2020-01-21 Thread Jark Wu
Hi Kant, 1) Yes, it will be stored in rocksdb statebackend. 2) In old planner, the left state is the same with right state which are both `>>`. It is a 2-level map structure, where the `col1` is the join key, it is the first-level key of the state. The key of the MapState is the input row,

Re: Influxdb reporter not honouring the metrics scope

2020-01-21 Thread Chesnay Schepler
The solution for 1.9 and below is to create a customized version of the influx db reporter which excludes certain tags. On 21/01/2020 19:27, Yun Tang wrote: Hi, Gaurav InfluxDB metrics reporter has a fixed format of reporting metrics which cannot be controlled by the scope. If you don't

Re: Influxdb reporter not honouring the metrics scope

2020-01-21 Thread Yun Tang
Hi, Gaurav InfluxDB metrics reporter has a fixed format of reporting metrics which cannot be controlled by the scope. If you don't want some tags stored, you can try to use `metrics.reporter..scope.variables.excludes` which introduced in flink-1.10 [1], to exclude specific variables. However,

Re: Location of flink-s3-fs-hadoop plugin (Flink 1.9.0 on EMR)

2020-01-21 Thread Aaron Langford
Senthil, One of the key steps in debugging this for me was enabling debug level logs on my cluster, and then looking at the logs in the resource manager. The failure you are after happens before the exceptions you have reported here. When your Flink application is starting, it will attempt to

Replacing a server in Zookeeper Quorum

2020-01-21 Thread Aaron Langford
Hello Flink Community, I'm working on a HA setup of Flink 1.8.1 on AWS EMR and have some questions about how Flink interacts with Zookeeper when one of the servers in the quorum specified in flink-conf.yaml goes down and is replaced by a machine with a new IP address. Currently, I configure

Re: Location of flink-s3-fs-hadoop plugin (Flink 1.9.0 on EMR)

2020-01-21 Thread Senthil Kumar
Yang, I appreciate your help! Please let me know if I can provide with any other info. I resubmitted my executable jar file as a step to the flink EMR and here’s are all the exceptions. I see two of them. I fished them out of /var/log/Hadoop//syslog 2020-01-21 16:31:37,587 ERROR

Re: How to get Task metrics with StatsD metric reporter?

2020-01-21 Thread John Smith
I think I figured it out. I used netcat to debug. I think the Telegraf StatsD server doesn't support spaces in the stats names. On Mon, 20 Jan 2020 at 12:19, John Smith wrote: > Hi, running Flink 1.8 > > I'm declaring my metric as such. > > invalidList = getRuntimeContext() >

Re: Question regarding checkpoint/savepoint and State Processor API

2020-01-21 Thread Jin Yi
Hi Seth, Thanks for the prompt response! Regarding my second question, once I have converted the existing savepoint to dataset, how can I convert the dataset into BroadcastState? For example, in my BroadcastProcessFunction: @Override public void processBroadcastElement(String key, Context

Re: Question regarding checkpoint/savepoint and State Processor API

2020-01-21 Thread Jin Yi
Hi Seth, Thanks for the prompt response! Regarding my second question, once I have converted the existing savepoint to dataset, how can I convert the dataset into BroadcastState? For example, in my BroadcastProcessFunction: @Override public void processBroadcastElement(String key, Context

Call for presentations for ApacheCon North America 2020 now open

2020-01-21 Thread Rich Bowen
Dear Apache enthusiast, (You’re receiving this message because you are subscribed to one or more project mailing lists at the Apache Software Foundation.) The call for presentations for ApacheCon North America 2020 is now open at https://apachecon.com/acna2020/cfp ApacheCon will be held at

GC overhead limit exceeded, memory full of DeleteOnExit hooks for S3a files

2020-01-21 Thread Mark Harris
Hi, We're using flink 1.7.2 on an EMR cluster v emr-5.22.0, which runs hadoop v "Amazon 2.8.5". We've recently noticed that some TaskManagers fail (causing all the jobs running on them to fail) with an "java.lang.OutOfMemoryError: GC overhead limit exceeded”. The taskmanager (and jobs that

Re: Flink Performance

2020-01-21 Thread Dharani Sudharsan
Thanks David. But I don’t see any solutions provided for the same. On Jan 21, 2020, at 7:13 PM, David Magalhães mailto:speeddra...@gmail.com>> wrote: I've found this ( https://stackoverflow.com/questions/50580756/flink-window-dragged-stream-performance ) post on StackOverflow, where someone

Re: Flink Performance

2020-01-21 Thread David Magalhães
I've found this ( https://stackoverflow.com/questions/50580756/flink-window-dragged-stream-performance ) post on StackOverflow, where someone complains about performance drop in KeyBy. On Tue, Jan 21, 2020 at 1:24 PM Dharani Sudharsan < dharani.sudhar...@outlook.in> wrote: > Hi All, > >

Flink Performance

2020-01-21 Thread Dharani Sudharsan
Hi All, Currently, I’m running a flink streaming application, the configuration below. Task slots: 45 Task Managers: 3 Job Manager: 1 Cpu : 20 per machine My sample code below: Process Stream: datastream.flatmap().map().process().addsink Data size: 330GB approx. Raw Stream:

Re: Re: CountEvictor 与 TriggerResult.FIRE_AND_PURGE 清理窗口数据有区别吗?

2020-01-21 Thread tison
你读一下 EvictingWindowOperator 相关代码或者说 Evictor#evictBefore 的调用链,里面关于 window state 的处理是比较 hack 的,用文字说也起不到简练的作用 private void emitWindowContents(W window, Iterable> contents, ListState> windowState) throws Exception { timestampedCollector.setAbsoluteTimestamp(window.maxTimestamp()); // Work

Re: Flink Job Submission Fails even though job is running

2020-01-21 Thread satya brat
How does web.timeout help hear?? The issue is with respect to aka dispatched timing out. The job is submitted to the task managers but the response doesn't reach the client. On Tue, Jan 21, 2020 at 12:34 PM Yang Wang wrote: > Hi satya, > > Maybe the job has been submitted to Dispatcher

Re:Re: CountEvictor 与 TriggerResult.FIRE_AND_PURGE 清理窗口数据有区别吗?

2020-01-21 Thread USERNAME
evict 丢弃掉的数据,在内存或者RocksDB中也会同步删除吗? 在 2020-01-21 17:27:38,"tison" 写道: >正好看到这一部分,还是有的,你考虑下滑动的计数窗口 > >[1] 会在 fire 之后把整个 windowState 丢掉,[2] 其实会重新计算 evict 之后的 windowState > >Best, >tison. > > >USERNAME 于2020年1月21日周二 下午5:21写道: > >> 大家,新年快乐~ >> >> >> [1] TriggerResult.FIRE_AND_PURGE >> >>

Flink configuration on Docker deployment

2020-01-21 Thread Soheil Pourbafrani
Hi, I need to set up a Flink cluster using the docker(and not using the docker-compose). I successfully could strat the jobmanager and taskmanager but the problem is I have no idea how to change the default configuration for them. For example in the case of giving 8 slots to the taskmanager or

Re: Implementing a tick service

2020-01-21 Thread Dominik Wosiński
Hey, you have access to context in `onTimer` so You can easily reschedule the timer when it is fired. Best, Dom.

Re: Implementing a tick service

2020-01-21 Thread Benoît Paris
Hello all! Please disregard the last message; I used Thread.sleep() and Stateful Source Functions . But just out of curiosity, can processing-time Timers get rescheduled inside the

[no subject]

2020-01-21 Thread Ankush Khanna

where does flink store the intermediate results of a join and what is the key?

2020-01-21 Thread kant kodali
Hi All, If I run a query like this StreamTableEnvironment.sqlQuery("select * from table1 join table2 on table1.col1 = table2.col1") 1) Where will flink store the intermediate result? Imagine flink-conf.yaml says state.backend = 'rocksdb' 2) If the intermediate results are stored in rockdb then

Re: Questions of "State Processing API in Scala"

2020-01-21 Thread Izual
Sry for wrong post. > This can probably be confirmed by looking at the exception stack trace. > Can you post a full copy of that? I missed the history jobs, but I think u r right. When I debug the program to find reason, came into these code snippet. ``` TypeSerializerSchemaCompatibility

Re: CountEvictor 与 TriggerResult.FIRE_AND_PURGE 清理窗口数据有区别吗?

2020-01-21 Thread tison
正好看到这一部分,还是有的,你考虑下滑动的计数窗口 [1] 会在 fire 之后把整个 windowState 丢掉,[2] 其实会重新计算 evict 之后的 windowState Best, tison. USERNAME 于2020年1月21日周二 下午5:21写道: > 大家,新年快乐~ > > > [1] TriggerResult.FIRE_AND_PURGE > >

CountEvictor 与 TriggerResult.FIRE_AND_PURGE 清理窗口数据有区别吗?

2020-01-21 Thread USERNAME
大家,新年快乐~ [1] TriggerResult.FIRE_AND_PURGE https://github.com/apache/flink/blob/1662d5d0cda6a813e5c59014acfd7615b153119f/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/windowing/evictors/CountEvictor.java#L74 [2] CountEvictor

about registering completion function for worker shutdown

2020-01-21 Thread Dominique De Vito
Hi, For a Flink batch job, some value are writing to Kafka through a Producer. I want to register a hook for closing (at the end) the Kafka producer a worker is using hook to be executed, of course, on worker side. Is there a way to do so ? Thanks. Regards, Dominique