Re: Apache Flink - Are counters reliable and accurate ?

2019-06-27 Thread Chesnay Schepler
So here's the thing: Metrics are accurate, so long as the job is running. Once the job terminates metrics are cleaned up and not persisted anywhere, with the exception of a few metrics (like numRecordsIn). Another thing that is always good to double-check is to enable DEBUG logging and re-run

[External] Regarding kinesis data analytics for flink

2019-06-27 Thread Vishal Sharma
Hi Leo, Recently, I came across kinesis data analytics which provides managed flink in AWS. It seems promising. Some concerns that I have with my very little exploration till now are - => it uses flink 1.6 (not the latest one) => Doesn't support kafka streams directly => No option to configure

Re:Flink batch job memory/disk leak when invoking set method on a static Configuration object.

2019-06-27 Thread Haibo Sun
Hi, Vadim This similar issue has occurred in earlier versions, see https://issues.apache.org/jira/browse/FLINK-4485. Are you running a Flink session cluster on YARN? I think it might be a bug. I'll see if I can reproduce it with the master branch code, and if yes, I will try to investigate

About "Flink 1.7.0 HA based on zookeepers "

2019-06-27 Thread 胡逸才
HI Tan: I have the same problem with you when running "flink-1.7.2 ON KUBERNATE HA" mode, may I ask if you have solved this problem? How? After I started the two jobmanagers normally, when I tried to kill one of them, he could not restart normally. Both jobmanagers reported this error. The

Re: HDFS checkpoints for rocksDB state backend:

2019-06-27 Thread Yang Wang
Hi, Andrea If you are running flink cluster on Yarn, the jar `flink-shaded-hadoop2-uber-1.6.4.jar` should exist in the lib dir of the flink client, so that it could be uploaded to the Yarn Distributed Cache and then be available on JM and TM. And if you are running flink standalone cluster, the

Re: Apache Flink - Running application with different flink configurations (flink-conf.yml) on EMR

2019-06-27 Thread M Singh
Hi Xintong:  Thanks for your pointers. I tried -Dparallelism.default=2 locally (in IDE) and it did not work.  Do you know if there is a common way that would work both for emr, locally and ide ? Thanks again. On Thursday, June 27, 2019, 12:03:06 AM EDT, Xintong Song wrote: Hi Singh,

Re: Apache Flink - Are counters reliable and accurate ?

2019-06-27 Thread M Singh
Hi Chesnay: Thanks for your response. My job runs for a few minutes and i've tried setting the reporter interval to 1 second. I will try the counter on a longer running job. Thanks again. On Thursday, June 27, 2019, 11:46:17 AM EDT, Chesnay Schepler wrote: 1) None that I'm aware of.

Flink batch job memory/disk leak when invoking set method on a static Configuration object.

2019-06-27 Thread Vadim Vararu
Hi guys, I have a simple batch job with a custom output formatter that writes to a local file. public class JobHadoop { public static void main(String[] args) throws Exception { ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

Debug Kryo.Serialization Exception

2019-06-27 Thread Fabian Wollert
Hi, we have some Flink Jobs (Flink Version 1.7.1) consuming from a Custom Source and Ingesting into an Elasticsearch Cluster (V.5.6). In recent times, we see more and more Exceptions happening like this: com.esotericsoftware.kryo.KryoException: Unable to find class: com. ^ at

Re: Apache Flink - Are counters reliable and accurate ?

2019-06-27 Thread Chesnay Schepler
1) None that I'm aware of. 2) You should use counters. 3) No, counters are not checkpointed, but you could store the value in state yourself. 4) None that I'm aware of that doesn't require modifications to the application logic. How long does your job run for, and how do you access metrics?

Apache Flink - Are counters reliable and accurate ?

2019-06-27 Thread M Singh
Hi: I need to collect application metrics which are counts (per unit of time eg: minute)  for certain events.  There are two ways of doing this: 1. Create separate streams (using split stream etc) in the application explicitly, then aggregate the counts in a window and save them.  This mixes

Re: Limit number of jobs or Job Managers

2019-06-27 Thread Pankaj Chand
Hi Haibo Sun, It's exactly what I needed. Thank you so much! Best, Pankaj On Thu, Jun 27, 2019 at 7:45 AM Haibo Sun wrote: > Hi, Pankaj Chand > > If you're running Flink on YARN, you can do this by limiting the number of > applications in the cluster or in the queue. As far as I know, Flink

Re:Limit number of jobs or Job Managers

2019-06-27 Thread Haibo Sun
Hi, Pankaj Chand If you're running Flink on YARN, you can do this by limiting the number of applications in the cluster or in the queue. As far as I know, Flink does not limit that. The following are the configuration items for YARN : yarn.scheduler.capacity.maximum-applications

Limit number of jobs or Job Managers

2019-06-27 Thread Pankaj Chand
Hi everyone, Is there any way (parameter or function) I can limit the number of concurrent jobs executing in my Flink cluster? Or alternatively, limit the number of concurrent Job Managers (since there has to be one Job Manager for every job)? Thanks! Pankaj

Re: HDFS checkpoints for rocksDB state backend:

2019-06-27 Thread Andrea Spina
HI Qiu, my jar does not contain the class `org.apache.hadoop.hdfs.protocol.HdfsConstants*`, *but I do expect it is contained within `flink-shaded-hadoop2-uber-1.6.4.jar` which is located in Flink cluster libs. Il giorno gio 27 giu 2019 alle ore 04:03 Congxian Qiu < qcx978132...@gmail.com> ha

Re: Hello-world example of Flink Table API using a edited Calcite rule

2019-06-27 Thread JingsongLee
Got it, it's clear, TableStats is the important functions of ExternalCatalog. It is right way. Best, JingsongLee -- From:Felipe Gutierrez Send Time:2019年6月27日(星期四) 14:53 To:JingsongLee Cc:user Subject:Re: Hello-world example of

Re: Hello-world example of Flink Table API using a edited Calcite rule

2019-06-27 Thread Felipe Gutierrez
Hi JingsongLee, Sorry for not explain very well. I am gonna try a clarification of my idea. 1 - I want to use InMemoryExternalCatalog in a way to save some statistics which I create by listening to a stream. 2 - Then I will have my core application using Table API to execute some

Re: Batch mode with Flink 1.8 unstable?

2019-06-27 Thread Biao Liu
Hi Ken again, In regard to TimeoutException, I just realized that there is no akka.remote.OversizedPayloadException in your log file. There might be some other reason caused this. 1. Have you ever tried increasing the configuration "akka.ask.timeout"? 2. Have you ever checked the garbage