Re: BlinkPlanner limitation related clarification

2020-01-23 Thread Jingsong Li
Hi RKandoji, IMO, yes, you can not reuse table env, you should create a new tEnv after executing, 1.9.1 still has this problem. Related issue is [1], fixed in 1.9.2 and 1.10. [1] https://issues.apache.org/jira/browse/FLINK-13708 Best, Jingsong Lee On Fri, Jan 24, 2020 at 11:14 AM RKandoji

BlinkPlanner limitation related clarification

2020-01-23 Thread RKandoji
Hi Team, I've been using Blink Planner and just came across this page https://ci.apache.org/projects/flink/flink-docs-stable/release-notes/flink-1.9.html#known-shortcomings-or-limitations-for-new-features and saw below limitation: Due to a bug with how transformations are not being cleared on

Re: Re: flink on yarn任务启动报错 The assigned slot container_e10_1579661300080_0005_01_000002_0 was removed.

2020-01-23 Thread tison
你上面的是 taskmanager.err,需要的是 taskmanager.log Best, tison. 郑 洁锋 于2020年1月23日周四 下午10:22写道: > 之前挂过 后面启动的时候 是checkpoints的文件丢了? 你是这个意思吗? > > > zjfpla...@hotmail.com > > 发件人: zhisheng > 发送时间: 2020-01-22 16:45 > 收件人:

How to debug a job stuck in a deployment/run loop?

2020-01-23 Thread Jason Kania
I am attempting to migrate from 1.7.1 to 1.9.1 and I have hit a problem where previously working jobs can no longer launch after being submitted. In the UI, the submitted jobs show up as deploying for a period, then go into a run state before returning to the deploy state and this repeats

FileStreamingSink is using the same counter for different files

2020-01-23 Thread Pawel Bartoszek
Hi, Flink Streaming Sink is designed to use global counter when creating files to avoid overwrites. I am running Flink 1.8.2 with Kinesis Analytics (managed flink provided by AWS) with bulk writes (rolling policy is hardcoded to roll over on checkpoint). My job is configured to checkpoint every

Re: Location of flink-s3-fs-hadoop plugin (Flink 1.9.0 on EMR)

2020-01-23 Thread Aaron Langford
When creating your cluster, you can provide configurations that EMR will find the right home for. Example for the aws cli: aws emr create-cluster ... --configurations '[{ > "Classification": "flink-log4j", > "Properties": { > "log4j.rootLogger": "DEBUG,file" > } > },{ >

Re: Location of flink-s3-fs-hadoop plugin (Flink 1.9.0 on EMR)

2020-01-23 Thread Senthil Kumar
Could you tell us how to turn on debug level logs? We attempted this (on driver) sudo stop hadoop-yarn-resourcemanager followed the instructions here https://stackoverflow.com/questions/27853974/how-to-set-debug-log-level-for-resourcemanager and sudo start hadoop-yarn-resourcemanager but we

PostgreSQL JDBC connection drops after inserting some records

2020-01-23 Thread Soheil Pourbafrani
Hi, I have a peace of Flink Streaming code that reads data from files and inserts them into the PostgreSQL table. After inserting 6 to 11 million records, I got the following errors: *Caused by: java.lang.RuntimeException: Execution of JDBC statement failed. at

Re: where does flink store the intermediate results of a join and what is the key?

2020-01-23 Thread Benoît Paris
Hi all! @Jark, out of curiosity, would you be so kind as to expand a bit on "Query on the intermediate state is on the roadmap"? Are you referring to working on QueryableStateStream/QueryableStateClient [1], or around "FOR SYSTEM_TIME AS OF" [2], or on other APIs/concepts (is there a FLIP?)?

Re: Re: flink on yarn任务启动报错 The assigned slot container_e10_1579661300080_0005_01_000002_0 was removed.

2020-01-23 Thread 郑 洁锋
之前挂过 后面启动的时候 是checkpoints的文件丢了? 你是这个意思吗? zjfpla...@hotmail.com 发件人: zhisheng 发送时间: 2020-01-22 16:45 收件人: user-zh 主题: Re: flink on yarn任务启动报错 The assigned slot

Re: REST rescale with Flink on YARN

2020-01-23 Thread Vasily Melnik
Hi all, I've found some solution for this issue. Problem is that with YARN ApplicationMaster URL we communicate with JobManager via proxy which is implemented on Jetty 6 (for Hadoop 2.6). So to use PATCH method we need to locate original JobManager URL. Using /jobmanager/config API we could get

Re: Re: flink on yarn任务启动报错 The assigned slot container_e10_1579661300080_0005_01_000002_0 was removed.

2020-01-23 Thread 郑 洁锋
日志已经在前面的邮件里面了 zjfpla...@hotmail.com 发件人: tison 发送时间: 2020-01-22 12:10 收件人: user-zh 主题: Re: Re: flink on yarn任务启动报错 The assigned slot container_e10_1579661300080_0005_01_02_0 was removed. 那你看下 TM

Re: Old offsets consumed from Kafka after Flink upgrade to 1.9.1 (from 1.2.1)

2020-01-23 Thread Tzu-Li (Gordon) Tai
Hi Somya, I'll have to take a closer look at the JIRA history to refresh my memory on potential past changes that caused this. My first suspection is this: It is expected that the Kafka consumer will *ignore* the configured startup position if the job was restored from a savepoint. It will

Re: REST rescale with Flink on YARN

2020-01-23 Thread Chesnay Schepler
Older versions of Jetty don't support PATCH requests. You will either have to update it or create a custom Flink version that uses POST for the rescale operation. On 23/01/2020 13:23, Vasily Melnik wrote: Hi all. I'm using Flink 1.8 on YARN with CDH 5.12 When i try to perform rescale request:

REST rescale with Flink on YARN

2020-01-23 Thread Vasily Melnik
Hi all. I'm using Flink 1.8 on YARN with CDH 5.12 When i try to perform rescale request: curl -v -X PATCH '/proxy/application_1576854986116_0079/jobs/11dcfc3163936fc019e049fc841b075b/rescaling?parallelism=3

Re: batch job OOM

2020-01-23 Thread Jingsong Li
Fanbin, I have no idea now, can you created a JIRA to track it? You can describe complete SQL and some data informations. Best, Jingsong Lee On Thu, Jan 23, 2020 at 4:13 PM Fanbin Bu wrote: > Jingsong, > > Do you have any suggestions to debug the above mentioned > IndexOutOfBoundsException

Re: Usage of KafkaDeserializationSchema and KafkaSerializationSchema

2020-01-23 Thread Aljoscha Krettek
Hi, the reason the new schema feels a bit weird is that it implements a new paradigm in a FlinkKafkaProducer that still follows a somewhat older paradigm. In the old paradigm, partitioning and topic where configured on the sink, which made it fixed for all produced records. The new schema

Re: Old offsets consumed from Kafka after Flink upgrade to 1.9.1 (from 1.2.1)

2020-01-23 Thread Somya Maithani
Hey, Any ideas about this? We are blocked on the upgrade because we want async timer checkpointing. Regards, Somya Maithani Software Developer II Helpshift Pvt Ltd On Fri, Jan 17, 2020 at 10:37 AM Somya Maithani wrote: > Hey Team, > > *Problem* > Recently, we were trying to upgrade Flink

Re: Apache Flink - Sharing state in processors

2020-01-23 Thread Chesnay Schepler
1. a/b) No, they are deserialized into separate instances in any case and are independent afterwards. 2. a/b) No, see 1). 3. a/b) No, as individual tasks are isolated by different class-loaders. On 23/01/2020 09:25, M Singh wrote: Thanks Yun for your answers. By processor I did mean user

Re: Usage of KafkaDeserializationSchema and KafkaSerializationSchema

2020-01-23 Thread Chesnay Schepler
That's a fair question; the interface is indeed weird in this regard and does have some issues. From what I can tell you have 2 options: a) have the user pass the topic to the serialization schema constructor, which in practice would be identical to the topic they pass to the producer. b)

Re: java.lang.StackOverflowError

2020-01-23 Thread Piotr Nowojski
Hi, Thanks for reporting the issue. Could you first try to upgrade to Flink 1.6.4? This might be a known issue fixed in a later bug fix release [1]. Also, are you sure you are using (unmodified) Flink 1.6.2? Stack traces somehow do not match with the 1.6.2 release tag in the repository, for

Re: Flink ParquetAvroWriters Sink

2020-01-23 Thread aj
Hi Arvid, I am not clear with this " Note that I still recommend to just bundle the schema with your Flink application and not reinvent the wheel." Can you please help with some sample code on how it should be written. Or can we connect some way so that I can understand with you . On Thu, Jan

Re: Flink ParquetAvroWriters Sink

2020-01-23 Thread Arvid Heise
The issue is that your are not providing any meaningful type information, so that Flink has to resort to Kryo. You need to extract the schema during query compilation (in your main) and pass it to your deserialization schema. public TypeInformation getProducedType() { return

Re: Apache Flink - Sharing state in processors

2020-01-23 Thread M Singh
Thanks Yun for your answers. By processor I did mean user defined processor function. Keeping that in view, do you have any advice on how the shared state - ie, the parameters passed to the processor as mentioned above (not the key state or operator state) will be affected in a distributed

debug flink in intelliJ on EMR

2020-01-23 Thread Fanbin Bu
Hi, I m following https://cwiki.apache.org/confluence/display/FLINK/Remote+Debugging+of+Flink+Clusters to debug flink program running on EMR. how do I specify the host in the `edit configurations` if the terminal on emr master is hadoop@ip-10-200-46-186 ? Thanks, Fanbin

Re: batch job OOM

2020-01-23 Thread Fanbin Bu
Jingsong, Do you have any suggestions to debug the above mentioned IndexOutOfBoundsException error? Thanks, Fanbin On Wed, Jan 22, 2020 at 11:07 PM Fanbin Bu wrote: > I got the following error when running another job. any suggestions? > > Caused by: java.lang.IndexOutOfBoundsException > at >