Re: Get the savepointPath of a particular savepoint

2019-01-14 Thread Gary Yao
Hi, The API still returns the location of a completed savepoint. See the example in the Javadoc [1]. Best, Gary [1]

Streaming Checkpoint - Could not materialize checkpoint Exception

2019-01-14 Thread sohimankotia
Hi , Flink - 1.5.5 My Streaming job has checkpoint every minute . I am getting following exception. 2019-01-15 01:59:04,680 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed checkpoint 469 for job e9a08c0661a6c31b5af540cf352e1265 (2736 bytes in 124 ms). 2019-01-15

Re: Flink Streaming Job Task is getting cancelled silently and causing job to restart

2019-01-14 Thread sohimankotia
Hi , Any Update/help please ? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Get watermark metric as a delta of current time

2019-01-14 Thread Cristian
Hello. Flink emits watermark metrics (currentWatermark) as a Unix timestamp, which is useful in some context but troublesome for others. For instance, when sending data to Datadog, there is no way to meaningfully see or act upon this metric, because there is no support for timestamps. A more

Re: One TaskManager per node or multiple TaskManager per node

2019-01-14 Thread Ethan Li
Thank you Jamie! Sorry didn’t add more context because it’s mostly a general question without any specific use cases in mind. We currently deploy flink on bare metal and then submit jobs to it. And it’s how we deploy storm cluster. Looks like we need to move away from this setup for flink. We

Duplicate record writes to sink after job failure

2019-01-14 Thread Slotterback, Chris
We are running a Flink job that uses FlinkKafkaProducer09 as a sink with consumer checkpointing enabled. When our job runs into communication issues with our kafka cluster and throws an exception after the configured retries, our job restarts but we want to ensure at least once processing so we

Re: One TaskManager per node or multiple TaskManager per node

2019-01-14 Thread Jamie Grier
There are a lot of different ways to deploy Flink. It would be easier to answer your question with a little more context about your use case but in general I would advocate the following: 1) Don't run a "permanent" Flink cluster and then submit jobs to it. Instead what you should do is run an

One TaskManager per node or multiple TaskManager per node

2019-01-14 Thread Ethan Li
Hello, I am setting up a standalone flink cluster and I am wondering what’s the best way to distribute TaskManagers. Do we usually launch one TaskManager (with many slots) per node or multiple TaskManagers per node (with smaller number of slots per tm) ? Also with one TaskManager per node, I

Re: Recovery problem 1 of 2 in Flink 1.6.3

2019-01-14 Thread John Stone
Is this a known issue? Should I create a Jira ticket? Does anyone have anything they would like me to try? I’m very lost at this point. I’ve now seen this issue happen without destroying pods, i.e. the job running crashes after several hours and fails to recover once all task slots are

Re: Parallelism questions

2019-01-14 Thread Dawid Wysakowicz
Hi Alexandru As for 2, generally speaking the number of required slots depends on number of slot sharing groups. By default all operators belong to the default slot sharing group, that means a job requires as many slots as maximal parallelism in the job. More on the distributed runtime you can

RE: Subtask much slower than the others when creating checkpoints

2019-01-14 Thread Pasquale Vazzana
I have the same problem, even more impactful. Some subtasks stall forever quite consistently. I am using Flink 1.7.1, but I've tried downgrading to 1.6.3 and it didn't help. The Backend doesn't seem to make any difference, I've tried Memory, FS and RocksDB back ends but nothing changes. I've

Parallelism questions

2019-01-14 Thread Alexandru Gutan
Hi everyone! 1. Is there a way to increase the parallelism (e.g. through REST) of some operators in a job without re-deploying the job? I found this answer which mentions scaling at runtime on

Re: Multiple select single result

2019-01-14 Thread dhanuka ranasinghe
Hi Fabian, +1  Cheers Dhanuka On Mon, 14 Jan 2019, 21:29 Fabian Hueske Hi, > > That's a Java limitation. Methods cannot be larger than 64kb and code that > is generated for this predicate exceeds the limit. > There is a Jira issue to fix the problem. > > In the meantime, I'd follow a hybrid

Re: Get the savepointPath of a particular savepoint

2019-01-14 Thread Dawid Wysakowicz
Hi, The path of a savepoint is a user specified parameter, therefore it is not tracked by flink. It is up to the user to know where should the savepoint end up. As for API to check status of a savepoint you can use[1] Best, Dawid [1]

Re: Multiple select single result

2019-01-14 Thread Fabian Hueske
Hi, That's a Java limitation. Methods cannot be larger than 64kb and code that is generated for this predicate exceeds the limit. There is a Jira issue to fix the problem. In the meantime, I'd follow a hybrid approach and UNION ALL only as many tables as you need to avoid the code compilation

Re: What happen to state in Flink Task Manager when crash?

2019-01-14 Thread Dawid Wysakowicz
Hi, Pretty much just a rephrase of what others said. Flink's state is usually backed some highly available distributed fs and upon checkpoint a consistent view of all local states is written there, so it can be later restored from. As of now, any failure of a Task slot (e.g. if a TM fails, all

Re: Multiple select single result

2019-01-14 Thread dhanuka ranasinghe
Hi Fabian , I was encounter below error with 200 OR operators so I guess this is JVM level limitation. Error : of class "datastreamcalcrule" grows beyond 64 kb Cheers Dhanuka On Mon, 14 Jan 2019, 20:30 Fabian Hueske Hi, > > you should avoid the UNION ALL approach because the query will scan

Re: Multiple select single result

2019-01-14 Thread Fabian Hueske
Hi, you should avoid the UNION ALL approach because the query will scan the (identical?) Kafka topic 200 times which is highly inefficient. You should rather use your second approach and scale the query appropriately. Best, Fabian Am Mo., 14. Jan. 2019 um 08:39 Uhr schrieb dhanuka ranasinghe <