Re: NoSuchMethodError - getColumnIndexTruncateLength after upgrading Flink from 1.11.2 to 1.12.1

2021-06-29 Thread Thomas Wang
Thanks Matthias. Could you advise how I can confirm this in my environment? Thomas On Tue, Jun 29, 2021 at 1:41 AM Matthias Pohl wrote: > Hi Rommel, Hi Thomas, > Apache Parquet was bumped from 1.10.0 to 1.11.1 for Flink 1.12 in > FLINK-19137 [1]. The error you're seeing looks like some dependen

Re: Yarn Application Crashed?

2021-06-29 Thread Thomas Wang
Thanks Piotr. This is helpful. Thomas On Mon, Jun 28, 2021 at 8:29 AM Piotr Nowojski wrote: > Hi, > > You should still be able to get the Flink logs via: > > > yarn logs -applicationId application_1623861596410_0010 > > And it should give you more answers about what has happened. > > About the

Re: How can I tell if a record in a bounded job is the last record?

2021-06-29 Thread Paul Lam
Hi Yik San, Maybe you could use watermark to trigger the last flush. Source operations will emit MAX_WATERMARK to trigger all the timers when it terminates (see [1]). [1] https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/Stre

RE: FW: Hadoop3 with Flink

2021-06-29 Thread V N, Suchithra (Nokia - IN/Bangalore)
Hi Yangze Guo, Thanks for the reply. I am using flink in Kubernetes environment. Hence can you please suggest how to use hadoop3 with flink in k8s. Regards, Suchithra -Original Message- From: Yangze Guo Sent: Monday, June 28, 2021 3:16 PM To: V N, Suchithra (Nokia - IN/Bangalore) Cc:

Re: Job Recovery Time on TM Lost

2021-06-29 Thread Lu Niu
Hi, Gen Thanks for replying! The reasoning overall makes sense. But in this case, when JM sends a cancel request to a killed TM, why the call timeout 30s instead of returning "connection refused" immediately? Best Lu On Tue, Jun 29, 2021 at 7:52 PM Gen Luo wrote: > Hi Lu, > > We found almost

Re: Questions about UPDATE_BEFORE row kind in ElasticSearch sink

2021-06-29 Thread Kai Fu
Thank you for the reply, Jark. In our case, we found that there are no UPDATE_BEFORE records generated since the join is using -D/+I row kinds. *> Note: "+I" represents "INSERT", "-D" represents "DELETE", "+U" represents "UPDATE_AFTER",* * "-U" represents "UPDATE_BEFORE". We forward input RowK

Re: Job Recovery Time on TM Lost

2021-06-29 Thread Gen Luo
Hi Lu, We found almost the same thing when we were trying failover in a large scale job. The akka.ask.timeout and heartbeat.timeout were set to 10min for the test, and we found that the job would take 10min to recover from TM lost. We reached the conclusion that the behavior is expected in the Fl

How can I tell if a record in a bounded job is the last record?

2021-06-29 Thread Yik San Chan
Hi community, I have a batch job that consumes records from a bounded source (e.g., Hive), walk them through a BufferingSink as described in [docs]( https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/datastream/fault-tolerance/state/#checkpointedfunction). In the BufferingSink,

Re: Converting Table API query to Datastream API

2021-06-29 Thread JING ZHANG
Hi Le, link is a bit outdated. Since Flink 1.9 version, TableAPI & SQL is no longer translated to DataStream API. TableAPI & SQL and DataStream are at the same level, and both translated into

Session cluster configmap removal

2021-06-29 Thread Sweta Kalakuntla
Hi, We have flink session clusters in kubernetes and several long running flink jobs deployed in them with HA enabled. After we have enabled HA, we are seeing configmaps created for every new job. Whenever we stop/cancel any existing jobs, these configmaps do not get deleted. Is that right, these

Re: Savepoint failure with operation not found under key

2021-06-29 Thread Rainie Li
I see, then it passed longer than 5 mins. Thanks for the help. Best regards Rainie On Tue, Jun 29, 2021 at 12:29 AM Chesnay Schepler wrote: > How much time has passed between the requests? (You can only query the > status for about 5 minutes) > > On 6/29/2021 6:37 AM, Rainie Li wrote: > > Thank

Converting Table API query to Datastream API

2021-06-29 Thread Le Xu
Hello! I have a basic question about the concept of using Flink Table API. Based on the link here it seems like if I implement stream query with Table API the program is translated to datastr

Re: Recommended way to submit a SQL job via code without getting tied to a Flink version?

2021-06-29 Thread Sonam Mandal
Hi Matthias, Thanks for getting back to me. We are trying to build a system where users can focus on writing Flink SQL applications and we handle the full lifecycle of their Flink cluster and job. We would like to let users focus on just their SQL and UDF logic. In such an environment, we canno

Re: Use Flink to write a Kafka topic to s3 as parquet files

2021-06-29 Thread Arvid Heise
Hi Thomas, The usual way with Avro would be to generate a class from your schema [1]. Then PlaySession would already be a SpecificRecord and you would avoid the extra step. I'm quite positive that the same way works with ParquetAvroWriters. Note that you would need to use ParquetAvroWriters#forSp

Re: Recommended way to submit a SQL job via code without getting tied to a Flink version?

2021-06-29 Thread Matthias Pohl
Hi Sonam, what's the reason for not using the Flink SQL client? Because of the version issue? I only know that FlinkSQL's state is not backwards-compatible between major Flink versions [1]. But that seems to be unrelated to what you describe. I'm gonna add Jark and Timo to this thread. Maybe, they

Re: Monitoring Exceptions using Bugsnag

2021-06-29 Thread Matthias Pohl
Hi Kevin, I haven't worked with Bugsnag. So, I cannot give more input on that one. For Flink, exceptions are handled by the job's scheduler. Flink collects these exceptions in some bounded queue called the exception history [1]. It collects task failures but also global failures which make the job

Re: NoSuchMethodError - getColumnIndexTruncateLength after upgrading Flink from 1.11.2 to 1.12.1

2021-06-29 Thread Matthias Pohl
Hi Rommel, Hi Thomas, Apache Parquet was bumped from 1.10.0 to 1.11.1 for Flink 1.12 in FLINK-19137 [1]. The error you're seeing looks like some dependency issue where you have a version other than 1.11.1 of org.apache.parquet:parquet-column:jar on your classpath? Matthias [1] https://issues.apac

Re: Savepoint failure with operation not found under key

2021-06-29 Thread Chesnay Schepler
How much time has passed between the requests? (You can only query the status for about 5 minutes) On 6/29/2021 6:37 AM, Rainie Li wrote: Thanks for the context Chesnay. Yes, I sent both requests to the same JM. Best regards Rainie On Mon, Jun 28, 2021 at 8:33 AM Chesnay Schepler