Kafka exceptions in Flink log file

2018-04-02 Thread Alexander Smirnov
I see a lot of messages in flink log like below. What's the cause? 02 Apr 2018 04:09:13,554 ERROR org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka producer I/O thread: org.apache.kafka.common.KafkaException: Error registering mbean kafka.producer:type=producer-node-met

Re: DFS problem with removing checkpoint

2018-04-02 Thread Stephan Ewen
Can you clarify which one does not get deleted? The file in the "high-availability.storageDir", or the "state.backend.fs.checkpointdir/JobId/check-", or both? Could you also tell us which file system you use? There is a known issue in some versions of Flink that S3 "directories" are not deleted.

Re: Unable to load AWS credentials: Flink 1.2.1 + S3 + Kubernetes

2018-04-02 Thread Bajaj, Abhinav
Hi, Thanks for the suggestions. We are using Kube2iam for our Kubernetes cluster and it seems to be setup correctly to support IAM Roles. I also checked AWS documentation to troubleshoot th

Side outputs never getting consumed

2018-04-02 Thread Julio Biason
Hey guys, I have a pipeline that generates two different types of data (but both use the same trait) and I need to save each on a different sink. So far, things were working with splits, but it seems using splits with side outputs (for the late data, which we'll plug a late arrival handling) caus

Watermark Question on Failed Process

2018-04-02 Thread Chengzhi Zhao
Hello, flink community, I am using period watermark and extract the event time from each records from files in S3. I am using the `TimeLagWatermarkGenerator` as it was mentioned in flink documentation. Currently, a new watermark will be generated using processing time by fixed amount override de

Re: Issue in Flink/Zookeeper authentication via Kerberos

2018-04-02 Thread Shuyi Chen
Hi Sarthak, Happy to help. Could you please share the jobmanager/taskmanager log and flink conf again? Also, Flink 1.4.0 has a regression on kerberos security (keytab path in TaskManager is set incorrectly) , which is fixed on 1.4.1. (see https://issues.apache.org/jira/browse/FLINK-8275) Shuyi

Multiple (non-consecutive) keyBy operators in a dataflow

2018-04-02 Thread au.fp2018
Hello Flink Community, I am relatively new to Flink. In the project I am currently working on I've a dataflow with a keyBy() operator, which I want to convert to dataflow with multiple keyBy() operators like this: Source --> KeyBy() --> Stateful process() function that generates a more gra

Re: Multiple (non-consecutive) keyBy operators in a dataflow

2018-04-02 Thread 李玥
Hello, In my opinion , it would be meaningful only on this situation: 1. The total size of all your stats is huge enough, e.g. 1GB+. 2. Splitting you job to multiple KeyBy process would reduce the size of your stats. Because operation of saving stats is synchronized and all working threa

subuquery about flink sql

2018-04-02 Thread ??????
Deal All I have a question about subquery of flink sql. My sql like this: select user, count(product), TUMBLE_START(t, INTERVAL '60' SECOND) as wStart, TUMBLE_END(t, INTERVAL '60' SECOND) as wEnd from ( select distinct(user

subuquery about flink sql

2018-04-02 Thread ??????
-- -- ??: ""<453673...@qq.com>; : 2018??4??3??(??) 11:23 ??: "user"; : "skycrab68"; : subuquery about flink sql Deal All I have a question about subquery of flink sql. My sql like this: selec

Re: subuquery about flink sql

2018-04-02 Thread 李玥
The exception logs tells that your table “myFlinkTable” does not contain a column/field named “t”. Could be something wrong about your table registration. It would be helpful to show us your table registration code, like: // register a Table tableEnv.registerTable("table1", ...)