Re: Can't access Debezium metadata fields in Kafka table

2021-09-23 Thread Harshvardhan Shinde
Hi, Here's the complete error log: [ERROR] Could not execute SQL statement. Reason: org.apache.flink.table.api.ValidationException: Invalid metadata key 'value.ingestion-timestamp' in column 'origin_ts' of table 'flink_hive.harsh_test.testflink'. The DynamicTableSource class 'org.apache.flink.stre

Re: stream processing savepoints and watermarks question

2021-09-23 Thread JING ZHANG
Hi Macro, Do you specified drain flag when stop a job with a savepoint? If the --drain flag is specified, then a MAX_WATERMARK will be emitted before the last checkpoint barrier. [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/cli/#stopping-a-job-gracefully-creating-a-fi

stream processing savepoints and watermarks question

2021-09-23 Thread Marco Villalobos
Something strange happened today. When we tried to shutdown a job with a savepoint, the watermarks became equal to 2^63 - 1. This caused timers to fire indefinitely and crash downstream systems with overloaded untrue data. We are using event time processing with Kafka as our source. It seems imp

Re: pyflink keyed stream checkpoint error

2021-09-23 Thread Curt Buechter
Guess my last reply didn't go through, so here goes again... Possibly, but I don't think so. Since I submitted this, I have done some more testing. It works fine with file system or memory state backends, but not with rocksdb. I will try again and check the logs, though. I've also tested rocksdb c

Re: pyflink keyed stream checkpoint error

2021-09-23 Thread Dian Fu
PS: there are more information about this configuration in https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/dev/python/python_config/#python-fn-execution-bundle-size > 2021年9月24日 上午10:07,Dian Fu 写道: > > I agree with Roman that it seems that the Python process has crashed. > > Be

Re: pyflink keyed stream checkpoint error

2021-09-23 Thread Dian Fu
I agree with Roman that it seems that the Python process has crashed. Besides the suggestions from Roman, I guess you could also try to configure the bundle size to smaller value via “python.fn-execution.bundle.size”. Regards, Dian > 2021年9月24日 上午3:48,Roman Khachatryan 写道: > > Hi, > > Is it

byte array as keys in Flink

2021-09-23 Thread Dan Hill
*Context* I want to perform joins based on UUIDs. String version is less efficient so I figured I should use the byte[] version. I did a shallow dive into the Flink code I'm not sure it's safe to use byte[] as a key (since it uses object equals/hashcode). *Request* How do other Flink devs do for

Exact S3 Permissions to allow a flink job to use s3 for checkpointing

2021-09-23 Thread Thomas Wang
Hi, I'm trying to figure out what exact s3 permissions does a flink job need to work appropriately when using s3 for checkpointing. Currently, I have the following IAM Policy, but it seems insufficient. Can anyone help me figure this out? Thanks. { Action = [ "s3:PutObject", "s3:GetObject", ] Eff

Re: Resource leak would happen if exception thrown when flink redisson

2021-09-23 Thread Roman Khachatryan
I'd suggest to check that shutdown() in close() always completes: @Override public void close() { this.redisson.shutdown(); log.info(String.format("Shut down redisson instance in close method, RedissonRxClient shutdown is %s", redisson.isShutdown())); } maybe by logging on open and then com

Re: pyflink keyed stream checkpoint error

2021-09-23 Thread Roman Khachatryan
Hi, Is it possible that the python process crashed or hung up? (probably performing a snapshot) Could you validate this by checking the OS logs for OOM killer messages or process status? Regards, Roman On Wed, Sep 22, 2021 at 6:30 PM Curt Buechter wrote: > > Hi, > I'm getting an error after ena

Re: Can't access Debezium metadata fields in Kafka table

2021-09-23 Thread Roman Khachatryan
Hi, could you please share the full error message? I think it should list the supported metadata columns. Do you see the same error with 'debezium-json' format instead of 'debezium-avro-confluent' ? Regards, Roman On Wed, Sep 22, 2021 at 5:12 PM Harshvardhan Shinde wrote: > > Hi, > I'm trying

Re: Relation between Flink Configuration and TableEnv

2021-09-23 Thread Paul Lam
Sorry, I mean the relation between Flink Configuration and TableConfig, not TableEnv. Best, Paul Lam Paul Lam 于2021年9月24日周五 上午12:24写道: > Hi all, > > Currently, Flink creates a new Configuration in TableConfig of > StreamTableEnvironment, and synchronizes options in it back to the > Configuratio

Relation between Flink Configuration and TableEnv

2021-09-23 Thread Paul Lam
Hi all, Currently, Flink creates a new Configuration in TableConfig of StreamTableEnvironment, and synchronizes options in it back to the Configuration of the underlying StreamExecutionEnvironment afterward. However, only "relevant" options are set back [1], others are dropped silently. That block

Re: flink rest endpoint creation failure

2021-09-23 Thread Curt Buechter
Thanks Robert, But, no, the rest.bind-port is not set to 35485 in the configuration. Other jobs use different ports, so it is getting set dynamically. #== # Rest & web frontend #

RE: hdfs lease issues on flink retry

2021-09-23 Thread Shah, Siddharth
Hi David/Matthias, Thank you for your suggestion, it seems to be working fine. Had a quick question – would these _temporary directories created by DataSink task on retry require clean up or flink internally would take care of clean up part? From: David Morávek Sent: Monday, September 20, 2021

Re: DataStreamAPI and Stateful functions

2021-09-23 Thread Igal Shilman
No worries! Glad everything worked out! Cheers, Igal On Thu, Sep 23, 2021 at 2:42 PM Barry Higgins wrote: > Hi Igal, > Apologies you are correct. I had my wires crossed. I had been trying to > get everything working through my local ide before I deployed to our > ververica cluster. > I was only

Re: DataStreamAPI and Stateful functions

2021-09-23 Thread Barry Higgins
Hi Igal, Apologies you are correct. I had my wires crossed. I had been trying to get everything working through my local ide before I deployed to our ververica cluster. I was only able to get the code running through IntelliJ by following the steps below. Once I reverted the hack and changed the

Re: DataStreamAPI and Stateful functions

2021-09-23 Thread Igal Shilman
Hi Barry! Glad to hear that it works for you! I just didn't understand: a) what is "flink.yaml" perhaps you are referring to "flink-conf.yaml"? b) why is it bundled with the distribution jar? I couldn't find it there (nor it should be there) I've looked manually by: jar tf statefun-flink-distribut

Re: DataStreamAPI and Stateful functions

2021-09-23 Thread Barry Higgins
Hi Igal, I just wanted to give you an update on my deployment of stateful functions to an existing Flink cluster. The good news is that it works now when I submit my config with the statefun-flink-distribution. Thank you very much for your help. There was one gotcha and that was down to the requi