FileSource example

2024-09-12 Thread Jacob Rollings
Hello, Is there an example for FileSource that continuously monitors NFS directories looking out for files that match pattern specified at runtime? I was searching for documents around it but could not find. Can flink file source monitor on nfs directories without any issues? I was using a custo

Checkpointing

2024-05-08 Thread Jacob Rollings
Hello, I'm curious about how Flink checkpointing would aid in recovering data if the data source is not Kafka but another system. I understand that checkpoint snapshots are taken at regular time intervals. What happens to the data that were read after the previous successful checkpoint if the sys

Async code inside Flink Sink

2024-04-17 Thread Jacob Rollings
Hello, I have a use case where I need to do a cache file deletion after a successful sunk operation(writing to db). My Flink pipeline is built using Java. I am contemplating using Java completableFuture.runasync() to perform the file deletion activity. I am wondering what issues this might cause i

Global connection open and close

2024-03-21 Thread Jacob Rollings
Hello, Is there a way in Flink to instantiate or open connections (to cache/db) at global level, so that it can be reused across many process functions rather than doing it in each operator's open()?Along with opening, also wanted to know if there is a way to close them at job level stop, such tha

Fwd: Flink Checkpoint & Offset Commit

2024-03-07 Thread Jacob Rollings
Hello, I am implementing proof of concepts based Flink realtime streaming solutions. I came across below lines in out-of-the-box Flink Kafka connector documents. *https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/kafka/*

Re: [E] Re: Kubernetes operator expose UI rest service as NodePort instead of default clusterIP

2022-09-02 Thread Jeesmon Jacob
I remember testing the operator with the rest service exposed as NodePort. NodePort requires rbac.nodeRoules.create: true (default is false) in values.yaml. Maybe you missed that? https://github.com/apache/flink-kubernetes-operator/blob/release-1.1/helm/flink-kubernetes-operator/values.yaml#L34-L3

Flink Kubernetes Operator v1.0 ETA

2022-06-01 Thread Jeesmon Jacob
Hi there, Is there an ETA on v1.0 release of operator? We are prototyping with a CI build from release-1.0 branch but would like to know the approximate ETA of official 1.0 release so that we can plan accordingly. Thanks, Jeesmon

Re: Python Job Type Support in Flink Kubernetes Operator

2022-05-24 Thread Jeesmon Jacob
Hi Gyula, Any idea on this? We are exploring current limitations of using the operator for Flink deployment and if there is a plan to support Python jobs in future will help us. Thanks, Jeesmon On Fri, May 20, 2022 at 3:46 PM Jeesmon Jacob wrote: > Hi there, > > Is there a plan t

Python Job Type Support in Flink Kubernetes Operator

2022-05-20 Thread Jeesmon Jacob
Hi there, Is there a plan to support Python Job Type in Flink Kubernetes Operator? If yes, any ETA? According to this previous operator overview only Java jobs are supported in operator. This page was recently modified to remove the features table. https://github.com/apache/flink-kubernetes-oper

Exception handling

2021-04-27 Thread Jacob Sevart
but we've recently encountered some setbacks in the game of whack-a-mole with pathological messages, and are hoping to mitigate the losses when these do occur. Jacob

Re: "stepless" sliding windows?

2020-10-22 Thread Jacob Sevart
on every event. The windows need to be of a fixed >>> size, but to have their start and end times update continuously, and I'd >>> like to trigger on every event. Is this a bad idea? I've googled and read >>> the docs extensively and haven't been able to identify built-in >>> functionality or examples that map cleanly to my requirements. >>> >>> OK, I just found DeltaTrigger, which looks promising... Does it make >>> sense to write a WindowAssigner that makes a new Window on every event, >>> allocation rates aside? >>> >>> Thanks! >>> >>> -0xe1a >>> >> -- Jacob Sevart Software Engineer, Safety

Re: Checkpoints for kafka source sometimes get 55 GB size (instead of 2 MB) and flink job fails during restoring from such checkpoint

2020-04-17 Thread Jacob Sevart
e don’t find clear way to > reproduce > > this problem (when the flink job creates “abnormal” checkpoints). > > > > Configuration: > > > > We are using flink 1.8.1 on emr (emr 5.27) > > > > Kafka: confluence kafka 5.4.1 > > > > Flink kafka connector: > > org.apache.flink:flink-connector-kafka_2.11:1.8.1 (it includes > > org.apache.kafka:kafka-clients:2.0.1 dependencies) > > > > Our input kafka topic has 32 partitions and related flink source has > 32 > > parallelism > > > > We use pretty much all default flink kafka concumer setting. We only > > specified: > > > > CommonClientConfigs.BOOTSTRAP_SERVERS_CONFIG, > > > > ConsumerConfig.GROUP_ID_CONFIG, > > > > CommonClientConfigs.SECURITY_PROTOCOL_CONFIG > > > > Thanks a lot in advance! > > > > Oleg > > > > > -- Jacob Sevart Software Engineer, Safety

Re: Very large _metadata file

2020-03-21 Thread Jacob Sevart
https://github.com/apache/flink/pull/11475 On Sat, Mar 21, 2020 at 10:37 AM Jacob Sevart wrote: > Thanks, will do. > > I only want the time stamp to reset when the job comes up with no state. > Checkpoint recoveries should keep the same value. > > Jacob > > On Sat, Mar 2

Re: Very large _metadata file

2020-03-21 Thread Jacob Sevart
Thanks, will do. I only want the time stamp to reset when the job comes up with no state. Checkpoint recoveries should keep the same value. Jacob On Sat, Mar 21, 2020 at 10:16 AM Till Rohrmann wrote: > Hi Jacob, > > if you could create patch for updating the union state

Re: Very large _metadata file

2020-03-20 Thread Jacob Sevart
ation? Jacob On Tue, Mar 17, 2020 at 2:51 AM Till Rohrmann wrote: > Did I understand you correctly that you use the union state to synchronize > the per partition state across all operators in order to obtain a global > overview? If this is the case, then this will only work in case o

Re: Very large _metadata file

2020-03-16 Thread Jacob Sevart
imes. How would you go about implementing something like that? On Mon, Mar 16, 2020 at 1:54 PM Till Rohrmann wrote: > Hi Jacob, > > I think you are running into some deficiencies of Flink's union state > here. The problem is that for every entry in your list state, Flink stores

Re: Very large _metadata file

2020-03-13 Thread Jacob Sevart
Oh, I should clarify that's 43MB per partition, so with 48 partitions it explains my 2GB. On Fri, Mar 13, 2020 at 7:21 PM Jacob Sevart wrote: > Running *Checkpoints.loadCheckpointMetadata *under a debugger, I found > something: > *subtaskState.managedOperatorState[0].sateNameToPa

Re: Very large _metadata file

2020-03-13 Thread Jacob Sevart
ava.time.Instant). I see a way to end up fewer items in the list, but I'm not sure how the actual size is related to the number of offsets. Can you elaborate on that? Incidentally, 42.5MB is the number I got out of https://issues.apache.org/jira/browse/FLINK-14618. So I think my two problems ar

Re: Very large _metadata file

2020-03-05 Thread Jacob Sevart
t the end. If it is putting state in there, under normal circumstances, does it make sense that it would be interleaved with metadata? I would expect all the metadata to come first, and then state. Jacob Jacob On Thu, Mar 5, 2020 at 10:53 AM Kostas Kloudas wrote: > Hi Jacob, > > As I s

Re: Very large _metadata file

2020-03-04 Thread Jacob Sevart
Kostas and Gordon, Thanks for the suggestions! I'm on RocksDB. We don't have that setting configured so it should be at the default 1024b. This is the full "state.*" section showing in the JobManager UI. [image: Screen Shot 2020-03-04 at 9.56.20 AM.png] Jacob On Wed, Mar 4,

Very large _metadata file

2020-03-03 Thread Jacob Sevart
which look like HDFS paths, which leaves a lot of that file-size unexplained. What else is in there, and how exactly could this be happening? We're running 1.6. Jacob

State key serializer has not been configured in the config.

2016-06-22 Thread Jacob Bay Larsen
ovide some help ? Best regards Jacob private ListState> deltaPositions; @Override public void open(org.apache.flink.configuration.Configuration parameters) throws Exception { // Create state variable ListStateDescriptor> descriptor = new Lis