gt; problems ? Have you checked
> if there are errors reported for EFS, or if there might be duplicate
> mounting for the same EFS and others
> have ever deleted the directory?
>
> Best,
> Yun
>
>
> --Original Mail --
> *Sender:*Navneeth
Hi All,
Any suggestions?
Thanks
On Mon, Jan 18, 2021 at 7:38 PM Navneeth Krishnan
wrote:
> Hi All,
>
> We are running our streaming job on flink 1.7.2 and we are noticing the
> below error. Not sure what's causing it, any pointers would help. We have
> 10 TM's checkpoin
Hi All,
We are running our streaming job on flink 1.7.2 and we are noticing the
below error. Not sure what's causing it, any pointers would help. We have
10 TM's checkpointing to AWS EFS.
AsynchronousException{java.lang.Exception: Could not materialize
checkpoint 11 for operator Processor ->
Jan 3, 2021 at 19:09 Navneeth Krishnan
> wrote:
>
>> Hi All,
>>
>> Currently we are using flink in session cluster mode and we manually
>> deploy the jobs i.e. through the web UI. We use AWS ECS for running the
>> docker container with 2 services definitions, on
Hi All,
Currently we are using flink in session cluster mode and we manually deploy
the jobs i.e. through the web UI. We use AWS ECS for running the docker
container with 2 services definitions, one for JM and other for TM. How is
everyone managing the CICD process? Is there a better way to run a
Hello All,
First of all Happy New Year!! Thanks for the excellent community support.
I have a job which requires a 2 seconds tumbling time window per key, For
each user we wait for 2 seconds to collect enough data and proceed to
further processing. My question is should I use the regular DSL
t;> [2]
>> https://flink.apache.org/news/2020/04/15/flink-serialization-tuning-vol-1.html
>> [3] https://github.com/redis/jedis
>> [4] https://lettuce.io/
>> [5]
>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/asyncio.html
>>
>>
Hi All,
We have a flink streaming job processing around 200k events per second. The
job requires a lot of less frequently changing data (sort of static but
there will be some changes over time, say 5% change once per day or so).
There are about 12 caches with some containing approximately 20k
Hi All,
Any feedback on how this can be resolved? This is causing downtime in
production.
Thanks
On Tue, Oct 20, 2020 at 4:39 PM Navneeth Krishnan
wrote:
> Hi All,
>
> I'm facing an issue in our flink application. This happens in version
> 1.4.0 and 1.7.2. We have both ver
Hi All,
I'm facing an issue in our flink application. This happens in version 1.4.0
and 1.7.2. We have both versions and we are seeing this problem on both. We
are running flink on ECS and checkpointing enabled to EFS. When the
pipeline restarts due to some node failure or any other reason, it
Hi All,
We are currently using flink in production and use keyBy for performing a
CPU intensive computation. There is a cache lookup for a set of keys and
since keyBy cannot guarantee the data is sent to a single node we are
basically replicating the cache on all nodes. This is causing more
Hi All,
We are currently on a very old version of flink 1.4.0 and it has worked
pretty well. But lately we have been facing checkpoint timeout issues. We
would like to minimize any changes to the current pipelines and go ahead
with the migration. With that said our first pick was to migrate to
Hi All,
I was looking at the documentation for windows and got a little confused.
As per my understanding tumbling window per key will create a non
overlapping window based on when the data for that key arrived. For example
consider a tumbling window of 30 seconds
user1 - 10:01:01
user2 -
t; [1]
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#keyed-state-and-operator-state
>
> On Thu, Apr 23, 2020 at 7:44 AM Navneeth Krishnan <
> reachnavnee...@gmail.com> wrote:
>
>> Hi All,
>>
>> Is there a way for an upstre
Hi All,
Is there a way for an upstream operator to know how the downstream
operator tasks are assigned? Basically I want to group my messages to be
processed on slots in the same node based on some key.
Thanks
nouncing-ververica-platform-community-edition?utm_campaign=Ververica%20Platform%20-%20Community%20Edition_content=123140986_medium=social_source=twitter_channel=tw-2581958070
> [2]
>
> https://www.ververica.com/blog/how-to-get-started-with-data-artisans-platform-on-aws-eks
>
>
>
> O
Hi All,
I'm very new to EKS and trying to deploy a flink job in cluster mode. Are
there any good documentations on what are the steps to deploy on EKS?
>From my understanding, with flink 1.10 running it on EKS will automatically
scale up and down with kubernetes integration based on the load. Is
he.org/projects/flink/flink-docs-stable/ops/filesystems/s3.html
>
> On Sat, Feb 1, 2020, 07:42 Navneeth Krishnan
> wrote:
>
>> Hi Arvid,
>>
>> Thanks for the response.
>>
>> I have both the jars under /opt/flink/plugins but I'm still getting the
>> sa
ojects/flink/flink-docs-master/ops/plugins.html#isolation-and-plugin-structure
>
> On Thu, Jan 30, 2020 at 10:26 AM Navneeth Krishnan <
> reachnavnee...@gmail.com> wrote:
>
>> Hi All,
>>
>> I'm trying to migrate from NFS to S3 for checkpointing and I'm facing few
Hi All,
I'm trying to migrate from NFS to S3 for checkpointing and I'm facing few
issues. I have flink running in docker with flink-s3-fs-hadoop jar copied
to plugins folder. Even after having the jar I'm getting the following
error: Caused by:
/org/apache/flink/table/runtime/operators/aggregate/MiniBatchLocalGroupAggFunction.java#L89
>
>
> --
> *From:* Navneeth Krishnan
> *Sent:* Wednesday, January 8, 2020 15:36
> *To:* Yun Tang
> *Cc:* user
> *Subject:* Re: Using redis cache in flink
>
> Hi Yun,
here, and we cannot ensure there would be a
> performance gain. Actually, I prefer the time used in CPU serialization is
> much less than the time consumed through the network.
>
> Best
> Yun Tang
> --
> *From:* Navneeth Krishnan
> *Sent:* Wednes
Hi All,
I want to use redis as near far cache to store data which are common across
slots i.e. share data across slots. This data is required for processing
every single message and it's better to store in a in memory cache backed
by redis rather than rocksdb since it has to be serialized for
Another concern is that since the 1.4 version is very far away, all
> maintenance and response are not as timely as the recent versions. I
> personally recommend upgrading as soon as possible.
>
> I can ping @Piotr Nowojski and see if it is
> possible to explain the cause of this p
; Do you have ever check that this problem exists on Flink 1.9?
>
> Best,
> Congxian
>
>
> vino yang 于2020年1月3日周五 下午3:54写道:
>
>> Hi Navneeth,
>>
>> Did you check if the path contains in the exception is really can not be
>> found?
>>
>> Best,
Hi All,
We are running into checkpoint timeout issue more frequently in production
and we also see the below exception. We are running flink 1.4.0 and the
checkpoints are saved on NFS. Can someone suggest how to overcome this?
[image: image.png]
java.lang.IllegalStateException: Could not
docs-release-1.9/ops/state/checkpoints.html#resuming-from-a-retained-checkpoint
> [3] https://issues.apache.org/jira/browse/FLINK-10461
> Best,
> Congxian
>
>
> Navneeth Krishnan 于2019年11月8日周五 下午3:38写道:
>
>> Hello All,
>>
>> I have a streaming job running in p
Hello All,
I have a streaming job running in production which is processing over 2
billion events per day and it does some heavy processing on each event. We
have been facing some challenges in managing flink in production like
scaling in and out, restarting the job with savepoint etc. Flink
ether filtering is sufficient.
> In general, you can use timers as you suggested as the windowing itself
> works in a similar way.
>
> Best,
> Andrey
>
> On Thu, Oct 17, 2019 at 11:10 PM Navneeth Krishnan <
> reachnavnee...@gmail.com> wrote:
>
>> Hi All,
>>
Hi All,
I'm currently using a tumbling window of 5 seconds using TumblingTimeWindow
but due to change in requirements I would not have to window every incoming
data. With that said I'm planning to use process function to achieve this
selective windowing.
I looked at the example provided in the
tasks of the same
>> operator or tasks of other operators.
>> This is true for every type of state, including broadcast state.
>>
>> Best, Fabian
>>
>>
>> Am Di., 1. Okt. 2019 um 08:22 Uhr schrieb Navneeth Krishnan <
>> reachnavnee...@gmail.com>:
>>
>
ed?
>
> Best,
> Congxian
>
>
> Navneeth Krishnan 于2019年10月1日周二 上午10:15写道:
>
>> Thanks Oytun. The problem with doing that is the same data will be have
>> to be stored multiple times wasting memory. In my case there will around
>> million entries which
> Oytun Tez
>>
>> *M O T A W O R D*
>> The World's Fastest Human Translation Platform.
>> oy...@motaword.com — www.motaword.com
>>
>>
>> On Mon, Sep 30, 2019 at 8:29 PM Navneeth Krishnan <
>> reachnavnee...@gmail.com> wrote:
>>
>>> H
Hi All,
Is it possible to access a broadcast state across the pipeline? For
example, say I have a KeyedBroadcastProcessFunction which adds the incoming
data to state and I have downstream operator where I need the same state as
well, would I be able to just read the broadcast state with a
25, 2019 at 1:16 AM Terry Wang wrote:
> Hi, Navneeth,
>
> I think both is ok.
> IMO, run one container with number of slots same as virtual cores may be
> better for slots can share the Flink Framework and thus reduce memory cost.
>
> Best,
> Terry Wang
>
>
>
Hi All,
I’m currently running flink on amazon ecs and I have assigned task slots
based on vcpus per instance. Is it beneficial to run a separate container
with one slot each or one container with number of slots same as virtual
cores?
Thanks
Hi All,
I looked at the RocksDB KV store implementation and I found that
deserialization has to happen for each key lookup. Given a scenario where
the key lookup has to happen for every single message would it still be a
good idea to store it in rocksdb store or would in-memory store/cache be
vinced with is
> switching between event and processing time.
> Write a custom triggers and fire the event time window if you
> don't see any activity. That's the only way.
>
> On Mon, Jul 29, 2019, 11:07 PM Navneeth Krishnan
> wrote:
>
>> Hi All,
>>
>&g
Hi All,
Any suggestions?
Thanks
On Thu, Jul 25, 2019 at 11:45 PM Navneeth Krishnan
wrote:
> Hi All,
>
> I'm working on a very short tumbling window for 1 second per key. What I
> want to achieve is if the event time per key doesn't progress after a
> second I want to e
Hi All,
I'm working on a very short tumbling window for 1 second per key. What I
want to achieve is if the event time per key doesn't progress after a
second I want to evict the window, basically a combination of event time
and processing time. I'm currently achieving it by registering a
Hi All,
Currently I have a keyBy user and I see uneven load distribution since some
of the users would have very high load versus some users having very few
messages. Is there a recommended way to achieve even distribution of
workload? Has someone else encountered this problem and what was the
Hi All,
Any pointers on the below checkpoint failure scenario. Appreciate all the
help. Thanks
Thanks
On Sun, Jul 7, 2019 at 9:23 PM Navneeth Krishnan
wrote:
> Hi All,
>
> Occasionally I run into failed checkpoints error where 2 or 3 consecutive
> checkpoints fails after running
Hi All,
Occasionally I run into failed checkpoints error where 2 or 3 consecutive
checkpoints fails after running for a minute and then it recovers. This is
causing delay in processing the incoming data since there is huge amount of
data buffered during the failed checkpoints. I don't see any
Hi All,
Where can I get the videos of latest flink forward talks?
Thanks,
Navneeth
Hi All,
We have some streaming jobs in production and today we manually deploy the
flink jobs in each region/environment. Before we start automating it I just
wanted to check if anyone has already created a CICD script for Jenkins or
other CICD tools to deploy the latest JAR on to running flink
Hi,
Is this feature present in flink 1.5? I have a requirement to connect a
keyed stream and broadcast stream.
https://issues.apache.org/jira/browse/FLINK-3659
Thanks,
Navneeth
Hi All,
I'm having issues with creating side outputs. There are two input sources
(both from kafka) and they are connected and fed into a co-process
function. Inside the co-process, the regular data stream outputs a POJO and
in processElement2 there is a periodic timer which creates the side
Hi,
Is there way to get the kafka timestamp in deserialization schema? All
records are written to kafka with timestamp and I would like to set that
timestamp to every record that is ingested. Thanks.
Hi,
Is there a way for a script to be called whenever a job gets restarted? My
scenario is lets say there are 20 slots and the job runs on all 20 slots.
After a while a task manager goes down and now there are only 14 slots and
I need to readjust the parallelism of my job to ensure the job runs
Thanks Sendoh. Is there a way to advance watermark even when there are no
incoming events. What exactly does setAutoWatermarkInterval do?
Also I don't see the watermark displayed in flink dashboard.
Will the watermark advance only when there is data from all consuming kafka
topic and partitions?
be more than one window per key depending on the
> watermarks.
>
> Hope this helps,
> Fabian
>
> 2018-01-21 6:48 GMT+01:00 Navneeth Krishnan <reachnavnee...@gmail.com>:
>
>> Hi,
>>
>> I'm facing issues with frequent young generation garbage collections i
Thanks Chesnay.
On Tue, Jan 23, 2018 at 6:54 AM, Chesnay Schepler <ches...@apache.org>
wrote:
> I could reproduce this locally and opened a JIRA
> <https://issues.apache.org/jira/browse/FLINK-8496>.
>
>
> On 21.01.2018 04:32, Navneeth Krishnan wrote:
>
> Hi,
&g
Hi,
Any suggestions would really help. Thanks.
On Mon, Jan 15, 2018 at 12:07 AM, Navneeth Krishnan <
reachnavnee...@gmail.com> wrote:
> Hi All,
>
> Has anyone tried scaling out flink cluster on EMR based on CPU usage/
> kafka lag/ back pressure monitoring? If so can you pro
Hi,
I am having few issues with event time windowing. Here is my scenario, data
is ingested from a kafka consumer and then keyed by user followed by a
Tumbling event window for 10 seconds. The max lag tolerance limit is 1
second.
I have the BoundedOutOfOrdernessGenerator that extends
Hi,
I'm facing issues with frequent young generation garbage collections in my
task manager which happens approximately every few seconds. I have 3 task
managers with 12GB heap allocated on each and I have set the config to use
G1GC. My program ingests binary data from kafka source and the
Hi,
We recently upgraded from flink 1.3 to 1.4 and in the task manager UI it
shows there are 0 memory segments whereas in 1.3 I think it was default
32k. I have even tried adding the below config but still it shows 0.
taskmanager.network.numberOfBuffers: 2048
[image: Inline image 1]
Regards,
Hi All,
Has anyone tried scaling out flink cluster on EMR based on CPU usage/ kafka
lag/ back pressure monitoring? If so can you provide some insights on how
it could be achieved and sample scripts if possible. Thanks a lot.
Thanks,
Navneeth
Hi,
I have a requirement to initialize few guava caches per jvm and some static
helper classes. I tried few options but nothing worked. Need some help.
Thanks a lot.
1. Operator level static variables:
public static Cache loadingCache;
public void open(Configuration parameters)
Hi,
I'm receiving this error and due to which I'm not able to run my job. Any
help is greatly appreciated. Thanks.
On Tue, Dec 12, 2017 at 10:21 AM, Navneeth Krishnan <
reachnavnee...@gmail.com> wrote:
> Hi,
>
> I have a kafka source and sink in my pipeline and when I start
2 PM, Piotr Nowojski <pi...@data-artisans.com>
wrote:
> Hi,
>
> Reporting once per 10 seconds shouldn’t create problems. Best to try it
> out. Let us know if you get into some troubles :)
>
> Piotrek
>
> On 11 Dec 2017, at 18:23, Navneeth Krishnan <reachnavnee.
Hi,
I have a kafka source and sink in my pipeline and when I start my job I get
this error and the job goes to failed state. I checked the kafka node and
everything looks good. Any suggestion on what is happening here? Thanks.
java.lang.Exception: Failed to send data to Kafka: The server
etrics
> 2.
> I don’t think that reporting per node/jvm is possible with Flink’s metric
> system. For that you would need some other solution, like report your
> metrics using JMX (directly register MBeans from your code)
>
> Piotrek
>
> > On 10 Dec 2017, at 18:51, Navnee
Hi,
I have a streaming pipeline running on flink and I need to collect metrics
to identify how my algorithm is performing. The entire pipeline is
multi-tenanted and I also need metrics per tenant. Lets say there would be
around 20 metrics to be captured per tenant. I have the following ideas for
Hi All,
I have developed a streaming pipeline in java and I need to pass some of
the configuration parameters that are passed during program startup to user
functions. I used the below link as reference.
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/best_practices.html
I have
Hi All,
Is there a way to read the yarn application id/ name within flink so that
the logs can be sent to an external logging stack like ELK or CloudWatch
merged by the application?
Thanks,
Navneeth
Hello,
I'm running flink on AWS EMR and I would like to know how I can pass a
custom log4j properties file. I changed the log4j.properties file in flink
conf directory but it doesn't seem like the changes are reflected. Thanks.
I'm using the below command to start my flink job.
> flink run -m
not install it on your own.
> I think you can find it in the advanced options.
>
> On 26. Sep 2017, at 07:14, Navneeth Krishnan <reachnavnee...@gmail.com>
> wrote:
>
> Hello All,
>
> I'm trying to deploy flink on AWS EMR and I'm very new to EMR. I'm running
> into mu
g internal
> APIs and if the other approach works for you I would stay with that.
>
> Best,
> Aljoscha
>
> [1] https://github.com/apache/beam/blob/be9fb29901cf4a1ae7b4a9d8e9f25f
> 4ea78359fd/runners/flink/src/main/java/org/apache/beam/runners/flink/
> FlinkStreamingTransformTra
Hello All,
I'm trying to deploy flink on AWS EMR and I'm very new to EMR. I'm running
into multiple issues and need some help.
*Issue1:*
How did others resolve this multiple bindings issue?
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
ter setup that can help us pin down the problem, would be
> appreciated.
>
> Thanks,
> Kostas
>
> On Sep 13, 2017, at 7:12 PM, Navneeth Krishnan <reachnavnee...@gmail.com>
> wrote:
>
> Hi,
>
> I am sure I have provided the right job manager details because the
> connecti
Hi,
Any suggestions on this could be achieved?
Thanks
On Thu, Sep 7, 2017 at 8:02 AM, Navneeth Krishnan <reachnavnee...@gmail.com>
wrote:
> Hi All,
>
> Any suggestions on this would really help.
>
> Thanks.
>
> On Tue, Sep 5, 2017 at 2:42 PM, Navneeth Krishnan &
Hi,
Any idea on how to solve this issue?
Thanks
On Wed, Sep 13, 2017 at 10:12 AM, Navneeth Krishnan <
reachnavnee...@gmail.com> wrote:
> Hi,
>
> I am sure I have provided the right job manager details because the
> connection timeout ip is the task manager where the stat
Hi,
I am sure I have provided the right job manager details because the
connection timeout ip is the task manager where the state is kept. I guess
the client is able to reach the job manager and figure out where the state
is. Also if I provide a wrong state name, I'm receiving unknown state
Hi All,
Any suggestions would really be helpful. Thanks
On Sun, Sep 10, 2017 at 12:04 AM, Navneeth Krishnan <
reachnavnee...@gmail.com> wrote:
> Hi All,
>
> I'm running a streaming job on flink 1.3.2 with few queryable states.
> There are 3 task managers and a job manager. I
Hi All,
I'm running a streaming job on flink 1.3.2 with few queryable states. There
are 3 task managers and a job manager. I'm getting timeout exception when
trying to query a state and also a warning message in the job manager log.
*Client:*
final Configuration config = new Configuration();
Sorry my bad, figured out it was a change done at our end which created
different keys. Thanks.
On Fri, Sep 8, 2017 at 5:32 PM, Navneeth Krishnan <reachnavnee...@gmail.com>
wrote:
> Hi,
>
> I'm experiencing a wired issue where any data put into map state when
> retrieved
Hi,
I'm experiencing a wired issue where any data put into map state when
retrieved with the same key is returning as null and hence it puts the same
value again and again. I used rocksdb state backend but tried with Memory
state backend too but the issue still exist.
Each time when I set the
Hi All,
Any suggestions on this would really help.
Thanks.
On Tue, Sep 5, 2017 at 2:42 PM, Navneeth Krishnan <reachnavnee...@gmail.com>
wrote:
> Hi All,
>
> I looked into an earlier email about the topic broadcast config through
> connected stream and I couldn't find the
ed
> classes and methods.
>
> Hope this helps,
> Fabian
>
>
> 2017-09-05 19:35 GMT+02:00 Navneeth Krishnan <reachnavnee...@gmail.com>:
>
>> Thanks Gordon for your response. I have around 80 parallel flatmap
>> operator instances and each instance r
Hi,
Is there a reason behind removing the default value option in
MapStateDescriptor? I was using it in the earlier version to initialize
guava cache with loader etc and in the new version by default an empty map
is returned.
Thanks
Hi All,
I looked into an earlier email about the topic broadcast config through
connected stream and I couldn't find the conclusion.
I can't do the below approach since I need the config to be published to
all operator instances but I need keyed state for external querying.
Thanks Gordon for your response. I have around 80 parallel flatmap operator
instances and each instance requires 3 states. Out of which one is user
state in which each operator will have unique user's data and I need this
data to be queryable. The other two states are kind of static states which
the timestamp provided by the timer to see if the
> current key should be evicted.
> Checkout the example on the ProcessFunction page.
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/stream/
> process_function.html
>
> Best regards,
> Kien
>
> On 9/5/
Hi All,
I have a streaming pipeline which is keyed by userid and then to a flatmap
function. I need to clear the state after sometime and I was looking at
process function for it.
Inside the process element function if I register a timer wouldn't it
create a timer for each incoming message?
//
Hi All,
I have couple of questions regarding state maintenance in flink.
- I have a connected stream and then a keyby operator followed by a flatmap
function. I use MapState and keys get added by data from stream1 and
removed by messges from stream2. Stream2 acts as a control stream in my
85 matches
Mail list logo