Hi,
Observations on Watermarks:
Read this great article:
https://data-artisans.com/blog/watermarks-in-apache-flink-made-easy
* Watermark means when for any event TS, when to stop waiting for arrival
of earlier events.
* Watermark t means all events with Timestamp < t have already arrived.
* When t
I have 2 shards in the Kinesis Streams- need to figure out how to check
from the logs if records are being written to both shards .
Not sure if this is what you are looking for in terms of # of shards read-
seems like 1 from the logs below:
DEBUG org.apache.http.wire
Hi Steffen,
Thanks for reporting this.
Internally Flink does not keep any open connections to S3. It only keeps
buffers data internally up
till the point they reach a min-size limit (by default 5MB) and then
uploads them as a part of
an MPU on one go. Given this, I will have to dig a bit dipper
Dear Flink Community,
Is there a way of troubleshooting timer service? In the docs, it says that
the service might degrade a job performance significantly. Is there a way
how to expose and see timer service metrics i.e. length of the priority
queue, how many time the service fires etc?
Hi,
looks like a very useful extension to Flink. Thanks for letting us know!
You can also use the commun...@flink.apache.org mailing list to spread
the news because the user@ list is more for user support questions and help.
Regards,
Timo
Am 14.12.18 um 09:23 schrieb GezimSejdiu:
Dear all,
Thanks, works great! This should be very useful for real-time
dashboard that want to compute in event time, especially for
multi-tenant systems or other specialized kafka topics that can have
gaps in the traffic.
- Original Message -
From: "Aljoscha Krettek"
To:"William Saar"
Cc:
Sent:Fr
Sure, let me try out with more debug logs and get back to you
Regards
Bhaskar
On Fri, Dec 14, 2018 at 4:41 PM Tzu-Li (Gordon) Tai
wrote:
> Hi,
>
> (Removed dev@ from the mail thread)
>
> I took a look at the logs you provided, and it seems like the sink
> operators should have been properly tea
Hi,
(Removed dev@ from the mail thread)
I took a look at the logs you provided, and it seems like the sink operators
should have been properly tear-down, and therefore closing the
RestHighLevelClient used internally.
I’m at this point not really sure what else could have caused this besides a
Hi Chang,
I think there is a JIRA[1] aimed at harden this case.
In fact Flink create this directory on started and without other warnings,
we can assume that it has been created. So it might be deleted by
some clean up processes(by Flink or by the fs).
Best,
tison.
[1] https://issues.apache.org
Hi William,
there is currently no official way of doing this but the Flink community will
be working on this as part of an upcoming source refactoring.
For now, there is this watermark extractor that I once did:
https://github.com/aljoscha/flink/commit/6e4419e550caa0e5b162bc0d2ccc43f6b0b3860f
My question is: whatever the Flink user is doing, as long as he/her is doing
all the actions within the Flink-provided ways (Flink CLI or Flink APIs in
code), should not be able to touch this directory, right?
Because this directory is for the JobManager and managed by Flink.
Best regards/祝好,
Hi Chesnay,
What do you mean by "...we can make a small adjustment to the code…"? Do you
mean I, as a flink application developer, can do this in my code, OR, it has to
be a code change in the Flink itself?
And more importantly, I would like to ping point the root cause of this because
I canno
Any standardized components to generate watermarks based on processing
time in an event time stream when there is no data from a source?
The docs for event time [1] indicate that people are doing this, but
the only suggestion on Stack Overflow [2] is to make every window
operator in stream have a
Hi,
I’m suspecting that this is the issue:
https://issues.apache.org/jira/browse/FLINK-11164.
One more thing to clarify to be sure of this:
Do you have multiple shards in the Kinesis stream, and if yes, are some of them
actually empty?
Meaning that, even though you mentioned some records were w
Dear all,
The Smart Data Analytics group (http://sda.tech) is happy to announce SANSA
0.5 - the fifth release of the Scalable Semantic Analytics Stack. SANSA
employs distributed computing via Apache Spark and Apache Flink in order to
allow scalable machine learning, inference and querying capabili
15 matches
Mail list logo