Hi Ben,
Sorry for the late reply but I guess that your question was answered
in StackOverflow, right?
Did that answer solve your problem?
Cheers,
Kostas
On Mon, Dec 21, 2020 at 9:09 AM Ben Beasley wrote:
>
> First off I want to thank the folks in this email list for their help thus
> far.
>
>
Glad I could help!
On Mon, Dec 21, 2020 at 3:42 AM Ben Beasley wrote:
>
> That worked. Thankyou, Kostas.
>
>
>
> From: Kostas Kloudas
> Date: Sunday, December 20, 2020 at 7:21 AM
> To: Ben Beasley
> Cc: user@flink.apache.org
> Subject: Re: No execution.target s
Hi Ben,
You can try using StreamExecutionEnvironment
streamExecutionEnvironment =
StreamExecutionEnvironment.getExecutionEnvironment();
instead of directly creating a new one. This will allow to pick up the
configuration parameters you pass through the command line.
I hope this helps,
Kostas
On
Hi Lalala,
Even in session mode, the jobgraph is created before the job is
executed. So all the above hold.
Although I am not super familiar with the catalogs, what you want is
that two or more jobs share the same readers of a source. This is not
done automatically in DataStream or DataSet and I
Hi Hector,
The main reasons for deprecating the readFileStream() was that:
1) it was only capable of parsing Strings and in a rather limited way
as one could not even specify the encoding
2) it was not fault-tolerant, so your concerns about exactly-once were
not covered
One concern that I can
I am also cc'ing Timo to see if he has anything more to add on this.
Cheers,
Kostas
On Thu, Nov 19, 2020 at 9:41 PM Kostas Kloudas wrote:
>
> Hi,
>
> Thanks for reaching out!
>
> First of all, I would like to point out that an interesting
> alternative to the per-job clu
Hi,
Thanks for reaching out!
First of all, I would like to point out that an interesting
alternative to the per-job cluster could be running your jobs in
application mode [1].
Given that you want to run arbitrary SQL queries, I do not think you
can "share" across queries the part of the job
Hi Ashwin,
Do you have any filtering or aggregation (or any operation that emits
less data than it receives) in your logic? If yes, you could for
example put if before the reschaling operation so that it gets chained
to your source and you reduce the amount of data you ship through the
network.
Hi Nikola,
Apart from the potential overhead you mentioned about having one more
operator, I cannot find any other. Also even this one I think is
negligible.
The reason why we recommend attaching the Watermark Generator to the
source is more about semantics rather than efficiency. It seems
Hi Flavio,
Coould this https://issues.apache.org/jira/browse/FLINK-20020 help?
Cheers,
Kostas
On Thu, Nov 5, 2020 at 9:39 PM Flavio Pompermaier wrote:
>
> Hi everybody,
> I was trying to use the JobListener in my job but onJobExecuted() on Flink
> 1.11.0 but I can't understand if the job
Could you also post the ticket here @Flavio Pompermaier and we will
have a look before the upcoming release.
Thanks,
Kostas
On Wed, Nov 4, 2020 at 10:58 AM Chesnay Schepler wrote:
>
> Good find, this is an oversight in the CliFrontendParser; no help is
> printed for the run-application action.
>>
>> To be clear, I, personally, don't have a problem with removing it (we
>> have removed other connectors in the past that did not have a migration
>> plan), I just reject he argumentation.
>>
>> On 10/28/2020 12:21 PM, Kostas Kloudas wrote:
>> > No, I do
Hi all,
I will have a look in the whole stack trace in a bit.
@Chesnay Schepler I think that we are setting the correct classloader
during jobgraph creation [1]. Is that what you mean?
Cheers,
Kostas
[1]
strict.
On Wed, Oct 28, 2020 at 10:04 AM Chesnay Schepler wrote:
>
> If the conclusion is that we shouldn't remove it if _anyone_ is using
> it, then we cannot remove it because the user ML obviously does not
> reach all users.
>
> On 10/28/2020 9:28 AM, Kostas Kloudas wrote:
> &
gt;
> > Seth
> >
> > https://github.com/apache/flink/blob/2ff3b771cbb091e1f43686dd8e176cea6d435501/flink-connectors/flink-connector-filesystem/src/main/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.java#L170-L172
> >
> > On Thu, Oct 15, 2020 at 2:57 PM Ko
Thanks Piyush for the message.
After this, I revoke my +1. I agree with the previous opinions that we
cannot drop code that is actively used by users, especially if it
something that deep in the stack as support for cluster management
framework.
Cheers,
Kostas
On Fri, Oct 23, 2020 at 4:15 PM
+1 for adding a warning about the removal of Mesos support and I would
also propose to state explicitly in the warning the version that we
are planning to actually remove it (e.g. 1.13 or even 1.14 if we feel
it is too aggressive).
This will help as a reminder to users and devs about the upcoming
to remove it at this point.
>
> On Tue, Oct 13, 2020 at 2:33 PM Kostas Kloudas wrote:
>
> > @Chesnay Schepler Off the top of my head, I cannot find an easy way
> > to migrate from the BucketingSink to the StreamingFileSink. It may be
> > possible but it will require so
rsions of the module compatible with 1.12+?
>
> On 10/12/2020 4:30 PM, Kostas Kloudas wrote:
> > Hi all,
> >
> > As the title suggests, this thread is to discuss the removal of the
> > flink-connector-filesystem module which contains (only) the deprecated
> > Buc
Hi all,
As the title suggests, this thread is to discuss the removal of the
flink-connector-filesystem module which contains (only) the deprecated
BucketingSink. The BucketingSin is deprecated since FLINK 1.9 [1] in
favor of the relatively recently introduced StreamingFileSink.
For the sake of a
Hi Jason,
Your analysis seems correct.
As an alternative, you could:
1) either call readFile multiple times on the
StreamExecutionEnvironment (once for each dir you want to monitor) and
then union the streams, or
2) you could put all the dirs you want to monitor under a common
parent dir and
Hi all,
I will have a look.
Kostas
On Mon, Sep 28, 2020 at 3:56 PM Till Rohrmann wrote:
>
> Hi Cristian,
>
> thanks for reporting this issue. It looks indeed like a very critical problem.
>
> The problem seems to be that the ApplicationDispatcherBootstrap class
> produces an exception (that
iodically fetching a
> new version of data from some external storage.
>
> Thanks,
>
> Dongwon
>
> > 2020. 9. 23. 오전 4:59, Kostas Kloudas 작성:
> >
> > Hi Dongwon,
>
>
>
>
>
> >
> > If you know the data in advance, you can
Hi Dongwon,
If you know the data in advance, you can always use the Yarn options
in [1] (e.g. the "yarn.ship-directories") to ship the directories with
the data you want only once to each Yarn container (i.e. TM) and then
write a udf which reads them in the open() method. This will allow the
data
Thanks a lot for the discussion!
I will open a voting thread shortly!
Kostas
On Mon, Aug 24, 2020 at 9:46 AM Kostas Kloudas wrote:
>
> Hi Guowei,
>
> Thanks for the insightful comment!
>
> I agree that this can be a limitation of the current runtime, but I
> think
lable in the BATCH mode in current
> implementation.
> So maybe we need more checks in the AUTOMATIC execution mode.
>
> Best,
> Guowei
>
>
> On Thu, Aug 20, 2020 at 10:27 PM Kostas Kloudas wrote:
>>
>> Hi all,
>>
>> Thanks for the comments!
>>
>&
timers at the end of
> a job would be interesting, and would help in (at least some of) the cases I
> have in mind. I don't have a better idea.
>
> David
>
> On Mon, Aug 17, 2020 at 8:24 PM Kostas Kloudas wrote:
>>
>> Hi Kurt and David,
>>
>> Thanks a lot for t
ot;bounded
>> > streaming" to be treated differently. If I've understood it correctly, the
>> > section on scheduling allows me to choose STREAMING scheduling even if I
>> > have bounded sources. I like that approach, because it recognizes that
>> > even
adFile,readFileStream(...),socketTextStream(...),socketTextStream(...)
> (deprecated in 1.2)
>
> Looking forward to more opinions on the issue.
>
> Best,
>
> Dawid
>
>
> On 17/08/2020 12:49, Kostas Kloudas wrote:
>
> Thanks a lot for starting this Dawi
ame job as in production, except with different sources and
> sinks. While it might be a reasonable default, I'm not convinced that
> switching a processing time streaming job to read from a bounded source
> should always cause it to fail.
>
> David
>
> On Wed, Aug 12, 2020 a
Thanks a lot for starting this Dawid,
Big +1 for the proposed clean-up, and I would also add the deprecated
methods of the StreamExecutionEnvironment like:
enableCheckpointing(long interval, CheckpointingMode mode, boolean force)
enableCheckpointing()
isForceCheckpointing()
Hi Dasraj,
Yes, I would recommend to use Public and, if necessary, PublicEvolving
APIs as they provide better guarantees for future maintenance.
Unfortunately there are no Docs about which APIs are public or
publiceEvolving but you can see the annotations of the classes in the
source code.
I
Hi Narasimha,
I am not sure why the TMs are not shutting down, as Yun said, so I am
cc'ing Till here as he may be able to shed some light.
For the application mode, the page in the documentation that you
pointed is the recommended way to deploy an application in application
mode.
Cheers,
Kostas
Hi all,
As described in FLIP-131 [1], we are aiming at deprecating the DataSet
API in favour of the DataStream API and the Table API. After this work
is done, the user will be able to write a program using the DataStream
API and this will execute efficiently on both bounded and unbounded
data.
Hi Dasraj,
You are right. On your previous email I did not pay attention that you
migrated from 1.9.
Since 1.9 the ClusterClient has changed significantly as it is not
annotated as @Public API.
I am not sure how easy it is to use the old logic in your settings.
You could try copying the old code
Hi Dasraj,
Could you please specify where is the clusterClient.run() method and
how does it submit a job to a cluster?
Is the clusterClient your custom code?
Any details will help us pin down the problem.
One thing that is worth looking at is the release-notes of 1.11 [1].
There you will find
d Not Found consistently returned a correct result.
> It had never occurred before and I am afraid now I could no longer observe it
> again. I appreciate it does not give too much information so I will come back
> with more info on this thread if it happens again.
>
> -----Original Mess
Hi Alex,
Maybe Seth (cc'ed) may have an opinion on this.
Cheers,
Kostas
On Thu, Jul 23, 2020 at 12:08 PM Александр Сергеенко
wrote:
>
> Hi,
>
> We use so-called "control stream" pattern to deliver settings to the Flink
> job using Apache Kafka topics. The settings are in fact an unlimited
Hi Senthil,
You can see the configuration from the WebUI or you can get from the
REST API[1].
In addition, if you enable debug logging, you will have a line
starting with "Effective executor configuration:" in your client logs
(although I am not 100% sure if this will contain all the
Hi Lorenzo,
If you want to learn how Flink uses watermarks, it is worth checking [1].
Now in a nutshell, what a watermark will do in a pipeline is that it
may fire timers that you may have registered, or windows that you may
have accumulated.
If you have no time-sensitive operations between the
Hi Basanth,
If I understand your usecase correctly:
1) you want to find all A->B->C->D
2) for all of them you want to know how long it took to complete
3) if one completed within X it is considered ok, if not, it is
considered problematic
4) you want to count each one of them
One way to go is
Hi Tomasz,
Thanks a lot for reporting this issue. If you have verified that the
job is running AND that the REST server is also up and running (e.g.
check the overview page) then I think that this should not be
happening. I am cc'ing Chesnay who may have an additional opinion on
this.
Cheers,
Hi Alexander,
Routing of input data in Flink can be done through keying and this can
guarantee collocation constraints. This means that you can send two
records to the same node by giving them the same key, e.g. the topic
name. Keep in mind that elements with different keys do not
necessarily go
Hi John,
I think that using different plugins is not going to be an issue,
assuming that the scheme of your FS's do not collide. This is already
the case for S3 within Flink, where we have 2 implementations, one
based on Presto and one based on Hadoop. For the first you can use the
scheme s3p
Hi Alan,
Unfortunately not but the release is expected to come out in the next
couple of weeks, so then it will be available.
Until then, you can either copy the code of the AvroWriterFactory to
your project and use it from there, or download the project from
github, as you said.
Cheers,
Kostas
Hi Alan,
In the upcoming Flink 1.11 release, there will be support for Avro
using the AvroWriterFactory as seen in [1].
Do you think that this can solve your problem?
You can also download the current release-1.11 branch and also test it
out to see if it fits your needs.
Cheers,
Kostas
[1]
I understand. Thanks for looking into it Senthil!
Kostas
On Tue, Jun 9, 2020 at 7:32 PM Senthil Kumar wrote:
>
> OK, will do and report back.
>
> We are on 1.9.1,
>
> 1.10 will take some time __
>
> On 6/9/20, 2:06 AM, "Kostas Kloudas" wrote:
>
>
is cancelled, Flink sends an Interrupt signal to the Thread
> running the Source.run method
>
>
>
> For some reason (unknown to me), this does not happen when a Stop command
> is issued.
>
>
>
> We ran into some minor issues because of said behavior.
>
>
>
>
Hi Omkar,
For the first part of the question where you set the "drain" to false
and the state gets drained, this can be an issue on our side. Just to
clarify, no matter what is the value of the "drain", Flink always
takes a savepoint. Drain simply means that we also send MAX_WATERMARK
before
Hi Senthil,
>From a quick look at the code, it seems that the cancel() of your
source function should be called, and the thread that it is running on
should be interrupted.
In order to pin down the problem and help us see if this is an actual
bug, could you please:
1) enable debug logging and
What Arvid said is correct.
The only thing I have to add is that "stop" allows also exactly-once sinks
to push out their buffered data to their final destination (e.g.
Filesystem). In other words, it takes into account side-effects, so it
guarantees exactly-once end-to-end, assuming that you are
Hi all,
@Venkata, Do you have many small files being created as Arvid suggested? If
yes, then I tend to agree that S3 is probably not the best sink. Although I
did not get that from your description.
In addition, instead of PrintStream you can have a look at the code of the
SimpleStringEncoder in
Hi Singh,
The only thing to add to what Yang said is that the "execution.target"
configuration option (in the config file) is also used for the same
purpose from the execution environments.
Cheers,
Kostas
On Wed, May 27, 2020 at 4:49 AM Yang Wang wrote:
>
> Hi M Singh,
>
> The Flink CLI picks
Hi all,
I would like to bring the discussion in
https://issues.apache.org/jira/browse/FLINK-17745 to the dev mailing
list, just to hear the opinions of the community.
In a nutshell, in the early days of Flink, users could submit their
jobs as fat-jars that had a specific structure. More
Hi Eyal and Dawid,
@Eyal I think Dawid explained pretty well what is happening and why in
distributed settings, the underlying FS on which the StreamingFileSink
writes has to be durable and accessible to all parallel instances of
the job. Please let us know if you have any further questions.
Roshan Punnoose wrote:
>>
>> Nope just the s3a. I'll keep looking around to see if there is anything else
>> I can see. If you think of anything else to try, let me know.
>>
>> On Thu, Apr 9, 2020, 7:41 AM Kostas Kloudas wrote:
>>>
>>> It should no
exceptions there. Not sure where to look for
>>> s3 exceptions in particular.
>>>
>>> On Thu, Apr 9, 2020, 7:16 AM Kostas Kloudas wrote:
>>>>
>>>> Yes, this is why I reached out for further information.
>>>>
>>>> Incrementin
Hi Roshan,
Your logs refer to a simple run without any failures or re-running
from a savepoint, right?
I am asking because I am trying to reproduce it by running a modified
ParquetStreamingFileSinkITCase [1] and so far I cannot.
The ITCase runs against the local filesystem, and not S3, but I
Hi Eyal,
This is a known issue which is fixed now (see [1]) and will be part of
the next releases.
Cheers,
Kostas
[1] https://issues.apache.org/jira/browse/FLINK-16371
On Tue, Mar 24, 2020 at 11:10 AM Eyal Pe'er wrote:
>
> Hi all,
>
> I am trying to write a sink function that retrieves string
he
>> `fileStateSizeThreshold` argument when constructing the `FsStateBackend`.
>> The purpose of that threshold is to ensure that the backend does not create
>> a large amount of very small files, where potentially the file pointers are
>> actually larger than the state itse
ey Kostas,
>
> We’re a little bit off from a 1.10 update but I can certainly see if that
> CompressWriterFactory might solve my use case for when we do.
>
> If there is anything I can do to help document that feature, please let me
> know.
>
> Thanks!
>
> Austin
>
>
Hi Jacob,
Could you specify which StateBackend you are using?
The reason I am asking is that, from the documentation in [1]:
"Note that if you use the MemoryStateBackend, metadata and savepoint
state will be stored in the _metadata file. Since it is
self-contained, you may move the file and
tince/flink-streaming-file-sink-compression/tree/unbounded
>
> On Tue, Mar 3, 2020 at 3:50 AM Kostas Kloudas wrote:
>
>> Hi Austin and Rafi,
>>
>> @Rafi Thanks for providing the pointers!
>> Unfortunately there is no progress on the FLIP (or the issue).
>>
>>
Hi Austin,
Dawid is correct in that you need to enable checkpointing for the
StreamingFileSink to work.
I hope this solves the problem,
Kostas
On Mon, Feb 24, 2020 at 11:08 AM Dawid Wysakowicz
wrote:
>
> Hi Austing,
>
> If I am not mistaken the StreamingFileSink by default flushes on
Hi Juergen,
I will reply to your questions inline. As a general comment I would
suggest to also have a look at [3] so that you have an idea of some of
the alternatives.
With that said, here come the answers :)
1) We receive files every day, which are exports from some database
tables, containing
Hi Hemant,
Why not using simple connected streams, one containing the
measurements, and the other being the control stream with the
thresholds which are updated from time to time.
Both will be keyed by the device class, to make sure that the
measurements and the thresholds for a specific device
Parallelism for source function is 1 and for Process function its currently 2.
>
> Thanks for the response.
>
> —
> Akshay
>
> > On Feb 12, 2020, at 2:07 AM, Kostas Kloudas wrote:
> >
> > Hi Akshay,
> >
> > Could you be more specific on what you are
xpressed in DataStream API.
>
> ср, 12 февр. 2020 г. в 13:06, Kostas Kloudas :
>>
>> Hi Oleg,
>>
>> Could you be more specific on what do you mean by
>> "for events of last n seconds(time units in general) for every incoming
>> event."?
>>
>
Hi Salva,
Yes, the same applies to the Operator API as the output is not
thread-safe and there is no way of "checkpointing" the "in-flight"
data without explicit handling.
If you want to dig deeper, I would recommend to have a look also at
the source code of the AsyncWaitOperator to see how you
Hi Akshay,
Could you be more specific on what you are trying to achieve with this scheme?
I am asking because if your source is too fast and you want it to slow
it down so that it produces data at the same rate as your process
function can consume them, then Flink's backpressure will eventually
Hi Salva and Yun,
Yun is correct on that the collector is not thread-safe so writing
should be guarded.
In addition, such a pattern that issues a request to a 3rd party
multi-threaded library and registers a callback for the future does
not play well with checkpointing. In your case, if a
Hi Apoorv,
I am not so familiar with the internal of RocksDB and how the number
of open files correlates with the number of (keyed) states and the
parallelism you have, but as a starting point you can have a look to
[1] for recommendations on how to tune RocksDb for large state and I
am also
Hi Oleg,
Could you be more specific on what do you mean by
"for events of last n seconds(time units in general) for every incoming event."?
Do you mean that you have a stream of parallelism 1 and you want for
each incoming element to have your function fire with input the event
itself and all
Hi Fatima,
I am not super familiar with Parquet but your issue seems to be
related to [1], which seems to be expected behaviour on the Parquet
side.
The reason for this behaviour seems to be the format of the parquet
files which store only the leaf fields but not the structure of the
groups, so
Hi John,
As you suggested, I would also lean towards increasing the number of
allowed open handles, but
for recommendation on best practices, I am cc'ing Piotr who may be
more familiar with the Kafka consumer.
Cheers,
Kostas
On Tue, Feb 11, 2020 at 9:43 PM John Smith wrote:
>
> Just wondering
://issues.apache.org/jira/browse/FLINK-13027
[2] https://issues.apache.org/jira/browse/FLINK-15476
On Mon, Feb 3, 2020 at 8:14 PM Kostas Kloudas wrote:
> Hi Mark,
>
> Currently no, but if rolling on every checkpoint is ok with you, in future
> versions it is easy to allow to roll on every checkpoi
s
On Mon, Feb 3, 2020 at 4:11 PM Mark Harris wrote:
> Hi Kostas,
>
> Sorry, stupid question: How do I set that for a StreamingFileSink?
>
> Best regards,
>
> Mark
> ----------
> *From:* Kostas Kloudas
> *Sent:* 03 February 2020 14:58
> *To:* Ma
Hi Mark,
Have you tried to set your rolling policy to close inactive part files
after some time [1]?
If the part files in the buckets are inactive and there are no new part
files, then the state handle for those buckets will also be removed.
Cheers,
Kostas
anks,
> Pawel
>
>
> On Fri, 24 Jan 2020 at 20:45, Kostas Kloudas wrote:
>>
>> Hi Pawel,
>>
>> You are correct that counters are unique within the same bucket but
>> NOT across buckets. Across buckets, you may see the same counter being
>> used.
>
Hi Pawel,
You are correct that counters are unique within the same bucket but
NOT across buckets. Across buckets, you may see the same counter being
used.
The max counter is used only upon restoring from a failure, resuming
from a savepoint or rescaling and this is done to guarantee that n
valid
Oops, sorry for not sending the reply to everyone
and thanks David for reposting it here.
Great to hear that you solved your issue!
Kostas
On Wed, Jan 15, 2020 at 1:57 PM David Magalhães wrote:
>
> Sorry, I've only saw the replies today.
>
> Regarding my previous email,
>
>> Still, there is
lose" from
> "success finish close" in StreamingFileSink?
>
> Best,
> Jingsong Lee
>
> On Mon, Dec 9, 2019 at 10:32 PM Kostas Kloudas wrote:
>>
>> Hi Li,
>>
>> This is the expected behavior. All the "exactly-once" sinks in
Hi Krzysztof,
If I get it correctly, your main reason behind not using side-outputs
is that it seems that "side-output", by the name, seems to be a
"second class citizen" compared to the main output.
I see your point but in terms of functionality, there is no difference
between the different
Hi Kristoff,
The recommended alternative is to use SideOutputs as described in [1].
Could you elaborate why you think side outputs are not a good choice
for your usecase?
Cheers,
Kostas
[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/side_output.html
On Thu, Dec 19, 2019
Thanks a lot for reporting this!
I believe that this can be really useful for the community!
Cheers,
Kostas
On Tue, Dec 17, 2019 at 1:29 PM spoganshev wrote:
>
> In case you experience an exception similar to the following:
>
>
Hi all,
With the feature-freeze for the release-1.10 already past us, it is
time to focus a little bit on documenting the new features that the
community added to this release, and improving the already existing
documentation based on questions that we see in Flink's mailing lists.
To this end,
nutes. You see that the preceding commits follow this
> pattern of one commit per checkpoint interval, which is what we expect. It's
> very strange that two files for the same TopicPartition (same TaskManager)
> are committed.
>
>
> I am eager to hear your reply and understand what
Hi Pankaj,
When you start a session cluster with the bin/yarn-session.sh script,
Flink will create the cluster and then write a "Yarn Properties file"
named ".yarn-properties-YOUR_USER_NAME" in the directory:
either the one specified by the option "yarn.properties-file.location"
in the
Hi Harrison,
One thing to keep in mind is that Flink will only write files if there
is data to write. If, for example, your partition is not active for a
period of time, then no files will be written.
Could this explain the fact that dt 2019-11-24T01 and 2019-11-24T02
are entirely skipped?
In
As a side note, I am assuming that you are using the same Flink Job
before and after the savepoint and the same Flink version.
Am I correct?
Cheers,
Kostas
On Mon, Nov 25, 2019 at 2:40 PM Kostas Kloudas wrote:
>
> Hi Singh,
>
> This behaviour is strange.
> One thing I can r
Hi Singh,
This behaviour is strange.
One thing I can recommend to see if the two jobs are identical is to
launch also the second job without a savepoint,
just start from scratch, and simply look at the web interface to see
if everything is there.
Also could you please provide some code from your
Hi Vinay,
You are correct when saying that the bulk formats only support
onCheckpointRollingPolicy.
The reason for this has to do with the fact that currently Flink
relies on the Hadoop writer for Parquet.
Bulk formats keep important details about how they write the actual
data (such as
Hi Amran,
If you want to know from which partition your input data come from,
you can always have a separate bucket for each partition.
As described in [1], you can extract the offset/partition/topic
information for an incoming record and based on this, decide the
appropriate bucket to put the
Hi all,
Big +1 for contributing Stateful Functions to Flink and as for the
main question at hand, I would vote for putting it in the main
repository.
I understand that this can couple the release cadence of Flink and
Stateful Functions although I think the pros of having a "you break
it,
you fix
Hi Anton,
First of all, there is this PR
https://github.com/apache/flink/pull/9581 that may be interesting to
you.
Second, I think you have to keep in mind that the hourly bucket
reporting will be per-subtask. So if you have parallelism of 4, each
of the 4 tasks will report individually that
Hi Debasish,
So far I am not aware of any concrete timeline for Flink 1.9.1 but
I think that Gordon and Kurt (cc'ed) who were the release-1.9
managers are the best to answer this question.
Cheers,
Kostas
On Mon, Sep 9, 2019 at 9:38 AM Debasish Ghosh wrote:
>
> Hello -
>
> Is there a plan for a
Hi Sidhartha,
Your explanation is correct.
If you stopped the job with a savepoint and then you try to restore
from that savepoint, then Flink will try to restore its state
which is, of course, included in its old bucket.
But new data will go to the new bucket.
One solution is either to restart
Hi Guyla,
Thanks for looking into it.
I did not dig into it but in the trace you posted there is the line:
Failed to TRUNCATE_FILE ... for
**DFSClient_NONMAPREDUCE_-1189574442_56** on 172.31.114.177 because
**DFSClient_NONMAPREDUCE_-1189574442_56 is already the current lease
holder**.
The
Congratulations Andrey!
Well deserved!
Kostas
On Wed, Aug 14, 2019 at 4:04 PM Yun Tang wrote:
>
> Congratulations Andrey.
>
> Best
> Yun Tang
>
> From: Xintong Song
> Sent: Wednesday, August 14, 2019 21:40
> To: Oytun Tez
> Cc: Zili Chen ; Till Rohrmann ;
>
Congratulations Rong!
On Thu, Jul 11, 2019 at 4:40 PM Jark Wu wrote:
> Congratulations Rong Rong!
> Welcome on board!
>
> On Thu, 11 Jul 2019 at 22:25, Fabian Hueske wrote:
>
>> Hi everyone,
>>
>> I'm very happy to announce that Rong Rong accepted the offer of the Flink
>> PMC to become a
1 - 100 of 382 matches
Mail list logo