Hi Daniel,
The discussion for releasing Flink 1.4.3 hasn't been started (until now).
The community is still working on the 1.5.0 release but AFAIK, there are no
blockers for 1.4.3.
Development and release discussions take place on the dev@f.a.o list.
Would you kicking off a discussion there?
Hi Makeyang,
Would you mind opening a JIRA issue [1] for your improvement suggestion?
It would be good to add the Flink version that you are running.
Thanks, Fabian
[1] https://issues.apache.org/jira/projects/FLINK
2018-04-20 6:21 GMT+02:00 makeyang :
> one of my
Hi,
You can run Flink applications locally in your IDE and debug a Flink
program just like a regular Java/Scala application.
Best, Fabian
2018-04-19 0:53 GMT+02:00 Qian Ye :
> Hi
>
> I’m wondering if new debugging methods/tools are urgent for Flink
> development. I know
Hi Teena,
I'd go with approach 2. The performance difference shouldn't be significant
compared to 1. but it is much easier to implement, IMO.
Avoid approach 3. It will be much slower because you need at least one call
to an external data store and more difficult to implement.
Flink's
>
>
> I did mean like “finding series of consecutive events”, as it was
> described in [2].
>
>
>
> Are these features already in Flink and how well they are documented ?
>
>
>
> Can I use Scala or only Java ?
>
>
>
> I would like some example codes,
The over window operates on an unbounded stream of data. Hence it is not
possible to sort the complete stream.
Instead we can sort ranges of the stream. Flink uses watermarks to define
these ranges.
The operator processes the records in timestamp order that are not late,
i.e., have timestamps
most close preceding event in w2.
> If condition meets, I'd like to use the price value and timestamp in that
> event to get one matching event from another raw stream (r2).
>
> CEP sounds to be a good idea. But I need to refer to event in other stream
> (r2) in current pattern condition
Hi Esa,
What do you mean by "individual searches in the Table API"?
There is some work (a pending PR [1]) to integrate the MATCH_RECOGNIZE
clause (SQL 2016) [2] into Flink's SQL which basically adds a SQL syntax
for the CEP library.
Best, Fabian
[1] https://github.com/apache/flink/pull/4502
[2]
retest without the explicit parallelism
> when I get a chance.
>
> Michael
>
>
> On Apr 16, 2018, at 2:05 AM, Fabian Hueske <fhue...@gmail.com> wrote:
>
> (re-adding user mailing list)
>
> A non-serializable function object should cause the job to fail, but not
> to ignor
Hi Viktor,
Flink does not modify user code.
It distributes the job JAR file to the cluster and serializes the function
objects using Java serialization to ship them to the worker nodes where
they are deserialized.
What type of annotation gets dropped?
Can you show us a small example of the code?
;
> On Mon, Apr 16, 2018 at 3:14 PM, Fabian Hueske <fhue...@gmail.com> wrote:
>
>> Fully agree Juho!
>>
>> Do you want to contribute the docs fix?
>> If yes, we should update FLINK-5479 to make sure that the warning is
>> removed once the bug is fixed.
>&g
Hi Bill,
Flink's built-in aggregation functions are implemented against the same
interface as UDAGGs and are applied in parallel.
The performance depends of course on the implementation of the UDAGG. For
example, you should try to keep the size of the accumulator as small as
possible because it
Hi Darshan,
You are right. there's currently no rebalancing operation on the Table API.
I see that this might be a good feature, not sure though how easy it would
be to integrate because we need to pass it through the Calcite optimizer
and rebalancing is not a relational operation.
For now,
Hi Adrian,
Thanks reaching out to the community. I don't think that this is an issue
with Flink's SQL support. SQL queries are translated into regular streaming
(or batch) jobs.
The JM might just be overloaded by too many jobs. Since you are running in
a YARN environment, it might make sense to
ts stalled on an idle partition
> it blocks everything.
>
> Link to current documentation:
> https://ci.apache.org/projects/flink/flink-docs-
> master/dev/connectors/kafka.html#kafka-consumers-and-
> timestamp-extractionwatermark-emission
>
> On Mon, Dec 4, 2017 at 4:29 PM, Fabi
Thanks for starting the discussion Elias.
I see two ways to address this issue.
1) Add an interface that a deserialization schema can implement to register
metrics. Each source would need to check for the interface and call it to
setup metrics.
2) Check for null returns in the source functions
wse/FLINK-9182?jql=
> project%20%3D%20FLINK%20AND%20issuetype%20%3D%
> 20Improvement%20AND%20created%20%3E%3D%20-10m
>
>I'll pull the code ASAP.
>
>
>
>
> MaKeyang
> TIG.JD.COM
> 京东基础架构 <http://tig.jd.com/>
> tig.jd.com
> TIG官网
>
>
> -
.
>
>
> Michael
>
> On Apr 14, 2018, at 6:12 AM, Fabian Hueske <fhue...@gmail.com> wrote:
>
> Hi,
>
> The number of Taskmanagers is irrelevant für the parallelism of a job or
> operator. The scheduler only cares about the number of slots.
>
> How did you set
Sorry, I forgot to CC the user mailing list in my reply.
2018-04-12 17:27 GMT+02:00 Fabian Hueske <fhue...@gmail.com>:
> Hi,
>
> Assuming you are using event time, the right function to generate a row
> time attribute from a window would be "w1.rowtime" instead of &q
Sorry, I forgot to CC the user mailing list in my reply.
2018-04-12 17:26 GMT+02:00 Fabian Hueske <fhue...@gmail.com>:
> Hi,
>
> Sorry for the long delay. Many contributors are traveling due to Flink
> Forward.
>
> Your use case should be well supported by Fli
Hi,
SerializationSchema is a public interface that you can implement.
It has a single method to turn an object into a byte array.
I would suggest to implement your own SerializationSchema.
Best, Fabian
2018-04-11 15:56 GMT+02:00 Luigi Sgaglione :
> Hi,
>
> I'm
Hi Ivan,
You can certainly do these things with Flink.
Michael pointed you in a good direction if you want to implement the logic
in the DataStream API / ProcessFunctions.
Flink's SQL support should also be able to handle the use case you
described.
The "ingredients" would be
- a TUMBLE window
Hi Josh,
FileInputFormat stores the currently read FileInputSplit in a protected
variable FileInputFormat.currentSplit.
You can override your FileInputFormat and access the path of the read file
from the FileInputSplit in the method that emits the records from the
format.
Best,
Fabian
t; Thanks for the clarification. I was just trying to understand the intended
> behavior. It would have been nice if Flink tracked state for downstream
> operators by key, but I can do that with a map in the downstream functions.
>
> Michael
>
> Sent from my iPad
>
> On Apr 5, 2018
Amit is correct. keyBy() ensures that all records with the same key are
processed by the same paralllel instance of a function.
This is different from "a parallel instance only sees records of one key".
I had a look at the docs [1].
I agree that "Logically partitions a stream into disjoint
Hi,
Thanks for the feedback!
As Till explained, the problem is that the JM first tries to schedule the
job to the failed TM (which hasn't been detected as failed yet).
The configured three restart attempts are "consumed" by these attempts and
the job fails afterwards.
Best, Fabian
2018-04-05
Window operators drop late events by default. When they receive a late
event, they already computed and emitted a result.
Since there is not good default behavior to hay ndle a late event in this
case, they are simply dropped.
However, Flink offers multiple ways to explicitly handle late events
Hi Josh,
You are right, FLINK-2646 is related to the problem of non-finialized
files. If we could distinguish the cases why close() is called, we could do
a proper clean-up if the job terminated because all data was processed.
Right now, the source and sink interfaces of the DataStream API are
ata. The amount of appended
>> data is quite small so I wanted to see if I can use accumulators and once
>> the job is done I can get all the data and then use the way I want to use.
>>
>> I guess I will try the metrics as well to see how that goes.
>>
>> Thanks
&g
Hi Darshan,
Accumulators are not exposed to UDFs of the Table API / SQL.
What's your use case for these? Would metrics do the job as well?
Best, Fabian
2018-04-04 21:31 GMT+02:00 Darshan Singh :
> Hi,
>
> I would like to use accumulators with table /scalar functions.
Hi Juho,
Thanks for raising this point!
I'll add Chesnay and Till to the thread who contributed to the REST API.
Best, Fabian
2018-04-04 15:02 GMT+02:00 Juho Autio :
> I just learned that Flink savepoints API was refactored to require using
> HTTP POST.
>
> That's fine
Hi,
you can do that with a ProcessFunction [1].
The Context parameter of the ProcessFunction.processElement() method gives
access to the current watermark and the timestamp of the current element.
In case you don't just want to log the late data but send it to a different
DataStream (and sink it
Thanks for reporting this Juho.
I've created FLINK-9130 [1] to address the issue.
Best Fabian
[1] https://issues.apache.org/jira/browse/FLINK-9130
2018-04-04 12:03 GMT+02:00 Juho Autio :
> Thank you, it works!
>
> I would still expect this to be documented.
>
> If I
Hi,
This type of applications are not super well supported by Flink, yet. The
missing feature is on the roadmap and called Side Inputs [1].
There are (at least) two alternatives but both have some drawbacks:
1) Ingest the static data set as regular DataStream, keyBy the static and
the actual
Hi,
The issue might be related to garbage collection pauses during which the TM
JVM cannot communicate with the JM.
The metrics contain a stats for memory consumpion [1] and GC activity [2]
that can help to diagnose the problem.
Best, Fabian
[1]
Hi Pete,
Broadcast variables are a feature of the DataSet API [1], i.e., available
for batch processing.
Broadcast variables are computed based on the complete input (which is
possible because they are only available for bounded data sets and not for
unbounded streams) and shared with all
Hi Chengzhi,
You can access the current watermark from the Context object of a
ProcessFunction [1] and store it in operator state [2].
In case of a restart, the state will be restored with the watermark that
was active when the checkpoint (or savepoint) was taken. Note, this won't
be the last
Hi Maxim,
I think Ken's approach is a good idea. However, you would need to a add a
stateful operator to join the results of the individual queries if that is
needed.
In order to join the results, you would need a unique id on which you can
keyBy() to collect all 20 records that originated from
Thank you Edward and Christophe!
2018-03-29 17:55 GMT+02:00 Edward Alexander Rojas Clavijo <
edward.roja...@gmail.com>:
> Hi all,
>
> I did some tests based on the PR Christophe mentioned above and by making
> a change on the NettyClient to use CanonicalHostName instead of
> HostNameAddress to
Hi James,
The answer to your question depends on your use case.
The AsyncIOFunction approach works if you have a DataStream that you would
like to enrich with data in a Cassandra table but not if you would like to
create a DataStream from a Cassandra table.
The Flink code base contains a
Hi Darshan,
What you observe is the result of what's supposed to be an optimization. By
fusing the two select() calls, we reduce the number of operators in the
resulting plan (one MapFunction less).
This optimization is only applied for ScalarFunctions but not for
TableFunctions.
With a better
Hi Navneeth,
Flink's KafkaConsumer automatically attaches Kafka's ingestion timestamp if
you configure EventTime for an application [1].
Since Flink treats record timestamps as meta data, they are not directly
accessible by most functions. You can implement a ProcessFunction [2] to
access the
Hi,
Yes, I've updated the PR.
It needs a review and should be included in Flink 1.5.
Cheers, Fabian
simone <simone.povosca...@gmail.com> schrieb am Mo., 26. März 2018, 12:01:
> Hi Fabian,
>
> any update on this? Did you fix it?
>
> Best, Simone.
>
> On 22/03/2018
Thanks for the feedback!
Fabian
2018-03-23 8:02 GMT+01:00 James Yu :
> Hi,
>
> When I proceed further to timePrediction exercise (http://training.data-
> artisans.com/exercises/timePrediction.html), I realize that the
> nycTaxiRides.gz's
> format is fine.
> The problem is in
Yes, that would be great!
Thank you, Fabian
2018-03-23 3:06 GMT+01:00 Ashish Pokharel <ashish...@yahoo.com>:
> Fabian, that sounds good. Should I recap some bullets in an email and
> start a new thread then?
>
> Thanks, Ashish
>
>
> On Mar 22, 2018, at 5:16 AM, Fabia
Hi Gregory,
Event-time timestamps are handled a bit differently in Flink's SQL compared
to the DataStream API.
In the DataStream API, timestamps are hidden from the user and implicitly
used for time-based operations such as windows.
In SQL, the query semantics cannot depend on hidden fields.
Hi James,
the exercise does not require to filter on pickup events. It says:
"This is done by counting every five minutes the number of taxi rides that
started and ended in the same area within the last 15 minutes. Arrival and
departure locations should be separately counted."
That is achieved
in restoreState() if you don't
want to do it in the constructor. That method is always called when the
operator is started.
Best,
Fabian
2018-03-21 19:55 GMT+01:00 Ken Krugler <kkrugler_li...@transpac.com>:
> Hi Fabian,
>
> On Mar 20, 2018, at 6:38 AM, Fabian Hueske <fhue..
r a sub-set of job graph that are “healthy”.
>
> Thanks, Ashish
>
>
> On Mar 20, 2018, at 9:53 AM, Fabian Hueske <fhue...@gmail.com> wrote:
>
> Well, that's not that easy to do, because checkpoints must be coordinated
> and triggered the JobManager.
> Also, the checkpointing
7.
>>>>> val namedStream = dataStream.map((value:String) => {
>>>>>
>>>>>
>>>>> Should i file a bug report ?
>>>>>
>>>>> On Tue, Mar 20, 2018 at 9:30 AM, karim amer <karim.amer...@gmail.com>
>>
Hi,
I don't think the CEP library is that flexible, but I loop in Kostas (CC)
who knows more about it.
I'm not exactly sure what you mean by "manipulate" event-time, but I don't
think that's necessary.
You can implement rules also with state and timers in the ProcessFunction.
The function
Hi Ankit,
The env.java.opts parameter is used for all JVMs started by Flink, i.e., JM
and TM.
Since the JM process is started before the TM, the port is already in use
when you start the TM.
You can use
env.java.opts.taskmanager
to pass parameters only for TM JVMs.
Best, Fabian
2018-03-20
Hi,
That was a bit too early.
I found an issue with my approach. Will come back once I solved that.
Best, Fabian
2018-03-21 23:45 GMT+01:00 Fabian Hueske <fhue...@gmail.com>:
> Hi,
>
> I've opened a pull request [1] that should fix the problem.
> It would be great if yo
e
> problem go away?
>
> If either of that is the case, it must be a mutability bug somewhere in
> either accidental object reuse or accidental serializer sharing.
>
>
> On Tue, Mar 20, 2018 at 3:34 PM, Fabian Hueske <fhue...@gmail.com> wrote:
>
>> Hi Simone an
Hi,
I'm afraid, Flink CEP does not distinguish work days from non-work days.
Of course, you could implement the logic in a DataStream program (probably
using ProcessFunction).
Best, Fabian
2018-03-20 15:44 GMT+01:00 shishal :
> I am using flink CEP , and to match a event
> Hi Fabian,
>
> This simple code reproduces the behavior -> https://github.com/xseris/
> Flink-test-union
>
> Thanks, Simone.
>
> On 19/03/2018 15:44, Fabian Hueske wrote:
>
> Hmmm, I still don't see the problem.
> IMO, the result should be correct for both plans. T
Hi Pedro,
Can you reopen FLINK-7100 and post a comment with your error message and
environment?
Thanks,
Fabian
2018-03-20 14:58 GMT+01:00 PedroMrChaves :
> Hello,
>
> I still have the same issue with Flink Version 1.4.2.
>
> java.lang.IllegalArgumentException: A
Hi,
TBH, I don't have much experience with logging, but you might want to
consider using Side Outputs [1] to route invalid records into a separate
stream.
The stream can then separately handled, be written to files or Kafka or
wherever.
Best,
Fabian
[1]
Hi,
I'm quite sure that is not supported.
You'd have to take a savepoint and restart the application.
Depending on the sink system, you could start the new job before shutting
the old job down.
Best, Fabian
2018-03-20 10:31 GMT+01:00 Rohil Surana :
> Hi,
>
> We have a
Well, that's not that easy to do, because checkpoints must be coordinated
and triggered the JobManager.
Also, the checkpointing mechanism with flowing checkpoint barriers (to
ensure checkpoint consistency) won't work once a task failed because it
cannot continue processing and forward barriers. If
Thanks for reporting back!
2018-03-20 10:42 GMT+01:00 James Yu :
> Just found out that IDE seems auto import wrong class.
> While "org.apache.flink.streaming.api.datastream.DataStream" is required,
> "org.apache.flink.streaming.api.scala.DataStream" was imported.
>
> This is a
Hi,
The BucketingSink closes files once they reached a certain size (BatchSize)
or have not been written to for a certain amount of time
(InactiveBucketThreshold).
While being written to, files are in an in-progress state and only moved to
to completed once being closed. When that happens, other
is, it is
either the Flink runtime code or your functions.
Given that one plan produces good results, it might be the Flink runtime
code.
Coming back to my previous question.
Can you provide a minimal program to reproduce the issue?
Thanks, Fabian
2018-03-19 15:15 GMT+01:00 Fabian Hueske <f
nto...@gmail.com>:
> Are there plans to address all or few of the above apart from the "JM LB
> not possible" which seems understandable ?
>
> On Mon, Mar 19, 2018 at 9:58 AM, Fabian Hueske <fhue...@gmail.com> wrote:
>
>> Queryable state is "just"
Ah, thanks for the update!
I'll have a look at that.
2018-03-19 15:13 GMT+01:00 Fabian Hueske <fhue...@gmail.com>:
> HI Simone,
>
> Looking at the plan, I don't see why this should be happening. The pseudo
> code looks fine as well.
> Any chance that you can create a minimal
>
> reuse is not enabled. I attach the plan of the execution.
>
> Thanks,
> Simone
>
> On 19/03/2018 11:36, Fabian Hueske wrote:
>
> Hi,
>
> Union is actually a very simple operator (not even an operator in Flink
> terms). It just merges to inputs. There is no additiona
Hi,
Data is partitioned by key across machines and state is kept per key. It is
not possible to interact with two keys at the same time.
Best, Fabian
2018-03-19 14:47 GMT+01:00 Dhruv Kumar :
> In other words, while using the Flink streaming APIs, is it possible to
> take
ere are certain gaps in it being trully
> considered a Highly Available Key Value store.
>
> Regards.
>
>
>
>
>
> On Mon, Mar 19, 2018 at 5:53 AM, Fabian Hueske <fhue...@gmail.com> wrote:
>
>> Hi Vishal,
>>
>> In general, Queryable State should b
| 2003-12-15 16:31:00.884 | 2003-12-15 16:31:00.884
> | 2003-12-15 16:31:00.884 | 2003-12-15 16:31:00.884 | 2003-12-15
> 16:31:00.884},107151667,12)
> 3> (LogLine{time=2003-12-15 16:31:03.784, line=' | 2003-12-15
> 16:31:03.784},1071516670000,1)
> 3> (LogLine{time=2003-12-15
Hi,
Union is actually a very simple operator (not even an operator in Flink
terms). It just merges to inputs. There is no additional logic involved.
Therefore, it should also not emit records before either of both
ReduceFunctions sorted its data.
Once the data has been sorted for the
plan to adding these features to flink SQL ?
>
> Thanks
> LiYue
> tig.jd.com
>
>
>
> 在 2018年3月14日,上午7:48,Fabian Hueske <fhue...@apache.org> 写道:
>
> Hi,
>
> Chesnay is right.
> SQL and Table API do not support early window results and no allowed
> lateness to upda
Hi Vishal,
In general, Queryable State should be ready to use.
There are a few things to consider though:
- State queries are not synchronized with the application code, i.e., they
can happen at the same time. Therefore, the Flink application should not
modify objects that have been put into or
Hi,
The timestamps of the stream records should be increasing (strict
monotonicity is not required, a bit out of orderness can be handled due to
watermarks).
So, the events should also be generated with increasing timestamps. It
looks like your generator generates random dates. I'd also generate
gt;
>I try modify the flink resource and start a thread to clean the
> expired keyed sate, but it doesn't work well because of concurrency issues.
>
>
>
>
>
>
> Best,
> Deqiang
>
> 2018-03-16 16:03 GMT+08:00 Fabian Hueske <fhue...@gmail.com>:
&
Hmmm, that's a strange behavior that is unexpected (to me).
Flink optimizes the Table API / SQL queries when a Table is converted into
a DataStream (or DataSet) or emitted to a TableSink.
So, given that you convert the result tables in addSink() into a DataStream
and write them to a sink function,
One thing that changed in Flink 1.4 with respect to Hadoop is that Hadoop
is now an optional dependency.
Since Hadoop dependencies are now dynamically loaded, you might use
different versions on the client and the cluster?
Also the order in which classes are loaded changed.
You could try to
Thanks for managing this release Gordon!
Cheers, Fabian
2018-03-15 21:24 GMT+01:00 Tzu-Li (Gordon) Tai :
> The Apache Flink community is very happy to announce the release of Apache
> Flink 1.3.3, which is the third bugfix release for the Apache Flink 1.3
> series.
>
>
Hi,
AFAIK, that's not possible.
The only "solution" is to reduce the number of timers. Whether that's
possible or not, depends on the application.
For example, if you use timers to clean up state, you can work with an
upper and lower bound and only register one timer for each (upper - lower)
gt; Yan
> ------
> *From:* Fabian Hueske <fhue...@gmail.com>
> *Sent:* Thursday, March 15, 2018 11:55:12 AM
>
> *To:* Yan Zhou [FDS Science]
> *Cc:* user@flink.apache.org
> *Subject:* Re: flink sql: "slow" performance on Over Window a
it's not the case. the end-to-end latency of application using
> process time is only has 1/3 of the other(50ms vs 150ms). Why was that?
> Please help me to understand.
>
>
> Best
>
> Yan
>
> --
> *From:* Fabian Hueske <fhue...@gmail.c
Hi Felipe,
Just like the ReduceFunction, the WindowFunction is applied in the context
of a single key. So, it will be called for each key and always just see a
single record (the reduced record of the key).
You'd have to add a non-keyed window (allWindow) for your sorting
WindowFunction.
Note
If I understand fine-grained recovery correctly, one would still need to
take checkpoints.
Ashish would like to avoid checkpointing and accept to lose the state of
the failed task.
However, he would like to avoid losing more state than necessary due to
restarting of tasks that did not fail.
fixed amount of lateness
> <https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/event_timestamp_extractors.html#assigners-allowing-a-fixed-amount-of-lateness>
> is used. So even the slowest source task is not that slow.
>
>
> Best
>
> Yan
>
>
>
>
&
Hi,
Flink advances watermarks based on all parallel source tasks. If one of the
source tasks lags behind the others, the event time progresses as
determined by the "slowest" source task.
Hence, records ingested from a faster task might have a higher processing
latency.
Best, Fabian
2018-03-14
Hi,
To be honest, I did not understand your requirements and what you are
looking for.
stream.keyBy("partition").addSink(...) will partition the output on the
"partition" attribute before handing it to the sink.
Hence, all records with the same "partition" value will be handled by the
same
Hi,
Flink supports upgrading of serializers [1] [2] since version 1.3.
You probably need to upgrade to Flink 1.3 before you can use the feature.
Best, Fabian
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/state/custom_serialization.html
[2]
sink)
>
> I failed to reproduce it without actually used table schemas and SQL
> queries in my production. And at last I wrote my own JDBC sink with
> connection pooling to migrate this problem. Maybe someone familiar with the
> implementation of union operator would figure out what's
Hi,
Chesnay is right.
SQL and Table API do not support early window results and no allowed
lateness to update results with late arriving data.
If you need such features, you should use the DataStream API.
Best, Fabian
2018-03-13 12:10 GMT+01:00 Chesnay Schepler :
> I don't
Hi Bill,
The size of the program depends on the number and complexity SQL queries
that you are submitting.
Each query might be translated into a sequence of multiple operators. Each
operator has a string with generated code that will be compiled on the
worker nodes. The size of the code depends
Hi Gregory,
Your understanding is correct. It is not possible to assign UUID to the
operators generated by the SQL/Table planner.
To be honest, I am not sure whether the use case that you are describing
should be the scope of the "officially" supported use cases of the API.
It would require in
; exhausted for all other pipes. I am very interested in how holding back the
>> busy source does not create a pathological issue where that source is
>> forever held back. Is there a FLIP for it ?
>>
>> On Thu, Mar 8, 2018 at 11:29 AM, Fabian Hueske <fhue...@gmail.com>
Hi Gytis,
Flink does currently not support holding back individual streams, for
example it is not possible to align streams on (offset) event-time.
However, the Flink community is working on a windowed join for the
DataStream API, that only holds the relevant tail of the stream as state.
If your
We hope to pick up FLIP-20 after Flink 1.5.0 has been released.
2018-03-07 22:05 GMT-08:00 Shailesh Jain :
> In addition to making the addition of patterns dynamic, any updates on
> FLIP 20 ?
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-
>
Jayant Ameta <wittyam...@gmail.com>:
> Thanks Fabian.
> Will there be any performance issues if I use NFS as the shared filesystem
> (as compared to HDFS or S3)?
>
> Jayant Ameta
>
> On Mon, Mar 5, 2018 at 10:31 PM, Fabian Hueske <fhue...@gmail.com> wrote:
>
&g
filesystem (HDFS, S3, etc). Is my understanding correct?
>
>
> Jayant Ameta
>
> On Mon, Mar 5, 2018 at 9:51 PM, Fabian Hueske <fhue...@gmail.com> wrote:
>
>> Hi,
>> RockDB is an embedded key-value storage that is internally used by Flink.
>> There
Hi,
RockDB is an embedded key-value storage that is internally used by Flink.
There is no need to setup a RocksDB database or service yourself. All of
that is done by Flink.
As a Flink user that uses the RockDB state backend, you won't get in touch
with RocksDB itself.
Besides that, RocksDB is
Hi Alberto,
You can also add another MapState. The key is a timestamps and
the value is the key that you want to discard.
When onTimer() is called, you look up the key in the MapState
and and remove it from the original MapState.
Best, Fabian
2018-03-05 0:48 GMT-08:00
That does not matter.
2018-03-01 13:32 GMT+01:00 Esa Heikkinen <esa.heikki...@student.tut.fi>:
> Hi
>
>
>
> Should the custom source function be written by Java, but no Scala, like
> in that RideCleansing exercise ?
>
>
>
> Best, Esa
>
>
>
>
Are you sure the program is doing anything at all?
Do you call execute() on the StreamExecutionEnvironment?
2018-03-01 15:55 GMT+01:00 Supun Kamburugamuve :
> Thanks Piotrek,
>
> I did it this way on purpose to see how Flink performs. With 128000
> messages it takes an
t sure how that’ll go. If you’ve got an idea for a
> work around, I’d be all ears too.
>
>
>
>
>
> *From: *Philip Doctor <philip.doc...@physiq.com>
> *Date: *Tuesday, February 27, 2018 at 10:02 PM
> *To: *"Tzu-Li (Gordon) Tai" <tzuli...@apache.org>
601 - 700 of 1535 matches
Mail list logo