I think what your looking for is a a side output. Change the logic to a
process function. What is true goes to collector false can go to a side
output. Which now gives you 2 streams
On Mon, May 10, 2021, 8:14 PM Nikola Hrusov wrote:
> Hi Arvid,
>
> In my case it's the latter, thus I have also th
tagSet.addAll(tags);
>
> }
>
>
>
> for (String tag : tagSet) {
>
> outputMap.putIfAbsent(tag, new OutputTag(tag)
> {});
>
> ctx.output(outputMap.get(tag), value);
>
> }
>
> }
>
> }
>
>
>
> Exception com
Can you please share your code?
On Mon, Jan 11, 2021, 6:47 PM Priyanka Kalra A <
priyanka.a.ka...@ericsson.com> wrote:
> Hi Team,
>
>
>
> We are generating multiple side-output tags and using default processing
> time on non-keyed stream. The class $YYY extends *ProcessFunction* O> and implem
on4s or
> https://github.com/FasterXML/jackson-module-scala both only seem to
> consume strings.
>
> Best,
> Georg
>
> Am Do., 9. Juli 2020 um 19:17 Uhr schrieb Taher Koitawala <
> taher...@gmail.com>:
>
>> You can try the Jackson ObjectMapper library and t
You can try the Jackson ObjectMapper library and that will get you from
json to object.
Regards,
Taher Koitawala
On Thu, Jul 9, 2020, 9:54 PM Georg Heiler wrote:
> Hi,
>
> I want to map a stream of JSON documents from Kafka to a scala case-class.
> How can this be accomplish
The Open method would be a great! And close method could close it when
operator closes!
Also for external calls AsyncIO is a great operator. Give that a look.
Regards,
Taher Koitawala
On Sat, May 30, 2020, 10:17 PM Aissa Elaffani
wrote:
> Hello Guys,
> I want to enrich a data strea
Would AsyncIO operator not be an option for you to connect to RDBMS?
On Tue, Jan 28, 2020, 12:45 PM Alexey Trenikhun wrote:
> Thank you Yun Tang.
> My implementation potentially could block for significant amount of time,
> because I wanted to do RDBMS maintenance (create partitions for new data
You can do this by writing a custom trigger or evictor.
On Fri, Nov 1, 2019 at 3:08 PM Qi Kang wrote:
> Hi all,
>
>
> We have a Flink job which aggregates sales volume and GMV data of each
> site on a daily basis. The code skeleton is shown as follows.
>
>
> ```
> sourceStream
> .map(message ->
Beware when using Bucketing sink as it does not follow exactly once
semantics. Also it has issues with s3 consistency.
On Sat, Oct 19, 2019, 1:42 PM Ravi Bhushan Ratnakar <
ravibhushanratna...@gmail.com> wrote:
> Hi,
>
> As an alternative, you may use BucketingSink which provides you the
> prov
> On Sun 18 Aug 2019 at 16:24, taher koitawala wrote:
>
>> Hi Swapnil,
>>We faced this problem once, I think changing checkpoint dir to
>> hdfs and keeping sink dir to s3 with EMRFS s3 consistency enabled solves
>> this problem. If you are not using e
Hi Swapnil,
We faced this problem once, I think changing checkpoint dir to hdfs
and keeping sink dir to s3 with EMRFS s3 consistency enabled solves this
problem. If you are not using emr then I don't know how else it can be
solved. But in a nutshell because EMRFS s3 consistency uses Dynamo D
I believe Flink serialization is really fast and GC is much better from
Flink 1.6 release, along side the state depends on what you do with it.
each task manager has its own instance of rocks DB and is responsible for
snapshot for his own instance upon checkpointing.
Further more if you used a key
progress?
>
> Thanks
>
> On Mon, Jul 29, 2019 at 10:51 AM taher koitawala
> wrote:
>
>> I believe the approach to this is wrong... For fixing windows we can
>> write our custom triggers to fire them... However what I'm not convinced
>> with is switching between
I believe the approach to this is wrong... For fixing windows we can write
our custom triggers to fire them... However what I'm not convinced with is
switching between event and processing time.
Write a custom triggers and fire the event time window if you
don't see any activity. That's th
As far as I know. It is completely safe
On Fri, Jul 19, 2019, 1:35 AM M Singh wrote:
> Just wanted to see if there is any advice on this question. Thanks
>
> On Sunday, July 14, 2019, 09:07:45 AM EDT, M Singh
> wrote:
>
>
> Hi:
>
> Is it safe to manipulate the state of an object in the evictor
Looks like you need a window
On Tue, Jul 16, 2019, 9:24 PM sri hari kali charan Tummala <
kali.tumm...@gmail.com> wrote:
> Hi All,
>
> I am trying to write toRetractSream to CSV which is kind of working ok but
> I get extra values like True and then my output data values.
>
> Question1 :-
> I don
No particular reason for not using the process function, just wanted to
clarify if that was the correct way to do it. Thanks Knauf.
Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163
On Wed, Feb 27, 2019 at 8:23 PM Konstantin Knauf
wrote:
> Hi Taher ,
>
> a ProcessFunction is act
ava.lang.IllegalStateException: Consecutive multiple splits
are not supported. Splits are deprecated. Please use side-outputs
Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163
://github.com/apache/flink/blob/master/flink-streaming-java/src/test/java/org/apache/flink/streaming/api/functions/sink/filesystem/BucketAssignerITCases.java
> On 26/01/2019 08:18, Taher Koitawala wrote:
>
> Can someone please help with this?
>
> On Fri 25 Jan, 2019, 1:47 PM Taher Koita
Can someone please help with this?
On Fri 25 Jan, 2019, 1:47 PM Taher Koitawala Hi All,
> Is there a way to specify *batch size* and *compression *properties
> when using StreamingFileSink just like we did in bucketing sink? The only
> parameters it is accepting is Inactivi
the same kafka
topics, however doing different operations. And each flink job is writing a
file with different size and we would want to make it consistent.
Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163
and
folders, we are only facing this issue with StreamingFileSink.
Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163
similar to the ones
> that the older BucketingSink was doing.
>
> Cheers,
> Kostas
>
> On Thu, Jan 10, 2019 at 10:47 AM Taher Koitawala <
> taher.koitaw...@gslab.com> wrote:
>
>> Hi Kostas,
>>Thanks you for the clarification, also can you please
Hi Kostas,
Thanks you for the clarification, also can you please point
how StreamingFileSink uses TwoPhaseCommit. Can you also point out the
implementing class for that?
Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163
On Thu, Jan 10, 2019 at 3:10 PM Kostas Kloudas wrote
Koitawala
On Thu 10 Jan, 2019, 2:40 PM Kostas Kloudas Hi Taher,
>
> The StreamingFileSink implements a version of TwoPhaseCommit. Can you
> elaborate a bit on what do you mean by " TwoPhaseCommit is not being used
> "?
>
> Cheers,
> Kostas
>
> On Thu, Jan 10, 201
Hi All,
As per my understanding and the API of StreamingFileSink,
TwoPhaseCommit is not being used. Can someone please confirm is that's
right? Also if StreamingFileSink does not support
TwoPhaseCommits what is the best way to implement this?
Regards,
Taher Koitawa
please elaborate and explain the how the row format and the
bulk works? Document only stresses on how they will be serialized.
Taher Koitawala
GS Lab Pune
+91 8407979163
Hi All,
Is there a way to send hints to the job graph builder!? Like
specifically disabling or enabling chaining.
May I ask why you want to have 2 differences between window time? What's
the use case?
On Mon 26 Nov, 2018, 5:53 PM Abhijeet Kumar Hello Team,
>
> I've to join two stream where one stream is coming late. So, I planned
> doing it by creating two windows, for first window the size will be 5
> minut
is by the key. rebalancing will not shuffle this partitioning ?
> e.g
> .addSource(source)
> .rebalance
> .keyBy(_.id)
> .mapWithState(...)
>
>
> On Mon, Nov 26, 2018 at 8:32 AM Taher Koitawala
> wrote:
>
>> Hi Avi,
>> No,
in my code then
> it would be a great help.
>
> Thanks,
>
>
> *Abhijeet Kumar*
> Software Development Engineer,
> Sentienz Solutions Pvt Ltd
> Cognitive Data Platform - Perceive the Data !
> abhijeet.ku...@sentienz.com |www.sentienz.com | Bengaluru
>
>
> On 26-N
Hi Avi,
No, rebalance is not changing the number of kafka partitions.
Lets say you have 6 kafka partitions and your flink parallelism is 8, in
this case using rebalance will send records to all downstream operators in
a round robin fashion.
Regards,
Taher Koitawala
GS Lab Pune
+91
ributed file system
> available on all TMs.
> The location is set in Flink option ‘state.checkpoints.dir'.
> This way job can restore from it with different set of TMs.
>
> Best,
> Andrey
>
> > On 26 Oct 2018, at 08:29, Taher Koitawala
> wrote:
> >
> > H
we
have an even spread when the RocksDB files are written?
Thanks,
Taher Koitawala
Sounds smashing; I think the initial integration will help 60% or so flink
sql users and a lot other use cases will emerge when we solve the first one.
Thanks,
Taher Koitawala
On Fri 12 Oct, 2018, 10:13 AM Zhang, Xuefu, wrote:
> Hi Taher,
>
> Thank you for your input. I think you e
ing"
The way we use this is:
Using streaming_table as configuration select count(*) from processingtable
as streaming;
This way users can now pass Flink SQL info easily and get rid of the Flink
SQL configuration file all together. This is simple and easy to understand
and I think most user
tream;
select count(*) from flink_mailing_list process as batch;
This way we could completely get rid of Flink SQL configuration files.
Thanks,
Taher Koitawala
Integrating
On Fri 12 Oct, 2018, 2:35 AM Zhang, Xuefu, wrote:
> Hi Rong,
>
> Thanks for your feedback. Some of my earlier comment
t fine.
Thanks,
Taher Koitawala
On Fri 5 Oct, 2018, 1:28 AM Andrew Kowpak,
wrote:
> Hi all,
>
> I apologize if this has been discussed to death in the past, but, I'm
> finding myself very confused, and google is not proving helpful.
>
> Based on the documentation, I unde
Hi Dawid,
Thanks for the answer, how do I get the state of the Window
then? I do understand that elements are going to the state as window in
itself is a stateful operator. How do I get access to those elements?
Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163
On Tue, Sep 25
t; valueState
= ctx.globalState().getState(new ValueStateDescriptor<>("valueState",
TypeInformation.of(new TypeHint() {})));
System.out.println(valueState.value());
collector.collect(T)
})
Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163
Thanks a lot for the explanation. That was exactly what I thought should
happen. However, it is always good to a clear confirmation.
Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163
On Fri, Sep 21, 2018 at 6:26 PM Piotr Nowojski
wrote:
> Hi,
>
> Yes, in your case half of the Kaf
ribing
the number of TM's than the number of partitions in the Kafka topic
guarantee high throughput?
Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163
ord X' from Stream2 reaches after maxOutOfOrderness time has passed. In
this scenario as per my knowledge. X will be maintained in the flink state.
However, when X' comes, how do I do a lookup for X from the flink state and
carry on the further aggregation or whatever I want to do.
Regards,
27;s an ideal state, and it's often difficult to achieve
this state, because eventtime always has more or less delay as events are
transmitted from the source to the processing system.
Thanks, vino.
[1]:
https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/event_timestamps_watermarks.
Hi All,
Can someone show a good example of how watermarks need to be
generated when using EventTime windows? What happens if the watermark is
same as the timestamp? How does the watermark help in the window to be
triggered and what if watermarks are kept behind the currentTimestamps in
g on the
sender side (batch only).
2018-09-12 7:30 GMT-04:00 Taher Koitawala :
> So flink TMs reads one line at a time from hdfs in parallel and keep
> filling it in memory and keep passing the records to the next operator? I
> just want to know how data comes in memory? How it is parti
;> by directly playing back the complete data set. A TaskManager fails, Flink
>> will kick it out of the cluster, and the Task running on it will fail, but
>> the result of stream processing and batch Task failure is different. For
>> stream processing, it triggers a restart of t
Furthermore, how does Flink deal with Task Managers dying when it is using
the DataSet API. Is checkpointing done on dataset too? Or the whole dataset
has to re-read.
Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163
On Tue, Sep 11, 2018 at 11:18 PM, Taher Koitawala wrote:
> Hi
Hi All,
Just like Spark does Flink read a dataset and keep it in memory
and keep applying transformations? Or all records read by Flink async
parallel reads? Furthermore, how does Flink deal with
Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163
49 matches
Mail list logo