from:"taher koitawala"

Re: Splitting stream

2021-05-10 Thread Taher Koitawala

I think what your looking for is a a side output. Change the logic to a process function. What is true goes to collector false can go to a side output. Which now gives you 2 streams On Mon, May 10, 2021, 8:14 PM Nikola Hrusov wrote: > Hi Arvid, > > In my case it's the latter, thus I have also th

Re: Timestamp Issue with OutputTags

2021-01-11 Thread Taher Koitawala

tagSet.addAll(tags); > > } > > > > for (String tag : tagSet) { > > outputMap.putIfAbsent(tag, new OutputTag(tag) > {}); > > ctx.output(outputMap.get(tag), value); > > } > > } > > } > > > > Exception com

Re: Timestamp Issue with OutputTags

2021-01-11 Thread Taher Koitawala

Can you please share your code? On Mon, Jan 11, 2021, 6:47 PM Priyanka Kalra A < priyanka.a.ka...@ericsson.com> wrote: > Hi Team, > > > > We are generating multiple side-output tags and using default processing > time on non-keyed stream. The class $YYY extends *ProcessFunction* O> and implem

Re: map JSON to scala case class & off-heap optimization

2020-07-09 Thread Taher Koitawala

on4s or > https://github.com/FasterXML/jackson-module-scala both only seem to > consume strings. > > Best, > Georg > > Am Do., 9. Juli 2020 um 19:17 Uhr schrieb Taher Koitawala < > taher...@gmail.com>: > >> You can try the Jackson ObjectMapper library and t

Re: map JSON to scala case class & off-heap optimization

2020-07-09 Thread Taher Koitawala

You can try the Jackson ObjectMapper library and that will get you from json to object. Regards, Taher Koitawala On Thu, Jul 9, 2020, 9:54 PM Georg Heiler wrote: > Hi, > > I want to map a stream of JSON documents from Kafka to a scala case-class. > How can this be accomplish

Re: Data Stream Enrichement

2020-05-30 Thread Taher Koitawala

The Open method would be a great! And close method could close it when operator closes! Also for external calls AsyncIO is a great operator. Give that a look. Regards, Taher Koitawala On Sat, May 30, 2020, 10:17 PM Aissa Elaffani wrote: > Hello Guys, > I want to enrich a data strea

Re: Blocking KeyedCoProcessFunction.processElement1

2020-01-28 Thread Taher Koitawala

Would AsyncIO operator not be an option for you to connect to RDBMS? On Tue, Jan 28, 2020, 12:45 PM Alexey Trenikhun wrote: > Thank you Yun Tang. > My implementation potentially could block for significant amount of time, > because I wanted to do RDBMS maintenance (create partitions for new data

Re: How to emit changed data only w/ Flink trigger?

2019-11-01 Thread Taher Koitawala

You can do this by writing a custom trigger or evictor. On Fri, Nov 1, 2019 at 3:08 PM Qi Kang wrote: > Hi all, > > > We have a Flink job which aggregates sales volume and GMV data of each > site on a daily basis. The code skeleton is shown as follows. > > > ``` > sourceStream > .map(message ->

Re: Customize Part file naming (Flink 1.9.0)

2019-10-19 Thread Taher Koitawala

Beware when using Bucketing sink as it does not follow exactly once semantics. Also it has issues with s3 consistency. On Sat, Oct 19, 2019, 1:42 PM Ravi Bhushan Ratnakar < ravibhushanratna...@gmail.com> wrote: > Hi, > > As an alternative, you may use BucketingSink which provides you the > prov

Re: Using S3 as a sink (StreamingFileSink)

2019-08-18 Thread taher koitawala

> On Sun 18 Aug 2019 at 16:24, taher koitawala wrote: > >> Hi Swapnil, >>We faced this problem once, I think changing checkpoint dir to >> hdfs and keeping sink dir to s3 with EMRFS s3 consistency enabled solves >> this problem. If you are not using e

Re: Using S3 as a sink (StreamingFileSink)

2019-08-18 Thread taher koitawala

Hi Swapnil, We faced this problem once, I think changing checkpoint dir to hdfs and keeping sink dir to s3 with EMRFS s3 consistency enabled solves this problem. If you are not using emr then I don't know how else it can be solved. But in a nutshell because EMRFS s3 consistency uses Dynamo D

Re: RocksDB KeyValue store

2019-07-29 Thread taher koitawala

I believe Flink serialization is really fast and GC is much better from Flink 1.6 release, along side the state depends on what you do with it. each task manager has its own instance of rocks DB and is responsible for snapshot for his own instance upon checkpointing. Further more if you used a key

Re: Event time window eviction

2019-07-29 Thread taher koitawala

progress? > > Thanks > > On Mon, Jul 29, 2019 at 10:51 AM taher koitawala > wrote: > >> I believe the approach to this is wrong... For fixing windows we can >> write our custom triggers to fire them... However what I'm not convinced >> with is switching between

Re: Event time window eviction

2019-07-29 Thread taher koitawala

I believe the approach to this is wrong... For fixing windows we can write our custom triggers to fire them... However what I'm not convinced with is switching between event and processing time. Write a custom triggers and fire the event time window if you don't see any activity. That's th

Re: Apache Flink - Manipulating the state of an object in an evictor or trigger

2019-07-18 Thread taher koitawala

As far as I know. It is completely safe On Fri, Jul 19, 2019, 1:35 AM M Singh wrote: > Just wanted to see if there is any advice on this question. Thanks > > On Sunday, July 14, 2019, 09:07:45 AM EDT, M Singh > wrote: > > > Hi: > > Is it safe to manipulate the state of an object in the evictor

Re: table toRetractStream missing last record and adding extra column (True)

2019-07-16 Thread taher koitawala

Looks like you need a window On Tue, Jul 16, 2019, 9:24 PM sri hari kali charan Tummala < kali.tumm...@gmail.com> wrote: > Hi All, > > I am trying to write toRetractSream to CSV which is kind of working ok but > I get extra values like True and then my output data values. > > Question1 :- > I don

Re: Split Stream on a Split Stream

2019-02-27 Thread Taher Koitawala

No particular reason for not using the process function, just wanted to clarify if that was the correct way to do it. Thanks Knauf. Regards, Taher Koitawala GS Lab Pune +91 8407979163 On Wed, Feb 27, 2019 at 8:23 PM Konstantin Knauf wrote: > Hi Taher , > > a ProcessFunction is act

Split Stream on a Split Stream

2019-02-26 Thread Taher Koitawala

ava.lang.IllegalStateException: Consecutive multiple splits are not supported. Splits are deprecated. Please use side-outputs Regards, Taher Koitawala GS Lab Pune +91 8407979163

Re: StreamingFileSink Avro batch size and compression

2019-01-26 Thread Taher Koitawala

://github.com/apache/flink/blob/master/flink-streaming-java/src/test/java/org/apache/flink/streaming/api/functions/sink/filesystem/BucketAssignerITCases.java > On 26/01/2019 08:18, Taher Koitawala wrote: > > Can someone please help with this? > > On Fri 25 Jan, 2019, 1:47 PM Taher Koita

Re: StreamingFileSink Avro batch size and compression

2019-01-25 Thread Taher Koitawala

Can someone please help with this? On Fri 25 Jan, 2019, 1:47 PM Taher Koitawala Hi All, > Is there a way to specify *batch size* and *compression *properties > when using StreamingFileSink just like we did in bucketing sink? The only > parameters it is accepting is Inactivi

StreamingFileSink Avro batch size and compression

2019-01-25 Thread Taher Koitawala

the same kafka topics, however doing different operations. And each flink job is writing a file with different size and we would want to make it consistent. Regards, Taher Koitawala GS Lab Pune +91 8407979163

StreamingFileSink cannot get AWS S3 credentials

2019-01-11 Thread Taher Koitawala

and folders, we are only facing this issue with StreamingFileSink. Regards, Taher Koitawala GS Lab Pune +91 8407979163

Re: SteamingFileSink with TwoPhaseCommit

2019-01-10 Thread Taher Koitawala

similar to the ones > that the older BucketingSink was doing. > > Cheers, > Kostas > > On Thu, Jan 10, 2019 at 10:47 AM Taher Koitawala < > taher.koitaw...@gslab.com> wrote: > >> Hi Kostas, >>Thanks you for the clarification, also can you please

Re: SteamingFileSink with TwoPhaseCommit

2019-01-10 Thread Taher Koitawala

Hi Kostas, Thanks you for the clarification, also can you please point how StreamingFileSink uses TwoPhaseCommit. Can you also point out the implementing class for that? Regards, Taher Koitawala GS Lab Pune +91 8407979163 On Thu, Jan 10, 2019 at 3:10 PM Kostas Kloudas wrote

Re: SteamingFileSink with TwoPhaseCommit

2019-01-10 Thread Taher Koitawala

Koitawala On Thu 10 Jan, 2019, 2:40 PM Kostas Kloudas Hi Taher, > > The StreamingFileSink implements a version of TwoPhaseCommit. Can you > elaborate a bit on what do you mean by " TwoPhaseCommit is not being used > "? > > Cheers, > Kostas > > On Thu, Jan 10, 201

SteamingFileSink with TwoPhaseCommit

2019-01-10 Thread Taher Koitawala

Hi All, As per my understanding and the API of StreamingFileSink, TwoPhaseCommit is not being used. Can someone please confirm is that's right? Also if StreamingFileSink does not support TwoPhaseCommits what is the best way to implement this? Regards, Taher Koitawa

Row format or bulk format

2018-12-27 Thread Taher Koitawala

please elaborate and explain the how the row format and the bulk works? Document only stresses on how they will be serialized. Taher Koitawala GS Lab Pune +91 8407979163

Flink job execution graph hints

2018-12-09 Thread Taher Koitawala

Hi All, Is there a way to send hints to the job graph builder!? Like specifically disabling or enabling chaining.

Re: Flink join stream where one stream is coming 5 minutes late

2018-11-26 Thread Taher Koitawala

May I ask why you want to have 2 differences between window time? What's the use case? On Mon 26 Nov, 2018, 5:53 PM Abhijeet Kumar Hello Team, > > I've to join two stream where one stream is coming late. So, I planned > doing it by creating two windows, for first window the size will be 5 > minut

Re: understadning kafka connector - rebalance

2018-11-26 Thread Taher Koitawala

is by the key. rebalancing will not shuffle this partitioning ? > e.g > .addSource(source) > .rebalance > .keyBy(_.id) > .mapWithState(...) > > > On Mon, Nov 26, 2018 at 8:32 AM Taher Koitawala > wrote: > >> Hi Avi, >> No,

Re: Window is not working in streaming

2018-11-26 Thread Taher Koitawala

in my code then > it would be a great help. > > Thanks, > > > *Abhijeet Kumar* > Software Development Engineer, > Sentienz Solutions Pvt Ltd > Cognitive Data Platform - Perceive the Data ! > abhijeet.ku...@sentienz.com |www.sentienz.com | Bengaluru > > > On 26-N

Re: understadning kafka connector - rebalance

2018-11-25 Thread Taher Koitawala

Hi Avi, No, rebalance is not changing the number of kafka partitions. Lets say you have 6 kafka partitions and your flink parallelism is 8, in this case using rebalance will send records to all downstream operators in a round robin fashion. Regards, Taher Koitawala GS Lab Pune +91

Re: RocksDB checkpointing dir per TM

2018-10-26 Thread Taher Koitawala

ributed file system > available on all TMs. > The location is set in Flink option ‘state.checkpoints.dir'. > This way job can restore from it with different set of TMs. > > Best, > Andrey > > > On 26 Oct 2018, at 08:29, Taher Koitawala > wrote: > > > > H

RocksDB checkpointing dir per TM

2018-10-25 Thread Taher Koitawala

we have an even spread when the RocksDB files are written? Thanks, Taher Koitawala

Re: [DISCUSS] Integrate Flink SQL well with Hive ecosystem

2018-10-12 Thread Taher Koitawala

Sounds smashing; I think the initial integration will help 60% or so flink sql users and a lot other use cases will emerge when we solve the first one. Thanks, Taher Koitawala On Fri 12 Oct, 2018, 10:13 AM Zhang, Xuefu, wrote: > Hi Taher, > > Thank you for your input. I think you e

Re: [DISCUSS] Integrate Flink SQL well with Hive ecosystem

2018-10-11 Thread Taher Koitawala

ing" The way we use this is: Using streaming_table as configuration select count(*) from processingtable as streaming; This way users can now pass Flink SQL info easily and get rid of the Flink SQL configuration file all together. This is simple and easy to understand and I think most user

Re: [DISCUSS] Integrate Flink SQL well with Hive ecosystem

2018-10-11 Thread Taher Koitawala

tream; select count(*) from flink_mailing_list process as batch; This way we could completely get rid of Flink SQL configuration files. Thanks, Taher Koitawala Integrating On Fri 12 Oct, 2018, 2:35 AM Zhang, Xuefu, wrote: > Hi Rong, > > Thanks for your feedback. Some of my earlier comment

Re: Kafka Per-Partition Watermarks

2018-10-04 Thread Taher Koitawala

t fine. Thanks, Taher Koitawala On Fri 5 Oct, 2018, 1:28 AM Andrew Kowpak, wrote: > Hi all, > > I apologize if this has been discussed to death in the past, but, I'm > finding myself very confused, and google is not proving helpful. > > Based on the documentation, I unde

Re: Null Flink State

2018-09-25 Thread Taher Koitawala

Hi Dawid, Thanks for the answer, how do I get the state of the Window then? I do understand that elements are going to the state as window in itself is a stateful operator. How do I get access to those elements? Regards, Taher Koitawala GS Lab Pune +91 8407979163 On Tue, Sep 25

Null Flink State

2018-09-25 Thread Taher Koitawala

t; valueState = ctx.globalState().getState(new ValueStateDescriptor<>("valueState", TypeInformation.of(new TypeHint() {}))); System.out.println(valueState.value()); collector.collect(T) }) Regards, Taher Koitawala GS Lab Pune +91 8407979163

Re: How does flink read data from kafka number of TM's are more than topic partitions

2018-09-21 Thread Taher Koitawala

Thanks a lot for the explanation. That was exactly what I thought should happen. However, it is always good to a clear confirmation. Regards, Taher Koitawala GS Lab Pune +91 8407979163 On Fri, Sep 21, 2018 at 6:26 PM Piotr Nowojski wrote: > Hi, > > Yes, in your case half of the Kaf

How does flink read data from kafka number of TM's are more than topic partitions

2018-09-21 Thread Taher Koitawala

ribing the number of TM's than the number of partitions in the Kafka topic guarantee high throughput? Regards, Taher Koitawala GS Lab Pune +91 8407979163

Lookup from state

2018-09-17 Thread Taher Koitawala

ord X' from Stream2 reaches after maxOutOfOrderness time has passed. In this scenario as per my knowledge. X will be maintained in the flink state. However, when X' comes, how do I do a lookup for X from the flink state and carry on the further aggregation or whatever I want to do. Regards,

Re: Watermarks in Event Time windowing

2018-09-13 Thread Taher Koitawala

27;s an ideal state, and it's often difficult to achieve this state, because eventtime always has more or less delay as events are transmitted from the source to the processing system. Thanks, vino. [1]: https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/event_timestamps_watermarks.

Watermarks in Event Time windowing

2018-09-13 Thread Taher Koitawala

Hi All, Can someone show a good example of how watermarks need to be generated when using EventTime windows? What happens if the watermark is same as the timestamp? How does the watermark help in the window to be triggered and what if watermarks are kept behind the currentTimestamps in

Re: How does flink read a DataSet?

2018-09-12 Thread Taher Koitawala

g on the sender side (batch only). 2018-09-12 7:30 GMT-04:00 Taher Koitawala : > So flink TMs reads one line at a time from hdfs in parallel and keep > filling it in memory and keep passing the records to the next operator? I > just want to know how data comes in memory? How it is parti

Re: How does flink read a DataSet?

2018-09-12 Thread Taher Koitawala

;> by directly playing back the complete data set. A TaskManager fails, Flink >> will kick it out of the cluster, and the Task running on it will fail, but >> the result of stream processing and batch Task failure is different. For >> stream processing, it triggers a restart of t

Re: How does flink read a DataSet?

2018-09-11 Thread Taher Koitawala

Furthermore, how does Flink deal with Task Managers dying when it is using the DataSet API. Is checkpointing done on dataset too? Or the whole dataset has to re-read. Regards, Taher Koitawala GS Lab Pune +91 8407979163 On Tue, Sep 11, 2018 at 11:18 PM, Taher Koitawala wrote: > Hi

How does flink read a DataSet?

2018-09-11 Thread Taher Koitawala

Hi All, Just like Spark does Flink read a dataset and keep it in memory and keep applying transformations? Or all records read by Flink async parallel reads? Furthermore, how does Flink deal with Regards, Taher Koitawala GS Lab Pune +91 8407979163

49 matches

Mail list logo