Re: Side outputs from sinks

2023-10-18 Thread Péter Váry
k1 -> if OK > sink2 -> if OK sink3 > > \-> if NOK sink4 \-> if NOK sink4 > > In order to achieve it, we would like to have some kind of side output > coming out from our sinks but side outputs are not available in > SinkFunctions. The only way we have c

Side outputs from sinks

2023-10-18 Thread Aian Cantabrana
t; filters/maps/process -> sink1 -> if OK sink2 -> if OK sink3 \-> if NOK sink4 \-> if NOK sink4 In order to achieve it, we would like to have some kind of side output coming out from our sinks but side outputs are not available in SinkFunctions. The only way we have come up

Re: Side outputs documentation

2023-09-26 Thread Alexis Sarda-Espinosa
ype information. > >> > >> The second constructor is introduced after the document and the first > >> constructor, and I think the document might have been outdated and not > >> match with OutputTag's current behavior. A ticket and PR could be > >> adde

Re: Side outputs documentation

2023-09-25 Thread Yunfeng Zhou
ment and the first >> constructor, and I think the document might have been outdated and not >> match with OutputTag's current behavior. A ticket and PR could be >> added to fix the document. What do you think? >> >> Best, >> Yunfeng >> >> On Fri, Sep 22, 2023 a

Re: Side outputs documentation

2023-09-25 Thread Alexis Sarda-Espinosa
behavior. A ticket and PR could be > added to fix the document. What do you think? > > Best, > Yunfeng > > On Fri, Sep 22, 2023 at 4:55 PM Alexis Sarda-Espinosa > wrote: > > > > Hello, > > > > very quick question, the documentation for side outputs state

Re: Side outputs documentation

2023-09-24 Thread Yunfeng Zhou
, 2023 at 4:55 PM Alexis Sarda-Espinosa wrote: > > Hello, > > very quick question, the documentation for side outputs states that an > OutputTag "needs to be an anonymous inner class, so that we can analyze the > type" (this is written in a comment in the example). Is th

Side outputs documentation

2023-09-22 Thread Alexis Sarda-Espinosa
Hello, very quick question, the documentation for side outputs states that an OutputTag "needs to be an anonymous inner class, so that we can analyze the type" (this is written in a comment in the example). Is this really true? I've seen many examples where it's a static element an

Re: behavior differences when sinking side outputs to a custom KeyedCoProcessFunction

2021-06-09 Thread Arvid Heise
age: Screen Shot 2021-05-20 at 3.12.22 PM.png] >>> these join functions have a time window/duration or interval associated >>> with them to define the duration of join state and inference window. this >>> is set per operator to allow for in order and out of order join threshol

Re: behavior differences when sinking side outputs to a custom KeyedCoProcessFunction

2021-06-03 Thread Jin Yi
tate and inference window. this >> is set per operator to allow for in order and out of order join thresholds >> for id based joins, and this window acts as the scope for inference when a >> right event that is an inference candidate (missing foreign key id) is >> about to b

Re: Does Flink 1.12.1 DataStream API batch execution mode support side outputs?

2021-05-23 Thread Marco Villalobos
I found the problem. I tried to sign timestamps to the operator (I don't know why), and when I did that, because I used the Flink API fluently, I was no longer referencing the operator that contained the side-outputs. Disregard my question. On Sat, May 22, 2021 at 9:28 PM Marco Villalobos

Does Flink 1.12.1 DataStream API batch execution mode support side outputs?

2021-05-22 Thread Marco Villalobos
I have been struggling for two days with an issue using the DataStream API in Batch Execution mode. It seems as though my side-output has no elements available to downstream operators. However, I am certain that the downstream operator received events. I logged the side-output element just

Re: behavior differences when sinking side outputs to a custom KeyedCoProcessFunction

2021-05-21 Thread Jin Yi
perator to allow for in order and out of order join thresholds > for id based joins, and this window acts as the scope for inference when a > right event that is an inference candidate (missing foreign key id) is > about to be evicted from state. > > problem: > > i have things coded up

behavior differences when sinking side outputs to a custom KeyedCoProcessFunction

2021-05-20 Thread Jin Yi
about to be evicted from state. problem: i have things coded up with side outputs for duplicate, late and dropped events. the dropped events case is the one i am focusing on since events that go unmatched are dropped when they are evicted from state. only rhs events are the ones being dropped, w

Re: Side outputs PyFlink

2021-05-20 Thread Dian Fu
PI? This already works pretty neatly in the DataStream API but couldn't find > any communication on adding this to PyFlink. > > In the meantime, what do you suggest for a workaround on side outputs? > Intuitively, I would copy a stream and add a filter for each side output but >

Side outputs PyFlink

2021-05-20 Thread Wouter Zorgdrager
for a workaround on side outputs? Intuitively, I would copy a stream and add a filter for each side output but this seems a bit inefficient. In that setup, each side output will need to go over the complete stream. Any ideas? Thanks in advance! Regards, Wouter

Re: numRecordsOutPerSecond metric and side outputs

2021-01-04 Thread Arvid Heise
Hi Alexey, side outputs should be counted in numRecordsOutPerSecond. However, there is a bug that this is not happening for side-outputs in the middle of the chain [1]. [1] https://issues.apache.org/jira/browse/FLINK-18808 On Tue, Dec 22, 2020 at 1:14 AM Alexey Trenikhun wrote: > He

numRecordsOutPerSecond metric and side outputs

2020-12-21 Thread Alexey Trenikhun
Hello, Does numRecordsOutPerSecond metric takes into account number of records send to side output or it provides rate only for main output? Thanks, Alexey

Re: Multiple side outputs of same type?

2020-12-18 Thread Alex Cruise
and filtering out >> the wrong records in each suffix stream, but it's not super efficient... >> Unfortunately, from what I can see, using side outputs isn't an option >> because each output tag has a single type parameter, and the output record >> is dispatched based on i

Re: Multiple side outputs of same type?

2020-12-18 Thread Arvid Heise
ffix streams, which are allocated on boot > based on configuration. > > At the moment I'm just duplicating the input records, and filtering out > the wrong records in each suffix stream, but it's not super efficient... > Unfortunately, from what I can see, using side outputs isn't an optio

Multiple side outputs of same type?

2020-12-18 Thread Alex Cruise
, but it's not super efficient... Unfortunately, from what I can see, using side outputs isn't an option because each output tag has a single type parameter, and the output record is dispatched based on its runtime type. Is there a better way to do this? Thanks! -0xe1a

Re: How to setup Regions for Fault Tolerance in Flink when using Side Outputs

2020-11-25 Thread Till Rohrmann
Hi Patrick, at the moment it is not possible to disconnect side outputs from other streaming operators. I guess what you would like to have is an operator which consumes on a best effort basis but which can also lose some data while it is being restarted. This is currently not supported by Flink

Re: How to setup Regions for Fault Tolerance in Flink when using Side Outputs

2020-11-25 Thread Eifler, Patrick
Hi Till, Thanks for your reply. Is there any option to disconnect the side outputs from the pipelined data exchanges of the main stream. The benefit of side outputs is very high regarding performance and useability plus it fits the use case here very nicely. Though this pipelined connection

Re: How to setup Regions for Fault Tolerance in Flink when using Side Outputs

2020-11-24 Thread Till Rohrmann
of side outputs, I think they are connected via pipelined data exchanges with the main stream and, hence, are part of the same failover region as the main stream. [1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/task_failure_recovery.html#restart-pipelined-region-failover-strategy Cheers

How to setup Regions for Fault Tolerance in Flink when using Side Outputs

2020-11-24 Thread Eifler, Patrick
Hi all, We are trying to setup regions to enable Flink to only stop failing tasks based on region instead of failing the entire stream. We are using one main stream that is reading from a kafka topic and a bunch of side outputs for processing each event from that topic differently

Re: Is outputting from components other than sinks or side outputs a no-no ?

2020-07-27 Thread Tom Fennelly
No, I think David answered the specific question that I asked i.e. is it okay (or not) for operators other than sinks and side outputs to do I/O. Purging DLQ entries is something we'll need to be able to do anyway (for some scenarios - aside from successful checkpoint retries) and I specifically

Re: Is outputting from components other than sinks or side outputs a no-no ?

2020-07-27 Thread Stephen Connolly
ncio.html > > <https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/stream/operators/asyncio.html> > > > > Best, > > David > > > > On Sun, Jul 26, 2020 at 11:08 AM Tom Fennelly > > wrote: > > > >> Hi. > >> > &g

Re: Is outputting from components other than sinks or side outputs a no-no ?

2020-07-27 Thread Tom Fennelly
ors/asyncio.html> > > Best, > David > > On Sun, Jul 26, 2020 at 11:08 AM Tom Fennelly > wrote: > >> Hi. >> >> What are the negative side effects of (for example) a filter function >> occasionally making a call out to a DB ? Is this a big no-no and should all >> outputs be done through sinks and side outputs, no exceptions ? >> >> Regards, >> >> Tom. >> >

Re: Is outputting from components other than sinks or side outputs a no-no ?

2020-07-26 Thread David Anderson
ve side effects of (for example) a filter function > occasionally making a call out to a DB ? Is this a big no-no and should all > outputs be done through sinks and side outputs, no exceptions ? > > Regards, > > Tom. >

Is outputting from components other than sinks or side outputs a no-no ?

2020-07-26 Thread Tom Fennelly
Hi. What are the negative side effects of (for example) a filter function occasionally making a call out to a DB ? Is this a big no-no and should all outputs be done through sinks and side outputs, no exceptions ? Regards, Tom.

Re: Why side-outputs are only supported by Process functions?

2020-06-22 Thread Arvid Heise
Hi Ivneet, Q1) you can read about the deprecation of split in FLINK-11084 [1]. In general side-outputs subsume the functionality and allow some advanced cases (like emitting the same record into two outputs). Q2) It's simply a matter of API design. The basic idea is to keep most interfaces

Why side-outputs are only supported by Process functions?

2020-06-21 Thread ivneet kaur
Hi folks, I want to split my stream for some invalid message handling, and need help understanding a few things. Question 1: Why is *split *operator deprecated? Question 2: Why side-outputs are only supported for ProcessFunction, KeyedProcessFunction etc. The doc on side-outputs says: "*Yo

Re: Side Outputs from RichAsyncFunction

2020-02-20 Thread Chesnay Schepler
I don't think this is possible. At the very least you should be able to workaround this by having your AsyncFunction return an Either, and having a subsequent ProcessFunction do the side-output business. On 19/02/2020 22:25, KristoffSC wrote: Hi, any thoughts about this one? Regards,

Re: Side Outputs from RichAsyncFunction

2020-02-19 Thread KristoffSC
Hi, any thoughts about this one? Regards, Krzysztof -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Side Outputs from RichAsyncFunction

2020-02-18 Thread KristoffSC
Hi all, Is there a way to emit a side output from RichAsyncFunction operator like it is possible with ProcessFunctions via ctx.output(outputTag, value); At first glance I don't see a way to do it In my use case RichAsyncFunction is used to call REST services and I would like to handle REST error

Side outputs in Async I/O

2019-11-06 Thread Romain Gilles
Hi flink, I would like to know if you think that providing access to the side output from Async I/O cloud be a good idea. Thanks in advance, Romain

Re: Side outputs never getting consumed

2018-04-06 Thread Timo Walther
Hi Julio, thanks for this great example. I could reproduce it on my machine and I could find the problem. You need to store the newly created branch of your pipeline in some variable like `val test = pipeline.process()` in order to access the side outputs via `test.getSideOutput

Re: Side outputs never getting consumed

2018-04-04 Thread Julio Biason
/SideoutputSample.scala The thing to notice is that we do the split to side outputs _after_ the window functions -- because we want to split the results just before the sinks (we had a split there instead, but the job would, sometimes, crash because "splits can't be used with side outputs", or something ar

Re: Side outputs never getting consumed

2018-04-03 Thread Timo Walther
Timo Am 02.04.18 um 21:53 schrieb Julio Biason: Hey guys, I have a pipeline that generates two different types of data (but both use the same trait) and I need to save each on a different sink. So far, things were working with splits, but it seems using splits with side outputs (for the

Side outputs never getting consumed

2018-04-02 Thread Julio Biason
Hey guys, I have a pipeline that generates two different types of data (but both use the same trait) and I need to save each on a different sink. So far, things were working with splits, but it seems using splits with side outputs (for the late data, which we'll plug a late arrival handling

Re: Out of the blue: "Cannot use split/select with side outputs"

2018-03-19 Thread Julio Biason
a weird problem with my pipeline. > > The pipeline process lines from our logs and generate different metrics > based on it (I mean, quite the standard procedure). It uses side outputs > for dead letter queues, in case it finds something wrong with the logs and > a metric can't be gener

Out of the blue: "Cannot use split/select with side outputs"

2018-03-19 Thread Julio Biason
Hey guys, I got a weird problem with my pipeline. The pipeline process lines from our logs and generate different metrics based on it (I mean, quite the standard procedure). It uses side outputs for dead letter queues, in case it finds something wrong with the logs and a metric can't

Re: Batch Side Outputs

2017-06-06 Thread Fabian Hueske
ne is currently working on adding side outputs > for the DataSet API. The workaround is to output one common type from a > function have several parallel filters after that for filtering out the > elements of the correct type for the respective stream. > > Best, > Aljoscha >

Re: Batch Side Outputs

2017-06-06 Thread Aljoscha Krettek
Hi Flavio, As far as I am aware no one is currently working on adding side outputs for the DataSet API. The workaround is to output one common type from a function have several parallel filters after that for filtering out the elements of the correct type for the respective stream. Best

Batch Side Outputs

2017-06-06 Thread Flavio Pompermaier
Hi to all, will side outputs [FLINK-4460 <https://issues.apache.org/jira/browse/FLINK-4460>] be eventually available also for batch API? Best, Flavio

Re: Side outputs

2017-02-08 Thread Chen Qin
FLINK-4460 Thanks, Chen On Wed, Feb 8, 2017 at 9:04 AM, Newport, Billy <billy.newp...@gs.com> wrote: > I’ve implemented side outputs right now using an enum approach as > recommended be others. Basically I have a mapper which wants to generate 4 > outputs (DATA, INSERT,

Side outputs

2017-02-08 Thread Newport, Billy
I've implemented side outputs right now using an enum approach as recommended be others. Basically I have a mapper which wants to generate 4 outputs (DATA, INSERT, UPDATES, DELETE). It emits a Tuple2<Enum,Record> right now and I use a 4 following filters to write each 'stream' to a dif