Re: [EXT] Sliding windows
Joe Thanks for the options. Kafka Flink option is interesting but to big at the moment. I’ll build a custom processor for this occasion. Much appreciated! Craig Knell > On 4 Jun 2019, at 21:56, Joe Witt wrote: > > ...from the description it isn't clear what you're trying to achieve so > lets first try to expand the detail on the use case. > > We should distinguish whether you're wanting to 'combine various objects in > a data stream together on some time bound' from 'processing various objects > in a data stream to make some observation over some time bound'. > > If you're wanting to merge data together to make a larger object comprised > of those smaller objects then MergeContent is your friend. > > If you're wanting to look at a given stream or set of streams at once and > make a time window based observation over that data then I recommend > looking at something like Apache Flink which is purpose built for that and > should be better than NiFi at that part. If it is a pretty straight > forward single stream window evaluation and you want to avoid having > another system in play then I'd just write a little custom processor in > NiFi for your case. Once you have a more complex data distribution and > processing requirement and you want a powerful low latency combination I'd > say put NiFi, Kafka, and Flink together for a pretty hard to beat combo. > > Thanks > > On Tue, Jun 4, 2019 at 9:51 AM Peter Wicks (pwicks) > wrote: > >> Craig, >> >> If you have a timestamp set as an attribute on the processor, then this is >> kind of possible. >> >> Have a regular MergeContent processor, with "Maximum Group Size" set to 1 >> mb, set "Max Bin Age" to 3 min; you may need to tweak settings to get the >> right cadence, but these are generally the settings you need to touch. Use >> the "Merged" relationship for whatever you need. To create the Window, pass >> the "Original" relationship to a RouteOnAttribute processor. >> >> In the RouteOnAttribute use NiFi Expression Language to calculate how old >> the FlowFile is (using the timestamp attribute I mentioned). If the >> FlowFile is older than x, drop it, else send it back to the MergeContent >> processor. >> >> Using this process, it should be easy to get a 5 min rolling window (drop >> any FlowFile older than 5 min in RouteOnAttribute). >> >> I don't know that this perfectly answers what you asked, but does it give >> you a good direction to investigate? >> >> Thanks, >> Peter >> >> -Original Message- >> From: Craig Knell >> Sent: Tuesday, June 4, 2019 1:32 AM >> To: dev@nifi.apache.org >> Subject: [EXT] Sliding windows >> >> Hi Folks >> >> We have a stream of data that I need to window to 5 minutes and the window >> is to slide every 3 minutes. Each minute is 1 mb, I therefore have to >> deliver 5mb per 3 minutes. >> >> What is the best way of achieving this in nifi? >> >> Best regards >> >> Craig >>
Re: [EXT] Sliding windows
...from the description it isn't clear what you're trying to achieve so lets first try to expand the detail on the use case. We should distinguish whether you're wanting to 'combine various objects in a data stream together on some time bound' from 'processing various objects in a data stream to make some observation over some time bound'. If you're wanting to merge data together to make a larger object comprised of those smaller objects then MergeContent is your friend. If you're wanting to look at a given stream or set of streams at once and make a time window based observation over that data then I recommend looking at something like Apache Flink which is purpose built for that and should be better than NiFi at that part. If it is a pretty straight forward single stream window evaluation and you want to avoid having another system in play then I'd just write a little custom processor in NiFi for your case. Once you have a more complex data distribution and processing requirement and you want a powerful low latency combination I'd say put NiFi, Kafka, and Flink together for a pretty hard to beat combo. Thanks On Tue, Jun 4, 2019 at 9:51 AM Peter Wicks (pwicks) wrote: > Craig, > > If you have a timestamp set as an attribute on the processor, then this is > kind of possible. > > Have a regular MergeContent processor, with "Maximum Group Size" set to 1 > mb, set "Max Bin Age" to 3 min; you may need to tweak settings to get the > right cadence, but these are generally the settings you need to touch. Use > the "Merged" relationship for whatever you need. To create the Window, pass > the "Original" relationship to a RouteOnAttribute processor. > > In the RouteOnAttribute use NiFi Expression Language to calculate how old > the FlowFile is (using the timestamp attribute I mentioned). If the > FlowFile is older than x, drop it, else send it back to the MergeContent > processor. > > Using this process, it should be easy to get a 5 min rolling window (drop > any FlowFile older than 5 min in RouteOnAttribute). > > I don't know that this perfectly answers what you asked, but does it give > you a good direction to investigate? > > Thanks, > Peter > > -Original Message- > From: Craig Knell > Sent: Tuesday, June 4, 2019 1:32 AM > To: dev@nifi.apache.org > Subject: [EXT] Sliding windows > > Hi Folks > > We have a stream of data that I need to window to 5 minutes and the window > is to slide every 3 minutes. Each minute is 1 mb, I therefore have to > deliver 5mb per 3 minutes. > > What is the best way of achieving this in nifi? > > Best regards > > Craig >
RE: [EXT] Sliding windows
Craig, If you have a timestamp set as an attribute on the processor, then this is kind of possible. Have a regular MergeContent processor, with "Maximum Group Size" set to 1 mb, set "Max Bin Age" to 3 min; you may need to tweak settings to get the right cadence, but these are generally the settings you need to touch. Use the "Merged" relationship for whatever you need. To create the Window, pass the "Original" relationship to a RouteOnAttribute processor. In the RouteOnAttribute use NiFi Expression Language to calculate how old the FlowFile is (using the timestamp attribute I mentioned). If the FlowFile is older than x, drop it, else send it back to the MergeContent processor. Using this process, it should be easy to get a 5 min rolling window (drop any FlowFile older than 5 min in RouteOnAttribute). I don't know that this perfectly answers what you asked, but does it give you a good direction to investigate? Thanks, Peter -Original Message- From: Craig Knell Sent: Tuesday, June 4, 2019 1:32 AM To: dev@nifi.apache.org Subject: [EXT] Sliding windows Hi Folks We have a stream of data that I need to window to 5 minutes and the window is to slide every 3 minutes. Each minute is 1 mb, I therefore have to deliver 5mb per 3 minutes. What is the best way of achieving this in nifi? Best regards Craig