Bryan's guess on the history is probably right but more to the point with what we have available these days with the record processors and so on I think we should just change it back to one. Peter's statement on user expectation I agree with for sure. Any chance you want to file that JIRA/PR peter?
On Fri, May 4, 2018 at 9:13 AM, Bryan Bende <bbe...@gmail.com> wrote: > I don't know the history of this particular processor, but I think the > purpose of the session.get() with batches is similar to the concept of > @SupportsBatching. Basically both of them should have better > performance because you are handling multiple flow files in a single > session. The supports batching concept is a bit more flexible as it is > configurable by the user, where as this case is hard-coded into the > processor. > > I suppose if there is some reason why you need to process 1 flow file > at a time, you could set the back-pressure threshold to 1 on the queue > leading into ReplaceText. > > On Fri, May 4, 2018 at 3:50 AM, Peter Wicks (pwicks) <pwi...@micron.com> > wrote: >> Had a user notice today that a ReplaceText processor, scheduled to run every >> 20 minutes, had processed all 14 files in queue at once. I looked at the >> code and see that ReplaceText does not do a standard session.get, but >> instead calls: >> >> final List<FlowFile> flowFiles = >> session.get(FlowFileFilters.newSizeBasedFilter(1, DataUnit.MB, 100)); >> >> Was there a design reason behind this? To us it was just really confusing >> that we didn't have full control over how quickly FlowFile's move through >> this processor. >> >> Thanks, >> Peter