Thanks Joe On Tue, Apr 26, 2016 at 5:48 PM, Joe Witt <[email protected]> wrote:
> We're a gentle bunch. Fire away and ask questions as needed. You can > reply to this to get this back on list. All good. > > So yeah we need to handle these cases better. You should not have to > worry about this so that is on us to improve. Let's go ahead and > start this dialogue. > > Thanks > Joe > > On Tue, Apr 26, 2016 at 1:31 PM, Jim Wagoner <[email protected]> wrote: > > It is a few hundred thousand so that definitely sounds like the issue I > am > > seeing. On a previous project I had the same issue, but had much more > > powerful machine compared to the EC2 instance I am running now so I could > > just throw hardware at the problem. > > > > The dev list is fine. Just wasn't sure of the etiquette when submitting > to > > the dev list. > > > > On Tue, Apr 26, 2016 at 5:27 PM, Joe Witt <[email protected]> wrote: > >> > >> Jim, > >> > >> I suspect the issue lies elsewhere really unless there are hundreds of > >> thousands of objects in each zip file. We don't load the data in > >> memory while compressing or decompressing because we have a streaming > >> API for operating on the data. However, we do keep pointers and > >> attributes of those flow files so after hundreds of thousands those > >> could become problematic. > >> > >> Any reason to be off-list or is it ok if I put this on the dev list? > >> > >> Thanks > >> Joe > >> > >> On Tue, Apr 26, 2016 at 1:21 PM, Jim Wagoner <[email protected]> > wrote: > >> > Joe, > >> > > >> > I have been using the unpack processor to uncompress some fairly > >> > large > >> > files and have been running into memory and performance issues. I > wrote > >> > my > >> > own unpacker that transfers files as they are uncompress (clearly not > >> > side > >> > affect free) and it seems to help with memory usage, performance and > >> > gives > >> > me the option of handling corrupt zips. I wanted see if you could this > >> > of > >> > any issues within the framework with taking this approach. Is there a > >> > pattern I should look at when creating a large number of flow files > in a > >> > single processor call within Nifi? > >> > > >> > Thanks, > >> > Jim > > > > >
