We don't.. We've talked about putting it on there. We will in the future. On Fri, Oct 23, 2015 at 3:46 PM, Mark Payne <marka...@hotmail.com> wrote:
> So digging in a bit more, the issue that I was concerned about is not > really an issue, as the data is > still cleaned up elsewhere in the code if the FlowFile repo is filled up. > > There certainly must have been something that would have caused the > content repo to fill up, though. > Any chance that you have system metrics/reporting applications running on > this box, such as > Ganglia? That would help indicate when the disk started filling up to > narrow down the timeframe to > look at in the logs. > > > > > > On Oct 23, 2015, at 3:27 PM, Ryan H <rhendrickson.w...@gmail.com> wrote: > > > > Yea.. they're all on the same partition. (We know.. not good..) > > > > I grepped and haven't found any OutOfMem errors... > > > > Ryan > > > > On Fri, Oct 23, 2015 at 3:15 PM, Mark Payne <marka...@hotmail.com> > wrote: > > > >> This is indicating that it's unable to write to the > FileSystemRepository - > >> presumably because it is out of disk space. > >> > >> Any other error messages that you find? > >> > >> I'm wondering specifically if perhaps you received an OutOfMemoryError > or > >> anything of that nature? If it's not cleaning up > >> after itself, that would tend to indicate that the cleanup thread is no > >> longer running. So looking for anything that could > >> explain that. > >> > >> Also of note, you're seeing that the Content Repo is out of space, and > you > >> also mentioned an issue writing to the Provenance > >> Repository. > >> > >> Was your FlowFile Repository also out of disk space? Looking through the > >> code now I'm seeing a case where if the FlowFile > >> Repository is also out of disk space you may start running into issues > >> with the content repo filling up as well. > >> > >> Thanks > >> -Mark > >> > >> > >>> On Oct 23, 2015, at 3:00 PM, Ryan H <rhendrickson.w...@gmail.com> > wrote: > >>> > >>> Random mentions of FileSystemRepo > >>> > >>> nifi-app_2015-10-23_10.0.log: > >>> org.apache.nifi.processor.exception.FlowFileAccessException: Failed to > >> import data from java.io.ByteArrayInputStream@639fe470 for > >> > StandardFlowFileRecord[uuid=881506e8-c62f-442f-a564-88675e4f0372,claim=,offset=0,name=914750081936080,size=0] > >> due to org.apache.nifi.processor.exception.FlowFileAccessException: > Unable > >> to create ContentClaim due to java.io.IOException: Failed to write to > >> FileSystemRepository Stream [StandardContentClaim > >> [resourceClaim=StandardResourceClaim[id=1445594395635-74075, > >> container=default, section=347], offset=0, length=-1]] > >>> nifi-app_2015-10-23_10.0.log:Caused by: > >> org.apache.nifi.processor.exception.FlowFileAccessException: Unable to > >> create ContentClaim due to java.io.IOException: Failed to write to > >> FileSystemRepository Stream [StandardContentClaim > >> [resourceClaim=StandardResourceClaim[id=1445594395635-74075, > >> container=default, section=347], offset=0, length=-1]] > >>> nifi-app_2015-10-23_10.0.log:Caused by: java.io.IOException: Failed to > >> write to FileSystemRepository Stream [StandardContentClaim > >> [resourceClaim=StandardResourceClaim[id=1445594395635-74075, > >> container=default, section=347], offset=0, length=-1]] > >>> nifi-app_2015-10-23_10.0.log: at > >> > org.apache.nifi.controller.repository.FileSystemRepository$2.write(FileSystemRepository.java:913) > >> ~[nifi-framework-core-0.3.0.jar:0.3.0] > >>> nifi-app_2015-10-23_10.0.log: at > >> > org.apache.nifi.controller.repository.FileSystemRepository.importFrom(FileSystemRepository.java:689) > >> ~[nifi-framework-core-0.3.0.jar:0.3.0] > >>> nifi-app_2015-10-23_10.0.log: at > >> > org.apache.nifi.controller.repository.FileSystemRepository$2.write(FileSystemRepository.java:910) > >> ~[nifi-framework-core-0.3.0.jar:0.3.0] > >>> > >> > nifi-app_2015-10-23_10.0.log:org.apache.nifi.processor.exception.FlowFileAccessException: > >> Failed to import data from /opt/nifi/logs/nifi-app_2015-10-23_05.0.log > for > >> > StandardFlowFileRecord[uuid=1b365988-1287-4d12-a31c-75c99dfc12ad,claim=,offset=0,name=914754786738219,size=0] > >> due to java.io.IOException: Failed to write to FileSystemRepository > Stream > >> [StandardContentClaim > >> [resourceClaim=StandardResourceClaim[id=1445594400001-74079, > >> container=default, section=351], offset=0, length=-1]] > >>> > >>> > >>> > >>> On Fri, Oct 23, 2015 at 2:59 PM, Ryan H <rhendrickson.w...@gmail.com > >> <mailto:rhendrickson.w...@gmail.com>> wrote: > >>> Here's our provenance settings: > >>> > >>> # Persistent Provenance Repository Properties > >>> nifi.provenance.repository.directory.default=./provenance_repository > >>> nifi.provenance.repository.max.storage.time=24 hours > >>> nifi.provenance.repository.max.storage.size=1 GB > >>> nifi.provenance.repository.rollover.time=30 secs > >>> nifi.provenance.repository.rollover.size=100 MB > >>> nifi.provenance.repository.query.threads=2 > >>> nifi.provenance.repository.index.threads=1 > >>> nifi.provenance.repository.compress.on.rollover=true > >>> nifi.provenance.repository.always.sync=false > >>> nifi.provenance.repository.journal.count=16 > >>> # Comma-separated list of fields. Fields that are not indexed will not > >> be searchable. Valid fields are: > >>> # EventType, FlowFileUUID, Filename, TransitURI, ProcessorID, > >> AlternateIdentifierURI, ContentType, Relationship, Details > >>> nifi.provenance.repository.indexed.fields=EventType, FlowFileUUID, > >> Filename, ProcessorID, Relationship > >>> # FlowFile Attributes that should be indexed and made searchable > >>> nifi.provenance.repository.indexed.attributes= > >>> # Large values for the shard size will result in more Java heap usage > >> when searching the Provenance Repository > >>> # but should provide better performance > >>> nifi.provenance.repository.index.shard.size=500 MB > >>> # Indicates the maximum length that a FlowFile attribute can be when > >> retrieving a Provenance Event from > >>> # the repository. If the length of any attribute exceeds this value, it > >> will be truncated when the event is retrieved. > >>> nifi.provenance.repository.max.attribute.length=65536 > >>> > >>> # Volatile Provenance Respository Properties > >>> nifi.provenance.repository.buffer.size=100000 > >>> > >>> On Fri, Oct 23, 2015 at 2:57 PM, Elli Schwarz > >> <eliezer_schw...@yahoo.com.invalid <mailto: > >> eliezer_schw...@yahoo.com.invalid>> wrote: > >>> We had max storage size of 1GB, but that's for provenance repo and our > >> problem was with content_repo. Our disk was 60GB, all on one partition, > and > >> 55GB were taken up by content_repo. Now, it only contains 233MB. > >>> > >>> > >>> > >>> On Friday, October 23, 2015 2:50 PM, Mark Payne < > >> marka...@hotmail.com <mailto:marka...@hotmail.com>> wrote: > >>> > >>> > >>> > >>> OK, so this is interesting. Do you have your content repository and > >> provenance repository > >>> both pointing to the same partition? What do you have the > >> "nifi.provenance.repository.max.storage.size" > >>> property set to? How large is the actual disk? > >>> > >>> Thanks > >>> -Mark > >>> > >>> > >>>> On Oct 23, 2015, at 2:45 PM, Ryan H <rhendrickson.w...@gmail.com > >> <mailto:rhendrickson.w...@gmail.com>> wrote: > >>>> > >>>> I've got this one... let me look for that > >>>> > >>>> 2015-10-23 09:00:33,625 WARN [Provenance Maintenance Thread-1] > >>>> o.a.n.p.PersistentProvenanceRepository > >>>> java.io.IOException: No space left on device > >>>> at java.io.FileOutputStream.writeBytes(Native Method) > >> ~[na:1.8.0_51] > >>>> at java.io.FileOutputStream.write(FileOutputStream.java:326) > >>>> ~[na:1.8.0_51] > >>>> at > >>>> > >> > org.apache.lucene.store.FSDirectory$FSIndexOutput$1.write(FSDirectory.java:390) > >>>> ~[lucene-core-4.10.4.jar:4.10.4 1662817 - mike - 2015-02-27 16:38:43] > >>>> at > >>>> java.util.zip.CheckedOutputStream.write(CheckedOutputStream.java:73) > >>>> ~[na:1.8.0_51] > >>>> at > >>>> java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) > >>>> ~[na:1.8.0_51] > >>>> at > >>>> > >> > org.apache.lucene.store.OutputStreamIndexOutput.writeBytes(OutputStreamIndexOutput.java:51) > >>>> ~[lucene-core-4.10.4.jar:4.10.4 1662817 - mike - 2015-02-27 16:38:43] > >>>> at > >>>> org.apache.lucene.store.DataOutput.writeBytes(DataOutput.java:53) > >>>> ~[lucene-core-4.10.4.jar:4.10.4 1662817 - mike - 2015-02-27 16:38:43] > >>>> at > >>>> > >> > org.apache.lucene.codecs.lucene40.BitVector.writeBits(BitVector.java:272) > >>>> ~[lucene-core-4.10.4.jar:4.10.4 1662817 - mike - 2015-02-27 16:38:43] > >>>> at > >>>> org.apache.lucene.codecs.lucene40.BitVector.write(BitVector.java:227) > >>>> ~[lucene-core-4.10.4.jar:4.10.4 1662817 - mike - 2015-02-27 16:38:43] > >>>> at > >>>> > >> > org.apache.lucene.codecs.lucene40.Lucene40LiveDocsFormat.writeLiveDocs(Lucene40LiveDocsFormat.java:107) > >>>> ~[lucene-core-4.10.4.jar:4.10.4 1662817 - mike - 2015-02-27 16:38:43] > >>>> at > >>>> > >> > org.apache.lucene.index.ReadersAndUpdates.writeLiveDocs(ReadersAndUpdates.java:326) > >>>> ~[lucene-core-4.10.4.jar:4.10.4 1662817 - mike - 2015-02-27 16:38:43] > >>>> at > >>>> > >> > org.apache.lucene.index.IndexWriter$ReaderPool.release(IndexWriter.java:520) > >>>> ~[lucene-core-4.10.4.jar:4.10.4 1662817 - mike - 2015-02-27 16:38:43] > >>>> at > >>>> > >> > org.apache.lucene.index.IndexWriter$ReaderPool.release(IndexWriter.java:505) > >>>> ~[lucene-core-4.10.4.jar:4.10.4 1662817 - mike - 2015-02-27 16:38:43] > >>>> at > >>>> > >> > org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:299) > >>>> ~[lucene-core-4.10.4.jar:4.10.4 1662817 - mike - 2015-02-27 16:38:43] > >>>> at > >>>> > >> > org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates(IndexWriter.java:3312) > >>>> ~[lucene-core-4.10.4.jar:4.10.4 1662817 - mike - 2015-02-27 16:38:43] > >>>> at > >>>> > >> > org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:3303) > >>>> ~[lucene-core-4.10.4.jar:4.10.4 1662817 - mike - 2015-02-27 16:38:43] > >>>> at > >>>> > >> > org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2989) > >>>> ~[lucene-core-4.10.4.jar:4.10.4 1662817 - mike - 2015-02-27 16:38:43] > >>>> at > >>>> > >> > org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3134) > >>>> ~[lucene-core-4.10.4.jar:4.10.4 1662817 - mike - 2015-02-27 16:38:43] > >>>> at > >>>> org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3101) > >>>> ~[lucene-core-4.10.4.jar:4.10.4 1662817 - mike - 2015-02-27 16:38:43] > >>>> at > >>>> > >> > org.apache.nifi.provenance.lucene.DeleteIndexAction.execute(DeleteIndexAction.java:66) > >>>> ~[nifi-persistent-provenance-repository-0.3.0.jar:0.3.0] > >>>> at > >>>> > >> > org.apache.nifi.provenance.PersistentProvenanceRepository.purgeOldEvents(PersistentProvenanceRepository.java:906) > >>>> ~[nifi-persistent-provenance-repository-0.3.0.jar:0.3.0] > >>>> at > >>>> > >> > org.apache.nifi.provenance.PersistentProvenanceRepository$2.run(PersistentProvenanceRepository.java:260) > >>>> [nifi-persistent-provenance-repository-0.3.0.jar:0.3.0] > >>>> > >>>> On Fri, Oct 23, 2015 at 2:44 PM, Mark Payne <marka...@hotmail.com > >> <mailto:marka...@hotmail.com>> wrote: > >>>> > >>>>> Ryan, Elli, > >>>>> > >>>>> Do you by chance have any error messages in your logs from the > >>>>> FileSystemRepository? > >>>>> > >>>>> I.e., if you perform: > >>>>> > >>>>> grep FileSystemRepository logs/* > >>>>> > >>>>> Do you get anything interesting in there? > >>>>> > >>>>> Thanks > >>>>> -Mark > >>>>> > >>>>> > >>>>>> On Oct 23, 2015, at 2:38 PM, Elli Schwarz > >>>>> <eliezer_schw...@yahoo.com.INVALID> wrote: > >>>>>> > >>>>>> I've been working with Ryan. There appear to be a few issues here: > >>>>>> > >>>>>> - We upgraded from 0.2.0 to 0.3.0 and it appears that > >>>>> content_repository archive is now true by default. In 0.2.0 it was > >> false, > >>>>> and the documentation still states it is false by default. > >>>>>> - When we ran out of disk space overnight, the problem was solved by > >>>>> me simply restarting nifi, and that cleared out the archive by > itself. > >>>>>> > >>>>>> - In order to clear up the archive, I had to set archive to true, > >> and > >>>>> set max usage to 1%, and restart nifi. That cleared it up, and then I > >> set > >>>>> archive to false and restarted again so we don't run out of space. > >>>>>> - Based on the above, it appears that something happened yesterday > >>>>> that prevented Nifi from clearing out the archive even though disk > >> usage > >>>>> reached 100%. However, restarting nifi apparently enabled it to > >> perform the > >>>>> clearing of the archive. So apparently the max usage setting doesn't > >> work > >>>>> under some conditions, but we don't know what conditions occurred > >> overnight > >>>>> to cause this problem. > >>>>>> > >>>>>> Thanks!-Elli > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Friday, October 23, 2015 2:29 PM, Ryan H < > >>>>> rhendrickson.w...@gmail.com <mailto:rhendrickson.w...@gmail.com>> > >> wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>> Agree, they concern the archive... although it sounds like there are > >> 2 > >>>>>> archives? > >>>>>> > >>>>>> Within the content_repository folder, there are subfolders with the > >> name > >>>>>> 'archive' and files inside them. > >>>>>> > >>>>>> Example: > >>>>>> ./nfii/content_repository/837/archive/1445611320767-837 > >>>>>> > >>>>>> Settings: > >>>>>> nifi.content.repository.archive.max.retention.period=12 hours > >>>>>> nifi.content.repository.archive.max.usage.percentage=50% > >>>>>> nifi.content.repository.archive.enabled=true > >>>>>> > >>>>>> Last night, our server ran out of disk space because the > >>>>> content_repository > >>>>>> grew too large. Nifi didn't crash, but the log file contained > errors > >>>>>> saying the disk was full. > >>>>>> > >>>>>> We're not sure how, but the content_repository did not respect the > >> above > >>>>>> settings. > >>>>>> > >>>>>> We restarted Nifi, and it only then started to remove files, such > as: > >>>>>> ./nfii/content_repository/837/archive/1445611320767-837 > >>>>>> > >>>>>> We've turned off archiving for now. > >>>>>> > >>>>>> Ryan > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Fri, Oct 23, 2015 at 1:51 PM, Aldrin Piri <aldrinp...@gmail.com > >> <mailto:aldrinp...@gmail.com>> > >>>>> wrote: > >>>>>> > >>>>>>> Ryan, > >>>>>>> > >>>>>>> Those items only concern the archive. Did you have data enqueued > in > >>>>>>> connections in your flow? If so, these items are not eligible and > >> could > >>>>>>> explain why your disk was filled. Otherwise, can you please > provide > >>>>> some > >>>>>>> additional information so we can dig into why this may have arisen. > >>>>>>> > >>>>>>> Thanks! > >>>>>>> > >>>>>>> On Fri, Oct 23, 2015 at 10:25 AM, Ryan H < > >> rhendrickson.w...@gmail.com <mailto:rhendrickson.w...@gmail.com>> > >>>>>>> wrote: > >>>>>>> > >>>>>>>> I've got the following set: > >>>>>>>> > >>>>>>>> nifi.content.repository.archive.max.retention.period=12 hours > >>>>>>>> nifi.content.repository.archive.max.usage.percentage=50% > >>>>>>>> nifi.content.repository.archive.enabled=true > >>>>>>>> > >>>>>>>> Yet, the content repo filled my disk last night... > >>>>>>>> > >>>>>>>> > >>>>>>>> On Fri, Oct 23, 2015 at 1:16 PM, Aldrin Piri < > aldrinp...@gmail.com > >> <mailto:aldrinp...@gmail.com>> > >>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Ryan, > >>>>>>>>> > >>>>>>>>> Those archive folders map to the > >>>>>>> nifi.content.repository.archive.enabled > >>>>>>>>> property. > >>>>>>>>> > >>>>>>>>> What this property provides is a retention of files no longer in > >> the > >>>>>>>> system > >>>>>>>>> for historical context of your flow's processing and the ability > >> for > >>>>>>>>> viewing this in conjunction with provenance events as well as > >> allowing > >>>>>>>>> replay. The amount of the archive when enabled is bounded by the > >>>>>>>>> properties nifi.content.repository.archive.max.retention.period > >> and > >>>>>>>>> nifi.content.repository.archive.max.usage.percentage. > >>>>>>>>> > >>>>>>>>> Additional detail is available in the system properties of our > >>>>>>>>> Administration Guide [1] > >>>>>>>>> > >>>>>>>>> Let us know if you have additional questions. > >>>>>>>>> > >>>>>>>>> --aldrin > >>>>>>>>> > >>>>>>>>> [1] > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>> > >> > https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#system_properties > >> < > >> > https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#system_properties > >>> > >>>>>>>>> > >>>>>>>>> On Fri, Oct 23, 2015 at 10:09 AM, Ryan H < > >> rhendrickson.w...@gmail.com <mailto:rhendrickson.w...@gmail.com> > >>>>>> > >>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> Interesting.. So what would > >>>>>>>>>> > >>>>>>>>>> ./nfii/content_repository/837/archive/1445611320767-837 > >>>>>>>>>> > >>>>>>>>>> typically be? > >>>>>>>>>> > >>>>>>>>>> On Fri, Oct 23, 2015 at 12:56 PM, Andrew Grande < > >>>>>>>> agra...@hortonworks.com <mailto:agra...@hortonworks.com> > >>>>>>>>>> > >>>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Attachments don't go through, view at imagebin: > >>>>>>>>>>> http://ibin.co/2K3SwR0z8yWX <http://ibin.co/2K3SwR0z8yWX> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On 10/23/15, 12:52 PM, "Andrew Grande" < > agra...@hortonworks.com > >> <mailto:agra...@hortonworks.com>> > >>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> Ryan, > >>>>>>>>>>>> > >>>>>>>>>>>> ./conf/archive is to create a snapshot of your entire flow, > not > >>>>>>> the > >>>>>>>>>>> content repository data. See the attached screenshot (Settings > >> menu > >>>>>>>> on > >>>>>>>>>> the > >>>>>>>>>>> right). > >>>>>>>>>>>> > >>>>>>>>>>>> Andrew > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On 10/23/15, 12:47 PM, "ryan.andrew.hendrick...@gmail.com > >> <mailto:ryan.andrew.hendrick...@gmail.com> on > >>>>>>> behalf > >>>>>>>>> of > >>>>>>>>>>> Ryan H" <ryan.andrew.hendrick...@gmail.com <mailto: > >> ryan.andrew.hendrick...@gmail.com> on behalf of > >>>>>>>>>>> rhendrickson.w...@gmail.com <mailto: > rhendrickson.w...@gmail.com>> > >> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>>> Hi, > >>>>>>>>>>>>> I'm noticing my Content Repo growing large. There's a number > >>>>>>> of > >>>>>>>>>>> files... > >>>>>>>>>>>>> > >>>>>>>>>>>>> content_repo/837/archive/144...-837 > >>>>>>>>>>>>> > >>>>>>>>>>>>> Is this new in 3.0? My conf file says any archiving should > >> be > >>>>>>>>> going > >>>>>>>>>>>>> into ./conf/archive, but i don't see anything in there. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>> Ryan > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>> > >>>>> > >>> > >>> > >>> > >>> > >>> > >>> > >> > >> > >