[ https://issues.apache.org/jira/browse/NIFI-71?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15984665#comment-15984665 ]
Andre F de Miranda commented on NIFI-71: ---------------------------------------- [~mcgilman] - apologies for necrobumping another JIRA but just checking if this still relevant with the introduction of WAL provenance repo? > Persistent Prov Repo should compress in blocks > ---------------------------------------------- > > Key: NIFI-71 > URL: https://issues.apache.org/jira/browse/NIFI-71 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework > Reporter: Matt Gilman > Priority: Minor > > Currently we write a bunch of events to a file and then compress the file. We > then index the file offset of the uncompressed version of the file. > We should instead compress in chunks of X number of events of X number of > bytes. Then index the offset of the chunk in the compressed version. This > way, we can use FileInputStream.skip to seek to the appropriate offset and > then wrap the stream in GZIPInputStream. This allwos us to avoid reading a > lot of compressed data to get to the desired offset. -- This message was sent by Atlassian JIRA (v6.3.15#6346)