[jira] [Commented] (NIFI-4774) FlowFile Repository should write updates to the same FlowFile to the same partition
[ https://issues.apache.org/jira/browse/NIFI-4774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16397270#comment-16397270 ] ASF GitHub Bot commented on NIFI-4774: -- Github user asfgit closed the pull request at: https://github.com/apache/nifi/pull/2487 > FlowFile Repository should write updates to the same FlowFile to the same > partition > --- > > Key: NIFI-4774 > URL: https://issues.apache.org/jira/browse/NIFI-4774 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 1.6.0 > > > As-is, in the case of power loss or Operating System crash, we could have an > update that is lost, and then an update for the same FlowFile that is not > lost, because the updates for a given FlowFile can span partitions. If an > update were written to Partition 1 and then to Partition 2 and Partition 2 is > flushed to disk by the Operating System and then the Operating System crashes > or power is lost before Partition 1 is flushed to disk, we could lose the > update to Partition 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4774) FlowFile Repository should write updates to the same FlowFile to the same partition
[ https://issues.apache.org/jira/browse/NIFI-4774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16397266#comment-16397266 ] ASF GitHub Bot commented on NIFI-4774: -- Github user mattyb149 commented on the issue: https://github.com/apache/nifi/pull/2487 +1 LGTM, I tested switching back and forth and everything looks good. Thanks for the improvement! Merging to master > FlowFile Repository should write updates to the same FlowFile to the same > partition > --- > > Key: NIFI-4774 > URL: https://issues.apache.org/jira/browse/NIFI-4774 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 1.6.0 > > > As-is, in the case of power loss or Operating System crash, we could have an > update that is lost, and then an update for the same FlowFile that is not > lost, because the updates for a given FlowFile can span partitions. If an > update were written to Partition 1 and then to Partition 2 and Partition 2 is > flushed to disk by the Operating System and then the Operating System crashes > or power is lost before Partition 1 is flushed to disk, we could lose the > update to Partition 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4774) FlowFile Repository should write updates to the same FlowFile to the same partition
[ https://issues.apache.org/jira/browse/NIFI-4774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16397267#comment-16397267 ] ASF subversion and git services commented on NIFI-4774: --- Commit d14229e4407ce4587ba422fbd85e15a9d4f66f85 in nifi's branch refs/heads/master from [~markap14] [ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=d14229e ] NIFI-4774: Allow user to choose which write-ahead log implementation should be used by the WriteAheadFlowFileRepository Removed TODO comment Signed-off-by: Matthew Burgess This closes #2487 > FlowFile Repository should write updates to the same FlowFile to the same > partition > --- > > Key: NIFI-4774 > URL: https://issues.apache.org/jira/browse/NIFI-4774 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 1.6.0 > > > As-is, in the case of power loss or Operating System crash, we could have an > update that is lost, and then an update for the same FlowFile that is not > lost, because the updates for a given FlowFile can span partitions. If an > update were written to Partition 1 and then to Partition 2 and Partition 2 is > flushed to disk by the Operating System and then the Operating System crashes > or power is lost before Partition 1 is flushed to disk, we could lose the > update to Partition 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4774) FlowFile Repository should write updates to the same FlowFile to the same partition
[ https://issues.apache.org/jira/browse/NIFI-4774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16396993#comment-16396993 ] ASF GitHub Bot commented on NIFI-4774: -- Github user mosermw commented on the issue: https://github.com/apache/nifi/pull/2487 @mattyb149 and @markap14 with the 1.6.0 release approaching, do you think we've allowed enough time for review and testing? Is this good to go? Thanks. > FlowFile Repository should write updates to the same FlowFile to the same > partition > --- > > Key: NIFI-4774 > URL: https://issues.apache.org/jira/browse/NIFI-4774 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 1.6.0 > > > As-is, in the case of power loss or Operating System crash, we could have an > update that is lost, and then an update for the same FlowFile that is not > lost, because the updates for a given FlowFile can span partitions. If an > update were written to Partition 1 and then to Partition 2 and Partition 2 is > flushed to disk by the Operating System and then the Operating System crashes > or power is lost before Partition 1 is flushed to disk, we could lose the > update to Partition 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4774) FlowFile Repository should write updates to the same FlowFile to the same partition
[ https://issues.apache.org/jira/browse/NIFI-4774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16388037#comment-16388037 ] ASF GitHub Bot commented on NIFI-4774: -- Github user mosermw commented on the issue: https://github.com/apache/nifi/pull/2487 I tested this and I was able to switch back and forth between MinimalLockingWriteAheadLog and SequentialAccessWriteAheadLog. +1 from me. It's best to make this switch while there are 0 flowfiles in the repository. With flowfiles in the system, going from MinimalLockingWAL to SequantialAccessWAL worked, but the opposite had some issues. > FlowFile Repository should write updates to the same FlowFile to the same > partition > --- > > Key: NIFI-4774 > URL: https://issues.apache.org/jira/browse/NIFI-4774 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 1.6.0 > > > As-is, in the case of power loss or Operating System crash, we could have an > update that is lost, and then an update for the same FlowFile that is not > lost, because the updates for a given FlowFile can span partitions. If an > update were written to Partition 1 and then to Partition 2 and Partition 2 is > flushed to disk by the Operating System and then the Operating System crashes > or power is lost before Partition 1 is flushed to disk, we could lose the > update to Partition 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4774) FlowFile Repository should write updates to the same FlowFile to the same partition
[ https://issues.apache.org/jira/browse/NIFI-4774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377044#comment-16377044 ] ASF GitHub Bot commented on NIFI-4774: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2487#discussion_r170630297 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/repository/WriteAheadFlowFileRepository.java --- @@ -129,17 +138,22 @@ public WriteAheadFlowFileRepository() { checkpointDelayMillis = 0l; numPartitions = 0; checkpointExecutor = null; -flowFileRepositoryPath = null; +walImplementation = null; --- End diff -- It doesn't really matter what it's set to - that constructor will only be used for service loading and no method in the class will ever be called. > FlowFile Repository should write updates to the same FlowFile to the same > partition > --- > > Key: NIFI-4774 > URL: https://issues.apache.org/jira/browse/NIFI-4774 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 1.6.0 > > > As-is, in the case of power loss or Operating System crash, we could have an > update that is lost, and then an update for the same FlowFile that is not > lost, because the updates for a given FlowFile can span partitions. If an > update were written to Partition 1 and then to Partition 2 and Partition 2 is > flushed to disk by the Operating System and then the Operating System crashes > or power is lost before Partition 1 is flushed to disk, we could lose the > update to Partition 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4774) FlowFile Repository should write updates to the same FlowFile to the same partition
[ https://issues.apache.org/jira/browse/NIFI-4774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377042#comment-16377042 ] ASF GitHub Bot commented on NIFI-4774: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2487#discussion_r170630080 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/repository/WriteAheadFlowFileRepository.java --- @@ -80,6 +80,15 @@ */ public class WriteAheadFlowFileRepository implements FlowFileRepository, SyncListener { private static final String FLOWFILE_REPOSITORY_DIRECTORY_PREFIX = "nifi.flowfile.repository.directory"; +private static final String WRITE_AHEAD_LOG_IMPL = "nifi.flowfile.repository.wal.implementation"; + +// TODO: Update Admin Guide --- End diff -- Yup, this is done :) If you get a chance to merge, please delete the comment. > FlowFile Repository should write updates to the same FlowFile to the same > partition > --- > > Key: NIFI-4774 > URL: https://issues.apache.org/jira/browse/NIFI-4774 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 1.6.0 > > > As-is, in the case of power loss or Operating System crash, we could have an > update that is lost, and then an update for the same FlowFile that is not > lost, because the updates for a given FlowFile can span partitions. If an > update were written to Partition 1 and then to Partition 2 and Partition 2 is > flushed to disk by the Operating System and then the Operating System crashes > or power is lost before Partition 1 is flushed to disk, we could lose the > update to Partition 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4774) FlowFile Repository should write updates to the same FlowFile to the same partition
[ https://issues.apache.org/jira/browse/NIFI-4774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16373414#comment-16373414 ] ASF GitHub Bot commented on NIFI-4774: -- Github user mattyb149 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2487#discussion_r170084710 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/repository/WriteAheadFlowFileRepository.java --- @@ -129,17 +138,22 @@ public WriteAheadFlowFileRepository() { checkpointDelayMillis = 0l; numPartitions = 0; checkpointExecutor = null; -flowFileRepositoryPath = null; +walImplementation = null; --- End diff -- Should this be DEFAULT_WAL_IMPLEMENTATION? > FlowFile Repository should write updates to the same FlowFile to the same > partition > --- > > Key: NIFI-4774 > URL: https://issues.apache.org/jira/browse/NIFI-4774 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 1.6.0 > > > As-is, in the case of power loss or Operating System crash, we could have an > update that is lost, and then an update for the same FlowFile that is not > lost, because the updates for a given FlowFile can span partitions. If an > update were written to Partition 1 and then to Partition 2 and Partition 2 is > flushed to disk by the Operating System and then the Operating System crashes > or power is lost before Partition 1 is flushed to disk, we could lose the > update to Partition 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4774) FlowFile Repository should write updates to the same FlowFile to the same partition
[ https://issues.apache.org/jira/browse/NIFI-4774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16373415#comment-16373415 ] ASF GitHub Bot commented on NIFI-4774: -- Github user mattyb149 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2487#discussion_r170084644 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/repository/WriteAheadFlowFileRepository.java --- @@ -80,6 +80,15 @@ */ public class WriteAheadFlowFileRepository implements FlowFileRepository, SyncListener { private static final String FLOWFILE_REPOSITORY_DIRECTORY_PREFIX = "nifi.flowfile.repository.directory"; +private static final String WRITE_AHEAD_LOG_IMPL = "nifi.flowfile.repository.wal.implementation"; + +// TODO: Update Admin Guide --- End diff -- I think this TODO is DONE :) > FlowFile Repository should write updates to the same FlowFile to the same > partition > --- > > Key: NIFI-4774 > URL: https://issues.apache.org/jira/browse/NIFI-4774 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 1.6.0 > > > As-is, in the case of power loss or Operating System crash, we could have an > update that is lost, and then an update for the same FlowFile that is not > lost, because the updates for a given FlowFile can span partitions. If an > update were written to Partition 1 and then to Partition 2 and Partition 2 is > flushed to disk by the Operating System and then the Operating System crashes > or power is lost before Partition 1 is flushed to disk, we could lose the > update to Partition 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4774) FlowFile Repository should write updates to the same FlowFile to the same partition
[ https://issues.apache.org/jira/browse/NIFI-4774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16373163#comment-16373163 ] Mark Payne commented on NIFI-4774: -- I have submitted a new PR that allows the user to configure which implementation of the write-ahead log to use. > FlowFile Repository should write updates to the same FlowFile to the same > partition > --- > > Key: NIFI-4774 > URL: https://issues.apache.org/jira/browse/NIFI-4774 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 1.6.0 > > > As-is, in the case of power loss or Operating System crash, we could have an > update that is lost, and then an update for the same FlowFile that is not > lost, because the updates for a given FlowFile can span partitions. If an > update were written to Partition 1 and then to Partition 2 and Partition 2 is > flushed to disk by the Operating System and then the Operating System crashes > or power is lost before Partition 1 is flushed to disk, we could lose the > update to Partition 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4774) FlowFile Repository should write updates to the same FlowFile to the same partition
[ https://issues.apache.org/jira/browse/NIFI-4774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16373160#comment-16373160 ] ASF GitHub Bot commented on NIFI-4774: -- GitHub user markap14 opened a pull request: https://github.com/apache/nifi/pull/2487 NIFI-4774: Allow user to choose which write-ahead log implementation … …should be used by the WriteAheadFlowFileRepository Thank you for submitting a contribution to Apache NiFi. In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [ ] Does your PR title start with NIFI- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [ ] Has your PR been rebased against the latest commit within the target branch (typically master)? - [ ] Is your initial contribution a single, squashed commit? ### For code changes: - [ ] Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder? - [ ] Have you written or updated unit tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly? - [ ] If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly? - [ ] If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered? ### Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. You can merge this pull request into a Git repository by running: $ git pull https://github.com/markap14/nifi NIFI-4774-2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi/pull/2487.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2487 commit 07c1b1140178b83d69f31a4ec99e8a23d134d962 Author: Mark Payne Date: 2018-02-22T17:51:38Z NIFI-4774: Allow user to choose which write-ahead log implementation should be used by the WriteAheadFlowFileRepository > FlowFile Repository should write updates to the same FlowFile to the same > partition > --- > > Key: NIFI-4774 > URL: https://issues.apache.org/jira/browse/NIFI-4774 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 1.6.0 > > > As-is, in the case of power loss or Operating System crash, we could have an > update that is lost, and then an update for the same FlowFile that is not > lost, because the updates for a given FlowFile can span partitions. If an > update were written to Partition 1 and then to Partition 2 and Partition 2 is > flushed to disk by the Operating System and then the Operating System crashes > or power is lost before Partition 1 is flushed to disk, we could lose the > update to Partition 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4774) FlowFile Repository should write updates to the same FlowFile to the same partition
[ https://issues.apache.org/jira/browse/NIFI-4774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371911#comment-16371911 ] ASF GitHub Bot commented on NIFI-4774: -- Github user markap14 commented on the issue: https://github.com/apache/nifi/pull/2416 The proposed solution does address the issue that was raised in NIFI-4774, but in doing so introduces a new issue of data loss. This is why I provided the solution that I did in the PR, as I believe that it addresses both of these issues. Again, I do not have a problem with making this new solution one that the user is able to opt out of, though. If you want to submit a PR that also incorporates your proposed solution into the MinimalLockingWirteAheadLog that is okay too and we can also review and incorporate that change as well. > FlowFile Repository should write updates to the same FlowFile to the same > partition > --- > > Key: NIFI-4774 > URL: https://issues.apache.org/jira/browse/NIFI-4774 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 1.6.0 > > > As-is, in the case of power loss or Operating System crash, we could have an > update that is lost, and then an update for the same FlowFile that is not > lost, because the updates for a given FlowFile can span partitions. If an > update were written to Partition 1 and then to Partition 2 and Partition 2 is > flushed to disk by the Operating System and then the Operating System crashes > or power is lost before Partition 1 is flushed to disk, we could lose the > update to Partition 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4774) FlowFile Repository should write updates to the same FlowFile to the same partition
[ https://issues.apache.org/jira/browse/NIFI-4774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371636#comment-16371636 ] Brandon DeVries commented on NIFI-4774: --- -1. I was under the impression the PR for this was still a work in progress. I commented on the PR ([https://github.com/apache/nifi/pull/2416).|https://github.com/apache/nifi/pull/2416)] > FlowFile Repository should write updates to the same FlowFile to the same > partition > --- > > Key: NIFI-4774 > URL: https://issues.apache.org/jira/browse/NIFI-4774 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 1.6.0 > > > As-is, in the case of power loss or Operating System crash, we could have an > update that is lost, and then an update for the same FlowFile that is not > lost, because the updates for a given FlowFile can span partitions. If an > update were written to Partition 1 and then to Partition 2 and Partition 2 is > flushed to disk by the Operating System and then the Operating System crashes > or power is lost before Partition 1 is flushed to disk, we could lose the > update to Partition 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4774) FlowFile Repository should write updates to the same FlowFile to the same partition
[ https://issues.apache.org/jira/browse/NIFI-4774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16332522#comment-16332522 ] ASF GitHub Bot commented on NIFI-4774: -- Github user devriesb commented on the issue: https://github.com/apache/nifi/pull/2416 while i in no way object to a new implementation, I'm not sure that is the correct solution to the bug described in NIFI-4774[1]. A new implementation would need to be tested to a degree that a tweak to the existing implementation would not, and fixing this bug in a timely fashion would seem to be a worthy goal. [1] https://issues.apache.org/jira/browse/NIFI-4774 > FlowFile Repository should write updates to the same FlowFile to the same > partition > --- > > Key: NIFI-4774 > URL: https://issues.apache.org/jira/browse/NIFI-4774 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > > As-is, in the case of power loss or Operating System crash, we could have an > update that is lost, and then an update for the same FlowFile that is not > lost, because the updates for a given FlowFile can span partitions. If an > update were written to Partition 1 and then to Partition 2 and Partition 2 is > flushed to disk by the Operating System and then the Operating System crashes > or power is lost before Partition 1 is flushed to disk, we could lose the > update to Partition 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005)