[jira] [Commented] (NIFI-12731) GetHBase should save state whenever the session is committed
[ https://issues.apache.org/jira/browse/NIFI-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17813483#comment-17813483 ] ASF subversion and git services commented on NIFI-12731: Commit 2f42b44efa2f29a0380b5087e249d7975118a737 in nifi's branch refs/heads/main from Matt Burgess [ https://gitbox.apache.org/repos/asf?p=nifi.git;h=2f42b44efa ] NIFI-12731: Ensure state is updated in GetHBase whenever the session is committed Signed-off-by: Pierre Villard This closes #8346. > GetHBase should save state whenever the session is committed > > > Key: NIFI-12731 > URL: https://issues.apache.org/jira/browse/NIFI-12731 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Reporter: Matt Burgess >Assignee: Matt Burgess >Priority: Major > Fix For: 2.0.0, 1.26.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Currently there is a place in the GetHBase code where the session is > committed after each set of 500 rows/FlowFiles (so as not to run out of > memory buffering millions of rows/FlowFiles) but the state is not updated. If > an error occurs during processing of the entire table, the state is not > updated but FlowFiles have already been sent downstream, so restarting the > processor results in duplicate data. > GetHBase should save the current state whenever the session is committed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-12731) GetHBase should save state whenever the session is committed
[ https://issues.apache.org/jira/browse/NIFI-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17813482#comment-17813482 ] ASF subversion and git services commented on NIFI-12731: Commit f4a1004d30b7be6edcf26781b5c39cdb395c0f81 in nifi's branch refs/heads/support/nifi-1.x from Matt Burgess [ https://gitbox.apache.org/repos/asf?p=nifi.git;h=f4a1004d30 ] NIFI-12731: Ensure state is updated in GetHBase whenever the session is committed Signed-off-by: Pierre Villard This closes #8347. > GetHBase should save state whenever the session is committed > > > Key: NIFI-12731 > URL: https://issues.apache.org/jira/browse/NIFI-12731 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Reporter: Matt Burgess >Assignee: Matt Burgess >Priority: Major > Fix For: 2.0.0, 1.26.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Currently there is a place in the GetHBase code where the session is > committed after each set of 500 rows/FlowFiles (so as not to run out of > memory buffering millions of rows/FlowFiles) but the state is not updated. If > an error occurs during processing of the entire table, the state is not > updated but FlowFiles have already been sent downstream, so restarting the > processor results in duplicate data. > GetHBase should save the current state whenever the session is committed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-12731) GetHBase should save state whenever the session is committed
[ https://issues.apache.org/jira/browse/NIFI-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17813436#comment-17813436 ] Matt Burgess commented on NIFI-12731: - There are two PRs due to merge conflicts, one based on main and one based on support/nifi-1.x > GetHBase should save state whenever the session is committed > > > Key: NIFI-12731 > URL: https://issues.apache.org/jira/browse/NIFI-12731 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Reporter: Matt Burgess >Assignee: Matt Burgess >Priority: Major > Fix For: 2.0.0, 1.26.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Currently there is a place in the GetHBase code where the session is > committed after each set of 500 rows/FlowFiles (so as not to run out of > memory buffering millions of rows/FlowFiles) but the state is not updated. If > an error occurs during processing of the entire table, the state is not > updated but FlowFiles have already been sent downstream, so restarting the > processor results in duplicate data. > GetHBase should save the current state whenever the session is committed. -- This message was sent by Atlassian Jira (v8.20.10#820010)