[jira] [Commented] (NIFI-12731) GetHBase should save state whenever the session is committed

2024-02-01 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17813483#comment-17813483
 ] 

ASF subversion and git services commented on NIFI-12731:


Commit 2f42b44efa2f29a0380b5087e249d7975118a737 in nifi's branch 
refs/heads/main from Matt Burgess
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=2f42b44efa ]

NIFI-12731:  Ensure state is updated in GetHBase whenever the session is 
committed

Signed-off-by: Pierre Villard 

This closes #8346.


> GetHBase should save state whenever the session is committed
> 
>
> Key: NIFI-12731
> URL: https://issues.apache.org/jira/browse/NIFI-12731
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 2.0.0, 1.26.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently there is a place in the GetHBase code where the session is 
> committed after each set of 500 rows/FlowFiles (so as not to run out of 
> memory buffering millions of rows/FlowFiles) but the state is not updated. If 
> an error occurs during processing of the entire table, the state is not 
> updated but FlowFiles have already been sent downstream, so restarting the 
> processor results in duplicate data.
> GetHBase should save the current state whenever the session is committed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-12731) GetHBase should save state whenever the session is committed

2024-02-01 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17813482#comment-17813482
 ] 

ASF subversion and git services commented on NIFI-12731:


Commit f4a1004d30b7be6edcf26781b5c39cdb395c0f81 in nifi's branch 
refs/heads/support/nifi-1.x from Matt Burgess
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=f4a1004d30 ]

NIFI-12731:  Ensure state is updated in GetHBase whenever the session is 
committed

Signed-off-by: Pierre Villard 

This closes #8347.


> GetHBase should save state whenever the session is committed
> 
>
> Key: NIFI-12731
> URL: https://issues.apache.org/jira/browse/NIFI-12731
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 2.0.0, 1.26.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently there is a place in the GetHBase code where the session is 
> committed after each set of 500 rows/FlowFiles (so as not to run out of 
> memory buffering millions of rows/FlowFiles) but the state is not updated. If 
> an error occurs during processing of the entire table, the state is not 
> updated but FlowFiles have already been sent downstream, so restarting the 
> processor results in duplicate data.
> GetHBase should save the current state whenever the session is committed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-12731) GetHBase should save state whenever the session is committed

2024-02-01 Thread Matt Burgess (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17813436#comment-17813436
 ] 

Matt Burgess commented on NIFI-12731:
-

There are two PRs due to merge conflicts, one based on main and one based on 
support/nifi-1.x

> GetHBase should save state whenever the session is committed
> 
>
> Key: NIFI-12731
> URL: https://issues.apache.org/jira/browse/NIFI-12731
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 2.0.0, 1.26.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently there is a place in the GetHBase code where the session is 
> committed after each set of 500 rows/FlowFiles (so as not to run out of 
> memory buffering millions of rows/FlowFiles) but the state is not updated. If 
> an error occurs during processing of the entire table, the state is not 
> updated but FlowFiles have already been sent downstream, so restarting the 
> processor results in duplicate data.
> GetHBase should save the current state whenever the session is committed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)