[jira] [Commented] (KAFKA-10575) StateRestoreListener#onRestoreEnd should always be triggered

2022-09-16 Thread Guozhang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605945#comment-17605945
 ] 

Guozhang Wang commented on KAFKA-10575:
---

I've just created a new KIP: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-869%3A+Improve+Streams+State+Restoration+Visibility

> StateRestoreListener#onRestoreEnd should always be triggered
> 
>
> Key: KAFKA-10575
> URL: https://issues.apache.org/jira/browse/KAFKA-10575
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: highluck
>Priority: Major
>
> Today we only trigger `StateRestoreListener#onRestoreEnd` when we complete 
> the restoration of an active task and transit it to the running state. 
> However the restoration can also be stopped when the restoring task gets 
> closed (because it gets migrated to another client, for example). We should 
> also trigger the callback indicating its progress when the restoration 
> stopped in any scenarios.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-10575) StateRestoreListener#onRestoreEnd should always be triggered

2022-09-16 Thread Guozhang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605911#comment-17605911
 ] 

Guozhang Wang commented on KAFKA-10575:
---

Hello [~nicktelford] thanks for your inputs! Yes I'm now thinking about 
introducing a new API to the `StateRestoreListener` for the paused scenarios, 
and to create a KIP for that new API as well as a couple correlating metrics 
changes that will be introduced by KAFKA-10199.

Regarding the TaskStateChangeListener, I think it worth a separate discussion 
thread for its own scope --- personally I think only very advanced users would 
be leverage on this since {{Tasks}} are a concept that Streams library wants to 
more or less abstract away from common users: they should not worry too much 
about the unit of parallelism afterall.

> StateRestoreListener#onRestoreEnd should always be triggered
> 
>
> Key: KAFKA-10575
> URL: https://issues.apache.org/jira/browse/KAFKA-10575
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: highluck
>Priority: Major
>
> Today we only trigger `StateRestoreListener#onRestoreEnd` when we complete 
> the restoration of an active task and transit it to the running state. 
> However the restoration can also be stopped when the restoring task gets 
> closed (because it gets migrated to another client, for example). We should 
> also trigger the callback indicating its progress when the restoration 
> stopped in any scenarios.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-10575) StateRestoreListener#onRestoreEnd should always be triggered

2022-09-16 Thread Nicholas Telford (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605885#comment-17605885
 ] 

Nicholas Telford commented on KAFKA-10575:
--

On a related note, based on what [~ableegoldman] said, it might be a good idea 
to introduce a {{TaskStateChangeListener}}, which is notified of changes to a 
Task state. What do you think?

> StateRestoreListener#onRestoreEnd should always be triggered
> 
>
> Key: KAFKA-10575
> URL: https://issues.apache.org/jira/browse/KAFKA-10575
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: highluck
>Priority: Major
>
> Today we only trigger `StateRestoreListener#onRestoreEnd` when we complete 
> the restoration of an active task and transit it to the running state. 
> However the restoration can also be stopped when the restoring task gets 
> closed (because it gets migrated to another client, for example). We should 
> also trigger the callback indicating its progress when the restoration 
> stopped in any scenarios.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-10575) StateRestoreListener#onRestoreEnd should always be triggered

2022-09-16 Thread Nicholas Telford (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605878#comment-17605878
 ] 

Nicholas Telford commented on KAFKA-10575:
--

Hi [~guozhang], I ran in to this issue today and have some thoughts on it.

I agree that {{onRestoreEnd}} currently implies that the restoration 
_completed_, since that's what the documentation says. I suggest we add a new 
method, {{StateRestoreListener#onRestoreAbort}} (or {{onRestoreSuspend}} or 
{{onRestorePaused}}, etc.), which handles the case that restoration was stopped 
before it could complete (i.e. because the {{Task}} was closed.

It should be enough to simply call this new method in 
{{StoreChangelogReader#unregister}}, which is called when {{Task}}s are 
closed/migrated.

For backwards compatibility, this new method should have a {{default}} 
implementation in the {{StateRestoreListener}} interface that is a no-op.

What do you think? And since this involves adding a new method, do we need a 
KIP for this?

> StateRestoreListener#onRestoreEnd should always be triggered
> 
>
> Key: KAFKA-10575
> URL: https://issues.apache.org/jira/browse/KAFKA-10575
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: highluck
>Priority: Major
>
> Today we only trigger `StateRestoreListener#onRestoreEnd` when we complete 
> the restoration of an active task and transit it to the running state. 
> However the restoration can also be stopped when the restoring task gets 
> closed (because it gets migrated to another client, for example). We should 
> also trigger the callback indicating its progress when the restoration 
> stopped in any scenarios.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-10575) StateRestoreListener#onRestoreEnd should always be triggered

2022-06-13 Thread Guozhang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17553806#comment-17553806
 ] 

Guozhang Wang commented on KAFKA-10575:
---

I took another look at this ticket while working on KAFKA-10199 in parallel, 
and here are some updates:

1. I could confirm that today we only call `onRestoreEnd` for case 1 above, and 
for case 2/3 from [~ableegoldman] we do not (in fact case 3) is just a special 
case of case 2) since we would first transit to CLOSED anyways).
2. On a second thought, there may be different group of users who were 
anticipating the semantics of such callbacks, for example:

a) The original complaint that drives this ticket, is based on the anticipation 
that each `onRestoreStart` would always be paired with an `onRestoreEnd`. This 
is not actually the case because of case 2/3 above.
b) Others may anticipate that `onRestoreEnd` is only triggered when the 
restoration is actually completed. In fact this is what we explicitly stated in 
the javadocs.

So, if we just call `onRestoreEnd` on case 2/3) above, we may make users in a) 
happier but we would break compatibilities of users in b). In addition, since 
given the same topic-partition, and store names, there might be multiple 
restoration process happening at the same time e.g. when there are standby 
replicas, it's not very straight-forward trying to pair each one of 
`onRestoreStart` with a unique `onRestoreEnd`.

With those thoughts, I'm now leaning towards not just calling `onRestoreEnd` 
for case 2/3), but instead introduce a new API e.g. `onRestorePaused` for case 
2/3), plus also document clearly that not every `onRestoreStart` would be 
paired exactly with an `onRestoreEnd/Paused` to reduce user's unrealistic 
anticipations.

Thoughts?

> StateRestoreListener#onRestoreEnd should always be triggered
> 
>
> Key: KAFKA-10575
> URL: https://issues.apache.org/jira/browse/KAFKA-10575
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: highluck
>Priority: Major
>
> Today we only trigger `StateRestoreListener#onRestoreEnd` when we complete 
> the restoration of an active task and transit it to the running state. 
> However the restoration can also be stopped when the restoring task gets 
> closed (because it gets migrated to another client, for example). We should 
> also trigger the callback indicating its progress when the restoration 
> stopped in any scenarios.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (KAFKA-10575) StateRestoreListener#onRestoreEnd should always be triggered

2021-03-25 Thread A. Sophie Blee-Goldman (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17308931#comment-17308931
 ] 

A. Sophie Blee-Goldman commented on KAFKA-10575:


[~high.lee] this ticket is aimed at making sure the #onRestoreEnd callback is 
always invoked (if it exists) any time a task ends ongoing restoration. 
Specifically, we probably want to invoke it in all of these scenarios:
# transition from RESTORING to RUNNING: ie, restoration has actually completed
# transition from RESTORING to CLOSED: this one might be a bit tricky since 
technically a closing task must transition as RESTORING -> SUSPENDED -> CLOSED, 
but we do not want to invoke this callback if it's going to be resumed after 
suspension instead of closed. However a suspended task is only ever resumed 
when following EAGER rebalancing, which is likely not used in the latest 
versions and will be removed soon anyways, so maybe it's fine to just invoke it 
during the RESTORING -> SUSPENDED transition -- WDYT [~guozhang]? Also note, 
this transition will cover the case of a restoring task that hits TaskCorrupted 
and must be revived
# RESTORING active task is recycled from active to standby

As of today I believe we only trigger this callback in the 1st case, and 
possibly the 3rd. So this ticket would encompass confirming that we do this for 
recycled tasks and implementing it for closing tasks

> StateRestoreListener#onRestoreEnd should always be triggered
> 
>
> Key: KAFKA-10575
> URL: https://issues.apache.org/jira/browse/KAFKA-10575
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: highluck
>Priority: Major
>
> Today we only trigger `StateRestoreListener#onRestoreEnd` when we complete 
> the restoration of an active task and transit it to the running state. 
> However the restoration can also be stopped when the restoring task gets 
> closed (because it gets migrated to another client, for example). We should 
> also trigger the callback indicating its progress when the restoration 
> stopped in any scenarios.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10575) StateRestoreListener#onRestoreEnd should always be triggered

2021-03-25 Thread highluck (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17308411#comment-17308411
 ] 

highluck commented on KAFKA-10575:
--

[~ableegoldman] 
Yes, I am willing to work. However, I am wondering what to do..!
Would it be okay if I could tell you the direction you would like more about 
the ticket?!

> StateRestoreListener#onRestoreEnd should always be triggered
> 
>
> Key: KAFKA-10575
> URL: https://issues.apache.org/jira/browse/KAFKA-10575
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: highluck
>Priority: Major
>
> Today we only trigger `StateRestoreListener#onRestoreEnd` when we complete 
> the restoration of an active task and transit it to the running state. 
> However the restoration can also be stopped when the restoring task gets 
> closed (because it gets migrated to another client, for example). We should 
> also trigger the callback indicating its progress when the restoration 
> stopped in any scenarios.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10575) StateRestoreListener#onRestoreEnd should always be triggered

2021-03-24 Thread A. Sophie Blee-Goldman (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17308309#comment-17308309
 ] 

A. Sophie Blee-Goldman commented on KAFKA-10575:


[~high.lee] have you been able to work on this ticket? If you have other PRs 
that you are currently focusing on, you can always unassign this one and 
re-assign it once you're ready to pick it up. (No pressure, I've just been 
trying to clean up some tickets and make sure the owners are active)

> StateRestoreListener#onRestoreEnd should always be triggered
> 
>
> Key: KAFKA-10575
> URL: https://issues.apache.org/jira/browse/KAFKA-10575
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: highluck
>Priority: Major
>
> Today we only trigger `StateRestoreListener#onRestoreEnd` when we complete 
> the restoration of an active task and transit it to the running state. 
> However the restoration can also be stopped when the restoring task gets 
> closed (because it gets migrated to another client, for example). We should 
> also trigger the callback indicating its progress when the restoration 
> stopped in any scenarios.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10575) StateRestoreListener#onRestoreEnd should always be triggered

2020-12-07 Thread highluck (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17245318#comment-17245318
 ] 

highluck commented on KAFKA-10575:
--

[~Yohan123] 
Are you working on this task?

> StateRestoreListener#onRestoreEnd should always be triggered
> 
>
> Key: KAFKA-10575
> URL: https://issues.apache.org/jira/browse/KAFKA-10575
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Priority: Major
>
> Today we only trigger `StateRestoreListener#onRestoreEnd` when we complete 
> the restoration of an active task and transit it to the running state. 
> However the restoration can also be stopped when the restoring task gets 
> closed (because it gets migrated to another client, for example). We should 
> also trigger the callback indicating its progress when the restoration 
> stopped in any scenarios.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10575) StateRestoreListener#onRestoreEnd should always be triggered

2020-11-12 Thread Richard Yu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17231123#comment-17231123
 ] 

Richard Yu commented on KAFKA-10575:


Thanks for letting me know! I was already looking at StoreChangelogReader since 
that was probably one of only two places where onRestoreEnd was called. 
Hopefully, I will be able to pull together a PR that can tackle this issue.

> StateRestoreListener#onRestoreEnd should always be triggered
> 
>
> Key: KAFKA-10575
> URL: https://issues.apache.org/jira/browse/KAFKA-10575
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Priority: Major
>
> Today we only trigger `StateRestoreListener#onRestoreEnd` when we complete 
> the restoration of an active task and transit it to the running state. 
> However the restoration can also be stopped when the restoring task gets 
> closed (because it gets migrated to another client, for example). We should 
> also trigger the callback indicating its progress when the restoration 
> stopped in any scenarios.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10575) StateRestoreListener#onRestoreEnd should always be triggered

2020-11-10 Thread Guozhang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229436#comment-17229436
 ] 

Guozhang Wang commented on KAFKA-10575:
---

Of course. You can take a look at the StoreChangelogReader class as the entry 
point for a fix.

> StateRestoreListener#onRestoreEnd should always be triggered
> 
>
> Key: KAFKA-10575
> URL: https://issues.apache.org/jira/browse/KAFKA-10575
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Priority: Major
>
> Today we only trigger `StateRestoreListener#onRestoreEnd` when we complete 
> the restoration of an active task and transit it to the running state. 
> However the restoration can also be stopped when the restoring task gets 
> closed (because it gets migrated to another client, for example). We should 
> also trigger the callback indicating its progress when the restoration 
> stopped in any scenarios.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10575) StateRestoreListener#onRestoreEnd should always be triggered

2020-11-09 Thread Richard Yu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17228890#comment-17228890
 ] 

Richard Yu commented on KAFKA-10575:


[~guozhang] I'm interested in picking this one up. May I try my hand at it?

> StateRestoreListener#onRestoreEnd should always be triggered
> 
>
> Key: KAFKA-10575
> URL: https://issues.apache.org/jira/browse/KAFKA-10575
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Priority: Major
>
> Today we only trigger `StateRestoreListener#onRestoreEnd` when we complete 
> the restoration of an active task and transit it to the running state. 
> However the restoration can also be stopped when the restoring task gets 
> closed (because it gets migrated to another client, for example). We should 
> also trigger the callback indicating its progress when the restoration 
> stopped in any scenarios.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)