GitHub user shanthoosh opened a pull request:
https://github.com/apache/samza/pull/494
SAMZA-1692: Standalone stability fixes.
* Currently on session expiration, processorListener with incorrect
generationId is registered with zookeeper(ZkUtils generationId is incremented
on reconnect but the generationId in processorListener is zero all the time).
When this happens to immediate successor to leader, leader expiration event
will be skipped by that processor. This will prevent leader re-election on a
current leader death and will stall the processors group. Fix is to
re-instantiate and then register processorChangeListener on session expiration.
* Add processorId to debounce thread name (this can aid debugging when
multiple processors are running within a jvm).
* After ScheduleAfterDebounceTime queue is shutdown, don't accept new
schedule requests. Current ZkJobCoordinator shutdown sequence comprise of the
following steps:
A. Shutdown the ScheduleAfterDebounceTime queue.
B. Stop the zkClient and relinquish it's resources.
After we shutdown ScheduleAfterDebounceTime and before zkclient is stopped,
any new operations can be scheduled in ScheduleAfterDebounceTime queue by
zkClient. This will result in RejectedExecutionException, since executorService
is stopped.
sample exception:
`Caused by: java.util.concurrent.RejectedExecutionException: Task
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@23f962a8
rejected from java.util.concurrent.ScheduledThreadPoolExecutor@43408be8
`
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/shanthoosh/samza master
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/samza/pull/494.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #494
----
commit 5f9e5bdd4adc9ba14ef06b3de536787cb36cd0cc
Author: Shanthoosh Venkataraman <svenkataraman@...>
Date: 2018-04-30T04:28:14Z
SAMZA-1692: Standalone stability fixes.
- Currently, on session expiration processorListener with incorrect
generationId is registered with zookeeper(ZkUtils generationId is incremented
on reconnect but the generationId in processorListener is zero all the time).
When a session reconnect happens to a processor successive to leader, leader
expiration event will be skipped. This will prevent leader re-election on a
current leader death and will stall the processors group. Fix is to
reinstantiate and then register processorChangeListener on session expiration.
- Add processorId to debounce thread name (this can aid debugging when
multiple processors are running within a jvm).
- After ScheduleAfterDebounceTime queue is shutdown, don't accept new
schedule requests. Current ZkJobCoordinator shutdown sequence comprise of the
following steps
- Shutdown the ScheduleAfterDebounceTime queue.
- Stop the zkClient and relinquish it's resources.
After we shutdown ScheduleAfterDebounceTime and before zkclient is stopped,
any new operations can be scheduled in ScheduleAfterDebounceTime queue. This
will result in RejectedExecutionException, since executorService is stopped.
```
Caused by: java.util.concurrent.RejectedExecutionException: Task
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@23f962a8
rejected from java.util.concurrent.ScheduledThreadPoolExecutor@43408be8
```
----
---