[
https://issues.apache.org/jira/browse/CASSANDRA-20363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17930780#comment-17930780
]
Stefan Miklosovic edited comment on CASSANDRA-20363 at 2/26/25 6:11 PM:
------------------------------------------------------------------------
Well ... the primary motivation behind doing it like that is that there might
be in theory different implementations of this interface and a user _might_
want to pass some configuration properties to that - which ParameterizedClass
enabled. If we have just a simple system property, we will lose the ability to
configure that.
Tommy also wrote:
{code}
// if disk_failure_policy is stop and problem is disk unavailable schedule
a task
// that monitor when disk is available again and open gossip and transports
{code}
That got me thinking ... what does it actually mean "to schedule a task"? So an
error comes to "handleFSError" method and then he does what? 1) how does he
differentiate that disk is not available 2) how is actual scheduling done? If
he just starts a thread by submitting that to some thread pool, then he will be
looping in that thread until "the disk is back again"? How does he know what
disk to look at? Should not this be configurable? Etc. etc ...
To summarize, I think that what I am doing now is that I am "guessing" how
Tommy's implementation would look like. I would appreciate if we had a full
picture here so we can tailor the solution to Tommy's need, not doing something
Tommy will have a hard time to integrate with.
[~tommy_s] would you mind to share with us how your custom implementation would
look like?
was (Author: smiklosovic):
Well ... the primary motivation behind doing it like that is that there might
be in theory different implementations of this interface and a user _might_
want to pass some configuration properties to that - which ParameterizedClass
enabled. If we have just a simple system property, we will lose the ability to
configure that.
Tommy also wrote:
{code}
// if disk_failure_policy is stop and problem is disk unavailable schedule
a task
// that monitor when disk is available again and open gossip and transports
{code}
That got me thinking ... what does it actually mean "to schedule a task"? So an
error comes to "handleFSError" method and the he does what? 1) how does he
differentiate that disk is not available 2) how is actual scheduling done? If
he just starts a thread by submitting that to some thread pool, then he will be
looping in that thread until "the disk is back again"? How does he know what
disk to look at? Should not this be configurable? Etc. etc ...
To summarize, I think that what I am doing now is that I am "guessing" how
Tommy's implementation would look like. I would appreciate if we had a full
picture here so we can tailor the solution to Tommy's need, not doing something
Tommy will have a hard time to integrate with.
[~tommy_s] would you mind to share with us how your custom implementation would
look like?
> Add option to set a custom FSErrorHandler
> -----------------------------------------
>
> Key: CASSANDRA-20363
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20363
> Project: Apache Cassandra
> Issue Type: New Feature
> Components: Legacy/Core
> Reporter: Tommy Stendahl
> Assignee: Tommy Stendahl
> Priority: Normal
> Fix For: 5.x
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Add java property to override the DefaultFSErrorHandler with a custom
> implementation.
> The use case I am looking at is a customer deployment that are using network
> disks and these can go off-line sometimes, I would like to use
> "disk_failure_policy: stop" but automatically detect when the disk is on-line
> again and just open gossip and transports so the nodes comes back UP without
> triggering a restart of the node.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]