[
https://issues.apache.org/jira/browse/ARTEMIS-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17033593#comment-17033593
]
Francesco Nigro commented on ARTEMIS-2618:
------------------------------------------
Sorry [~riconeubauer] that I've -1 this until now and I appreciate both the
proposal and the approach :(
I'm ignorant on Windows systems TBH, but I suppose windows could be similarly
affected because of what NIO is: NIO is making uses of the OS page cache
(similarly to the *nix side) and that's the reason why not failign fast could
be dangerous: the risk is to cause misalignment between the OS view on
process-side (the OS page cache) and the actua disk state (that is triggered
just once, on a real I/O error).
> Improve Handling of Shutdown on critical I/O Error
> --------------------------------------------------
>
> Key: ARTEMIS-2618
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2618
> Project: ActiveMQ Artemis
> Issue Type: Improvement
> Affects Versions: 2.11.0
> Reporter: Rico Neubauer
> Priority: Major
> Attachments: Improve-Handling-of-Shutdown-on-critic.patch
>
>
> Would like to request an improvement in the handling of critical I/O errors
> on opening journal files.
> If {{org.apache.activemq.artemis.core.io.nio.NIOSequentialFile}} fails to
> open a journal file, the whole server shuts down with {{@Message(id = 222010,
> value = "Critical IO Error, shutting down the server. file=1, message=0"}}.
> We have seen this in the wild, where a backup-software locked the file for a
> short time while journal was about getting opened, resulting in the shutdown.
> Proposed improvement would be to have a short-running retry for opening the
> journal files and only fail fatally if error persists.
> Will attach a proposal patch. Can also create a PR if you accept.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)