[ 
https://issues.apache.org/jira/browse/ARTEMIS-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17033593#comment-17033593
 ] 

Francesco Nigro commented on ARTEMIS-2618:
------------------------------------------

Sorry [~riconeubauer] that I've -1 this until now and I appreciate both the 
proposal and the approach :(
I'm ignorant on Windows systems TBH, but I suppose windows could be similarly 
affected because of what NIO is: NIO is making uses of the OS page cache 
(similarly to the *nix side) and that's the reason why not failign fast could 
be dangerous: the risk is to cause misalignment between the OS view on 
process-side (the OS page cache) and the actua disk state (that is triggered 
just once, on a real I/O error).

> Improve Handling of Shutdown on critical I/O Error
> --------------------------------------------------
>
>                 Key: ARTEMIS-2618
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2618
>             Project: ActiveMQ Artemis
>          Issue Type: Improvement
>    Affects Versions: 2.11.0
>            Reporter: Rico Neubauer
>            Priority: Major
>         Attachments: Improve-Handling-of-Shutdown-on-critic.patch
>
>
> Would like to request an improvement in the handling of critical I/O errors 
> on opening journal files.
> If {{org.apache.activemq.artemis.core.io.nio.NIOSequentialFile}} fails to 
> open a journal file, the whole server shuts down with {{@Message(id = 222010, 
> value = "Critical IO Error, shutting down the server. file=1, message=0"}}.
> We have seen this in the wild, where a backup-software locked the file for a 
> short time while journal was about getting opened, resulting in the shutdown.
> Proposed improvement would be to have a short-running retry for opening the 
> journal files and only fail fatally if error persists.
> Will attach a proposal patch. Can also create a PR if you accept.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to