On 2023/02/16 16:58, Gabriele Svelto wrote:
On 16/02/23 04:48, ISHIKAWA,chiaki wrote:
On 2023/02/14 18:43, Gabriele Svelto wrote:
EXCEPTION_IN_PAGE_ERROR_READ / STATUS_UNEXPECTED_NETWORK_ERROR 1  2.00%

I know this is tangetial to the original topic discussed.

I can certainly understand that crash in the face of  malfunctionining I/O device may be unavoidable.

But network error, when firefox (presumably) tries to load a page should not cause FF to crash IMHO. I mean we need to be resistant to malicious data (or the lack of it). Correct?

This is not a network error per-se but rather an I/O error to a network-mounted filesystem. In that case Windows will deliver us a fatal exception because it cannot fill a page with data and there's really not much we can do. This usually means the connection dropped in the middle of a transfer or the filesystem was unmounted from under us.

 Gabriele

Thank you for the comment.

I agree that there is not much we can do.
But still, for Thunderbird, I would like to see a graceful shutdown with an easy to understand error message about "network file system did not respond.", etc. Otherwise, the user is left with a bitter taste in the mouth thinking "Is my last e-mail sent successfully?", "Is the last downloaded e-mail stored securely?", etc.

For idempotent operation, that is, operations that can be tried many times and return the same result always.), a crash is OK. E.g. FF's fetching a page that would return the same page not matter how many times we try, or TB's letting users looking at the headers of already downloaded messages. They are idempotent operations.
(I am ignoring the cache update or already-read-flag setting, etc.)

For non-idempotent operations, and TB's mail handling such as receiving/writing/sending e-mails are not idempotent operations, a crash is too harsh on the user. That is why I try to make it a bit more acceptable in the face of serious trouble with network file system and other I/O operations by handling such errors sensibly (and gracefully exit if not much can be done.) However, it is an uphill battle since low-level I/O error handling was not considered/tested well in TB.

But such attention should be given to FF users as well.
I suspect that FF user in the middle of important transaction (such as banking/payment), which is definitely NOT an idempotent operation, would have a similar sentiment if FF crashes just because underlying network file system does not respond, etc.

BTE, yhere id be a subtle difference between Windows and Linux regarding network file system operation (and its errors). Windows I/O system primitive tries various network error recovery schemes such as re-trying including  automatically handle short-read and try to read as many octets as possible when the remote server returns less than requested number of octets at initial call and there are still remaining octets on the remote server. So in that sense, if Windows I/O system call fails for network operation, that is when we know hard unrecoverable error occured.
Windows has already tried a few error recovery method.
OTOH, under linux, the system call obviously does not do such extra error recovery and all is passed to user code, which needs to take care of
the short read and other recovery measure if any
Currently T-B, and puresumably FF, too, does not handle such recovery very well. At least I have produced a patch for short-read issues for TB under linux and have tested it locally for several years. I have learnt the difference between Windows and Linux network I/O error handling at OS level because C-C TB under linux could not talk to congested remote server which occasionally returned short response whereas C-C TB under windows did not show such behavior.
I investigated and realized that the C-C TB under linux needed a fix.

While testing the code by mimicking the remote file server by unplugging network cable many times for a few weeks, I learned there are still other I/O issues such as failure of ftell and lseek which are not still handled perfectly in my patch yet. (It *IS* rare, and I suspect ftell wrapper has a bug somewhere. The error is thrown back as signal which C-C TB did not catch, thus crashing.) The issue of coping with misbehaving remote network file system gracefully is so hard to test without an instance of malfunctioning remote file server which can be controlled to "err" on demand.

Anyway, I know this topic is tangetial to the original discussion.
At least, being able to know the causes of crash including possible hardware issues including network failure is great.
So thank you for showing how to do it.

Chiaki


--
You received this message because you are subscribed to the Google Groups 
"[email protected]" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/a/mozilla.org/d/msgid/dev-platform/c4903bf3-11e0-cb6a-323d-fe844175ef6a%40yk.rim.or.jp.

Reply via email to