Re: [PATCH] block: report errno when flock fcntl fails

Vladimir Sementsov-Ogievskiy Wed, 16 Dec 2020 06:29:05 -0800

16.12.2020 16:53, Kevin Wolf wrote:

Am 16.12.2020 um 14:06 hat David Edmondson geschrieben:

On Wednesday, 2020-12-16 at 12:57:46 +01, Kevin Wolf wrote:

Am 16.12.2020 um 12:38 hat David Edmondson geschrieben:

On Wednesday, 2020-12-16 at 12:29:40 +01, Kevin Wolf wrote:

Am 15.12.2020 um 20:01 hat David Edmondson geschrieben:

When a call to fcntl(2) for the purpose of manipulating file locks
fails, report the error returned by fcntl.

Signed-off-by: David Edmondson <david.edmond...@oracle.com>


Is appending "Resource temporarily unavailable" in the common case (a
file locked by another process) actually an improvement for the message
or more confusing? It would be good to mention the motivation for the
change in the commit message.


Distinguishing between the common case and others is at least a start
when trying to figure out why it failed. We have potentially useful
information to assist in diagnosis, why throw it away?


I agree in principle, just when I saw the result, it felt more confusing
than helpful. Maybe the best option would be to include the errno only
if it's different from EAGAIN? Then the qemu-iotests reference output
would probably stay unchanged, too.


This is a pretty low-level error report even today - unless you
understand the implementation then it doesn't shout "the file is being
shared".


Hm, certainly true for raw_apply_lock_bytes(), which shouldn't normally
fail, so I agree that we don't need to simplify messages there.

 From raw_check_lock_bytes(), you get messages like 'Failed to get
"write" lock', which I think is fairly obvious? I'm not saying that it's
perfect and couldn't be improved, of course, but it doesn't feel
horrible.

Given that, I don't see any value in eliding the EAGAIN message, as I
have to know that the lack of the errno string means that it was EAGAIN,
meaning that another process holds a lock.


When you debug an error, you'll look at the code anyway. And a normal
user won't know what EAGAIN or "Resource temporarily unavailable" means
even if it's displayed. Temporarily sounds more like it will go away by
itself, not that I have to shut down a different process first.

I wonder if anyone else has an opinion on this? If people think that
displaying it is preferable or it doesn't matter much, we can add it
despite my concerns.


[a bit offtopic, but related]

I think, there are two kinds of errors:

1. Simple errors, when user (or in our case, most probably some vm management 
tool developer) understand the error, understand what he is doing wrong and 
don't report the problem to Qemu developers.

These errors should look well, do not contain information which may confuse the 
user.

2. Errors that should not happen. Often the reason is some bug, sometimes, it's 
actually type [1] error, but user didn't understand what's going wrong.. These 
errors are reported to Qemu developers..

For these errors it doesn't matter how they look for the user, user doesn't 
understand what's going on anyway. These errors should contain as much 
information as possible..


And it's a pity, that in pursuit of well-formed error messages for [1], we 
loose information, and have to deal with well-formed, but not informative error 
messages, wasting time on debugging.

Even such simple thing like determine, from what line of code the error comes 
not always obvious, due to different reasons.

I'd prefer that error message always contain the call-stack with a function 
parameters, not saying about errno :)

--
Best regards,
Vladimir

Re: [PATCH] block: report errno when flock fcntl fails

Reply via email to