On 9/30/21 11:24, Daniel P. Berrangé wrote:
On Thu, Sep 30, 2021 at 11:15:18AM -0600, Jim Fehlig wrote:
On 9/29/21 21:29, Jim Fehlig wrote:
Hi All,

Likely Christian received a bug report that motivated commit aeda1b8c56,
which was later reverted by Michal with commit 72adaf2f10. In the past,
I recall being asked about "internal error: End of file from qemu
monitor" on normal VM shutdown and gave a hand wavy response using some
of Michal's words from the revert commit message.

I recently received a bug report (sorry, but no public link) from a
concerned user about this error and wondered if there is some way to
improve it? I went down some dead ends before circling back to
Christian's patch. When rebased to latest master, I cannot reproduce the
hangs reported by Michal [1]. Perhaps Nikolay's series to resolve
hangs/crashes of libvirtd [2] has now made Christian's patch viable?

Hmm, Nikolay's series improves thread management at daemon shutdown and
doesn't touch VM shutdown logic. But there has been some behavior change
from the time aeda1b8c56 was committed (3.4.0 dev cycle) to current git
master. At least I don't see libvirtd hanging when running Michal's test on
master + rebased aeda1b8c56.

This particular "eof" error message has been a source of never ending
complaints from people who mistakenly think it indicates a significant
problem.

Nod. I've probably been asked about it more times than I'm remembering. "error" and "fail" often catch the attention of users and monitor tools.

I've always been wary of hiding it by default as there are potentially
scenarios it which it is interesting to see. I think I'm coming around
to the idea though that we're better off hiding it by default. The
scenarios which care about it will probably already need the user to
contribute full debug level logs in order to diagnose properly.

I've started a variant of Michal's test on git master + rebased aeda1b8c56. The test starts 6 VMs, waits 90sec, shuts them down in parallel, waits for shutdowns to finish, then repeats. All while continuously calling GetAllDomainStats from another client. I'll see how that holds up before re-proposing Christian's patch.

Regards,
Jim


Reply via email to