It maybe possible that disk IO is saturated. (i.e. more writes than the 
physical disk could handle).

-----Original Message-----
From: owner-m...@openbsd.org <owner-m...@openbsd.org> On Behalf Of Jan Johansson
Sent: Wednesday, 10 March 2021 6:21 AM
To: misc@openbsd.org
Cc: Mike Larkin <mlar...@nested.page>; Ian Darwin <i...@darwinsys.com>
Subject: Re: vmm/vmd disk issue

Mike Larkin <mlar...@nested.page> wrote:
> On Tue, Mar 09, 2021 at 09:38:57AM -0500, Ian Darwin wrote:
> > On Tue, Mar 09, 2021 at 09:52:03AM +0100, Jan Johansson wrote:
> > > If I try to cp or dd the disk image on the host it fails
> > >
> > > dd if=disk.raw.old of=disk.raw.bak bs=1m
> > > dd: disk.raw.old: Input/output error
> > > 8858+0 records in
> > > 8858+0 records out
> > > 9288286208 bytes transferred in 102.048 secs (91018010 bytes/sec)
> > >
> > > The host show no other signs of failing hardware.
> > >
> > > Is this a software or a hardware error?
> >
> > Given that it gives an error outside the VM, it's likely hardware.
> >
>
> Agreed. Sorta hard to fault vmd(8) if it's not even running.

Since these are sparse files, could the vioblk(4) somehow write incorrect data 
that later will make it unreadable such as a pointer pointing into nothingness?

The messages

vmd[39543]: vioblk write error: Input/output error
vmd[39543]: wr vioblk: disk write error

was produced and 01:30 when all the 4 guests and the host all run the daily 
script (which makes backup and other maintenance tasks) if that could have any 
impact.

Should there not be anything on the host logging errors to dmesg/syslog such as 
sd(4) or ahci(4)?

(If it is not obvious my understanding of how the virtio/vioblk stuff hooks in 
to the disk stack is very limited)

This drive was installed in august 2020 and if I recall correctly it was 
because of this issue. So I am thinkig cable or motherboard.

If I decide to replace would it make sense to make this a softraid mirror 
(RAID1) to avoid or get better indication of this kind of problems in the 
future or would only add more parts that can break?

I'am currently trying to provoke the drive from the host with

dd if=/dev/random of=test.raw bs=1m count=17000

then cp/dd and cmp to see if I can make it break for real.


Classified as Confidential

Reply via email to