On Fri, Sep 16, 2022 at 05:59:20PM -0700, Michael Truog wrote: > Hi, > > I was attempting to have a RAID 5 softraid0 setup on a sparc64 machine (boot > log output below) but ran into problems when attempting to create a single > partition with the size 5.5TB (RAID 5 with 4 x 2TB hard drives). I found an > interesting problem when using disklabel on the softraid0 hard drive device, > when attempting to make this 5.5TB partition. The partition "a" would only > be allowed as 1.5TB and any partition >= "d" would only be allowed as 2TB, > however the limit occurred silently after disklabel had exited. When inside > disklabel, I could allocate a single "a" partition to be 5.5TB successfully > and was able to write the partition successfully. However, when the > disklabel process exited, either with the q command or a kill signal 9, the > partition would be shrunk to the limit described above. If the disklabel > process was suspended (ctrl-Z), this wouldn't happen and newfs would see the > 5.5TB partition, though usage of the partition wouldn't work. The partition > would have inaccessible blocks that fsck showed extreme anger at, when it > saw it at boot time.
It would really help to showcase your issue with commands/output. This issue is not related to softraid(4), it is most probably an old sparc(64) quirk: 1. create big dummy disk for a single filesystem: $ ldomctl create-vdisk -s 10T sparse-10T.img 2. pass it to guest domain in order to have a "real" 10T sized sd(4): # dmesg | grep ^sd2 sd2 at scsibus3 targ 0 lun 0: <SUN, Virtual Disk, 1.1> sd2: 10485760MB, 512 bytes/sector, 21474836480 sectors # echo '/ 1M-* 100%' | disklabel -wAT/dev/stdin sd2 # disklabel -h sd2 # /dev/rsd2c: type: SCSI disk: SCSI disk label: Virtual Disk duid: c4befc09bf56efed flags: vendor bytes/sector: 512 sectors/track: 255 tracks/cylinder: 511 sectors/cylinder: 130305 cylinders: 164804 total sectors: 21474836480 # total bytes: 10.0T boundstart: 0 boundend: 21474836480 16 partitions: # size offset fstype [fsize bsize cpg] a: 2.0T 0 4.2BSD 8192 65536 1 c: 10.0T 0 unused disklabel: warning, partition a: size % cylinder-size != 0 3. compare against amd64/vmm: $ vmctl create -s 10T 10T-sparse.img vmctl: create imagefile operation failed: File too large $ vmctl create -s 7T 7T-sparse.img vmctl: raw imagefile created (Not quite sure why 7T is the maximum here... 8T wouldn't work, either) # vmctl start -c -b /bsd.rd -d 7T-sparse.img t ... sd0 at scsibus0 targ 0 lun 0: <VirtIO, Block Device, > sd0: 7340032MB, 512 bytes/sector, 15032385536 sectors ... (I)nstall, (U)pgrade, (A)utoinstall or (S)hell? s # cd /dev ; MAKEDEV sd0 sh: MAKEDEV: not found # cd /dev ; sh MAKEDEV sd0 # echo '/ 1M-* 100%' | disklabel -wAT/dev/stdin sd0 # disklabel -h sd0 # /dev/rsd0c: type: SCSI disk: SCSI disk label: Block Device duid: 24ff0fe5062adbdc flags: bytes/sector: 512 sectors/track: 255 tracks/cylinder: 511 sectors/cylinder: 130305 cylinders: 115363 total sectors: 15032385536 # total bytes: 7.0T boundstart: 0 boundend: 15032385536 16 partitions: # size offset fstype [fsize bsize cpg] a: 7.0T 0 4.2BSD 8192 65536 1 c: 7.0T 0 unused So that makes it look like a purely sparc64 related issue. I don't *see* silent truncation on amd64. > > I did bump into a kernel panic when doing the sequence (kernel panic output > is below the boot log): disklabel single partition 5.5TB written, suspend > disklabel process, newfs on partition, kill -9 disklabel process, write a > single file to the filesystem ("the_first_file" in the command line output > below). Same as above; clear steps to reproduce would be helpful. > > The 1.5TB/2TB partition limit is known and expected on sparc64, isn't it? I > didn't see the limit mentioned in documentation, though the disklabel > manpage does say "On some machines, such as Sparc64, partition tables may > not exhibit the full functionality described above.". I bumped into the > same limit when attempting to use softraid0 RAID c too. This disklabel(8) CAVEATS is pretty vague; CVS log shows it originally mentioned amiga3 and sparc, with minor tweaks arriving sparc64. > OpenBSD 7.1 (GENERIC.MP) #1269: Mon Apr 11 22:05:10 MDT 2022 > dera...@sparc64.openbsd.org:/usr/src/sys/arch/sparc64/compile/GENERIC.MP Can you try with a snapshot, please? > mpi0 at pci8 dev 0 function 0 "Symbios Logic SAS1068E" rev 0x04: msi > mpi0: UNUSED, firmware 1.27.0.0 > ultra# vi the_first_file > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~the_first_file: new file: line 1This is the > first file ever written here.Copying file for recovery...:wqpanic: biodone > already > Stopped at db_enter+0x8: nop > TID PID UID PRFLAGS PFLAGS CPU COMMAND > biodone(4004b5f7718, 430cf, 290, 0, 0, 1997348) at biodone+0x1a4 > sd_buf_done(40048954010, 2018000, 0, 0, 0, 6) at sd_buf_done+0x78 > scsi_done(40048954010, 10000, 4004aa73800, 0, 10000, 6) at scsi_done+0x18 > mpi_scsi_cmd_done(4004a9a3e88, 4004a980b00, 4004a993400, 22600, 200, a) at > mpi_ > scsi_cmd_done+0x208 > mpi_reply(4004a997c00, 0, 640, 0, 4004a991920, c7) at mpi_reply+0x84 > mpi_intr(4004a997c00, 1ca6638, 20, 4004a82cfc0, 1c00, 6) at mpi_intr+0x30 > intr_handler(2017ec8, 4004a991900, e9643408, 1ca76e8, 1cd4000, a4) at > intr_hand > ler+0x50 > sparc_intr_retry(0, 0, 17970f8, 0, 1c00, 12) at sparc_intr_retry+0x5c > cpu_idle_cycle(1ca6618, 2018000, 18b3b98, 1ca76e8, 0, 1997348) at > cpu_idle_cycl > e+0x2c > sched_idle(2018360, 4004a92c000, 17970f8, 0, 0, 3b9ac800) at > sched_idle+0x158 > proc_trampoline(0, 0, 0, 0, 0, 0) at proc_trampoline+0x14 > https://www.openbsd.org/ddb.html describes the minimum info required in bug > reports. Insufficient info makes it difficult to find and fix bugs. > ddb{0}> show panic > *cpu0: biodone already Can you try reproducing your disklabel/suspend/newfs/kill/write-file dance without mpi(4) and softraid(4) in between? Does it also crash if you create partitions that are smaller than 2.0T?