Re: sysupgrade failure logs

Judah Kocher Sun, 14 Feb 2021 15:40:47 -0800

I had this nicely formatted when I sent it, but it seems to have beenreformatted elsewhere in transit. Hopefully this helps but if not I willleave it be.


On 2/14/21 6:27 PM, Judah Kocher wrote:

Thanks to each of you for your replies,

Lesson 1: always get machines with remote console access. It wil savethe day some day and help in diagnosing issues.

Having remote console access would be sweet, but unfortunately that goesfar beyond the hobbyist price point I currently have to work with.

On the system that succeeded when you were watching on the console,did automaic sysupgardes started to work after that?

It did not. This unit still fails the weekly upgrade.

In general, my guess would be a boot.conf contents that prevent theautomatic upgrade to work. Or maybe you have very old bootloaders onthe failing mahcines.

I do not have an /etc/boot.conf file on any of these systems. Both theunits having issues are less than 12 months old so I wouldn't think thebootloader age would be an issue. I have both older and newer units thatare working correctly. Are there major recent changes you are aware ofthat might lead to something like this?

BTW, kernel # cannot be used to identify a kernel.

Interesting. When I realized two of my machines were still on oldsnapshots it seemed like a a simple metric to use for tracking this. Theactual number isn't really as relevant to me at this point as the factthat it does or does not change. Could the update be partially and/orcompletely succeeding and the kernel # staying the same? In my limitedexperience following -current for the last 4+ years every snapshot comeswith a different #.

 -Otto

Care to show a script ? otherwise it looks like rather lengthymathematical problem with quite some variables.

The script is very basic. I didn't show it originally because Iknow from reading various threads in the past that many of the folks onthis list find a "non-default" install abhorrent and I wasn't interestedin dragging that into this. However, here are the complete script contents:

#!/bin/sh

# Fetch current system version
CURRENT=`uname -a | awk '{print $4}'`

# download latest snapshot but do not install
doas sysupgrade -n -s

# Delete unwanted sets
doas rm /home/_sysupgrade/x*
doas rm /home/_sysupgrade/g*
doas rm /home/_sysupgrade/c*

# Log System Time
year=`date +%Y`
month=`date +%m`
day=`date +%d`
hour=`date +%H`
minute=`date +%M`
second=`date +%S`
echo "System Snapshot upgrade started from $CURRENT at$hour:$minute:$second on $month/$day/$year" >> /home/$USER/systemLog
doas reboot
I have a separate script that runs on each boot which checks if anupgrade attempt was the last logged item and fetches the currentsystem version to compare. If it is the same it logs a failure. If itis different it logs the new version #. Either way it emails me theresults. This script is running on all 6 systems.
.
What are the permissions on the bsd.upgrade that's left behind? Ifthey are still +x then your issue is with the boot loader, maybe thatboot.conf otto suggested. If they are -x then the boot loader startedthe install kernel but something went wrong.

The permissions of the left-behind bsd.upgrade are -rw------- 1 root

On 14 February 2021 18:02:07 CET, Judah Kocher <[email protected]>wrote:

Hello folks,

I am having an issue with sysupgrade and I have had trouble finding the

source of the problem so I hope someone here might be able and willing
to point me in the right direction.

I have 6 small systems running OpenBSD -current and I have a basic
script which upgrades to the latest snapshot weekly. The systems are
all
relatively similar. Three are the exact same piece of hardware, two are

slightly different, and one is a VM configured to match the first three

as closely as possible with virtual hardware.

The script checks the current kernel version, (e.g. "GENERIC.MP#302")
logs it, runs sysupgrade, and after the reboot it checks the kernel
version again. If it is different it logs it as a "success" and if it
is
still the same it logs it as a failure.

All 6 systems were configured using the same autoinstall configuration
and the upgrade script is identical on each unit. However, two of the
three identical units always fail. When I remote into either system and

manually run the upgrade script it also fails. I was able to get onsite

with one of them where I connected a monitor and keyboard and manually
ran the script to observe the results but oddly enough it succeeded so
I
learned nothing actionable. However it continues to fail the weekly
upgrade. I have confirmed that the script permissions are identical on
the working and nonworking units.

The 4 units that successfully upgrade leave a mail message with a log
of
the upgrade process. However I have been unable to find any record or
log on the systems that are failing to help me figure out why this
isn't
working. The only difference I can identify between the systems is that

"auto_upgrade.conf" and "bsd.upgrade" are both present in "/" on the
two
systems that fail, but are properly removed on the 4 that succeed.

I would appreciate any suggestions of what else I can try or check to
figure out what is causing this issue.

Thanks

Judah

Re: sysupgrade failure logs

Reply via email to