Hi,
I'm assuming I'm hitting a defect, but I'm not sure if the norm is to just file
an issue on github or to send email first.
>From a simple kernel test job I'm seeing the following failure in the
>debug/client.0.log after the kernel build (apparently from client/kernel.py in
>_add_kernel_to_bootloader) :
21:02:46 INFO | GOOD build kernel.install timestamp=1337994166
localtime=May 25 21:02:46
21:02:46 DEBUG| Persistent state client.steps now set to [([],
'job.end_reboot_and_verify', [1337994166, '2.6.36-autotest::#1 SMP Fri May 25
20:56:11 EDT 2012', 'build', []], {}), ([], 'step_test',
('http://serverxxxx.hp.com/kernels/kernel-2.6.36.tar.bz2',), {})]
21:02:46 DEBUG| No kernel found for title "autotest". Assuming no entry exists,
and emulating boottool(.pl) behavior and being silent about it.
21:02:46 ERROR| grubby fatal error: unable to find a suitable template
21:02:47 DEBUG| Running 'touch /fastboot'
That fatal error happens in the bootloader.add_kernel() call, but doesn't
result in giving up on the kernel install. This results autotest proceeding to
reboot the system and it just sitting forever at the booting from disk C due to
there no longer being any kernels in the bootloader. At that point we just
have to re-install the test system.
This is on a test client with RHEL 6.2 server and with autotest from the
following commit ID installed:
716554702f8bbc86738b96272df0d27ce8be889c
I think we also saw the same issue with the following version of autotest
installed, but I need to test 1 more time to confirm:
https://github.com/autotest/autotest/commit/ed05905987207e30b8ebfeb4d6e1dcf9e63d8979
Older versions of autotest (e.g. 13.0) that were using the older version of
grubby and boottool are able to successfully install the exact same kernel with
the below being the output I see in the logs for that same step:
02/04 20:22:19 INFO | kernel:0016| --- END kernel.install ---
02/04 20:22:19 INFO | job:0211| GOOD build kernel.install
timestamp=1328404939 localtime=Feb 04 20:22:19
02/04 20:22:19 DEBUG| base_job:0347| Persistent state client.steps now set to
[([], 'job.end_reboot_and_verify', [1328404939, '2.6.36-autotest::#1 SMP Sat
Feb 4 20:08:57 EST 2012', 'build', []], {}), ([], 'step_test',
('http://serverxxxx.hp.com/kernels/kernel-2.6.36.tar.bz2',), {})]
02/04 20:22:19 DEBUG|base_utils:0074| Running
'/usr/local/autotest/tools/boottool "--remove-kernel=autotest"'
02/04 20:22:19 DEBUG|base_utils:0074| Running
'/usr/local/autotest/tools/boottool "--info=all"'
02/04 20:22:20 DEBUG|base_utils:0074| Running
'/usr/local/autotest/tools/boottool "--add-kernel=/boot/vmlinuz-autotest"
"--title=autotest" "--args=_dummy_" "--initrd=/boot/initrd-autotest"
"--position=end"'
02/04 20:22:20 DEBUG|base_utils:0074| Running
'/usr/local/autotest/tools/boottool "--update-kernel=autotest"
"--args=console=ttyS0"'
02/04 20:22:20 DEBUG|base_utils:0074| Running
'/usr/local/autotest/tools/boottool "--update-kernel=autotest"
"--args=IDENT=1328404939"'
02/04 20:22:20 DEBUG|base_utils:0074| Running
'/usr/local/autotest/tools/boottool "--update-kernel=autotest"
"--remove-args=_dummy_"'
02/04 20:22:20 DEBUG|base_utils:0074| Running 'touch /fastboot'
I recognize I likely don't have enough information in this message to debug the
actual gruby fatal error, and I'm still trying to triage this failure a little
bit more (to see if I can tell exactly which grubby command is failing during
that fatal error); however, I think in general the practice should be that if
autotest sees an error when adding the kernel to the bootloader that it should
NOT proceed to reboot the system. That just makes it harder to debug the
problem and produces a system that no longer boots.
-Dan
_______________________________________________
Autotest mailing list
[email protected]
http://test.kernel.org/cgi-bin/mailman/listinfo/autotest