Hi,

I'm assuming I'm hitting a defect, but I'm not sure if the norm is to just file 
an issue on github or to send email first.

>From a simple kernel test job I'm seeing the following failure in the 
>debug/client.0.log after the kernel build (apparently from client/kernel.py in 
>_add_kernel_to_bootloader) :

21:02:46 INFO |        GOOD    build   kernel.install timestamp=1337994166   
localtime=May 25 21:02:46
21:02:46 DEBUG| Persistent state client.steps now set to [([], 
'job.end_reboot_and_verify', [1337994166, '2.6.36-autotest::#1 SMP Fri May 25 
20:56:11 EDT 2012', 'build', []], {}), ([], 'step_test', 
('http://serverxxxx.hp.com/kernels/kernel-2.6.36.tar.bz2',), {})]
21:02:46 DEBUG| No kernel found for title "autotest". Assuming no entry exists, 
and emulating boottool(.pl) behavior and being silent about it.
21:02:46 ERROR| grubby fatal error: unable to find a suitable template
21:02:47 DEBUG| Running 'touch /fastboot'


That fatal error happens in the bootloader.add_kernel() call, but doesn't 
result in giving up on the kernel install.  This results autotest proceeding to 
reboot the system and it just sitting forever at the booting from disk C due to 
there no longer being any kernels in the bootloader.   At that point we just 
have to re-install the test system. 

This is on a test client with RHEL 6.2 server and with autotest from the 
following commit ID installed:
716554702f8bbc86738b96272df0d27ce8be889c

I think we also saw the same issue with the following version of autotest 
installed, but I need to test 1 more time to confirm:
https://github.com/autotest/autotest/commit/ed05905987207e30b8ebfeb4d6e1dcf9e63d8979

Older versions of autotest (e.g. 13.0) that were using the older version of 
grubby and boottool are able to successfully install the exact same kernel with 
the below being the output I see in the logs for that same step:

 02/04 20:22:19 INFO |    kernel:0016| --- END kernel.install ---
02/04 20:22:19 INFO |       job:0211|         GOOD    build   kernel.install    
    timestamp=1328404939   localtime=Feb 04 20:22:19      
02/04 20:22:19 DEBUG|  base_job:0347| Persistent state client.steps now set to 
[([], 'job.end_reboot_and_verify', [1328404939, '2.6.36-autotest::#1 SMP Sat 
Feb 4 20:08:57 EST 2012', 'build', []], {}), ([], 'step_test', 
('http://serverxxxx.hp.com/kernels/kernel-2.6.36.tar.bz2',), {})]
02/04 20:22:19 DEBUG|base_utils:0074| Running 
'/usr/local/autotest/tools/boottool "--remove-kernel=autotest"'
02/04 20:22:19 DEBUG|base_utils:0074| Running 
'/usr/local/autotest/tools/boottool "--info=all"'
02/04 20:22:20 DEBUG|base_utils:0074| Running 
'/usr/local/autotest/tools/boottool "--add-kernel=/boot/vmlinuz-autotest" 
"--title=autotest" "--args=_dummy_" "--initrd=/boot/initrd-autotest" 
"--position=end"'
02/04 20:22:20 DEBUG|base_utils:0074| Running 
'/usr/local/autotest/tools/boottool "--update-kernel=autotest" 
"--args=console=ttyS0"'
02/04 20:22:20 DEBUG|base_utils:0074| Running 
'/usr/local/autotest/tools/boottool "--update-kernel=autotest" 
"--args=IDENT=1328404939"'
02/04 20:22:20 DEBUG|base_utils:0074| Running 
'/usr/local/autotest/tools/boottool "--update-kernel=autotest" 
"--remove-args=_dummy_"'
02/04 20:22:20 DEBUG|base_utils:0074| Running 'touch /fastboot'


I recognize I likely don't have enough information in this message to debug the 
actual gruby fatal error, and I'm still trying to triage this failure a little 
bit more (to see if I can tell exactly which grubby command is failing during 
that fatal error); however, I think in general the practice should be that if 
autotest sees an error when adding the kernel to the bootloader that it should 
NOT proceed to reboot the system.  That just makes it harder to debug the 
problem and produces a system that no longer boots.

-Dan
_______________________________________________
Autotest mailing list
[email protected]
http://test.kernel.org/cgi-bin/mailman/listinfo/autotest

Reply via email to