lustre does get unmounted before NFS filesystem as seen in the log message... 
the problem is due to the fact that LNET is still up when openibd gets 
removed.

Nirmal

On 09/09/2010 02:28 PM, Andreas Dilger wrote:
> On 2010-09-09, at 10:56, Nirmal Seenu wrote:
>> I just upgraded my lustre version from 1.8.1.1 to 1.8.4 and I can't reboot 
>> my lustre clients cleanly anymore. I am using the latest RHEL kernel and
>> the openibd that comes part of that RHEL kernel + patchless lustre client 
>> installed from the tar ball.
>>
>> The lustre client gets unmounted cleanly but the system deadlocks once the 
>> openibd driver is removed. I had to modify the openibd stop script to
>> include "umount lustre" and "lustre_rmmod" as a work around.
>
> If you put "_netdev" in the lustre mount options, the shutdown scripts  
> should unmount it before trying to stop the networking.
>
>
>> The following is the error message that I get when I try to reboot the 
>> lustre client:
>>
>> Scientific Linux SLF release 5.3 (Lederman)
>> Kernel 2.6.18-194.11.1.el5 on an x86_64
>>
>> INIT:Shutting down smartd: [  OK  ]
>> Stopping atd: [  OK  ]
>> Shutting down process accounting:  [  OK  ]
>> Stopping xinetd: [  OK  ]
>> Stopping autofs:  Stopping automount: [  OK  ]
>> [  OK  ]
>> Stopping acpi daemon: [  OK  ]
>> Shutting down ntpd: [  OK  ]
>> Unmounting network block filesystems:  LustreError: 
>> 3697:0:(ldlm_request.c:1025:ldlm_cli_cancel_req()) Got rc -108 from cancel 
>> RPC: canceling anyway
>> LustreError: 3697:0:(ldlm_request.c:1587:ldlm_cli_cancel_list()) 
>> ldlm_cli_cancel_list: -108
>> Lustre: client ffff81020f145400 umount complete
>> [  OK  ]
>> Unmounting NFS filesystems:  [  OK  ]
>> Stopping system message bus: [  OK  ]
>> Stopping RPC idmapd: [  OK  ]
>> Stopping NFS locking: [  OK  ]
>> Stopping NFS statd: [  OK  ]
>> Stopping portmap: [  OK  ]
>> Stopping PC/SC smart card daemon (pcscd): [  OK  ]
>> Shutting down kernel logger: [  OK  ]
>> Shutting down system logger: [  OK  ]
>> Unloading OpenIB kernel modules:NET: Unregistered protocol family 27
>>
>> Failed to unload rdma_cm
>>
>> Failed to unload ib_cm
>>
>> Failed to unload iw_cm
>> LustreError: 131-3: Received notification of device removal
>> Please shutdown LNET to allow this to proceed
>> INFO: task rmmod:4151 blocked for more than 120 seconds.
>> "echo 0>  /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> rmmod         D ffff810227061420     0  4151   3795                     
>> (NOTLB)
>>   ffff81021c8ddce8 0000000000000082 000000000000000f 0000000000000292
>>   00000000000000ef 0000000000000001 ffff81020ecdd100 ffff8102271ef040
>>   0000004a957c4bd9 000000000095dc57 ffff81020ecdd2e8 0000000480076646
>> Call Trace:
>>   [<ffffffff80063167>] wait_for_completion+0x79/0xa2
>>   [<ffffffff8008cfa1>] default_wake_function+0x0/0xe
>>   [<ffffffff80063b05>] mutex_lock+0xd/0x1d
>>   [<ffffffff8838d155>] :rdma_cm:cma_remove_one+0x171/0x1a2
>>   [<ffffffff80076525>] do_flush_tlb_all+0x0/0x6a
>>   [<ffffffff8817d5f0>] :ib_core:ib_unregister_device+0x30/0xdb
>>   [<ffffffff881a918a>] :ib_mthca:__mthca_remove_one+0x30/0x11a
>>   [<ffffffff80063b05>] mutex_lock+0xd/0x1d
>>   [<ffffffff881a928c>] :ib_mthca:mthca_remove_one+0x18/0x25
>>   [<ffffffff8015daeb>] pci_device_remove+0x24/0x3a
>>   [<ffffffff801c7a3e>] __device_release_driver+0x9f/0xe9
>>   [<ffffffff801c7e04>] driver_detach+0xad/0x101
>>   [<ffffffff801c6ffe>] bus_remove_driver+0x6f/0x92
>>   [<ffffffff801c7e8b>] driver_unregister+0xd/0x16
>>   [<ffffffff8015ddb4>] pci_unregister_driver+0x2a/0x79
>>   [<ffffffff881bc398>] :ib_mthca:mthca_cleanup+0x10/0x16
>>   [<ffffffff800a6674>] sys_delete_module+0x196/0x1c5
>>   [<ffffffff8005d116>] system_call+0x7e/0x83
>>
>>
>> Nirmal
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Technical Lead
> Oracle Corporation Canada Inc.
>
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to