Hi Guys, I just deployed a new lustre filesystem and was unable to mount the filesystem on a client for the first time. I was able to reach everything using the:
lctl ping i...@o2ib3 All networks were up but the client hanged while doing a mount. So, I decided to reboot my client and was then able to mount the filesystem without any issues. Here is something that I saw on the MDS server. Any ideas what might be the problem? Thanks in advance for your input. -- ..snip.. Jan 14 17:06:36 resmds01 kernel: Lustre: 7790:0:(client.c:1383:ptlrpc_expire_one_request()) @@@ Request x1324894173265938 sent from reshpcfs-OST0001-osc to NID 10.0.250...@o2ib3 0s ago has failed due to network error (limit 15s). Jan 14 17:06:36 resmds01 kernel: Lustre: 7722:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.0.250...@o2ib3failed: 5 Jan 14 17:06:51 resmds01 kernel: Lustre: 7712:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.0.250...@o2ib3failed: 5 Jan 14 17:06:51 resmds01 kernel: Lustre: 7719:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.0.250...@o2ib3failed: 5 Jan 14 17:06:51 resmds01 kernel: Lustre: 7714:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.0.250...@o2ib3failed: 5 Jan 14 17:06:51 resmds01 kernel: Lustre: 7712:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.0.250...@o2ib3failed: 5 Jan 14 17:06:51 resmds01 kernel: Lustre: 7722:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.0.250...@o2ib3failed: 5 Jan 14 17:06:51 resmds01 kernel: Lustre: 7714:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.0.250...@o2ib3failed: 5 Jan 14 17:06:51 resmds01 kernel: Lustre: 7709:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.0.250...@o2ib3failed: 5 Jan 14 17:06:51 resmds01 kernel: Lustre: 7719:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.0.250...@o2ib3failed: 5 Jan 14 17:06:51 resmds01 kernel: Lustre: 7714:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.0.250...@o2ib3failed: 5 Jan 14 17:06:51 resmds01 kernel: Lustre: 7709:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.0.250...@o2ib3failed: 5 Jan 14 17:06:51 resmds01 kernel: Lustre: 7719:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.0.250...@o2ib3failed: 5 Jan 14 17:06:51 resmds01 kernel: Lustre: 7714:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.0.250...@o2ib3failed: 5 Jan 14 17:06:51 resmds01 kernel: Lustre: 7709:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.0.250...@o2ib3failed: 5 Jan 14 17:06:51 resmds01 kernel: Lustre: 7717:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.0.250...@o2ib3failed: 5 Jan 14 17:06:51 resmds01 kernel: Lustre: 7714:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.0.250...@o2ib3failed: 5 Jan 14 17:06:51 resmds01 kernel: Lustre: 7717:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.0.250...@o2ib3failed: 5 Jan 14 17:06:51 resmds01 kernel: Lustre: 7714:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.0.250...@o2ib3failed: 5 Jan 14 17:06:51 resmds01 kernel: Lustre: 7712:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from 10.0.250...@o2ib3failed: 5 Jan 14 17:06:51 resmds01 kernel: Lustre: 5616:0:(o2iblnd_cb.c:1953:kiblnd_peer_connect_failed()) Deleting messages for 10.0.250...@o2ib3: connection failed Jan 14 17:07:36 resmds01 kernel: LustreError: 11-0: an error occurred while communicating with 10.0.250...@o2ib3. The ost_connect operation failed with -19 Jan 14 17:13:55 resmds01 kernel: LustreError: 11-0: an error occurred while communicating with 0...@lo. The mds_connect operation failed with -16 Jan 14 17:14:20 resmds01 kernel: LustreError: 11-0: an error occurred while communicating with 0...@lo. The mds_connect operation failed with -16 ..snip.. --
_______________________________________________ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss