Jason, Michael, thanks y lot for your replies. I pinged everone from all directions but the router is still marked "down" on the client. I even removed and re-added the router entry via lctl --net tcp1 del_route xyz@o2ib and lctl --net tcp1 add_route xyz@o2ib . No luck. So I think I'll wait for the next maintenance window. Oh, and I forgot to mention that the servers run a 1.6.7.2, the router as well and the clients 1.8.5. Works good so far.
Thanks, Michael Am Dienstag, den 25.01.2011, 15:12 +0100 schrieb Temple Jason: > I've found that even with the Protocal Error, it still works. > > -Jason > > -----Original Message----- > From: lustre-discuss-boun...@lists.lustre.org > [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf Of Michael Shuey > Sent: martedì, 25. gennaio 2011 14:45 > To: Michael Kluge > Cc: Lustre Diskussionsliste > Subject: Re: [Lustre-discuss] "up" a router that is marked "down" > > You'll want to add the "dead_router_check_interval" lnet module > parameter as soon as you are able. As near as I can tell, without > that there's no automatic check to make sure the router is alive. > > I've had some success in getting machines to recognize that a router > is alive again by doing an lctl ping of their side of a router (e.g., > on a tcp0 client, `lctl ping <routerIP>@tcp0`, then `lctl ping > <routerIP>@o2ib0` from an o2ib0 client). If you have a server/client > version mismatch, where lctl ping returns a protocol error, you may be > out of luck. > > -- > Mike Shuey > > > > On Tue, Jan 25, 2011 at 8:38 AM, Michael Kluge > <michael.kl...@tu-dresden.de> wrote: > > Hi list, > > > > if a Lustre router is down, comes back to life and the servers do not > > actively test the routers periodically: is it possible to mark a Lustre > > router as "up"? Or to tell the servers to ping the router? > > > > Or can I enable the "router pinger" in a live system without unloading > > and loading the Lustre kernel modules? > > > > > > Regards, Michael > > > > -- > > > > Michael Kluge, M.Sc. > > > > Technische Universität Dresden > > Center for Information Services and > > High Performance Computing (ZIH) > > D-01062 Dresden > > Germany > > > > Contact: > > Willersbau, Room A 208 > > Phone: (+49) 351 463-34217 > > Fax: (+49) 351 463-37773 > > e-mail: michael.kl...@tu-dresden.de > > WWW: http://www.tu-dresden.de/zih > > > > _______________________________________________ > > Lustre-discuss mailing list > > Lustre-discuss@lists.lustre.org > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > -- Michael Kluge, M.Sc. Technische Universität Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Willersbau, Room A 208 Phone: (+49) 351 463-34217 Fax: (+49) 351 463-37773 e-mail: michael.kl...@tu-dresden.de WWW: http://www.tu-dresden.de/zih
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss