Hi Ashish, On Mon, 2007-02-26 at 16:04, Batwara, Ashish wrote: > Hi, > I am trying to bring up opensm, but it not letting me. When I look at > the /var/log/messages, I see that it becomes UP for a moment and then > again it goes down. Look for " SUBNET UP " in below logs. Can anyone > know what the problem is? I am using OFED-1.1.1 with patches almost 1 > month ago. > > Thanks > Ashish > > > Feb 26 14:38:37 p49 run_srp_daemon[7640]: failed srp_daemon: > [HCA=mthca0] [port=2] [exit status=0] > Feb 26 14:38:37 p49 run_srp_daemon[7642]: failed srp_daemon: > [HCA=mthca0] [port=1] [exit status=0] > Feb 26 14:38:46 p49 OpenSM[7433]: SM port is down > Feb 26 14:38:53 p49 run_srp_daemon[7653]: starting srp_daemon: > [HCA=mthca0] [port=2] > Feb 26 14:38:53 p49 run_srp_daemon[7658]: starting srp_daemon: > [HCA=mthca0] [port=1] > Feb 26 14:38:56 p49 OpenSM[7433]: SM port is down > Feb 26 14:38:56 p49 run_srp_daemon[7675]: failed srp_daemon: > [HCA=mthca0] [port=2] [exit status=0] > Feb 26 14:38:56 p49 run_srp_daemon[7680]: failed srp_daemon: > [HCA=mthca0] [port=1] [exit status=0] > Feb 26 14:39:06 p49 OpenSM[7433]: SM port is down > Feb 26 14:39:26 p49 last message repeated 2 times > Feb 26 14:39:26 p49 run_srp_daemon[7691]: starting srp_daemon: > [HCA=mthca0] [port=1] > Feb 26 14:39:26 p49 run_srp_daemon[7692]: starting srp_daemon: > [HCA=mthca0] [port=2] > Feb 26 14:39:29 p49 run_srp_daemon[7715]: failed srp_daemon: > [HCA=mthca0] [port=1] [exit status=0] > Feb 26 14:39:29 p49 run_srp_daemon[7716]: failed srp_daemon: > [HCA=mthca0] [port=2] [exit status=0] > Feb 26 14:39:36 p49 OpenSM[7433]: SM port is down > Feb 26 14:39:56 p49 last message repeated 2 times > Feb 26 14:39:59 p49 run_srp_daemon[7728]: starting srp_daemon: > [HCA=mthca0] [port=1] > Feb 26 14:39:59 p49 run_srp_daemon[7727]: starting srp_daemon: > [HCA=mthca0] [port=2] > Feb 26 14:40:02 p49 run_srp_daemon[7752]: failed srp_daemon: > [HCA=mthca0] [port=1] [exit status=0] > Feb 26 14:40:02 p49 run_srp_daemon[7751]: failed srp_daemon: > [HCA=mthca0] [port=2] [exit status=0] > Feb 26 14:40:06 p49 OpenSM[7433]: SM port is down > Feb 26 14:40:26 p49 last message repeated 2 times > Feb 26 14:40:32 p49 run_srp_daemon[7791]: starting srp_daemon: > [HCA=mthca0] [port=2] > Feb 26 14:40:32 p49 run_srp_daemon[7792]: starting srp_daemon: > [HCA=mthca0] [port=1] > Feb 26 14:40:35 p49 run_srp_daemon[7812]: failed srp_daemon: > [HCA=mthca0] [port=1] [exit status=0] > Feb 26 14:40:35 p49 run_srp_daemon[7817]: failed srp_daemon: > [HCA=mthca0] [port=2] [exit status=0] > Feb 26 14:40:36 p49 OpenSM[7433]: SM port is down > Feb 26 14:40:46 p49 OpenSM[7433]: SM port is down > Feb 26 14:40:56 p49 OpenSM[7433]: Entering MASTER state > Feb 26 14:40:56 p49 OpenSM[7433]: SUBNET UP > Feb 26 14:41:05 p49 run_srp_daemon[7823]: starting srp_daemon: > [HCA=mthca0] [port=1] > Feb 26 14:41:05 p49 run_srp_daemon[7832]: starting srp_daemon: > [HCA=mthca0] [port=2] > Feb 26 14:41:06 p49 OpenSM[7433]: SM port is down > Feb 26 14:41:08 p49 run_srp_daemon[7847]: failed srp_daemon: > [HCA=mthca0] [port=2] [exit status=0] > Feb 26 14:41:14 p49 run_srp_daemon[7853]: failed srp_daemon: > [HCA=mthca0] [port=1] [exit status=0] > Feb 26 14:41:16 p49 OpenSM[7433]: SM port is down
It appears your SM port to some switch (?) is losing physical connectivity. Try a different (known good) cable. -- Hal > _______________________________________________ > openib-general mailing list > openib-general@openib.org > http://openib.org/mailman/listinfo/openib-general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > _______________________________________________ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general