SOLVED, Thanks to Scott Adair and the Sun Support Eniginers, for all your help.
Problem LDOM's would drop off the network,after a few days and I needed to reboot the primary domain to recover. Sun backline support identified that the LogicalChannels which the virtual network devices use, had become corrupt and wasn't able to reset. Solution The latest firmware patch for the T5240 (136936-06). System has now been stable for 3 weeks. I was orginally running 136936-03. Patch revisions 04 an 05 were recalled/bad. BEFORE sc> showhost Sun System Firmware 7.1.0.g 2008/04/03 18:27 Host flash versions: Hypervisor 1.6.0.b 2008/03/01 01:47 OBP 4.28.0 2008/01/22 21:12 POST 4.28.0 2008/01/22 21:39 AFTER (Patch 136936-06) sc> showhost Sun System Firmware 7.1.3.e 2008/07/29 13:41 Host flash versions: Hypervisor 1.6.4.b 2008/07/11 08:06 OBP 4.28.10 2008/07/12 12:38 POST 4.28.10 2008/07/12 13:03 Andy 2008/8/27 Scott Adair <scott at adair.cc> > > Excellent, hopefully they will be able to figure it out this time around. Let > me know how it goes > > S. > > On Wed, Aug 27, 2008 at 12:36 PM, Andy Paton <andy.paton at gmail.com> wrote: >> >> Thanks for that. >> >> I've got a solaris god looking at it, doing mdb stack traces and >> writing dtrace scripts. Starting to get beyond my understanding :-) >> >> Andy >> >> 2008/8/27 Scott Adair <scott at adair.cc>: >> > Hi Andy >> > >> > I am, unfortunately, back at work. I would have much rather been still on >> > vacation, but I guess we all have to get back to reality at some point. >> > >> > My case number is 65951668. The Sun tech was pretty much stumped, and we >> > never got a resolution to the issue. Attached to this case are various >> > "snoop" log files during the network outage event. >> > >> > If you need anything else please let me know. Good luck! >> > >> > S. >> > >> > On Wed, Aug 27, 2008 at 11:06 AM, Andy Paton <andy.paton at gmail.com> >> > wrote: >> >> >> >> Scott, >> >> >> >> If your unluckly enough to be back from vacation could you send me your >> >> case id. >> >> >> >> I've got sun backline support loged on my box for two days trying to >> >> resolve the problem. >> >> >> >> Thanks >> >> >> >> Andy >> >> >> >> 2008/8/12 Scott Adair <scott at adair.cc>: >> >> > Hey Andy >> >> > >> >> > Your issue sounds very similar to what I was experiencing. >> >> > >> >> > We had 5 LDoms on a T5240, and the network dropped for one (or >> >> > sometimes >> >> > two) of the LDoms at a time, but not all of them. Only one vsw was >> >> > configured in the primary. Our support rep couldn't determine if the >> >> > issue >> >> > was with the vsw code or the vnic driver. It was noted that for us it >> >> > was >> >> > generaly happening with heavyish network load (6 - 8MB/s). Our LDom >> >> > would >> >> > come back online after about 5-10 mins. >> >> > >> >> > I'll send you the case number when I get back to work (on vacation now). >> >> > I >> >> > went through all the information gather things with sun, so hopefully >> >> > there >> >> > is some information attached to my case that would help you out with >> >> > yours. >> >> > >> >> > If you need any more specifics please let me know. Hopefully Sun can >> >> > come to >> >> > a resolution for this issue. >> >> > >> >> > Scott >> >> > >> >> > On Mon, Aug 11, 2008 at 11:23 AM, Andy Paton <andy.paton at gmail.com> >> >> > wrote: >> >> >> >> >> >> I'm escalating this through Sun (Call open) and wont let it go until >> >> >> it is resolved. >> >> >> Currently I'm going through we need additional information routine, >> >> >> but I will let you know how I get on. >> >> >> >> >> >> I have 7 guest LDOM's defined and one LDOM has dropped the network >> >> >> completely, cant ping other guest LDOMs on localmachine or any other >> >> >> network device.. >> >> >> Only workaround so far is to reboot the primary domain. >> >> >> >> >> >> Andy >> >> >> >> >> >> 2008/8/11 Raghuram.Kothakota <Raghuram.Kothakota at sun.com>: >> >> >> > Scott Adair wrote: >> >> >> >> >> >> >> >> Hi Andy >> >> >> >> >> >> >> >> There hasn't been a solution that I know of so far. We worked with >> >> >> >> sun >> >> >> >> support for about three weeks and weren't able to find a resolution. >> >> >> >> >> >> >> > >> >> >> > Is there an escalation and a CR filed against this issue? >> >> >> > >> >> >> > -Raghuram. >> >> >> >> >> >> >> >> We have since switched to zones on the sam physical server and no >> >> >> >> longer have the dropped network connections. >> >> >> >> >> >> >> >> Scott >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On 8/10/08, Andy Paton <andy.paton at gmail.com> wrote: >> >> >> >> >> >> >> >>> >> >> >> >>> I'm also having the same problem, on a Sun T5240, Sol10u5, LDOM >> >> >> >>> 1.03 >> >> >> >>> with 8 LDOM's >> >> >> >>> >> >> >> >>> Any definitive solutions to resolve the issue? >> >> >> >>> >> >> >> >>> Many Thanks >> >> >> >>> >> >> >> >>> Andy >> >> >> >>> _______________________________________________ >> >> >> >>> ldoms-discuss mailing list >> >> >> >>> ldoms-discuss at opensolaris.org >> >> >> >>> http://mail.opensolaris.org/mailman/listinfo/ldoms-discuss >> >> >> >>> >> >> >> >>> >> >> >> >> >> >> >> >> _______________________________________________ >> >> >> >> ldoms-discuss mailing list >> >> >> >> ldoms-discuss at opensolaris.org >> >> >> >> http://mail.opensolaris.org/mailman/listinfo/ldoms-discuss >> >> >> >> >> >> >> > >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> Blog - http://apaton.blogspot.com >> >> >> Bookmarks - http://del.icio.us/apaton >> >> > >> >> > >> >> >> >> >> >> >> >> -- >> >> Blog - http://apaton.blogspot.com >> >> Bookmarks - http://del.icio.us/apaton >> > >> > >> >> >> >> -- >> Blog - http://apaton.blogspot.com >> Bookmarks - http://del.icio.us/apaton > -- Blog - http://apaton.blogspot.com Bookmarks - http://del.icio.us/apaton
