Re: ksoftirqd using 100% CPU
On Thu, Feb 09, 2012 at 04:43:25PM -0600, Ron Foster at Baldor-IS wrote: The only messages I can find have to do with a hipersockets time out. Feb 9 11:21:39 bus0104 kernel: NETDEV WATCHDOG: hsi0: transmit timed out Feb 9 11:21:39 bus0104 kernel: qeth: Recovery of device 0.0.5100 started ... Hmm, that shouldn't happen. If you have a support agreement, can you open a service request with Novell / SUSE, please? We'll need the files generated by running supportconfig and dbginfo.sh, and if you already have opened a case with IBM, the PMR would be handy. Regards, Joerg -- Joerg Reuterhttp://yaina.de/jreuter And I make my way to where the warm scent of soil fills the evening air. Everything is waiting quietly out there (Anne Clark) -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: ksoftirqd using 100% CPU
On Fri, Feb 10th, 2012 at 11:00 PM, Joerg Reuter wrote: On Thu, Feb 09, 2012 at 04:43:25PM -0600, Ron Foster at Baldor-IS wrote: The only messages I can find have to do with a hipersockets time out. Feb 9 11:21:39 bus0104 kernel: NETDEV WATCHDOG: hsi0: transmit timed out Feb 9 11:21:39 bus0104 kernel: qeth: Recovery of device 0.0.5100 started ... Hmm, that shouldn't happen. Lol ... I was wondering if the OP could see anything from z/VM, but if it really is a softirq/tasklet problem that probably means (Linux) dodgy driver. And in almost all occasions that probably means network ... Keeping an eye on /proc/interrupts might be instructive. Shane ... -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: ksoftirqd using 100% CPU
Shane, As best we can tell, there were no relevant messages that came out on the vm operator console. Ron From: Linux on 390 Port [LINUX-390@VM.MARIST.EDU] On Behalf Of Shane G [ibm-m...@tpg.com.au] Sent: Friday, February 10, 2012 6:46 AM To: LINUX-390@VM.MARIST.EDU Subject: Re: ksoftirqd using 100% CPU On Fri, Feb 10th, 2012 at 11:00 PM, Joerg Reuter wrote: On Thu, Feb 09, 2012 at 04:43:25PM -0600, Ron Foster at Baldor-IS wrote: The only messages I can find have to do with a hipersockets time out. Feb 9 11:21:39 bus0104 kernel: NETDEV WATCHDOG: hsi0: transmit timed out Feb 9 11:21:39 bus0104 kernel: qeth: Recovery of device 0.0.5100 started ... Hmm, that shouldn't happen. Lol ... I was wondering if the OP could see anything from z/VM, but if it really is a softirq/tasklet problem that probably means (Linux) dodgy driver. And in almost all occasions that probably means network ... Keeping an eye on /proc/interrupts might be instructive. Shane ... -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: ksoftirqd using 100% CPU
Jeorg, We have a support agreement with Novell, so I will be opening up a service request. The developers were getting hostile, so I had to revert the development system that was consistently having the problem to a previous level. But I have found another system that had the problem once yesterday that we can use to gather the documentation. Ron I did not know that we could open a PMR with IBM directly. From: Linux on 390 Port [LINUX-390@VM.MARIST.EDU] On Behalf Of Joerg Reuter [jreu...@suse.de] Sent: Friday, February 10, 2012 6:00 AM To: LINUX-390@VM.MARIST.EDU Subject: Re: ksoftirqd using 100% CPU On Thu, Feb 09, 2012 at 04:43:25PM -0600, Ron Foster at Baldor-IS wrote: The only messages I can find have to do with a hipersockets time out. Feb 9 11:21:39 bus0104 kernel: NETDEV WATCHDOG: hsi0: transmit timed out Feb 9 11:21:39 bus0104 kernel: qeth: Recovery of device 0.0.5100 started ... Hmm, that shouldn't happen. If you have a support agreement, can you open a service request with Novell / SUSE, please? We'll need the files generated by running supportconfig and dbginfo.sh, and if you already have opened a case with IBM, the PMR would be handy. Regards, Joerg -- Joerg Reuterhttp://yaina.de/jreuter And I make my way to where the warm scent of soil fills the evening air. Everything is waiting quietly out there (Anne Clark) -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Anyone lose network connectivity during upgrade to SLES11 SP1
Hi, new to group (Am taking over a 200 Server ZVM Linux Install). Perhaps I can shed some light that may help. I know on x86 Linux, you can delete the .rules and reboot. The system will automagically recreate the rules as needed when the udev / hald stuff starts running. Maybe as part of the upgrade process you should remove those entries. If one of the longtime MAINFRAME guru's knows something I don't , please chime in. Ben Duncan - Business Network Solutions, Inc. 336 Elton Road Jackson MS, 39212 Never attribute to malice, that which can be adequately explained by stupidity - Hanlon's Razor Original Message Subject: Re: Anyone lose network connectivity during upgrade to SLES11 SP1 From: Marcy Cortes marcy.d.cor...@wellsfargo.com Date: Thu, February 09, 2012 5:12 pm To: LINUX-390@VM.MARIST.EDU Ah, you are right! I must have looked at a sles 10 server! Glad it was useful, though! Marcy -Original Message- From: Linux on 390 Port [mailto:LINUX-390@vm.marist.edu] On Behalf Of Ron Foster at Baldor-IS Sent: Thursday, February 09, 2012 2:52 PM To: LINUX-390@vm.marist.edu Subject: Re: [LINUX-390] Anyone lose network connectivity during upgrade to SLES11 SP1 Marcy, Thanks for the info. It appears that on SLES11SP1 the name of the file changed to 70-persistent-net.rules Thanks for giving the information on where the rules files are. Ron From: Linux on 390 Port [LINUX-390@VM.MARIST.EDU] On Behalf Of Marcy Cortes [marcy.d.cor...@wellsfargo.com] Sent: Friday, February 03, 2012 9:36 AM To: LINUX-390@VM.MARIST.EDU Subject: Re: Anyone lose network connectivity during upgrade to SLES11 SP1 So you can modify /etc/udev/rules.d/30-net_persistent_names.rules For the servers we have with more than 1 interface, it contains this: SUBSYSTEM==net, ACTION==add, ENV{PHYSDEVPATH}==*0.0.3000, IMPORT=/lib/udev/rename_netiface %k eth0 SUBSYSTEM==net, ACTION==add, ENV{PHYSDEVPATH}==*0.0.4000, IMPORT=/lib/udev/rename_netiface %k eth1 That file refers you to /usr/share/doc/packages/sysconfig/README.Persistent_Interface_Names for more info Marcy -Original Message- From: Linux on 390 Port [mailto:LINUX-390@vm.marist.edu] On Behalf Of Ron Foster at Baldor-IS Sent: Friday, February 03, 2012 7:13 AM To: LINUX-390@vm.marist.edu Subject: Re: [LINUX-390] Anyone lose network connectivity during upgrade to SLES11 SP1 Tobias, I will have to further testing, but I think you have come up with the cause of our problem. I am just going to have to figure out the best way to fix it. Here is the problem: 1. Almost all of our systems have more than one network interface. 2. So in order to avoid an error message, we have to modify our default route to include the interface. In order to keep that interface name from changing, we used the SLES10 version of persistent interface names. In other words, our default route looks like this: default 10.80.200.1 - qeth-bus-ccw-0.0.0700 3. You may ask why we used qeth-bus-ccw-0.0.0700 instead of the short interface name eth0. The reason why is that in our experience, the short interface names have this habit of changing. We would come in some Saturday night, take all the Linux systems down, apply some z/vm maintenance, and then bring up our Linux systems. At that point, some number of our Linux systems no longer communicated with the outside world. I finally got tired of that and changed all the default routes to use the long names. 4. You might say that that problem is fixed and does not happen anymore. On this system I am upgrading the interface we want to use for communication to the outside world is eth4. I start out the upgrade, and by the time I get to the second part of the upgrade, that is the part where I lose network connectivity. The interface name has changed to eth0 and it looks like SLES11 SP1 is no longer honoring the long interface name, so this Linux system cannot communicate with the outside world during the second part of the upgrade. I suppose my next step is to put the SLES10 SP4 system back, and change the default route to use eth4 and see if when the upgrade process changes the interface name, it also changes the default route. Does anyone know where some documentation that documents this behavior? Thanks, Ron From: Linux on 390 Port [LINUX-390@VM.MARIST.EDU] On Behalf Of Tobias Doerkes [tdoer...@hotmail.com] Sent: Friday, February 03, 2012 2:03 AM To: LINUX-390@VM.MARIST.EDU Subject: Re: Anyone lose network connectivity during upgrade to SLES11 SP1 Hi, what about routing? Is it possible that the default route does point to an inactive interface? Kind regards, Tobias. PS: Sorry i forgot the subject. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
Re: ksoftirqd using 100% CPU
On Friday, 02/10/2012 at 10:21 EST, Ron Foster at Baldor-IS rfos...@baldor.com wrote: I did not know that we could open a PMR with IBM directly. Only if your Linux support contract is with IBM. Alan Altmark Senior Managing z/VM and Linux Consultant IBM System Lab Services and Training ibm.com/systems/services/labservices office: 607.429.3323 mobile; 607.321.7556 alan_altm...@us.ibm.com IBM Endicott -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: ksoftirqd using 100% CPU
On 2/10/2012 at 11:33 AM, Alan Altmark alan_altm...@us.ibm.com wrote: On Friday, 02/10/2012 at 10:21 EST, Ron Foster at Baldor-IS rfos...@baldor.com wrote: I did not know that we could open a PMR with IBM directly. Only if your Linux support contract is with IBM. But, if you've already opened a PMR with IBM for z/VM or for hardware (for whatever reason related to the incident), we'd like to know about it anyway. I know the same is true in the reverse, if a customer comes to us first, and then later needs to work with IBM on the problem because it involves IBM software or hardware. Mark Post -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Share in Atlanta Call for Chair's
Hello to everyone, This e -mail is a request for people to chair sessions for a conference called SHARE. SHARE Inc. is an independent, volunteer run association providing enterprise technology professionals with continuous education and training, valuable professional networking and effective industry influence. History In 1955, just two years after the release of IBM's first computer, a handful of the earliest IT professionals collaborated to form SHARE. Thus came into being the world's first organization of computing professionals. Over the past five decades, SHARE has become synonymous with high-quality, user-driven education and resources to make enterprise computing specialists more effective professionals. SHARE serves more than 20,000 individuals representing over 2,000 of IBM's top enterprise computing customers. Our constituency includes many of the top international corporations (including the majority of the FORTUNE 500), universities and colleges, municipal through federal government organizations, and industry-leading consultants. While independent, SHARE maintains a close partnership with IBM and its subsidiaries, as well as with leading solution providers to continually strengthen SHARE's benefits for its members. Our Mission To enable people in Information Technology environments to achieve business results. Our Vision We will be an indispensable partner with our members and IBM - the community where users and technology meet to shape the future of Information Technology. The Value of SHARE Membership Participation in SHARE provides the opportunity to build relationships with a diverse community of IT professionals, enhances your professional development, and positions you as a thought leader in the industry Just in case you need it, this is a reminder that SHARE is approaching rapidly and the Linux and VM program is looking for session chairs. Below is a list of the sessions. If you are planning on attending SHARE in Atlanta GA, please volunteer to chair a session or two! Please respond to me, off list, and let me know what you are interested in. Also when you reply to me please include your e - mail address. Detailed abstracts on each session are available on www.share.orghttp://www.share.org. idSession Title Day Date and Time Speakers 10738Current Future Features SUSE Linux Enterprise Server for System z Mon 2012-03-12, 09:30 Marcus Kraft (Speaker) 10311z/VM Platform Update Mon 2012-03-12, 11:00 Alan Altmark (Speaker) 10177Introduction to Automating Linux System Administration using Cf engine 3 Mon 2012-03-12, 11:00 Aleksey Tsalolikhin (Speaker) 10771Social Business in a Heterogeneous Private Cloud: How to do more, Faster than before with Less Mon 2012-03-12, 11:00 Michael Wojton (Speaker) 10174The Cloud Computing Cookbook: The Hypervisor Side Mon 2012-03-12, 13:30 Michael D. MacIsaac (Speaker) 10443Introduction to Virtualization: z/VM Basic Concepts and Terms Mon 2012-03-12, 13:30 Bill Bitner (Speaker) 10308Best Practices for Red Hat Enterprise Linux on System z Mon 2012-03-12, 13:30 Bradford E. Hinson (Speaker) 10175The Cloud Computing Cookbook: The Linux Side Mon 2012-03-12, 15:00 Michael D. MacIsaac (Speaker) 10567Using z/VM DirMaint in an SSI Cluster Mon 2012-03-12, 16:30 Pamela Bryant (Speaker) 10310Current Future State of Red Hat Enterprise Linux Mon 2012-03-12, 16:30 Bradford E. Hinson (Speaker) 10245Automation Scenarios for a z/VM Cluster and Linux on System z Guests Mon 2012-03-12, 16:30 Tracy Dean (Speaker) 10493Introduction to REXX Workshop (Part 1 of 2) (BYOC) Tue 2012-03-13, 09:30