Re: ksoftirqd using 100% CPU

2012-02-10 Thread Joerg Reuter
On Thu, Feb 09, 2012 at 04:43:25PM -0600, Ron Foster at Baldor-IS wrote:

 The only messages I can find have to do with a hipersockets time out.
 Feb  9 11:21:39 bus0104 kernel: NETDEV WATCHDOG: hsi0: transmit timed out
 Feb  9 11:21:39 bus0104 kernel: qeth: Recovery of device 0.0.5100
 started ...

Hmm, that shouldn't happen. If you have a support agreement,
can you open a service request with Novell / SUSE, please? We'll
need the files generated by running supportconfig and dbginfo.sh,
and if you already have opened a case with IBM, the PMR would be handy.

Regards,
Joerg
--
Joerg Reuterhttp://yaina.de/jreuter
And I make my way to where the warm scent of soil fills the evening air.
Everything is waiting quietly out there (Anne Clark)

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: ksoftirqd using 100% CPU

2012-02-10 Thread Shane G
On Fri, Feb 10th, 2012 at 11:00 PM, Joerg Reuter wrote:

 On Thu, Feb 09, 2012 at 04:43:25PM -0600, Ron Foster at Baldor-IS wrote:
 
  The only messages I can find have to do with a hipersockets time out.
  Feb  9 11:21:39 bus0104 kernel: NETDEV WATCHDOG: hsi0: transmit timed
 out
  Feb  9 11:21:39 bus0104 kernel: qeth: Recovery of device 0.0.5100
  started ...
 
 Hmm, that shouldn't happen.

Lol ...
I was wondering if the OP could see anything from z/VM, but if it really is a
softirq/tasklet problem that probably means (Linux) dodgy driver.
And in almost all occasions that probably means network ...

Keeping an eye on /proc/interrupts might be instructive.

Shane ...

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: ksoftirqd using 100% CPU

2012-02-10 Thread Ron Foster at Baldor-IS
Shane,
As best we can tell, there were no relevant messages that came out on the vm 
operator console.
Ron

From: Linux on 390 Port [LINUX-390@VM.MARIST.EDU] On Behalf Of Shane G 
[ibm-m...@tpg.com.au]
Sent: Friday, February 10, 2012 6:46 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: ksoftirqd using 100% CPU

On Fri, Feb 10th, 2012 at 11:00 PM, Joerg Reuter wrote:

 On Thu, Feb 09, 2012 at 04:43:25PM -0600, Ron Foster at Baldor-IS wrote:

  The only messages I can find have to do with a hipersockets time out.
  Feb  9 11:21:39 bus0104 kernel: NETDEV WATCHDOG: hsi0: transmit timed
 out
  Feb  9 11:21:39 bus0104 kernel: qeth: Recovery of device 0.0.5100
  started ...

 Hmm, that shouldn't happen.

Lol ...
I was wondering if the OP could see anything from z/VM, but if it really is a
softirq/tasklet problem that probably means (Linux) dodgy driver.
And in almost all occasions that probably means network ...

Keeping an eye on /proc/interrupts might be instructive.

Shane ...

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/
--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: ksoftirqd using 100% CPU

2012-02-10 Thread Ron Foster at Baldor-IS
Jeorg,

We have a support agreement with Novell, so I will be opening up a service 
request.

The developers were getting hostile, so I had to revert the development system 
that was consistently having the problem to a previous level.  But I have found 
another system that had the problem once yesterday that we can use to gather 
the documentation.

Ron

I did not know that we could open a PMR with IBM directly.



From: Linux on 390 Port [LINUX-390@VM.MARIST.EDU] On Behalf Of Joerg Reuter 
[jreu...@suse.de]
Sent: Friday, February 10, 2012 6:00 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: ksoftirqd using 100% CPU

On Thu, Feb 09, 2012 at 04:43:25PM -0600, Ron Foster at Baldor-IS wrote:

 The only messages I can find have to do with a hipersockets time out.
 Feb  9 11:21:39 bus0104 kernel: NETDEV WATCHDOG: hsi0: transmit timed out
 Feb  9 11:21:39 bus0104 kernel: qeth: Recovery of device 0.0.5100
 started ...

Hmm, that shouldn't happen. If you have a support agreement,
can you open a service request with Novell / SUSE, please? We'll
need the files generated by running supportconfig and dbginfo.sh,
and if you already have opened a case with IBM, the PMR would be handy.

Regards,
Joerg
--
Joerg Reuterhttp://yaina.de/jreuter
And I make my way to where the warm scent of soil fills the evening air.
Everything is waiting quietly out there (Anne Clark)

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/
--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Anyone lose network connectivity during upgrade to SLES11 SP1

2012-02-10 Thread Ben Duncan
Hi, new to group (Am taking over a 200 Server ZVM Linux Install).

Perhaps I can shed some light that may help. I know on x86 Linux, you
can delete
the .rules and reboot. The system will automagically recreate the
rules as needed
when the udev / hald stuff starts running.

Maybe as part of the upgrade process you should remove those entries.

If one of the longtime MAINFRAME guru's knows something I don't ,
please chime in.


Ben Duncan - Business Network Solutions, Inc. 336 Elton Road Jackson MS,
39212
Never attribute to malice, that which can be adequately explained by
stupidity
- Hanlon's Razor




 Original Message 
Subject: Re: Anyone lose network connectivity during upgrade to SLES11
SP1
From: Marcy Cortes marcy.d.cor...@wellsfargo.com
Date: Thu, February 09, 2012 5:12 pm
To: LINUX-390@VM.MARIST.EDU

Ah, you are right! I must have looked at a sles 10 server!
Glad it was useful, though!

Marcy 
-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@vm.marist.edu] On Behalf Of
Ron Foster at Baldor-IS
Sent: Thursday, February 09, 2012 2:52 PM
To: LINUX-390@vm.marist.edu
Subject: Re: [LINUX-390] Anyone lose network connectivity during upgrade
to SLES11 SP1

Marcy,

Thanks for the info.

It appears that on SLES11SP1 the name of the file changed to 

70-persistent-net.rules

Thanks for giving the information on where the rules files are.

Ron

From: Linux on 390 Port [LINUX-390@VM.MARIST.EDU] On Behalf Of Marcy
Cortes [marcy.d.cor...@wellsfargo.com]
Sent: Friday, February 03, 2012 9:36 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Anyone lose network connectivity during upgrade to SLES11
SP1

So you can modify /etc/udev/rules.d/30-net_persistent_names.rules

For the servers we have with more than 1 interface, it contains this:

SUBSYSTEM==net, ACTION==add, ENV{PHYSDEVPATH}==*0.0.3000,
IMPORT=/lib/udev/rename_netiface %k eth0
SUBSYSTEM==net, ACTION==add, ENV{PHYSDEVPATH}==*0.0.4000,
IMPORT=/lib/udev/rename_netiface %k eth1


That file refers you to
/usr/share/doc/packages/sysconfig/README.Persistent_Interface_Names for
more info



Marcy

-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@vm.marist.edu] On Behalf Of
Ron Foster at Baldor-IS
Sent: Friday, February 03, 2012 7:13 AM
To: LINUX-390@vm.marist.edu
Subject: Re: [LINUX-390] Anyone lose network connectivity during upgrade
to SLES11 SP1

Tobias,

I will have to further testing, but I think you have come up with the
cause of our problem. I am just going to have to figure out the best way
to fix it.

Here is the problem:

1. Almost all of our systems have more than one network interface.
2. So in order to avoid an error message, we have to modify our default
route to include the interface.
In order to keep that interface name from changing, we used the SLES10
version of persistent interface names. In other words, our default route
looks like this:
default 10.80.200.1 - qeth-bus-ccw-0.0.0700
3. You may ask why we used qeth-bus-ccw-0.0.0700 instead of the short
interface name eth0. The reason why is that in our experience, the short
interface names
have this habit of changing. We would come in some Saturday night, take
all the Linux systems down, apply some z/vm maintenance, and then bring
up our Linux
systems. At that point, some number of our Linux systems no longer
communicated with the outside world. I finally got tired of that and
changed all the default
routes to use the long names.
4. You might say that that problem is fixed and does not happen anymore.
On this system I am upgrading the interface we want to use for
communication
to the outside world is eth4. I start out the upgrade, and by the time I
get to the second part of the upgrade, that is the part where I lose
network
connectivity. The interface name has changed to eth0 and it looks like
SLES11 SP1 is no longer honoring the long interface name, so this Linux
system
cannot communicate with the outside world during the second part of the
upgrade.

I suppose my next step is to put the SLES10 SP4 system back, and change
the default route to use eth4 and see if when the upgrade process
changes the interface
name, it also changes the default route.

Does anyone know where some documentation that documents this behavior?

Thanks,
Ron


From: Linux on 390 Port [LINUX-390@VM.MARIST.EDU] On Behalf Of Tobias
Doerkes [tdoer...@hotmail.com]
Sent: Friday, February 03, 2012 2:03 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Anyone lose network connectivity during upgrade to SLES11
SP1

Hi,

what about routing? Is it possible that the default route does point to
an inactive interface?

Kind regards,
Tobias.

PS: Sorry i forgot the subject.

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
visit

Re: ksoftirqd using 100% CPU

2012-02-10 Thread Alan Altmark
On Friday, 02/10/2012 at 10:21 EST, Ron Foster at Baldor-IS
rfos...@baldor.com wrote:

 I did not know that we could open a PMR with IBM directly.

Only if your Linux support contract is with IBM.

Alan Altmark

Senior Managing z/VM and Linux Consultant
IBM System Lab Services and Training
ibm.com/systems/services/labservices
office: 607.429.3323
mobile; 607.321.7556
alan_altm...@us.ibm.com
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: ksoftirqd using 100% CPU

2012-02-10 Thread Mark Post
 On 2/10/2012 at 11:33 AM, Alan Altmark alan_altm...@us.ibm.com wrote: 
 On Friday, 02/10/2012 at 10:21 EST, Ron Foster at Baldor-IS
 rfos...@baldor.com wrote:
 
 I did not know that we could open a PMR with IBM directly.
 
 Only if your Linux support contract is with IBM.

But, if you've already opened a PMR with IBM for z/VM or for hardware (for 
whatever reason related to the incident), we'd like to know about it anyway.  I 
know the same is true in the reverse, if a customer comes to us first, and then 
later needs to work with IBM on the problem because it involves IBM software or 
hardware.


Mark Post

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Share in Atlanta Call for Chair's

2012-02-10 Thread Jagos, Brian V
Hello to everyone,

This e -mail is a request for people to chair sessions for a conference called 
SHARE.

SHARE Inc. is an independent, volunteer run association providing enterprise 
technology professionals with continuous education and training, valuable 
professional networking and effective industry influence.

History
In 1955, just two years after the release of IBM's first computer, a handful of 
the earliest IT professionals collaborated to form SHARE. Thus came into being 
the world's first organization of computing professionals.

Over the past five decades, SHARE has become synonymous with high-quality, 
user-driven education and resources to make enterprise computing specialists 
more effective professionals. SHARE serves more than 20,000 individuals 
representing over 2,000 of IBM's top enterprise computing customers. Our 
constituency includes many of the top international corporations (including the 
majority of the FORTUNE 500), universities and colleges, municipal through 
federal government organizations, and industry-leading consultants. While 
independent, SHARE maintains a close partnership with IBM and its subsidiaries, 
as well as with leading solution providers to continually strengthen SHARE's 
benefits for its members.

Our Mission
To enable people in Information Technology environments to achieve business 
results.

Our Vision
We will be an indispensable partner with our members and IBM - the community 
where users and technology meet to shape the future of Information Technology.

The Value of SHARE Membership
Participation in SHARE provides the opportunity to build relationships with a 
diverse community of IT professionals, enhances your professional development, 
and positions you as a thought leader in the industry



Just in case you need it, this is a reminder that SHARE is approaching rapidly 
and the Linux and VM program is looking for session chairs.



Below is a list of the sessions.  If you are planning on attending SHARE in 
Atlanta GA, please volunteer to chair a session or two!  Please respond to me, 
off list, and let me know what you are interested in.  Also when you reply to 
me please include your e - mail address. Detailed abstracts on each session are 
available on www.share.orghttp://www.share.org.


idSession Title 

  Day  Date and Time 
Speakers
10738Current  Future Features SUSE Linux Enterprise Server for System z
 Mon  
2012-03-12, 09:30 Marcus Kraft (Speaker)
10311z/VM Platform Update   

   Mon  2012-03-12, 11:00 Alan Altmark (Speaker)
10177Introduction to Automating Linux System Administration using Cf engine 
3  Mon  2012-03-12, 
11:00 Aleksey Tsalolikhin (Speaker)
10771Social Business in a Heterogeneous Private Cloud: How to do more, 
Faster than before with Less  Mon  2012-03-12, 11:00 Michael Wojton 
(Speaker)
10174The Cloud Computing Cookbook: The Hypervisor Side  
  
Mon  2012-03-12, 13:30 Michael D. MacIsaac (Speaker)
10443Introduction to Virtualization: z/VM Basic Concepts and Terms  
 Mon  
2012-03-12, 13:30 Bill Bitner (Speaker)
10308Best Practices for Red Hat Enterprise Linux on System z

 Mon  2012-03-12, 13:30 Bradford E. Hinson (Speaker)
10175The Cloud Computing Cookbook: The Linux Side   

Mon  2012-03-12, 15:00 Michael D. MacIsaac (Speaker)
10567Using z/VM DirMaint in an SSI Cluster  

   Mon  2012-03-12, 16:30 Pamela Bryant (Speaker)
10310Current  Future State of Red Hat Enterprise Linux 

 Mon  2012-03-12, 16:30 Bradford E. Hinson (Speaker)
10245Automation Scenarios for a z/VM Cluster and Linux on System z Guests   
 Mon  2012-03-12, 
16:30 Tracy Dean (Speaker)
10493Introduction to REXX Workshop (Part 1 of 2) (BYOC) 

   Tue  2012-03-13, 09:30