Kernel bug in SLES 9

2009-01-06 Thread van Sleeuwen, Berry
Hello listers,
 
Our datawarehouse machine suffers from a kernel bug. In the past year we
have had this issue, sometimes even every week. Usually the bug appears
when performing an archive but it also happens at other times. In all
cases the machine has a high IO load.
 
We have upgraded the kernel but that did not help in solving the error.
 
Anyone have the same problems, especially with regard to Oracle systems?
Any help or suggestions would be appreciated.
 
System: SLES 9 SP4, Kernel 2.6.5-7.
Oracle 10.1.0.5.0.
 
Message from /var/log/messages:
 
kernel: kernel BUG at fs/aio.c:733!
kernel: illegal operation: 0001 [#1]
kernel: CPU:0Not tainted (2.6.5-7.315-s390x
SLES9_SP4_BRANCH-200811261403180100)
kernel: Process oracle (pid: 24794, task: 000144b81858, ksp:
dd52b630)
kernel: Krnl PSW : 07018000 001d2eaa
(__aio_run_iocbs+0x1a2/0x39c)
kernel: Krnl GPRS: 0008  001f
00134d62
kernel:001d2ea8  001d3b26

kernel:001d39ca 2000 00013f7da300
00013f7da300
kernel:39bb8900 00378ac0 001d2ea8
dd52bc60
kernel: Krnl Code: 00 00 b9 04 00 2c b9 04 00 39 a7 49 00 00 c0 e5 ff ff
fd d0
kernel: Call Trace:
kernel:  [001d39ca] io_submit_one+0x1b6/0x268
kernel:  [001d3b26] sys_io_submit+0xaa/0x13c
kernel:  [0011fc9c] sysc_noemu+0x10/0x16

I did find a discussion on the aio.c. More in detail, in the routine
aio_read_evt there was a discussion on the line
spin_lock(info-ring_lock) and it looks like this is exactly the
location where we get our bug. When I look at the kernel levels, even up
to the most current level, the code has not been changed yet.
 
 
Met vriendelijke groet/With kind regards, 
Berry van Sleeuwen 
Flight Forum 3000 5657 EW Eindhoven

( +31 (0)6 22564276

 



Atos Origin http://www.atosorigin.com/ 

MO OC Mainframe Services

 

 

 

 

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
image001.gifimage003.jpgÿþDit bericht is vertrouwelijk en kan 
geheime informatie bevatten enkel

bestemd voor de geadresseerde. Indien 
dit bericht niet voor u is bestemd,

verzoeken wij u dit onmiddellijk aan 
ons te melden en het bericht te

vernietigen.

Aangezien de integriteit van het 
bericht niet veilig gesteld is middels

verzending via internet, kan Atos 
Origin niet aansprakelijk worden 
gehouden

voor de inhoud daarvan.

Hoewel wij ons inspannen een virusvrij 
netwerk te hanteren, geven

wij geen enkele garantie dat dit 
bericht virusvrij is, noch aanvaarden 
wij

enige aansprakelijkheid voor de 
mogelijke aanwezigheid van een virus in 
dit

bericht.

 

Op al onze rechtsverhoudingen, 
aanbiedingen en overeenkomsten 
waaronder

Atos Origin goederen en/of diensten 
levert zijn met uitsluiting van alle

andere voorwaarden de 
Leveringsvoorwaarden van Atos Origin 
van toepassing.

Deze worden u op aanvraag direct 
kosteloos toegezonden.

 

This e-mail and the documents attached 
are confidential and intended solely

for the addressee; it may also be 
privileged. If you receive this e-mail

in error, please notify the sender 
immediately and destroy it.

As its integrity cannot be secured on 
the Internet, the Atos Origin group

liability cannot be triggered for the 
message content. Although the

sender endeavours to maintain a 
computer 

Re: Kernel bug in SLES 9

2009-01-06 Thread Christian Borntraeger
Hello Berry,

I have no advise on your specific problem, but a kernel bug, oops, warning or
panic is a very strong indication that this is a real code problem. I suggest
to open a service request/problem ticket.

hope this helps

Christian

 Message from /var/log/messages:

 kernel: kernel BUG at fs/aio.c:733!
 kernel: illegal operation: 0001 [#1]
 kernel: CPU:0Not tainted (2.6.5-7.315-s390x
 SLES9_SP4_BRANCH-200811261403180100)
 kernel: Process oracle (pid: 24794, task: 000144b81858, ksp:
 dd52b630)
 kernel: Krnl PSW : 07018000 001d2eaa
 (__aio_run_iocbs+0x1a2/0x39c)
 kernel: Krnl GPRS: 0008  001f
 00134d62
 kernel:001d2ea8  001d3b26
 
 kernel:001d39ca 2000 00013f7da300
 00013f7da300
 kernel:39bb8900 00378ac0 001d2ea8
 dd52bc60
 kernel: Krnl Code: 00 00 b9 04 00 2c b9 04 00 39 a7 49 00 00 c0 e5 ff ff
 fd d0
 kernel: Call Trace:
 kernel:  [001d39ca] io_submit_one+0x1b6/0x268
 kernel:  [001d3b26] sys_io_submit+0xaa/0x13c
 kernel:  [0011fc9c] sysc_noemu+0x10/0x16

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390


Re: Setting up TSM backups to a Mainframe Server

2009-01-06 Thread David Boyes
On 1/5/09 7:18 PM, Thomas Kern tlk_sysp...@yahoo.com wrote:

 Or set it to run as root under CRON and you can specify the filesystem
 such as 'dsmc incremental / /srv /oradb'.

 You can also have the output processed into a status message to be sent
 to a central administrator or to XYMON (formerly Hobbit).

Yes, that's another way to do it. The down side of that approach is that you
lose the centralized scheduling and result reporting that TSM already
provides, and you have to come up with your own result management. You can
specify the filesystems in the dsm.opt file if you know which ones you want.

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390


Re: Kernel bug in SLES 9

2009-01-06 Thread van Sleeuwen, Berry
Hello Christian,

Yes, that was my idea also but there is a problem with my accounts at
Novell and IBM. I can't open any issue at this time and my manager
expects me to solve this within a few days.

Regards, Berry. 

-Original Message-
From: Linux on 390 Port [mailto:linux-...@vm.marist.edu] On Behalf Of
Christian Borntraeger
Sent: dinsdag 6 januari 2009 11:01
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Kernel bug in SLES 9

Hello Berry,

I have no advise on your specific problem, but a kernel bug, oops,
warning or panic is a very strong indication that this is a real code
problem. I suggest to open a service request/problem ticket.

hope this helps

Christian

 Message from /var/log/messages:

 kernel: kernel BUG at fs/aio.c:733!
 kernel: illegal operation: 0001 [#1]
 kernel: CPU:0Not tainted (2.6.5-7.315-s390x
 SLES9_SP4_BRANCH-200811261403180100)
 kernel: Process oracle (pid: 24794, task: 000144b81858, ksp:
 dd52b630)
 kernel: Krnl PSW : 07018000 001d2eaa
 (__aio_run_iocbs+0x1a2/0x39c)
 kernel: Krnl GPRS: 0008  001f
 00134d62
 kernel:001d2ea8  001d3b26
 
 kernel:001d39ca 2000 00013f7da300
 00013f7da300
 kernel:39bb8900 00378ac0 001d2ea8
 dd52bc60
 kernel: Krnl Code: 00 00 b9 04 00 2c b9 04 00 39 a7 49 00 00 c0 e5 ff 
 ff fd d0
 kernel: Call Trace:
 kernel:  [001d39ca] io_submit_one+0x1b6/0x268
 kernel:  [001d3b26] sys_io_submit+0xaa/0x13c
 kernel:  [0011fc9c] sysc_noemu+0x10/0x16

--
For LINUX-390 subscribe / signoff / archive access instructions, send
email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
visit http://www.marist.edu/htbin/wlvindex?LINUX-390

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390


Multiple Network Interface Routing Problem

2009-01-06 Thread Sollenberger, Justin W Mr CIV US DISA CDB24
We are currently running a SLES 10 SP2 guest (hostx) with access to
three networks (eth0, eth1, eth2).  The issue we are having is that the
default route is not correct after an ipl.  If I delete the incorrect
route (using: route del) and add the correct route (using: route add)
everything works as it should.  We have another system (host z) running
with the same three interfaces and are having no problems there.  I'm
just not sure how Linux is creating its routing table with the info
provided in the ifroute files.  hostz comes up with the correct default
ip (x.y.z.129) and hostx does not.  The ips/gateways/masks are what was
given to me from our networking office.  I've included the output from
route and the /etc/sysconfig/network/ifroute* files from hostx and from
hostz:

hostx:/etc/sysconfig/network # route
Kernel IP routing table
Destination Gateway Genmask Flags Metric RefUse
Iface
x.y.z.128   0.0.0.0 255.255.255.224 U 0  00
eth0
1.0.0.0 0.0.0.0 255.0.0.0   U 0  00
eth2
2.0.0.0 0.0.0.0 255.0.0.0   U 0  00
eth1
127.0.0.0   0.0.0.0 255.0.0.0   U 0  00
lo
default 1.a.1.1 0.0.0.0 UG0  00
eth0

hostx:/etc/sysconfig/network # cat ifroute-qeth-bus-ccw-0.0.0*
defaultx.y.z.1290.0.0.0eth0

default1.a.1.1  0.0.0.0eth2

default2.b.1.1  0.0.0.0eth1

hostz:/etc/sysconfig/network # route
Kernel IP routing table
Destination Gateway Genmask Flags Metric RefUse
Iface
x.y.z.128   0.0.0.0 255.255.255.224 U 0  00
eth0
1.0.0.0 0.0.0.0 255.0.0.0   U 0  00
eth2
2.0.0.0 0.0.0.0 255.0.0.0   U 0  00
eth1
127.0.0.0   0.0.0.0 255.0.0.0   U 0  00
lo
default x.y.z.129   0.0.0.0 UG0  00
eth0

hostz:/etc/sysconfig/network # cat ifroute-qeth-bus-ccw-0.0.0*
defaultx.y.z.1290.0.0.0eth0

default1.a.1.1  0.0.0.0eth2

default2.b.1.1  0.0.0.0eth1


Any ideas on why hostx and hostz come up with different default gateways
when the configuration files are the same?  What am I missing?  How can
I correct it?  Thanks in advance for your help.

VR,
  
Justin Sollenberger

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390


Re: Multiple Network Interface Routing Problem

2009-01-06 Thread Mark Post
 On 1/6/2009 at 11:59 AM, Sollenberger, Justin W Mr CIV US DISA CDB24
justin.sollenber...@csd.disa.mil wrote: 
-snip-
 Any ideas on why hostx and hostz come up with different default gateways
 when the configuration files are the same?  What am I missing?  How can
 I correct it?  Thanks in advance for your help.

What are the contents of /etc/sysconfig/network/routes on the two systems?


Mark Post

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390


Re: Multiple Network Interface Routing Problem

2009-01-06 Thread David Boyes
There's no guarantee in what order a network interface will initialize, so
I think the confusion is with multiple default route specifications, and
you're getting lucky with the other machine getting the right one. There
should be only one default route specified. The route add/del flushes the
extra entries and causes it to work.

I think I would try removing the default entry from the ifroute files for
all but one interface (the one that should actually be the default network
if there is not a more specific route) and add a init script to establish
any other needed routes later in the boot process. That way you can control
when the additonal connectivity becomes available and ensure that the routes
are correctly inserted.

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390


Re: Kernel bug in SLES 9

2009-01-06 Thread Alan Altmark
On Tuesday, 01/06/2009 at 09:42 EST, van Sleeuwen, Berry
berry.vansleeu...@atosorigin.com wrote:
 Yes, that was my idea also but there is a problem with my accounts at
 Novell and IBM. I can't open any issue at this time and my manager
 expects me to solve this within a few days.

Solving a transient kernel problem within a few days may or may not be
reasonable, but I think it assumes that you have a support structure in
place.  When you combine a false assumption with Management's natural
desire to expect you to deliver more than is reasonable, alarms should
begin to sound.  Perform an MER, Management Expectation Reset.  In my
experience, that's less of a hit than accepting an unreasonable request
and then failing.

And, if you do then happen to perform a miracle, they will call you
Scotty and worship the quicksand you walk on.  It doesn't get any better
than that!  ;-)

Alan Altmark
z/VM Development
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390


Re: Multiple Network Interface Routing Problem

2009-01-06 Thread Ron Foster at Baldor-IS

Justin,

We have systems with multiple network interfaces.  I finally had to
modify the default route in
/etc/sysconfig/network/routes to look like this.

default 32.71.175.1 - qeth-bus-ccw-0.0.0600

Notice I had to use the persistent name for the interface.  The eth0,
eth1 type of name
can change from IPL to IPL.

Ron



Sollenberger, Justin W Mr CIV US DISA CDB24 wrote:

We are currently running a SLES 10 SP2 guest (hostx) with access to
three networks (eth0, eth1, eth2).  The issue we are having is that the
default route is not correct after an ipl.  If I delete the incorrect
route (using: route del) and add the correct route (using: route add)
everything works as it should.  We have another system (host z) running
with the same three interfaces and are having no problems there.  I'm
just not sure how Linux is creating its routing table with the info
provided in the ifroute files.  hostz comes up with the correct default
ip (x.y.z.129) and hostx does not.  The ips/gateways/masks are what was
given to me from our networking office.  I've included the output from
route and the /etc/sysconfig/network/ifroute* files from hostx and from
hostz:

hostx:/etc/sysconfig/network # route
Kernel IP routing table
Destination Gateway Genmask Flags Metric RefUse
Iface
x.y.z.128   0.0.0.0 255.255.255.224 U 0  00
eth0
1.0.0.0 0.0.0.0 255.0.0.0   U 0  00
eth2
2.0.0.0 0.0.0.0 255.0.0.0   U 0  00
eth1
127.0.0.0   0.0.0.0 255.0.0.0   U 0  00
lo
default 1.a.1.1 0.0.0.0 UG0  00
eth0

hostx:/etc/sysconfig/network # cat ifroute-qeth-bus-ccw-0.0.0*
defaultx.y.z.1290.0.0.0eth0

default1.a.1.1  0.0.0.0eth2

default2.b.1.1  0.0.0.0eth1

hostz:/etc/sysconfig/network # route
Kernel IP routing table
Destination Gateway Genmask Flags Metric RefUse
Iface
x.y.z.128   0.0.0.0 255.255.255.224 U 0  00
eth0
1.0.0.0 0.0.0.0 255.0.0.0   U 0  00
eth2
2.0.0.0 0.0.0.0 255.0.0.0   U 0  00
eth1
127.0.0.0   0.0.0.0 255.0.0.0   U 0  00
lo
default x.y.z.129   0.0.0.0 UG0  00
eth0

hostz:/etc/sysconfig/network # cat ifroute-qeth-bus-ccw-0.0.0*
defaultx.y.z.1290.0.0.0eth0

default1.a.1.1  0.0.0.0eth2

default2.b.1.1  0.0.0.0eth1


Any ideas on why hostx and hostz come up with different default gateways
when the configuration files are the same?  What am I missing?  How can
I correct it?  Thanks in advance for your help.

VR,

Justin Sollenberger

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
.




--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390


Re: Multiple Network Interface Routing Problem

2009-01-06 Thread Ron Foster at Baldor-IS

Mark,

Come to think of it, the last time I remember having device names change
on me was a few of the systems
that I upgraded to SLES10 SP2.  I don't remember what level of code I
was coming from-SLES10 or
SLES10SP1.

I still think I will stick with the long names for the time being.  On
most of our systems we have two
hipersocket interfaces defined-5100 and 5200.  hsi0 is not always 5100.

Now that I know that the rules file is there, I can go fix it.  But I
have other things to do right now.

Thanks for the info,

Ron

Mark Post wrote:

On 1/6/2009 at 12:32 PM, Ron Foster at Baldor-IS rfos...@baldor.com wrote:


-snip-


Notice I had to use the persistent name for the interface.  The eth0,
eth1 type of name
can change from IPL to IPL.



If you're running SLES10, that shouldn't be true.  The udev rules in 
/etc/udev/rules.d/30-net_persistent_names.rules should provide consistency 
across IPLs.


Mark Post

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390




--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390


Re: Kernel bug in SLES 9

2009-01-06 Thread Mark Post
 On 1/6/2009 at  9:40 AM, van Sleeuwen, Berry
berry.vansleeu...@atosorigin.com wrote: 
 Hello Christian,
 
 Yes, that was my idea also but there is a problem with my accounts at
 Novell and IBM. I can't open any issue at this time and my manager
 expects me to solve this within a few days.

Even if you were able to open service requests, that's not likely to happen.  
In looking at Bugzilla, I see a number of different problems with BUG reports 
in fs/aio.c, and most of them went on for months and months.  I believe the 
workaround is to disable the use of AIO in Oracle.  That will likely present a 
performance degradation, but it might keep you limping along until your support 
contracts are ironed out.


Mark Post

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390


Re: Kernel bug in SLES 9

2009-01-06 Thread Berry van Sleeuwen
Hello Mark,

Thanks for your comments. This workaround would be an option. We'll have
to look into that.

I did find that some work has been done when I was looking into the
aio.c. And also that some parts have been discussed for some time now.
The aio.c (and aio.h) have been changed to some extent in later kernel
versions but I can't say if those changes would have any effect on our
problem.

Just for my good understanding (and for some input tomorrow at our next
meeting). Suppose we would be able to open a service request, what would
be the chance a kernel bug can be fixed through a service request? Based
on the discussions on the aio.c my guess would be that a kernel fix
would be difficult to get implemented, especially when a quick result is
expected. Am I correct when I expect that the service would be close to
an advice like changing specific parameters and/or install a newer
kernel or patch? (Note that this is in no way to you personally, I
expect kernel changes to be part of the kernel development team so any
change in that part is outside the scope of any company that provides
support. The problem is how to convince the upper level to view it the
same way. As Alan did suggest, input for a MER.)

Do you know by any chance what button to push for AIO in Oracle? I know
that we did cover that when the server was installed 4 years ago but I
didn't do that part myself. IIRC it was asynch_IO=yes or some parameter
like that. Correct?

Is there any guess as to what performance penalty this disable would
give us? The server does hit it's limits quite often so I expect this
question to be the next one.

Regards, Berry.

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390


Re: Multiple Network Interface Routing Problem

2009-01-06 Thread Sollenberger, Justin W Mr CIV US DISA CDB24
Thanks to all who replied.  I was able to get it working properly by
removing all of the default routes except for one.  I also replaced
those default routes (in the ifroute files) with the other more specific
routes.

Now my route table look like this:

Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref
Use Iface
x.y.z.128   *   255.255.255.224 U 0  0
0 eth0
link-local  *   255.255.0.0 U 0  0
0 eth0
1.0.0.0 1.a.1.1 255.0.0.0   UG0  00
eth2
2.0.0.0 2.b.1.1 255.0.0.0   UG0  00
eth1
loopback*   255.0.0.0   U 0  0
0 lo
default x.y.z.129   0.0.0.0 UG0  00
eth0

VR,
  
Justin Sollenberger 

Linux on System z Administrator
Operating Systems, CDB24
DECC Mechanicsburg
DSN: 430-8386
Comm: 717-605-8386
justin.sollenber...@csd.disa.mil


-Original Message-
From: Linux on 390 Port [mailto:linux-...@vm.marist.edu] On Behalf Of
David Boyes
Sent: Tuesday, January 06, 2009 12:14
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Multiple Network Interface Routing Problem

There's no guarantee in what order a network interface will initialize,
so
I think the confusion is with multiple default route specifications, and
you're getting lucky with the other machine getting the right one. There
should be only one default route specified. The route add/del flushes
the
extra entries and causes it to work.

I think I would try removing the default entry from the ifroute files
for
all but one interface (the one that should actually be the default
network
if there is not a more specific route) and add a init script to
establish
any other needed routes later in the boot process. That way you can
control
when the additonal connectivity becomes available and ensure that the
routes
are correctly inserted.

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390


Re: Kernel bug in SLES 9

2009-01-06 Thread Mark Post
 On 1/6/2009 at  3:51 PM, Berry van Sleeuwen berry.vansleeu...@xs4all.nl
wrote: 
-snip-
 Just for my good understanding (and for some input tomorrow at our next
 meeting). Suppose we would be able to open a service request, what would
 be the chance a kernel bug can be fixed through a service request?

For a true software bug (and when the program itself it telling you it's a bug, 
it's hard to argue that it's not), assuming the proper support contract is in 
place, a fix will be created and a PTF sent to the customer for testing.  The 
major Linux distribution providers have a significant amount of kernel 
development skills on staff.  Many of the heavy hitters in Linux kernel 
development work directly for the distribution providers.

 Based
 on the discussions on the aio.c my guess would be that a kernel fix
 would be difficult to get implemented, especially when a quick result is
 expected. Am I correct when I expect that the service would be close to
 an advice like changing specific parameters and/or install a newer
 kernel or patch?

If you're not already at the latest kernel, then most likely you would be asked 
to upgrade to that first and test.  If the problem still exists, then a PTF 
would (eventually) be produced.  You didn't provide the specific kernel RPM 
version in your first note, so I don't know if that would apply to you or not.  
The most current version is 2.6.5-7.315.

In general, parameter changes may be requested, if you're not following the 
recommendations from the ISV that certified the software.  If you are, and 
there is a bug, we will work to fix it.  In many cases, PTFs are provided that 
we know won't fix the problem, but might provide more clues as to just what is 
happening to cause the problem.  The speed with which that happens is dependent 
on many factors, such as severity and business impact to the customer, how 
difficult it is to figure out what's wrong, getting the customer to test new 
PTF packages, etc.  For this particular piece of code, the relatively long 
resolution times appear to be mostly related to trying to figure out exactly 
what is wrong, due to the complexity of the code, and being able to get the 
affected customers to test.

 (Note that this is in no way to you personally, I
 expect kernel changes to be part of the kernel development team so any
 change in that part is outside the scope of any company that provides
 support. The problem is how to convince the upper level to view it the
 same way. As Alan did suggest, input for a MER.)

Once a fix has been created, then attempts are made to get that fix accepted in 
the mainline kernel source.  If that is not successful, then the Linux 
distribution provider is responsible for maintaining that fix outside the 
official tree, until the release is at end of life.  It's part of what you pay 
for when you buy support.

 Do you know by any chance what button to push for AIO in Oracle? I know
 that we did cover that when the server was installed 4 years ago but I
 didn't do that part myself. IIRC it was asynch_IO=yes or some parameter
 like that. Correct?

Sorry, no I don't.

 Is there any guess as to what performance penalty this disable would
 give us? The server does hit it's limits quite often so I expect this
 question to be the next one.

I would have no clue.  The decision will have to come down to what's better for 
your company, taking the performance hit, or living with the bug.


Mark Post

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390


bash shell question

2009-01-06 Thread John McKown
I've got a small problem. I have a daemon which I cannot easily restart
because it is production and people are using it. The daemon is started
with something like:

daemon args daemon.log

The file daemon.log is getting very huge. The correct way to fix this is
to stop the daemon, mv or rm the daemon.log, then restart the
daemon. But I have a vague memory that it is sometimes possible to reset
a file to empty simply by doing a:

daemon.log

and that will, at times, work even if the daemon is not restarted. Is my
memory correct? Or is that some sort of special case which does not
apply when bash does a  redirect of stdin?

We plan to fix this by not doing that!, but instead piping the stdout
from the daemon to a process called cronolog which works by
automatically changing the output file at midnight by changing the date
portion of the log name.

--
Q: What do theoretical physicists drink beer from?
A: Ein Stein.

Maranatha!
John McKown

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390


Re: bash shell question

2009-01-06 Thread Larry Ploetz

On 1/6/09 1:50 PM, John McKown wrote:

I've got a small problem. I have a daemon which I cannot easily restart
because it is production and people are using it. The daemon is started
with something like:

daemon argsdaemon.log

The file daemon.log is getting very huge. The correct way to fix this is
to stop the daemon, mv or rm the daemon.log, then restart the
daemon. But I have a vague memory that it is sometimes possible to reset
a file to empty simply by doing a:



daemon.log



You're probably thinking of  daemon.log to truncate a file.


and that will, at times, work even if the daemon is not restarted. Is my
memory correct? Or is that some sort of special case which does not
apply when bash does a  redirect of stdin?



No, append isn't a special case.


We plan to fix this by not doing that!, but instead piping the stdout
from the daemon to a process called cronolog which works by
automatically changing the output file at midnight by changing the date
portion of the log name.



I believe some people don't think this is kosher, but I like:

bzip2 -c daemon.log  daemon.log.$(date +%Y%m%d).bz2   daemon.log

but if daemon keeps daemon.log open, daemon.log will become a sparse
file the next time daemon writes to it. A sparse file doesn't take up
any disk space, in this case, except what's subsequently written to it
by daemon. You have to remember not to copy the first line of daemon.log
if you do this again (e.g., tail +2 daemon.log | ..., or if you want to
be more efficient:

tail -c $(( $(stat -c %b\*%B daemon.log) )) daemon.log | tr -d '\000'\| ...

(unless you're worried about embedded nulls in the real portion of
daemon.log, in which case replace tr -d '\000' with a slightly less
efficient sed -e '1s/^\000//g'.)

If you do the above on a regular basis, you never have to bounce daemon,
and can still have, e.g. daily, log files. And bonus -- eventually you
can have a file that looks like it's larger than the file system it's on!

A good backup program will back up sparse files sparsely. I don't know
about TSM, but Legato Networker isn't a good backup program :-(

- Larry

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390


Re: Multiple Network Interface Routing Problem

2009-01-06 Thread gah

Sollenberger, Justin W Mr CIV US DISA CDB24 wrote:


We are currently running a SLES 10 SP2 guest (hostx) with access to
three networks (eth0, eth1, eth2).  The issue we are having is that the
default route is not correct after an ipl.  If I delete the incorrect
route (using: route del) and add the correct route (using: route add)
everything works as it should.


On a network with three interfaces I would like do dynamic
routing, and run routed or gated.  If you don't yet, soon
you will want redundant routes and that means that hosts
should be able to choose the best route dynamically.

-- glen

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390


Re: bash shell question

2009-01-06 Thread Kok Leong Chan
Perhaps logrotate can help you here.

From the manpage:

logrotate  is  designed to ease administration of systems that generate
large numbers of log files.  It allows automatic rotation, compression,
removal, and mailing of log files.  Each log file may be handled daily,
weekly, monthly, or when it grows too large.



- Chan Kok Leong




John McKown joa...@swbell.net
Sent by: Linux on 390 Port LINUX-390@VM.MARIST.EDU
01/07/2009 05:50 AM
Please respond to
Linux on 390 Port LINUX-390@VM.MARIST.EDU


To
LINUX-390@VM.MARIST.EDU
cc

Subject
bash shell question






I've got a small problem. I have a daemon which I cannot easily restart
because it is production and people are using it. The daemon is started
with something like:

daemon args daemon.log

The file daemon.log is getting very huge. The correct way to fix this is
to stop the daemon, mv or rm the daemon.log, then restart the
daemon. But I have a vague memory that it is sometimes possible to reset
a file to empty simply by doing a:

daemon.log

and that will, at times, work even if the daemon is not restarted. Is my
memory correct? Or is that some sort of special case which does not
apply when bash does a  redirect of stdin?

We plan to fix this by not doing that!, but instead piping the stdout
from the daemon to a process called cronolog which works by
automatically changing the output file at midnight by changing the date
portion of the log name.

--
Q: What do theoretical physicists drink beer from?
A: Ein Stein.

Maranatha!
John McKown

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390


--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390