Re: Curious about crash.

2015-12-23 Thread Tom Huegel
Here is my little TRKCOMP exec. Sometimes I am hesitant to put something
like this out because there is going to be at least one critic that says it
is 'incorrect'. It is just something I did with copy/paste of other code...
Do with it as you please.

 /* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
*/
/* Dependencies and pre-reqs:
*/
/*
*/
/* The pipe stage 'trackread' is used which is available in the
*/
/* Princeton University pipelines package.
*/
/* http://vm.marist.edu/~pipeline/index.html#Runtime
*/
/*
*/
/* PICKPIPE which can be downloaded from IBM's VM download page.
*/
/* http://www.vm.ibm.com/download/packages/
*/
/*
*/
/* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
*/


parse source . . exname .   /* get the EXEC name for
displays */
rrc =
0

 say 'Enter disk disks -- DISK-1 DISK-2'  /* ask for disks to compare
*/
 pull disk disk1  /* get response
*/
 if disk = '' then disk = 100 /* default to 100
*/
 if disk1 = '' then disk1 = 101   /* default to 101
*/
 erase 'dasd'||disk out
a
 erase 'dasd'||disk1 out
a
 erase combined out
a
 say 'START & END cylinders'  /* ask for cylinder range
*/
 pull start end   /* get the response
*/
   if start = '' then start = 0/* if not specified
start on cyl 0 */
   if end = '' then end = start/* if not specified end
on the start cyl */
   'pipe query version | spec 1-3 1 | var
pipelevel'
if pipelevel ¬= 'PIP' then, /* do we have correct
pipes?*/
   pickpipe u '(quiet'  /* no then load the
UPLEVEL pipes   */
if rc ¬= 0 then /* was pipe load
successful?  */
  call msg 16, 'An uplevel pipe package is not available-- 'exname '
cannot continue'


   'pipe devinfo' disk '| var temp' /* get device
characteristics */
   if rc ¬= 0 then  /* device error
*/
  call msg 20, 'Device error trying to access ' disk '--' exname
'cannot continue'
  parse var temp cuu clas dvtyp cutyp cyls trkpc trklen .  /* parse it
out to vars */


   if dvtyp ¬= '3390' then /* device type must be
3390  */
  call msg 24, 'Device' cuu 'is not a supported device type' dvtyp '¬=
3390'


   if end = '*' | end = 'END' then end = cyls -1   /* calculate the end
cylinder */


   if end < start then /* verify cylinder
range  */
  call msg 26, 'END less than START on' disk ' --' exname 'cannot
continue'


   if end > cyls -1 then/* verify cylinder
range */
  call msg 28, 'END ('end') > last cylinder ('cyls -1') on' disk ' --'
exname 'cannot continue'


 do start = start by 1 until start >= end   /* loop read specified
cylinders */ /*'do-a'*/
cc = start  /* read
cylinder*/
hh = 0  /* and
track*/
   do hh = hh by 1 until hh = trkpc -1  /* do all tracks on the
cylinder */ /*'do-b'*/
  call readtrk  /* read disk 1
*/
  call readtrk1 /* read disk 2
*/
  do x = 1 by 1 for utrack.0/* look for mismatches
*/  /*'do-c'*/
 if utrack.x <> u1track.x then do   /* do the records
match? *//*'do-d'*/
say 'CC HH R =' cc hh x /* no then display
address */
'pipe literal' cc hh x '| >> combined out a'   /* write out
record address to combined file */
'pipe var utrack.x | >>' dasd||disk 'out a'/* write out
disk 1 mismatch */
'pipe var utrack.x | >> combined out a'/* write out
combined mismatch-1 */
'pipe var u1track.x | >>' dasd||disk1 'out a'  /* write out
disk 2 mismatch */
'pipe var u1track.x | >> combined out a'   /* write out
combined mismatch-2 */
 end/* end 'do-d'
*/
  end   /* end 'do-c'
*/
   end  /* end 'do-b'
*/
 end/* end 'do-a'
*/
signal
finish

readtrk:

'PIPE (endchar ? )',   /* start the pipe
*/
   '| trackread' disk cc hh,   /* read a track
*/
'| trackdeblock ', /* deblock the track
*/
'| drop 2',/* drop HA & R0 records
*/
'| stem utrack.'   /* save the records in a
stem */
if rc ¬= 0 then do  /* device error?
*/
   call msg 00, '
'
   call msg 00, 'Invalid pointer see previous messages --'
exname

end

return


Re: Curious about crash.

2015-12-22 Thread Mark Post
>>> On 12/22/2015 at 05:40 PM, Rick Troth  wrote: 
> On 12/22/2015 04:16 PM, Tom Huegel wrote:
>> My ignorance overflows..
>> When I look on my INTEL LINUX I can find grub.cfg in boot/grub2 but I
>> cannot find it in the zLINUX.

Most likely because it doesn't exist on your system(s).

> Probably me making ASSumptions.
> GRUB recently learned how to play in z land,

Not really.  GRUB still doesn't understand DASD or SCSI over FCP (as 
implemented by z Systems).  GRUB only knows how to read the file systems Linux 
uses to find the various kernels and initrds specified in /boot/grub2/grub.cfg.

> and the distros have picked
> that up.

To my knowledge, only SUSE Linux Enterprise Server 12 and 12 SP1 offer this.

-snip-
> In any case, the fact that /boot/grub exists suggests that we do want to
> go the GRUB route and not traditional ZIPL.

The zipl command is still needed on z Systems.  It's just not as 
front-and-center as it used to be in the past.  In particular, SUSE doesn't 
provide an /etc/zipl.conf any more, since there's no real need for customer 
modifications there.

The way things work is that:
1. A Linux kernel needs to be booted from DASD or SCSI over FCP.  That means 
that zipl has to write out the pointers to the kernel and initrd so that the 
boot loader can find them.
2. Once the Linux system gets up to a point where GRUB can be invoked, it then 
tries to find both grub.cfg and then the Linux kernels and initrds that it 
references.
3. After the user specifies a specific kernel/initrd combination, possibly 
adding parameters to what is in /boot/grub2/grub.cfg, or the timeout happens, 
GRUB then boots the "real" system.


Mark Post

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Curious about crash.

2015-12-22 Thread Tom Huegel
My ignorance overflows..
When I look on my INTEL LINUX I can find grub.cfg in boot/grub2 but I
cannot find it in the zLINUX.

ls /boot
bootmap symvers-2.6.32-573.8.1.el6.s390x.gz
config-2.6.32-431.el6.s390x System.map-2.6.32-431.el6.s390x
config-2.6.32-573.8.1.el6.s390x System.map-2.6.32-573.8.1.el6.s390x
grubtape0
initramfs-2.6.32-431.el6.s390x.img  vmlinuz-2.6.32-431.el6.s390x
symvers-2.6.32-431.el6.s390x.gz vmlinuz-2.6.32-573.8.1.el6.s390x
[root@tom129 /]# ls /boot/grub
ls /boot/grub
splash.xpm.gz


Two additional points: 1) I'll post my XCAT experience as best I can
remember, but probably not until I return from the holiday break in
January. 2) I wrote up the little exec (pipe trackread) I had mentioned to
compare raw dasd tracks but I couldn't determine the problem because there
were too many differences.. But if anyone would like the exec I can post it
here.. Someone may have a use for it.



On Tue, Dec 22, 2015 at 9:37 AM, Rick Troth  wrote:

> On 12/22/2015 11:55 AM, Tom Huegel wrote:
> > Why these LINUX machines are setup the way they are is a mystery to me.
> As
> > part of the lab exercise I used XCAT to provision the machines..
>
> So we're all learning from your experience.
> Kind of you to walk point with XCAT for the rest of us.   :-)
>
>
> > Following what Rick said I was able to boot the failing machine.. I don't
> > know what to do next, but it is booted.
>
> Fab!
>
> Now you can try to run that 'diff' which Russ asked for.
>
>
> mv grub.cfg grub.cfg-BAK
> grub2-mkconfig -o /boot/grub2/grub.cfg
> diff grub.cfg-BAK grub.cfg
>
>
> I don't know what to tell you about stamping a now-good bootstrap (with
> GRUB).
> In times past it would be 'mkinitrd' followed by 'zipl'. Can someone say
> if that's viable here?
>
> -- R; <><
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
> visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Curious about crash.

2015-12-22 Thread Rick Troth
On 12/22/2015 04:16 PM, Tom Huegel wrote:
> My ignorance overflows..
> When I look on my INTEL LINUX I can find grub.cfg in boot/grub2 but I
> cannot find it in the zLINUX.

Probably me making ASSumptions.
GRUB recently learned how to play in z land, and the distros have picked
that up.
There are some advantages.


[edited]
> # ls /boot
> bootmap symvers-2.6.32-573.8.1.el6.s390x.gz
> config-2.6.32-431.el6.s390x System.map-2.6.32-431.el6.s390x
> config-2.6.32-573.8.1.el6.s390x System.map-2.6.32-573.8.1.el6.s390x
> grubtape0
> initramfs-2.6.32-431.el6.s390x.img  vmlinuz-2.6.32-431.el6.s390x
> symvers-2.6.32-431.el6.s390x.gz vmlinuz-2.6.32-573.8.1.el6.s390x
>
> # ls /boot/grub
> ls /boot/grub
> splash.xpm.gz

Look under /etc too. And compare the dead system to the good system.

It's all new to me too. I believe /boot/grub2/grub.cfg gets built using
the "helper scripts" found under /etc/grub.d.

In any case, the fact that /boot/grub exists suggests that we do want to
go the GRUB route and not traditional ZIPL.

Since the system has booted, can you [re]run a 'yum update'?


> Two additional points: 1) I'll post my XCAT experience as best I can
> remember, but probably not until I return from the holiday break in
> January. 2) I wrote up the little exec (pipe trackread) I had mentioned to
> compare raw dasd tracks but I couldn't determine the problem because there
> were too many differences.. But if anyone would like the exec I can post it
> here.. Someone may have a use for it.
>
>

I for one would be interested in the tool, but the task of
track-by-track analysis of a Linux filesystem would be mind numbing at
best.

Hang in there, Tom.
Oh, and, of course, ... Mey Christmaaas!

-- Sir Santa; <><

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Curious about crash.

2015-12-22 Thread Rick Troth
On 12/22/2015 11:55 AM, Tom Huegel wrote:
> Why these LINUX machines are setup the way they are is a mystery to me. As
> part of the lab exercise I used XCAT to provision the machines..

So we're all learning from your experience.
Kind of you to walk point with XCAT for the rest of us.   :-)


> Following what Rick said I was able to boot the failing machine.. I don't
> know what to do next, but it is booted.

Fab!

Now you can try to run that 'diff' which Russ asked for.


mv grub.cfg grub.cfg-BAK
grub2-mkconfig -o /boot/grub2/grub.cfg
diff grub.cfg-BAK grub.cfg


I don't know what to tell you about stamping a now-good bootstrap (with
GRUB).
In times past it would be 'mkinitrd' followed by 'zipl'. Can someone say
if that's viable here?

-- R; <><

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Curious about crash.

2015-12-22 Thread Tom Huegel
Why these LINUX machines are setup the way they are is a mystery to me. As
part of the lab exercise I used XCAT to provision the machines..

Following what Rick said I was able to boot the failing machine.. I don't
know what to do next, but it is booted.

On Mon, Dec 21, 2015 at 6:45 PM, Rick Troth  wrote:

> On 12/16/2015 09:45 AM, Tom Huegel wrote:
> > VFS: Cannot open root device "disk/by-path/ccw-0.0.0100-part2" or
> > unknown-block(0,0)
> > Please append a correct "root=" boot option; here are the available
> > partitions:
>
> Nothin.
> No partitions. (More significantly, no disks at all.)
>
> I didn't see it mentioned, but if the two systems use the same device
> addresses then you might be able to boot from the good system.
> Guessing that the INITRD is shot. So ...
>
> #cp link good 100 500 rr
> #cp ipl 500 clear
>
>  ... and the INITRD there will have the needed modules and startup magic
> to "see" your 100 disk. You do not need R/W access to the link. It
> should not try to mount that disk (or a partition thereof) as root.
> You're only using the bootstrap, the IPL text.
>
> -- R; <><
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
> visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Curious about crash.

2015-12-21 Thread Alan Ackerman
I'm a little surprised that you are using -part2. We format all DASD with a 
single partition. If we want to break something up, we break it up into 
separate minidisks, each with a single partition. That allows us to tune using 
monitor data. 

Are you really using partition 2? If so, I'd be curious to know why.


Alan Ackerman
alan.ackerma...@gmail.com



> On Dec 16, 2015, at 6:45 AM, Tom Huegel  wrote:
> 
> VFS: Cannot open root device "disk/by-path/ccw-0.0.0100-part2" or
> unknown-block(0,0)

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Curious about crash.

2015-12-21 Thread Rick Troth
On 12/16/2015 09:45 AM, Tom Huegel wrote:
> VFS: Cannot open root device "disk/by-path/ccw-0.0.0100-part2" or
> unknown-block(0,0)
> Please append a correct "root=" boot option; here are the available
> partitions:

Nothin.
No partitions. (More significantly, no disks at all.)

I didn't see it mentioned, but if the two systems use the same device
addresses then you might be able to boot from the good system.
Guessing that the INITRD is shot. So ...

#cp link good 100 500 rr
#cp ipl 500 clear

 ... and the INITRD there will have the needed modules and startup magic
to "see" your 100 disk. You do not need R/W access to the link. It
should not try to mount that disk (or a partition thereof) as root.
You're only using the bootstrap, the IPL text.

-- R; <><

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Curious about crash.

2015-12-17 Thread Tom Huegel
Like I mentioned earlier this is not production system that is down. It is
just a lab system that can be re-installed easily. It would be nice to have
some idea of what killed it in the first place.
Using DITTO I was visually trying to compare the DASD from the working and
nonworking systems, but that was inconclusive. If I get a chance later I'll
write a quick PIPE to do trackreads and compare the two disks, maybe
that'll show something.
Thanks for all of the tips and comments.
Tom


On Thu, Dec 17, 2015 at 6:14 AM, Jonathan Quay 
wrote:

> Isn't it complaining about not finding the root filesystem, not about not
> finding the /boot filesystem?  I would take a look at that partition in
> question that it can't find.
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
> visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Curious about crash.

2015-12-17 Thread Jonathan Quay
Isn't it complaining about not finding the root filesystem, not about not
finding the /boot filesystem?  I would take a look at that partition in
question that it can't find.

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Curious about crash.

2015-12-16 Thread R P Herrold
On Wed, 16 Dec 2015, John McKown wrote:

> ​Or, as one co-worker tells it, they have their CEC up against the back
> wall. Unbeknownst to them, on the other side of the wall is a mega-Gauss
> electromagnet. ​Magnet on, CEC fails.

one of our site design rules is to never wrap the Token Ring 
network cables around the cyclotron  ;)

-- R

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Curious about crash.

2015-12-16 Thread Tom Huegel
Well Mark, I tried all 3 ipl's they all failed with the same message.

On Wed, Dec 16, 2015 at 12:43 PM, Mark Post  wrote:

> >>> On 12/16/2015 at 10:44 AM, Tom Huegel  wrote:
> -snip-
> > CP I 100
>
> You're IPLing from device number 100, so that means the disk is available
> to the guest.
>
> -snip-
> > Kernel command line: root=/dev/disk/by-path/ccw-0.0.0100-part2 rd_NO_LUKS
> > LANG=en_US.UTF-8  KEYTABLE=us rd_NO_MD SYSFONT=latarcyrheb
> > -sun16  rd_DASD=0.0.0100 rd_NO_LVM rd_NO_DM
> > BOOT_IMAGE=0
>
> I'm not sure, but it's possible the rd_dasd parameter is case sensitive.
> On our systems that use dracut, the parameter is actually rd.dasd, not
> rd_dasd.  The same with the other parms: rc.something, not rd_something.
>
> -snip-
> > VFS: Cannot open root device "disk/by-path/ccw-0.0.0100-part2" or
> > unknown-block(0,0)
>
> And up to this point, I see nothing from the DASD driver talking about
> activating device number 100.  Which is what leads me to wonder about the
> kernel parms you have.
>
> Something you could try to work around the problem would be this:
> ipl 100 parm rd.dasd=0.0.0100
>
> and if that doesn't work, then
> ipl 100 parm rd_dasd=0.0.0100
>
> and if that doesn't work, then just for grins
> ipl 100 parm rd.dasd=0.0.0100 rd_dasd=0.0.0100
>
>
> Mark Post
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
> visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Curious about crash.

2015-12-16 Thread Mark Post
>>> On 12/16/2015 at 10:44 AM, Tom Huegel  wrote: 
-snip-
> CP I 100

You're IPLing from device number 100, so that means the disk is available to 
the guest.

-snip-
> Kernel command line: root=/dev/disk/by-path/ccw-0.0.0100-part2 rd_NO_LUKS
> LANG=en_US.UTF-8  KEYTABLE=us rd_NO_MD SYSFONT=latarcyrheb
> -sun16  rd_DASD=0.0.0100 rd_NO_LVM rd_NO_DM
> BOOT_IMAGE=0

I'm not sure, but it's possible the rd_dasd parameter is case sensitive.  On 
our systems that use dracut, the parameter is actually rd.dasd, not rd_dasd.  
The same with the other parms: rc.something, not rd_something.

-snip-
> VFS: Cannot open root device "disk/by-path/ccw-0.0.0100-part2" or
> unknown-block(0,0)

And up to this point, I see nothing from the DASD driver talking about 
activating device number 100.  Which is what leads me to wonder about the 
kernel parms you have.

Something you could try to work around the problem would be this:
ipl 100 parm rd.dasd=0.0.0100

and if that doesn't work, then
ipl 100 parm rd_dasd=0.0.0100

and if that doesn't work, then just for grins
ipl 100 parm rd.dasd=0.0.0100 rd_dasd=0.0.0100


Mark Post

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Curious about crash.

2015-12-16 Thread John McKown
On Wed, Dec 16, 2015 at 11:58 AM, Paul Dembry  wrote:

> [AA]
> A gamma ray collided with a 1 and knocked it over, turning it into a 0.
> That's the only explanation I can think of.  I used to think it was
> sunspots or coronal mass ejections, but I've moved on.
>
> [PD]
> Never ignore that possibilty, it reminds me of a problem that Sun
> Microsystems had in the early 2000s with the E1 Starfire systems. There
> was some problem with the cache memory that would randomly alter a bit
> (http://www.eweek.com/c/a/IT-Infrastructure/AmericaWest-Flight-Plan/2/). I
> spent a week or so pouring over a customer's core dump and was able to show
> that the value stored in memory had in fact been altered by one bit within
> about 10 instructions, 0x1001 became 0x0001.
> Paul
>
>
​Or, as one co-worker tells it, they have their CEC up against the back
wall. Unbeknownst to them, on the other side of the wall is a mega-Gauss
electromagnet. ​Magnet on, CEC fails.


-- 

Schrodinger's backup: The condition of any backup is unknown until a
restore is attempted.

Yoda of Borg, we are. Futile, resistance is, yes. Assimilated, you will be.

He's about as useful as a wax frying pan.

10 to the 12th power microphones = 1 Megaphone

Maranatha! <><
John McKown

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Curious about crash.

2015-12-16 Thread Tom Huegel
Right the kernel appears to be different.
>From the working system:
Linux version 2.6.32-431.66.1.el6.s390x (
mockbu...@s390-002.build.bos.redhat.com) (gcc version 4.4.7 20120313 (Red
Hat 4.4.7-4) (GCC
) ) #1 SMP Fri Oct 2 13:20:16 EDT
2015

setup: Linux is running as a z/VM guest operating system in 64-bit mode



On Wed, Dec 16, 2015 at 7:46 AM, R P Herrold  wrote:

> On Wed, 16 Dec 2015, Alan Altmark wrote:
>
> > On Wednesday, 12/16/2015 at 02:48 GMT, Tom Huegel 
> > wrote:
>
> > > Now the registered machine fails to boot, the other one works fine.
> >
> > A gamma ray collided with a 1 and knocked it over, turning it into a 0.
> > That's the only explanation I can think of.  I used to think it was
> > sunspots or coronal mass ejections, but I've moved on.
>
> Assumedly the non-registered machine is still running the
> older kernel (please check which kernals are installed thus:
> rpm -qa kernel\*
> )
>
> As such mkinitrd (which seems to be failing, perhaps due for a
> improperly specified path to an initrd, was in play (per the
> earlier message quoted)
>
> I have found 'grub2' and grubby, to be quite sensitive to the
> configuration file it is handed.  There are 'order issues' on
> what needs to appear before and after other items, not well
> documented.  From my notes:
>
> The following will generate the  correct grub.cfg file
> ( /boot/grub2/ is RHEL / ClefOS 7 and later ... )
>
> cd /boot/grub2/
> mv grub.cfg grub.cfg-BAK
> grub2-mkconfig -o /boot/grub2/grub.cfg
>
> but sadly the 'fix' does not persist when a new kernel is
> installed. not sure why as I have been off solving other
> issues
>
> -- Russ herrold
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
> visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Curious about crash.

2015-12-16 Thread R P Herrold
On Wed, 16 Dec 2015, Tom Huegel wrote:

> Right the kernel appears to be different.

* nod *

Next diagnostic step would be a reinstall of a registered
system, and then after a yum update, but before the reboot,
please try the code below, for the phase one to chainload
into, to see if it can 'solve' the path

> > mv grub.cfg grub.cfg-BAK
> > grub2-mkconfig -o /boot/grub2/grub.cfg

if that works I would be interested in a comparison of the BAK
file and its successor

-- Russ herrold

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Curious about crash.

2015-12-16 Thread Paul Dembry
[AA]
A gamma ray collided with a 1 and knocked it over, turning it into a 0.
That's the only explanation I can think of.  I used to think it was
sunspots or coronal mass ejections, but I've moved on.

[PD]
Never ignore that possibilty, it reminds me of a problem that Sun
Microsystems had in the early 2000s with the E1 Starfire systems. There
was some problem with the cache memory that would randomly alter a bit
(http://www.eweek.com/c/a/IT-Infrastructure/AmericaWest-Flight-Plan/2/). I
spent a week or so pouring over a customer's core dump and was able to show
that the value stored in memory had in fact been altered by one bit within
about 10 instructions, 0x1001 became 0x0001.
Paul

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Curious about crash.

2015-12-16 Thread Gregg Levine
Hello!
I missed that clue! Well that's a first.

It seems reading a machine log for the big guy is a special talent.
-
Gregg C Levine gregg.drw...@gmail.com
"This signature fought the Time Wars, time and again."


On Wed, Dec 16, 2015 at 11:51 AM, Tom Huegel  wrote:
> Right the kernel appears to be different.
> From the working system:
> Linux version 2.6.32-431.66.1.el6.s390x (
> mockbu...@s390-002.build.bos.redhat.com) (gcc version 4.4.7 20120313 (Red
> Hat 4.4.7-4) (GCC
> ) ) #1 SMP Fri Oct 2 13:20:16 EDT
> 2015
>
> setup: Linux is running as a z/VM guest operating system in 64-bit mode
>
>
>
> On Wed, Dec 16, 2015 at 7:46 AM, R P Herrold  wrote:
>
>> On Wed, 16 Dec 2015, Alan Altmark wrote:
>>
>> > On Wednesday, 12/16/2015 at 02:48 GMT, Tom Huegel 
>> > wrote:
>>
>> > > Now the registered machine fails to boot, the other one works fine.
>> >
>> > A gamma ray collided with a 1 and knocked it over, turning it into a 0.
>> > That's the only explanation I can think of.  I used to think it was
>> > sunspots or coronal mass ejections, but I've moved on.
>>
>> Assumedly the non-registered machine is still running the
>> older kernel (please check which kernals are installed thus:
>> rpm -qa kernel\*
>> )
>>
>> As such mkinitrd (which seems to be failing, perhaps due for a
>> improperly specified path to an initrd, was in play (per the
>> earlier message quoted)
>>
>> I have found 'grub2' and grubby, to be quite sensitive to the
>> configuration file it is handed.  There are 'order issues' on
>> what needs to appear before and after other items, not well
>> documented.  From my notes:
>>
>> The following will generate the  correct grub.cfg file
>> ( /boot/grub2/ is RHEL / ClefOS 7 and later ... )
>>
>> cd /boot/grub2/
>> mv grub.cfg grub.cfg-BAK
>> grub2-mkconfig -o /boot/grub2/grub.cfg
>>
>> but sadly the 'fix' does not persist when a new kernel is
>> installed. not sure why as I have been off solving other
>> issues
>>
>> -- Russ herrold
>>
>> --
>> For LINUX-390 subscribe / signoff / archive access instructions,
>> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or
>> visit
>> http://www.marist.edu/htbin/wlvindex?LINUX-390
>> --
>> For more information on Linux on System z, visit
>> http://wiki.linuxvm.org/
>>
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Curious about crash.

2015-12-16 Thread Mark Post
>>> On 12/16/2015 at 10:35 PM, Thomas Anderson  wrote: 
> Not to state the obvious, but the *kernel command line*  is telling the 
> kernel loader that it will find a root file system and a loadable kernel 
> image at
> "/dev/disk/by-path/ccw-0.0.0100-part2* 
> However when it tries to read it, either it can*t access the device or it 
> isn*t finding the *IPLTEXT* (for want of a better term).
> 
>> Kernel command line: root=/dev/disk/by-path/ccw-0.0.0100-part2 rd_NO_LUKS
>> LANG=en_US.UTF-8  KEYTABLE=us rd_NO_MD SYSFONT=latarcyrheb
>> -sun16  rd_DASD=0.0.0100 rd_NO_LVM rd_NO_DM
>> BOOT_IMAGE=0
> 
> Blah, blah, blah*
> 
>> 
>> VFS: Cannot open root device "disk/by-path/ccw-0.0.0100-part2" or
>> unknown-block(0,0)
> blah, blah, blah, 
> 
> Kind of like you told CP to *ipl * and there was nothing *ipl-able* 
> at 

If there wasn't a kernel and initrd on the 100 disk at the expected location, 
he wouldn't have gotten nearly as far as he did.


Mark Post

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Curious about crash.

2015-12-16 Thread Mark Post
>>> On 12/16/2015 at 03:58 PM, Tom Huegel  wrote: 
> Well Mark, I tried all 3 ipl's they all failed with the same message.

One more thing to try is perhaps root=/dev/dasda1.  If you only have one disk, 
perhaps you'll get lucky with the device name matching.

The other thing to try is turning on debug output:
debug rd.debug (or rd_debug, depending)

What does your /etc/zipl.conf file look like on the working system?


Mark Post

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Curious about crash.

2015-12-16 Thread Thomas Anderson
Not to state the obvious, but the “kernel command line”  is telling the kernel 
loader that it will find a root file system and a loadable kernel image at
"/dev/disk/by-path/ccw-0.0.0100-part2” 
However when it tries to read it, either it can’t access the device or it isn’t 
finding the “IPLTEXT” (for want of a better term).

> Kernel command line: root=/dev/disk/by-path/ccw-0.0.0100-part2 rd_NO_LUKS
> LANG=en_US.UTF-8  KEYTABLE=us rd_NO_MD SYSFONT=latarcyrheb
> -sun16  rd_DASD=0.0.0100 rd_NO_LVM rd_NO_DM
> BOOT_IMAGE=0

Blah, blah, blah…

> 
> VFS: Cannot open root device "disk/by-path/ccw-0.0.0100-part2" or
> unknown-block(0,0)
blah, blah, blah, 

Kind of like you told CP to “ipl ” and there was nothing “ipl-able” at 


WAGs:
Transitory disk error or pregnant pause while trying to read the device?  (Try 
re-booting a couple of more times, keeping an eye out for DASD errors??  “Thank 
you for calling Microsoft support…")
Somebody sharing the device that is supposed to hold your kernel?  (oh were YOU 
using that disk?, I saved my grandma’s fruitcake recipe there)
Something “interesting” happened during the “yum update” and the kernel image 
wasn’t properly updated.   
“Just check the yum logs”
“If the system would boot so I could see the logs, I wouldn’t need to 
check the yum logs, now would I?”
“Thank you for calling Linux support, is there anything else I can help 
you with today?”


Actually you COULD allocate the disk that is supposed to contain the root file 
system on the failing system to the instance that is working, mount it and poke 
around,
but it doesn’t sound like this system is worth that much work and your time 
would be better spent on last minute Christmas shopping. :O


Tom Anderson
Ex ignorantia ad sapientiam
e tenebris ad lucem!

> On Dec 16, 2015, at 7:44 AM, Tom Huegel  wrote:
> 
> Here is the whole console log.
> In the mean time I am attributing it to global warming.
> 
> 
> 
> CP I
> 100
> 
> Booting default
> (2.6.32-573.8.1.el6.s390x)
> 
> Initializing cgroup subsys
> cpuset
> 
> Initializing cgroup subsys
> cpu
> 
> Linux version 2.6.32-573.8.1.el6.s390x (
> mockbu...@s390-003.build.bos.redhat.com) (gcc version 4.4.7 20120313 (Red
> Hat 4.4.7-16) (GCC
> ) ) #1 SMP Fri Sep 25 19:21:49 EDT
> 2015
> 
> setup: Linux is running as a z/VM guest operating system in 64-bit
> mode
> crashkernel=auto resulted in zero bytes of reserved
> memory.
> 
> Zone PFN
> ranges:
> 
>  DMA  0x ->
> 0x0008
> 
>  Normal   0x0008 ->
> 0x0008
> 
> Movable zone start PFN for each
> node
> 
> early_node_map[1] active PFN
> ranges
> 
>0: 0x ->
> 0x0008
> 
> PERCPU: Embedded 12 pages/cpu @02b0 s19456 r8192 d21504
> u65536
> pcpu-alloc: s19456 r8192 d21504 u65536
> alloc=16*4096
> 
> pcpu-alloc: [0] 00 [0] 01 [0] 02 [0] 03 [0] 04 [0] 05 [0] 06 [0]
> 07
> pcpu-alloc: [0] 08 [0] 09 [0] 10 [0] 11 [0] 12 [0] 13 [0] 14 [0]
> 15
> pcpu-alloc: [0] 16 [0] 17 [0] 18 [0] 19 [0] 20 [0] 21 [0] 22 [0]
> 23
> pcpu-alloc: [0] 24 [0] 25 [0] 26 [0] 27 [0] 28 [0] 29 [0] 30 [0]
> 31
> pcpu-alloc: [0] 32 [0] 33 [0] 34 [0] 35 [0] 36 [0] 37 [0] 38 [0]
> 39
> pcpu-alloc: [0] 40 [0] 41 [0] 42 [0] 43 [0] 44 [0] 45 [0] 46 [0]
> 47
> pcpu-alloc: [0] 48 [0] 49 [0] 50 [0] 51 [0] 52 [0] 53 [0] 54 [0]
> 55
> pcpu-alloc: [0] 56 [0] 57 [0] 58 [0] 59 [0] 60 [0] 61 [0] 62 [0]
> 63
> 
> 
> 
> HOLDING   SCZVMLX2
> Built 1 zonelists in Zone order, mobility grouping on.  Total pages:
> 517120
> Kernel command line: root=/dev/disk/by-path/ccw-0.0.0100-part2 rd_NO_LUKS
> LANG=en_US.UTF-8  KEYTABLE=us rd_NO_MD SYSFONT=latarcyrheb
> -sun16  rd_DASD=0.0.0100 rd_NO_LVM rd_NO_DM
> BOOT_IMAGE=0
> 
> PID hash table entries: 4096 (order: 3, 32768
> bytes)
> 
> Dentry cache hash table entries: 262144 (order: 9, 2097152
> bytes)
> Inode-cache hash table entries: 131072 (order: 8, 1048576
> bytes)
> Memory: 2048224k/2097152k available (5118k kernel code, 0k reserved, 3531k
> data, 260k init)
> Write protected kernel read-only data: 0x10 -
> 0x7f
> 
> Hierarchical RCU
> implementation.
> 
> console [ttyS0]
> enabled
> 
> allocated 8388608 bytes of
> page_cgroup
> 
> please try 'cgroup_disable=memory' option if you don't want memory
> cgroups
> pid_max: default: 65536 minimum:
> 512
> 
> Security Framework
> initialized
> 
> SELinux:
> Initializing.
> 
> Mount-cache hash table entries:
> 256
> 
> Initializing cgroup subsys
> ns
> 
> Initializing cgroup subsys
> cpuacct
> 
> Initializing cgroup subsys
> memory
> 
> Initializing cgroup subsys
> devices
> 
> Initializing cgroup subsys
> freezer
> 
> Initializing cgroup subsys
> net_cls
> 
> Initializing cgroup subsys
> blkio
> 
> Initializing cgroup subsys
> perf_event
> 
> Initializing cgroup subsys
> net_prio
> 
> 
> 
> 
> HOLDING   SCZVMLX2
> cpu: 1 configured CPUs, 0 standby
> CPUs
> 
> cpu: Processor 0 started, address 0, identification
> 1B6ED6
> 
> Brought up 1
> CPUs
> 
> devtmpfs:
> initialized

Re: Curious about crash.

2015-12-16 Thread Gregg Levine
Hello!
Actually Alan you're thinking of why hundreds of satellites ignored yesterday.

Actually it looks like that penguin ignored the root setting for
things. First time I've seen one that revolting.

Is it possible to see the log entries before that first line?
-
Gregg C Levine gregg.drw...@gmail.com
"This signature fought the Time Wars, time and again."


On Wed, Dec 16, 2015 at 10:19 AM, Alan Altmark  wrote:
> On Wednesday, 12/16/2015 at 02:48 GMT, Tom Huegel 
> wrote:
>> Now the registered machine fails to boot, the other one works fine.
>
> A gamma ray collided with a 1 and knocked it over, turning it into a 0.
> That's the only explanation I can think of.  I used to think it was
> sunspots or coronal mass ejections, but I've moved on.
>
> Alan Altmark
>
> Senior Managing z/VM and Linux Consultant
> Lab Services System z Delivery Practice
> IBM Systems & Technology Group
> ibm.com/systems/services/labservices
> office: 607.429.3323
> mobile; 607.321.7556
> alan_altm...@us.ibm.com
> IBM Endicott
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Curious about crash.

2015-12-16 Thread Alan Altmark
On Wednesday, 12/16/2015 at 02:48 GMT, Tom Huegel  
wrote:
> Now the registered machine fails to boot, the other one works fine.

A gamma ray collided with a 1 and knocked it over, turning it into a 0. 
That's the only explanation I can think of.  I used to think it was 
sunspots or coronal mass ejections, but I've moved on.

Alan Altmark

Senior Managing z/VM and Linux Consultant
Lab Services System z Delivery Practice
IBM Systems & Technology Group
ibm.com/systems/services/labservices
office: 607.429.3323
mobile; 607.321.7556
alan_altm...@us.ibm.com
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Curious about crash.

2015-12-16 Thread Tom Huegel
Here is the whole console log.
In the mean time I am attributing it to global warming.



CP I
100

Booting default
(2.6.32-573.8.1.el6.s390x)

Initializing cgroup subsys
cpuset

Initializing cgroup subsys
cpu

Linux version 2.6.32-573.8.1.el6.s390x (
mockbu...@s390-003.build.bos.redhat.com) (gcc version 4.4.7 20120313 (Red
Hat 4.4.7-16) (GCC
) ) #1 SMP Fri Sep 25 19:21:49 EDT
2015

setup: Linux is running as a z/VM guest operating system in 64-bit
mode
crashkernel=auto resulted in zero bytes of reserved
memory.

Zone PFN
ranges:

  DMA  0x ->
0x0008

  Normal   0x0008 ->
0x0008

Movable zone start PFN for each
node

early_node_map[1] active PFN
ranges

0: 0x ->
0x0008

PERCPU: Embedded 12 pages/cpu @02b0 s19456 r8192 d21504
u65536
pcpu-alloc: s19456 r8192 d21504 u65536
alloc=16*4096

pcpu-alloc: [0] 00 [0] 01 [0] 02 [0] 03 [0] 04 [0] 05 [0] 06 [0]
07
pcpu-alloc: [0] 08 [0] 09 [0] 10 [0] 11 [0] 12 [0] 13 [0] 14 [0]
15
pcpu-alloc: [0] 16 [0] 17 [0] 18 [0] 19 [0] 20 [0] 21 [0] 22 [0]
23
pcpu-alloc: [0] 24 [0] 25 [0] 26 [0] 27 [0] 28 [0] 29 [0] 30 [0]
31
pcpu-alloc: [0] 32 [0] 33 [0] 34 [0] 35 [0] 36 [0] 37 [0] 38 [0]
39
pcpu-alloc: [0] 40 [0] 41 [0] 42 [0] 43 [0] 44 [0] 45 [0] 46 [0]
47
pcpu-alloc: [0] 48 [0] 49 [0] 50 [0] 51 [0] 52 [0] 53 [0] 54 [0]
55
pcpu-alloc: [0] 56 [0] 57 [0] 58 [0] 59 [0] 60 [0] 61 [0] 62 [0]
63



HOLDING   SCZVMLX2
Built 1 zonelists in Zone order, mobility grouping on.  Total pages:
517120
Kernel command line: root=/dev/disk/by-path/ccw-0.0.0100-part2 rd_NO_LUKS
LANG=en_US.UTF-8  KEYTABLE=us rd_NO_MD SYSFONT=latarcyrheb
-sun16  rd_DASD=0.0.0100 rd_NO_LVM rd_NO_DM
BOOT_IMAGE=0

PID hash table entries: 4096 (order: 3, 32768
bytes)

Dentry cache hash table entries: 262144 (order: 9, 2097152
bytes)
Inode-cache hash table entries: 131072 (order: 8, 1048576
bytes)
Memory: 2048224k/2097152k available (5118k kernel code, 0k reserved, 3531k
data, 260k init)
Write protected kernel read-only data: 0x10 -
0x7f

Hierarchical RCU
implementation.

console [ttyS0]
enabled

allocated 8388608 bytes of
page_cgroup

please try 'cgroup_disable=memory' option if you don't want memory
cgroups
pid_max: default: 65536 minimum:
512

Security Framework
initialized

SELinux:
Initializing.

Mount-cache hash table entries:
256

Initializing cgroup subsys
ns

Initializing cgroup subsys
cpuacct

Initializing cgroup subsys
memory

Initializing cgroup subsys
devices

Initializing cgroup subsys
freezer

Initializing cgroup subsys
net_cls

Initializing cgroup subsys
blkio

Initializing cgroup subsys
perf_event

Initializing cgroup subsys
net_prio




HOLDING   SCZVMLX2
cpu: 1 configured CPUs, 0 standby
CPUs

cpu: Processor 0 started, address 0, identification
1B6ED6

Brought up 1
CPUs

devtmpfs:
initialized

regulator: core version
0.5

NET: Registered protocol family
16

bio: create slab  at
0

NetLabel:
Initializing

NetLabel:  domain hash size =
128

NetLabel:  protocols = UNLABELED
CIPSOv4

NetLabel:  unlabeled traffic allowed by
default

Switching to clocksource
tod

NET: Registered protocol family
2

IP route cache hash table entries: 65536 (order: 7, 524288
bytes)
TCP established hash table entries: 65536 (order: 8, 1048576
bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576
bytes)

TCP: Hash tables configured (established 65536 bind
65536)

TCP reno
registered

NET: Registered protocol family
1

futex hash table entries: 4096 (order: 8, 1048576
bytes)

audit: initializing netlink socket
(disabled)

type=2000 audit(1450280489.672:1):
initialized

HugeTLB registered 1 MB page size, pre-allocated 0
pages

VFS: Disk quotas
dquot_6.5.2

Dquot-cache hash table entries: 512 (order 0, 4096
bytes)




HOLDING   SCZVMLX2
msgmni has been set to
4001

ksign: Installing public key
data

Loading
keyring

- Added public key
E5A322A2BCB59BD8

- User ID: Red Hat, Inc. (Kernel Module GPG
key)

- Added public key
D4A26C9CCD09BEDA

- User ID: Red Hat Enterprise Linux Driver Update Program <
secal...@redhat.com>
- Added public key
CB445E0DA3EA1F65

- User ID: Red Hat Enterprise Linux Driver Update Program (key 4) <
secal...@redhat.com>
Block layer SCSI generic (bsg) driver version 0.4 loaded (major
253)
io scheduler noop
registered

io scheduler anticipatory
registered

io scheduler deadline
registered

io scheduler cfq registered
(default)

brd: module
loaded

loop: module
loaded

cio: Channel measurement facility initialized using format extended (mode
autodetected)
GRE over IPv4 demultiplexor
driver

TCP cubic
registered

Initializing XFRM netlink
socket

NET: Registered protocol family
17

registered taskstats version
1

Initalizing network drop monitor
service

md: Waiting for all devices to be available before
autodetect

md: If you don't use raid, use
raid=noautodetect




HOLDING   SCZVMLX2
md: Autodetecting RAID
arrays.

md: Scanned 0 and added 0
devices.

md: autorun
...

md: ... autorun
DONE.

VFS: Cannot open root device 

Re: Curious about crash.

2015-12-16 Thread Stuart, David
I'm going to go with my son's favorite response to tech support questions:

"Because it hates you!"


Dave 


Dave Stuart
Principal Information Systems Support Analyst
Information Technology Services
County of Ventura, CA
805-662-6731
david.stu...@ventura.org



-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Tom Huegel
Sent: Wednesday, December 16, 2015 7:44 AM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Curious about crash.

Here is the whole console log.
In the mean time I am attributing it to global warming.



CP I
100

Booting default
(2.6.32-573.8.1.el6.s390x)

Initializing cgroup subsys
cpuset

Initializing cgroup subsys
cpu

Linux version 2.6.32-573.8.1.el6.s390x (
mockbu...@s390-003.build.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 
4.4.7-16) (GCC
) ) #1 SMP Fri Sep 25 19:21:49 EDT
2015

setup: Linux is running as a z/VM guest operating system in 64-bit mode 
crashkernel=auto resulted in zero bytes of reserved memory.

Zone PFN
ranges:

  DMA  0x ->
0x0008

  Normal   0x0008 ->
0x0008

Movable zone start PFN for each
node

early_node_map[1] active PFN
ranges

0: 0x ->
0x0008

PERCPU: Embedded 12 pages/cpu @02b0 s19456 r8192 d21504
u65536
pcpu-alloc: s19456 r8192 d21504 u65536
alloc=16*4096

pcpu-alloc: [0] 00 [0] 01 [0] 02 [0] 03 [0] 04 [0] 05 [0] 06 [0]
07
pcpu-alloc: [0] 08 [0] 09 [0] 10 [0] 11 [0] 12 [0] 13 [0] 14 [0]
15
pcpu-alloc: [0] 16 [0] 17 [0] 18 [0] 19 [0] 20 [0] 21 [0] 22 [0]
23
pcpu-alloc: [0] 24 [0] 25 [0] 26 [0] 27 [0] 28 [0] 29 [0] 30 [0]
31
pcpu-alloc: [0] 32 [0] 33 [0] 34 [0] 35 [0] 36 [0] 37 [0] 38 [0]
39
pcpu-alloc: [0] 40 [0] 41 [0] 42 [0] 43 [0] 44 [0] 45 [0] 46 [0]
47
pcpu-alloc: [0] 48 [0] 49 [0] 50 [0] 51 [0] 52 [0] 53 [0] 54 [0]
55
pcpu-alloc: [0] 56 [0] 57 [0] 58 [0] 59 [0] 60 [0] 61 [0] 62 [0]
63



HOLDING   SCZVMLX2
Built 1 zonelists in Zone order, mobility grouping on.  Total pages:
517120
Kernel command line: root=/dev/disk/by-path/ccw-0.0.0100-part2 rd_NO_LUKS
LANG=en_US.UTF-8  KEYTABLE=us rd_NO_MD SYSFONT=latarcyrheb
-sun16  rd_DASD=0.0.0100 rd_NO_LVM rd_NO_DM
BOOT_IMAGE=0

PID hash table entries: 4096 (order: 3, 32768
bytes)

Dentry cache hash table entries: 262144 (order: 9, 2097152
bytes)
Inode-cache hash table entries: 131072 (order: 8, 1048576
bytes)
Memory: 2048224k/2097152k available (5118k kernel code, 0k reserved, 3531k 
data, 260k init) Write protected kernel read-only data: 0x10 - 0x7f

Hierarchical RCU
implementation.

console [ttyS0]
enabled

allocated 8388608 bytes of
page_cgroup

please try 'cgroup_disable=memory' option if you don't want memory cgroups
pid_max: default: 65536 minimum:
512

Security Framework
initialized

SELinux:
Initializing.

Mount-cache hash table entries:
256

Initializing cgroup subsys
ns

Initializing cgroup subsys
cpuacct

Initializing cgroup subsys
memory

Initializing cgroup subsys
devices

Initializing cgroup subsys
freezer

Initializing cgroup subsys
net_cls

Initializing cgroup subsys
blkio

Initializing cgroup subsys
perf_event

Initializing cgroup subsys
net_prio




HOLDING   SCZVMLX2
cpu: 1 configured CPUs, 0 standby
CPUs

cpu: Processor 0 started, address 0, identification
1B6ED6

Brought up 1
CPUs

devtmpfs:
initialized

regulator: core version
0.5

NET: Registered protocol family
16

bio: create slab  at
0

NetLabel:
Initializing

NetLabel:  domain hash size =
128

NetLabel:  protocols = UNLABELED
CIPSOv4

NetLabel:  unlabeled traffic allowed by
default

Switching to clocksource
tod

NET: Registered protocol family
2

IP route cache hash table entries: 65536 (order: 7, 524288
bytes)
TCP established hash table entries: 65536 (order: 8, 1048576
bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576
bytes)

TCP: Hash tables configured (established 65536 bind
65536)

TCP reno
registered

NET: Registered protocol family
1

futex hash table entries: 4096 (order: 8, 1048576
bytes)

audit: initializing netlink socket
(disabled)

type=2000 audit(1450280489.672:1):
initialized

HugeTLB registered 1 MB page size, pre-allocated 0 pages

VFS: Disk quotas
dquot_6.5.2

Dquot-cache hash table entries: 512 (order 0, 4096
bytes)




HOLDING   SCZVMLX2
msgmni has been set to
4001

ksign: Installing public key
data

Loading
keyring

- Added public key
E5A322A2BCB59BD8

- User ID: Red Hat, Inc. (Kernel Module GPG
key)

- Added public key
D4A26C9CCD09BEDA

- User ID: Red Hat Enterprise Linux Driver Update Program < secal...@redhat.com>
- Added public key
CB445E0DA3EA1F65

- User ID: Red Hat Enterprise Linux Driver Update Program (key 4) < 
secal...@redhat.com> Block layer SCSI generic (bsg) driver version 0.4 loaded 
(major
253)
io scheduler noop
registered

io scheduler anticipatory
registered

io scheduler deadline
registered

io scheduler cfq registered
(default)

brd: module
loaded

loop: module
loaded

cio: Channel measurement facility initialized using format extended (m

Re: Curious about crash.

2015-12-16 Thread R P Herrold
On Wed, 16 Dec 2015, Alan Altmark wrote:

> On Wednesday, 12/16/2015 at 02:48 GMT, Tom Huegel 
> wrote:

> > Now the registered machine fails to boot, the other one works fine.
>
> A gamma ray collided with a 1 and knocked it over, turning it into a 0.
> That's the only explanation I can think of.  I used to think it was
> sunspots or coronal mass ejections, but I've moved on.

Assumedly the non-registered machine is still running the
older kernel (please check which kernals are installed thus:
rpm -qa kernel\*
)

As such mkinitrd (which seems to be failing, perhaps due for a
improperly specified path to an initrd, was in play (per the
earlier message quoted)

I have found 'grub2' and grubby, to be quite sensitive to the
configuration file it is handed.  There are 'order issues' on
what needs to appear before and after other items, not well
documented.  From my notes:

The following will generate the  correct grub.cfg file
( /boot/grub2/ is RHEL / ClefOS 7 and later ... )

cd /boot/grub2/
mv grub.cfg grub.cfg-BAK
grub2-mkconfig -o /boot/grub2/grub.cfg

but sadly the 'fix' does not persist when a new kernel is
installed. not sure why as I have been off solving other
issues

-- Russ herrold

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/