from:"Malcolm Beattie"

Re: How to unformat a dasd drive?

2016-05-19 Thread Malcolm Beattie

Mark Post writes:
> >>> On 5/19/2016 at 07:33 AM, Malcolm Beattie  wrote:
> > my $dev = sprintf("/dev/disk/by-path/ccw-0.0.%04s", $devno);
>
> You should not assume that the first two pieces of the busid will always be 
> 0.0.  Even today, it can be 0.1 or 0.2, etc., depending on what CSS the 
> device is in.

OK, best make it
  my $dev = "/dev/disk/by-path/ccw-$devno";
and have the caller ensure the argument is passed in canonical form.

--Malcolm

--
Malcolm Beattie
Linux and System z Technical Consultant, zChampion
IBM UK Systems and Technology Group

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: How to unformat a dasd drive?

2016-05-19 Thread Malcolm Beattie

Mark Post writes:
> >>> On 5/18/2016 at 04:27 PM, Malcolm Beattie  wrote:
> -snip-
> > So an ad hoc and quick and dirty way to write a couple of tracks
> > of a lone R0 that'll be treated by Linux as "n/f" is (starting
> > with DASD device 777 offline):
>
> Cool, it does indeed work.  So I wrapped a script around that and uploaded it 
> to http://wiki.linuxvm.org/wiki/Projects_and_Software/Scripts for anyone 
> that's interested in it.
>
> Bug reports are welcome, but don't expect rapid turnaround.  :)

Having the following "llunformat" as the low-level script that does
the unformatting (invoke as "llunformat devno") may be preferable in
terms of keeping people's lunch down compared to the Perl one-liner:

#!/usr/bin/perl
# Copyright 2016 IBM United Kingdom Ltd
# Author: Malcolm Beattie, IBM
# Last update: 19 May 2016
# Sample code - NO WARRANTY
#
use strict;

sub ckd {
my ($c, $h, $r, $key, $data) = @_;
my $count = pack("nnCCn", $c, $h, $r, length($key), length($data));
return $count . $key . $data;
}

sub track {
my ($data) = @_;
return $data . ("\xff" x 8) . ("\0" x (65536 - 8 - length($data)));
}

my $devno = shift @ARGV or die "Usage: llunformat devno\n";
my $dev = sprintf("/dev/disk/by-path/ccw-0.0.%04s", $devno);
my @cmd = (qw(dd bs=65536 oflag=direct), "of=$dev");
open(DD, "|-", @cmd) or die "dd: $!\n";

for (my $h = 0; $h < 2; $h++) {
print DD track(ckd(0, $h, 0, "", "\0" x 8));
}

--Malcolm

--
Malcolm Beattie
Linux and System z Technical Consultant, zChampion
IBM UK Systems and Technology Group

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: How to unformat a dasd drive?

2016-05-18 Thread Malcolm Beattie

Mark Post writes:
> >>> On 5/17/2016 at 11:51 AM, Scott Rohling  wrote:
> > I'm wondering if something like this would work?:
> >
> > You can add 'count=xx' to only write so many blocks...  not sure how many
> > it takes to wipe formatting info - nothing to play with at the moment..
> >
> >
> > dd if=/dev/zero of=/dev/dasdX iflag=nocache oflag=direct bs=4096
>
> It doesn't seem to work. You're still going to be limited by the 
> dasd_eckd_mod driver to writing in the formatted space and not the raw device 
> itself.  I even tried turning on the raw_track_access and that didn't help 
> either.  Trying to use both that and oflag=direct caused an I/O error and the 
> dd aborted.

You do indeed need to enable raw_track_access (must be done while
device is offline to Linux) but you also need to write valid track
images and use 64KB O_DIRECT I/Os.

Linux interprets track images as starting from the R0 (no key,
8 bytes of \0 data) (not starting from the 5-byte HA as done in
AWS format) and having 8 0xff bytes terminate the track data
(followed by padding to 64KB).

So an ad hoc and quick and dirty way to write a couple of tracks
of a lone R0 that'll be treated by Linux as "n/f" is (starting
with DASD device 777 offline):

# chccwdev -a raw_track_access=1 -e 777
# perl -e 'for ($h=0;$h<2;$h++){printf 
"\0\0\0%c\0\0\0\x8%s",$h,(("\0"x8).("\xff"x8).("\0"x65512))}' | dd bs=65536 
count=2 oflag=direct of=/dev/disk/by-path/ccw-0.0.0777

Then take the device offline and online (in normal mode) again with
# chccwdev -d 777
# chccwdev -a raw_track_access=0 -e 777

Works for me. Note that you need the explicit resetting of
raw_track_access back to zero since the attribute is "sticky" across
varying offline/online.

--Malcolm

--
Malcolm Beattie
Linux and System z Technical Consultant, zChampion
IBM UK Systems and Technology Group

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: How to find a memory leak?

2015-07-09 Thread Malcolm Beattie

Alan Altmark writes:
> On Thursday, 07/09/2015 at 04:25 EDT, Mark Post  wrote:
> > > The next question is - can this ever be done by a non-root user? I
> tried
> >
> > No.
> > # ls -l /proc/sys/vm/drop_caches
> > -rw-r--r-- 1 root root 0 Jul  9 16:23 /proc/sys/vm/drop_caches
>
> Thank heavens!   That's all we need -- unprivileged users messing with the
> cache

Even unprivileged programs have limited and controlled access to
influencing the caching behaviour for files that they deal with,
whether via read/write or mapped into memory. There are the POSIXy
interfaces:
  madvise(..., MADV_RANDOM) and fadvise(..., POSIX_FADV_RANDOM)
  madvise(..., MADV_SEQUENTIAL) and fadvise(..., POSIX_FADV_SEQUENTIAL)
Similarly WILLNEED, DONTNEED and a few extras like:
  fsync(...)
  fdatasync(...)
and one or two where the APIs or functionality aren't as standardised
or common like readahead(...).

Linux has "per-open-file" tracking of readahead window information and
per-page marks in the page cache itself and does a good job of deducing
the right amount of sync/async readahead based on access pattern and
memory pressure in most common cases. However, it's nice to be able to
give it a hint or two (e.g. "I'm going to stream through this file once
and then won't need it again") while continuing to use the usual simple
file APIs without having to mess around reinventing your own buffering
or fiddle around with separate threads, async I/Os or separate access
methods (or equivalent) in O/Ses where caching is all-or-nothing or
privileged-control-only.

--Malcolm

--
Malcolm Beattie
Linux and System z Technical Consultant, zChampion
IBM UK Systems and Technology Group

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: Single User mode Linux Guest

2015-02-19 Thread Malcolm Beattie

Michael MacIsaac writes:
> What I see is a prompt for the root password:
>
>  INIT: Going single user
>  INIT: Sending processes the TERM signal
> *Give root password* for maintenance
>  (or type Control-D to continue):

If it's under z/VM, you can do

ipl ... parm init=/bin/sh

and it'll just start up a shell on the console without even
starting up the real init. Then you can
  passwd root
follwed by a
#cp signal shutdown

If you're not under z/VM then you only get to specify the loadparm
from the HMC Load panel so put "prompt" (no quotes) in the loadparm,
do the Load and then in the Operating System Messages/SCLP applet,
respond to the menu prompt with
1 init=/bin/sh
(where the number 1 is the menu entry to use rather than a
runlevel number).

--Malcolm

--
Malcolm Beattie
Linux and System z Technical Consultant, zChampion
IBM UK Systems and Technology Group

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: SCSI device issue on RHEL 6.3

2014-11-04 Thread Malcolm Beattie

Grzegorz Powiedziuk writes:
> [0:0:0:1074675744]diskIBM  2107900  .278  /dev/sda
> [0:0:0:1075527712]diskIBM  2107900  .278  /dev/sdb
>
> I've seen that before. I don't like the LUN number - 1074675744
> Make sure that in your ZFCP you have it right.
> I've seen bogus numbers like this when I typed too many zeros in the LUN
> numbe field.

Those are normal LUN numbers for DS8k and similar scsimask-style LUNs.
Unlike V7000, XIV, SVC etc which start at LUN ids [0, ], 1, 2, 3 for
each host, a DS8k assigns a unique 4-hex-digit volume ID 0xWXYZ for
every volume it creates (whether ECKD or SCSI) and for scsimask-mode
host connections (such as the Linux one above) to a DS8k volume group
containing volume ID 0xWXYZ it makes it visible as LUN id
0x40WX40YZ.

In the above list, Linux maps it to the old-style "host:bus:target:lun"
by using for the LUN those upper 8 hex digits (first 32 bits of the
64-bit LUN). 1074675744 is 0x400e4020 and 1075527712 is 0x401b4020 so
those are valid LUN numbers for the what the DS8k would refer to as
volume ids 0x0e02 and 0x1b20 (in extent pools 0 and 1 respectively by
looking at the first hex digit of each).

--Malcolm

--
Malcolm Beattie
Linux and System z Technical Consultant, zChampion
IBM UK Systems and Technology Group

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: Enabling SSL for access to a z/OS guest

2014-07-24 Thread Malcolm Beattie

Cameron Seay writes:
> This may be a question for the z/OS board, but since all of our z/OS lives
> inside z/VM guests I will ask it here first.

The IBMVM mailing list would have been better still but there's a
good sized overlap between the IBMVM and LINUX-390 lists.

> Our IT staff wants us to use SSL so that outside users can access the z/VM
> LPAR without having to get vpn accounts.  Currently they do.  We access
> z/OS via logging into a VM LPAR and then dialing into the z/OS guest.  The
> 3270 client we use has SSL capability.  What needs to be enabled/turned on
> on the VM side to allow a connection via SSL?  The IT folks are going to
> open a port for this purpose.

Follow the "Configuring the SSL Server" chapter of the
"z/VM TCP/IP Planning and Customization" manual to get the base SSL
and TLS support set up with your certificate and to get the SSL
service virtual machine(s) set up. There were significant changes
brought in in z/VM 6.2 for SSL (e.g. multiple server pools) so the
exact method depends on what level of z/VM you're using and if you're
still on 5.4 then there'll be a bit of tweaking you'll need to
remember to do to the configuration when you upgrade.

Then for SSL-secured tn3270 access you follow the "Configuring the
TCP/IP Server" chapter. You need to choose one or both of:
(a) having z/VM TCPIP and the tn3270 clients negotiate SSL via TLS
(no need for a separate port). You use INTERCLIENTPARAMS statements
to configure it: TLSLABEL to choose your certificate label and
SECURECONNECTION NEVER|REQUIRED|PREFERRED|ALLOWED to set your policy
on whether clients can/must negotiate SSL. Some tn3270 clients that
support SSL don't support TLS-negotiated SSL and some of those that
support TLS have problems depending on which end tries to negotiate
first so that may influence your SECURECONNECTION or TLS choice.
(b) having the tn3270 client make an immediate SSL-protocol
connection in which case you need a separate port and add
"SECURE your_cert_label" to the relevant "portnum TCP INTCLIEN" line
in the PORT section of your PROFILE TCPIP.

--Malcolm

--
Malcolm Beattie
Linux and System z Technical Consultant, zChampion
IBM UK Systems and Technology Group

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: NFS migration

2014-06-09 Thread Malcolm Beattie

Jake anderson writes:
> Recently we did a migration from one NFS storage server to another NFS
> storage server. During this Migration all the copied File had owners as
> root. In the recent NFS storage server the FTP server option is no more
> available so we have mounted the NFS storage to a linux running on VMware
> infra(as a ftp server). So when we try change the owner of any file
> mounted to Linux we get a permission denied(Even when we try it as root).
> The message we get is "permission denied"(This is the only message). The ls
> -l clearly gives that all the file has the owner as root.
>
> Has any undergone this situation ? Why a root cannot change the owner(root)
> to someother ID ?
> Since the files have the User and Group copied from previous NFS storage.
> Aren't there anyways to change the Owner and Group from Linux ?

It's the NFS server that's forbidding it. It's very common in all
but the snazziest of NFS environments for the NFS server to "squash"
the root user of NFS clients and treat it as an unprivileged,
anonymous user. This avoids having a root user on any NFS client
getting root-level access to all exported files on the server.

For a Linux-based NFS server, the export options "root_squash"
(which is the default) and "all_squash" (probably not the case here)
do this. You need an explicit export option "no_root_squash" to allow
root on the chosen NFS clients to be allowed to chown and access
exported files as though they were uid 0 on the server. Other NFS
servers or appliances may present the option differently.

--Malcolm

--
Malcolm Beattie
Linux and System z Technical Consultant
IBM UK Systems and Technology Group

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: Bash script for FTP to mainframe

2013-08-30 Thread Malcolm Beattie

Bauer, Bobby (NIH/CIT) [E] writes:
> That's interesting, it is prompting for the password!?
> 
> ftp -inv $HOST << EOF
>  user $USER
>  PASS $PASS

Many of the user-visible commands to ftp clients are the same as
or similar to the underlying ftp protocol commands that the ftp
client sends over the network to the server. That sometimes makes
it easy to conflate the two concepts and can cause confusion with
what's actually going on.

In the ftp protocol itself, the client program sends a "user ..."
followed by a "pass ..." to the server to complete the logon process.
However, the ftp client program gets the information from the end
user differently. In your case, you're using the -i option (or else
it would prompt interactively for the username) and you're using the
-n option so it's not auto-logging in with username/password from the
~/.netrc file. (You might wish to consider holding the password there
instead of stashing it in the script).

So the program starts up and you use the end-user
"user username [password]" command. The program uses the "username"
component and sends a "user username" protocol command. However, it
then needs the password to send. To get it, it either takes it from
the second argument of your "user" command or, if not there, prompts
you on the terminal for it (bypassing stdin). Although the ftp client
program then sends a "pass ..." protocol command to the server, it's
not an end-user command which can be used.

So, to return to your original try:

> HOST=nih
> USER=me
> PASS=password
> ftp -inv $HOST << EOF
> user $USER $PASS
[...]
> Remote system type is MVS.
> (username) 331 Send password please.
> 530 new passwords are not the same
> Login failed.
> 
> I know the password is correct. I don't know what it is doing/complaining 
> about when it says the new password is not the same. Anybody know how to do 
> this?

I look up the "530 new passwords are not the same" error from
the z/OS Communications Server IP and SNA Codes manual and find:

530 new passwords are not the same
Explanation:

The PASS command was issued using the format
old_password/new_password/new_password to change the password of
the user ID, but the second “new password” was not identical to
the first “new password”. Both “new passwords” must be the same.

So I wonder if there may have been a slash character in your
password? Then the z/OS ftp server interprets a slash as
an implicit logon-and-change-password request which is failing.
I'm mildly surprised it would have decided to check the equality
of the two new passwords (and return the error) before verifying
that the actual password (to the left of the first slash) was valid.

--Malcolm

-- 
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: Putting devices offline automatically

2013-05-16 Thread Malcolm Beattie

Mauro Souza writes:
> Hi guys,
>
> I have a client with a peculiar problem. They have zVM 4 partitions,
> sharing the same LCU's, every partition sees every DASD, and each partition
> have its own range of disks, defined with Offline_at_IPL on SYSTEM CONFIG.
> They use this setup because some times they need to access in one partition
> a disk belonging to another.
>
> When this need arises, they issue a VARY ON 2345 and get the disk online,
> use it, copy something, VARY OFF 2345. Works fine.
>
> The problem: some channels are getting offline sometimes. We know that
> people are always messing with the cables, and from time to time one fiber
> or another gets loose, they fix, and so on. And when the device comes back
> online, every single DASD on that CHPID comes online too, ignoring the
> Offline_at_IPL statement. As it should, because Offline_at_IPL is just for
> IPL.
>
> We are thinking on a "method and apparatus" for getting those DASDs offline
> when the CHPID gets back to life. I already have a REXX (I just deleted a
> lot of lines of CPSYNTAX and added a few more) that parses SYSTEM CONFIG
> and looks if a given DASD is in the Offline_at_IPL range, and can put a
> DASD offline. I just could not make the exec run by itself (or by MAINT, or
> another CMS machine) every time a channel status changes.
>
> I tried to setup PROP, and it looks fine, except it doesn't react at all.
> My PROP RTABLE is configured to run my exec, but when the channel gets
> back, it does nothing. If I send the message by hand from MAINT, the exec
> runs, and puts the device offline if it is on Offline_at_IP range. I guess
> I will have to read everything about PROP again (I could find few
> documentation and examples), in case I have missed something.
>
> I saw the NOTACCEPTed statement for DEVICES on SYSTEM CONFIG, but looks
> like it will take the device offline forever, and we will need to bring it
> online sometimes.
>
> Does anyone have any idea for us?

"NOTACCEPTED" is reversible: provided you have#
"FEATURES ENABLE SET_DEVICES" in your SYSTEM CONFIG, you could try doing
 CP VARY OFFLINE rdev
 CP SET DEVICES NOTACCEPTED rdev

and see if that prevents the disappearance/reappearance of the
channel triggering it coming online again. The CP Commands and
Utilities Reference describes the behaviour as:
  NOTACCEPTed
  tells CP not to accept the specified device or devices when the
  device(s) is dynamically added to VM from another partition. When
  VM is running second level it will prevent a device(s) from
  being accepted when a device is attached to the virtual machine
  in which VM is running. If VM dynamically defines a device for
  its own partition the NOTACCEPTed designation is overridden.

and it's not completely clear to me which case is covered by
"channel reappears". Worth a try though. Bringing it online again,
you should only need a
  CP VARY ONLINE rdev

CP still will retain some knowledge of the device though, so if that
doesn't work and you want CP to forget about it even more (though
still not entirely), you could try
 CP VARY OFFLINE rdev
 CP SET RDEVICE rdev CLEAR
 CP SET DEVICES NOTSENSED rdev
 CP SET DEVICES NOTACCEPTED rdev

After that lot, to bring it back with default DASD options you'll
want something like
 CP SET DEVICES SENSED rdev
 CP VARY ONLINE rdev

or, if you want non-default settings for the rdev's setting of
SHARED, EQID or MDC, you'll need instead an explicit
 CP SET RDEVICE rdev ... TYPE DASD ...
which includes your options, followed by
 CP VARY ONLINE rdev
(maybe preceded by
 CP SET DEVICES ACCEPTED rdev
 CP SET DEVICES SENSED rdev
for the sake of completeness).

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: Issues using VMUR

2013-03-05 Thread Malcolm Beattie

Shumate, Scott writes:
> Contents of /etc/zipl.conf
[...]
> parameters="root=/dev/mapper/VolGroup00-lv_root rd_DASD=0.0.0701 
> rd_DASD=0.0.0700 rd_NO_LUKS rd_DASD=0.0.0702 LANG=en_US.UTF-8 rd_NO_MD  
> KEYTABLE=us cio_ignore=all,!0.0.0009,!0.0.000c rd_LVM_LV=VolGroup00/lv_root 
> SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=VolGroup00/lv_swap 
> rd_NO_DM"

That shows the !0.0.000c and looks fine so 00c should be available
after the next reboot, provided zipl is run after the edit and before
rebooting.

> Output from cat/proc/cmdline  (I don't see !0.0.000c)
>
> [root@wil-zvmdb01 ~]# cat /proc/cmdline
> root=/dev/mapper/VolGroup00-lv_root rd_DASD=0.0.0701 rd_DASD=0.0.0700 
> rd_NO_LUKS rd_DASD=0.0.0702 LANG=en_US.UTF-8 rd_NO_MD  KEYTABLE=us 
> cio_ignore=all,!0.0.0009,!0.0.0009 rd_LVM_LV=VolGroup00/lv_root 
> SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=VolGroup00/lv_swap 
> rd_NO_DM BOOT_IMAGE=0

This shows two copies of !0.0.0009 in place for the current boot.
Since the original was
cio_ignore=all,!0.0.0009
it looks as though an additional 0.0.0009 had been added giving
cio_ignore=all,!0.0.0009,!0.0.0009
instead of the second one being !0.0.000c.

I've just tried on an RHEL62 system with the exact same kernel as
yours and !0.0.000c works fine for me. Any chance the second !0.0.0009
was added, then zipl run (which writes the change to the boot block)
and only then was it noticed and changed to !0.0.000c without then
rerunning zipl and rebooting?

> Ouput from cio_ignore -l
>
> Ignored devices:
> =
> 0.0.-0.0.0008
> 0.0.000a-0.0.000b
> 0.0.000d-0.0.06ff

This, though, shows that 00c is not being ignored and should be usable
at the moment. Given that it isn't in the kernel cmdline shown above
for the current boot, it must have  dynamically set via a cio_ignore
command done directly or indirectly. RedHat uses various scripts
triggered from udev hot-plug rules to fiddle with cio_ignore for
things like dasd but I hadn't thought they'd done anything as
polished for vmur.

> Output from zipl
>
> [root@wil-zvmdb01 ~]# zipl
> Using config file '/etc/zipl.conf'
> Run /lib/s390-tools/zipl_helper.device-mapper /boot/
> Building bootmap in '/boot/'
> Building menu 'rh-automatic-menu'
> Adding #1: IPL section 'linux-2.6.32-220.el6.s390x' (default)
> Preparing boot device: dasdb.
> Done.

Running zipl doesn't just tell you the current configuration, it
writes the current zipl.conf settings into the boot block to be
used for the next reboot. It looks as though maybe some of the
various steps and tests (dynamic cio_ignore, editing zipl.conf,
running zipl, rebooting, seeing /proc/cmdline and the current
cio_ignore table and accessing the device) weren't in the
intended order. If
(1) zipl.conf really does have the !0.0.000c in it as shows
(2) and zipl has been run after that change was made
then you should find that after the next reboot you see the
newly stamped setting via /proc/cmdline and the corresponding
omission of 0.0.000c in the output from cio_ignore -l.

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: Issues using VMUR

2013-03-05 Thread Malcolm Beattie

Shumate, Scott writes:
> That worked,  but I'm having issues with making it perm.  I added it to 
> /etc/zipl.conf and reran zipl.  I rebooted but it was still on  the black 
> list.

Have a look at the output of
# cat /proc/cmdline
and check the syntax closely. For example, there must be no spaces
within the cio_ignore= value, there must be an exclamation mark to
remove the device number rather than add it, the !0.0.000c has to
appear after the all not before and I can imagine that the device
number may have to be spelled out in the full canonical form of
"zero dot zero dot four hex digits". Also check you edited the
zipl.conf stanza for the kernel you then actually booted. Since the
contents of /proc/cmdline show the information from the current
boot, you'd be able to tell if it's the wrong stanza because it
wouldn't have your edits in place. Send the output (along with the
output of "cio_ignore -l" and "lscss" for good measure) if it's
still not clear.

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: Issues using VMUR

2013-03-04 Thread Malcolm Beattie

Shumate, Scott writes:
> I'm having issues wondering if someone could help me out.  I'm trying to 
> receive files from my reader.  I'm currently running RHEL6.
[...]
> I list the rdr with vmur
> [root@wil-zvmdb01 dev]# vmur li
> ORIGINID FILE CLASS RECORDS  CPY HOLD DATE  TIME NAME  TYPE DIST
> RSCS 0007 B PUN 0006 001 NONE 03/04 16:11:39 SCOTT EXEC 
> SYSPROG
> RSCS 0008 B PUN 0006 001 NONE 03/04 17:19:44 SCOTT EXEC 
> SYSPROG
> LXP10001 0004 T CON 0744 001 NONE 02/26 17:04:46
> LXP10001

This works because listing the contents of the reader is done
via a DIAG call rather than Linux prodding the device itself.

> I try to bring rdr online
> [root@wil-zvmdb01 dev]# chccwdev -e 000c
> Device 0.0.000c not found

This is because RedHat defaults to ignoring all devices via the
cio_ignore blacklist and only enabling those that are explicitly
removed from the blacklist. To do this dynamically, do:
# cio_ignore -r 00c
You can then bring it online with chccwdev. You can ensure that
the device is not ignored at the next boot by modifying the
cio_ignore parameter in the kernel command line in /etc/zipl.conf
and rerunning zipl. For example, you could change
cio_ignore=all,!0.0.0009 to cio_ignore=all,!0.0.0009,!0.0.000c
or getting rid of the cio_ignore blacklist and allowing all the
virtual machine's devices to be seen.

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-31 Thread Malcolm Beattie

John McKown writes:
> This is more a curiosity question. I have written a bash script which
> reads a bzip2 compressed set of files. For each record in the file, it
> writes the record into a file name based on the first two "words" in
> the record and the "generation number" from the input file name. Do to
> the extreme size of the input (47 files, each of which would be around
> 120 Gb to 180 Gb expanded or 23 to 27 million lines - very large).
> Basically there are probably around 50 or so (don't know) possible
> combinations of the "words". I'm wondering if I rewriting the script
> into either Python or Perl (both basically interpreted) would be worth
> my while.

Perl (and Python) aren't simply interpreted. In the case of perl, it
compiles the source into an internal op tree (rather like bytecode)
while performing a decent amount of cheap optimisation (peephole
optimisation mostly) and then runs that internal structure. Python
will do something similar but the internal representation is
different. Most if this isn't relevant to your situation here though.

>   Or should I go with a compiler such as C/C++? Or, lastly, is
> it basically irrelevant due to the extremely large number of records
> and the minimal processing; which means that I/O will dominate the
> application.

It's not I/O that dominating in your implementation below, it is
(as others have spotted) the opening and closing the relevant file
on every single line of input. Either Perl or Python will let you
remove this cost entirely. In fact, on your bash script below, bash
seems to read each character from its uncompressed input in a
separate read syscall which is dreadful but may be fixable.

> If you're interested, the bash script looks like:
>
> #!/bin/bash
> for i in irradu00.g*.bz2;do
> gen=${i#irradu00.}; # remove prefix
> gen=${gen%.bz2}; # remove suffix, leaving generation
> bzcat $i |\
> while read line;do
> fn=${line%% *} # remove all trailing characters after a space
> ft=${line:9:8} # get second word
> ft=${ft%% *} # and remove trailing spaces
> echo "${line}" >>${fn}.${ft}.${gen}.tx2;
> done;
> done

This Perl program (or analogue in Python or whatever) is likely to
give (and strace on some small test data shows) much, much better
behaviour for larger input files:

#!/usr/bin/perl
use strict;
use IO::File;

my %fhcache;

sub newfh {
my $filename = shift;
my $fh = IO::File->new($filename, "a") or die "$filename: $!\n";
$fhcache{$filename} = $fh;
return $fh;
}

sub getfh {
my $filename = shift;
return $fhcache{$filename} || newfh($filename);
}

foreach my $infile () {
open(IN, "bzcat $infile|") or die "bzcat $infile: $!\n";
    my ($gen) = $infile =~ /\.(.*)\.bz2/;
while () {
my ($fn, $ft) = split;
getfh("$fn.$ft.$gen.tx2")->print($_);
}
close(IN);
}

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: Sending a signal along with associated data.

2013-01-30 Thread Malcolm Beattie

Thomas Anderson writes:
> You are correct in that the use of signals is pretty limited and there isn't 
> a convenient way to pass data to the target application.

POSIX real-time signals allow an int or pointer's-worth of data to be
sent along with the signal which is then queued and also carries the
sender's uid, gid and pid. See the "Real-time Signals" section of
signal(7) for the care needed to select an unused rt signal number
and see sigqueue(3) for how to send them.

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: SSH and LDAP/RACF

2012-07-23 Thread Malcolm Beattie

Florian Bilek writes:
> 2.) In principle the login via SSH is working very good. I encountered
> recently a kind of weakness in the configuration: A RACF user that uses its
> own RSA keys to log into the system. When I do a RACF revoke on that user,
> it seems that the LDAP check not takes place and the user can still login.
> What can be done about that?

There's a section of the sshd(8) man page beginning:
Regardless of the authentication type, the account is checked
to ensure that it is accessible.  An account is not accessible
if it is locked, listed in DenyUsers or its group is listed in
DenyGroups.  The definition of a locked account is system
dependant. Some platforms...

and which then (as I try to ignore the misspelling of dependent)
gives O/S-specific ways that it checks for locked accounts,
usually by special contents of a directly-accessed shadow
password field such as "*LK", "Nologin", "!". From that, I'd guess
that sshd may not invoke PAM in a way that would let you use
pam_ldap to do the appropriate lookup via LDAP.

What about, as a workaround, creating a RACF group named NOLOGIN,
connecting revoked users to that group (an extra step, but that's
why I called it a workaround not a proper solution) and then
putting "DenyGroups nologin" in your sshd_config? If z/VM LDAP
doesn't special case group membership lookups for revoked users
then I think that may work.

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: Start networkadapter without reboot

2012-06-18 Thread Malcolm Beattie

Ursula Braun writes:
> looking through your transcript, I find the definition of the NIC (vmcp
> def nic 800 type qdio), but I do not see a vmcp couple command to bind
> the created NIC to a VSWITCH or GuestLAN. This would explain the failing
> STARTLAN command for this qeth device.

I intentionally didn't bother with a COUPLE since I was trying to
reproduce Berry's problem and also expecting the vNIC to act like
a normal NIC and let me configure it and even ifconfig it up before
plugging it into a switch. I'd thought that that used to work but
maybe not. Would it be considered a bug or an "unimplemented feature"
that it doesn't act that way?

Actually, even when I then couple it to a VSWITCH, the state remains
in HARDSETUP (even after I do an "echo 1 > recover" too) and an
"ifup eth1" still fails. That makes it even more unlike a normal NIC
and seems very inconvenient. I'll send you the trace for that too in
case that part is unexpected.

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: Start networkadapter without reboot

2012-06-18 Thread Malcolm Beattie

Ursula Braun writes:
> correct, state HARDSETUP should be a temporary state during online
> setting for a qeth device only. If the device stays in state HARDSETUP
> something unexpected occurred. To understand the reason, the trace
> file /sys/kernel/debugfs/qeth_setup/hex_ascii should be checked by a
> qeth expert. It's a wrap around buffer; thus it needs to be saved
> immediately after the problem shows up.

I can reproduce this easily on SLES11SP1 kernel 2.6.32.12-0.7-default
(not up to date on service at all). I'll send you a transcript.

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: Start networkadapter without reboot

2012-06-15 Thread Malcolm Beattie

van Sleeuwen, Berry writes:
> I've found some more information when looking through the /sys/ directory. 
> The state for this device is in HARDSETUP.
>
> nlzlx204:~ # cat /sys/devices/qeth/0.0.0f10/state
> HARDSETUP
>
> Searching for this I've found some information in patch reports, for instance 
> at git390.marist.edu and kerneltrap.org. This status is a result of a boot 
> without the actual device available. Indeed, what we have here now. When the 
> cable is plugged (or the vswitch connected) the state should switch to 
> SOFTSETUP or eventualy even to UP. But it doesn't. Would it be possible to 
> get it to UP dynamically or is this a bug that can be resolved with a reboot 
> or network restart only? (kernel level is at 2.6.32.12-0.7).

Try
# echo 1 > recover
You may need to take the device offline first (echo 0 > online) and
then bring it back again after the recovery attempt (echo 1 > online).

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: bash question.

2012-04-13 Thread Malcolm Beattie

McKown, John writes:
> Very nice! Thanks. I guess that I'm going to end up dedicating a
> weekend day to just read the entire output from "info bash". Luckily,
> I can create a text file from it, convert it to PDF format, then
> read the PDF directly on my Kindle DX or Android tablet.

In case you weren't aware of it already, the utilities used to
process the *roff macros used in man pages support typesetting to
PostScript as well as generating simple text output. So typing
man -t bash > bash-man.ps
will generate you a nicely formatted PostScript version of the
man page in bash-man.ps, fancy fonts and all, instead of what you'd
get from just taking the text version. That's suitable for direct
pringting but you can instead just
  ps2pdf bash-man.ps
to produce your bash-man.pdf PDF version.

Using "info bash" instead of "man bash" uses a slightly different
source of documentation (the FSF document their own programs in their
own GNU info format instead of man pages) but you'll nearly always
find that some nice people have already ensured that your distro has
man pages for the programs as well and that they have either exactly
the same content or are "close enough" for most purposes. There are
ways of generating various typeset-like formats from info format too
but I forget what they are and I don't think they are as simple as
just adding "-t" to your man command invocation.

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: New book: Linux Health Checker 1.0 User's Guide

2012-03-23 Thread Malcolm Beattie

David Boyes writes:
> > Do we have any equivalent "System Health Checker" for z/VM?
>
> Would be an interesting Summer of Code project if someone were
> willing to mentor the student. You'd need a college that still had
> a VM system, though -- which pretty much limits it to a few
> candidates.

I provide Linux guests and second level z/VM systems, not just z/OS,
on the Zeus hub used by EMEA universities in the z Academic
Initiative program. I'd have thought the US zAI folks would likely
do the same on their hubs.

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: Oracle in virtual environments

2012-02-22 Thread Malcolm Beattie

Harder, Pieter writes:
> Yes, this policy of non-support of Oracle on virtual systems
> applies to VMware as well. That is why we still run our main Oracle
> on Sparc iron. Political trouble ahead ;-)

Berry van Sleeuwen writes:
> In a discussion with our Oracle group they claim that there is no
> support for Oracle in virtual systems and that they therefore will
> not support on virtual systems too. So when a (performance) problem
> is found they first advise to migrate to a dedicated server, and
> increase resources, before they attempt to solve the problem. This
> is not (only) true for z but also for other virtual systems as
> well, we discovered this because of the advice to migrate off of
> cloud systems. So basically any 'cloud' service is not advised to
> run Oracle.
[...]
> 
> Can anyone confirm this statement? Is it Oracle or is it the
> interpretation or our Oracle group? Is there a formal statement
> from Oracle itself?

I'm not Oracle and I should think you'll get an official response
soon, but just to give you the good news as soon as possible: I've
seen public statements from Oracle that System z virtualisation is
handled specially by them and is fully supported, unlike most other
virtualisation. The nearest public statement I find to hand I have
is the 11th foil (labelled 22) of Oracle's SHARE presentation from
13 August 2008, titled "Virtualizing Oracle Servers with Linux on
IBM System z" by Barry Perkins of Oracle and IBM's own Kathryn Arrell
which says:

IBM System z Server Virtualization – A Proven Platform
...
System z virtualization is fully supported by Oracle
 Database, Real Application Clusters, Fusion middleware

(and that statement is highlighted in red to indicate its
importance).

--Malcolm

-- 
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: Porting old application- numeric keypad

2011-12-15 Thread Malcolm Beattie

Smith, Ann (ISD, IT) writes:
> They have tried diff.

As John says, GNU diff, as available on any Linux, provides a lot
of powerful functions include recursion support (specified
explicitly via the -r option). I'd encourage anyone using diff to
use the "-u" option to provide "unidiff" format output which is
much more human-readable and provides more context information used
by patching software to behave more robustly in the face of
applying patches to "slightly modified" files. Using diff to do
"diff -ur dir1 dir2" and such like is something I do fairly
frequently and I've never found any glaring omissions in its
functionality.

> Has some functions but appareently not all that dircmp -d provides.

What functionality do they think is missing from GNU diff?
I wouldn't be surprised if education plus possibly some minor
pre/post-processing with other utilities solved their problems.

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: Showing a process running in BG

2011-08-01 Thread Malcolm Beattie

Tom Duerbusch writes:
> I have a process that may or may not be running in background.
>
> When I use any of the forms of "ps", it shows the process running, but, I 
> don't understand if any of the fields being displayed, indicate that this is 
> a BG process.  It all looks the same to me .
>
> If the process is running in the background, I need to follow the path of how 
> did it get there (bg).  If the process isn't running in background, I have a 
> different problem all together.

Use the "-j" or "j" option of ps to list the process group,
session ID and controlling terminal of your processes. So if you
prefer your ps options SysV-flavoured, do
ps -ejww
or if you prefer them BSD-flavoured, do
ps ajxww
You may be able to get way without knowing the precise details of how
processes, process groups, sessions and controlling terminals interact.
The common cases are
(1) The process is a daemon: no controlling terminal (TTY column "?"),
pid = pgid = sid.
(2) The process is an interactive shell: has a controlling terminal,
pid = pgid = sid.
(3) The process is a part of a foreground or background job of an
interactive shell: has a controlling terminal (that of the shell
that started the job), sid is same as the shell's sid.
The leader of the process group (pid = pgid) is usually the first
command in a pipe line (e.g. for a | b | c, the pgid will be the
pid of a and b and c will have same pgid but, of course, different
pids).
The difference between "foreground" and "background" is whether the
pgid is the one set on the controlling terminal. The shell uses
tcsetpgrp() on the controlling terminal to switch between foreground
and background. The important consequences are things like hitting
Ctrl/C on the terminal sends an interrupt signal to its process
group (foreground processes) and processes attempting I/O to their
controlling terminal get refused with SIGTTOU sent if their process
group doesn't match (i.e. background processes), although that
behaviour is configurable.

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: DB2 Performance Questions

2011-08-01 Thread Malcolm Beattie

Shedlock, George writes:
> We are conducting a proof of concept for one of our divisions. This includes 
> DB/2 9.7 and Suse 10 SP 3 with CKD dasd.
>
> The first opportunity is the defines of tablespaces. In our x86 environment, 
> this runs in less that 3-4 minutes (yes, there are a lot of tables). In our 
> z/Linux guest that same set of defined runs in about 3-4 HOURS!. The only 
> thing we have seen as far as activity is a very large number of disk I/O's to 
> dasd. The tables define approx. 800-900 GB. What we think we are seeing is 
> that the table spaces are being formatted.
>
> We have tried the "no file system caching" option on the define, but is 
> flagged as an invalid option. IBM is saying that this is the default, but 
> after the tables are defined and we look at the tables, we see that the 
> option is turned on. If we then try to turn it off, it is again flagged as an 
> invalid option.

Unless things have changed recently, DB2 V9 for LUW on Linux for
System z only supports "no file system caching" (i.e. direct I/O)
when using FCP SCSI disk access, not with ECKD disk access. I find
this documented in Table 16 ("Supported configuration for table
spaces without file system caching") on pp160-161 of the DB2 V9
"Administration Guide: Implementation" manual (SC10-4221-00). My
copy of the manual is from quite a while back so a newer may have
some changes. Assuming the restriction is still in place, I share
your pain.

Within the constraints of doing without direct I/O, you can at least
try to ensure that your DB2 data is spread and striped across a large
enough number of device numbers (virtual and real, possibly including
PAV devices of various flavours if appropriate) and across enough
channels, back-end disks, ranks and whatever other objects your
back-end DASD subsystem needs to ensure good performance. There are
presentations around which describe the various configuration issues
you need to cover. Without one of those and/or the help of someone
else who knows where to look, you can easily run into bottlenecks due
to configuration rather than fundamental hardware or software.

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: Standby storage and user direct

2011-02-09 Thread Malcolm Beattie

Brad Hinson writes:
> Anyone know if there's support in z/VM's user direct file for defining
> standby storage/memory when defining a user?

There's no specific keyword so you do it via
COMMAND DEFINE STORAGE AS size STANDBY size

You can include a "RESERVED size" in there too but unless you're
wanting to simulate closely an LPAR environment where other LPARs
have used up the spare memory before you, it doesn't seem much use.

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: Z10 vs x86 or Sparc

2011-02-08 Thread Malcolm Beattie

Dave Jones writes:
> z196:
> quad core chips
> chip speed: 5.2 GHz
> L1: 64K I / 128K D private/core
> L2: 1.5M I+D private/core
> L3: 24MB/chip - shared

plus
L4: 192MB/book - shared between the 20 or 24 cores

(then multiply by the 1-4 books).

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: selinux question

2011-02-01 Thread Malcolm Beattie

Neale Ferguson writes:
> Thanks. I used the chcon command to change the context but am still having 
> problems and seeing this in the audit log:
>
> type=AVC msg=audit(1296596790.809:1547): avc:  denied  { execute } ...

Now it's complaining about execute whereas before it was only
complaining about read. I'm no expert here, but I believe the types
of object are in general different from the types of subjects for
Type Enforcement which is the usual SELinux policy.

If you look in the selinux-policy SRPM (just do a build-prepare with
rpmbuild -bp), you'll find the source for the snmpd policy in
directory serefpolicy-3.7.19/policy/modules/services in files snmp.fc,
snmp.if and snmp.te for, respectively, the contexts for particular
directory names (for use with restorecon), the interfaces and the
underlying types. I'm looking at Fedora 13 but it's probably close.
I see stuff in there for it reading lib files and executing init
scripts and so on but I see nothing for loading dynamic modules.

If you want to solve this properly rather than using a blunt hammer
then you could maybe look at the apache.* policy files in the same
directory and see how the httpd_modules_t type is implemented there
to handle Apache DSOs and use similar type and interface definitions
for snmpd.

--Malcolm

--
Malcolm Beattie
IBM Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: Notification when a spool file arrives

2010-12-07 Thread Malcolm Beattie

Alan Altmark writes:
> On Tuesday, 12/07/2010 at 09:07 EST, Malcolm Beattie 
> wrote:
> >  First of all, I've added an "arrived"
> > attribute to ur devices which gets incremented each time a file
> > arrives so
> > cat /sys/bus/ccw/drivers/vmur/0.0.000c/arrived
> > contains a number checkable by scripts or programs (useful in case of
> > wakeups or restarts of an app so it can check if anything really
> > happened).
>
> But you're not opening a spool file, you're opening a special character
> device.

No, I'm opening a sysfs file--maybe I was unclear. The notification
mechanism I'm talking about it is via the sysfs driver model file
(/sys/.) and not the special character device file (/dev/...).
The latter is, as both you and I have written, a non-blocking model.

In the Linux driver model, device-related notifications can be sent
as uevents (broadcast over an AF_NETLINK socket, one of whose
listeners is udevd with its configurable rulesets undef /udev) or,
more recently, via a POLLPRI condition on a descriptor opened on
a sysfs file. Neither of these affects the open/read/write/close
behaviour of the special character device file in /dev.
Or I may have misunderstood your point?

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: Notification when a spool file arrives

2010-12-07 Thread Malcolm Beattie

Eddie Chen writes:
> Over the weekend I took a look at VMUR.CCP source to insert "FD_ZERO(), 
> FD_SET() and select()". I thought adding would be trivial.
>  However, when I took a look at the device driver VMUR.C on the internet, I 
> found that the OPEN calls diag_read_next_file and if there
>  are no data(no reader file) it will returns ENODATA(No data). That means 
> when I open the "/dev/00c" to get the file descriptor it will come back
>  "NO data" thus no file descriptor for the select(). Also I notice that it 
> does not allow OPEN in "write" mode  as well.

(Subject line changed to clarify thread contents.)

Indeed, the interface to unit record devices (whether via channel
programs or via DIAG) is not a blocking one: you try a read from the
device and it either gives you data or it comes back immediately with
"no there's no data". Hence why I wanted to add an asynchronous
notification that a new file has appeared in the reader.

The I/O model is that when this happens, CP (presumably modelling what
would happen with real hardware) presents an unsolicited interrupt to
the driver. I've looked at two ways in which that could conveniently
be sent through to userland. First of all, I've added an "arrived"
attribute to ur devices which gets incremented each time a file
arrives so
cat /sys/bus/ccw/drivers/vmur/0.0.000c/arrived
contains a number checkable by scripts or programs (useful in case of
wakeups or restarts of an app so it can check if anything really
happened).

One of the notification methods is via a sysfs KOBJ_CHANGE uevent
which can be caught by the udev subsystem. I added this on Friday and
it seems to work nicely. In practice, it means you can either have a
script which sits waiting for a file to appear by doing udevmonitor
(or udevadm monitor depending on kernel) to wait for such events
or you can add a udev rule to /etc/udev/rules.d with something like
BUS="ccw", DRIVERS="vmur", ACTION="change", RUN+="/do/something"
to trigger a program to run when a file arrives. There's a fancy
netlink API to uevents too if scripting doesn't appeal.

The other notification method is indeed via a blocking I/O but not
for the device itself. The good news is that sysfs *now* allows a
drivers to "wake up" readers of a sysfs attribute by triggering
poll() to see a POLLPRI condition. I can add that easily to the
"arrived" attribute meaning that something roughly like

fd = open("/sys/bus/ccw/drivers/vmur/0.0.000c/arrived", O_RDONLY);
struct pollfd pi = { .fd = fd, .events = POLLLPRI };
rc = poll(&pi, 1, 0, 0);

will block nicely and wake up with POLLPRI in pi.revents when a
new spool file arrives. The only annoyance is that when I say
*now*, I mean in modern kernels because that API (sysfs_notify and
in, even easier, sysfs_notify_dirent) is not in kernels of SLES10 SP2
era. I need to think about how best to ensure that people with older
kernels can distinguish clearly what's available and what isn't
(depending on how much backporting various people want to do).

Of course, nothing here should be taken as meaning that IBM is
committing to add this functionality and I'm only here talking about
what I tried on my own test system.

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: Question to smsgiucv

2010-12-01 Thread Malcolm Beattie

Florian Bilek writes:
> I am looking for a  possibility using the Virtual Reader under z/VM in
> z/LINUX. The idea is to process files received from a z/OS via RSCS.
> Off course I could regullarily start VMUR to poll the RDR but couldn't that
> be done much smarter with an event starting VMUR ?

I've kept meaning to add select() support or similar to vmur since
I wrote the original but it's never quite made it to the top of my
priority list. It should just be a few lines of code (catch the
unsolicited interrupt and wake any waiters) in the right place.
I'll try to take a look soon if nobody gets in there first.

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: vmur usage ?

2010-09-11 Thread Malcolm Beattie

David Boyes writes:
> > > Now it starts to look interesting :)
> > > Where is source of that example program ?
> > Re: AF_IUCV. Yes, it would also work, but the programming is more
> > complex. There's something to the ability to just use plain old 'cat'
> > or any language that understands file I/O to deal with IUCV that I like
> > about Neale's driver.
>
> In fact, it's so small, here's the whole sample program. This code takes 
> anything delivered to the guest via *MSG (the classes you specify) and copies 
> it to syslog and a terminal. Works for CPCONIO, MSG, SMSG, etc, etc -- 
> anything you can SET  IUCV.

echo Write daemon integrating AF_IUCV with dbus >> ToDo

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: IBM zEnterprise System announced???

2010-07-23 Thread Malcolm Beattie

Dave Jones writes:
> Thanks, Alan, that's what I wanted to knowwe still treat these
> blades as "distributed" servers, only they're connected to the z via a
> secure, fast, internal network. Excellent.

But wait, there's more...
Once a blade is purchased and entitled to be put in the zBX, as soon
as it's put in the zBX it becomes part of the "z world". Assuming
there's the usual z hardware support in place from IBM, the support
immediately changes to 24x7 for the blade, it integrates into the
"call home" mechanism of the box, it's monitored and watched just
like any other z component and if anything goes wrong the usual
z CE comes out and does the repair/replacement. Similarly, all
firmware/hypervisor changes are done via the z HMC in the same way
as, for example, channel cards, crypto cards and so on.

I've already heard of one customer that's considering adding a zBX
to a coupling-facility-only footprint (even though there's going to be
no app data connectivity between the z196 and the zBX) purely to get
the benefits of moving that level of management of the blade estate
into the arena of z technology and z support.

Oh, and when the zBX is installed, it doesn't just get dumped at the
data centre door by the truck driver (as I'm told some blade chassis
arrive)--it counts as "z" and so the full installation gets done in
the same way as other z hardware.

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: IBM zEnterprise System announced???

2010-07-23 Thread Malcolm Beattie

Eric Spencer writes:
[> Mark Post writes:]
> > >>> On 7/22/2010 at 06:19 PM, Marcy Cortes
> >  wrote:
> > > Sounds like it will be talking over that private IP network rather than
> > some
> > > sort of CP co-processor though, so anything is possible.
> >
> > I didn't see any mention of an IP network, just that it was "private."
> > That could mean a lot of things.
> I think its "private" as in not visible outside the box(s), its not a part of 
> your ip network in general. it is not proprietary I believe it's using 
> standard ip protocols.

The IEDN (intraensemble data network) is a flat, VLAN-aware,
10GbE switched network and you can use IPv4 or IPv6 as you wish.
You can, if you want, have the IEDN "behind" the z196 and bring in all
your external network connections into ordinary OSA-Express ports on
the z196. In this case, connectivity between the z196 and the IEDN is
via OSA-Express ports configured with a special CHPID type (OSX
instead of OSD). However, you also have the option to bring your
external data network directly into the IEDN via the TOR switch in
which case it is the customer's particular responsibility to configure
VLANs correctly via the TOR switch and consider firewalling
requirements.

In case there's any confusion, the "other" special network is
the INMN (intranode management network) and that indeed is private:
it connects the HMCs, SEs, zBX (at "hardware, firmware and
hypervisor" level) and CPC (via OSA-Express ports configured as
CHPID type OSM) for private chatter of management stuff. It's still
IP though (IPv6 link-local, in fact).

For more details, see section 7.4 of the IBM zEnterprise System
Technical Guide redpiece (SG24-7833).

--Malcolm

--
Malcolm Beattie
Mainframe Systems and Software Business, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: automatic email?

2009-02-24 Thread Malcolm Beattie

John McKown writes:
> On Mon, 23 Feb 2009, Adam Thornton wrote:
>
> > On Feb 23, 2009, at 8:13 PM, John McKown wrote:
> > >
> > >
> > > Many thanks for any ideas.
> >
> > I wrote a very simple MTA based on Net::SMTP in perl to do this.  It's
> > straightforward.  Net::SMTP makes it very, very easy, assuming you
> > already speak Perl.  However:
> >
> > Most Linux distros let you configure an MTA to use a remote host as
> > its smarthost with a couple of clicks.  I would not recommend sendmail
> > for anything other than an emetic in this day and age, but certainly
> > Debian's packaging of Exim lets you set up a mailserver pointing to a
> > smarthost trivially, and I believe I remember that it's a single line
> > in postfix as well.
> >
> > Adam
> >
>
> Thanks for the pointers to Exim.
>
> I do know Perl, somewhat. I'll look at Net::SMTP as my needs are minimal.
>
> I'm not the sysadmin on this particular box (I support a vendor
> application), so installing software is a bit difficult. I need to request
> it from the sysadmin and then it needs to be approved by corporate
> security (believe it or not).

The advantage to having an MTA running locally is that it handles
all the corner cases of SMTP, queueing and logging so that you don't
have to: what to do when the server is unavailable, what to do when
the server is responding slowly, what to do when the server sends a
temporary 4xx error, what to do when the server plays protocol games
(more usual for externally facing servers, but still...), what to
do when you suddenly want to send lots of mails at once. It's your
local MTA's job to queue them for you locally, keep track of them,
ensure they get sent out reliably eventually and let you know exactly
whether they have been received successfully.

If someone blames you for an email going missing and the remote
folks can't/won't find what happened from their logs (maybe that
doesn't happen these days but it sure used to...) then it's nice to
be able to grep what happened out of your logs (exigrep is a useful
utility if you use Exim) and tell them exactly when the email was
sent across and exactly what their server said. Can you tell this
is from the heart and from real experience? ;-)

You don't need to have the MTA listening on an external interface
(just localhost) so you don't have to worry about incoming mail
and mail relay security.

--Malcolm

--
Malcolm Beattie
System z SWG/STG, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Udev ressize 256 too short

2008-11-15 Thread Malcolm Beattie

Marcy Cortes writes:
> Does anyone know what these mean?  This is SLES10 SP 2 - new install.
> The volume groups are very large - vg02-vg06 each contain 22 mod 54
> volumes (almost a Terabyte).  vg01 contains 12 mod 54.  Volume group
> "system" is pretty small at 6G.
>
>
> There was some discussion on this list a year ago, but no resolution was
> posted.
>
> Waiting for udev to settle...
> Scanning for LVM volume groups...
>   Reading all physical volumes.  This may take a while...
>   Found volume group "vg02" using metadata type lvm2
>   Found volume group "vg01" using metadata type lvm2
>   Found volume group "vg04" using metadata type lvm2
>   Found volume group "vg05" using metadata type lvm2
>   Found volume group "system" using metadata type lvm2
>   Found volume group "vg06" using metadata type lvm2
>   Found volume group "vg03" using metadata type lvm2
> Activating LVM volume groups...
>   1 logical volume(s) in volume group "vg02" now active
> udevd-event[4269]: run_program: ressize 256 too short
[lots more "ressize 256 too short" lines snipped]

Some initial googling makes it look like udevd is invoking a
program with util_run_program() or udev_exec() and the caller is
expecting that program to produce a result (by writing to stdout)
of no more than 256 characters. The called program is producing
too much stdout which makes its caller sad. A search of the udev
codebase shows #defines of length of 256 for some filename and
directory name buffers and one or two other things.

You could try running udevmonitor while the scan takes place and
then work through which udev rules and external programs get
invoked to try to catch the one constructing the undesirably long
output.

--Malcolm

--
Malcolm Beattie
System z SWG/STG, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: z10 BC is Here

2008-10-21 Thread Malcolm Beattie

Bruce Hayden writes:
> It must be you..  It is basically a single frame z10 EC in size and is
> not a "rack".  It does use an MCM, but like the z9 BC, it isn't
> mounted in a book and you can't add another.

Actually, it's not an MCM (Multi Chip Module) in the z10 BC, it's
six SCMs (Single Chip Modules): 4 separate SCMs for the 4 separate
Enterprise Quad Core chips (3 active in each) and 2 other SCMs for
the SC (System Controller) chips.

For pictures and more detail the draft of the redbook
"IBM System z10 Business Class Technical Overview"
(SG24-7632) is now available at
http://www.redbooks.ibm.com/redpieces/abstracts/sg247632.html

--Malcolm

--
Malcolm Beattie
System z SWG/STG, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: var subdirectory

2008-09-27 Thread Malcolm Beattie

Gentry, Stephen writes:
> I think I've painted myself into a corner.  My root subdirectory has
> gotten full.  When I built this linux, /opt and a couple other subdir
> were installed as separate mount points.  Now, I'd like to move var to a
> separate dasd/mount point.  When I try to rm the var subdirectory, I use
> the -rf command.  However, 5 subdir's won't delete. They are:
>
> `var/lib/nfs/rpc_pipefs/statd'
> `var/lib/nfs/rpc_pipefs/portmap'
> `var/lib/nfs/rpc_pipefs/nfs'
> `var/lib/nfs/rpc_pipefs/mount'
> `var/lib/nfs/rpc_pipefs/lockd'
>
> I get an "operation not permitted" message.  I figure that maybe there
> is a task running, so I go and kill some tasks that look like they might
> be related to this, but still no luck.  I need to remove this subdir so
> I can mount the new one (var).  I do not have a 2nd linux running
> therefore, I cannot mount this disk to a 2nd one and delete var.
> Basically, all I have is a linux command line in a 3270 session. I can't
> putty into this linux under existing conditions.
> Does anyone have any suggestion?

You will find that there is a (pseudo)filesystem mounted on
/var/lib/nfs/rpc_pipefs which supports some fancy NFS functionality.
You will need to unmount it first or else avoid descending into it
when attempting to remove files under /var. I'm surprised that going
down to runlevel 1 doesn't unmount it but perhaps the init.d scripts
don't tidy up everything or else some nfs-related kernel module keeps
some refcount on it. After stopping all nfs-related services via
their init.d scripts, try a "fuser -m /var/lib/nfs/rpc_pipefs" to see
if any processes are still around. Once those are stopped, you should
be able to "umount /var/lib/nfs/rpc_pipefs" unless there's a kernel
refcount held on it somehow.

--Malcolm

--
Malcolm Beattie
System z SWG/STG, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: NTP daemon problem (was: Weird application freeze problem)

2008-09-10 Thread Malcolm Beattie

Edmund R. MacKenty writes:
> Does anyone know of a Linux tool that would give more accurate information
> about process wake-ups?  It would be nice to be able to profile Linux daemons
> like this and see which ones play nice in a VM environment, because ntpd sure
> doesn't!

Try
strace -tt -T -o strace.log -p $pid
and use filter options to avoid too much output.
"man strace" for details.

--Malcolm

--
Malcolm Beattie
System z SWG/STG, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: DDR'ing 3390 DASD To Remote Location

2008-06-19 Thread Malcolm Beattie

Rob van der Heij writes:
> On Thu, Jun 19, 2008 at 5:11 PM, Malcolm Beattie <[EMAIL PROTECTED]> wrote:
>
> > I wrote writetrack and readtrack kernel modules for Linux 5 years
> > ago which implement ioctls to do that along with simplistic userland
> > utilities and they worked OK for me to transfer various VM and z/OS
> > disks as images via plain Linux files. The internal kernel API for
> > DASD driver disciplines was a bit icky back then so they would need a
> > good polish (or simply a rewrite based on the existing template).
>
> IMHO the disadvantage of that approach is that you need another
> userspace tool to get the data in and out of the driver. It is harder
> to integrate in existing backup processes.
> The design we discussed back then was to show the cylinders of the
> volume as files in a directory (bonus points when arranged according
> to CMS formatted minidisks). That way you could backup the data as any
> other Linux data, and have automatically a way for incremental (per
> cylinder) backups etc.

Should be straightforward to use FUSE to add such a filesystem
"wrapper" around the underlying readtrack/writetrack ioctl.

--Malcolm

--
Malcolm Beattie
System z SWG/STG, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: DDR'ing 3390 DASD To Remote Location

2008-06-19 Thread Malcolm Beattie

Rob van der Heij writes:
> When we saw the first new z/VM installations with Linux show up, I
> proposed a new feature for the Linux disk driver that would allow
> arbitrary tracks to be read and written (like the pipeline stages).
> That way a Linux guest could be used to backup the VM packs along with
> the Linux data. And for D/R restore you could first IPL one Linux
> guest native, restore the VM packs (from TSM) and then IPL VM again.
> Something like that would fit your needs.
> The design of the driver appeared to be very simple after a few beers,
> but next morning it turned out to be harder.

I wrote writetrack and readtrack kernel modules for Linux 5 years
ago which implement ioctls to do that along with simplistic userland
utilities and they worked OK for me to transfer various VM and z/OS
disks as images via plain Linux files. The internal kernel API for
DASD driver disciplines was a bit icky back then so they would need a
good polish (or simply a rewrite based on the existing template).

The unit record driver I did was eventually noticed and requested
by enough customers (I assume) that Boe asked me for it, polished it
and pushed it upstream so perhaps a similar thing might work if people
are interested in full track read/write for Linux. No guarantees, since
I don't know how those requests were routed and prioritised before
they ended up as a request to me. I suggest requests be sent via
whatever the usual official route is for customer requests (i.e.
not directly to me, I'm afraid).

--Malcolm

--
Malcolm Beattie
System z SWG/STG, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Measuring CPU performance? Which is right?

2008-06-12 Thread Malcolm Beattie

CHAPLIN, JAMES (CTR) writes:
> On the zLinux guest (ZP013), using sar I get a CPU usage of about 15%:
[...]
> But under Perfkit (zVM) we get the following exception message, 33.5%
> CPU:
>
> 11:51:51 FCXUSL317A User ZP013 %CPU 33.5 exceeded threshold 30.0 for 5
> min.
[...]
> We have two
> IFLs defined to the guest.
[...]
> Why are the numbers from PERFKIT different from the zLinux environment?

PerfKit percentages are calculated as "percentage of one engine".
Linux percentages calculate "percentage of CPU resource available to
the image". For your Linux guest with 2 engines, Linux tells you it's
using ~15% of its 2-engines'-worth. PerfKit spells that as ~30% of
a nominally-100%-utilised single engine. Same resource usage,
different way of displaying the measurement.

[For the purposes of this posting, I'm treating any remaining few
percent difference as a second order effect or else we'd muddy the
waters with discussing a bunch of more complex measurement issues.]

--Malcolm

--
Malcolm Beattie
System z SWG/STG, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: recover root password

2008-04-15 Thread Malcolm Beattie

RPN01 writes:
> To be completely compliant, everything done by / with root
> will need to be logged, showing what was done, and by whom. Can you do that
> now, with two or more people logging into root? Can you do it with even one
> person logging into root? Not on any distribution I know today.

Quick plug: I'll be covering Linux native tools for auditing
(auditd/auditctl), accounting (acct/sa) and other things beginning
with "A"[1] in my technical session at the z Tech Conference in
Dresden next month.

There are trade-offs involved in enabling such things but if you
really want to audit everything root does, you can.

--Malcolm

[1] ACLs and Activity reporting.

--
Malcolm Beattie
System z SWG/STG, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: how can I mount /tmp as a tmpfs in SLES9?

2008-03-18 Thread Malcolm Beattie

Rob van der Heij writes:
> On Tue, Mar 18, 2008 at 5:18 AM, Mark Post <[EMAIL PROTECTED]> wrote:
> The "normal" usage of Linux in /tmp is pretty limited, so I don't
> think I'd be scared about a few MBs there. But since those files
> probably remain in page cache while you need them, you do not win
> anything there.

Others have discussed a lot of the aspects of /tmp configuration in
this thread but I'll just point out that there is a much bigger win
with tmpfs beyond the "data in page cache" that would apply even
with "/tmp on a normal filesystem on a fast block device". Linux
internally models the whole filesystem hierarchy (directories,
sub-directories, files, etc) with its VFS layer and caches it in
the structures in its "dcache".

A normal filesystem has to take those internal structures and
record them into blocks (and read them from blocks) so that the
block layer can do the I/O. Directory contents have to be squished
into a format which can be used as metadata blocks and recorded on
the block device, as does inode data like "last access time" and so
on. tmpfs doesn't have to do any of that at all since it's just a
thin layer around the VFS. That reduces the path length for
filesystem operations from
file op -> VFS -> fs -> block layer -> device driver (e.g. DIAG)
to
file op -> VFS+mm
That's a particularly big win for metadata-intensive operations.

A surprising number of applications do indeed use /tmp (often
creating and immediately unlinking the file so you may not see them
around much) and I think there are meta-data heavy ones too although
my experience with those is out of date. Such apps do things like
extracting tar files to /tmp and then walk/read through the results.

I've just tried out an example: make a script which untars a tar file
with ~4000 files of about 10KB each (I used /etc) into a directory
and then does rm -rf on it. The only system I can do the test on at
the moment is dreadful for proper measurement (tiny SLES10SP1 under
z/VM 4.4 as a capped guest under z/VM 5.x hence no DIAG either so
there's dasd_fba driver overhead there that wouldn't be present).
Running a couple of those test scripts in parallel, I get that tmpfs
is twice as fast as ext2 (mounted noatime) on VDISK (FBA not DIAG
though) with the CPU pegged at its cap of ~30% but that system setup
is so unusual it's probably not very useful. An internal throughput
test of a similar nature on a proper system would be interesting.

(Trying a different mail setup; let's see if it works.)

--Malcolm

--
Malcolm Beattie
System z SWG/STG, Europe
IBM UK

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Dynamic configuration changes

2007-11-07 Thread Malcolm Beattie

Mark Post writes:
> If an additional CPU gets DEFINEd (via CP), or configured online (in an 
> LPAR), you then need to bring it online to the Linux system by echoing a "1" 
> into /sys/devices/system/cpu/cpu?/online, where ? equals the number of the 
> CPU, (starting with 0).

Now that Mark's explained nicely what to do, I'd like to follow up
with a big fat warning about what *not* to do: do not use
"CP DETACH CPU n" to take a virtual CPU out of your guest's
configuration once present. A "DETACH CPU" will immediately trigger
the effects of a "CP SYSTEM CLEAR" and your guest will be dead in the
water with all its memory zeroed. Once a CPU has been DEFINEd into
the guest's configuration, only use Linux sysfs to vary it online or
offline and don't DETACH it.

Another minor warning: I seem to remember one version of SLES9 having
problems with hot CPU support. I'm not sure which service pack level;
it might even have been before SP1. The symptom was that you could
vary offline (and then online) CPUs that existed and were online when
the guest was IPLed but if you did a dynamic "CP DEFINE CPU" then a
following "echo 1 > .../online" would fail to add it and would leave
an uninterruptible process around or something similar. Since the
additional_cpus and possible_cpus boot parameters came along, I think
things now all work as they should.

--Malcolm

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Strange file type cannot be moved with scp or ftp

2007-05-29 Thread Malcolm Beattie

Rick Troth writes:
> I've never understood named sockets or how they differ
> from named pipes.  Prefix char is "s".

They are real sockets, living in the Unix domain rather than (for
example) the internet domain. So just as the endpoints of a TCP/IP
connection are sockets referenced by an IP address and a port number,
so a stream connection in the Unix domain is between sockets
referenced by a filesystem object with a path name like any other
filesystem object.

A TCP/IP application server creates a socket, binds it to a chosen
[server IP address, portnumber] pair and listens for connections.
A client creates a socket, binds it to its own
[client IP address, random-portnumber] (often automatically) then
does connect()s it to the server's well known [IP address, portnumber].
In the Unix domain, a server program creates a socket, binds it to a
well-known pathname (e.g. /var/run/myapp) which creates a visible
object there (the thing that ls shows with the "s"). A client creates
a socket, binds to, say, $HOME/.myappclient/foo, and connects to
/var/run/myapp. Like a TCP/IP application (and unlike named pipes) you
can have multiple concurrent connections from clients to the same
server (the server program sits in accept() which hands it out a new
file descriptor for each new connection that arrives).

Unix domain socket connections, similarly to TCP/IP, can choose
between a stream-based connection (bidirectional stream of bytes)
and a datagram one. Much more interestingly, though, Unix domain
sockets support functionality that TCP/IP can't. Since both endpoints
are in the same Unix system, you can pass Unixy information across
them (with the help of the kernel): I'll mention two here.

You can pass file descriptors across them--not just the number.
A server can open a file and pass the file descriptor through the
socket where it magically appears at the other end as an open file
descriptor in the receiving process. Or a client can pass an open
file descriptor to a server. This is different from just passing a
name and getting the other side to open it: think about what happens
when the processes on both sides have different security credentials
(uid, gid etc); what happens when you pass a descriptor to an opened
file with its seek pointer in the middle; what happens with an
opened socket or an opened file that was then unlinked/removed before
passing across the socket. (Trivia: any implementation of this
requires a garbage collection algorithm in the kernel. As far as I
know, this is the only part of a Unix-like kernel which absolutely
requires a garbage collection algorithm.)

You can also (on Linux systems--other Unices may have different APIs
or be unable to do this) pass your identity credentials to the other
side. This means that the server side program can receive the pid,
uid and gid of the client side of each connection in an unforgeable
way. It provides a good way for servers to mediate client security
in the case that you don't want to mess with passwords and such like.
A server daemon can, for example, restrict incoming admin connections
to only those coming from a given uid. In the case of Linux, you can
also set permissions on the server socket (a client will be unable
to connect if it doesn't have read/write permission to that socket
object) but most other Unices ignore socket permissions.

For more details, "man unix" should pick up the Section 7 man page
for Unix domain sockets. Hmm, that turned out to be a longer email
than I intended.

--Malcolm

Malcolm Beattie <[EMAIL PROTECTED]>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: cmsfs and cmsfs.o

2007-04-05 Thread Malcolm Beattie

Mark Perry writes:
> is there a way to use the bus/device address - such as 0.0.0190 rather
> than having to find out the /dev/dasdx?

The udev configuration on SLES9 and SLES10 gives you
/dev/disk/by-path/ccw-0.0.0190
and, if you're careful to avoid duplicate volsers (as seen by the
Linux guest) then you can use /dev/disk/by-id/VOLSER on SLES9 or
/dev/disk/by-id/ccw-VOLSER on SLES10. Or tweak your udev configuration
to follow your own naming conventions.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Moodle

2007-04-03 Thread Malcolm Beattie

Stephen Frazier writes:
> Is anyone running Moodle?
>
> Moodle is a course management system (CMS) - a free, Open Source software
> package designed using
> sound pedagogical principles, to help educators create effective online
> learning communities.

Yes, I use it on Zeus (the hub supporting the European universities
in the System z University Program for Europe/Academic Initiative).
I started using it last year and am rather impressed with it.
(Still on leave; still posting from home.)

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: How can we print to a VM virtual printer ? SUSE 9 and VM 00E ?

2007-03-28 Thread Malcolm Beattie

David Boyes writes:
> > Would like to send/route print from z/Linux guest to the guests
> virtual
> > printer 00E.
> > No RSCS, not VTAM. Sure we could FTP but processing the spooled print
> > output
> > in CMS REXX is so much simpler.
> > Any suggestions would be appreciated.
>
> There is no supported unit-record driver for Linux (Malcolm Beattie
> wrote one long ago, but AFAIK it hasn't been updated for 2.6 kernels, so
> really isn't much help any more).

I did a 2.6 version which was taken up by Boeblingen late last year
and is, perhaps, headed for mainline--I haven't heard recently.
I don't know the timescale or how many changes it'll go through
first. I'm on leave at the mo, so won't be checking right now.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: scp question.

2007-02-20 Thread Malcolm Beattie

Alan Altmark writes:
> On Tuesday, 02/20/2007 at 10:13 CST, "McKown, John"
> <[EMAIL PROTECTED]> wrote:
> > Is there any definative documentation, such as an RFC, which states how
> > scp handles the files that it transfers? In particular, I have the
> > Cygwin scp on my Windows XP system. I am running IBM's "Ported Tools"
> > version of OpenSSL and SSHD server on z/OS 1.6. When I do a simple:
> >
> > scp file [EMAIL PROTECTED]:file
> >
> > The contents of the file on z/OS has automagically been converted from
> > ASCII to EBCDIC. This just seems __wrong__ to me.
>
> Start with RFC 4251, the Secure Shell Protocol Architecture.  It will lead
> you to other RFCs.  ssh data transfer has no concept of "text" or
> "binary".  It just moves bytes around.

Here's some more rather surprising behaviour, using just ssh:
thinkpad% echo -n ABC | ssh zos 'od -t x1'
00C1  C2  C3
03
So the three bytes (0x41, 0x42, 0x43) sent by the ssh client end up
being read on stdin by od as 0xc1, 0xc2, 0xc3, i.e. converted from
ASCII to EBCDIC. There's no scp there, just a stream of bytes to move
around.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM Europe System z

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: a history question

2006-12-14 Thread Malcolm Beattie

Richard Troth writes:
> Like Rd said, no timestamps in BASH history. (Other shells have history
> too, and still no timestamps.)

Er, both bash and tcsh can timestamp history. tcsh does so by
default and bash does so if you set HISTTIMEFORMAT. man bash says
   HISTTIMEFORMAT
  If this variable is set and not null, its value is  used  as  a
  format  string  for strftime(3) to print the time stamp associ-
  ated with each history entry displayed by the history  builtin.
  If this variable is set, time stamps are written to the history
  file so they may be preserved across shell sessions.
Often a bit of care needs to be taken to consider what behaviour
is wanted for saving history between sessions, multiple shells,
login v. non-login shells etc. The man pages for tcsh and bash go
into the details.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: FCP

2005-06-23 Thread Malcolm Beattie

McKown, John writes:
> No. Only z/VM and Linux understand FCP connected DASD. z/OS and z/VSE cannot 
> access it.

Actually, z/VSE as of 3.1 can indeed access SCSI disks via FCP. There's
a chapter in the z/VSE Planning guide (chapter 9: "Using SCSI Disks With
Your z/VSE System ") which is a good place to start reading about it.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: possible to boot zLinux (SLES8) in different user mode?

2005-03-07 Thread Malcolm Beattie

Peter 1 Oberparleiter writes:
> [EMAIL PROTECTED] wrote:
> > is it possible to boot zLinux (SLES8) in different user mode (2)? We're
> > having the problem that a process is hanging during startup,
> unfortunately
> > before the sshd daemon gets started...
>
> Newer distributions support a boot menu which can be activated at IPL time
> to specifiy an additional command line parameter. Unfortunately, as far as
> I know, this feature is not available on SLES8 systems.

If you're running under VM and don't mind delving into a few CP
commands then you can use the following quick and dirty way. I saw
it on an IBM internal forum recently. I've added some extra
explanation, shown some example output and mentioned the case where
the parmline is EBCDIC instead of ASCII. I've tested it (the
ASCII one anyway) and it works for me but no guarantees.

Create a trace trap for when Linux starts running:
CP TRACE I R 1

Then IPL from your normal boot device:
IPL vdev

You'll almost immediately see
Tracing active at IPL
 -> 0001  BASR  0DD0CC 0

Display the current kernel command line (the first 100 bytes)
as hex and ASCII:
D TX10480.100

If you're parmline is in ASCII (see below if not), it'll show
something like
R00010480  64617364 3D323830 302D3238 30462072 06 *dasd=2800-280F r*
R00010490  6F6F743D 2F646576 2F646173 64613120*oot=/dev/dasda1 *
R000104A0  766D706F 3D4C 4F474F46 4620766D*vmpoff=LOGOFF vm*
R000104B0  68616C74 3D4C4F47 4F46460A *halt=LOGOFF.*
R000104C0     **
R000104D0 to 0001057F suppressed line(s) same as above 

The parmline is terminated with a newline character
(ASCII 0x0A above). You can append a space and S to the parmline
(which tells Linux to boot into single user mode) by overwriting
the newline with space+S+newline. Do that as follows (still assuming
you're parmline is ASCII), replacing the address with where your
trainling newline character lives:
STORE S104BB 20530A
Note that the leading S before the address 104BB (no intervening
space) says that the following data (20530A) is hex for a byte
string (in ASCII, 0x20 = space, 0x53 = S, 0x0A = newline).
Repeat the display to check you got it right:
D TX10480.100
R00010480  64617364 3D323830 302D3238 30462072 06 *dasd=2800-280F r*
R00010490  6F6F743D 2F646576 2F646173 64613120*oot=/dev/dasda1 *
R000104A0  766D706F 3D4C 4F474F46 4620766D*vmpoff=LOGOFF vm*
R000104B0  68616C74 3D4C4F47 4F464620 530A*halt=LOGOFF S...*
R000104C0     **
R000104D0 to 0001057F suppressed line(s) same as above 

Now let the guest continue the boot process:
B
and you'll see the usual boot messages as it comes up in single
user mode. You can then unregister the trace with
#CP TRACE END ALL

As mentioned above, Linux can also cope with EBCDIC kernel command
lines so that you can easily create the parmline from CMS for
example. I think most people who've completed a full install will be
using zipl to write the text in ASCII but, for completeness, if your
parmline data is in EBCDIC, you'll need to use "D T10480.100" instead
of "D TX10480.100" (i.e. with no "X") in order to interpret the data
being displayed correctly and you'll need to use EBCDIC for the
hex codes to append to the parmline.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: VM Shutdown

2004-09-15 Thread Malcolm Beattie

Post, Mark K writes:
> Depending on what version of z/VM and Linux you're running, updating
> /etc/inittab to have something like this:
> # What to do at the "Three Finger Salute".
> ca::ctrlaltdel:/sbin/shutdown -t5 -r now

Change that "-r" (meaning "reboot") into "-h" (meaning "halt")
so that the SIGNAL SHUTDOWN magic described elsewhere in this
thread behaves as expected. For cleanliness, it's also good to
include "vmpoff=LOGOFF" into the kernel parmline so that when the
guest finishes shutting down and does a "halt -p" (a "power-off"
halt), the kernel will do a "CP LOGOFF". This logs the guest off;
CP then knows that the guest has finished its signal shutdown
processing cleanly and will log a nice message to say so.

[The vmpoff assumes that your distribution uses a "halt -p" in its
shutdown scripts, as SLES does. If your distribution ends up doing
a "halt" without the -p then the relevant parmline addition would
be "vmhalt=LOGOFF". I usually add both...belt and braces.]

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: tar up directory structure but not contents

2004-08-04 Thread Malcolm Beattie

John Campbell writes:
>   I would probably write off tar and, instead, do:
>
> cd(somewhere)
> find . -xdev -type d -print | cpio -ocv >tree.cpio

And combining this with the NUL-separation to ensure whitespace in
filenames is handled correctly, this would become
find . -xdev -type d -print0 | cpio -0ocv >tree.cpio

And if you use "-H ustar" instead of the -c option (which corresponds
to "-H newc") then cpio will write out a tar archive which tar can
extract for you:
find . -xdev -type d -print0 | cpio -0ov -H ustar > tree.tar

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Deleting files in a directory more than 5 days old

2004-07-22 Thread Malcolm Beattie

Post, Mark K writes:
> Finally, there is a subtle difference between doing an -exec rm and piping
> the output of find to an xargs rm command.  The difference there is that the
> find command will invoke the rm command once for each file that it finds
> that matches your criteria.  The xargs version will "batch" them up to the
> maximum line length that is allowed on your system, and invoke rm once for
> each maximum number of arguments, thus reducing the amount of system
> overhead required for process creation and destruction, etc.  I tend to use
> that a lot these days.  It really does speed things up when there are a lot
> of objects to be handled.

However, if you use xargs be extremely careful about the possibility of
whitespace in filenames. If you have a file called "old price.list" and
use a pipe such as
find  -print | xargs rm
(or, equivalently, omit the "-print" since it's the default action)
then xargs will parse its input stream
foo
bar
old price.list
baz
for arguments to rm by separating at whitespace and end up attempting
to remove file "old" (which probably doesn't exist) and file
"price.list" (which may be your new file which you definitely don't
want removed). It's much safer to use
find  -print0 | xargs -0 rm
(those are zeroes) which are GNU extensions that force find to print
the filenames terminated with NUL (a.k.a. \0 a.k.a. ASCII code 0) and
force xargs to split its input stream at the \0 character (which
cannot appear in filenames) and thus safely remove exactly the right
files. It also therefore handles filenames containing \n correctly
which, although not a common mistake, can form part of a malicious
attack against some programs which mis-parse such things.

Talking of attacks, there were examples elsewhere in the thread of
using find to traverse a directory such as /tmp to clean things up.
I should warn people that there are race conditions that are easy
to miss when doing such recursive operations on filesystems which are
writable by potential attackers. These involve the order in which
directories are read, lists built up, directories and symbolic links
traversed and the resulting actions executed. There have been known
exploits in the past resulting from such automation. An example
includes versions of some of the automated /tmp cleanup scripts run
from cron in various older distributions. Any of you whose threat
models require you to give attention to possible attacks from local
users should be careful how such automated scripts are coded.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: web server proof of concept - code set translation problem

2004-07-12 Thread Malcolm Beattie

SOLLENBERGER, JUSTIN writes:
> Most of the binaries are .pdf, .ppt, .gif, .jpg, .etc that are linked to be the web 
> pages.  Shouldn't be anything that would cause a problem.

What about looking at this from the other direction? Do a recursive
"wget -r" from the Linux system to pull the data directly from the
original web server and let the web server decide how to serve the
files up to a web client wanting ASCII. Provided all the data is
static and doesn't have any server-side includes, hierarchy oddities
or complex permissions, you'll have a starting point with the data
in the right format.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Performance with Multiple CPUs

2004-06-24 Thread Malcolm Beattie

Adam Thornton writes:
> On Tue, 2004-06-22 at 10:58, Walter Wojcik wrote:
> > Does anyone have experience running zLinux in an LPAR which has multiple
> > CPUs defined?  We have experienced negative performance characteristics
> > with Intel Linux versions (RedHat, SUSSE) when the Intel machine had 4
> > processors.  We were wondering if the same performance degradations appear
> > on the mainframe.
>
> Without addressing your actual question at all:
>
> If you need 4 engines in an LPAR, I would question whether the zSeries
> is the right tool for the job.  You're clearly doing something pretty
> computationally-intensive at that point, and machine cycles on other
> hardware are generally a whole lot cheaper.  I'd take a look at the app
> design and see if there isn't some way to offload the CPU-intensive
> stuff to a different box, and keep the I/O-intensive stuff on the
> zSeries.

Both the original question and this response are rather ambiguous.
I suspect that when Adam refers to "4 engines in an LPAR", he is
thinking of Linux running by itself (no z/VM) in that LPAR and that he
assumes the original poster means the same. When I read the original
question, I assumed it was just a question about scalability and that
the LPAR would be using z/VM to run multiple Linux guests, some of
which may, at times, want to peak at using 4 CPUs. Provided the sizing
is done properly and any such peaks "separate out" nicely, I would
imagine zSeries would do a very good job given the excellent "raw"
scaling capability of zSeries (as others have said) and the wonders of
z/VM to provide such "time-shift" sharing. If the original poster
really did mean a 4-way LPAR with Linux running "native" in it, then
my response is irrelevant of course (though I can still think of
some interesting configurations: some CPU-intensive applications
may simply need all that CPU in order to drive even larger amounts of
I/O that the hardware is good at).

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Adoption of UML Copy-On-Write

2004-06-19 Thread Malcolm Beattie

Matt Zimmerman writes:
> On Fri, Jun 18, 2004 at 04:42:43PM -0700, Brandon Darbro wrote:
>
> > Huh?  EVMS or LVM2 has a method of adding a writable layer to read only dasd?
>
> EVMS supports writable snapshots, using copy-on-write from an EVMS volume.
> I haven't tried it with read-only DASD, but in theory it should be possible
> for it to be used this way.

Unfortunately, LVM2 snapshots (I haven't looked at EVMS to see how they
do it) are the "wrong way around" for this use. When you write to a
volume that you are snapshotting, the original data from the block is
written to the snapshot volume and then the new block data is written
back to the underlying original volume. The advantage is that the
original volume always contains up to date data but the disadvantage is
that you can't have the original volume readonly.

A further benefit of having a "new block data gets written to new
volume" device mapper would be that you could take raw copies (e.g.
with DDR) of un-quiesced (i.e. mounted live and somewhat active)
Linux (journalled) filesystems and be confident that mounting the
resulting copy will (after haing replayed the journal and hence
provided the copy lives on a writable volume itself) give you a
filesystem with consistent metadata. The result corresponds to a
point-in-time shutdown of the filesystem which journalling is designed
to cope with). Atomic snapshotting at device level (e.g. ESS Flashcopy)
can give you similar functionality but is less widely applicable or
convenient.

Writing such a device mapper shouldn't be too hard given the nice new
dm infrastructure (which I haven't looked at detail and really must
sometime): reads/writes in non-snaphot mode go straight through to the
original volume. In snapshot mode, a write to block m causes the new
block data to be written to the "next free" block, n, on the snapshot
device along with an entry in a map at the front of the snapshot device
mapping m -> n. Reads look up in the map on the snapshot device and get
the redirected block or else a read to the original block if unmapped.
You then need a feature to merge back the new data into the original
volume when desired. And you really, really don't want the snaphot
device to fill up because the failure mode is much, much worse than
letting an archive-style LVM snapshot fill up. I'd love to hear if
EVMS has a mapping facility in that direction already.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Linux Partition - DR readiness

2004-06-10 Thread Malcolm Beattie

Adam Thornton writes:
> On Wed, 2004-06-09 at 20:44, Ranga Nathan wrote:
> > Thanks Adam I have given our zipl.conf below. From what you say, we
> > should be OK.
>
> Well, as long as 230A is the *first* DASD that's detected, you should
> be.  If something else comes up as /dev/dasda you have problems.

The line in zipl.conf was

parameters="dasd=2300-230F root=/dev/dasda1"

which means that Linux will allocate a "slot" for each abstract device
number from 2300 to 230F regardless of whether each is available,
online or whatever. So /dev/dasda will always refer to device 2300,
/dev/dasdb will always refer to 2301 and device 230A would always be
/dev/dasdk. That means that it is not going to work if the root disk
suddenly becomes 230A.

The best solution, as others have suggested, is probably to arrange
for the device numbers to be the same at both sites. In the absence of
that, you can remove the dependency on device numbers for all non-root
fileystems either by using mount-by-filesystem-label (for ext2/ext3,
using e2label and "LABEL=foo" in /etc/fstab instead of the device name)
or LVM (which pools together all PVs--physical volumes--and sorts out
the logical volumes itself). However, that doesn't help for the root
filesystem which needs explicitly coding in /etc/zipl.conf. In the
absence of a boot-time choice of image and arguments, the only way
would be to create a little initrd (with mkinitd) on which you can put
a little script or program to query where you're running and choose
your own root filesystem early in the boot procedure. I still think it
would be easier all around to get the device numbers to match though
[subliminal message: VM, VM, VM].

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: IBM 3900/4000 printers in linux

2004-06-01 Thread Malcolm Beattie

David Boyes writes:
> If you're not running Linux under VM, you'll need to modify the UR
> driver that Mr. Beattie wrote to emit the right CCWs for printer
> devices, then define the 3900 printer to CUPS as spool:/dev/printer
> (or whatever you tell the driver to do).

If anyone is missing functionality they need from the ur driver,
please let me know and I can probably add it without too much
difficulty. I've nearly finished porting it to the 2.6 kernel
(well, it compiles, so it's all over bar the shouting :-) and the
new driver model means I can add features fairly cleanly. For
printers, I'd guess the best thing would be an attribute like
echo 1 > /sys/bus/ccw/drivers/ur/0.0.001E/carriage-control
where the first character of each line is taken to be the CCW
command for that line (i.e. what "traditional" printer data
already includes as "carriage control").

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: VDISK for /tmp

2004-03-18 Thread Malcolm Beattie

Rob van der Heij writes:
> But this does not mean it is always wise to do. I don't know enough
> about Linux to tell whether it is important to have high bandwidth to
> /tmp.

More important is low latency. Lots of creation, small reads and
writes and unlinks. Compilation with gcc, for example, uses /tmp t
store its partial assembly and object files when compiling (unless
you use the -pipe option). Since files in /tmp don't need to survive
across a reboot (assuming either sanity, comliance with LSB or both)
having a filesystem which doesn't even try to dribble them out to
disk can be convenient. That is the reason for the existence of the
"tmpfs" filesystem. If you do

mount -t tmpfs none /tmp

then you get a filesystem which exists only in page cache. You can
set the size limit for the filesystem at mount time (see man page for
mount) or else it defaults to half of (what Linux thinks is) main
memory. When compared with a "normal" filesystem backed by VDISK or
by a DCSS, it'll produce a different mixture of pressure and
behaviour but it's not clear under what circumstances it may provide
a win. With tmpfs all the /tmp pages would be mixed with everything
else but at least would be backed by a nice fast paging hierarchy
(one hopes). With a normal filesystem on VDISK the /tmp activity
would all be focussed on one memory area but would have a longer
path length (through the block layer and into CP for VDISK or just the
block layer and some page faults to DCSS). I can't remember off the top
of my head whether ext2 will allocate blocks just released by an
unlinked file or whether it'll allocate in fresh blocks (you mention
this point elsewhere but I've snipped it now). If it reuses
just-released blocks (or can be persuaded to do so) then the memory
footprint on the VDISK would be much friendlier and usual tricks
like mounting noatime and nodiratime would help flurries of
metadata writes that you don't need to hit disk, er, backing memory.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Can't locate module char-major-10-224

2004-03-12 Thread Malcolm Beattie

Peter E. Abresch Jr.   - at Pepco writes:
> I am running SuSE Linix SLES8 with the latest patches applied on a native
> IBM 9672-R26 LPAR. I keep receiving the following message on my console
> and logs:
>
> Mar 12 07:30:00 mainpepl modprobe: modprobe: Can't locate module
> char-major-10-224
>
>
> What is causing this and how can I correct this problem or eliminate the
> message. Thanks.

Something is trying to open a character device with major number 10
and minor number 224. If you "cat /proc/devices" you'll see that
major 10 belongs to the "misc" device driver (which is a way that
simple device drivers can ask for a single character device simply).
According to LANANA (www.lanana.org), minor 224 is assigned to some
TCPA chip which is unlikely to be what you're using so something
else has usurped that number.

Assuming that the device node for it has been created in /dev, do
ls -lR /dev | grep 10,
and look for the name of the device node associated with "10, 224"
to see if it reminds you of what's been installed. Whatever software
it is, it'll be expecting you to have put a line
alias char-major-10-224 foo
in /etc/modules.conf so that whenever something tries to open the
device, the kernel will automatically do "modprobe foo" instead of
"modprobe char-major-10-224". When the device driver is loaded, it'll
register via the misc device API and then a "cat /proc/misc" will
show the association between its minor number and name.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: ipl clear, SINGLE USER mode??

2004-03-12 Thread Malcolm Beattie

[following up to myself, sorry]

Malcolm Beattie writes:
> line. Use another automount daemon over a separate mountpoint with a
> timeout of only a few seconds.

Except the timeout from non-usage of the filesystem will only trigger
automount into unmounting it and the underlying minidisk will still
be linked. automount doesn't provide a hook there, as far as I know.
Darn. Needs thought. It's certainly solvable (e.g. have a daemon
sitting around and looking at the directory of real mountpoints and
doing the unlinks when necessary while avoiding races) but it's not
nice (not sure if autofs creating the mountpoint will trigger a
dnotify event which could be waited for) and probably not as
easy and clean as I'd hoped.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: ipl clear, SINGLE USER mode??

2004-03-12 Thread Malcolm Beattie

Rob van der Heij writes:
>   vmlinx linux007 201 /mnt/tmp
> Yes, vmlinx is a little bash script that issues the CP LINK command
> through hcp, adds the new device to /proc/dasd/devices and issues the
> mount.

If you replace the invocation of mount with something that just outputs
a map line like "-fstype=auto :/dev/$dasd$partition" then you can turn
it into an automount script for autofs. Then you can have
vi /mdisk/linux007.201/etc/inittab
automount the filesystem for you and auto-unmount it when you haven't
touched it for a while.

For an encore, have a similar script which detects if anyone else has
linked to the minidisk and wait until it's free before outputting the
line. Use another automount daemon over a separate mountpoint with a
timeout of only a few seconds. Then you have a somewhat basic shared
filesystem that at least can be used for letting things like
cp /mdiskshare/lxconfig.300/someconfig /etc/someconfig
report_summary > /mdiskshare/lxconfig.300/`hostname`.`date +%j`
be automated. Yes, it's fragile and any guest that holds a file open
on it or cd's into a directory on it can block other guests
indefinitely but I can still imagine it being useful in some
environments. Naming is a bit finicky (best to enforce canonical
naming, say lower case guest name, non-zero-padded lower-case hex and
partition number as :2,:3,:4 with an enforced omission defaulting to
partition 1, otherwise the autmounters keys and mapping give rise to
a few interesting problems).

Similar things are doable for /devno/123 and /volser/ABCDEF too
(with the latter having further interesting namespace issues with the
duplicate volsers that tend to happen on minidisks rather than real
volumes).

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: filesystem overhead

2004-03-10 Thread Malcolm Beattie

Post, Mark K writes:
> Since ext3 is advertised as "ext2 with a journal," I would say that the
> journal is the only additional overhead you'll see in terms of disk space
> usage.

Hmm, "sort of". From the point of view of purely disk space usage,
that's true, as you say. Further, ext3 is indeed "ext2 with a
journal", so you're right there too. The reason for the "sort of"
is that what "ext2" means there is a little more ambiguous than
you might imagine. During ext2's development history, a number of
changes have been made to improve performance. Some of those have
been folded into ext3 and some haven't. If you want to make an
accurate comparison of ext2 and ext3, you may find in practice
that those differences become significant for some workloads. Three
of the changes which spring to mind are
* htree indexing (from Daniel Phillips) for faster directory lookups
* the Orlov allocator for choosing where to allocate disk blocks for
  new files (i.e. when do you put them "near" recently created files to
  get good locality of reference and when do you put them far away in
  order to allow room for the files to expand without fragmentation)
* locking (if I recall, the locking requirements for the journalling
  sometimes means the kernel has to (or wants to) do the locking in the
  ext3 filesystem differently from ext2.

What I can't remember is which changes were carried across from ext2
to ext3 and when. There's also the difference that the journal I/Os
will affect the I/O scheduling for the ordinary filesystem I/Os
themselves unless the journal has been placed somewhere else
carefully enough. All in all, I'd suggest people do some thinking,
testing and measuring when moving from ext2 to ext3 if the workload
is I/O intensive enough that it might make a difference.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: z/VM access to EMC (was: Accessing DASD on a Shark from Linux under z/VM)

2004-02-27 Thread Malcolm Beattie

Jim Sibley writes:
> > > I'm curious. One of the benefits touted, and true,
> about
> > Linux on zSeries
> > > vs. some other platform, is the zSeries' strength
> in I/O.
> > Is this still true
> > > with FCP attached SCSI DASD? Why would the zSeries
> drive
> > SCSI DASD better
> > > than Intel or Sun?
> > > John McKown
> > > Senior Systems Programmer
>
> Basically you can attach more dasd space and have more
> simultaneous (NOT just concurrent) data transfers
> going on at the same time.
>
> The I/O advantage of the mainframe is that it usually
> has more paths (256 channels) to more devices(65,536)
> thus giving a lot more parallel I/O, not that any
> particular device is more efficient. If you have a lot
> threads active, more I/O can be done in parallel that
> most intel and other boxes.
>
> With 256 channels at say 12 MB/sec (shark) on , the
> total aggregate rate of the mainframe would be about 3
> GB/sec. Obviously, that's limited by the 2 GB backend
> buss on the TREXX.

The general idea is right but the bus limit is wrong: 2GB/sec I/O
for an entire box would be very poor. Rather than have zSeries damned
with faint praise, allow me to hype up its I/O capabilities a bit more.
2GByte/sec is the speed of a single STI bus and the smallest T-Rex (one
book) has 12 STI buses while the largest (four books) has 48 STI buses
for a total of 96GByte/sec bandwidth. Channel cards, whether ESCON or
FICON, are spread over domains/slots to take advantage of the STI buses
available. You can't fill all of that bandwidth with DASD I/O
(there's a limit of 120 x FICON 2Gbit/sec ports--60 features on
z990--making a nominal 24Gbyte/sec) but it's way more than 2GB.

ESCON hits the limit of number of channels way before any
hardware bandwidth limit but even so you only have 16 ESCON ports
per card. Each STI bus fans out to four slots and, for ordinary
I/O, gets multiplexed down to 333MByte/s, 500MByte/sec or
1000MByte/sec as appropriate. For ESCON, it uses 333Mbyte/s (which
nicely encompasses the 16 x 20MByte/s nominal signalling for an
ESCON card) and for FICON, 500MByte/sec (which nicely encompasses
the 2 x 200MByte/s nominal for the dial-port FICON-Express cards).
The buses and features are, IMHO, very well designed to ensure that
there are no bottlenecks or caps right through to the backend memory
bus adapters (MBAs) of the memory subsystem. For those interested in
the details, Chapter 3 of the "z990 Technical Guide" redbook
(SG24-6947) from www.redbooks.ibm.com elaborates on this and
describes it very well.

> Also, the main frame typically has 2 processors
> dedicated to driving the devices (SAPs), so less "real
> cpu" is used for I/O.

In fact, not just the SAPs (which deal with initiating the I/Os).
Each channel card also is fairly powerful and has the responsibility
of doing much of the I/O work itself. For example, each z900 FICON
card has two 333MHz PowerPC processors (cross-checked for reliability)
to do the work. Again, for lots of detail, see the "z900 I/O subsystem"
paper by Stigliani et al in the z900 edition of the IBM Journal of R&D
(Vol 46 No 4/5 Jul/Sept 2002).

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: Basevol/Guestvol

2004-02-26 Thread Malcolm Beattie

On Thu, 26 Feb 2004 09:45:53 -0500, Scully, William P
<[EMAIL PROTECTED]> wrote:
> >I recently gave a presentation at HillGang which describes a simplified
> >approach for Basevol/Guestvol in the SuSE operating system.  If you
> >forward me at "William dot Scully at CA doc COM" your e-mail address,
> >I'll fire off to you a copy of an HTML document which describes the
> >approach I used.  (I believe Mark Post also has a copy and intends to
> >put it on the LinuxVM.org site, when he next updates those pages.)

Bob writes:
> Thanks for the reply. Yes, I saw that presentation material and between
> that material and the Redbook I have been able to understand and setup
> everything except for where to put the (mount --bind)'s for the guestvol
> packs into the rc.d directory structure to have them so that the mounts
> are done at the correct time.

I'd like the presentation too, please. As for the boot time (and
shutdown time) details: tweaking RedHat's scripts and ordering was
the main nuisance when I was designing basevol+guestvol. It turned
out to be rather easier for SLES7 but I didn't have a chance to do it
properly (or for SLES8) since my test VM/Linux system is very tight
on disk space and I don't have the time/focus of a residency period
to extend things.

I wish Al Viro would finish off the unionfs he's been talking on and
off about writing for years: we could do plenty of marvellous sharing
setups with that. Even a cut-down version would be almost as useful
(two layers only, bottom layer only read-only, no merging, no
white-outs for unlink(), just mkdir to create an empty directory on
top of one below).

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: nfs hangs on NetApp NAS device

2004-02-26 Thread Malcolm Beattie

Adam Thornton writes:
> On Wed, 2004-02-25 at 12:42, McKown, John wrote:
> > If you do not recommend the "soft" option (at least for R/W), what else is
> > possible? If the NFS server "dies" or is unavailable for some reason, does
> > that mean that all the client boxes which use it should die as well?
>
> Yes.
>
> If you're mounting files you need to have read-write, and the underlying
> filesystem goes away, you absolutely do not want to continue operations
> with the files you have open.  If you do keep going, e.g. with a soft
> mount, you're looking at Data Corruption City.

To expand on this a little: there are two independent two-way choices
for "how do I want the NFS filesystem to behave when it stops behaving
like the local filesystem it's pretending to be?". One choice is
soft v. hard, the other choice is intr v. nointr. The defaults are
hard and nointr. The four combinations have the following properties:

hard,nointr
  The default. Makes the filesystem behave (a little more) like a local
  filesystem in the sense that a read or write of n bytes will wait
  uninterruptibly until it has fully succeeded or failed[*].
hard,intr
  The useful alternative. Weakens the pretence of local filesystem
  semantics but only a little. If an interrupt (SIGINT, Ctrl/C, ...)
  occurs "during" a read(), then it returns with errno EINTR or a
  short read (not sure if NFS will actually do the latter).  This
  doesn't usually confuse applications since EINTR must be handled
  anyway in the case it arrives "just before" the read and if the
  application is designed to cope with reading from terminals, pipes
  or devices then it needs to cope with short reads anyway. An EINTR
  in the middle of a write() is a bit nastier since you don't know
  what happened server-side (but then if you cared about exactly what
  data is on the server you'd either take more care of the NFS
  server or not use NFS).
soft,nointr (or soft,intr I suppose)
  This weakens the pretence of a normal local filesystem even more,
  at least insofar as people trust "quality of implementation" as
  well as the letter of the law. If the NFS server times out (either
  because it's down or because the network's congested or because
  various timeout values have been tweaked) then the read()/write()
  returns with errno EIO meaning an I/O error. Now, many applications
  follow the methodology of "if you can't handle it, don't test for it"
  and other follow the methodology of "being coded by a lazy git who
  doesn't even test for errors" in which case your data is toast. Yes,
  it would also be toast if the local filesystem started giving I/O
  errors but such things are normally handled at a different level
  (shout at whoever implemented the RAID solution and/or the hardware
  vendor).

Of the choices available, "hard,intr" tends to give much more useful
and safe semantics than "soft" but, even so, needs careful thought
and effort which could have been prevented by more effort in making
the NFS server more reliable. A default "hard" mount will pick up
the read/write transparently when the server comes back up again
given the statelessness of NFS[*] so it's only "long" outages that
matter.

--Malcolm

[*] Yes, those are lies but are close enough for this explanation.

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: ulimit settings

2003-12-05 Thread Malcolm Beattie

Kevin Ellsperman writes:
> I need to be able to set the ulimit for nofile - the maximum number of open
> files.  It defaults to 1024 and WebSphere V5 needs this value set
> significantly higher.  I can change this value from the command line as
> root, I cannot set this value in /etc/security/limits.conf.  Anything I
> specify in limits.conf just gets ignored.  This is especially crucial for
> us because we do not run WebSphere as root, but as a non-root user.  Has
> anybody been able to change this value on a permanent basis for a user?

The configuration file /etc/security/limits.conf is only used if the
method you use to start the user session uses the PAM module
pam_limits. In SLES8, for example, the default configurations are
such that login and sshd use pam_limits but su doesn't. Look in
/etc/pam.d/sshd and /etc/pam.d/login and you'll see that the last
line of each is
session required pam_limits.so
which is what sets the resource limits based on
/etc/security/limits.conf. If your WebSphere start up script uses
su to get from root to the non-root user (or if it does its own
setgroups/setgid/setuid stuff) then nothing will be looking at
limits.conf.

Another thing to note is that pam_limits will fail to grant the
session at all if the attempt to set the chosen limits fails.
In particular (as I've just found out by testing), if you put lines
in limits.conf which have "foo hard nofiles 11000" then you will
no longer be able to log in to username foo by ssh because the ssh
daemon itself has inherited the default limit of 1024 from its
parent shell and so can't increase its child's limit beyond its own.
Similarly, if you add the line
session required pam_limits.so
to /etc/pam.d/su then you will not be able to su to a username which
has a limit higher than 1024 for nofiles configured in limits.conf.
The answer for sshd is to start the daemon off with a higher limit
of its own, e.g. add lines to /etc/sysconfig/ssh (which in SLES8
anyway gets sourced at sshd startup time):
ulimit -h -n 2
ulimit -s -n 2
to set the process' hard and soft open files limits to 2 before
the sshd itself gets execed. For su, you're going to have to set the
limits before the su which means it's probably not worth using
limits.conf at all: if you have to raise the limits before su'ing
then you might as well set them to the right values to start with
and not bother using pam_limits and limits.conf. In other words, just
edit the startup script for WebSphere (or an /etc/sysconfig file if
it's nicely behaved enough to source one) to set the limits higher
before it starts up, using ulimit commands as above for bash. Note
that the exact syntax is shell-dependent since such commands are
necessarily shell builtins (it's no good calling out to a separate
program because the rlimits are inherited only by children and so
your own shell wouldn't have its own limits changed). For Bourne
flavoured shells, ulimit is what you want; for csh flavoured shells
you'd use "limit" with a different syntax (not that you'd ever be
writing scripts in csh of course, but just fyi for interactive use).

The sysctl fs.file-max (equivalently /proc/sys/fs/file-max) is a
system-wide limit which you may want to raise too if you think it's
in danger of being reached. For SLES8 (at least), it appears to be
9830 by default which is rather more than the per-user value of
1024 that you're hitting first but still may be worth increasing if
there are going to be a number of processes all wanting more than a
couple of thousand or so open files.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: Chan Attached Tape Major & Minors Redux

2003-11-26 Thread Malcolm Beattie

James Tison writes:
> Back a couple of summers ago, I recall Sergey Korzhevsky, myself, and maybe
> a couple others involved in trying to figure out what the tape majors and
> minors were. IIRC, Sergey finally put all the pieces together. I just want
> to review them now that I've had a chance to actually channel attach a few
> 3490s and run tests on the drivers & devices.
>
> By the way, I'm running SLES 8.0 without a maintenance agreement, so I
> could easily be wrong. There just seems to be no good document where all
> this (very simple) stuff is written down. I don't run the devfs, either.

It's all documented in the "Device Drivers and Installation Commands"
manual (LNUX-1313-02, Chapter 5 "Channel-attached tape device driver")
which is available directly as

http://www10.software.ibm.com/developerworks/opensource/linux390/docu/lx24jun03dd01.pdf
which is the link on the "Linux on zSeries Library" web page at
  http://www-1.ibm.com/servers/eserver/zseries/os/linux/library/index.html

> The tape device major -- whether block or character -- is always 254.

Not necessarily: they are dynamically allocated (presumably nobody
got around to getting a number allocated from LANANA) which means
that the driver will look for the first free number available
starting at 254 and going downwards. For example, if you have cpint
loaded first then cpint would allocate char major 254 for itself and
the tape char device would get major 253 whilst the tape block device
would get major 254 (assuming that no other block device had been
loaded that had snaffled major 254 first). Rather than guess, look in
/proc/devices after the driver is loaded and look for the allocated
numbers in there.

> The block device minors are always single within the major. For example,
> /dev/btibm0 is 254:0, /dev/btibm1 is 254:1, etc.

Hmm, TFM says
Character device
[...]
The minor number for the non-rewind device is the tape device
number of /proc/tapedevices multiplied with 2. The minor number
for the rewind device is the non-rewind number +1.

Block device
[...]
The device nodes have the same minor as the matching
non-rewinding character device.

which would imply that block device minors would be 0,2,4,...

> The character device minors come in pairs, and they're sequential within
> the device major. The rewindable member of the pair is ODD. The
> non-rewindable member of the pair is EVEN.

That agrees with the manual.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: RE : RE : signal shutdown

2003-11-25 Thread Malcolm Beattie

Monteleone writes:
> Have a look please to the response i get when i try to compile ext_int:
>
> lnxtrs7:/ext_int # gcc ext_int.c -o ext_int
[...]
> Is there a particularity to compile this module ?

Yes, it's a kernel module so it's not the same as an ordinary
userland executable. For longer modules, I normally provide nice
READMEs and Makefiles but this one was so short I didn't. Sorry.
The following is the sort of thing you need
gcc -D__KERNEL__ -I/lib/modules/2.4.19-3suse-SMP/build/include -Wall 
-Wstrict-prototypes -O2 -fomit-frame-pointer -fno-strict-aliasing -D__SMP__ -pipe 
-fno-strength-reduce -DMODULE -c ext_int.c

That works on SLES8 which makes the necessary kernel include files
available in /lib/modules/2.4.19-3suse-SMP/build/include (for the
kernel version I have). If you can't find an appropriate directory
in /lib/modules for your kernel version (or it doesn't have a "build"
subdirectory) then we'll have to play games installing the kernel
source package in which case let me know what distribution you have
(and it may have a kernel-includes package).

An older convention for kernel include files was to put them in
/usr/include/linux and /usr/include/asm or to use them from a source
tree in /usr/src/linux/include but that can lead to hard-to-find
problems when you have multiple kernels or source trees installed.

Given that this module only uses four particular kernel functions,
it's not really sensitive to versioning differences but I don't
want to do anything tasteless like send a binary module around.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: RE : signal shutdown

2003-11-25 Thread Malcolm Beattie

Monteleone writes:
> This is what i get when i run bootshell
>
> lnxtrs7 login: /sbin/bootshell: /sbin/bootshell: cannot execute binary
> file
[...]
> - run gcc -c ./bootshell-1.3.cc -o /sbin/bootshell
[...]

The "-c" option produces an object file, not an executable. Leave
out the "-c" option and gcc will also do the link stage and create
an executable for you.

Another option for consideration may be the ext_int kernel module
I wrote which lets you trap the external interrupt number of your
choice and have it deliver a signal of your choice to the PID of your
choice. I used that when doing the Large Scale Linux Deployment
redbook to be able to trigger a remote shutdown of a Linux guest
before the SIGNAL SHUTDOWN support was widely available. See
section 9.8 of that redbook for details. Using it to trigger a
shutdown is nice and simple since you only need to deliver a SIGINT
(signal 2) to init (PID 1) and init will then do the ctrlaltdel line
in your /etc/inittab (similar to how the SIGNAL SHUTDOWN does it,
except that that communicates extra data (timeout info) out of band
rather than just being the external interrupt).

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: Too Many Open Files

2003-11-18 Thread Malcolm Beattie

Craig, Roger C [ITS] writes:
> We are running Linux under VM on a mainframe. We keep running into this
> "Too many open files" problem with one of our WebLogic Servers:
> Sep 8, 2003 5:58:25 AM CDT>   <000203>
> 
><000204>
>  1,063,013,260 seconds, java.net.SocketException: Too many open files>
><000206>
> 
>
> We end up having to bounce the server (or Linux image) when we get this
> condition. Has anyone experienced this? Also is there a good way to
> display the number of open files?

It's just an administrative limit these days (either via
/etc/security/limits.conf for session initiated via PAM using
pam_limits or via a default of, usually, 1024). Raise the limit
in whichever way you like: WebLogic may have a preferred way of doing
this depending on how its username starts up a session or you can use
limits.conf (if WebLogic goes via a PAM config that includes pam_limits)
or else use ulimit (bash) or limit (tcsh) in your daemon startup script.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: db2 using tsm

2003-10-17 Thread Malcolm Beattie

Noll, Ralph writes:
> db2 error message using tsm
>
> anyone seen this
>
> db2 => backup db police online use tsm
> DB21019E  An error occurred while accessing the directory "/root".
> db2 =>

/root is typically root's home directory. If you were running db2
as a non-root username then you would not have permission to access
/root. This might happen, for example, if you used "su db2user"
rather than "su - db2user" from root (which would leave the HOME
environment variable set to /root) and db2 then tried to access
some per-user configuration file living under $HOME. Type "env"
and see which environment variables refer to /root.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: adding dasd under SUSE Enterprise 8...

2003-10-01 Thread Malcolm Beattie

Stefan Kopp writes:
> > > more minidisks). I only have to reboot Linux and the
> > > disks are online.
> >
> > You don't have to reboot Linux  It was answered
> > here yesterday, use
> > echo "add device range=xxx-yyy" >> /proc/dasd/devices
>
> Ooops, sorry, you're right. I've always thought I have to reboot when I've
> updated the user.direct because the new adresses were not active. Now I've
> spend some time with the bookmanager, nice thingy. A "#cp define mdisk" returns
> "Invalid option - MDISK", which I've solved with the entry "OPTION DEVMAINT"
> for the designated z/VM user. Now I can enter "#cp define mdisk 205 1 1500
> xyz" - wohaa - Linux recognizes the new disk.

Ouch, you don't want to do that. DEFINE MDISK is intended for a
privileged user to bypass the table of "real" minidisks and just
carve out any extent at all from a device. Dangerous stuff and
rarely needed. Take that OPTION DEVMAINT off the directory entry
because all you need to do is
CP LINK * 205 205 W
on the Linux guest and it will pick up the changes to the directory
which were made behind its back (adjust link mode to taste). This
will also trigger Linux into noticing the presence of the new disk
and it will bring it online (if it's in the list of eligible DASD
devices and hasn't has a "set device range=... off" done on it).

Regards,
--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: Far 3390-mod 9's

2003-09-16 Thread Malcolm Beattie

James Melin writes:
> Using this strategy from the link on Linux VM
>
> mount /dev/dasdh1 /mnt
> cd /var
> tar -clpSf - . | (cd /mnt; tar -xpSf - )
>
> produced 3 errors
>
> tar: ./lib/mysql/mysql.sock: socket ignored
> tar: ./run/printer: socket ignored
> tar: ./run/.nscd_socket: socket ignored
>
> Those three errors translated into missing items in the copy
>
> rockhopper:/var # diff -r /var /mnt
> Binary files /var/db2/.fmcd.lock and /mnt/db2/.fmcd.lock differ
> Only in /var/lib/mysql: mysql.sock
> Only in /var/run: .nscd_socket
> Only in /var/run: printer
>
> I am concerned that such things are not being copied in this manner. Is
> there a way to make TAR grab these as well?

These are Unix domain sockets which are created when an application
binds an AF_UNIX socket into the filesystem namespace. They do not
have any use outside of the context of the process which bound it or
clients which connect to it (assuming the process even exists any
more). They are not copyable and don't contain data that you need to
be concerned about.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: Questions about Linux 2.5 SMP & threading issues

2003-07-25 Thread Malcolm Beattie

 and included, for example, in SLES8.

> Can some of the internals gurus on this list comment on these claims and
> whether they are relevant to Linux under VM? I have my own speculations, but
> I wanted to see what the facts were.

When a new version of a piece of software comes out (or is about to
come out), some people get in a state of mind where they think that it
means that the previous version is dreadful. Sometimes this is because
they've worked on (part of) the new version and so concentrated on the
particular narrow problem area so much that they can no longer see the
big picture. Sometimes it's because somebody less familiar with the
detail of the changes sees a list of all the nice new features and
improvements in the new version and thinks that means that the previous
version was bad in absolute terms rather than relative terms in all
those areas. Letting such people have too much influence over the
choice of what software to run for a real workload in real life in the
here and now is unlikely to be a very good idea.

Hope this helps.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: HELP - using sed to translate upper to lower

2003-06-16 Thread Malcolm Beattie

Adam Thornton writes:
> On Mon, 2003-06-16 at 10:55, McKown, John wrote:
> > As best as I can tell, the following sed script should change all the upper
> > case to lower case. It is not working (SLES7)
> >
> > echo "XX" | sed 'y/[A-Z]/[a-z]/'
> >
> > What am I doing wrong?
>
> I always use tr:
>
> echo "XX" | tr '[A-Z]' '[a-z]'

Note that if you need to enter the murky waters of i18n then you
also need to distinguish between

tr A-Z a-z

which will only lowercase the 26 "unadorned" uppercase letters and

tr '[:upper:]' '[:lower:]'

which will also lowercase accented characters for reasonably
straightforward locale settings. If you want to handle more complex
Unicode lowercasing then you want to be using Perl's "tr" operator
(and/or uc(), lc(), regexps etc.). If you want *really* weird
Unicode stuff in all its full glory then even Perl may not get you
there (and you'll also have my full sympathy).

(Actually, the y/// syntax in Perl is a synonym for tr/// for those
who like the sed syntax plus you still get the nicer range behaviour
and hence
echo "XX" | perl -pe 'y/A-Z/a-z/'
works as you'd expect it would.)

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: "batch" compiles

2003-03-20 Thread Malcolm Beattie

McKown, John writes:
> OK, so I have a corrupted mindset, coming from MVS . But suppose that
> I want to compile a LOT of programs. In MVS, I code up some JCL and submit
> it to run later. When it completes, I get a notify to my TSO id and look at
> the output in SDSF. I repeat this for however many compiles that I want to
> do. Perhaps doing the submissions over a period of time. How do I do that in
> Linux (or any UNIX)? In VM/CMS, I remember a CMSBATCH virtual machine which
> worked a bit like the MVS initiator. The best that I can think of to do in
> Linux is:

I'm surprised I haven't yet seen anyone else mention the "batch"
command that comes as part of the "at" suite. It'll probably already
be installed. It's very useful to be able to do
at now somecommand
and "at" will "package up" your current environment variables and
arrange for the command to run "now" in the background, with all
stdout ending up sent to you as mail to your username once the job
is finished. For more complex resource control and timing, "batch"
lets you set up queues which run at particular times and when the
load average is low enough. It's certainly not as powerful as JES
but it may suffice for basic batch usage.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: reset a computer

2003-03-19 Thread Malcolm Beattie

Tzafrir Cohen writes:
> On Sun, Mar 16, 2003 at 02:25:49AM +0200, Tzafrir Cohen wrote:
> > Trying to explain the question once again
> >
> > On Thu, Mar 13, 2003 at 05:59:03PM +0200, Tzafrir Cohen wrote:
> > > Hi
> > >
> > > Short version of the question:
> > >
> > > How do do a "hard-reset" to a linux guest from within linux?
> > >
> > > Note that I don't mean to IPL the boot specific device: I need to re-run
> > > profile.exec from cms . I know I can do that using hcp.
> >
> > (As if the user has logged-off and re-logged-on)
>
> The answer is, of course, "hcp 'i cms'". I have no idea why it didn't work
> for me when I first tried it (it got the system stalled in CMS, so I
> figured it as yet another one of he things that don't work.

You may want to do
hcp 'i cms parm autocr'
so that CMS doesn't wait for you to hit Enter.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: QDIO for virtual NIC

2003-03-14 Thread Malcolm Beattie

Gregg C Levine writes:
> Has anyone actually tried out IPv6 under Linux, while its running on
> an appropriate Z-Box, and his relatives?

I've tried it over the last couple of days after reading up on it again
(the last time I played with IPv6 and the 6bone was before/during the
first address bit allocation wars which dates my activity somewhat :-).

The good news is that I can set up a sit tunnel from a Linux/ia32 box
to a Linux SLES8 guest under VM and it works fine. SLES8 comes out of
the box with the basic IPv6 tools and ping6/tracepath6 work fine. That
shows the general IPv6 stuff is OK.

The bad news is that I can't get a virtual QDIO interface (i.e. on a
QDIO GuestLAN) to work with IPv6. This may well have something to do
with the fact that the kernel logs the line
 qeth: IPv6 not supported on eth0
but I ploughed on regardless. This is z/VM 4.3 service level 0202
running 64-bit (second level) on a z900. The GuestLAN is type QDIO
(i.e. not HIPER) and each of two Linux guests has a virtual NIC
defined and coupled to it. They run SLES8 (I've tried with both the
shipped qeth driver and the qeth-susekernel-2.4.19-s390-1 driver
which developerworks implies is later and fixes a few bugs. One of
the bugs fixed in that claims to be "MAC address could not be
determined for VM Guest LAN interfaces" but even with the new driver
"ip link show eth0" still shows zeroes for the MAC address.
Both those drivers work fine with IPv4. As far as IPv6 is concerned,
an "ip -6 addr ls" shows

1: lo:  mtu 16436 qdisc noqueue
inet6 ::1/128 scope host
3: eth0:  mtu 1492 qdisc pfifo_fast qlen 100
inet6 fe80::200:ff:fe00:0/10 scope link
4: tr0:  mtu 1492 qdisc pfifo_fast qlen 100
inet6 fe80::a00:5aff:fe0c:c6aa/10 scope link

from which we can determine that the link local IPv6 address for tr0
is behaving (with the low bits correctly calculated from its MAC
address) but that the link local address for eth0 (the GuestLAN
interface) doesn't look right (especially when an "hcp q nic 7000"
shows its (faked) MAC address as 00-04-AC-00-00-00). Interestingly,
the other Linux guest (still using the original SLES8 qeth module),
shows exactly the same link local addres (oops) which led to a
short "hooray, ping6 of the other guest's IPv6 link-local address
works" before I realised that the duplicate address actually meant
it was pinging itself. Regardless of the link-local address, I tried
adding site-local addresses (fec0::2 and fec0::9) with appropriate
routes to the guests sharing the GuestLAN but although setting the
addresses and routes didn't give any errors, a ping6 from one guest
to the other just sat there (no errors; the behaviour you'd get from
packets dropped on the floor).

The latest "Device Drivers and Installation Commands" manual (for the
May 2002 stream) says about the qeth driver
Support for IPv6 applies to Gigabit Ethernet (GbE) and
Fast Ethernet (FENET) only.
which may mean "we don't support GuestLAN NICs" or may mean
"we support GuestLAN NICs because they're the virtual equivalent of
a real Gigabit Ethernet NIC". Given the ultra-concise
"qeth: IPv6 not supported on eth0" message, it's possible the former
but, unfortunately, I can't go check the code to tell.

Does anyone know any further detail for sure about IPv6 support for
QDIO (GuestLAN and otherwise)?

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: Kernel patch to add VM IPL PARM support

2003-03-03 Thread Malcolm Beattie

Lucius, Leland writes:
> So, which should it be?  Append or prepend the PARMs to the command line?  I
> haven't looked too deeply, but it appears that processing kernel parameters
> isn't too consistent and some rtns take the first parameter encountered.  As
> it is, appending PARMs wouldn't allow you to override in those situations.

You could allow both and combine the keyword with the eyecatcher:

PRE foo=bar POST baz=quux

where you allow one or both or PRE/POST but require at least one of
them to be the first word to act as eyecatcher. You may also want to
think up better keywords than PRE and POST--short enough not to take
up to many of the precious 64 chars but long enough to be meaningful.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: Network Problems with new kernel....

2003-01-30 Thread Malcolm Beattie

Geyer, Thomas L. writes:
> I am running SLES7 under zVM 4.3 using a Guest Lan. The current kernel is
> 2.4.7. I have built kermel 2.4.19, when I reboot with the new kernel I see
> the folloowing errors:
>
> Initializing random number generator 7  [80C [10D [1;32mdone
>  [m"
>  [m"
> modprobe: modprobe: Can't locate module eth0

modprobe looks for a module or alias called "eth0", looks up its
module dependencies and then tries to load it/them. Check whether
you have a line
alias eth0 qeth
in /etc/modules.conf or else modprobe won't even look for qeth.
Since you later say it works for an earlier kernel, I guess this isn't
then problem.
[...]
> When I logon onto the virtual machine through the TN3270, I see (using the
> lsmod command) that the qdio.o and qeth.o modules have not been loaded. I
> then use the insmod command to load qdio.o and qeth.o followed by ifconfig
> and route command to get the Linux virtual machine on the network.

If you are using insmod on qdio then qeth then you are resolving the
module dependencies yourself. I suspect if you tried "modprobe qeth"
(without loading qdio) then you might run into the same problem. The
table of module dependencies is per-kernel-version-tree. You'll need
to run a
depmod -a
to rebuild the dependencies for a new kernel. You may need to fiddle
with explicit options to depmod to ensure you build the dependencies
for the right kernel and put them in the right place. Look at the man
page for depmod for details. Often, distributions will run an
automatic depmod sometime during boot. This normally removes the need
to do a manual depmod but equally makes it easy to forget when one
*does* need to do one.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: TCPDUMP

2002-12-03 Thread Malcolm Beattie

Eddie Chen writes:
>  I am look at a output from a  tcpdump,  and I found that the  datagram of
> fragmented data  are sent from the "last fragmented"   datagram first.
>  Is this correcrt
>
>
>
>  (frag 9311:920@8880) (DF)
>  (frag 9311:1480@7400+) (DF)
>  (frag 9311:1480@5920+) (DF)
>  (frag 9311:1480@4440+) (DF)
>  (frag 9311:1480@2960+) (DF)
>  (frag 9311:1480@1480+) (DF)
>  1472 proc-7 (frag 9311:1480@0+)

Yes, it's a useful performance optimisation. It means the recipient
can allocate a network buffer just the right size for the whole
datagram as soon as it receives the first fragment. That saves it
having to reallocate larger and larger buffers for each fragment
that comes in. IIRC, it used to confuse one or two grotty old
embedded TCP/IP stacks but that was years ago and I'd hope that
everything today can handle it.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: HiperSockets and Guest LAN

2002-12-02 Thread Malcolm Beattie

Jxrgen Birkhaug writes:
> Ok - I've ditched the uneven device and reverted back to an even boundary.
>
> z/VM now sees the following *after* trying to initialize the qeth module:
>
> 
> Q NIC DETAILS
> Adapter 0960  Type: HIPER   Name: UNASSIGNED  Devices: 3
>   Port 0 MAC: 00-04-AC-00-00-0E  LAN: SYSTEM LNXLAN02   MFS: 16384
>   Connection Name: HALLOLE  State: Startup
>Device: 0960  Unit: 000  Role: CTL-READ
>   Unassigned Devices:
>Device: 0961  Unit: 001  Role: Unassigned
>Device: 0962  Unit: 002  Role: Unassigned
> 
>
> The dev numbers do match with the contents of /proc/subchannels. I'm
> slightly perplexed as to why the nic is in "State: Startup" and why 0961
> and 0962 are "Unassigned".
>
> Linux, on the other hand, reports:
>
> 
> qeth: Trying to use card with devnos 0x960/0x961/0x962
>  qeth: received an IDX TERMINATE on irq 0x11/0x12 with cause code 0x17
>  qeth: IDX_ACTIVATE on read channel irq 0x11: negative reply
>  qeth: There were problems in hard-setting up the card.
> 
>
> Back to scratch.

OK, let's keep going at it. What's the output of
 # cat /proc/chandev
on the Linux side (1) when you've freshly rebooted it, (2) after
you've caused the chandev settings to take effect (whether you
use SuSE's rcchandev, echo a read_conf to /proc/chandev or
whatever) and also (3) after you do the "modprobe qeth"?

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: HiperSockets and Guest LAN

2002-12-02 Thread Malcolm Beattie

Jxrgen Birkhaug writes:
> > Quoting Malcolm Beattie <[EMAIL PROTECTED]>:
> >
> > 
> > >
> > > Better make that triple of device numbers start on an even boundary.
> > >
> > > --Malcolm
> 
> Why?

I'm sure I've seen somewhere that it's a requirement but I can't
remember exactly which part of the system requires it and the only
reference I can find at the moment is one which only mentions the
requirement for OSE and not OSD (i.e. for non-QDIO). However,
something does look a bit odd about your new try:

> Adapter 0963  Type: HIPER Name: UNASSIGNED  Devices: 3
>   Port 0 MAC: 00-04-AC-00-00-0C  LAN: SYSTEM LNXLAN02MFS: 16384
>   Connection Name: HALLOLE   State: Session Established
> Device: 0964  Unit: 001   Role: CTL-READ
> Device: 0965  Unit: 002   Role: CTL-WRITE
> Device: 0963  Unit: 000   Role: DATA

Notice that VM shows that the triple of device numbers 963,964,965
have been switched around to the order 964,965,963 in order for the
first even number to become the CTL-READ device. The error message
from your Linux guest was

> qeth: Trying to use card with devnos 0x963/0x964/0x965
>  qeth: received an IDX TERMINATE on irq 0x14/0x15 with cause code 0x08
>  qeth: IDX_ACTIVATE on read channel irq 0x14: negative reply
>  qeth: There were problems in hard-setting up the card.

and it may be worth checking whether Linux has decided to switch
around the device numbers in the same way, perhaps by checking in
/proc/subchannels or /proc/chandev whether subchannel 0x14 really
is the control read device. On the other hand, it may be simpler
just to enforce the "even boundary" constraint, if only to avoid
having those permuted device numbers appearing.

I guess that there may even be other differences since this time
you're using a hipersockets device instead of a qdio one and it'll
have a different portname and so on (which is case sensitive and so
may be worth checking too: even if your OS/390 people see/quote it
in upper case it's possible that the underlying portname could be
lower case).

Setting up QDIO/Hipersockets connections have quite a few little
subtle requirements and getting any of them wrong can lead to the
sort of errors you're seeing. It's a bit of nuisance but usually
it's just a question of checking every little thing one more time
to find the one that you're running into.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: HiperSockets and Guest LAN

2002-12-02 Thread Malcolm Beattie

Jxrgen Birkhaug writes:
> Thanks Malcolm. I checked my chandev.conf and it did contain the
> underscore. I probably messed up my orginal post.
>
> I have now defined a new hipersocket and when trying to initialize it I get:
>
> -
> qeth: Trying to use card with devnos 0x963/0x964/0x965
>  qeth: received an IDX TERMINATE on irq 0x14/0x15 with cause code 0x08
>  qeth: IDX_ACTIVATE on read channel irq 0x14: negative reply
>  qeth: There were problems in hard-setting up the card.
> -
>
> At least it is a different cause code.

Better make that triple of device numbers start on an even boundary.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: HiperSockets and Guest LAN

2002-12-02 Thread Malcolm Beattie

Jxrgen Birkhaug writes:
> I'm having problems setting getting my qeth interface to work running on a
> virgin 2.4.19 kernel patched with the may 2002 stream.
>
> I suspect that it might be a problem with chandev and syntax, and I have
> been screwing around with chandev for some time but to no avail.
>
> insmod qeth returns:
>
[...]
> /etc/chandev.conf contains:
>
> noauto;qeth0,0x0960,0x0961,0x0962;addparms,0x10,0x0960,0x0962,portname:LNXLAN02


That "addparms" needs to be "add_parms" instead (i.e. with an underscore).

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: More NSS Info

2002-11-08 Thread Malcolm Beattie

David Boyes writes:
> You would need at least one non-root/swap address mounted as /config or
> something for storing the configuration of what goes where, and you'd
> have to move at least a few of the utilities (eg mount, ifconfig, etc)
> from /usr to /sbin (generating statically linked versions) and include
> /sbin in the root filesystem.

The basevol+guestvol environment I describe in the
"...zSeries...Large Scale Linux Deployment" redbook (SG246824)
(I really ought to bind that phrase to a single keystroke :-)
lets you have a readonly root filesystem which is linked to
(readonly) and booted by any number of clones. The boot process
then mounts a (potentially very) small guest-specific readwrite
volume (whatever disk is at devno 777) and binds all the
necessary writable directories into the filesystem. Other parts
of the redbook then describe how you can then bootstrap yourself
to get other information (via a PROP guest and then via LDAP).

We can do better than Sun since we have shared disks in known,
manageable namespaces at boot time and since we have Al Viro's
namespace support in Linux for bind mounts (again, described in
the redbook for those unfamiliar with the concept). [Next is
updates-in-place with CLONE_NEWNS and pivot_root() and/or
immediate kernel-to-kernel reboots when kexec() is stable...]

I'll set up the NSS stuff on my own VM system and get it to
work nicely with basevol+guestvol (which I've just got working
properly with SuSE SLES7; the original redbook environment having
some dependencies on the RedHat boot scripts).

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: Virtual network topology questions...

2002-11-08 Thread Malcolm Beattie

Nix, Robert P. writes:
> 9672, so no hiper-sockets. In trial mode, so no money to buy a distribution or 
>support, but with the potential to do so if / when it goes into production. 
>Potentially running DB2 and WebSphere, so SuSE instead of RedHat, as IBM supports 
>SuSE more so than RedHat, in our experience.
>
> I'd like to work within the confines I have.

You don't need physical hiper-sockets hardware for the GuestLAN and
virtual hipersockets provided by z/VM 4.3. GuestLAN (or virtual hsi)
simplifies many things. Unless there's absolutely no way for you to
use z/VM 4.3, you're good to go.

Part 2 of the "...zSeries...  Large Scale Linux Deployment" redbook
(SG246824) covers these sorts of issues and includes chapters on
"Hipersockets and z/VM GuestLAN", "TCP/IP direct connection" and
"TCP/IP routing".

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: OSA express gb adapter

2002-11-07 Thread Malcolm Beattie

Adam Thornton writes:
> On Wed, Nov 06, 2002 at 09:07:07PM -0500, David Boyes wrote:
> > > Does Red Hat include the OCO modules for QDIO on an OSA?  Thanks.
> > > Kyle Stewart
> > > The Kroger Co.
> > No.
>
> However, IBM does supply the modules built for RH, and they also have a
> procedure for building a new initrd with those modules:
>
> 
>http://oss.software.ibm.com/developerworks/opensource/linux390/special_oco_rh_2.4.shtml

Plus there's a detailed practical run-through of a RedHat+OCO install
in Appendix B of the "...zSeries...Large Scale Linux Deployment"
redbook, SG246824 (go to http://www.redbooks.ibm.com and type
"large scale linux" into the search field).

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: OSA express gb adapter

2002-11-06 Thread Malcolm Beattie

Crowley, Glen L writes:
> I have an LPAR that shares a gb ethernet osa express adapter with OS/390
> LPAR's.  I am using SUSE distribution as it is the only one that I found
> that includes the OCO modules.  I get the error that follows when setting up
> my network definition.  I have on ocasion be able to get this to work, but
> 99% of the time it fails.  Anybody have any ideas that might help me.
>
> Enter the device addresses for the qeth module, e.g. '0xf800,0xf801,0xf802'
> or auto for autoprobing (auto):
>
> Starting with microcode level 0146, OSA-Express QDIO require a portname to
> be set in the device driver. It identifies the port for sharing with other.
> OS images, for example the PORTNAME dataset used by OS/390.
> Do you have OSA Express microcode level 0146 or higher?
> y
> Note: If you share the card, you must use same portname
> on all guest/lpars using the card.
> Please enter the portname (must be 1 to 8 characters) to use:
> osa1

That name might be case-sensitive; I can't remember if I've ever
tried without explicit uppercase so it's only a guess. Does trying
"OSA1" in caps make a difference?

> qeth: Trying to use card with devnos 0xC40/0xC41/0xC42
> qeth: received an IDX TERMINATE on irq 0xAF4/0xAF5 with cause code 0x22 --
> try
> another portname

Are those device numbers right for the card you were intending to
use? If anyone adds another OSA to your LPAR without telling you,
you may end up with the wrong one. It might be safer to give the
triple of device numbers explicitly at the "Enter the device addresses"
prompt instead of letting it autodetect.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: Probably the first published shell code example for Linux/390

2002-11-01 Thread Malcolm Beattie

Jan Jaeger writes:
> When we are talking about storing (ie overlaying) programs (trojans) on the
> stack space, then only hardware protection can really help. One would need
> to come to a model where instructions cannot be executed from the stack.
> One can achive this in S/390, by making the stack space a separate space,
> which is only addressable thru an access register (like an MVS data space).
> This way instructions can never be executed from the stack space, however, I
> am afraid that such an implementation would break a few things.

Solar Designer did a non-executable stack patch for Linux/ia32
(using segment protection for the stack space since ia32 page-level
protection does not distinguish read from execute). The things that
a non-executable stack break are mainly (1) gcc trampolines (used for
nested functions), (2) signal delivery and (3) application-specific
run-time code generation. He handled (1) and (2) by detecting such
code and disabling the non-exec stack on the fly (yes, this is a
slight exposure). For (3), he supported a an ELF executable marker
which disabled non-exec stack for the whole program.

It was fairly popular and worked well against the sort of attacks
which it was designed to prevent. Needless to say, people then worked
out how to do some exploits even with non-exec stack ("return into
libc" et al). The arms war continues, as always.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: Plagued by YAST 3270 problems

2002-10-31 Thread Malcolm Beattie

Post, Mark K writes:
> Hey, the card reader on my Linux/390 guests have always worked, particularly
> with Malcolm Beattie's Unit Record device driver.  Having VM helps with
> that, but David Boyes did verify its functionality with a piece of real UR
> hardware.

And for those who weren't aware: the latest version of the driver,
along with a userland utility (complete with man page) are now
documented in Appendix A of the "... zSeries ... Large Scale Linux
Deployment" Redbook and available for download from the redbooks site:

The UR device driver provides a Linux character device interface
to an attached unit record device for a Linux guest. The UR
utility provides a user interface to the UR device driver.

Using the UR driver and utility, it is possible to exchange files
between a Linux guest and a z/VM virtual machine (initiated
within the Linux guest). The UR utility provides an interface
for copying files between UR devices (typically the reader,
punch, and printer defined by the virtual machine). It can
handle any file block size, and record length, and will perform
EBCDIC-to-ASCII conversion as required.

The UR device driver and utility can be downloaded from the
Internet as described in Appendix D,  Additional material  on
page 279.

For the ur utility, the syntax is:

ur copy [ -tbf ] [ infile | - ] [ outfile | - ]
ur info devfile
ur list
ur add minor devno blksz reclen flags [ devname [ perm ] ]
ur remove minor

with the last two lines providing dynamic device support.

The Redbook is available online (HTML and PDF) by going to
http://www.redbooks.ibm.com/
and entering "SG246824" in the search box at the top.
The direct URL to the HTML online version is
http://www.redbooks.ibm.com/redbooks/SG246824.html
and the direct URL to the PDF version is
http://www.redbooks.ibm.com/pubs/pdfs/redbooks/sg246824.pdf

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: Antwort: Max number of dasd devices

2002-10-14 Thread Malcolm Beattie

Jim Sibley writes:
> When will SuSE have the devfs as the default for zSeries so we don't
> have to compile the kernel to use it and get away from the double
> mapping we have to do between device and device node? It is a real
> nuisance to try and map 100 devices per LPAR for 7 or 8 LPARs. Then try
> moving 20 or thirty of those volumes to another LPAR when business
> needs dictate! W/O devfs, I can vouch that it is both a pain and error
> prone.

devfs is not the only way of handling these device management issues.
devfs carries along with it a certain amount of design and
implementation "history". Let's just say that distributions wouldn't
gratuitously omit it just to make your life harder.

There are two issues: the cleanliness of the kernel side and device
management in userland. They only overlap slightly. In the medium to
long term, the "stick together multiple majors and index everything
into arrays of stuff" issue on kernel side should be solved via the
combination of 32-bit dev_t (12 bit major, 20 bit minor), nice
struct device, struct gendisk or whatever and devicefs. This assumes
that Al Viro and co make the scramble before the 2.5 feature freeze
next week (or get it in afterwards anyway :-). Linus gave him the OK
two weeks ago so I have high hopes.

For the userland issue, I've often wondered why someone hasn't done
a version of scsidev for z/Linux (presumably "dasddev" would be the
obvious name). It would simply go look at all the DASD information
available via /proc/dasd/devices, /proc/partitions, query all the
volumes for their volsers and build up a set of nodes and symlinks
so you can refer to your volumes by label, /dev/dasdvol/VLABEL, or
devno, /dev/dasdno/2345 and so on. I must admit, I haven't quite
wondered hard enough for it to reach the top of my todo list though...

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: z/VM 4.2 Dispatching Problem running about 120 Li nux copies

2002-09-21 Thread Malcolm Beattie

y.

Now that you've raised that STORBUF constraint, what happens if the
Linux guests *do* start all making heavy, random use of all of their
memory all together all at the same time? CP is going to have to start
paging stuff both in and out in order to keep up? At what stage does
it decide that it's a bad idea to do all that I/O to page memory in
for a guest only to have to page it out again to make some available
for another guest a tiny instant later? And then do some more paging
soon after to get back the first guest's memory that it now needs
again? At some point, it may be better to kick out an entire guest
(or few) for a while so that the rest can get useful work done without
thrashing. Then, later on, it can kick out those guests and bring the
first lot back again. *That* is the job of the LDUBUF constraint that
we mentioned before. If you've raised STORBUF and CP has to start
paging *heavily* on behalf of the guests then it'll only allow a
certain number of Q3 guests to do that (the Q3 LDUBUF percentage of
your paging exposures, as we saw above) before kicking some guests
out to E3 to sit it out. So LDUBUF acts as a backstop for STORBUF once
the overcommit of real storage starts to pinch.

> DSPBUF : Q1=32767 Q2=32767 Q3=32767

The DSPBUF setting for a queue simply limits the number of guests
allowed to be in the associated dispatch queue. Having the DSPBUF
Q3=32767 means you can have 32767 Q3 guests all dispatched at the
same time before getting kicked out to E3. In other words, it's
effectively unlimited unless you want to play Test Plan Foo games.
If you set DSPBUF Q3 down to, say, 40 then when Linux guest number
41 started up (without timer-on-demand and without QUICKDSP), it
would get kicked straight into E3 and sit there like a lemon.
(This is what happened to us on the Large Scale Linux on VM
residency recently until I figured out it was the DSPBUF setting
that was too low.) I (personally) haven't come across any situation
involving Linux-only guests under VM in which setting the DSPBUF Q3
setting low enough to be relevant has ever been a useful constraint.
Obviously, this doesn't mean that I recommend setting it to a huge
number on every system and, as with any performance tuning, it needs
to be done carefully by someone who understands what is going on and
after getting full information on the specific system involved.
(OK, enough CYA.)

> q alloc page
> EXTENT EXTENT  TOTAL  PAGES   HIGH%
> VOLID  RDEV  STARTEND  PAGES IN USE   PAGE USED
> --  -- -- -- -- -- 
> 420RES 2271194277  15120  15120  15120 100%
>639688   9000   8997   9000  99%
> VMPASP 2281  0   1499 27 133939 269977  49%
> VMPA01 227E  0   3338 601020 132574 285116  22%
> VMPA02 227F  0   3338 601020 136540 285106  22%
> VMPA03 2280  0   3338 601020 131633 285112  21%
> VMPA04 2282  0   3338 601020 131783 277199  21%
> VMPA05 2283  0   3338 601020 135393 284971  22%
>   -- --
> SUMMARY3299K 825979 25%
> USABLE 3299K 825979 25%

As mentioned earlier, this shows me/you how many paging exposures
you have, so that you can work out when LDUBUF will start kicking in.
(It also shows you have a couple of paging extents on your sysres
volume which probably isn't really optimal. You have a good number
of other paging volumes and CP can do special optimisation tricks
for those (seldom ending channel progams, yadda yadda). It's unlikely
to be able to do those on your sysres pack which is a bit of a pity.)

> ind
> AVGPROC-035% 01
> MDC READS-03/SEC WRITES-00/SEC HIT RATIO-100%
> STORAGE-094% PAGING-0001/SEC STEAL-000%
> Q0-00126(0)   DORMANT-00012

That 126 for Q0 shows that you have 126 guests with QUICKDSP ON
so those guests are bypassing all those SRM constraints I mention
above and getting dispatched regardless.

> Q1-0(0)   E1-0(0)
> Q2-0(0) EXPAN-001 E2-0(0)
> Q3-1(0) EXPAN-002 E3-0(0)

That 1 for Q3 is the one Linux guest you did *not* give QUICKDSP.

If you wish, you may care to change the SRM settings (raise LDUBUF
and STORBUF as described above) and then remove QUICKDSP from the
Linux guests. You should find (if you get the STORBUF setting right)
that the Linux guests can survive without QUICKDSP. This would
have the advantage that CP can still have its SRM constraints as a
backstop in case the Linux guests start thrashing the system. If
that does happen then it may well be preferable to have some of
them go into eligible rather than have the system thrash without
getting any useful work done.

Hope this (long) explanation has been useful to some.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: z/VM 4.2 Dispatching Problem running about 120 Li nux copies

2002-09-20 Thread Malcolm Beattie

Davis, Jeff writes:
> Your indicate command shows that you have 8 users in the E list.  This
> happens when you don't have enough available real storage to fit a user's
> working set in to real storage.  When a user is in the E list, the system
> appears to be unavailable or down to that user.  He cannot run.  You can do
> a couple of things.  The best is to get more real storage.  Unfortunately,
> that's not so easy.  Second, you can set your SRM settings to over allocate
> real storage.  This will prevent users from going into the E list, but will
> also drive up your paging rate.  Make sure you have plenty of paging
> resource to do this.

If you post the result of
QUERY SRM
along with
QUERY STORE
QUERY XSTORE
QUERY ALLOC PAGE
then we can check whether some of your SRM settings are unnecessarily
preventing the Linux guests from being dispatched. For example, if
your DSPBUF setting was inherited from one intended for CMS guests
then it may include a "reserve" for Q1 and Q2 guests which Linux will
never be able to make use of.

As Jeff says, the next most likely reason is that storage is not being
overcommitted enough. In the absence of the timer-on-demand patch and/or
fancy footwork with mm configuration changes/patches, CP sees the entire
storage allocation of each Linux guest as one large clumped working set,
despite Linux being fairly happy (usually) to have some of it ripped
away. Because of this, in many Linux situations you can allow CP to
overcommit real storage rather heavily (while increasing paging space
to back it, of course) without the paging space actually being needed
most of the time. That's the STORBUF setting for SRM.

There's also the LDUBUF setting which causes guests to be kicked into the
eligible list if CP thinks that they are loading the paging subsystem
too much. That estimate is calculated based on how many "exposures"
the paging subsystem has: basically, how many volumes it can page to in
parallel, modulo some tweakable multipliers. The Q ALLOC PAGE I asked
for above will let you work out if that's likely to need tweaking when
you start increasing STORBUF (and assuming Linux guests turn out to need
to page significantly).

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: VSAM or Lightweight Database?

2002-09-18 Thread Malcolm Beattie

Paul Raulerson writes:
> Pretty much the last time I tried to use it for anything serious was in Solaris 7, 
>but yes, I am thinking of what was delivered with
> BSD 4.1 and above.  So you are saying the -ldb will give me multi-user, multi-key, 
>transactional access to record based data under
> Linux/390?

Multi-user: yes.
Transactional: yes.
Multi-key: weeell, it depends on what you mean by multi-key.

Of the four current access methods (hash, btree, queue, recno), the
btree and hash ones are the generic key-based ones. If by multi-key,
you mean you want to have fields k1 and k2 so that lookup by the
pair (k1, k2) is fast and so is a lookup by (k1) then you can use a
btree with a flattened key field consisting of the concatenation of
the k1 and k2 fields (with canonicalised length). The btree will mean
that you can look up by (k1) and, by locality of reference, walk
through all ordered (k1,k2) tuples nicely.

If instead you want multiple independent key fields to data then you'd
have to build you own indices with either a natural primary key for
the main data or else recno access method and then manage separate
index databases of key -> record_id  (recno or primary key) mappings
yourself. Although db3 will do the ACID stuff for you, it won't do all
the fancy constraint and index management that a proper relational
database will do (whether DB2, PostgreSQL or whatever) but from my
minimal knowledge of ISAM, I don't think that does either.

> libdb last time I looked was just a disk based associative array handler...

Time to look again. Start, for example, at
http://www.sleepycat.com/docs/ref/am_conf/intro.html

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

Re: Determining the 'mass' of a file system tree.

2002-09-17 Thread Malcolm Beattie

James Melin writes:
> Is there a good tool to say analyze part of a file system tree and report
> how much space it is using?
>
> Say like /usr/sbin - which is not in it's own file system but part of a
> larger one.

du -s /usr/sbin

Useful variations on a theme are
 du -s /foo/bar/*/
to get subtotals of each subdirectory (note the trailing / to force
the glob to match only directories) and including the option "--total"
to print a grand total.

> I'm trying to size a new deployment based on another and adust for growth.
> I am limited at the moment to a mod-9 drive size, so its kinda critical to
> know what parts of the root FS contain the most mass.
>
> I am also limited on the number of volumes I can actually have, so I'm
> trying to figure out the best distribution of limited resources
>
> My thought was this:

Your suggested breakdown of filesystems doesn't fit with usual
practice. If you want to split your filesystem amongst many volumes
(and there are frequently good reasons for doing this), then start
with separate filesystems for:
/, swap, /usr, /tmp, /var, /opt, /home and /usr/local.
These need not be full 3390-3 or 3390-9 volumes but can be
partitions instead (by using the CDL disk layout to get up to 3
partitions on each volume). Typically, you would want to keep the
root filesystem smallish in such a setup. For a larger Linux system,
you would mount extra volumes wherever needed (application specific
data filesystems might want to be on /var/lib/foo/data123,
/opt/foo/data/blah, /home/biggroupname, /usr/local/foo or a variety
of other conventions).

This all assumes that you have a large enough Linux system to make
it worth the complexity of splitting everything up. You can go a
long way with a single filesystem for the entire base system, a
swap partition and, if needed, a separate /usr before you necessarily
need to consider splitting off /var, /tmp, /opt or whatever.

--Malcolm

--
Malcolm Beattie <[EMAIL PROTECTED]>
Linux Technical Consultant
IBM EMEA Enterprise Server Group...
...from home, speaking only for myself

1 2 >

1 - 100 of 130 matches

Mail list logo