Re: Tapetype question

2001-10-09 Thread Marty Shannon, RHCE

[EMAIL PROTECTED] wrote:
> My tapetype run:
> 
> hostname:~# tapetype SONY-SDX500C /dev/nst1
> st1: Block limits 2 - 16777215 bytes.
> wrote 693509 32Kb blocks in 6289 seconds
> wrote 4829 32Kb sections
> define tapetype SONY-SDX500C {
> comment "just produced by tapetype program
> length 21671 mbytes
> filemark 4554 kbytes
> speed 3528 kbytes
> }
> could not rewind /dev/nst1: Input/output error
> hostname:~#

This is indicative that the drive's firmware is incapable of coping
properly with writing on (or past) the physical end of medium.

> Why is tapetype attempting to rewind a non-rewinding device (/dev/nst1)?

Of course, it wants to leave the drive so that all you need to do is hit
the eject button to continue to use the drive.  Note that "non-rewind"
only applies to the default action when the device is closed; it has no
other implication.

> I am receiving the following in syslog:
> 
> Oct 8 19:55:29 hostname kernel: st1: Error with sense data: Current
> st09:01: sense key Hardware Error

Again, the firmware didn't cope with writing on/past end of medium.  It
may also be reporting incorrect status via the scsi layer.

> I tried to run a mt rewind as well (I know I am using the non rewinding
> device, yet this was working yesterday.):
> 
> hostname:~# mt -f /dev/nst1 rewind
> mt: /dev/nst1: Input/output error
> hostname:~#

The driver attempted to ask the drive to rewind, but because of the
confused internal state of the drive, it was unable to do so.

> The tape will not eject using the eject button.  I rebooted the server and
> cycled the power on the tape device and *poof* everything is working
> again.  I can tar to the device, rewind and erase the tape with mt, and
> label the tape again.  Can anyone shed any light on exactly what happened
> here?  Is this indicitive of problems I will have with amanda running with
> this drive?

I'm pretty sure the only thing you would have to do is power-cycle the
drive.  It is the drive that is confused, not the kernel (driver).

Cheers,
Marty
--
Marty Shannon, RHCE, Independent Computing Consultant
mailto:[EMAIL PROTECTED]



Re: amtape: flaky false "label doesn't match labelstr" errors

2001-10-17 Thread Marty Shannon, RHCE

Jay Lessert wrote:
> 
> Amanda 2.4.2p2, on sparc Solaris 2.6.
> 
> This just started happening on an Amanda config that has been running
> stably for months.
> 
> The config has a labelstr of:
> 
> labelstr"^ARCHIVE-[0-9][1-9]-[0-9][0-9]-[0-9][0-9][0-9][0-9]$"

You goofed on the second character class; it should probably be '[0-9]'
rather than '[1-9]'.


--
Marty Shannon, RHCE, Independent Computing Consultant
mailto:[EMAIL PROTECTED]



Re: dat compression control, lack thereof

2002-01-19 Thread Marty Shannon, RHCE

The proper mechanism for this is obtained by "man stinit" under recent
versions of Linux; Solaris has an analogous mechanism.  Other systems
probably also have it, but I don't know for sure.  To use hardware
compression, just refer to the device name that has the compression
enabled.

For instance, on my tape server, my /etc/stinit.def looks like:

# The common definitions that can usually be used
{buffer-writes read-ahead async-writes }

# Seagate AIT
manufacturer=SEAGATE model="AIT" {
can-bsr can-partitions scsi2logical auto-lock
mode1 blocksize=0 density=0x30 compression=0# native, no compression
mode2 blocksize=0 density=0x30 compression=1# native, w/ compression
}

Under Linux, you will almost certainly need to use "mknod" to create
appropriate devices; mine (on a SCSI interface) look like:

crw-rw-rw-   1 root disk   9, 128 May  5  1998 /dev/nst0
crw-rw-rw-   1 root root   9, 160 Dec 11  1999 /dev/nst0c
crw-rw-rw-   1 root disk   9,   0 May  5  1998 /dev/st0
crw-rw-rw-   1 root root   9,  32 Dec 11  1999 /dev/st0c

(Actual backups are done on a different machine & drive; these are for
illustration only.)

    Cheers,
Marty
--
Marty Shannon, RHCE, Independent Computing Consultant
mailto:[EMAIL PROTECTED]



Re: Amanda & DNS problems

2002-01-21 Thread Marty Shannon, RHCE

Ian Eure wrote:
> 
> Amanda Backup Client Hosts Check
> 
> ERROR: nephrite: [host
> axinite.gh.landmark-appraisal.com.0.168.192.in-addr.arpa: hostname lookup
> failed]
> Client check: 1 host checked in 0.208 seconds, 1 problem found

You have a typo in your DNS data, almost certainly a name not terminated
with a ".", probably in the "reverse" zone.  Use nslookup and/or dig to
verify.

Marty
--
Marty Shannon, RHCE, Independent Computing Consultant
mailto:[EMAIL PROTECTED]



Re: Confirmation for subscribe amanda-users

2002-01-24 Thread Marty Shannon, RHCE

[EMAIL PROTECTED] wrote:
> 
> I installed advfs.diff on the client (wookie) and reran the test dump
> but the sendsize*dump file still says:
> 
> **
> sendsize: debug 1 pid 8615 ruid 33 euid 33 start time Thu Jan 24 17:13:43 2002
> /usr/local/libexec/sendsize: version 2.4.2p2
> calculating for amname '/boot', dirname '/boot'
> sendsize: getting size via dump for /boot level 0
> sendsize: running "/sbin/dump 0Ssf 1048576 - /dev/sda2"
> running /usr/local/libexec/killpgrp
>   DUMP: Warning: unable to translate LABEL=/
>   DUMP: Warning: unable to translate LABEL=/boot
> /dev/sda2: Permission denied while opening filesystem
> .
> (no size line match in above dump output)
> .
> asking killpgrp to terminate
> sendsize: pid 8615 finish time Thu Jan 24 17:13:44 2002
> **

The only *problem* is that dump can't open /dev/sda2 for reading.  Try:

su -c amanda id

If that doesn't tell you that amanda is in whatever group owns the disk,
that's your problem, and you need to verify that the (numeric) group id
used in /etc/passwd is the group that owns the disk.

Marty
--
Marty Shannon, RHCE, Independent Computing Consultant
mailto:[EMAIL PROTECTED]



Re: Kernel 2.4.17 crashing

2002-02-04 Thread Marty Shannon, RHCE

Chris Dahn wrote:
> 
>   So, I upgraded my linux kernel to 2.4.17. I'm running a redhat system,
> originally 7.1.  I have the mt-st package version .5beta release 10.
> Occasionally when I attempt to run an amcheck, my system merely stops. No
> kernel panic, not segfaulting, it simply stops. Does anyone know of this
> problem? I'm going to try upgrading mt-st next, but I was hoping for a little
> confirmation.

Well, it's generally a really bad idea to run a kernel that hasn't been
all that thoroughly beaten on by the masses.  What's a worse idea is
running a "bleeding-edge" kernel on the machine that does your backups! 
I would never even consider doing backups on a machine with anything but
a stock release + vendor recommended updates!

I would *very* strongly suggest you revert to the 7.1 or 7.2 kernel as
shipped by Red Hat (or superceded by Red Hat recommended updates, of
course).

Marty
--
Marty Shannon, RHCE, Independent Computing Consultant
mailto:[EMAIL PROTECTED]



Re: Compression on Exb 8500

2002-02-04 Thread Marty Shannon, RHCE

Glen Eustace wrote:
> 
> It would seem that we have some kind of conflict with hardware and software
> compression.  If both are on, the backups can not be restored. I have managed
> to turn off software compression and that seems to cure the problem but would
> prefer to do software and not hardware.
> 
> I have tried turning off compression using 'mt -f tape setdensity 0x15',
> which seems to work but doesn't stick, the next time amanda writes to the
> drive it has gone back to 0x8c.
> 
> This system is on RH6.2 on SPARC, is there anyway of permanently turning off
> compression on the Exb8500 ?

Try: "man stinit" and after you've created the proper device nodes
(search the archive for an earlier post of mine), add "post-install st
/sbin/stinit" to /etc/conf.modules.  You already have the proper density
codes, so it shouldn't take more than a few minutes to get it all set
up.

Marty
--
Marty Shannon, RHCE, Independent Computing Consultant
mailto:[EMAIL PROTECTED]



Re: Compression on Exb 8500

2002-02-05 Thread Marty Shannon, RHCE

Glen Eustace wrote:
> 
> On Tue, 05 Feb 2002 18:35, you wrote:
> > Try: "man stinit" and after you've created the proper device nodes
> > (search the archive for an earlier post of mine), add "post-install st
> > /sbin/stinit" to /etc/conf.modules.  You already have the proper density
> > codes, so it shouldn't take more than a few minutes to get it all set
> > up.
> 
> Thanks for the info, I read your previous letter and am still a bit confussed.
> 
> I only want the single mode on the drive, so didn't put a mode line in the
> stinit.def file, stinit complains it is missing.
> 
> If I add a mode1, then it complains there is not device /dev/nst1, which is
> true it is /dev/nst0.
> 
> It doesn't like mode0, and clues ?

Hmmm, if I said mode0-3, my bad.  Must be mode1-4, but you don't need
all of them.  You do, however, want to be certain to identify the drive
to stinit, using the "manufacturer=" and "model=" (which you get from
the kernel boot messages in /var/log/messages).  For instance, the
customized part of my stinit.def file for my AIT-1 drive:

manufacturer=SEAGATE model="AIT" {
can-bsr can-partitions scsi2logical auto-lock
mode1 blocksize=0 density=0x30 compression=0# native, no compression
mode2 blocksize=0 density=0x30 compression=1# native, w/ compression
}

(note that you have your own density codes, and probably should not use
"compression=" for the Exabyte drives).  Also, after editing
/etc/stinit.def, do "rmmod st", then "modprobe st".  Oh, yeah, the
device files:

$ ls -l /dev | grep ' 9, ' | grep st0
crw-rw-rw-   1 root disk   9, 128 May  5  1998 nst0
crw-rw-rw-   1 root root   9, 160 Dec 11  1999 nst0c
crw-rw-rw-   1 root disk   9,   0 May  5  1998 st0
crw-rw-rw-   1 root root   9,  32 Dec 11  1999 st0c

stinit is smart enough to find them, no matter what you've called them,
but you must have both a rewind and non-rewind device for each mode you
specify.  If, in fact, you have more than 1 tape drive, and the drive
you're messing with is actually st1, increment all the device minor
numbers by 1 (1, 33, 129, 161).

Let me know if any of that helps

(Copied back to the list, as the problem continues to crop up for
folks.)

Marty
--
Marty Shannon, RHCE, Independent Computing Consultant
mailto:[EMAIL PROTECTED]



Re: Kernel 2.4.17 crashing

2002-02-06 Thread Marty Shannon, RHCE

Michael Hicks wrote:
> 
> Dan Wilder <[EMAIL PROTECTED]> wrote:
> >
> > Is there a reason you need to run a 2.4 kernel?
> 
> When you're making backups, it's really nice to be able to create files
> bigger than 2GB in size.  2.4 allows big files, even on 32-bit systems.

Except that it is utterly unnecessary with amanda.  Any other reasons?

Marty
--
Marty Shannon, RHCE, Independent Computing Consultant
mailto:[EMAIL PROTECTED]



Re: gtar causing kernel hang

2000-10-24 Thread Marty Shannon, RHCE

This is a kernel problem, *not* a gtar problem.  Gtar is trying to do an
lstat() system call, and the kernel has apparently run out of resources
-- unless the disk hasn't been fsck'ed, in which case, there may well be
a block that is corrupted (and it probably contains inodes).  Be
assured, if /var/log/messages gives you a "kernel:" message, then the
problem is either in the kernel itself, or possibly in the hardware
(i.e., you could have an uncorrected memory error (do not *ever* use
non-ECC memory on a machine you want to be even remotely reliable!)).

Recommendation: fsck the partition that gtar was working on.  Run a
serious memory checker (wish I had one, but then, I only use ECC
memory).

Good luck,
Marty

[EMAIL PROTECTED] wrote:
> 
> Hello everyone.  On one machine, and one machine only, the backup process
> is crashing the server.  In particular, it appears to be gtar that is
> making it hang.  This machine is configured similarly to every other
> machine that I'm backing up, but this is the only one having problems.
> 
> Here is the error that shows up in /var/log/messages:
> 
> Oct 24 00:50:08 banjo kernel: Unable to handle kernel paging request at
> virtual address 0004a2bd
> Oct 24 00:50:08 banjo kernel: current->tss.cr3 = 133c, %%cr3 =
> 133c
> Oct 24 00:50:08 banjo kernel: *pde = 
> Oct 24 00:50:08 banjo kernel: Oops: 
> Oct 24 00:50:08 banjo kernel: CPU:0
> Oct 24 00:50:08 banjo kernel: EIP:0010:[dput+295/328]
> Oct 24 00:50:08 banjo kernel: EFLAGS: 00010296
> Oct 24 00:50:08 banjo kernel: eax: 0004a27d   ebx: dc027120
> ecx: c5616fa0   edx: d6794f80
> Oct 24 00:50:08 banjo kernel: esi:    edi: cf5a8330
> ebp: 0568   esp: dcce3e7c
> Oct 24 00:50:08 banjo kernel: ds: 0018   es: 0018   ss: 0018
> Oct 24 00:50:08 banjo kernel: Process gtar (pid: 7926, process nr: 53,
> stackpage=dcce3000)
> Oct 24 00:50:08 banjo kernel: Stack: c56165a0 c012e758 dc027120 dcce3ed0
> dcce3ed0 c023715c 1006 dcce3ed0
> Oct 24 00:50:08 banjo kernel:0001 1006 c012f6d3 f562
> 1006  c0268f60 c023715c
> Oct 24 00:50:08 banjo kernel:c0268f60 cf6598c0 dd26d320 dd26d36c
>  dcce3ed0 dcce3ed0 c012f73a
> Oct 24 00:50:08 banjo kernel: Call Trace: [prune_dcache+248/300]
> [try_to_free_inodes+199/264] [grow_inodes+30/384] [get_new_inode+18
> 5/280] [iget+88/96] [ext2_lookup+84/124] [real_lookup+79/160]
> Oct 24 00:50:08 banjo kernel:[lookup_dentry+296/488]
> [__namei+40/88] [sys_newlstat+14/96] [system_call+52/56] [startup_32+43
> /286]
> Oct 24 00:50:08 banjo kernel: Code: 8b 40 40 50 56 68 60 41 1f c0 e8 ee 49
> fe ff c7 05 00 00 00
> 
> After this, the machine is fully hung, and I have to reboot it.  I'm
> thinking maybe gtar is hosing the stack somehow.
> 
> Any ideas?  Or do I need to send this to the tar bug list?  Here's the
> stats on the server itself:
> 
> 400MHz PII, 512MB RAM, RAID-5 SCSI with a DPT Decade card.  Kernel
> 2.2.16-3, Amanda 2.4.1p1, gtar 1.12.
> 
> Thanks for any help!
> 
> -D

--
Marty Shannon, RHCE, Sr. Systems Developer
mailto:[EMAIL PROTECTED]



Re: Sony TSL-11000

2000-11-18 Thread Marty Shannon, RHCE

"Jonathan F. Dill" wrote:
> 
> Does anyone know how can I enable disconnect for the linux aic7xxx
> driver?  It's not in the README.aic7xxx nor the comments in the source
> code.  Do I have to change something in the card setup before bootup?
> 
> I have determined that the TSL-11000 is not causing the reset directly.
> The problem appears to be with the settings of the linux aic7xxx
> driver.  The tape changer appears to "tie up" the bus while it's in
> action.  When the aic7xxx module fails to read or write to partitions on
> /dev/sda for a certain number of attempts, it decides that the bus is
> "hung" and performs a bus reset.
> 
> I thought "disconnect" might help if it would allow the TSL-11000 to
> perform its operation without hanging all of the other devices on the
> bus, but I don't know what other problems, if any, enabling disconnect
> might cause.
> 
> I increased the selection timeout to the max value of 256ms in case that
> would help--it's the only parameter that could be changed without
> compiling a custom kernel that looks like it might help.  Next, I plan
> to see if I can change any parameters through /proc/scsi, what
> parameters are adjustable when compiling the kernel, and what info I can
> dredge up from the aic7xxx web pages and mailing list archives.
> 
> --
> "Jonathan F. Dill" ([EMAIL PROTECTED])

Someone will undoubtedly correct me if I'm wrong on this, but I'm pretty
sure you need to enable disconnect from Adaptec's BIOS settings page
(type control-A to get there).

Cheers,
Marty
--
Marty Shannon, RHCE, Sr. Systems Developer
mailto:[EMAIL PROTECTED]



Re: Sony TSL-11000

2000-11-18 Thread Marty Shannon, RHCE

Joi Ellis wrote:
> 
> On Sat, 18 Nov 2000, Jonathan F. Dill wrote:
> 
> >Date: Sat, 18 Nov 2000 15:30:47 -0500
> >From: Jonathan F. Dill <[EMAIL PROTECTED]>
> >To: "Marty Shannon, RHCE" <[EMAIL PROTECTED]>
> >Cc: [EMAIL PROTECTED]
> >Subject: Re: Sony TSL-11000
> >
> >"Marty Shannon, RHCE" wrote:
> >> Someone will undoubtedly correct me if I'm wrong on this, but I'm pretty
> >> sure you need to enable disconnect from Adaptec's BIOS settings page
> >> (type control-A to get there).
> >
> >Thanks for the tip--I looked up the user reference for the AHA-2940U2W
> >from adaptec.com and it is indeed one of the BIOS settings.  It's on by
> >default, and I doubt that I changed it, but I'll have to check.
> 
> Would enabling disconnect on the scsi controller be a good thing to do
> generically?
> 
> I have a similar card on my machine, perhaps I could reconfigure it
> to enhance performance over what I see now?
> 
> --
> Joi Ellis
> [EMAIL PROTECTED], http://www.visi.com/~gyles19/

I would say that, in general, if your OS supports it (all versions of
Linux in the 2.0 and 2.2 series that I know of do support it; I
*believe* SunOS supports it), it allows devices to release the SCSI bus,
so, yes.  Of course, if your OS doesn't support it, you're out of luck. 
I seem to recall that some drives need to be jumpered properly to enable
disconnect, as well

Cheers,
Marty
--
Marty Shannon, RHCE, Sr. Systems Developer
mailto:[EMAIL PROTECTED]



Re: Pre-compiled Sun 2.5.1 binaries?

2001-02-16 Thread Marty Shannon, RHCE

Steve Fulton wrote:
> 
> Here is the output from config.log .. as you will see, there is an I/O.  In
> a nutshell, the /usr partition is fragged - due to some poor administration
> by a previous sysadmin.  Since it is a production machine, I cannot take it
> down to fix - at least not for the short-term, but I need to back it up -
> which is why I want to install the Amanda client on it - which is why I need
> a pre-compilied binary to test.
> 
> Steve
> 
> configure:4153: checking for gcc
> configure:4230: checking whether the C compiler (gcc  -g) works
> configure:4244: gcc -o conftest  -g conftest.c   1>&5
> conftest.c:0: /usr/include/.: I/O error
> configure: failed program was:
> #line 4240 "configure"
> #include "confdefs.h"
> main(){return(0);}

You will probably have no more luck backing this machine up than you had
trying to build Amanda.  This is a seriously broken machine.  At the
very least, you have a media error (i.e., bad block) in the middle of
the directory "/usr/include".  This machine needs to be taken down, all
the filesystems checked, and the lost data restored.  I would not even
consider trying to back this machine up until after those steps are
complete.

Good luck (you *WILL* need it),
Marty
--
Marty Shannon, RHCE, Independent Computing Consultant
mailto:[EMAIL PROTECTED]



Re: Failure on W2k client

2001-04-03 Thread Marty Shannon, RHCE

David Lloyd wrote:
> 
> The error I see is:
> 
> "Host is down or invalid password"

In some sense, that is the final answer.

More explicitly (and has been stated in the FAQ, and on this mailing
list inumerable times): if smbclient can't get to the windows host,
Amanda is not involved.  You must first make smbclient work before you
can even begin to test with Amanda.

Again: THIS IS NOT AMANDA'S FAULT!  IT IS SMBCLIENT'S FAULT!

Sorry folks.  I'm just really sick & tired of seeing the same old
problems flogged to death here because folks refuse to read either the
FAQ or the archives of this mailing list.

Marty

P.S.  Technically, it is Microsoft's fault for violating their own
(unpublished) standards with the SMB implementation for w2k.

P.P.S.  The folks who provide Samba do an amazing job in spite of
Microsoft, and if you get the very latest version from them, I'd be
willing to bet a buck that it will solve your problem -- unless you have
your Samba misconfigured, that is.
--
Marty Shannon, RHCE, Independent Computing Consultant
mailto:[EMAIL PROTECTED]



Re: Local Filesystem

2001-04-07 Thread Marty Shannon, RHCE

C Scott wrote:
> 
> Will Amanda, using DUMP, stay in a local filesystem if NFS shares
> are mounted in the local partition? Example disklist:
> 
> hostname/   comp-user
> 
> What if something was mounted in /mnt/hostname2 when the backup ran. Would
> Dump ignore it? If not, is there anyway to configure Dump to stay in the
> local filesystem? I can not use Gnu Tar because I must preserve the
> modification dates of files.
> 
> Casey Scott

Dump can *only* access the local filesystem: it reads the actual
partition, and thus only sees mount points as directories.

    Cheers,
Marty
--
Marty Shannon, RHCE, Independent Computing Consultant
mailto:[EMAIL PROTECTED]



Re: Local Filesystem

2001-04-08 Thread Marty Shannon, RHCE

Joi Ellis wrote:
> 
> On Sat, 7 Apr 2001, Marty Shannon, RHCE wrote:
> 
> >
> >Dump can *only* access the local filesystem: it reads the actual
> >partition, and thus only sees mount points as directories.
> >
> 
> This depends upon the version of dump and the platform.
> Linux dump groks directories anywhere on the disk, not just partition
> device names or mount points.

True enough.  (Though with all the other things that have been said
about Linux dump, I'm not sure I would trust it to do that correctly.)

> But, the version I'm using doesn't
> store dump levels for anything except a full partition or mountpoint.
> A subdirectory below the mountpoint can be backed up with dump but the
> dump timestamps aren't recorded.  Or something like that.

But using it that way would render it useless with Amanda.

> --
> Joi Ellis
> [EMAIL PROTECTED], http://www.visi.com/~gyles19/

--
Marty Shannon, RHCE, Independent Computing Consultant
mailto:[EMAIL PROTECTED]



Re: Strange amdump

2001-04-08 Thread Marty Shannon, RHCE

C Scott wrote:
> 
> Can anyone think of any reason why I can only get one backup on a
> tape. The tape is only 38% used, but the amdump report (email) says that
> next tape expected is a new tape. If amdump is run again, it reports that
> amflush must be used because it can't overwrite the active tape. There is
> plenty of room on the tape.
> 
> Casey Scott

Amanda will only put 1 session (which may include dumps from more than 1
filesystem) on a tape.  The reasons for this have been discussed
extensively, but the bottom line is that appending to a tape is not
reliable.
--
Marty Shannon, RHCE, Independent Computing Consultant
mailto:[EMAIL PROTECTED]



Re: Changer don't run after installing a new hard drive

2001-04-18 Thread Marty Shannon, RHCE

Juergen Knott wrote:
> 
> Hi All!
> 
> Amanda, Version 2.4.2-19991216-beta1, runs a lot of time very well.
> Now i have installed a new SCSI Hard Drive, and than i want to start amanda.
> But amanda don't run.
> 
> Amanda Tape Server Host Check
> -
> Holding disk /dumps/amanda: 2095692 KB disk space available, that's plenty
> amcheck-server: could not get changer info:  could not read result
> from "/usr/local/libexec/chg-scsi" (got signal 11)

Odds are excellent that you have set the SCSI ID of your new drive to
the same as either the tape drive or the changer.  Remember that the
changer typically uses an ID different from the actual tape drive

> Amanda Backup Client Hosts Check
> 
> Client check: 2 hosts checked in 0.042 seconds, 0 problems found
> 
> (brought to you by Amanda 2.4.2-19991216-beta1)
> amanda@fileserver:/usr/bin >
> 
> Who can help me?
> 
> Bye Juergen
> 
> --
> Dies ist eine Microsoft freie Mail!

--
Marty Shannon, RHCE, Independent Computing Consultant
mailto:[EMAIL PROTECTED]



Re: problem backing up a host with more than 171 disklist entries of root-tar

2001-05-11 Thread Marty Shannon, RHCE

This sounds like a classic case of running out of file descriptors --
either on a per-process basis, or on a system-wide basis (more likely
per-process, as you seem to be able to reproduce it at will with the
same number of disklist entries on that "host").

It seems to me that Amanda should specifically check for the
open/socket/whatever system call that is returning with errno set to
EMFILE (or, on some brain damaged systems, EAGAIN).  When that happens,
Amanda should wait for some of the existing connections to be taken down
(i.e., closed).

Cheers,
Marty

"Bernhard R. Erdmann" wrote:
> 
> Hi,
> 
> I'm using Amanda 2.4.2p2 on/for a Linux Box (RH 6.2, 2.2.19, GNU tar
> 1.13.17) to backup home directories on a NetApp Filer mounted with NFS.
> 
> Up to and including 171 disklist entries of type root-tar, everything is
> ok. amcheck complains about the home directories being not accessible
> (amanda has uid 37), but runtar get's them running with euid 0 (NFS
> export with no root squashing). It takes about 3 secs for amcheck to
> check these lines.
> 
> If I add some more disklist entries of the same type, amcheck hangs for
> a minute (ctimeout 60) and then reports "selfcheck request timed out.
> Host down?"
> 
> /tmp/amanda gets three more files: amanda..debug, amcheck...
> and selfcheck...
> With up to 171 entries, selfcheck..debug grows to 28387 Bytes
> containing 171 lines "could not access". Using 172 entries, it stops at
> 16427 Bytes and contains only 100 lines "could not access" (o.k. because
> of NFS permissions). The last line of the disklist is checked first.
> /tmp/amanda/selfcheck... ends with:
> selfcheck: checking disk /home/User/cb
> selfcheck: device /home/User/cb
> selfcheck: could not access /home/User/cb (/home/User/cb): Permission
> denied
> selfcheck: checking disk /home/User/ca
> selfcheck: device /home/User/ca
> 
> After adding one or more lines to the disklist file, only the last 100
> lines get checked, then an amandad and a selfcheck process is hanging
> around:
> $ ps x
>   PID TTY  STAT   TIME COMMAND
> 28833 pts/2S  0:00 -bash
> 28854 pts/2S  0:00 emacs -nw disklist
> 29000 pts/1S  0:00 -bash
> 29149 ?S  0:00 amandad
> 29151 ?S  0:00 /usr/libexec/amanda/selfcheck
> 29182 pts/3S  0:00 -bash
> 29227 pts/3S  0:00 less selfcheck.20010511233745.debug
> 29230 pts/1R  0:00 ps x
> 
> Killing selfcheck spaws another selfcheck process and this one's debug
> file stops after having checked the last 100 disklist lines, too.
> $ kill 29151
> $ ps x
>   PID TTY  STAT   TIME COMMAND
> 28833 pts/2S  0:00 -bash
> 28854 pts/2S  0:00 emacs -nw disklist
> 29000 pts/1S  0:00 -bash
> 29182 pts/3S  0:00 -bash
> 29231 ?S  0:00 amandad
> 29233 ?S  0:00 /usr/libexec/amanda/selfcheck
> 29234 pts/1R  0:00 ps x
> $ kill 29233
> $ ps x
>   PID TTY  STAT   TIME COMMAND
> 28833 pts/2S  0:00 -bash
> 28854 pts/2S  0:00 emacs -nw disklist
> 29000 pts/1S  0:00 -bash
> 29182 pts/3S  0:00 -bash
> 29238 ?S  0:00 amandad
> 29240 ?S  0:00 /usr/libexec/amanda/selfcheck
> 29241 pts/1R  0:00 ps x
> $ kill 29240
> $ ps x
>   PID TTY  STAT   TIME COMMAND
> 28833 pts/2S  0:00 -bash
> 28854 pts/2S  0:00 emacs -nw disklist
> 29000 pts/1S  0:00 -bash
> 29182 pts/3S  0:00 -bash
> 29244 ?S  0:00 amandad
> 29246 ?D  0:00 /usr/libexec/amanda/selfcheck
> 29247 pts/1R  0:00 ps x
> $ kill 29246
> $ ps x
>   PID TTY  STAT   TIME COMMAND
> 28833 pts/2S  0:00 -bash
> 28854 pts/2S  0:00 emacs -nw disklist
> 29000 pts/1S  0:00 -bash
> 29182 pts/3S  0:00 -bash
> 29251 pts/1R  0:00 ps x
> 
> Now it's got killed...
> 
> Any ideas?

--
Marty Shannon, RHCE, Independent Computing Consultant
mailto:[EMAIL PROTECTED]



Re: compiling error on a client...

2001-04-02 Thread Marty Shannon, RHCE

verdon wrote:
> 
> i'm actually installing amanda on a few server (redhat 6.2) "this is
> probably why i'm always asking strange question" :)
> 
> usually when running ./configure i get:
> 
> checking disk device prefixes... /dev/ - /dev/
> checking whether posix fcntl locking works... (cached) yes
> 
> but on one server i have the following error and compilation abort with
> some error
> 
> checking disk device prefixes... /dev/ - /dev/
> checking whether posix fcntl locking works... (cached) no
> checking whether flock locking works... (cached) no
> checking whether lockf locking works... (cached) no
> checking whether lnlock locking works... (cached) no
> configure: warning: *** No working file locking capability found!
> configure: warning: *** Be VERY VERY careful.
> 
> this server is running like the other:
> redhat-6.2
> glibc-2.1.3-15
> glibc-devel-2.1.3-15
> 
> i have look in my kernel configuration for an optional "posix support"
> that i would have not enable!?? but i didn't find anything about it
> i just have change my kernel to 2.4.2 but this didn't change anything?
> 
> I've look for similar error in amanda mailing list archives ...without
> great success...

You cannot run glibc-2.1.* with a 2.4 kernel.  They are incompatible in
a number of ways.  Either downgrade the kernel to a 2.2 or upgrade glibc
to a 2.2.  In general, it isn't a good idea (unless you're a kernel
hacker) to run anything more advanced than what appears in the
appropriate updates directory on update.redhat.com.  It surprises me
more than a little bit that the system works well enough to do much of
anything on it!

Cheers,
Marty
--
Marty Shannon, RHCE, Independent Computing Consultant
mailto:[EMAIL PROTECTED]



Re: Amcheck troubleshooting

2001-06-06 Thread Marty Shannon, RHCE

Denise Ives wrote:
> 
> On 6 Jun 2001, Alexandre Oliva wrote:
> 
> > On Jun  5, 2001, Denise Ives <[EMAIL PROTECTED]> wrote:
> >
> > > can not access hdg1 (hdg1): No such file or directory
> >
> > Looks like it's not looking for it as /dev/hdg1, assuming this device
> > name exists.  Try /dev/hdg1 explicitly, if this is the case.
> 
> Same results are generated if I explicitly add /dev/hdg1 to my disklist.
> I tried it both ways before I made the post. Is there anything else I can
> try here?

First, try:

$ ls -l /dev/hdg1

If that says "No such file or directory", then you haven't actually
created the device file (and certainly don't have a filesystem mounted
there).

Marty
--
Marty Shannon, RHCE, Independent Computing Consultant
mailto:[EMAIL PROTECTED]



Re: rewinding problem (I've got it!)

2001-08-31 Thread Marty Shannon, RHCE

Sandor Feher wrote:
> After I rebooted everythign seems ok. amcheck reports there are not any
> error, the tape is running forward and backward too. But when I cd
> /misc/cd (scsi cdrom with autofs) the tape hangs and the amcheck reports
> I wrote my recent mail (ERROR: /dev/nst0: rewinding tape: Input/output
> error). Any idea?

Odds are pretty good that you have a SCSI cabling problem.  Either
16-bit -> 8-bit not terminated properly, or maybe just the length of
cable.  Probably others can give better details

Cheers,
    Marty
--
Marty Shannon, RHCE, Independent Computing Consultant
mailto:[EMAIL PROTECTED]



Re: make on solaris 7 dies

2001-09-21 Thread Marty Shannon, RHCE

Put /usr/ccs/bin in your $PATH before running configure.

Marty

Toby Bluhm wrote:
> 
> Trying to install 2.4.2 on solaris 7. Make dies here:
> 
> gcc -DHAVE_CONFIG_H -I. -I. -I../config -I./../regex-src
> -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE
> -D_FILE_OFFSET_BITS=64 -g -O2   -c version.c
> gcc -DHAVE_CONFIG_H -I. -I. -I../config -I./../regex-src
> -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE
> -D_FILE_OFFSET_BITS=64 -g -O2   -c pipespawn.c
> rm -f libamanda.a
> cru libamanda.a alloc.o amflock.o debug.o dgram.o error.o file.o
> fileheader.o match.o protocol.o regcomp.o regerror.o regexec.o regfree.o
> security.o statfs.o stream.o token.o util.o version.o versuff.o
> pipespawn.o
> /usr/bin/sh: cru: not found
> make[1]: *** [libamanda.a] Error 1
> make[1]: Leaving directory `/usr/local/src/amanda-2.4.2/common-src'
> make: *** [all-recursive] Error 1
> 
> Seems to be tied to common-src/Makefile not having AR defined.
> My rh71  box has AR = /usr/bin/ar. Can't find any utility called ar on
> the sparc system.
> 
> Is this the problem? What is the solution?
> 
> Thanks
> 
> --
> toby
> 
> "I believe we are on an irreversible trend toward more freedom and
> democracy -- but that could change" - Dan Qualye

--
Marty Shannon, RHCE, Independent Computing Consultant
mailto:[EMAIL PROTECTED]



Re: drive compression discovery

2003-03-10 Thread Marty Shannon, RHCE
Eric Sproul wrote:
> Also, my system is Linux, so there
> are no compression-related device names.

Actually, all Linux Amanda users should read and understand (and implement) the
stinit man page.  Once you've built a proper "/etc/stinit.def" for your
drive(s), and make the proper device nodes, you should never have any issues
with compression again.
--
Marty Shannon, RHCE (@ home)
mailto:[EMAIL PROTECTED]


Re: Pornography

2003-03-20 Thread Marty Shannon, RHCE
Brian Cuttler wrote:
> 
> Seth,
> 
> I'm not getting that spam via the list. maybe my address got
> out that way but its not the vector for the incoming.

That's very odd.  All the spam I get is through the list -- porn or otherwise.

Why is the list set up to allow non-subscribers to post to it?

    Marty
--
Marty Shannon, RHCE (@ home)
mailto:[EMAIL PROTECTED]


Re: Pornography

2003-03-20 Thread Marty Shannon, RHCE
Joshua Baker-LePain wrote:
> 
> On Thu, 20 Mar 2003 at 12:46pm, Marty Shannon, RHCE wrote
> 
> > > I'm not getting that spam via the list. maybe my address got
> > > out that way but its not the vector for the incoming.
> >
> > That's very odd.  All the spam I get is through the list -- porn or otherwise.
> >
> > Why is the list set up to allow non-subscribers to post to it?
> 
> To allow open discussion.  Most lists *I'm* on are like this (and
> unmoderated).

Ok, is it worth:

subscribers getting spammed;

the additional load/cost on the server (and recipients) to send (and recieve)
spam that will ostensibly never be seen;

the loss of subscribers who can neither filter their email
(companies/organizations do not necessarily behave rationally) nor access an
external mail address (perhaps due to privacy issues)?

I think not.

*All* lists *I'm* on -- with the exception of this one -- do not permit any
access by non-subscribers (except to subscribe, of course).

Marty
--
Marty Shannon, RHCE (@ home)
mailto:[EMAIL PROTECTED]


Re: Pornography

2003-03-20 Thread Marty Shannon, RHCE
Scott Mcdermott wrote:
> 
> Marty Shannon, RHCE on Thu 20/03 12:46 -0500:
> > That's very odd.  All the spam I get is through the list
> > -- porn or otherwise.
> 
> I don't get any spam off the amanda-users list either btw.
> 
> What must be happening is that the spammers are forging
> from-headers and sending to addresses they have harvested
> form the list.  They forge the from-header to look like the
> address of the email system that sends out the mails to the
> recipients they harvested.

No, I've checked.  The spam is being sent through the list mail server,
suggesting that omniscient.com has enabled foreign relaying.  The spam does not
have the normal set of headers you would expect from a legitimate list message,
but it does have a legitimate origin in omniscient.com.  Perhaps they're not as
omniscient as they think?  :-)

Marty
--
Marty Shannon, RHCE (@ home)
mailto:[EMAIL PROTECTED]


Here's the headers from the latest spam

2003-03-20 Thread Marty Shannon, RHCE
Unlike many I've seen, it actually came *through* the list -- or so it appears.

Received: from guinness.omniscient.com (localhost [127.0.0.1]) by
guinness.omniscient.com (8.12.8/8.12.8) with ESMTP id h2KJr8EL029329
(version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for
<[EMAIL PROTECTED]>; Thu, 20 Mar 2003
14:53:08 -0500 (EST)
Received: (from [EMAIL PROTECTED]) by guinness.omniscient.com
(8.12.8/8.12.8/Submit) id h2KJr8ZR029328 for amanda.org.amanda-users-list; Thu,
20 Mar 2003 14:53:08 -0500 (EST)
X-Authentication-Warning: guinness.omniscient.com: majordom set sender to
[EMAIL PROTECTED] using -f
Received: from sapphire.omniscient.com (sapphire.omniscient.com [64.134.101.71])
by guinness.omniscient.com (8.12.8/8.12.8) with ESMTP id h2KJr7EL029321
(version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 20
Mar 2003 14:53:07 -0500 (EST)
Received: from mail.sinirsizhost.net ([62.29.78.32]) by sapphire.omniscient.com
(8.12.8/8.12.8) with SMTP id h2KJqpOF029091; Thu, 20 Mar 2003 14:53:00 -0500
(EST)

--
Marty Shannon, RHCE (@ home)
mailto:[EMAIL PROTECTED]


Re: Pornography

2003-03-20 Thread Marty Shannon, RHCE
Kirk Strauser wrote:
> 
> At 2003-03-20T20:42:52Z, Scott Mcdermott <[EMAIL PROTECTED]> writes:
> 
> > There is no reason someone should not want to subscribe to the list to
> > post.
> 
> That's not true:
> 
> "Oh, crap, my server just melted, and I'm having trouble getting Amanda to
> work!  I need an answer ASAP!  I know, I'll send an email to
> [EMAIL PROTECTED] from my home email address, even though I don't want
> to flood my slow home connection with a bunch of mailing lists..."

You forgot the part about, "But, my job is on the line, so I'll do whatever it
takes to get this machine restored, and worry about unsubscribing later."

Marty
--
Marty Shannon, RHCE (@ home)
mailto:[EMAIL PROTECTED]


Re: amcheck fails when client has multiple IP addresses

2003-03-22 Thread Marty Shannon, RHCE
lloyd wrote:
> 
> On 22 Mar 2003 12:42:52 -0800
> philo vivero <[EMAIL PROTECTED]> wrote:
> 
> > On Sat, 2003-03-22 at 03:27, lloyd wrote:
> > > I added a second IP address / interface (eth0:1)
> > >selfcheck request timed out.  Host down?
> > > If I shut down eth0:1 amcheck works.
> >
> > If you're on a Redhat-based system, edit
> > /etc/sysconfig/network-scripts/ifcfg-eth0:1 (or whatever has the
> > definition for interface eth0:1) and delete the GATEWAY= entry from it.
> [...]
> > The network script that brings up that interface sets a new default
> > gateway that then causes routing to not function.
> 
> Thanks for the suggestion - I tried it, but it didn't work.  Is there something else 
> I can try?

Did you actually look at the output of "netstat -r" and/or "route"?

Marty
--
Marty Shannon, RHCE (@ home)
mailto:[EMAIL PROTECTED]


Re: Backup size doubled (sendbackup.debug)

2003-06-25 Thread Marty Shannon, RHCE
Internet Support wrote:
> 
> You can see that Amanda correctly estimates the size at estimated 4752738
> tape blocks
> But actually dumps  DUMP: 9029315 tape blocks

Yes, dump is actually dumping the holding area along with all
the other files on that partition.  If you were using tar
instead of dump, you could use an "exclude" to prevent tar from
considering the holding area.  Since you're using dump, your
only option is to let Amanda know that it's the holding disk by
adding the "holding-disk yes" option in a new dumptype.  This
is all from memory; do check the gory details in the supplied
amanda.conf file.

Marty
--
Marty Shannon, RHCE (@ home)
mailto:[EMAIL PROTECTED]


Re: PINAR ALTUÐ

2003-08-10 Thread Marty Shannon, RHCE
No mas!  I must unsubscribe in order to keep my job.  I've
already been disciplined for receiving "AMANDA" mail that
includes *PORN*.  I cannot believe that the list manager really
wants to lose subscribers this way, but the last "discussion"
indicated that it was more important to have the list receive
porn than to make the list useful.  Do not bother to respond. 
I do *NOT* need to get fired just because I subscribe to this
list.

What a waste of a mailing list

Diety help the newbies  (Just for the record, she will
absolutely not do so -- in my humble opinion.)

Tugba wrote:
> 
>  [Image]   [Image]
> 
>  [Image]   [Image]
> 
>  [Image]   [Image]
> if you dont want receive mail please send blank email to
> [EMAIL PROTECTED]

You gotta be kidding, right?  This is what the AMANDA list is
all about, right?
--
Marty Shannon, RHCE (@ home)
mailto:[EMAIL PROTECTED]