Re: amanda 3.3.3 "too many files"

2013-06-07 Thread Brian Cuttler

Jean-Louis,
Jon,

I've updated my amanda.conf to use auth="local" for the
dumptypes I have in use in my disklist.

> ulimit
unlimited

Per solaris instructions...
> echo 'rlim_fd_max/d' | mdb -k
rlim_fd_max:
rlim_fd_max:0

> amcheck finsen
Amanda Tape Server Host Check
-
Holding disk /lstripe: 546222 MB disk space available, using 546122 MB
slot 9: volume 'Finsen31'
Will write to volume 'Finsen31' in slot 9.
NOTE: skipping tape-writable test
NOTE: info dir 
/usr/local/etc/amanda/finsen/DailySet1/curinfo/finsen/_export2_samba_maldi does 
not exist
NOTE: it will be created on the next run.
NOTE: index dir 
/usr/local/etc/amanda/finsen/DailySet1/index/finsen/_export2_samba_maldi does 
not exist
NOTE: it will be created on the next run.
Server check took 4.691 seconds

Amanda Backup Client Hosts Check

ERROR: finsen: service selfcheck: selfcheck: Error opening pipe to child: Too 
many open files
ERROR: finsen: service /usr/local/libexec/amanda/selfcheck failed: pid 5457 
exited with code 1
Client check: 1 host checked in 130.727 seconds.  2 problems found.

(brought to you by Amanda 3.3.3)

The new DLE is fact did cause the retained snapshot to change by
one DLE, in alpha order. It is (re)verified that this is not random
and is tied to list position.

So much for the solaris run time work-around.

export LD_PRELOAD_32 /usr/lib/extendedFILE.so.1

then run amcheck.
> amcheck finsen
ld.so.1: amcheck: warning: /usr/lib/extendedFILE.so.1: open failed: illegal 
insecure pathname
Amanda Tape Server Host Check
-
Holding disk /lstripe: 546222 MB disk space available, using 546122 MB
slot 9: volume 'Finsen31'

FILE.so.1: open failed: illegal insecure pathname
ERROR: finsen: Application 'amgtar': can't run support command
ERROR: finsen: Application 'amgtar': ld.so.1: amgtar: warning: 
/usr/lib/extendedFILE.so.1: open failed: illegal insecure pathname
ERROR: finsen: Application 'amgtar': can't run support command
ERROR: finsen: Application 'amgtar': ld.so.1: amgtar: warning: 
/usr/lib/extendedFILE.so.1: open failed: illegal insecure pathname
ERROR: finsen: Application 'amgtar': can't run support command

related to suid programs?

Don't want to make further changes before the weekend, think I'll
implement auth="local" for amdump on Monday and see how it performs.


thank you,

Brian




On Wed, Jun 05, 2013 at 01:41:16PM -0400, Brian Cuttler wrote:
> 
> Jean-Louis,
> 
> Yes, I did find some information on a run time mechanism to
> increase the 256 file limit (file limit stored in unsigned character).
> 
> The work-around employes requires the exection of /usr/lib/extendedFILE.so.1
> prior to the binary being executed.
> 
> Following up on your maxcheck and Spindle number, I wonder if I 
> couldn't automatically build an alternate disklist file with 
> spindle number and swap it in and out. It would have to be done
> dynamically (since my disklist changes and making changes in 
> multiple locations is error prone), but that can be scripted and
> called from cron.
> 
> /* I need something that will handle both formats of DLE
>  *
> finsen  /export2 zfs-snapshot2
> finsen  /export/home-AZ /export/home   {
> user-tar2
> include "./[A-Z]*"
> }
>  *
>  */
> 
> Since this is an amanda-client issue, rather than an amanda server
> issue, I need to ask you, how to execute this on the client-side
> before attempting to check the DLE list. Is there a way to invoke
> this from the amanda daemon?
> 
>  - Alternatively, if someone better versed than I am on the Solaris
>inetd or in SMF knows how to insert the requisit command on the
>client side - I would be appreciative if they would share their
>information.
> 
>   thank you,
> 
>   Brian
> 
> 
> On Wed, Jun 05, 2013 at 11:54:35AM -0400, Jean-Louis Martineau wrote:
> > Brian,
> > 
> > Can you increase the number of open files at the system level?
> > 
> > amcheck check all DLEs in parallel, you can try to add spindle (in the 
> > disklist) to reduce parallelism but that can have a bad impact on dump 
> > performance, so it is not a good workaround.
> > 
> > You would like a maxcheck  setting similar to maxdump, I put it in my 
> > TODO list.
> > 
> > Jean-Louis
> > 
> > On 06/05/2013 11:05 AM, Brian Cuttler wrote:
> > >Hello amanda users,
> > >
> > >I just updates amanda 3.3.0 to 3.3.0 on a Solaris 10/x86 system.
> > >The system is both the server and the client, there are no other
> > >clients of this system.
> > >
> > >We have ~265 DLEs on this system (large zfs arrays and all
> > >samba shares are their own file systems and DLE, thank goodness
> > >I was able to talk my manager out of making all user directories
> > >their own DLE as well, though they are their own zfs f

Re: amanda 3.3.3 "too many files"

2013-06-07 Thread Brian Cuttler


Jean-Louis,

added a couple of switches to # ls, got a much more informative output.

[finsen]: /proc/734/fd > ls -F -C /proc/10832/fd
0=  1=  10  12|  13|  16|  17|  2=  20|  21|  3>  6|  8|




On Thu, Jun 06, 2013 at 11:09:20AM -0400, Jean-Louis Martineau wrote:
> On 06/05/2013 11:54 AM, Jean-Louis Martineau wrote:
> >Brian,
> >
> >Can you increase the number of open files at the system level?
> >
> >amcheck check all DLEs in parallel, you can try to add spindle (in the 
> >disklist) to reduce parallelism but that can have a bad impact on dump 
> >performance, so it is not a good workaround.
> 
> Forget that idea, adding spindle will not help.
> 
> I think the problem is a file descriptor leak (files not closed), but it 
> can be in any process.
> Can you monitor all opened file for all amanda processes?
> I don't know how to do it with Solaris, but you 'ls /proc/PID/fd' on linux.
> It will help to find which process leak.
> 
> Jean-Louis
---
   Brian R Cuttler brian.cutt...@wadsworth.org
   Computer Systems Support(v) 518 486-1697
   Wadsworth Center(f) 518 473-6384
   NYS Department of HealthHelp Desk 518 473-0773



Re: amanda 3.3.3 "too many files"

2013-06-06 Thread Jean-Louis Martineau

On 06/05/2013 11:54 AM, Jean-Louis Martineau wrote:

Brian,

Can you increase the number of open files at the system level?

amcheck check all DLEs in parallel, you can try to add spindle (in the 
disklist) to reduce parallelism but that can have a bad impact on dump 
performance, so it is not a good workaround.


Forget that idea, adding spindle will not help.

I think the problem is a file descriptor leak (files not closed), but it 
can be in any process.

Can you monitor all opened file for all amanda processes?
I don't know how to do it with Solaris, but you 'ls /proc/PID/fd' on linux.
It will help to find which process leak.

Jean-Louis


Re: amanda 3.3.3 "too many files"

2013-06-05 Thread Jean-Louis Martineau

On 06/05/2013 03:56 PM, Brian Cuttler wrote:

Jean-Louis,

Thank you, I'm sorry I was unclear. Yes, of course the disklist
needs to be in place when I invoke amcheck on the server.

I'd meant that I need to find out how to up the file limit on
the client, which is a more difficult proposition since its
SMF/INET and not simply something I can script in cron on the
server. The fact that the client and the server are the same
box doesn't help much in this case.
It can help. use the 'local' auth, which is a fork of amandad instead of 
connecting to it.
If you increase the limit for amcheck, then that amandad will get the 
same limit.


Jean-Louis


thank you,

Brian

On Wed, Jun 05, 2013 at 03:08:45PM -0400, Jean-Louis Martineau wrote:

On 06/05/2013 01:41 PM, Brian Cuttler wrote:

Jean-Louis,

Yes, I did find some information on a run time mechanism to
increase the 256 file limit (file limit stored in unsigned character).

The work-around employes requires the exection of
/usr/lib/extendedFILE.so.1
prior to the binary being executed.

Following up on your maxcheck and Spindle number, I wonder if I
couldn't automatically build an alternate disklist file with
spindle number and swap it in and out. It would have to be done
dynamically (since my disklist changes and making changes in
multiple locations is error prone), but that can be scripted and
called from cron.

/* I need something that will handle both formats of DLE
  *
finsen  /export2 zfs-snapshot2
finsen  /export/home-AZ /export/home   {
 user-tar2
 include "./[A-Z]*"
 }
  *
  */

Since this is an amanda-client issue, rather than an amanda server
issue, I need to ask you, how to execute this on the client-side
before attempting to check the DLE list. Is there a way to invoke
this from the amanda daemon?

It must be done on the server before amcheck is executed.

./script-add-spindle < disklist > disklist.spindle
./amcheck CONF -odiskfile=disklist.spindle

Jean-Louis


---
Brian R Cuttler brian.cutt...@wadsworth.org
Computer Systems Support(v) 518 486-1697
Wadsworth Center(f) 518 473-6384
NYS Department of HealthHelp Desk 518 473-0773





Re: amanda 3.3.3 "too many files"

2013-06-05 Thread Brian Cuttler

Jean-Louis,

Thank you, I'm sorry I was unclear. Yes, of course the disklist
needs to be in place when I invoke amcheck on the server.

I'd meant that I need to find out how to up the file limit on
the client, which is a more difficult proposition since its
SMF/INET and not simply something I can script in cron on the
server. The fact that the client and the server are the same
box doesn't help much in this case.

thank you,

Brian

On Wed, Jun 05, 2013 at 03:08:45PM -0400, Jean-Louis Martineau wrote:
> On 06/05/2013 01:41 PM, Brian Cuttler wrote:
> >Jean-Louis,
> >
> >Yes, I did find some information on a run time mechanism to
> >increase the 256 file limit (file limit stored in unsigned character).
> >
> >The work-around employes requires the exection of 
> >/usr/lib/extendedFILE.so.1
> >prior to the binary being executed.
> >
> >Following up on your maxcheck and Spindle number, I wonder if I
> >couldn't automatically build an alternate disklist file with
> >spindle number and swap it in and out. It would have to be done
> >dynamically (since my disklist changes and making changes in
> >multiple locations is error prone), but that can be scripted and
> >called from cron.
> >
> >/* I need something that will handle both formats of DLE
> >  *
> >finsen  /export2 zfs-snapshot2
> >finsen  /export/home-AZ /export/home   {
> > user-tar2
> > include "./[A-Z]*"
> > }
> >  *
> >  */
> >
> >Since this is an amanda-client issue, rather than an amanda server
> >issue, I need to ask you, how to execute this on the client-side
> >before attempting to check the DLE list. Is there a way to invoke
> >this from the amanda daemon?
> It must be done on the server before amcheck is executed.
> 
> ./script-add-spindle < disklist > disklist.spindle
> ./amcheck CONF -odiskfile=disklist.spindle
> 
> Jean-Louis
> 
---
   Brian R Cuttler brian.cutt...@wadsworth.org
   Computer Systems Support(v) 518 486-1697
   Wadsworth Center(f) 518 473-6384
   NYS Department of HealthHelp Desk 518 473-0773



Re: amanda 3.3.3 "too many files"

2013-06-05 Thread Jean-Louis Martineau

On 06/05/2013 01:41 PM, Brian Cuttler wrote:

Jean-Louis,

Yes, I did find some information on a run time mechanism to
increase the 256 file limit (file limit stored in unsigned character).

The work-around employes requires the exection of /usr/lib/extendedFILE.so.1
prior to the binary being executed.

Following up on your maxcheck and Spindle number, I wonder if I
couldn't automatically build an alternate disklist file with
spindle number and swap it in and out. It would have to be done
dynamically (since my disklist changes and making changes in
multiple locations is error prone), but that can be scripted and
called from cron.

/* I need something that will handle both formats of DLE
  *
finsen  /export2 zfs-snapshot2
finsen  /export/home-AZ /export/home   {
 user-tar2
 include "./[A-Z]*"
 }
  *
  */

Since this is an amanda-client issue, rather than an amanda server
issue, I need to ask you, how to execute this on the client-side
before attempting to check the DLE list. Is there a way to invoke
this from the amanda daemon?

It must be done on the server before amcheck is executed.

./script-add-spindle < disklist > disklist.spindle
./amcheck CONF -odiskfile=disklist.spindle

Jean-Louis



Re: amanda 3.3.3 "too many files"

2013-06-05 Thread Brian Cuttler

Jean-Louis,

Yes, I did find some information on a run time mechanism to
increase the 256 file limit (file limit stored in unsigned character).

The work-around employes requires the exection of /usr/lib/extendedFILE.so.1
prior to the binary being executed.

Following up on your maxcheck and Spindle number, I wonder if I 
couldn't automatically build an alternate disklist file with 
spindle number and swap it in and out. It would have to be done
dynamically (since my disklist changes and making changes in 
multiple locations is error prone), but that can be scripted and
called from cron.

/* I need something that will handle both formats of DLE
 *
finsen  /export2 zfs-snapshot2
finsen  /export/home-AZ /export/home   {
user-tar2
include "./[A-Z]*"
}
 *
 */

Since this is an amanda-client issue, rather than an amanda server
issue, I need to ask you, how to execute this on the client-side
before attempting to check the DLE list. Is there a way to invoke
this from the amanda daemon?

 - Alternatively, if someone better versed than I am on the Solaris
   inetd or in SMF knows how to insert the requisit command on the
   client side - I would be appreciative if they would share their
   information.

thank you,

Brian


On Wed, Jun 05, 2013 at 11:54:35AM -0400, Jean-Louis Martineau wrote:
> Brian,
> 
> Can you increase the number of open files at the system level?
> 
> amcheck check all DLEs in parallel, you can try to add spindle (in the 
> disklist) to reduce parallelism but that can have a bad impact on dump 
> performance, so it is not a good workaround.
> 
> You would like a maxcheck  setting similar to maxdump, I put it in my 
> TODO list.
> 
> Jean-Louis
> 
> On 06/05/2013 11:05 AM, Brian Cuttler wrote:
> >Hello amanda users,
> >
> >I just updates amanda 3.3.0 to 3.3.0 on a Solaris 10/x86 system.
> >The system is both the server and the client, there are no other
> >clients of this system.
> >
> >We have ~265 DLEs on this system (large zfs arrays and all
> >samba shares are their own file systems and DLE, thank goodness
> >I was able to talk my manager out of making all user directories
> >their own DLE as well, though they are their own zfs file systems).
> >
> >The following errors are -not- new with 3.3.3, we've had them for
> >a while, I'd hoped the upgrade would take are of it.
> >
> >Also the amcheck leaves an amanda-check file around for one of
> >the zfs file systems (yes, configured to use zfs snapshot). [I'm
> >pretty sure these two errors are related to one another]
> >
> >The filesystem amanda-*-check file left is for the same filesystem
> >each night, unless we add/remove DLE/filesystems. So I think it is
> >the nth filesystem and at the limit of the open file counter, rather
> >than something in the file system itself.
> >
> >I was hoping there was an easy fix for this. Last I recall on the
> >topic it had to do with the fillm being a 32 bit rather than 64 bit
> >value (I could be wrong about this).
> >
> >Otherwise all # amcheck tests run successfully. Will run # amdump
> >this evening but do not anticipate any issues there.
> >
> > thank you,
> >
> > Brian
> >
> >>amcheck -c finsen
> >Amanda Backup Client Hosts Check
> >
> >ERROR: finsen: service selfcheck: selfcheck: Error opening pipe to child: 
> >Too many open files
> >ERROR: finsen: service /usr/local/libexec/amanda/selfcheck failed: pid 
> >8590 exited with code 1
> >Client check: 1 host checked in 83.304 seconds.  2 problems found.
> >
> >(brought to you by Amanda 3.3.3)
> >
> >
> >from /var/log/conlog
> >
> >Jun  5 10:55:04 finsen amandad[8583]: [ID 927837 daemon.info] connect from 
> >finsen.wadsworth.org
> >Jun  5 10:56:27 finsen selfcheck[8590]: [ID 702911 daemon.error] Error 
> >opening pipe to child: Too many open files
> >
> >
> >
> >
> >
> >---
> >Brian R Cuttler brian.cutt...@wadsworth.org
> >Computer Systems Support(v) 518 486-1697
> >Wadsworth Center(f) 518 473-6384
> >NYS Department of HealthHelp Desk 518 473-0773
> >
> 
---
   Brian R Cuttler brian.cutt...@wadsworth.org
   Computer Systems Support(v) 518 486-1697
   Wadsworth Center(f) 518 473-6384
   NYS Department of HealthHelp Desk 518 473-0773



Re: amanda 3.3.3 "too many files"

2013-06-05 Thread Jon LaBadie
On Wed, Jun 05, 2013 at 11:05:36AM -0400, Brian Cuttler wrote:
> 
> Hello amanda users,
> 
> I just updates amanda 3.3.0 to 3.3.0 on a Solaris 10/x86 system.
> The system is both the server and the client, there are no other
> clients of this system.
> 
> We have ~265 DLEs on this system (large zfs arrays and all
> samba shares are their own file systems and DLE, thank goodness
> I was able to talk my manager out of making all user directories
> their own DLE as well, though they are their own zfs file systems).
> 
> The following errors are -not- new with 3.3.3, we've had them for
> a while, I'd hoped the upgrade would take are of it. 
> 
> Also the amcheck leaves an amanda-check file around for one of
> the zfs file systems (yes, configured to use zfs snapshot). [I'm
> pretty sure these two errors are related to one another]
> 
> The filesystem amanda-*-check file left is for the same filesystem
> each night, unless we add/remove DLE/filesystems. So I think it is
> the nth filesystem and at the limit of the open file counter, rather
> than something in the file system itself.
> 
> I was hoping there was an easy fix for this. Last I recall on the
> topic it had to do with the fillm being a 32 bit rather than 64 bit
> value (I could be wrong about this).

It could very well be too many files open.  My research says
Solaris 10 default value for "process.max-file-descriptor"
is 256.  Check it with "ulimit -n".  Solaris maintains both
a "soft" and a "hard" set of limits for some parameters.
To check the hard limit try "ulimit -H -n".

For process.max-file-descriptor an ordinary user can reduce
the hard limit but not increase it.  That user can also
reduce the soft limit and can raise it to a maximum of
the hard limit.

To raise the soft limit try either:

 ulimit -S -n 1024

or something like:

  prctl -n process.max-file-descriptor -t basic -v  1024 -r -i process $$

These could go in amanda's .profile and that would help for
login sessions, but I doubt it would help for cron started
jobs.  You may have to run it in a wrapper.

With root access you can change the system default, but I
doubt you want it changed at the system level.  Maybe I'm
wrong, you probably only need to change the amanda server(s).

Jon
-- 
Jon H. LaBadie j...@jgcomp.com
 11226 South Shore Rd.  (703) 787-0688 (H)
 Reston, VA  20190  (609) 477-8330 (C)


Re: amanda 3.3.3 "too many files"

2013-06-05 Thread Jean-Louis Martineau

Brian,

Can you increase the number of open files at the system level?

amcheck check all DLEs in parallel, you can try to add spindle (in the 
disklist) to reduce parallelism but that can have a bad impact on dump 
performance, so it is not a good workaround.


You would like a maxcheck  setting similar to maxdump, I put it in my 
TODO list.


Jean-Louis

On 06/05/2013 11:05 AM, Brian Cuttler wrote:

Hello amanda users,

I just updates amanda 3.3.0 to 3.3.0 on a Solaris 10/x86 system.
The system is both the server and the client, there are no other
clients of this system.

We have ~265 DLEs on this system (large zfs arrays and all
samba shares are their own file systems and DLE, thank goodness
I was able to talk my manager out of making all user directories
their own DLE as well, though they are their own zfs file systems).

The following errors are -not- new with 3.3.3, we've had them for
a while, I'd hoped the upgrade would take are of it.

Also the amcheck leaves an amanda-check file around for one of
the zfs file systems (yes, configured to use zfs snapshot). [I'm
pretty sure these two errors are related to one another]

The filesystem amanda-*-check file left is for the same filesystem
each night, unless we add/remove DLE/filesystems. So I think it is
the nth filesystem and at the limit of the open file counter, rather
than something in the file system itself.

I was hoping there was an easy fix for this. Last I recall on the
topic it had to do with the fillm being a 32 bit rather than 64 bit
value (I could be wrong about this).

Otherwise all # amcheck tests run successfully. Will run # amdump
this evening but do not anticipate any issues there.

thank you,

Brian


amcheck -c finsen

Amanda Backup Client Hosts Check

ERROR: finsen: service selfcheck: selfcheck: Error opening pipe to child: Too 
many open files
ERROR: finsen: service /usr/local/libexec/amanda/selfcheck failed: pid 8590 
exited with code 1
Client check: 1 host checked in 83.304 seconds.  2 problems found.

(brought to you by Amanda 3.3.3)


from /var/log/conlog

Jun  5 10:55:04 finsen amandad[8583]: [ID 927837 daemon.info] connect from 
finsen.wadsworth.org
Jun  5 10:56:27 finsen selfcheck[8590]: [ID 702911 daemon.error] Error opening 
pipe to child: Too many open files





---
Brian R Cuttler brian.cutt...@wadsworth.org
Computer Systems Support(v) 518 486-1697
Wadsworth Center(f) 518 473-6384
NYS Department of HealthHelp Desk 518 473-0773





amanda 3.3.3 "too many files"

2013-06-05 Thread Brian Cuttler

Hello amanda users,

I just updates amanda 3.3.0 to 3.3.0 on a Solaris 10/x86 system.
The system is both the server and the client, there are no other
clients of this system.

We have ~265 DLEs on this system (large zfs arrays and all
samba shares are their own file systems and DLE, thank goodness
I was able to talk my manager out of making all user directories
their own DLE as well, though they are their own zfs file systems).

The following errors are -not- new with 3.3.3, we've had them for
a while, I'd hoped the upgrade would take are of it. 

Also the amcheck leaves an amanda-check file around for one of
the zfs file systems (yes, configured to use zfs snapshot). [I'm
pretty sure these two errors are related to one another]

The filesystem amanda-*-check file left is for the same filesystem
each night, unless we add/remove DLE/filesystems. So I think it is
the nth filesystem and at the limit of the open file counter, rather
than something in the file system itself.

I was hoping there was an easy fix for this. Last I recall on the
topic it had to do with the fillm being a 32 bit rather than 64 bit
value (I could be wrong about this).

Otherwise all # amcheck tests run successfully. Will run # amdump
this evening but do not anticipate any issues there.

thank you,

Brian

> amcheck -c finsen

Amanda Backup Client Hosts Check

ERROR: finsen: service selfcheck: selfcheck: Error opening pipe to child: Too 
many open files
ERROR: finsen: service /usr/local/libexec/amanda/selfcheck failed: pid 8590 
exited with code 1
Client check: 1 host checked in 83.304 seconds.  2 problems found.

(brought to you by Amanda 3.3.3)


from /var/log/conlog

Jun  5 10:55:04 finsen amandad[8583]: [ID 927837 daemon.info] connect from 
finsen.wadsworth.org
Jun  5 10:56:27 finsen selfcheck[8590]: [ID 702911 daemon.error] Error opening 
pipe to child: Too many open files





---
   Brian R Cuttler brian.cutt...@wadsworth.org
   Computer Systems Support(v) 518 486-1697
   Wadsworth Center(f) 518 473-6384
   NYS Department of HealthHelp Desk 518 473-0773