Re: amanda 3.3.3 "too many files"
Jean-Louis, Jon, I've updated my amanda.conf to use auth="local" for the dumptypes I have in use in my disklist. > ulimit unlimited Per solaris instructions... > echo 'rlim_fd_max/d' | mdb -k rlim_fd_max: rlim_fd_max:0 > amcheck finsen Amanda Tape Server Host Check - Holding disk /lstripe: 546222 MB disk space available, using 546122 MB slot 9: volume 'Finsen31' Will write to volume 'Finsen31' in slot 9. NOTE: skipping tape-writable test NOTE: info dir /usr/local/etc/amanda/finsen/DailySet1/curinfo/finsen/_export2_samba_maldi does not exist NOTE: it will be created on the next run. NOTE: index dir /usr/local/etc/amanda/finsen/DailySet1/index/finsen/_export2_samba_maldi does not exist NOTE: it will be created on the next run. Server check took 4.691 seconds Amanda Backup Client Hosts Check ERROR: finsen: service selfcheck: selfcheck: Error opening pipe to child: Too many open files ERROR: finsen: service /usr/local/libexec/amanda/selfcheck failed: pid 5457 exited with code 1 Client check: 1 host checked in 130.727 seconds. 2 problems found. (brought to you by Amanda 3.3.3) The new DLE is fact did cause the retained snapshot to change by one DLE, in alpha order. It is (re)verified that this is not random and is tied to list position. So much for the solaris run time work-around. export LD_PRELOAD_32 /usr/lib/extendedFILE.so.1 then run amcheck. > amcheck finsen ld.so.1: amcheck: warning: /usr/lib/extendedFILE.so.1: open failed: illegal insecure pathname Amanda Tape Server Host Check - Holding disk /lstripe: 546222 MB disk space available, using 546122 MB slot 9: volume 'Finsen31' FILE.so.1: open failed: illegal insecure pathname ERROR: finsen: Application 'amgtar': can't run support command ERROR: finsen: Application 'amgtar': ld.so.1: amgtar: warning: /usr/lib/extendedFILE.so.1: open failed: illegal insecure pathname ERROR: finsen: Application 'amgtar': can't run support command ERROR: finsen: Application 'amgtar': ld.so.1: amgtar: warning: /usr/lib/extendedFILE.so.1: open failed: illegal insecure pathname ERROR: finsen: Application 'amgtar': can't run support command related to suid programs? Don't want to make further changes before the weekend, think I'll implement auth="local" for amdump on Monday and see how it performs. thank you, Brian On Wed, Jun 05, 2013 at 01:41:16PM -0400, Brian Cuttler wrote: > > Jean-Louis, > > Yes, I did find some information on a run time mechanism to > increase the 256 file limit (file limit stored in unsigned character). > > The work-around employes requires the exection of /usr/lib/extendedFILE.so.1 > prior to the binary being executed. > > Following up on your maxcheck and Spindle number, I wonder if I > couldn't automatically build an alternate disklist file with > spindle number and swap it in and out. It would have to be done > dynamically (since my disklist changes and making changes in > multiple locations is error prone), but that can be scripted and > called from cron. > > /* I need something that will handle both formats of DLE > * > finsen /export2 zfs-snapshot2 > finsen /export/home-AZ /export/home { > user-tar2 > include "./[A-Z]*" > } > * > */ > > Since this is an amanda-client issue, rather than an amanda server > issue, I need to ask you, how to execute this on the client-side > before attempting to check the DLE list. Is there a way to invoke > this from the amanda daemon? > > - Alternatively, if someone better versed than I am on the Solaris >inetd or in SMF knows how to insert the requisit command on the >client side - I would be appreciative if they would share their >information. > > thank you, > > Brian > > > On Wed, Jun 05, 2013 at 11:54:35AM -0400, Jean-Louis Martineau wrote: > > Brian, > > > > Can you increase the number of open files at the system level? > > > > amcheck check all DLEs in parallel, you can try to add spindle (in the > > disklist) to reduce parallelism but that can have a bad impact on dump > > performance, so it is not a good workaround. > > > > You would like a maxcheck setting similar to maxdump, I put it in my > > TODO list. > > > > Jean-Louis > > > > On 06/05/2013 11:05 AM, Brian Cuttler wrote: > > >Hello amanda users, > > > > > >I just updates amanda 3.3.0 to 3.3.0 on a Solaris 10/x86 system. > > >The system is both the server and the client, there are no other > > >clients of this system. > > > > > >We have ~265 DLEs on this system (large zfs arrays and all > > >samba shares are their own file systems and DLE, thank goodness > > >I was able to talk my manager out of making all user directories > > >their own DLE as well, though they are their own zfs f
Re: amanda 3.3.3 "too many files"
Jean-Louis, added a couple of switches to # ls, got a much more informative output. [finsen]: /proc/734/fd > ls -F -C /proc/10832/fd 0= 1= 10 12| 13| 16| 17| 2= 20| 21| 3> 6| 8| On Thu, Jun 06, 2013 at 11:09:20AM -0400, Jean-Louis Martineau wrote: > On 06/05/2013 11:54 AM, Jean-Louis Martineau wrote: > >Brian, > > > >Can you increase the number of open files at the system level? > > > >amcheck check all DLEs in parallel, you can try to add spindle (in the > >disklist) to reduce parallelism but that can have a bad impact on dump > >performance, so it is not a good workaround. > > Forget that idea, adding spindle will not help. > > I think the problem is a file descriptor leak (files not closed), but it > can be in any process. > Can you monitor all opened file for all amanda processes? > I don't know how to do it with Solaris, but you 'ls /proc/PID/fd' on linux. > It will help to find which process leak. > > Jean-Louis --- Brian R Cuttler brian.cutt...@wadsworth.org Computer Systems Support(v) 518 486-1697 Wadsworth Center(f) 518 473-6384 NYS Department of HealthHelp Desk 518 473-0773
Re: amanda 3.3.3 "too many files"
On 06/05/2013 11:54 AM, Jean-Louis Martineau wrote: Brian, Can you increase the number of open files at the system level? amcheck check all DLEs in parallel, you can try to add spindle (in the disklist) to reduce parallelism but that can have a bad impact on dump performance, so it is not a good workaround. Forget that idea, adding spindle will not help. I think the problem is a file descriptor leak (files not closed), but it can be in any process. Can you monitor all opened file for all amanda processes? I don't know how to do it with Solaris, but you 'ls /proc/PID/fd' on linux. It will help to find which process leak. Jean-Louis
Re: amanda 3.3.3 "too many files"
On 06/05/2013 03:56 PM, Brian Cuttler wrote: Jean-Louis, Thank you, I'm sorry I was unclear. Yes, of course the disklist needs to be in place when I invoke amcheck on the server. I'd meant that I need to find out how to up the file limit on the client, which is a more difficult proposition since its SMF/INET and not simply something I can script in cron on the server. The fact that the client and the server are the same box doesn't help much in this case. It can help. use the 'local' auth, which is a fork of amandad instead of connecting to it. If you increase the limit for amcheck, then that amandad will get the same limit. Jean-Louis thank you, Brian On Wed, Jun 05, 2013 at 03:08:45PM -0400, Jean-Louis Martineau wrote: On 06/05/2013 01:41 PM, Brian Cuttler wrote: Jean-Louis, Yes, I did find some information on a run time mechanism to increase the 256 file limit (file limit stored in unsigned character). The work-around employes requires the exection of /usr/lib/extendedFILE.so.1 prior to the binary being executed. Following up on your maxcheck and Spindle number, I wonder if I couldn't automatically build an alternate disklist file with spindle number and swap it in and out. It would have to be done dynamically (since my disklist changes and making changes in multiple locations is error prone), but that can be scripted and called from cron. /* I need something that will handle both formats of DLE * finsen /export2 zfs-snapshot2 finsen /export/home-AZ /export/home { user-tar2 include "./[A-Z]*" } * */ Since this is an amanda-client issue, rather than an amanda server issue, I need to ask you, how to execute this on the client-side before attempting to check the DLE list. Is there a way to invoke this from the amanda daemon? It must be done on the server before amcheck is executed. ./script-add-spindle < disklist > disklist.spindle ./amcheck CONF -odiskfile=disklist.spindle Jean-Louis --- Brian R Cuttler brian.cutt...@wadsworth.org Computer Systems Support(v) 518 486-1697 Wadsworth Center(f) 518 473-6384 NYS Department of HealthHelp Desk 518 473-0773
Re: amanda 3.3.3 "too many files"
Jean-Louis, Thank you, I'm sorry I was unclear. Yes, of course the disklist needs to be in place when I invoke amcheck on the server. I'd meant that I need to find out how to up the file limit on the client, which is a more difficult proposition since its SMF/INET and not simply something I can script in cron on the server. The fact that the client and the server are the same box doesn't help much in this case. thank you, Brian On Wed, Jun 05, 2013 at 03:08:45PM -0400, Jean-Louis Martineau wrote: > On 06/05/2013 01:41 PM, Brian Cuttler wrote: > >Jean-Louis, > > > >Yes, I did find some information on a run time mechanism to > >increase the 256 file limit (file limit stored in unsigned character). > > > >The work-around employes requires the exection of > >/usr/lib/extendedFILE.so.1 > >prior to the binary being executed. > > > >Following up on your maxcheck and Spindle number, I wonder if I > >couldn't automatically build an alternate disklist file with > >spindle number and swap it in and out. It would have to be done > >dynamically (since my disklist changes and making changes in > >multiple locations is error prone), but that can be scripted and > >called from cron. > > > >/* I need something that will handle both formats of DLE > > * > >finsen /export2 zfs-snapshot2 > >finsen /export/home-AZ /export/home { > > user-tar2 > > include "./[A-Z]*" > > } > > * > > */ > > > >Since this is an amanda-client issue, rather than an amanda server > >issue, I need to ask you, how to execute this on the client-side > >before attempting to check the DLE list. Is there a way to invoke > >this from the amanda daemon? > It must be done on the server before amcheck is executed. > > ./script-add-spindle < disklist > disklist.spindle > ./amcheck CONF -odiskfile=disklist.spindle > > Jean-Louis > --- Brian R Cuttler brian.cutt...@wadsworth.org Computer Systems Support(v) 518 486-1697 Wadsworth Center(f) 518 473-6384 NYS Department of HealthHelp Desk 518 473-0773
Re: amanda 3.3.3 "too many files"
On 06/05/2013 01:41 PM, Brian Cuttler wrote: Jean-Louis, Yes, I did find some information on a run time mechanism to increase the 256 file limit (file limit stored in unsigned character). The work-around employes requires the exection of /usr/lib/extendedFILE.so.1 prior to the binary being executed. Following up on your maxcheck and Spindle number, I wonder if I couldn't automatically build an alternate disklist file with spindle number and swap it in and out. It would have to be done dynamically (since my disklist changes and making changes in multiple locations is error prone), but that can be scripted and called from cron. /* I need something that will handle both formats of DLE * finsen /export2 zfs-snapshot2 finsen /export/home-AZ /export/home { user-tar2 include "./[A-Z]*" } * */ Since this is an amanda-client issue, rather than an amanda server issue, I need to ask you, how to execute this on the client-side before attempting to check the DLE list. Is there a way to invoke this from the amanda daemon? It must be done on the server before amcheck is executed. ./script-add-spindle < disklist > disklist.spindle ./amcheck CONF -odiskfile=disklist.spindle Jean-Louis
Re: amanda 3.3.3 "too many files"
Jean-Louis, Yes, I did find some information on a run time mechanism to increase the 256 file limit (file limit stored in unsigned character). The work-around employes requires the exection of /usr/lib/extendedFILE.so.1 prior to the binary being executed. Following up on your maxcheck and Spindle number, I wonder if I couldn't automatically build an alternate disklist file with spindle number and swap it in and out. It would have to be done dynamically (since my disklist changes and making changes in multiple locations is error prone), but that can be scripted and called from cron. /* I need something that will handle both formats of DLE * finsen /export2 zfs-snapshot2 finsen /export/home-AZ /export/home { user-tar2 include "./[A-Z]*" } * */ Since this is an amanda-client issue, rather than an amanda server issue, I need to ask you, how to execute this on the client-side before attempting to check the DLE list. Is there a way to invoke this from the amanda daemon? - Alternatively, if someone better versed than I am on the Solaris inetd or in SMF knows how to insert the requisit command on the client side - I would be appreciative if they would share their information. thank you, Brian On Wed, Jun 05, 2013 at 11:54:35AM -0400, Jean-Louis Martineau wrote: > Brian, > > Can you increase the number of open files at the system level? > > amcheck check all DLEs in parallel, you can try to add spindle (in the > disklist) to reduce parallelism but that can have a bad impact on dump > performance, so it is not a good workaround. > > You would like a maxcheck setting similar to maxdump, I put it in my > TODO list. > > Jean-Louis > > On 06/05/2013 11:05 AM, Brian Cuttler wrote: > >Hello amanda users, > > > >I just updates amanda 3.3.0 to 3.3.0 on a Solaris 10/x86 system. > >The system is both the server and the client, there are no other > >clients of this system. > > > >We have ~265 DLEs on this system (large zfs arrays and all > >samba shares are their own file systems and DLE, thank goodness > >I was able to talk my manager out of making all user directories > >their own DLE as well, though they are their own zfs file systems). > > > >The following errors are -not- new with 3.3.3, we've had them for > >a while, I'd hoped the upgrade would take are of it. > > > >Also the amcheck leaves an amanda-check file around for one of > >the zfs file systems (yes, configured to use zfs snapshot). [I'm > >pretty sure these two errors are related to one another] > > > >The filesystem amanda-*-check file left is for the same filesystem > >each night, unless we add/remove DLE/filesystems. So I think it is > >the nth filesystem and at the limit of the open file counter, rather > >than something in the file system itself. > > > >I was hoping there was an easy fix for this. Last I recall on the > >topic it had to do with the fillm being a 32 bit rather than 64 bit > >value (I could be wrong about this). > > > >Otherwise all # amcheck tests run successfully. Will run # amdump > >this evening but do not anticipate any issues there. > > > > thank you, > > > > Brian > > > >>amcheck -c finsen > >Amanda Backup Client Hosts Check > > > >ERROR: finsen: service selfcheck: selfcheck: Error opening pipe to child: > >Too many open files > >ERROR: finsen: service /usr/local/libexec/amanda/selfcheck failed: pid > >8590 exited with code 1 > >Client check: 1 host checked in 83.304 seconds. 2 problems found. > > > >(brought to you by Amanda 3.3.3) > > > > > >from /var/log/conlog > > > >Jun 5 10:55:04 finsen amandad[8583]: [ID 927837 daemon.info] connect from > >finsen.wadsworth.org > >Jun 5 10:56:27 finsen selfcheck[8590]: [ID 702911 daemon.error] Error > >opening pipe to child: Too many open files > > > > > > > > > > > >--- > >Brian R Cuttler brian.cutt...@wadsworth.org > >Computer Systems Support(v) 518 486-1697 > >Wadsworth Center(f) 518 473-6384 > >NYS Department of HealthHelp Desk 518 473-0773 > > > --- Brian R Cuttler brian.cutt...@wadsworth.org Computer Systems Support(v) 518 486-1697 Wadsworth Center(f) 518 473-6384 NYS Department of HealthHelp Desk 518 473-0773
Re: amanda 3.3.3 "too many files"
On Wed, Jun 05, 2013 at 11:05:36AM -0400, Brian Cuttler wrote: > > Hello amanda users, > > I just updates amanda 3.3.0 to 3.3.0 on a Solaris 10/x86 system. > The system is both the server and the client, there are no other > clients of this system. > > We have ~265 DLEs on this system (large zfs arrays and all > samba shares are their own file systems and DLE, thank goodness > I was able to talk my manager out of making all user directories > their own DLE as well, though they are their own zfs file systems). > > The following errors are -not- new with 3.3.3, we've had them for > a while, I'd hoped the upgrade would take are of it. > > Also the amcheck leaves an amanda-check file around for one of > the zfs file systems (yes, configured to use zfs snapshot). [I'm > pretty sure these two errors are related to one another] > > The filesystem amanda-*-check file left is for the same filesystem > each night, unless we add/remove DLE/filesystems. So I think it is > the nth filesystem and at the limit of the open file counter, rather > than something in the file system itself. > > I was hoping there was an easy fix for this. Last I recall on the > topic it had to do with the fillm being a 32 bit rather than 64 bit > value (I could be wrong about this). It could very well be too many files open. My research says Solaris 10 default value for "process.max-file-descriptor" is 256. Check it with "ulimit -n". Solaris maintains both a "soft" and a "hard" set of limits for some parameters. To check the hard limit try "ulimit -H -n". For process.max-file-descriptor an ordinary user can reduce the hard limit but not increase it. That user can also reduce the soft limit and can raise it to a maximum of the hard limit. To raise the soft limit try either: ulimit -S -n 1024 or something like: prctl -n process.max-file-descriptor -t basic -v 1024 -r -i process $$ These could go in amanda's .profile and that would help for login sessions, but I doubt it would help for cron started jobs. You may have to run it in a wrapper. With root access you can change the system default, but I doubt you want it changed at the system level. Maybe I'm wrong, you probably only need to change the amanda server(s). Jon -- Jon H. LaBadie j...@jgcomp.com 11226 South Shore Rd. (703) 787-0688 (H) Reston, VA 20190 (609) 477-8330 (C)
Re: amanda 3.3.3 "too many files"
Brian, Can you increase the number of open files at the system level? amcheck check all DLEs in parallel, you can try to add spindle (in the disklist) to reduce parallelism but that can have a bad impact on dump performance, so it is not a good workaround. You would like a maxcheck setting similar to maxdump, I put it in my TODO list. Jean-Louis On 06/05/2013 11:05 AM, Brian Cuttler wrote: Hello amanda users, I just updates amanda 3.3.0 to 3.3.0 on a Solaris 10/x86 system. The system is both the server and the client, there are no other clients of this system. We have ~265 DLEs on this system (large zfs arrays and all samba shares are their own file systems and DLE, thank goodness I was able to talk my manager out of making all user directories their own DLE as well, though they are their own zfs file systems). The following errors are -not- new with 3.3.3, we've had them for a while, I'd hoped the upgrade would take are of it. Also the amcheck leaves an amanda-check file around for one of the zfs file systems (yes, configured to use zfs snapshot). [I'm pretty sure these two errors are related to one another] The filesystem amanda-*-check file left is for the same filesystem each night, unless we add/remove DLE/filesystems. So I think it is the nth filesystem and at the limit of the open file counter, rather than something in the file system itself. I was hoping there was an easy fix for this. Last I recall on the topic it had to do with the fillm being a 32 bit rather than 64 bit value (I could be wrong about this). Otherwise all # amcheck tests run successfully. Will run # amdump this evening but do not anticipate any issues there. thank you, Brian amcheck -c finsen Amanda Backup Client Hosts Check ERROR: finsen: service selfcheck: selfcheck: Error opening pipe to child: Too many open files ERROR: finsen: service /usr/local/libexec/amanda/selfcheck failed: pid 8590 exited with code 1 Client check: 1 host checked in 83.304 seconds. 2 problems found. (brought to you by Amanda 3.3.3) from /var/log/conlog Jun 5 10:55:04 finsen amandad[8583]: [ID 927837 daemon.info] connect from finsen.wadsworth.org Jun 5 10:56:27 finsen selfcheck[8590]: [ID 702911 daemon.error] Error opening pipe to child: Too many open files --- Brian R Cuttler brian.cutt...@wadsworth.org Computer Systems Support(v) 518 486-1697 Wadsworth Center(f) 518 473-6384 NYS Department of HealthHelp Desk 518 473-0773
amanda 3.3.3 "too many files"
Hello amanda users, I just updates amanda 3.3.0 to 3.3.0 on a Solaris 10/x86 system. The system is both the server and the client, there are no other clients of this system. We have ~265 DLEs on this system (large zfs arrays and all samba shares are their own file systems and DLE, thank goodness I was able to talk my manager out of making all user directories their own DLE as well, though they are their own zfs file systems). The following errors are -not- new with 3.3.3, we've had them for a while, I'd hoped the upgrade would take are of it. Also the amcheck leaves an amanda-check file around for one of the zfs file systems (yes, configured to use zfs snapshot). [I'm pretty sure these two errors are related to one another] The filesystem amanda-*-check file left is for the same filesystem each night, unless we add/remove DLE/filesystems. So I think it is the nth filesystem and at the limit of the open file counter, rather than something in the file system itself. I was hoping there was an easy fix for this. Last I recall on the topic it had to do with the fillm being a 32 bit rather than 64 bit value (I could be wrong about this). Otherwise all # amcheck tests run successfully. Will run # amdump this evening but do not anticipate any issues there. thank you, Brian > amcheck -c finsen Amanda Backup Client Hosts Check ERROR: finsen: service selfcheck: selfcheck: Error opening pipe to child: Too many open files ERROR: finsen: service /usr/local/libexec/amanda/selfcheck failed: pid 8590 exited with code 1 Client check: 1 host checked in 83.304 seconds. 2 problems found. (brought to you by Amanda 3.3.3) from /var/log/conlog Jun 5 10:55:04 finsen amandad[8583]: [ID 927837 daemon.info] connect from finsen.wadsworth.org Jun 5 10:56:27 finsen selfcheck[8590]: [ID 702911 daemon.error] Error opening pipe to child: Too many open files --- Brian R Cuttler brian.cutt...@wadsworth.org Computer Systems Support(v) 518 486-1697 Wadsworth Center(f) 518 473-6384 NYS Department of HealthHelp Desk 518 473-0773