Hi Pam,

During incremental backup, the peak memory utilization usually occurs when
the client is processing the directory with the largest number of files. In
this case, unless your machine has only 8 MB of RAM ;-) I do not see how
~15K objects could cause memory to be exhausted. It doesn't pass the "sniff
test".

Regarding the 6.4 error where you see the return code 11: This most likely
corresponds to errno EAGAIN, which means there were insufficient system
resources to create a new thread. This is not an insufficient memory issue,
but some other system resource.

A shot in the dark, but... by any chance is the AIX system configured to
use 64 KB page sizes? I ask because of this AIX APAR which *might* be a
match:

http://www.ibm.com/support/docview.wss?uid=isg1IZ27457

(See the Comments section of the APAR to match the acutal 6.1 maintenance
level.)

Best regards,

Andy

____________________________________________________________________________

Andrew Raibeck | IBM Spectrum Protect Level 3 | stor...@us.ibm.com

IBM Tivoli Storage Manager links:
Product support:
https://www.ibm.com/support/entry/portal/product/tivoli/tivoli_storage_manager

Online documentation:
http://www.ibm.com/support/knowledgecenter/SSGSG7/landing/welcome_ssgsg7.html

Product Wiki:
https://www.ibm.com/developerworks/community/wikis/home/wiki/Tivoli%20Storage%20Manager

"ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> wrote on 2016-04-27
15:56:42:

> From: "Pagnotta, Pamela (CONTR)" <pamela.pagno...@hq.doe.gov>
> To: ADSM-L@VM.MARIST.EDU
> Date: 2016-04-27 15:59
> Subject: Re: TSM Client upgrade on AIX
> Sent by: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU>
>
> Andy,
>
> Here is the top few entries from the select statement. From what I
> can tell, none of the filesystems have more than 200K objects on this
client.
>
> FILESPACE_NAME: /gridfs
>        HL_NAME: /oraem/Oracle/admin/emrep/adump/
>  TOTAL_OBJECTS: 15551
>
> FILESPACE_NAME: /gridfs
>        HL_NAME: /oraem/Oracle/middleware/oms/sysman/archives/emgc/
> deployments/GCDomain/emgc.ear/em.war/cabo/jsLibs/resources/
>  TOTAL_OBJECTS: 2473
>
> FILESPACE_NAME: /gridfs
>        HL_NAME: /oraem/Oracle/middleware/logs/
>  TOTAL_OBJECTS: 2344
>
> Since moving back to version 6.4.2.0 there is a new message
>
> 04/27/16   02:33:22 ANS0361I DIAG: Thread creation failed; rc=11.
> 04/27/16   02:33:24 ANS1999E Incremental processing of '/usr' stopped.
>
> The only Technote I can find that is close to this message and rc is
> TSM server related on a Linux system.  Is there anywhere where these
> diagnostic return codes are defined for TSM administrators?
>
> I have moved this backup to a quieter time of the night to see if
> that helps at all.
>
> I will open a ticket for this new error tomorrow.
>
> Thank you,
>
> Pam Pagnotta
> Sr. System Engineer
> Criterion Systems, Inc./ActioNet
> Contractor to US. Department of Energy
> Office of the CIO/IM-622
> Office: 301-903-5508
> Mobile: 301-335-8177
>
>
> -----Original Message-----
> From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On
> Behalf Of Andrew Raibeck
> Sent: Wednesday, April 27, 2016 2:19 PM
> To: ADSM-L@VM.MARIST.EDU
> Subject: Re: [ADSM-L] TSM Client upgrade on AIX
>
> Hi Pam,
>
> Do any of the file systems happen to have directories that contain large
> numbers of files (say, more than a million)? You could try running this
> SELECT statement from an administrative command line client to assess
this.
> Make sure to the node name RAIBECK with your node name (in all upper
case):
>
> select filespace_name, hl_name, count(*) as total_objects from backups
> where node_name='RAIBECK' and state='ACTIVE_VERSION' group by
> filespace_name, hl_name order by 3 desc
>
> You can cancel the output after the first few lines. What you are looking
> for is the top of the list, which would show you which file system and
> directory has the largest number of files. How many objects are there?
>
> If there are millions of files involved, it could be that memory is being
> exhausted (how much memory is available on this system?); though I would
> normally expect a proper "out of memory" message, rather than the more
> cryptic message you are seeing. In the past, I have heard customers say
> that this occurs after upgrading the client, but what really happened was
> that the number of files in the directory was growing continuously, and
> eventually the backup could not allocate enough memory; and the upgrade
> just happened to roughly coincide with the onset of the issue. I cannot
say
> whether this is possible in your situation, but I am just sharing some of
> my past experiences with this issue.
>
> From which client version and bit-architecture did you upgrade to 7.1? I
> see you put 6.4 on as a "workaround", but what was the original version?
> Earlier client versions did see an increase in memory usage when the
> clients were changed from 32-bit to 64-bit, as 64-bit software tends to
use
> more memory (pointer variables are 8 bytes rather than 4 bytes, and that
is
> one chief contributor). But no such change occurred from 6.4 to 7.1, so
why
> memory would be exhausted in 7.1 but not 6.4, I have no immediate idea.
>
> If the affected machine does not really have any directories with huge
> numbers of files, then this could be something else... I would invite you
> to reopen your PMR, let me know, and I will have it escalated to our
Level
> 2 support for further investigation. As I mentioned earlier, the cryptic
> calloc() error does not seem right.
>
> Best regards,
>
> Andy
>
>
____________________________________________________________________________

>
> Andrew Raibeck | IBM Spectrum Protect Level 3 | stor...@us.ibm.com
>
> IBM Tivoli Storage Manager links:
> Product support:
>
https://www.ibm.com/support/entry/portal/product/tivoli/tivoli_storage_manager

>
> Online documentation:
>
http://www.ibm.com/support/knowledgecenter/SSGSG7/landing/welcome_ssgsg7.html

>
> Product Wiki:
> https://www.ibm.com/developerworks/community/wikis/home/wiki/Tivoli%
> 20Storage%20Manager
>
> "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> wrote on 2016-04-27
> 13:15:50:
>
> > From: "Pagnotta, Pamela (CONTR)" <pamela.pagno...@hq.doe.gov>
> > To: ADSM-L@VM.MARIST.EDU
> > Date: 2016-04-27 13:17
> > Subject: Re: Re: TSM Client upgrade on AIX
> > Sent by: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU>
> >
> > Hi Dave,
> >
> > Thank you for the information. I, of course, could not get the
> > person assigned to my ticket to even acknowledge that this might be
> > due to some different memory requirements for the newer TSM clients.
> > The only response I received was that we just must not have enough
> > memory on our system to do the backup despite being told that there
> > was no issue with an older client.
> >
> > Pam
> >
> > Pam Pagnotta
> > Sr. System Engineer
> > Criterion Systems, Inc./ActioNet
> > Contractor to US. Department of Energy
> > Office of the CIO/IM-622
> > Office: 301-903-5508
> > Mobile: 301-335-8177
> >
> >
> > -----Original Message-----
> > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On
> > Behalf Of David Bronder
> > Sent: Wednesday, April 27, 2016 12:56 PM
> > To: ADSM-L@VM.MARIST.EDU
> > Subject: Re: [ADSM-L] Re: TSM Client upgrade on AIX
> >
> > This isn't really helpful for your specific situation, Pam (I don't
think
> > I've had the specific errors you've seen).  But I have noticed (with
much
> > dismay) that the 7.x clients for AIX have required significantly more
> memory
> > than earlier versions.  I have clients with 1+ million files in a
> filesystem
> > that had no problems with 6.x and earlier clients, but consistently
> required
> > huge data ulimits after upgrading to 7.x (and would fail, often
> completely
> > silently, if the ulimit wasn't high enough).
> >
> > I don't know what IBM did with the 7.x clients to make them so
> memory-greedy
> > compared to earlier versions.  Maybe the client-side dedupe support or
> > something, though I'm not using those newer features currently, so I
> would
> > hope that wouldn't be a factor.  Then again, I would hope IBM would
> realize
> > that setting the data ulimit to unlimited isn't really a best practice
> and
> > that having successful backups shouldn't require risking breaking
> services on
> > the systems those backups are protecting.  (</soapbox>)
> >
> > So far, I've gotten by with a non-unlimited ulimit, but it seems like I
> do
> > have to keep raising it with each new 7.x client release...
> >
> > =Dave
> >
> >
> > On 04/27/2016 09:09 AM, Pagnotta, Pamela (CONTR) wrote:
> > > Hi Matthew,
> > >
> > > Yes, the root user ulimits is set to unlimited on all the AIX
servers.
> > >
> > > Regards,
> > >
> > > Pam Pagnotta
> > > Sr. System Engineer
> > > Criterion Systems, Inc./ActioNet
> > > Contractor to US. Department of Energy
> > > Office of the CIO/IM-622
> > > Office: 301-903-5508
> > > Mobile: 301-335-8177
> > >
> > > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On
> > Behalf Of Matthew McGeary
> > > Sent: Wednesday, April 27, 2016 9:56 AM
> > > To: ADSM-L@VM.MARIST.EDU
> > > Subject: Re: [ADSM-L] TSM Client upgrade on AIX
> > >
> > > Good morning Pam,
> > >
> > > We encountered errors backing up filesystems with large numbers of
> > files until we set the root user ulimits to unlimited.  That fixed
> > the problem but can have other consequences, obviously.  Do you know
> > if your AIX admin tried changing the ulimits?
> > >
> > > Regards,
> > > __________________________
> > >
> > > Matthew McGeary
> > > Senior Technical Specialist - Infrastructure
> > > PotashCorp
> > > T: (306) 933-8921
> > > www.potashcorp.com
> > >
> > > From:        "Pagnotta, Pamela (CONTR)" <pamela.pagno...@hq.doe.gov>
> > > To:        ADSM-L@VM.MARIST.EDU
> > > Date:        04/27/2016 07:47 AM
> > > Subject:        [ADSM-L] TSM Client upgrade on AIX
> > > Sent by:        "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU>
> > >
> > > ________________________________
> > >
> > >
> > >
> > > Hello,
> > >
> > > Recently one of our AIX administrators upgraded the TSM client to
> > 7.1.4.4 on her servers. Many of them started receiving errors like
> > >
> > > calloc() failed: Size 31496 File ../mem/mempool.cpp Line 1092
> > >
> > > I looked this up and the indication is that the AIX server could
> > not supply enough memory to TSM to complete the backup. We opened a
> > ticket and were told to try memoryefficientbackup with
> > diskcachemethod. This did not fix the issue.
> > >
> > > In frustration the administrator reinstalled a TSM client version
> > of 6.4.2.0 and is no longer experiencing the memory problems.
> > >
> > > Any thoughts?
> > >
> > > Thank you,
> > >
> > > Pam
> > >
> > > Pam Pagnotta
> > > Sr. System Engineer
> > > Criterion Systems, Inc./ActioNet
> > > Contractor to US. Department of Energy
> > > Office of the CIO/IM-622
> > > Office: 301-903-5508
> > > Mobile: 301-335-8177
> > >
> >
> > --
> > Hello World.                                David Bronder - Systems
> Architect
> > Segmentation Fault                                      ITS-EI, Univ.
of
> Iowa
> > Core dumped, disk trashed, quota filled, soda warm.
> david-bron...@uiowa.edu
> >
>

Reply via email to