Re: [BackupPC-users] Disk space used far higher than reported pool size

Craig O'Brien Thu, 31 Oct 2013 05:54:15 -0700

The du -hs /backup/pool /backup/cpool /backup/pc/* has finished. Basically
I had 1 host that was taking up 6.9 TB of data with 2.8 TB in the cpool
directory and most of the other hosts averaging a GB each.


The 1 host was our file server (which I happen to know has a 2 TB volume
(1.3 TB currently used) that is our main fileshare.

I looked through the error log for this pc on backups with the most errors
and found thousands of these:

Unable to read 8388608 bytes from
/var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0,
seekPosn=1501757440 (0,512,147872,1499463680,2422719488)
Unable to read 8388608 bytes from
/var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0,
seekPosn=1501757440 (0,512,147872,1499463680,2422719488)
Unable to read 8388608 bytes from
/var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0,
seekPosn=1501757440 (0,512,147872,1499463680,2422719488)
Unable to read 8388608 bytes from
/var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0,
seekPosn=1501757440 (0,512,147872,1499463680,2422719488)
Unable to read 8388608 bytes from
/var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0,
seekPosn=1501757440 (0,512,147872,1499463680,2422719488)
Unable to read 8388608 bytes from
/var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0,
seekPosn=1501757440 (0,512,147872,1499463680,2422719488)
Unable to read 8388608 bytes from
/var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0,
seekPosn=1501757440 (0,512,147872,1499463680,2422719488)
Unable to read 8388608 bytes from
/var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0,
seekPosn=1501757440 (0,512,147872,1499463680,2422719488)
Unable to read 8388608 bytes from
/var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0,
seekPosn=1501757440 (0,512,147872,1499463680,2422719488)
Unable to read 8388608 bytes from
/var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0,
seekPosn=1501757440 (0,512,147872,1499463680,2422719488)
Unable to read 8388608 bytes from
/var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0,
seekPosn=1501757440 (0,512,147872,1499463680,2422719488)

I didn't see any of the ""BackupPC_link got error -4" errors. So now I'm
running this command:

du -hs /backup/pool /backup/cpool /backup/pc/myfileserver/*

to see which backups are doing the most damage. I'll report back once that
finishes.

Thanks for all your help!


Regards,
Craig


On Wed, Oct 30, 2013 at 10:24 PM, Holger Parplies <wb...@parplies.de> wrote:

> Hi,
>
> Adam Goryachev wrote on 2013-10-31 09:04:48 +1100 [Re: [BackupPC-users]
> Disk space used far higher than reported pool size]:
> > On 31/10/13 07:51, Holger Parplies wrote:
> > > [...]
> > > Aside from that, I would think it might be worth the effort of
> determining
> > > whether all hosts are affected or not (though I can't really see why
> there
> > > should be a difference between hosts). If some aren't, you could at
> least
> > > keep their history.
> > I suspect at least some hosts OR some backups are correct, or else OP
> > wouldn't have anything in the pool.
>
> as I understand it, the backups from before the change from smb to rsyncd
> are
> linked into the pool. Since the change, some or all are not. Whether the
> change of XferMethod has anything to do with the problem or whether it
> coincidentally happened at about the same point in time remains to be seen.
> I still suspect the link to $topDir as cause, and BackupPC_link is
> independent
> of the XferMethod used (so a change in XferMethod shouldn't have any
> influence).
>
> > [...] you might want to look at one individual host like this:
> > du -sm /backup/pool /backup/cpool /backup/pc/host1/*
> >
> > This should be a *lot* quicker than the previous du command, and also
> > should show minimal disk usage for each backup for host1. It is quicker
> > because you are only looking at the set of files for the pool, plus one
> > host.
>
> Just keep in mind that *incrementals* might be small even if not linked to
> pool files.
>
> Oh, and there is still another method that is *orders of magnitude* faster:
> look into the log file(s), or even at the *size* of the log files. If it
> happens every day, for each host, it shouldn't be hard to find. You can
> even
> write a Perl one-liner to show you which hosts it happens for (give me a
> sample log line and I will).
>
> If the log files show nothing, we're back to finding the problem, but I
> doubt
> that. You can't "break pooling" by copying, as was suggested. Yes, you get
> independent copies of files, and they might stay independent, but changed
> files should get pooled again, and your file system usage wouldn't continue
> growing in such a way as it seems to be. If pooling is currently "broken",
> there's a reason for that, and there should be log messages indicating
> problems.
>
> > PS, at this stage, you may want to look at the recent thread regarding
> > disk caches, and caching directory entries instead of file contents. It
> > might help with all the directory based searches you are doing to find
> > the problem. Long term you may (or not) want to keep the settings.
>
> Yes, but remember that for a similarly sized pool it used up about 32 GB of
> 96 GB available memory. If you can do your investigation on a reasonably
> idle
> system (i.e. not running backups, without long pauses), you should get all
> the
> benefits of caching your amount of memory allows without any tuning. And
> even
> tuning won't let you hold 32 GB of file system metadata in 4 GB of memory
> :-).
> It all depends on file count and hardware memory configuration.
>
> Regards,
> Holger
>
>
> ------------------------------------------------------------------------------
> Android is increasing in popularity, but the open development platform that
> developers love is also attractive to malware creators. Download this white
> paper to learn more about secure code signing practices that can help keep
> Android apps secure.
> http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk
> _______________________________________________
> BackupPC-users mailing list
> BackupPC-users@lists.sourceforge.net
> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:    http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/
>

------------------------------------------------------------------------------
Android is increasing in popularity, but the open development platform that
developers love is also attractive to malware creators. Download this white
paper to learn more about secure code signing practices that can help keep
Android apps secure.
http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk

_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Re: [BackupPC-users] Disk space used far higher than reported pool size

Reply via email to