As I said in my previous email, I'd recommend using the poolCnt files since
they accurately reflect what is being stored for each host.  But it will
take a bit more coding.  You can look in some of the BackupPC utilities
(eg: BackupPC_refCountUpdate) for examples of how to read the poolCnt files.

Craig

On Wed, Sep 2, 2020 at 11:03 PM Ján ONDREJ (SAL) <ondr...@salstar.sk> wrote:

> Hello,
>
>   trying to make a script, which can count backup size, but without success
> yet. I can parse XferLOG, extract sizes, identify links, but after summary
> some backups have zero or almost zero size. Looks like this happens, when
> there is no initial backup (backup with id 0), where all files have been
> transferred. My XferLOGs contains only changes.
>
> Here is what I counted from XferLOG for one server:
> digitall/XferLOG.625.z: 4.788 MB
> digitall/XferLOG.628.z: 0.899 MB
> digitall/XferLOG.629.z: 19.221 MB
> digitall/XferLOG.623.z: 0.059 MB
> digitall/XferLOG.627.z: 0.000 MB
> digitall/XferLOG.622.z: 0.065 MB
> digitall/XferLOG.624.z: 0.138 MB
> digitall/XferLOG.626.z: 0.060 MB
> HOST: digitall: 25.231 MB
>
> As you see, all files from XferLOG uses 25 MB of disk space.
> But I see an 4 GB file stored in my backup, when checking from BackupPC web
> interface.
>
> I still think, that this information about backup File Size is useless.
> What does it mean?
> 1. how much files have been transferred: NO
> 2. how much files are stored on disk: NO
> 3. how much files are on source filesystem: NO
>
> I really don't know, what this means.
>
> Please, at least let me know, how I can check, how much files are stored
> in each backup. I can parse them all using BackupPC_tarCreate or something
> similar, or using backuppcfs, but this is very slow for large backups.
>
>                                                         SAL
>
> On Mon, Aug 31, 2020 at 12:20:13PM -0700, Craig Barratt wrote:
> > I've tried to suggest a couple of reasons that could explain what you are
> > seeing, based on very incomplete information.
> >
> > Without you confirming what the issue actually is, your conclusion is
> > already that it's "absolutely buggy" and "useless".  To pick another
> > example, rsync -aHv will also report a total file size that is the sum of
> > the hardlink file sizes, and it also reports the actual bytes transferred
> > and the speedup.  So by your logic, does that mean rsync is also
> > "absolutely buggy" and "useless"?
> >
> > I'd recommend you actually understand the issue, and then decide what the
> > best options are.
> >
> > If your question is "is there a reasonable way to potion pool usage among
> > backup clients?" you are asking a question that doesn't have a simple
> > answer, because of hardlinks on the clients and pooling among all the
> > clients.
> >
> > That said, it wouldn't be too hard to write a script that reads the
> > reference counts for a client (which includes all the backups for that
> > client), and apportioning the pool file sizes to that client based on the
> > ratio of its own reference count to the total pool reference count for
> each
> > pool file.  But that's just one way of doing it.  And commercial
> > auditing/billing tools are well out of scope for BackupPC, but you are
> most
> > welcome to contribute anything you develop.
> >
> > Craig
> >
> > On Mon, Aug 31, 2020 at 9:38 AM Ján ONDREJ (SAL) <ondr...@salstar.sk>
> wrote:
> >
> > > Hello,
> > >
> > >   thanks for explanation, but how I can check in backuppc, which user
> > > uses how much disk space of my backuppc storage? This way File Size
> counter
> > > is absolutelly buggy.
> > >
> > >   I need to check, which backup uses most of my space and need to find,
> > > where I should exclude more files. But there is no information, which
> > > I can use. New files are only new files, doesn't count how much files
> > > there are. In Total files hardlinked files are counter multiple times,
> > > which ends in 10x more space usage in this counter like it's real.
> > >
> > >   This way total files counter is useless, only useful for windows
> users,
> > > which don't use hardlinks.
> > >
> > >                                                         SAL
> > >
> > > On Mon, Aug 31, 2020 at 09:09:02AM -0700, Craig Barratt via
> BackupPC-users
> > > wrote:
> > > >  That file is a hardlink, not a symlink. In the backup stats, each
> > > instance
> > > > of a hardlink is counted towards the total file size.
> > > >
> > > > If your file system has a lot of hardlinks, perhaps that's why the
> > > reported
> > > > number is higher than you expect?
> > > >
> > > > Craig
> > > >
> > > > On Mon, Aug 31, 2020 at 12:40 AM Ján ONDREJ (SAL) <
> ondr...@salstar.sk>
> > > > wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > On Mon, Aug 31, 2020 at 12:08:50AM -0700, Craig Barratt via
> > > BackupPC-users
> > > > > wrote:
> > > > > > Does your backup include sparse files?
> > > > >
> > > > > I think no.
> > > > >
> > > > > > Look in the XferLOG file to see the sizes of individual files -
> it
> > > > > > shouldn't be too hard to spot one that is large.
> > > > >
> > > > > There is no one large file. As I wrote, restored backup is not so
> large
> > > > > too.
> > > > > But you pointed me to right plate. I see this line in XferLOG:
> > > > >
> > > > >     new    recv hf..tpog... rw-r--r--     1000,    1000  25089367
> > > > > var/www/public/media/598522/catalogue.pdf =>
> > > > > var/www/private/import/docs/catalogue.pdf
> > > > >
> > > > > This is a symlink and it's size is counted as 25089367.
> > > > > According to "=>" symbol, this symlink is properly identified as
> > > symlink,
> > > > > but it's size is stored as symlinks target file. This is why backup
> > > > > size is larger than my filesystem. Can this be fixed?
> > > > >
> > > > >                                                         SAL
> > > > >
> > > > > > On Sun, Aug 30, 2020 at 11:51 PM Ján ONDREJ (SAL) <
> > > ondr...@salstar.sk>
> > > > > > wrote:
> > > > > >
> > > > > > > Hello,
> > > > > > >
> > > > > > >   I am using BackupPC for years, but after update to v4
> (4.4.0),
> > > some
> > > > > > > backups have inconsistent size dislayed in "File Size/Count
> Reuse
> > > > > Summary"
> > > > > > > table.
> > > > > > >
> > > > > > >   This is my servers directory, which should be in backup:
> > > > > > >
> > > > > > > Filesystem                 Size  Used Avail Use% Mounted on
> > > > > > > /dev/md0                   4.0G  3.0G  1.1G  75% /
> > > > > > > /dev/mapper/vg_server-www  200G  119G   82G  60% /var/www
> > > > > > >
> > > > > > > There is no other filesystem mounted, binded to this directory.
> > > > > > >
> > > > > > > It's complete size is 200 GB, used only 118 GB. Some files are
> > > excluded
> > > > > > > from
> > > > > > > backup, so an full backup should be about 120 GB uncompressed,
> > > after
> > > > > > > compression less. But our File Size table looks like:
> > > > > > >
> > > > > > >               Totals            Existing Files             New
> > > Files
> > > > > > > Backup# Type  #Files  Size/MiB  MiB/sec  #Files  Size/MiB
> #Files
> > > > > Size/MiB
> > > > > > > 0       full  3220584 943488.6  26.86    2625461 872577.7  1232
> > >  537.4
> > > > > > >
> > > > > > > As you see, this backup total size is 921 GiB. How it's
> possible,
> > > that
> > > > > > > an 200 GB partition is stored as 900 GB?
> > > > > > > Also according to i-nodes in linux, my server has:
> > > > > > >
> > > > > > > Filesystem                   Inodes   IUsed     IFree IUse%
> > > Mounted on
> > > > > > > /dev/md0                    4194240   51082   4143158    2% /
> > > > > > > /dev/mapper/vg_fusion-www 209715200 2470434 207244766    2%
> > > /var/www
> > > > > > >
> > > > > > > So there is 2.5 millions of files, some excluded, but results
> as
> > > 3.2
> > > > > > > millions
> > > > > > > on backup.
> > > > > > >
> > > > > > > Trying to restore files. Restore downloaded a 68GB tar package.
> > > > > > > This looks to be real, but if there are only 70 GB of data,
> why on
> > > > > backuppc
> > > > > > > status it's displayed as 900 GB?
> > > > > > >
> > > > > > > I need to find, which server is using most of my backup space.
> > > > > > > I know, that it's hard to find, because files are shared
> between
> > > > > servers
> > > > > > > (deduplicated), but at least I can estimate it. But if there
> are
> > > > > multiples
> > > > > > > of usage displayed in statistics, then it's impossible to
> > > aproximate.
> > > > > > >
> > > > > > > I deleted all backups of this server, in hope, that it helps,
> but
> > > > > don't.
> > > > > > > :-(
> > > > > > >
> > > > > > > Thank for help.
> > > > > > >
> > > > > > >                                                 SAL
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > BackupPC-users mailing list
> > > > > > > BackupPC-users@lists.sourceforge.net
> > > > > > > List:
> > > https://lists.sourceforge.net/lists/listinfo/backuppc-users
> > > > > > > Wiki:    https://github.com/backuppc/backuppc/wiki
> > > > > > > Project: https://backuppc.github.io/backuppc/
> > > > > > >
> > > > >
> > > > >
> > > > > > _______________________________________________
> > > > > > BackupPC-users mailing list
> > > > > > BackupPC-users@lists.sourceforge.net
> > > > > > List:
> https://lists.sourceforge.net/lists/listinfo/backuppc-users
> > > > > > Wiki:    https://github.com/backuppc/backuppc/wiki
> > > > > > Project: https://backuppc.github.io/backuppc/
> > > > >
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > BackupPC-users mailing list
> > > > > BackupPC-users@lists.sourceforge.net
> > > > > List:
> https://lists.sourceforge.net/lists/listinfo/backuppc-users
> > > > > Wiki:    https://github.com/backuppc/backuppc/wiki
> > > > > Project: https://backuppc.github.io/backuppc/
> > > > >
> > >
> > >
> > > > _______________________________________________
> > > > BackupPC-users mailing list
> > > > BackupPC-users@lists.sourceforge.net
> > > > List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
> > > > Wiki:    https://github.com/backuppc/backuppc/wiki
> > > > Project: https://backuppc.github.io/backuppc/
> > >
> > >
>
_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/

Reply via email to