Re: Estimating backup usage with dir-merge filter
>> It sounds like you missed the point of Kevin's message (in the other fork of >> this thread). The point wasn't to use >> `du`, it was that you can run your stats against the backed-up files, not >> the source. Then you're only running stats >> against the results of running the backup using the filters, so you don't >> need to filter them again. > > I got that but neglected to respond to the whole group. My mistake. > The backups are being performed using BackupPC to a central server > where compression and de-duplication is done. While it's true that > the actual storage on the backup server being consumed by each user is > less because of these, I don't have any problem hiding this from them > and instead telling them what their uncompressed and duplicated usage > is instead. It has more of an effect that way if you know what I > mean. > >> If that doesn't make sense or isn't possible (backups are on some remote >> server), then just use your rsync command >> with '--list-only', and post-process that list. > > I've been tinkering with using --verbose and --dry-run then parsing > the total size our of the last line of the output and I think I'm > close. Curiously, when I don't include the --filter option as a > baseline, I'm not getting the same results as "du". > > $ du -sb . | awk '{print $1}' > 508625653 > > $ rsync --dry-run --verbose -a . /tmp/does_not_exist | tail -1 | awk > '{print $4}' > 506037893 > > The difference is minimal and probably negligible for this purpose but > I'm still curious where it's coming from. Maybe there are some sparse > files in there somewhere. Do you have the same discrepancy if you use the --stats option? This email is protected by LBackup http://www.lbackup.org -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Estimating backup usage with dir-merge filter
On Thu, Oct 6, 2011 at 4:01 PM, Benjamin R. Haskell wrote: > It sounds like you missed the point of Kevin's message (in the other fork of > this thread). The point wasn't to use > `du`, it was that you can run your stats against the backed-up files, not the > source. Then you're only running stats > against the results of running the backup using the filters, so you don't > need to filter them again. I got that but neglected to respond to the whole group. My mistake. The backups are being performed using BackupPC to a central server where compression and de-duplication is done. While it's true that the actual storage on the backup server being consumed by each user is less because of these, I don't have any problem hiding this from them and instead telling them what their uncompressed and duplicated usage is instead. It has more of an effect that way if you know what I mean. > If that doesn't make sense or isn't possible (backups are on some remote > server), then just use your rsync command > with '--list-only', and post-process that list. I've been tinkering with using --verbose and --dry-run then parsing the total size our of the last line of the output and I think I'm close. Curiously, when I don't include the --filter option as a baseline, I'm not getting the same results as "du". $ du -sb . | awk '{print $1}' 508625653 $ rsync --dry-run --verbose -a . /tmp/does_not_exist | tail -1 | awk '{print $4}' 506037893 The difference is minimal and probably negligible for this purpose but I'm still curious where it's coming from. Maybe there are some sparse files in there somewhere. Paul -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Estimating backup usage with dir-merge filter
On Thu, 6 Oct 2011, Wayne Davison wrote: On Thu, Oct 6, 2011 at 1:01 PM, Benjamin R. Haskell wrote: use your rsync command with '--list-only', and post-process that list. Even easier, just make a note of the verbose output from the copy (get better stats via --stats with or w/o --verbose). Or, if you need a special run, --dry-run (-n) will tell you the file-size totals w/o transferring anything. Depends on what stats are needed. If you just need total bytes, yeah, that's easier. My example didn't do it, but it sounded like Paul wanted some kind of per-user statistics. Important bits, if you need more granularity: First column is an `ls -l` style mode display (first character = 'd' for dirs, '-' for normal files, etc.) Second column is the size in bytes. Third is date. Fourth is time. Fifth-through-rest is the path relative to the transfer root. (Spaces aren't escaped, but other special chars are listed as \#NNN where N's are octal digits), . -- Best, Ben -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Estimating backup usage with dir-merge filter
On Thu, Oct 6, 2011 at 1:01 PM, Benjamin R. Haskell wrote: > use your rsync command with '--list-only', and post-process that list. > Even easier, just make a note of the verbose output from the copy (get better stats via --stats with or w/o --verbose). Or, if you need a special run, --dry-run (-n) will tell you the file-size totals w/o transferring anything. ..wayne.. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Estimating backup usage with dir-merge filter
On Thu, 6 Oct 2011, Paul Dugas wrote: I appreciate the suggestions so far but I know how to measure usage with 'du' et al. The hitch here is that I want to exclude files the --filter='dir-merge .rsync-filter' excludes. Hense the thought to use rsync itself. It sounds like you missed the point of Kevin's message (in the other fork of this thread). The point wasn't to use `du`, it was that you can run your stats against the backed-up files, not the source. Then you're only running stats against the results of running the backup using the filters, so you don't need to filter them again. If that doesn't make sense or isn't possible (backups are on some remote server), then just use your rsync command with '--list-only', and post-process that list. E.g., if your command is: rsync -a --filter='dir-merge .rsync-filter' /source /dest It becomes, with a post-processing command that just counts bytes from files (not dirs/sockets/etc.): (all one command line -- munged for emailing) rsync --list-only -a --filter='dir-merge .rsync-filter' /source /dest | awk '/^-/ { total += $2 } END { print total }' Post-processing is made simpler by the fact that rsync escapes "special" characters already. (So, you don't have to worry about null bytes or newlines or anything in the filenames.) -- Best, Ben -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 8508] out of memory in glob_expand_module
https://bugzilla.samba.org/show_bug.cgi?id=8508 --- Comment #3 from Wayne Davison 2011-10-06 15:55:44 UTC --- The asprintf() call in question is only 1 of 2 in all of rsync that tests the success of the function via "<= 0" rather than "< 0". Perhaps just removing the '=' in those calls would fix the issue? Is the SunOS asprintf() working OK if it returns 0? -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the QA contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 8090] full_fname out of memory error on missing file SunOS 5.8
https://bugzilla.samba.org/show_bug.cgi?id=8090 Wayne Davison changed: What|Removed |Added Status|NEW |ASSIGNED --- Comment #3 from Wayne Davison 2011-10-06 15:54:21 UTC --- The asprintf() call in question is only 1 of 2 in all of rsync that tests the success of the function via "<= 0" rather than "< 0". Perhaps just removing the '=' in those calls would fix the issue? Is asprintf() working OK if it returns 0? -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the QA contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Estimating backup usage with dir-merge filter
I appreciate the suggestions so far but I know how to measure usage with 'du' et al. The hitch here is that I want to exclude files the --filter='dir-merge .rsync-filter' excludes. Hense the thought to use rsync itself. On Oct 6, 2011 11:02 AM, "K S Braunsdorf" wrote: >>that processes any filter files into --exclude parameters for "du" but >>recently, I've been wondering if there's an easier way that would use > > If your backups are all on a single partition you might try quot(8) > ("quot -- display disk space occupied by each user"). I wrote a > very simple perl script to munge quot ouptut to create a "diskhogs" > report about 20 years ago, and I still use it today. I suggest you > take the output of > quot -kvf $BACKUP_DEVICE > > and filter it to fit your needs. If you can't find a "quot" for your > OS I might have a C program that works as a replacement. > > --ksb at_host sac.fedex.com -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 5478] rsync: writefd_unbuffered failed to write 4092 bytes [sender]: Broken pipe (32)
https://bugzilla.samba.org/show_bug.cgi?id=5478 --- Comment #19 from Eric Shubert 2011-10-06 15:28:32 UTC --- (In reply to comment #14) > So, looking at your trace, it looks like the sender is waiting to write more > data (its select lists the write and error FDs). The receiving side's 2 pids > are waiting to read more data. So, something in between has either failed to > deliver the data, or blocked, or something (it would appear). Are there other > processes in between the 2 sides (e.g. ssh) that could have blocked? Or > perhaps a failed pipe (or unix domain socket)? After further testing, I'm tending to agree that the problem lies outside of rsync. Somehow, the select() is getting stuck (not receiving a signal from the kernel?). I'm not yet certain which side gets stuck first, but I would guess that it's the target, given that in one test there was about 2MB of data "in the pipe", that had been written by the source but not read yet by the target. Working on the idea that the failure happens when the pipe is saturated, I decided to try the bwlimit option to slow things down a bit. The upload speed at the source is limited to about 100KBPS (DSL, shared with other users), so I tried bwlimit=64. This worked noticeably better, although it also failed after a few hours. There may have been usage spikes that maxed out the available bw at the time, but I can't tell if that happened or not. So I reduced it to bwlimit=32, and it's been running for 7 hours now without failure (hasn't finished yet, but I'm optimistic). So I'm inclined to think that this problem lies in the transport somewhere. It also appears to come to light when the transport is saturated. Does anyone have any ideas regarding how to go about tracking this down further, or where to go for further help? While this bug is definitely affecting rsync, I think the solution is going to end up being elsewhere, in glibc or the kernel perhaps. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the QA contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html