On 12 Dec 2017, at 17:48, Bill Cole wrote:

On 12 Dec 2017, at 1:10 (-0500), Steven M. Bellovin wrote:

On 11 Dec 2017, at 23:26, Bill Cole wrote:

On 10 Dec 2017, at 21:14 (-0500), Steven M. Bellovin wrote:

My suspicion is that the problem has to do with very large directories on APFS file systems

This would be shocking. One of the rationales for APFS existing is that the HFS foundation was played out for dealing with large directories efficiently. I haven't looked into the details (life is short...) but if APFS is *worse* than HFS{+,X} with large directories then Apple is in a worse state than I had thought...

Yah. I have no other explanation, though. To give a current example, on a machine -- an old one, to be sure -- a Time Machine backup started almost 10 hours ago. It's dumped 77.5 MB -- out of a total of 152.7 MB -- in that time, and it's been at about 77 MB for the last ~7-8 hours. At some point, though, it will pass the expensive point and run at a reasonable rate. This dump is to a directly connected USB 2.0 drive. And the CPU is about 96% idle, according to 'top'.

Btw: by "big", I mean that I have one mailbox with 114K messages; the directory itself is 3.6 MB. No other mailbox is more than half that size, though I have four that are over 1 MB.

Oh my.

Yah. I knew some were large, but I didn't think *that* large. Worse yet, one of the top few is my inbox, which I haven't been cleaning out of late. I've been following the MailMate mantra: just create smart folders...

Since the backup disk can't be APFS (Time Machine relies on hard-linked directories, which APFS won't do) you're still dealing with that huge directory in HFS+ on the write side. If that directory has changes it is going to be spectacularly slow for TM to do 114k file hard links and copy a handful of changed files into a new directory.

Right, which explains older slowness, but not the sudden problem.

Also, USB 2.0 historically has been cripplingly slow on MacOS X, at least through El Capitan. I haven't tested it on my Sierra machine and don't have anything on High Sierra yet, so I can't say whether that might be a part of the problem.

My USB 2.0-only machine is, as of about 1:30 AM today, officially retired from hot spare status; I just got a new laptop and have moved my previous one to hot spare status. But the problem was on the 3.0 machine, too.

3 suggestions, in order of least to greatest effect on your specific issue (although the first 2 are good general TM housekeeping):

1. Use the 'tmutil' tool to thin your old backups more aggressively than TM does. (See the man page for details) This reduces the complexity of the filesystem btrees, making it easier for TM to do its work and also frees space so that you can avoid TM's arbitrary deletion of the oldest backups when the disk fills.

Possibly, though I doubt it will help with this issue. Time Machine did a massive delete on one of my disks (and then a massive new backup...), but it was just as slow afterwards.

2. Rebuild the filesystem btree structures on the Time Machine disk. This can be done with fsck_hfs using the "-Race" option or with Alsoft's DiskWarrior software. This will tidy up the mess that TM creates by building a full image of the source disk with mostly hard links every hour and then thinning them out over time, usually resulting in suboptimal structures. Note that either fsck_hfs or DiskWarrior may take an hour or more to rebuild the btrees but it will more than pay for itself in faster backups and especially in speeding up the filesystem verification TM does occasionally, which is also the source of the dreaded "you need to create a new backup" alert.

Ah, I didn't know about that one. I'll certainly try it.

3. Split that huge mailbox up into smaller slices that mostly never change, so that TM never has to do the appalling task of making the umpteenth hard link of each one of 114k files into a new huge directory. HFS+ starts to get noticeably slow with around 1k files in a directory. I try to keep my archives split into subfolders with nothing more than 2k messages because it feels like the speed degradation is worse than O(n) and is painful by 2k.

That's my current plan (though not to that small), but since it's just about intersession (I'm a professor) I have enough time to play and try to understand more of what's happening.

        --Steve Bellovin, https://www.cs.columbia.edu/~smb


_______________________________________________
mailmate mailing list
mailmate@lists.freron.com
https://lists.freron.com/listinfo/mailmate

Reply via email to