This patchset changes how we do space reservations for delayed refs. We were hitting probably 20-40 enospc abort's per day in production while running delayed refs at transaction commit time. This means we ran out of space in the global reserve and couldn't easily get more space in use_block_rsv().
The global reserve has grown to cover everything we don't reserve space explicitly for, and we've grown a lot of weird ad-hoc hueristics to know if we're running short on space and when it's time to force a commit. A failure rate of 20-40 file systems when we run hundreds of thousands of them isn't super high, but cleaning up this code will make things less ugly and more predictible. Thus the delayed refs rsv. We always know how many delayed refs we have outstanding, and although running them generates more we can use the global reserve for that spill over, which fits better into it's desired use than a full blown reservation. This first approach is to simply take how many times we're reserving space for and multiply that by 2 in order to save enough space for the delayed refs that could be generated. This is a niave approach and will probably evolve, but for now it works. With this patchset we've gone down to 2-8 failures per week. It's not perfect, there are some corner cases that still need to be addressed, but is significantly better than what we had. Thanks, Josef