My read of the situation is that the filesystem guys have spent a lot of time optimizing ordinary write but they haven't gotten around to optimizing fallocate because it's so rarely used -- which means that if one uses fallocate one gets lousy performance.
It's a chicken and egg problem. If coreutils started using fallocate now, one can be pretty sure they'd tune their filesystems over the next few years, to make fallocate compatible with delayed-write optimizations. On the other hand if nobody uses fallocate, there will be little incentive on their part to make it go fast. It's a question of whether we want to inflict temporary pain on users for a long-term benefit (early warning of file system full, which is something I'd dearly love to have).
