>>> This doesn't sound right. A FIEMAP_EXTENT_UNWRITTEN extent is all zeros, >>> and >>> so it should act as if it were a hole. The goal is not to copy the exact >>> fiemap structure of the source (that's impossible): the goal is to use as >>> little time and space as possible.
> A FIEMAP_EXTENT_UNWRITTEN extent is marked to allocated although > read it will return ZEROs through the filesystem. So why not using > fallocate(2) to deal with it? IMHO, it meet the goal to use little > time and space as possible, Am I miss something? It's faster to simply skip around that extent while reading it, and to skip around it when writing it, than to allocate it with fallocate when writing it. Logically, a FIEMAP_EXTENT_UNWRITTEN extent is a hole, and should be optimized when reading and writing, just like any hole. >> I see fiemap as optimizing reads, >> posix_fallocate() as optimizing writing zeros >> and fallocate() as optimizing allocation. It may not be quite that simple. Some platforms won't have fallocate and so posix_fallocate will have to do double duty as optimizing allocation too. Also, lseek is part of the process of optimizing reads, and of optimizing writing zeros. Most important, the heuristics for optimizing the writes should use info derived from optimizing the reads. I'm not objecting to breaking these improvements into two or three pieces, if someone wants to do that. However, it shouldn't be required to break them up; it's OK if someone wants to do it all at once. (This stuff is not that hard, after all.) I was planning to give it a shot at some point but obviously have not done so yet.