New submission from Mingye Wang <arthur200...@gmail.com>:

Consider this interaction:

cmd> echo > 1.txt
cmd> python -c "__import__('os').truncate('1.txt', 1024 ** 3)"
cmd> fsutil sparse queryFlag 1.txt

Not only takes a long time as is typical for a zero-write, but also reports 
non-sparse as an actual write would suggest. This is because internally, 
_chsize_s and friends enlarges files using a loop.[1]
  [1]: https://github.com/leelwh/clib/blob/master/c/chsize.c

On Unix systems, ftruncate for enlarging is described as "... as if the extra 
space is zero-filled", but this is not to be taken literally. In practice, 
sparse files are used whenever available (GNU dd expects that) and people do 
expect the operation to be very fast without a lot of real writes. A FreeBSD 
bug exists around how ftruncate is too slow on UFS.

The aria2 downloader gives a good example of how to truncate into a sparse file 
on Windows.[2] First a FSCTL_SET_SPARSE control is issued, and then a seek + 
SetEndOfFile would finish the job. Of course, a lseek to the end would be 
required to first determine the size of the file, so we know whether we are 
enlarging (sparse) or shrinking (don't sparse).
  [2]: https://github.com/aria2/aria2/blob/master/src/AbstractDiskWriter.cc#L507

----------
components: Library (Lib)
messages: 363717
nosy: Artoria2e5, steve.dower
priority: normal
severity: normal
status: open
title: os.ftruncate on Windows should be sparse
versions: Python 3.8, Python 3.9

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue39910>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to