On 3/5/23 09:35, aapost wrote:
I have run in to this a few times and finally reproduced it. Whether it is as expected I am not sure since it is slightly on the user, but I can think of scenarios where this would be undesirable behavior.. This occurs on 3.11.1 and 3.11.2 using debian 12 testing, in case the reasoning lingers somewhere else.

If a file is still open, even if all the operations on the file have ceased for a time, the tail of the written operation data does not get flushed to the file until close is issued and the file closes cleanly.

2 methods to recreate - 1st run from interpreter directly:

f = open("abc", "w")
for i in range(50000):
   f.write(str(i) + "\n")

you can cat the file and see it stops at 49626 until you issue an f.close()

a script to recreate:

f = open("abc", "w")
for i in range(50000):
   f.write(str(i) + "\n")
while(1):
   pass

cat out the file and same thing, stops at 49626. a ctrl-c exit closes the files cleanly, but if the file exits uncleanly, i.e. a kill command or something else catastrophic. the remaining buffer is lost.

Of course one SHOULD manage the closing of their files and this is partially on the user, but if by design something is hanging on to a file while it is waiting for something, then a crash occurs, they lose a portion of what was assumed already complete...

>Cameron
>Eryk

Yeah, I later noticed open() has the buffering option in the docs, and the warning on a subsequent page:

Warning
Calling f.write() without using the with keyword or calling f.close() might result in the arguments of f.write() not being completely written to the disk, even if the program exits successfully.

I will have to set the buffer arg to 1. I just hadn't thought about buffering in quite a while since python just handles most of the things lower level languages don't. I guess my (of course incorrect) assumptions would have leaned toward some sort of auto handling of the flush, or a non-buffer default (not saying it should).

And I understand why it is the way it is from a developer standpoint, it's sort of a mental thing in the moment, I was in a sysadmin way of thinking, switching around from doing things in bash with multiple terminals, forgetting the fundamentals of what the python interpreter is vs a sequence of terminal commands.

That being said, while "with" is great for many use cases, I think its overuse causes concepts like flush and the underlying "whys" to atrophy (especially since it is obviously a concept that is still important). It also doesn't work well when doing quick and dirty work in the interpreter to build a file on the fly with a sequence of commands you haven't completely thought through yet, in addition to the not wanting to close yet, the subsequent indention requirement is annoying. f = open("fn", "w", 1) will be the go to for that type of work since now I know. Again, just nitpicking, lol.

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to