[issue17852] Built-in module _io can loose data from buffered files at exit

2017-09-05 Thread Antoine Pitrou
Antoine Pitrou added the comment: To elaborate a bit on the patch: - it is pointless to call flush() if the buffered is in a bad state (self->ok == 0) or it has started finalizing already - you need to own the reference, since flush() can release the GIL and, if the reference is borrowed, the

[issue17852] Built-in module _io can loose data from buffered files at exit

2017-09-05 Thread Antoine Pitrou
Antoine Pitrou added the comment: Just apply the following patch to the original PR and it should work fine: diff --git a/Modules/_io/bufferedio.c b/Modules/_io/bufferedio.c index 50c87c1..2ba98f2 100644 --- a/Modules/_io/bufferedio.c +++ b/Modules/_io/bufferedio.c @@ -409,12 +409,12 @@ static

[issue17852] Built-in module _io can loose data from buffered files at exit

2017-09-05 Thread Neil Schemenauer
Neil Schemenauer added the comment: I reverted because of the crash in test_threading. I'm pretty sure there is a bug with the locking of bufferedio.c, related to threads and flush. Here is the stacktrace I get (my patch applied, I'm trying to write a Python test that triggers the SEGV

[issue17852] Built-in module _io can loose data from buffered files at exit

2017-09-05 Thread STINNER Victor
STINNER Victor added the comment: > FAIL: test_4_daemon_threads (test.test_threading.ThreadJoinOnShutdown) Oh, it also failed on: http://buildbot.python.org/all/builders/x86%20Gentoo%20Installed%20with%20X%203.x/builds/957/steps/test/logs/stdio and

[issue17852] Built-in module _io can loose data from buffered files at exit

2017-09-05 Thread STINNER Victor
STINNER Victor added the comment: > New changeset e38d12ed34870c140016bef1e0ff10c8c3d3f213 by Neil Schemenauer in > branch 'master': > bpo-17852: Maintain a list of BufferedWriter objects. Flush them on exit. > (#1908) This change introduced a regression:

[issue17852] Built-in module _io can loose data from buffered files at exit

2017-09-04 Thread Neil Schemenauer
Neil Schemenauer added the comment: New changeset db564238db440d4a2d8eb9d60ffb94ef291f6d30 by Neil Schemenauer in branch 'master': Revert "bpo-17852: Maintain a list of BufferedWriter objects. Flush them on exit. (#1908)" (#3337)

[issue17852] Built-in module _io can loose data from buffered files at exit

2017-09-04 Thread Neil Schemenauer
Changes by Neil Schemenauer : -- pull_requests: +3352 ___ Python tracker ___ ___

[issue17852] Built-in module _io can loose data from buffered files at exit

2017-09-04 Thread Neil Schemenauer
Neil Schemenauer added the comment: New changeset e38d12ed34870c140016bef1e0ff10c8c3d3f213 by Neil Schemenauer in branch 'master': bpo-17852: Maintain a list of BufferedWriter objects. Flush them on exit. (#1908)

[issue17852] Built-in module _io can loose data from buffered files at exit

2017-06-01 Thread Xavier G. Domingo
Changes by Xavier G. Domingo : -- nosy: +xgdomingo ___ Python tracker ___ ___

[issue17852] Built-in module _io can loose data from buffered files at exit

2017-06-01 Thread Neil Schemenauer
Changes by Neil Schemenauer : -- pull_requests: +1987 ___ Python tracker ___ ___

[issue17852] Built-in module _io can loose data from buffered files at exit

2017-06-01 Thread Neil Schemenauer
Neil Schemenauer added the comment: "Did you get any ResourceWarning?" I already knew that explicitly closing the file would fix the issue. However, think of the millions of lines of Python 2 that hopefully will be converted to Python 3. There will be many ResourceWarning errors. It is not

[issue17852] Built-in module _io can loose data from buffered files at exit

2017-06-01 Thread STINNER Victor
STINNER Victor added the comment: Neil Schemenauer: "Well, I just spent a couple of hours debugging a problem caused by this issue." Did you get any ResourceWarning? You need to run python3 with -Wd to see them. By the way, I enhanced ResourceWarning in Python 3.6: if you enable tracemalloc,

[issue17852] Built-in module _io can loose data from buffered files at exit

2017-05-31 Thread Neil Schemenauer
Neil Schemenauer added the comment: Well, I just spent a couple of hours debugging a problem caused by this issue. You could argue that I should be calling close() on all of my file-like objects but I agree with Armin that the current "most of the time it works" behaviour is quite poor. In

[issue17852] Built-in module _io can loose data from buffered files at exit

2015-02-13 Thread Martin Panter
Changes by Martin Panter vadmium...@gmail.com: -- nosy: +vadmium ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17852 ___ ___ Python-bugs-list

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread Armin Rigo
Armin Rigo added the comment: I hate to repeat myself, but if the C standard says files are flushed at exit; if Python 2 follows this standard; and if Python 3 follows it most of the time but *not always*... then it seems to me that something is very, very buggy in the worst possible way

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Do you have working in 3.4+ example? -- stage: - test needed versions: +Python 3.5 -Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17852 ___

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread Armin Rigo
Armin Rigo added the comment: If I understood correctly, Python 3.4 tries harder to find cycles and call destructors at the end of the program, but that's not a full guarantee. For example you can have a reference from a random C extension module. While trying to come up with an example, I

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread Armin Rigo
Armin Rigo added the comment: (Ah, it's probably a reference from the trace function - func_globals - f). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17852 ___

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: Here is an example. -- stage: test needed - needs patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17852 ___

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread Antoine Pitrou
Changes by Antoine Pitrou pit...@free.fr: Added file: http://bugs.python.org/file37357/gcio.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17852 ___ ___

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: Note the example I posted doesn't involve the shutdown sequence. It calls gc.collect() explicitly. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17852

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread Armin Rigo
Armin Rigo added the comment: To add to the confusion: Antoine's example produces an empty file on the current trunk cd282dd0cfe8. When I first tried it on a slightly older trunk (157 changesets ago), it correctly emitted a file with barXXX , but only if gc.collect() was present. Without

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: The problem is that the order of tp_finalize calls is arbitrary when there is a reference cycle (same thing, of course, with tp_clear). So depending on the exact layout of the garbage list, the TextIOWrapper could be collected before the BufferedWriter and

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread Armin Rigo
Armin Rigo added the comment: Maybe accepting the fact that relying on finalizers is a bad idea here? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17852 ___

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: You mean encourage people to explicitly use with or call close()? Yes, that's the simplest answer. It's why we added ResourceWarning. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17852

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread STINNER Victor
STINNER Victor added the comment: The problem is that the order of tp_finalize calls is arbitrary when there is a reference cycle If we want to guaranty that all files are properly flushed at exit, the best option is to maintain a list of open files. I'm not interested to implement that,

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: Le 04/12/2014 12:05, STINNER Victor a écrit : The problem is that the order of tp_finalize calls is arbitrary when there is a reference cycle If we want to guaranty that all files are properly flushed at exit, the best option is to maintain a list of open

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: How do you decide which object should be flushed? In which order? We have to take care of signals, threads and forks, stay portable, etc. The order doesn't matter. If you call flush() of TextIOWrapper, flushes of buffered writer and raw file will be called

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread Armin Rigo
Armin Rigo added the comment: Antoine: sorry if I wasn't clear enough. Obviously you want to encourage people to close their files, but I think personally that it is very bad for the implementation to *most of the time* work anyway and only rarely fail to flush the files. So, speaking only

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: Le 04/12/2014 13:53, Armin Rigo a écrit : So, speaking only about the implementation, it is (imho) a bad idea to rely on finalizers to flush the files, and something else should be done. But what else? -- ___

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread Armin Rigo
Armin Rigo added the comment: Antoine: I'm trying to explain what in the three lines that follow the parts you quoted. I already tried to explain it a few times above. Now I feel that I'm not going anywhere, so I'll quote back myself from 2013-04-27: Feel free to close anyway as not-a-bug;

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: Armin: I don't see how linked lists would help. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17852 ___ ___

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: How it looks to me. Should be global collection of weak references (WeakSet?). Every instance of TextIOWrapper, BufferedWriter and BufferedRandom add itself to this collection on create and remove on close. A function registered with atexit calls flush()

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread Antoine Pitrou
Changes by Antoine Pitrou pit...@free.fr: -- nosy: -pitrou ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17852 ___ ___ Python-bugs-list mailing

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread Armin Rigo
Armin Rigo added the comment: Here is a proof-of-concept. It changes both _pyio.py and bufferedio.c and was tested with the examples here. (See what I meant with linked lists.) The change in textio.c is still missing, but should be very similar to bufferedio.c. This is similar to the

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread Armin Rigo
Changes by Armin Rigo ar...@users.sourceforge.net: -- keywords: +patch Added file: http://bugs.python.org/file37359/pyio.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17852 ___

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread STINNER Victor
STINNER Victor added the comment: By the way, I fixed various issues to ensure that ResourceWarning are displayed at Python exit. There are still corner cases when the warnings are not emited. For example, when a daemon thread is used. Or when the warning comes very late during Python

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-04 Thread Nikolaus Rath
Nikolaus Rath added the comment: This will probably be too radial, but I think it should at least be mentioned as a possible option. We could just not attempt to implicitly flush buffers in the finalizer at all. This means scripts relying on this will break, but in contrast to the current

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-03 Thread Nikolaus Rath
Changes by Nikolaus Rath nikol...@rath.org: -- nosy: +nikratio ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17852 ___ ___ Python-bugs-list

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-03 Thread Nikolaus Rath
Nikolaus Rath added the comment: Serhiy, I believe this still happens in Python 3.4, but it is harder to reproduce. I couldn't get Armin's script to produce the problem either, but I'm pretty sure that this is what causes e.g. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=771452#60.

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-12-03 Thread Charles-François Natali
Charles-François Natali added the comment: Serhiy, I believe this still happens in Python 3.4, but it is harder to reproduce. I couldn't get Armin's script to produce the problem either, but I'm pretty sure that this is what causes e.g.

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-11-20 Thread Armin Rigo
Armin Rigo added the comment: Victor: there is the GIL, you don't need any locking. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17852 ___ ___

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-11-19 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: In 3.4+ the example script always writes string bar to file foo. Tested by running it in a loop hundreds times. Cleaning up at shutdown was enhanced in 3.4. -- nosy: +serhiy.storchaka ___ Python tracker

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-11-19 Thread STINNER Victor
STINNER Victor added the comment: Is anyone interested to work on the maintain a list of open file objects idea? I consider that Python 3 does its best to flush data at exit, but it's not a good practice to rely on the destructors to flush data. I mean, there is no warranty that files will be

[issue17852] Built-in module _io can loose data from buffered files at exit

2014-11-19 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I'm +1 on closing this. Agree with Charles-François that it's never been guaranteed by the Python specification. Pythonic way to work with files is to use the with statement, or, if you need long living file stream, careful close files in the finally block

[issue17852] Built-in module _io can loose data from buffered files at exit

2013-04-30 Thread Charles-François Natali
Charles-François Natali added the comment: Hum, POSIX (2004) is not so strict: Whether open streams are flushed or closed, or temporary files are removed is implementation-defined. http://pubs.opengroup.org/onlinepubs/009695399/functions/exit.html You're looking at the wrong section: The

[issue17852] Built-in module _io can loose data from buffered files at exit

2013-04-30 Thread STINNER Victor
STINNER Victor added the comment: It's guaranteed for exit(), not _Exit()/_exit(). Oops ok, thanks. 2013/4/30 Charles-François Natali rep...@bugs.python.org: Charles-François Natali added the comment: Hum, POSIX (2004) is not so strict: Whether open streams are flushed or closed, or

[issue17852] Built-in module _io can loose data from buffered files at exit

2013-04-29 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@gmail.com: -- nosy: +haypo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17852 ___ ___ Python-bugs-list

[issue17852] Built-in module _io can loose data from buffered files at exit

2013-04-29 Thread STINNER Victor
STINNER Victor added the comment: In Python 2, a buffered file opened for writing is flushed by the C library when the process exit. When you say Python 2, I assume you mean CPython 2, right? Because - AFAICT - files got flushed only by accident, not by design. It looks to be a feature of

[issue17852] Built-in module _io can loose data from buffered files at exit

2013-04-29 Thread Charles-François Natali
Charles-François Natali added the comment: When you say Python 2, I assume you mean CPython 2, right? Because - AFAICT - files got flushed only by accident, not by design. It looks to be a feature of the standard C library, at least the GNU libc. Its libio library installs an exit handler

[issue17852] Built-in module _io can loose data from buffered files at exit

2013-04-29 Thread STINNER Victor
STINNER Victor added the comment: It looks to be a feature of the standard C library, at least the GNU libc. Yes, it's guaranteed by POSIX/ANSI (see man exit). Hum, POSIX (2004) is not so strict: Whether open streams are flushed or closed, or temporary files are removed is

[issue17852] Built-in module _io can loose data from buffered files at exit

2013-04-28 Thread Charles-François Natali
Charles-François Natali added the comment: It used to be a consistently reliable behavior in Python 2 (and we made it so in PyPy too), provided of course that the process exits normally; but it no longer is in Python 3. Well I can see the reasons for not flushing files, if it's clearly

[issue17852] Built-in module _io can loose data from buffered files at exit

2013-04-27 Thread Armin Rigo
New submission from Armin Rigo: In Python 2, a buffered file opened for writing is flushed by the C library when the process exit. In Python 3, the _pyio and _io modules don't do it reliably. They rely on __del__ being called, which is not neccesarily the case. The attached example ends

[issue17852] Built-in module _io can loose data from buffered files at exit

2013-04-27 Thread Antoine Pitrou
Antoine Pitrou added the comment: There are actually several issues here: * collection of globals at shutdown is wonky: you should add an explicit del a,f; gc.collect() at the end of the script * order of tp_clear calls, or another issue with TextIOWrapper: if you open the file in binary

[issue17852] Built-in module _io can loose data from buffered files at exit

2013-04-27 Thread Charles-François Natali
Charles-François Natali added the comment: Code relying on garbage collection/shutdown hook to flush files is borked: - it's never been guaranteed by the Python specification (neither does Java/C#...) - even with an implementation based on C stdio streams, streams won't get flushed in case of

[issue17852] Built-in module _io can loose data from buffered files at exit

2013-04-27 Thread Armin Rigo
Armin Rigo added the comment: It used to be a consistently reliable behavior in Python 2 (and we made it so in PyPy too), provided of course that the process exits normally; but it no longer is in Python 3. Well I can see the reasons for not flushing files, if it's clearly documented