[issue17440] Some IO related problems on x86 windows
Gurmeet Singh added the comment: Please consider following before making a decision: __ io.BufferedReader does not implement read1 (the last lines of trace below) It does. You made a mistake in your experiment (you called read1() on a FileIO object, not a BufferedReader object). Please see the following lines: cfl = open ('c:/temp9/Capability/Analyzing Data.mp4', 'rb', buffering = -1) type(cfl) class '_io.BufferedReader' According to me it is a _io.BufferedReader only and not just _io.FileIO (the base class). Please tell me if I am wrong here. -- status: pending - open ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17440 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17440] Some IO related problems on x86 windows
Antoine Pitrou added the comment: You called read1() on fl (a FileIO object) and not cfl (a BufferedReader object). Your fault for choosing confusing variable names :-) len(fl.read1(70934549)) Traceback (most recent call last): File pyshell#44, line 1, in module len(fl.read1(70934549)) AttributeError: '_io.FileIO' object has no attribute 'read1' Please try to call cfl.read1() and see whether it works (it should). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17440 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17440] Some IO related problems on x86 windows
Gurmeet Singh added the comment: Please consider following before making a decision: io.FileIO does not implements single OS system call on read() - instead reads a file until EOF i.e. ignores the arguments supplied to read() Your experiments show otherwise, the argument supplied to read() is observed: if you call read(1024), at most 1024 bytes are returned, etc. It's only if you call read() without an argument that the file is being read until EOF. I said this because I saw the following in the docs: class io.RawIOBase read(n=-1) Read up to n bytes from the object and return them. As a convenience, if n is unspecified or -1, readall() is called. Otherwise, only one system call is ever made. Fewer than n bytes may be returned if the operating system call returns fewer than n bytes. If only one system call is being made, then I think that fl.read(256) and fl.read(70934549) should take same amount of time to complete - assuming disk I/O is the time consuming factor in this operation (as compared to memory processing). I am only saying that instead of one system call being made - the whole size specified by read is being read (by multiple system calls - as it appears to me). Please tell me if I am wrong here. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17440 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17440] Some IO related problems on x86 windows
Gurmeet Singh added the comment: @Antoine - wait I will do it -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17440 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17440] Some IO related problems on x86 windows
Gurmeet Singh added the comment: @Antoine It worked. I was wrong to say read1() was not implemented. Sorry. But please do consider other issues. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17440 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17440] Some IO related problems on x86 windows
Antoine Pitrou added the comment: If only one system call is being made, then I think that fl.read(256) and fl.read(70934549) should take same amount of time to complete - assuming disk I/O is the time consuming factor in this operation (as compared to memory processing). What do you mean? Reading a large number of bytes will most certainly always be slower than reading a small number of bytes, even if it only takes one system call. You still have to copy the data from disk or filesystem buffers into userspace. A reasonable expectation is for read(N) to be O(N), but not O(1). You might want to check that by timing it with different N values. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17440 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17440] Some IO related problems on x86 windows
Gurmeet Singh added the comment: I did the following to understand time taken for in memory copy: 1 fl = io.FileIO('c:/temp9/Capability/Analyzing Data.mp4', 'rb') 2 byt = fl.read(70934549) 3 byt2 = None 4 byt2 = byt[:] 5 fl.close() 6 fl = io.FileIO('c:/temp9/Capability/Analyzing Data.mp4', 'rb') 7 byt = fl.read(1) I found that python interpreter blocked for negligible time on line 4 (and line 7), as compared to line 2. I assume that line 4 is a correct syntax for an in place memory copy. Therefore, multiple system calls could be taking place - This is how I assumed. Please suggest if I am incorrect. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17440 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17440] Some IO related problems on x86 windows
Gurmeet Singh added the comment: Sorry, typo in the last post - I meant in memory - memory copy not in place memory copy. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17440 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17440] Some IO related problems on x86 windows
Antoine Pitrou added the comment: Bytes objects are immutable, so trying to copy them doesn't copy anything actually (it's an optimization): b = bx *10 id(b) 139720033059920 b2 = b[:] id(b2) 139720033059920 FileIO.read() only calls the underlying read() once, you can check the implementation: http://hg.python.org/cpython/file/8002f45377d4/Modules/_io/fileio.c#l703 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17440 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17440] Some IO related problems on x86 windows
Gurmeet Singh added the comment: Thanks for letting me know about the optimization. I trusted you that the system call is made once, though I looked up code to see if size of the read in buffer is being passed to the C routine. I should apologize though for raising this issue - since it is incorrect. But, I think you would be interested (out of CURIOSITY) in findings from the last experiment that I did to understand this issue: 1 import io 2 fl = io.FileIO('c:/temp9/Capability/Analyzing Data.mp4', 'rb') 3 barr = bytearray(70934549) 4 barr2= bytearray(70934549) 5 id(barr) 29140440 6 id(barr2) 26433560 7 fl.readinto(barr) 70934549 8 barr2 = barr[:] 9 fl.close() 10 fl = io.FileIO('c:/temp9/Capability/Analyzing Data.mp4', 'rb') 11 barrt = bytearray(1) 12 id(barrt) 34022512 13 fl.readinto(barrt) 1 14 fl.close() The time of line 7 was much greater than line 13. It was also greater than 8 (but not that great as of 11). But I cannot say for sure that the time for line 13 plus line 8 is equal to or lesser than 7 - it looks lesser but needs more precise testing to say anything further. I tried to reason the situation as follows (for this I looked up the hyperlink that you gave). I feel that the underlying system call takes the size argument - so I guess that large value suggests the C compiler to make ask the disk subsystem to read up the longer data - hence it takes the time since disk access is slower. Thanks for your time. Sorry for the incorrect issue. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17440 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17440] Some IO related problems on x86 windows
Antoine Pitrou added the comment: The time of line 7 was much greater than line 13. Well, yes, reading 70 MB is much longer than reading a single byte :-) I feel that the underlying system call takes the size argument Indeed it does. It would be totally inefficient if it didn't. so I guess that large value suggests the C compiler to make ask the disk subsystem to read up the longer data - hence it takes the time since disk access is slower. It's not the C compiler. It's the OS kernel which reads data from the disk when you ask to. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17440 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17440] Some IO related problems on x86 windows
Antoine Pitrou added the comment: Anyway, I'm now closing the issue as invalid. -- status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17440 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17440] Some IO related problems on x86 windows
Changes by Serhiy Storchaka storch...@gmail.com: -- components: +IO nosy: +benjamin.peterson, hynek, pitrou, stutzbach ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17440 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17440] Some IO related problems on x86 windows
New submission from Gurmeet Singh: 1. The read mode is not the default mode as mentioned in the docs.python.org. In particular see the first Traceback below - b does not work (as it does in C though) and you are forced to use rb instead. 2. io.BufferedReader does not implement read1 (the last lines of trace below) 3. io.FileIO does not implements single OS system call on read() - instead reads a file until EOF i.e. ignores the arguments supplied to read() - larger arguments are slower to execute (see the read calls in the trace below). _ import io fl = io.FileIO('c:/temp9/Capability/Analyzing Data.mp4', 'r') byt = fl.read() len(byt) 70934549 fl.close() fl = io.FileIO('c:/temp9/Capability/Analyzing Data.mp4', 'r') byt = fl.read(256) len(byt) 256 byt = fl.read(512) len(byt) 512 byt = fl.read(1024) len(byt) 1024 byt = fl.read(4096) len(byt) 4096 byt = fl.read(10240) len(byt) 10240 len(fl.read(40960)) 40960 len(fl.read(102400)) 102400 len(fl.read(1048576)) 1048576 fl.close() fl = io.FileIO('c:/temp9/Capability/Analyzing Data.mp4', 'r') len(fl.read(70934549)) 70934549 len(fl.read(70934549)) 0 fl.close() fl = io.FileIO('c:/temp9/Capability/Analyzing Data.mp4', 'r') b = bytearray(70934549) fl.readinto(b) 70934549 fl.close() fl = open ('c:/temp9/Capability/Analyzing Data.mp4', 'b', buffering = 0) Traceback (most recent call last): File pyshell#31, line 1, in module fl = open ('c:/temp9/Capability/Analyzing Data.mp4', 'b', buffering = 0) ValueError: Must have exactly one of create/read/write/append mode and at most one plus fl = open ('c:/temp9/Capability/Analyzing Data.mp4', 'rb', buffering = 0) type(fl) class '_io.FileIO' cfl = io.FileIO('c:/temp9/Capability/Analyzing Data.mp4', 'r') type(cfl) class '_io.FileIO' cfl.close() cfl = open ('c:/temp9/Capability/Analyzing Data.mp4', 'rb', buffering = -1) type(cfl) class '_io.BufferedReader' io.DEFAULT_BUFFER_SIZE 8192 len(fl.read(70934549)) 70934549 cfl.close() cfl = open ('c:/temp9/Capability/Analyzing Data.mp4', 'rb', buffering = -1) len(fl.read1(70934549)) Traceback (most recent call last): File pyshell#44, line 1, in module len(fl.read1(70934549)) AttributeError: '_io.FileIO' object has no attribute 'read1' -- messages: 184330 nosy: gsingh priority: normal severity: normal status: open title: Some IO related problems on x86 windows ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17440 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17440] Some IO related problems on x86 windows
Antoine Pitrou added the comment: 1. The read mode is not the default mode as mentioned in the docs.python.org. It is. If you don't mention a mode, the mode is r by default. But if you mention a mode, then you are required to specify one of r, w, a. io.BufferedReader does not implement read1 (the last lines of trace below) It does. You made a mistake in your experiment (you called read1() on a FileIO object, not a BufferedReader object). io.FileIO does not implements single OS system call on read() - instead reads a file until EOF i.e. ignores the arguments supplied to read() Your experiments show otherwise, the argument supplied to read() is observed: if you call read(1024), at most 1024 bytes are returned, etc. It's only if you call read() without an argument that the file is being read until EOF. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17440 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17440] Some IO related problems on x86 windows
Changes by Amaury Forgeot d'Arc amaur...@gmail.com: -- resolution: - invalid status: open - pending ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17440 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com