On 2/27/07, Paul Moore <[EMAIL PROTECTED]> wrote: [...] > Documenting the revised open() factory in this PEP would be useful. It > needs to address encoding issues, so it's not a simple copy of the > existing open().
Check the doc again. I added on at the end. It could use some review. I also added an elaboration into the p3yk branch in svn; that could use some review as well. > Also, should there be a factory method for opening raw byte streams? The open() I added returns a raw byte stream when you specify binary mode with buffering=0. > Once we start down this route, we open the can of worms, of course > (does socket.socket need to be specified in terms of the new IO > layers? No, but check the io.py in svn; it has a SocketIO class that wraps a socket. Sockets themselves are much lower level than this; they have all sort of other APIs. The SocketIO class only works for stream socket (e.g., TCP/IO). > what about the mmap module, the gzip/zipfile/tarfile modules, > etc?) These sould probably be noted in an "open issues" section, and > otherwise deferred for now. Agreed that we should add these to the open issues section. I don't think we should mess with mmap, but *perhaps* a mmap wrapper could be provided (by the mmap module). gzip, bzip2 etc. should probably be redefined in terms of the buffered (bytes) reader/writer protocol. zipfile and tarfile should take bytes readers/writers; the API they *provide* should be defined in terms of bytes and perhaps (when appropriate, I don't recall if they have read/write methods) in terms of buffered byte streams. It *may* even be useful if many of these would support non-blocking I/O; we're currently considering adding a standard API for returning "EWOULDBLOCK" errors (e.g. return None from read() and write()) -- though we won't be providing an API to turn that on (since it depends on the underlying implementation, e.g. sockets vs. files). > > The BufferedReader implementation is for sequential-access read-only > > objects. It does not provide a .flush() method, since there is no > > sensible circumstance where the user would want to discard the read > > buffer. > > It's not something I've done personally, but programs sometimes flush > a read buffer before (eg) reading a password from stdin, to avoid > typeahead problems. I don't know if that would be relevant here. We discussed this briefly at the sprint and came to the conclusion that this is outside the scope of the PEP; you can do this by (somehow) enabling non-blocking mode and then reading until you get None. > > Another way to do it is as follows (we should pick one or the other): > > > > .__init__(self, buffer, encoding=None, newline=None) > > > > Same as above but if newline is not None use that as the > > newline pattern (for reading and writing), and if newline is not set > > attempt to find the newline pattern from the file and if we can't for > > some reason use the system default newline pattern. > > I'm not sure that can work - the point of universal newlines is that > *any* of \n, \r or \r\n count as a newline, so there's no one pattern. > So I think that explicitly specifying universal newlines is necessary > (even though it's clunky). I think for input we should always accept all three line endings so you never need to specify anything; for output, we should pick a platform default (\r\n on Windows, \n everywhere else) and have an API to override it. So the API you quote above sounds about right: .__init__(self, buffer, encoding=None, newline=None) I'd like to constrain newline to be either \n or \r\n for writing; for reading IMO it should not be specified. -- --Guido van Rossum (home page: http://www.python.org/~guido/) _______________________________________________ Python-3000 mailing list [email protected] http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
