Re: [python-tulip] Async iterators
On Tue, Mar 24, 2015 at 4:10 AM, Victor Stinner wrote: > 2015-03-24 2:44 GMT+01:00 Guido van Rossum : > > For seekable() I couldn't find any dynamic implemetations, > > The first call to io.FileIO.seekable() calls lseek(0, SEEK_CUR). > Oops. :( > It's safer to expect that any file method can block on I/O. > Yup. > If you doubt that syscalls can block, try unbuffered FileIO on a NFS > share with metadata cache disabled ("mount -o noac" on Linux). Unplug > the network cable and enjoy :-) > > I checked yesterday with fstat(): the syscall blocks until the network > cable is plugged again. At least on Linux, it's not possible to > interrupt fstat() with a signal like CTRL+c :-( That's a sad state of the world. NFS just sucks in so many ways... This also means that if you use a thread pool for this, it might fill up with tasks that won't make progress, and eventually your thread pool will block all tasks (unless it's not really a thread pool :-). I guess we need timeouts on everything and eventually just kill the process. :-( -- --Guido van Rossum (python.org/~guido)
Re: [python-tulip] Async iterators
2015-03-24 2:44 GMT+01:00 Guido van Rossum : > For seekable() I couldn't find any dynamic implemetations, The first call to io.FileIO.seekable() calls lseek(0, SEEK_CUR). It's safer to expect that any file method can block on I/O. If you doubt that syscalls can block, try unbuffered FileIO on a NFS share with metadata cache disabled ("mount -o noac" on Linux). Unplug the network cable and enjoy :-) I checked yesterday with fstat(): the syscall blocks until the network cable is plugged again. At least on Linux, it's not possible to interrupt fstat() with a signal like CTRL+c :-( Victor
Re: [python-tulip] Async iterators
On Mon, Mar 23, 2015 at 8:39 PM, Tin Tvrtković wrote: > f = yield from aiofiles.open('test.bin', mode='rb') > try: > data = yield from f.read(512) > finally: > yield from f.close() > > I've run into two difficulties - first, it's difficult for me to tell which > calls may actually block (does 'isatty' block? does 'seekable' block [I > think so]?) and which don't have to go through an executor. But this is a > question for another day. :) On Mon, Mar 23, 2015 at 8:39 PM, Tin Tvrtković wrote: > f = yield from aiofiles.open('test.bin', mode='rb') > try: > data = yield from f.read(512) > finally: > yield from f.close() That's awesome, Tin! > I've run into two difficulties - first, it's difficult for me to tell which > calls may actually block (does 'isatty' block? does 'seekable' block [I > think so]?) and which don't have to go through an executor. But this is a > question for another day. :) I'd recommend taking a look at the Node.js filesystem API. Their philosophy is: anything that needs to go to disk is blocking, and everything that is blocking must have a callback. Just look at the API for their fs module: https://nodejs.org/api/fs.html For convenience, some functions have a non-callback "synchronous" version. Those have a Sync suffix, e.g. fs.stat(path, callback) fs.statSync(path) Cheers, Luciano -- Luciano Ramalho | Author of Fluent Python (O'Reilly, 2015) | http://shop.oreilly.com/product/0636920032519.do | Professor em: http://python.pro.br | Twitter: @ramalhoorg
Re: [python-tulip] Async iterators
On Mon, Mar 23, 2015 at 4:39 PM, Tin Tvrtković wrote: > Hello, > > following the discussion from > https://groups.google.com/forum/?fromgroups#!topic/python-tulip/iGPv24gTpAI, > I've been working on a small library for async access to files through a > thread pool. I've been aiming to emulate the existing file API as much as > possible: > > f = yield from aiofiles.open('test.bin', mode='rb') > try: > data = yield from f.read(512) > finally: > yield from f.close() > > Cool project! > I've run into two difficulties - first, it's difficult for me to tell > which calls may actually block (does 'isatty' block? does 'seekable' block > [I think so]?) and which don't have to go through an executor. But this is > a question for another day. :) > isatty() can definitely make a system call -- e.g. http://opensource.apple.com/source/Libc/Libc-167/gen.subproj/isatty.c -- so it should be considered blocking. For seekable() I couldn't find any dynamic implemetations, but IIRC it's possible to implement this as trying to seek to the current position and catching the error (and then caching it so subsequent calls won't have to do this). You should probably try to find at least one such implementation -- if you can't find one assume it won't be needed. (After all you're *defining* how things will behave in your version here.) > The second is that certain nifty file operations can't really be ported to > the async world; for example context managers. A file close may block, I > believe, so __exit__ would need to be yielded from, and that's currently > impossible, right? > Right. > Also, iterating over the file is also presenting me with difficulties. > There's no way for __next__ to be a coroutine, right? So __next__ would > have to return futures. But how to know when to raise StopIteration without > actually doing IO? Also, all the futures would basically be the same - > calling readline() in an executor. So if a used accidentally (or on purpose > maybe) doesn't actually yield from the futures right away, the iteration > would spin infinitely. > > I'm thinking implementing something like this isn't worth the trouble, and > users should just be instructed to use a while loop and readline() until an > empty result comes back. I'd appreciate comments to my conclusions, from > the experts. :) > Sounds like a plan. This is where I left it with the asyncio.streams API as well. > I will say one thing, I've learned a lot about Python 3's file IO stack :) > You're welcome! -- --Guido van Rossum (python.org/~guido)