On Fri, Sep 5, 2014 at 12:32 PM, Christoph Hellwig <h...@infradead.org> wrote: > On Fri, Sep 05, 2014 at 12:27:21PM -0400, Milosz Tanski wrote: >> I would prefer a interface more like recv() where I can specify the >> flag if I want blocking behavior for this read or not. Let me explain >> why: >> >> In a VLDB like workload this would enable me to lower the latency of >> common fast requests and. By fast requests I mean ones that do not >> require much data, the data is cached, or there's a predictable read >> pattern (read-ahead). Obviously it would be at the expense of the >> latency of large/slow requests (they have to make 2 read calls, the >> first one always EWOULDBLOCK) ... but in that case it doesn't matter >> since the time to do actual IO would trump any kind of extra latency. > > This is another good suggestion. I've actually heard people asking > for allowing per-I/O flags for other uses cases. The one I cane > remember is applying O_DSYNC only for FUA writes on a SCSI target, > the other one would be Samba again, as SMB allows per-I/O flags on > the wire as well. > >> Essentially, it's using the kernel facilities (page cache) to help me >> perform better (in a more predictable fashion). I would implement this >> in our application tomorrow. It's frustrating that there is a similar >> interface (recv* family) that I cannot use. >> >> I know there's been a bunch of attempts at buffered AIO and none of >> them made it into the kernel. It would let me build a buffered AIO >> implementation in user-space using a threadpool. And cached data would >> not end up getting blocked behind other non-cached requests sitting in >> the queue. I know there's other sources of blocking (locking, metadata >> lookups) but direct AIO already suffers from these so I'm fine to >> paper over that for now. > > Although I still think providing useful AIO at the kernel level would be > better than having everyone reimplement it it still would be useful to > allow people to sanely reimplement it. If only to avoid the discussion > about what API to use between the non-standard and not really that nice > Linux io_submit and the utterly horrible Posix aio_ semantics.
Yeah, I would love for that to happen but I've been lurking and following the non-blocking buffered AIO discussions and attempts on lkml since about 2008 and the threads go back much further than that about 12 years. I would take a much less ambitious syscall read/pread syscall that gets me 90% of the way there and I can build the remainder in user-space. It also has the nice side-effect of being providing a not-horrible fallback for older/non-linux systems where all IO goes into the thread pool (without the option to skip it). -- Milosz Tanski CTO 16 East 34th Street, 15th floor New York, NY 10016 p: 646-253-9055 e: mil...@adfin.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/