On Wed, Jan 25, 2017 at 2:52 PM, Andres Freund <and...@anarazel.de> wrote: > You'll, depending on your workload, still have a lot of lseeks even if > we were to use pread/pwrite because we do lseek(SEEK_END) to get file > sizes.
I'm pretty convinced that the lseek overhead that we're incurring right now is excessive. I mean, the Linux kernel guys fixed lseek to scale better more or less specifically because of PostgreSQL, which indicates that we're hitting it harder than most people.[1] And, more concretely, I've seen strace -c output where the time spent in lseek is far ahead of any other system call -- so if lseek overhead is negligible, then all of our system call overhead taken together is negligible, too. Having said that, it's probably not a big percentage of our runtime right now -- on normal workloads, it's probably some number of tenths of one percent. But I'm not sure that's a good reason to ignore it. The more we CPU-optimize other things (say, expression evaluation!) the more significant the things that remain will become. And we've certainly made performance fixes to save far fewer cycles than we're talking about here[2]. I'm no longer very sure fixing this is a very simple thing to do, partly because of the use of lseek to get the file size which you note above, and partly because of the possibility that this may, for example, break read-ahead, as Tom worried about previously[3]. But I think dismissing this as not-really-a-problem is the wrong approach. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company [1] https://www.postgresql.org/message-id/201110282133.18125.and...@anarazel.de [2] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=2781b4bea7db357be59f9a5fd73ca1eb12ff5a79 [3] https://www.postgresql.org/message-id/6352.1471461075%40sss.pgh.pa.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers