On Sat, Aug 8, 2015 at 5:31 AM, Shay Rojansky <r...@roji.org> wrote: > Hi everyone, I'm seeing some strange behavior and wanted to confirm it. > > When executing a query that selects a long result set, if the code > processing the results takes its time (i.e.g more than statement_timeout), a > timeout occurs. My expectation was that statement_timeout only affects query > *processing*, and does not cover the frontend actually processing the > result. > > First, I wanted to confirm that this is the case (and not some sort of bug > in my code). > > If this is the case, is this intended? Here are some points: > * It makes statement_timeout very difficult to use; clients do very diverse > things with database results, it may be very difficult (or impossible) to > estimate how much time they should take (e.g. random load factors on the > client machine or on some other machine receiving results). > * It makes it impossible to specifically detect problematic *queries* which > take too long to execute (as opposed to the time taken to process their > results). > > If you do insist that this behavior is correct, a documentation update for > statement_timeout might make this clearer.
I think the issue here is that we start returning rows to the client while the query is still executing. Suppose each row of output takes 100 ms of CPU time to generate and there are 1,000,000 rows. Then it's quite conceivable that we could be under the statement_timeout when we start returning rows to the client, but over the statement_timeout by the time we finish -- and the user would probably want statement_timeout to kick in in that case, because that's a lotta CPU time. I suppose we could try to toll statement_timeout while we're blocked waiting for the client, but nobody wrote the code for that yet. And it would mean that you can't use statement_timeout to prevent xmin from lagging, which could be why you set it in the first place. There might also be some usability difficulties: pg_stat_activity shows the time the query started, so if you know what statement_timeout is you can tell how close it is to being killed. If some of the time isn't counted, then you can't tell any more. Another approach (which I think might be better) is to have GUCs like statement_cpu_limit and statement_io_limit that kill a query when it uses more than the configured amount of that resource. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers