Re: [Firebird-devel] RFC: Timeouts
OK, after reading the thread again and again, I think I'm starting to understand what was Vlad shooting for. And I think his implementation makes sense (so I'm fine moving it forward). This also makes sense reading some of Vlad's scenarios in docs (although I see it more like client-side only feature). *But.* With that I see another *highly* related feature. Some kind of resource governor. In our case (Sean, Mark, originally me) the simplest form of _working_ time consumed (so it's disk IO, index IO, memory, CPU all together - I think we don't have resources for extra detailed implementation). That would be only server-side configuration (_maybe_ option for DPB), because I see it strictly as DBA safety mechanism. -- Mgr. Jiří Činčura Independent IT Specialist -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] RFC: Timeouts
25.02.2017 21:15, Jiří Činčura wrote: >> But, client side already can set it own timer and cancel the statement. > > To add to what others said. Isn't this feature, also, about helping i.e. > DBA to keep bad queries slowing down the server (considering (s)he has > no control over the application's code itself)? Sure. And it is stated in docs at first place: --- The feature could be useful for: - database administrators get instrument to limit heavy queries from consuming too much resources - application developers could use statement timeout when creating\debugging complex queries with unknown in advance execution time - testers could use statement timeout to detect long running queries and ensure finite run time of the test suites - and so on --- Regards, Vlad -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] RFC: Timeouts
> But, client side already can set it own timer and cancel the statement. To add to what others said. Isn't this feature, also, about helping i.e. DBA to keep bad queries slowing down the server (considering (s)he has no control over the application's code itself)? -- Mgr. Jiří Činčura Independent IT Specialist -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] RFC: Timeouts
>But, client side already can set it own timer and cancel the statement. At current time, you may forget about cancel of operation. Because it works incorrectly. Dmitry Kovalenko. -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] RFC: Timeouts
25.02.2017 17:21, Adriano dos Santos Fernandes wrote: > > But, client side already can set it own timer and cancel the statement. It was considered. However, it would mean that our implementation is completely useless for Java and .NET clients, they would have to implement timeouts from scratch. Dmitry -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] RFC: Timeouts
Em 25/02/2017 08:03, Mark Rotteveel escreveu: > On 25-2-2017 10:49, Dmitry Yemanov wrote: >> Depending on the plan, statement may take 99% of its "working" time >> inside execute() or inside fetch() or that time could be distributed >> among the API calls. Neither client nor DBA has any control on that. So >> I consider seriously wrong removing fetching time from the accounting. >> >> From the client side, a timeout can be seen from two different angles. >> It could be either statement execution time (including fetches, see >> above) - this is what we have implemented now. > > But it is not the execution time (which, in my view, is the time spent > in the engine) that is constrained by the current implementation - as I > understood it (correct my if I'm wrong) -, it is the total wall-clock > time which includes time 'waiting' on the user and doing nothing in the > server. > It also seems server doing client jobs. This timeout will start counting when engine receives the query, but from client POV, it should start when client executes the statement. That makes a whole difference in network and small timeout in milliseconds. Adriano -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] RFC: Timeouts
Em 25/02/2017 06:49, Dmitry Yemanov escreveu: > So we have different requirements for the same feature and they conflict > with each other. The positive side of the implemented solution is that > it suits both client-side and server-side usage. Sean's suggestion does > not fit client-side usage, IMO. Also, if the timer is suspended/resumed > too often (per every fetch call), it gonna be costly. But, client side already can set it own timer and cancel the statement. Adriano -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] RFC: Timeouts
On 25-2-2017 11:15, Dmitry Yemanov wrote: > 25.02.2017 12:37, Mark Rotteveel wrote: > >>> Do you/anyone know if these engines return full results sets or follow the >>> "page set" approach? >> >> As far as I know Oracle[1], PostgreSQL[2], SQL Server[3] support it. I >> believe MySQL does as well. Don't know about other database systems, but >> I assume most of them will support this (the SQL CLI standard also >> defines it). > > AFAIK, in these databases page sets are primarily a transport/API > feature. The important question, however, is whether the engine does > some work during fetch() besides copying the row(s) for the next batch. > I suspect InterBase/Firebird is the only engine that may process > execute() as a no-op and perform all the data retrieval / joining / etc > during fetching. I know Oracle does, although it might depend on some optimizer decisions and hints (eg first_rows(n) hint). Not sure about the others. Mark -- Mark Rotteveel -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] RFC: Timeouts
On 25-2-2017 10:49, Dmitry Yemanov wrote: > Depending on the plan, statement may take 99% of its "working" time > inside execute() or inside fetch() or that time could be distributed > among the API calls. Neither client nor DBA has any control on that. So > I consider seriously wrong removing fetching time from the accounting. > > From the client side, a timeout can be seen from two different angles. > It could be either statement execution time (including fetches, see > above) - this is what we have implemented now. But it is not the execution time (which, in my view, is the time spent in the engine) that is constrained by the current implementation - as I understood it (correct my if I'm wrong) -, it is the total wall-clock time which includes time 'waiting' on the user and doing nothing in the server. > Or it could be the API > call timeout, to avoid "blocking" for more than the specified time. This > means that any higher-level accounting should be done by the application > or connectivity library. I guess the latter approach gonna cost much > more due to often timer resets. It also adds more work to the > client-side developers. > > Now about what is "working time". If the client specifies 10 seconds, I > really doubt it expects to see the statement timing out after 30 > seconds. From the client POV, "working time" includes all the waits, > round-trips and so on. It's not about server resources, they are outside > client's business. It's rather about an application reaction time. So I > consider Vlad's position perfectly valid. And I'm surprised that Mark > and Jiri disagree. As a client of a database, I want to make progress with my work, which means that processing rows is making progress, while waiting for an execute or a fetch of rows is not making progress. Yes, I might also have a deadline time for doing my own work, but that is not something I want my database driver to dictate or control. I will also quote what I said last year (2016-08-21): "I think a single timeout that is measured over the entire execute + all fetches is too brittle. I'd prefer if the timeout is applied to the execute and each individual fetch." > For server-side statement timeouts, situation is a bit different. It's > really about long-running queries and server resources. DBA can hardly > guess about application logic (time between fetches) or about network > latencies. So if we speak only about engine doing some work, performing > disk I/O or waiting for something, then Sean's point has some value. Of > course, it has nothing to do with CPU time spent, it's more about time > spent inside the engine for whatever reason. But it surely defines > "long-running" statements from one side, as something being served by > the engine. > > From another side, we have long-running "sleeping" (even if > occasionally) statements that block metadata objects, occupy memory and > freeze transaction counters. These are different resources but it does > not mean they should be ignored. Vlad's solution accounts that, Sean's > does not. That can still happen, even if the statement is not being executed, but only prepared, or when my client does a lot of other things before finally committing the transaction. A statement level timeout does not necessarily solve that problem. And as a statement is only really cancelled - if I understood it correctly - when the next fetch occurs, executing a statement, retrieving the first row(s) and then doing nothing for a long time will still tie up those resources. > So we have different requirements for the same feature and they conflict > with each other. The positive side of the implemented solution is that > it suits both client-side and server-side usage. Sean's suggestion does > not fit client-side usage, IMO. Also, if the timer is suspended/resumed > too often (per every fetch call), it gonna be costly. > > That said, I'd vote against reworking the current design. Perhaps, we > could additionally implement what Sean suggests, but *only* at the > server side. But I have no idea how to mix them nicely. Having two > independent timeouts looks ugly, complicates the engine code and is > likely to confuse users. The problem I foresee is that if I implement this in Jaybird, that I am going to get confused or irate Jaybird users, who either expect the timeout only to be applied to the execute, or only on 'time spent waiting for the server'. From the perspective of JDBC the statement query timeout is to be applied to the execute, and optionally for result set method calls. The wording in the specification is a bit vague, but the expectation is that the timeout is applied to each method individually. The requirements of JDBC would not entirely fit with having a 'total time spent in engine' timeout, but it is better than a 'lifetime of statement' timeout in my opinion. That would then mean I'd probably have to not use this and implement a client side timeout mechanism,
Re: [Firebird-devel] RFC: Timeouts
25.02.2017 11:47, Mark Rotteveel wrote: > On 25-2-2017 09:31, Vlad Khorsun wrote: >>If think a bit deeper, it will be clear that there is no other way as >> send big result sets >> in parts over the wire. Also, why don't you ask if they fully fetch >> resultset at the server side ? >> >>What you offer is very hard to use in practice as nobody able to explain >> why statement was >> cancelled at this moment and it is impossible to predict moment when timeout >> should fire. >> >>Well, as we can't agree i offer to choose one of the following : >> 1. Leave it as is >> 2. Completely exclude fetches from timeout scope >> 3. Remove the whole feature > > Would it be possible to apply the timeout to execute + first fetch? Yes. I'm afraid to ask - why... > Or apply the timeout to the execute only, and separately for each fetch > (where the fetch call would/could include a timeout value as well)? Here you again catched by the same trap - there is no direct relation between API call fetch() and network packet op_fetch. It is unknown and not predictable. And it will work absolutely differently for network and embedded cases. And it will be different even for INET\XNET cases. Users will be "happy". If you need to set timeout on network wait, call it this way, ok. But don't mix it with statemenet execution timeout. And be ready that only reliable way to handle network timeout is to abort whole connection. While statement execution timeouts is handled in a much less *harsh* way. Regards, Vlad -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] RFC: Timeouts
25.02.2017 12:37, Mark Rotteveel wrote: >> Do you/anyone know if these engines return full results sets or follow the >> "page set" approach? > > As far as I know Oracle[1], PostgreSQL[2], SQL Server[3] support it. I > believe MySQL does as well. Don't know about other database systems, but > I assume most of them will support this (the SQL CLI standard also > defines it). AFAIK, in these databases page sets are primarily a transport/API feature. The important question, however, is whether the engine does some work during fetch() besides copying the row(s) for the next batch. I suspect InterBase/Firebird is the only engine that may process execute() as a no-op and perform all the data retrieval / joining / etc during fetching. Dmitry -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] RFC: Timeouts
25.02.2017 10:49, Dmitry Yemanov wrote: > Depending on the plan, statement may take 99% of its "working" time > inside execute() or inside fetch() or that time could be distributed > among the API calls. Neither client nor DBA has any control on that. So > I consider seriously wrong removing fetching time from the accounting. Nobody argued with that. Questionable is time between end of one fetch() call and start of the next fetch() call. -- WBR, SD. -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] RFC: Timeouts
All, Let me jump into discussion and share my own concerns. Depending on the plan, statement may take 99% of its "working" time inside execute() or inside fetch() or that time could be distributed among the API calls. Neither client nor DBA has any control on that. So I consider seriously wrong removing fetching time from the accounting. From the client side, a timeout can be seen from two different angles. It could be either statement execution time (including fetches, see above) - this is what we have implemented now. Or it could be the API call timeout, to avoid "blocking" for more than the specified time. This means that any higher-level accounting should be done by the application or connectivity library. I guess the latter approach gonna cost much more due to often timer resets. It also adds more work to the client-side developers. Now about what is "working time". If the client specifies 10 seconds, I really doubt it expects to see the statement timing out after 30 seconds. From the client POV, "working time" includes all the waits, round-trips and so on. It's not about server resources, they are outside client's business. It's rather about an application reaction time. So I consider Vlad's position perfectly valid. And I'm surprised that Mark and Jiri disagree. For server-side statement timeouts, situation is a bit different. It's really about long-running queries and server resources. DBA can hardly guess about application logic (time between fetches) or about network latencies. So if we speak only about engine doing some work, performing disk I/O or waiting for something, then Sean's point has some value. Of course, it has nothing to do with CPU time spent, it's more about time spent inside the engine for whatever reason. But it surely defines "long-running" statements from one side, as something being served by the engine. From another side, we have long-running "sleeping" (even if occasionally) statements that block metadata objects, occupy memory and freeze transaction counters. These are different resources but it does not mean they should be ignored. Vlad's solution accounts that, Sean's does not. So we have different requirements for the same feature and they conflict with each other. The positive side of the implemented solution is that it suits both client-side and server-side usage. Sean's suggestion does not fit client-side usage, IMO. Also, if the timer is suspended/resumed too often (per every fetch call), it gonna be costly. That said, I'd vote against reworking the current design. Perhaps, we could additionally implement what Sean suggests, but *only* at the server side. But I have no idea how to mix them nicely. Having two independent timeouts looks ugly, complicates the engine code and is likely to confuse users. Dmitry -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] RFC: Timeouts
On 25-2-2017 01:55, Leyne, Sean wrote: >>MSSQL implements timeouts at client side: From a quick glance, in the ms-sql JDBC driver, the (client side) query timeout is applied for waiting for a response from the server for an execute, fetch, etc. Each action has its own the timeout timer, so each action has the full timeout duration. >>Server-side statement timeouts implemented in MySQL: > >>PostgreSQL docs is very limited: The pgjdbc driver has a client side timeout which only seems to cover the initial execute and row fetch. Depending on the configuration and transaction auto-commit status, that can be the whole result set, or just the first (fetchSize) rows. Subsequent fetches are not covered by this timeout. > Do you/anyone know if these engines return full results sets or follow the > "page set" approach? As far as I know Oracle[1], PostgreSQL[2], SQL Server[3] support it. I believe MySQL does as well. Don't know about other database systems, but I assume most of them will support this (the SQL CLI standard also defines it). Side note: the way Firebird implements fetch is not a 'page set', but a row set, because a fetch contains complete rows, while the term page is usually reserved for a fixed amount of bytes. [1]: https://oracle.github.io/odpi/doc/public_functions/dpiStmt.html (see dpiStmt_fetchRows) [2]: https://www.postgresql.org/docs/9.6/static/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY (search for "Once a portal exists, it can be executed using an Execute message.") [3]: https://github.com/Microsoft/mssql-jdbc/blob/master/src/main/java/com/microsoft/sqlserver/jdbc/SQLServerResultSet.java#L1005 Mark -- Mark Rotteveel -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] RFC: Timeouts
25.02.2017 03:55, Leyne, Sean wrote: > > it is the value that represent a direct CPU cost of a SQL statement. You actually seem wanting CPU quotas. But they're not timeouts. A long-running statement may produce almost zero CPU load. Dmitry -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] RFC: Timeouts
On 25-2-2017 00:50, Vlad Khorsun wrote: > 24.02.2017 20:14, Mark Rotteveel wrote: > ... >> It would be nice to know exactly what changes are involved in the wire >> protocol for statement-specific timeouts without having to dive into the >> implementation. > >Read, please, "Remote client implementation notes" at both > README.statement_timeouts > and README.session_idle_timeouts. The only piece not described there could be > the way how > Statement::setTimeout() pass user timeout value with op_execute and > op_execute2 packets. > It is easy - if protocol version is at least 15, add 4-bytes with timeout > value to the > contents of op_execute\op_execute2 packet: > > https://github.com/FirebirdSQL/firebird/commit/2c49e6fcf20b55cd87d497dae7309c593a68bd62#diff-eabce942e207c1fedfc16fe4a70fd258 Thanks Mark -- Mark Rotteveel -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] RFC: Timeouts
25.02.2017 2:55, Leyne, Sean wrote: > Vlad, > > These connections perform only a few heavy weight SQL statements (taking max 3-4 of real execution time). > Most of the time is spent in the Firebird engine waiting for the next fetch, due to network latencies. In the engine ? >>> >>> Yes, the engine would be **waiting** for the next fetch request from the >> client. >> >>Engine never waits for the client ! This is against client-server >> architecture. > > Oh yes it does! Engine is passive (except of events notification). It does nothing by own initiative. > I can Execute the SQL, and in 23 seconds the first "page" of rows will be > returned to the client. > At this point, the server stops and waits for the client to perform a > "fetch". So, only when I First, stop to mix server and engine. Second, *server* at this point doesn't stop, it continues to fetch records from *engine* and caches it to be able to send it immediately to the client when next op_fetch arrives. ... >>> Based on your logic, the SELECT would be killed after 10 minutes, with only >> 3,000 rows having been processed by the client application. >>> >>> Following my logic, any time waiting for a "fetch" would not count, >>> and thus all 10,000 rows would be processed -- >> > but the transaction/connection would be 'active' for 33 minutes. >> >>Following *logic* developer should set timeout based on application >> processing time or (much better) fetch whole resultset, *commit* ASAP, and >> then process data. >> You describe very bad application (sorry) which holds open transaction 33 >> times longer than necessary. > > "Bad" or not has nothing to do with my point. I consider it has. > I am saying that applications are not perfect, we need a solution that > provides the best > possible outcome for all usage patterns. We need a solution for *most* usage patterns and i am against support of bad practices. > The fact that the transaction is open for longer that it should, has nothing > to do with my issue. It is impossible (and very wrong) to ignore such important details. > I want to use timeout to control very bad SQL statements -- which is a > separate/unrelated > issue to the length of a transaction. Statement and transaction lifetime is bound, despite of you wishes. > Perhaps we are talking about different timeout values? Execution vs. > Transaction vs Connection? I speak about statement exection timeout. And i alredy wrote here that there is no "transaction timeout" - does you read that ? > I am more concerned with Execution timeout -- since it is the value that > represent a direct CPU > cost of a SQL statement. It represent a tens of things except of CPU cost. If you need to control CPU cost, ask for it explicitly, but it is very, very different thing. If you insist on changes in implementation, please, specify exactly what you need and where it is implemented in a such way. >>> >>> Add logic to stop and start the timer in locations where the server is >>> waiting for client requests/"fetch" operations. >> >>No, sorry, without me. This is against my feeling of common sence and >> against all my experience. I could agree to completely exclude fetches from >> timeout scope, i.e. stop timer right after execute()\open(), but i'm not >> sure it >> is correct way. > > Well, Jiri and Mark agree with me. So, my POV is not unreasonable. You mixed few things in one, found no confirmation of existance of such feature and doesn't listen to the arguments - yes, of course, it is very reasonable ! >>> 1- I don't know of any other engines which allow for results to be fetched >>> in >> "pages". >> > It is the fact that the results can be fetch in "pages" with Firebird >> that, IMO, >> raises > the need for the additional level of 'accounting'. >> >>I see no relation of batch fetches (if you speak about it) with all said >> above. > > My point was that I don't know if those engines allow for results to be > return in pages > or whether result is returned as a single block. > > In which case, the execution time == the "cost" of the query since there > would be no > interaction with the client (there are no page fetches). > > Whereas your implementation currently just represents the elapsed time since > the query started -- Yes, exactly. > including time that the engine is doing nothing. And many other "interesting" times you do not want to consider. >>MSSQL implements timeouts at client side: > >>Server-side statement timeouts implemented in MySQL: > >>PostgreSQL docs is very limited: > > Do you/anyone know if these engines return full results sets or follow the > "page set" approach? If think a bit deeper, it will be clear that there is no other way as send big result sets in parts over the wire. Also, why don't you ask if they fully fetch resultset at the server side ? What you