Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On Dec 9, 2013, at 8:01 PM, Warren Young wrote: >> I remember reading an essay by a user of controlled substances > > Your next reading assignment is a book[3] on a functional programming > language, So… do you make your functions wear a purity ring? To keep them, hmmm, chaste? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On 12/7/2013 12:53, James K. Lowden wrote: On Thu, 05 Dec 2013 17:52:47 -0700 Warren Young wrote: To prove my point, I decided to divide the SQLite commands[1] into those that modify the DB and those that do not: Oh, let me help you out here: these aren't functions. I was careful to call them commands, and to treat what SQL calls "functions" separately. Nevertheless, I think you're trying to draw a dictionary-based line here instead of looking at fundamental concepts. A proper mathematical function takes N arguments and returns a single constant result for those arguments. If you have a static SQLite DB file, any SELECT statement against it involving only tables and the pure SQL functions qualifies as a pure function itself. The point of this exercise was to dig down to this conceptual level, bypassing the fuzzy terminology. SQL is a mixed bag of true functions and non-functional [1] elements. This thread is about one confusion that can result when these two aspects of SQL intermix[2] in unexpected ways. I remember reading an essay by a user of controlled substances Your next reading assignment is a book[3] on a functional programming language, preferably one with immutable-by-default values. Haskell is the current hotness, but Erlang would work just as well. There are less pure FP languages that can teach the same lessons, if you diligently avoid the impure bits: the ML family[4], Scala, Scheme... The rest of your post I answered indirectly in my reply to your other message in this thread. Footnotes: [1] In the mathematical sense. I.e. not meaning "broken". [2] e.g. "SELECT ... date('now')" [3] Free online FP books: http://learnyouahaskell.com/chapters http://learnyousomeerlang.com/content http://ocaml.org/learn/books.html https://en.wikibooks.org/wiki/F_Sharp_Programming http://www.scala-lang.org/documentation/books.html http://www.scheme.com/tspl4/ https://mitpress.mit.edu/sicp/ [4] OCaml and F# are the most-used flavors of ML in practice currently. Academia still has a lot of Standard ML holdouts. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On 12/7/2013 12:15, James K. Lowden wrote: On Wed, 04 Dec 2013 12:04:07 -0700 Warren Young wrote: Determinism is a property of a function; there is no such thing as a function that is sometimes deterministic and sometimes not. databases are about as far from side-effect-free as you can get. I'm not sure what you're referring to. I think your sense of the term "side effect" comes from the everyday use, which is most influenced by medical side effects. i.e., something bad and unintended. The term means something rather different in CS: https://en.wikipedia.org/wiki/Side_effect_%28computer_science%29 Specifically here, I mean that most SQL statements other than SELECT modify global state: the SQLite DB file. Any statement that modifies the DB file has the potential to change the result from *any* SQL statement, including SELECT. Example: SELECT * FROM foo WHERE id=42; UPDATE foo SET bar='qux' where id=42; SELECT * FROM foo WHERE id=42; The first and third statements return different results, even though they are side effect free, because UPDATE is not side effect free. Consider also that the UPDATE could come from another process, at an indeterminate time. This is why concerns over side effects -- in the CS sense -- matter. SQLite offers many ways to *control* this indeterminacy, features generally grouped under the acronym ACID, but you can't say "DBMS X is ACID compliant therefore it will never surprise me with unexpected results." ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On Thu, 05 Dec 2013 17:52:47 -0700 Warren Young wrote: > To prove my point, I decided to divide the SQLite commands[1] into > those that modify the DB and those that do not: > > Has side effects Limited side effects No side effects > ~~ ~~~ > ALTER TABLE ATTACH DATABASEANALYZE > CREATE TRIGGER CREATE TABLE CREATE INDEX > DELETE CREATE VIEWDROP INDEX > DETACH DATABASE[2] CREATE VIRTUAL TABLE EXPLAIN > DROP TABLE DROP TRIGGER[3]PRAGMA > DROP VIEW REINDEX > INSERT SELECT > REPLACE VACUUM > UPDATE Oh, let me help you out here: these aren't functions. I suppose you could think of them as functions, in the sense that strcpy(3) and unlink(2) are functions. But they're more commonly called verbs or commands or operations, components of a query or statement. I remember reading an essay by a user of controlled substances in which he observed that when we speak of drugs having effects and side-effects the distinction is wholly subjective: the side-effect is an *effect*, just not the desired one. ISTM that e.g. INSERT doesn't have a "side effect", but rather just an effect, to the extent it does what it says on the tin. If it also prints a message saying your father smells of elderberries, that would be a side effect, I would think. > Commands in the first column clearly have side effects: they can > affect the results of another process's SELECT queries. That's not supposed to be the case. An INSERT, say, might change the results of the same SELECT query executed before and after it, but it's not supposed to affect a SELECT while it's in progress. --jkl ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On Wed, 04 Dec 2013 12:04:07 -0700 Warren Young wrote: > > Determinism is a property of a function; there is no such > > thing as a function that is sometimes deterministic and sometimes > > not. > > databases are about as far from side-effect-free as you can get. I'm not sure what you're referring to. The functions I'm thinking of are things like COUNT and DATE. One is deterministic, the other not. Their determinism is not affected by how they are used. Unless you have an example of a function whose determinism is, um, syntatically determined, why introduce syntax around it? Regarding side-effects, every standard SQL function is side-effect free. True, in SQLite a user-defined function could send flowers on Valentine's day. But at that point it's surely wandered off the reservation. > Anyway, all this arguing over how SQLite *should* behave seems > misguided to me. What matters is how SQL is specified. SQLite > should follow the spec in areas like this. Yes. Better to color inside the lines. --jkl ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On 12/5/2013 14:45, Klaas V wrote: Warren wrote 4 dec 2013: | There are tiny corners of the programming world (FP) where this is not the case, but then you get into | questions of purity, and databases are about as far from side-effect-free as you can get. That's a wee bit exaggerated, To prove my point, I decided to divide the SQLite commands[1] into those that modify the DB and those that do not: Has side effects Limited side effects No side effects ~~ ~~~ ALTER TABLE ATTACH DATABASEANALYZE CREATE TRIGGER CREATE TABLE CREATE INDEX DELETE CREATE VIEWDROP INDEX DETACH DATABASE[2] CREATE VIRTUAL TABLE EXPLAIN DROP TABLE DROP TRIGGER[3]PRAGMA DROP VIEW REINDEX INSERT SELECT REPLACE VACUUM UPDATE Commands in the first column clearly have side effects: they can affect the results of another process's SELECT queries. The second column is for commands that are unlikely to affect another process's queries, but they do alter the user-visible DB schema, so it's hard to argue that they're side-effect-free. The third column looks longer than it really is. You can look at it as SELECT plus a bunch of "DBA commands." Those latter commands merely affect how well the SQLite DB engine runs, and they're unlikely to be used very often once the DB is set up and working well. If you strip away the setup, DBA, and rarely-used commands from the other columns, too, you're still left with SELECT on one side vs about half a dozen SQL commands with side effects on the other. That's what I was getting at with my quoted comment above. We should also consider SQLite's "functions". Most of the core functions[4] are pure, but there are several impure ones: changes(), last_insert_rowid(), random(), randomblob(), and total_changes(). The date and time functions[5] are pure, unless you pass 'now', which is what all the argument here is about, of course. The aggregate functions[6] are all pure. I think my point stands: SQL is awfully impure, from an FP/mathematical standpoint. - Footnotes: [1] From https://www.sqlite.org/lang.html but leaving out the keywords that aren't independent commands. I also left out the TRANSACTION and SAVEPOINT commands, since they just group other commands. [2] I put the DETACH and DROP commands in the first column even though their corresponding ATTACH and CREATE commands are in the second because there is nothing stopping these destructive commands from affecting another process's queries. [3] DROP TRIGGER is interesting: From the perspective of figuring out whether it could affect another process through side effects, this command actually /stops/ future side effects from occurring, even though it modifies the DB file to do so. Thus, it goes into the second column, not the first, where all the other DROP commands are. [4] https://www.sqlite.org/lang_corefunc.html [5] https://www.sqlite.org/lang_datefunc.html [6] https://www.sqlite.org/lang_aggfunc.html ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
Warren wrote 4 dec 2013: | There are tiny corners of the programming world (FP) where this is not the case, but then you get into | questions of purity, and databases are about as far from side-effect-free as you can get. That's a wee bit exaggerated, but who am I to start a discussion; it's a valid, even respectable opinion and | Anyway, all this arguing over how SQLite *should* behave seems misguided to me. What | matters is how SQL is specified. SQLite should follow the spec in areas like this. Not even this me seems. The developers of SQLite can choose not to "obey" rules, advices and guidelines of SQL standard at will if one of the three guys, one of us users feels it's appropriate and fits their philosophy better or for whatever reason even rewrite (some of) the standard and create a new or adapted version = skipping, changibg adding things = let's call it SQL13 (14 etc) or a less prozaic name. What about iSQL or SQLNT? ;-) Klaas `Z4us` V The fun with standards is there are so many to choose from ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On 12/3/2013 17:29, James K. Lowden wrote: Determinism is a property of a function; there is no such thing as a function that is sometimes deterministic and sometimes not. Unless you're new to this computing thing, you must have noticed that software developers almost never mean the same thing as mathematicians when we use the word "function". There are tiny corners of the programming world (FP) where this is not the case, but then you get into questions of purity, and databases are about as far from side-effect-free as you can get. Anyway, all this arguing over how SQLite *should* behave seems misguided to me. What matters is how SQL is specified. SQLite should follow the spec in areas like this. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On Tue, 3 Dec 2013 11:29:03 -0800 Scott Hess wrote: > Probably we should add a new API that allows the application > > to state auxiliary properties about application-defined functions > > (such as whether or not it is "constant", whether or not it can > > return NULL, whether or not it might change the encoding of its > > input parameters, etc.) But that has not been done yet. > > > IMHO, the default should be that a given function with given > parameters should be assumed to return a single value which is good > across the entire statement [...]. > > That said, I'm a little nervous about having that be an attribute of > the function rather than the statement. Why nervous? Determinism is a property of a function; there is no such thing as a function that is sometimes deterministic and sometimes not. --jkl ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On 27-11-2013 03:55, James K. Lowden wrote: On Mon, 25 Nov 2013 19:44:15 +0100 Luuk wrote: On 25-11-2013 13:41, Simon Slavin wrote: I'm wondering whether there's an argument that it should be evaluated just once for a transaction. I'm still thinking about this question, i can think of some benefits if its evaluated just once per transaction. but i hope anyone has some examples why its better to evalutate it once per statement. The easy answer is: that's how SQL is defined. A more complicated answer is that, inside a user-defined transaction, you have control over the meaning of "now". You can insert "now" into a table and re-use it as often as you like, without fear of it being changed by another process. If, on the other hand, you'd like to execute several statements in a transaction, you might also like to know when "now" is for each one. The per-statement definition of "now" supports that use. There's actually nothing special about "now". Other functions, e.g. changes(), may vary between statements in a transaction. (It happens SQLite doesn't have many such.) You really just want to preserve the function's defintion as "returns correct output each time it's called". HTH. ok, thanks for the answer. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On Mon, 25 Nov 2013 19:44:15 +0100 Luuk wrote: > On 25-11-2013 13:41, Simon Slavin wrote: > > I'm wondering whether there's an argument that it should be > > evaluated just once for a transaction. > > I'm still thinking about this question, > > i can think of some benefits if its evaluated just once per > transaction. > > but i hope anyone has some examples why its better to evalutate it > once per statement. The easy answer is: that's how SQL is defined. A more complicated answer is that, inside a user-defined transaction, you have control over the meaning of "now". You can insert "now" into a table and re-use it as often as you like, without fear of it being changed by another process. If, on the other hand, you'd like to execute several statements in a transaction, you might also like to know when "now" is for each one. The per-statement definition of "now" supports that use. There's actually nothing special about "now". Other functions, e.g. changes(), may vary between statements in a transaction. (It happens SQLite doesn't have many such.) You really just want to preserve the function's defintion as "returns correct output each time it's called". HTH. --jkl ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On 25-11-2013 13:41, Simon Slavin wrote: I'm wondering whether there's an argument that it should be evaluated just once for a transaction. I'm still thinking about this question, i can think of some benefits if its evaluated just once per transaction. but i hope anyone has some examples why its better to evalutate it once per statement. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
If a function is deterministic for then it can be executed once and its result can be memoized for the rest of the statement (or transaction even). If a function is idempotent for a statement then it can be executed once per-statement, and its result(s) can be memoized and reused for the life of that statement. If a function is neither idempotent (even if it is deterministic in the sense of not being random, for example, a monotonically increasing counter) nor deterministic (e.g., a true random()) then it should be called exactly as many times as the statement seems to imply, no more and no fewer (e.g., once per-row of a correlated sub-query, ...). Can idempotence and determinism be considered synonymous for the purposes of an RDBMS engine? I think not quite: deterministic should imply that a memoization cache can be used always, across many statements and even transactions, whereas idempotence might imply only that memoization is permitted (but not required) on a per-statement basis. Another desirable attribute might be whether the function is fast or slow: for fast enough deterministic functions there will be no point in having a large memoization cache, or even any memoization cache. Nico -- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On 25 Nov 2013, at 11:58am, Luuk wrote: > On 24-11-2013 19:36, Petite Abeille wrote: >> >> On Nov 24, 2013, at 7:10 PM, Valentin Davydov wrote: >> >>> Wait a second... and you'll get different value of datetime('now'). In this >>> sense datetime() is as deterministic as random(): it may give the same >>> result >>> next invocation or may not, dependng on various circumstances not related to >>> the function itself. >> >> Nah… in SQL… ‘now’ is supposed to be computed once per statement. And that’s >> all. > > Then 3.7.11, on windows is wrong, luckily this works OK on 3.8.1 Yes. This characteristic ('now' is evaluated once per statement) was discussed and corrected recently. I too was surprise that SQL was specific about it being evaluated just once for a statement. Though now I know that that, I'm wondering whether there's an argument that it should be evaluated just once for a transaction. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On 24-11-2013 19:36, Petite Abeille wrote: On Nov 24, 2013, at 7:10 PM, Valentin Davydov wrote: Wait a second... and you'll get different value of datetime('now'). In this sense datetime() is as deterministic as random(): it may give the same result next invocation or may not, dependng on various circumstances not related to the function itself. Nah… in SQL… ‘now’ is supposed to be computed once per statement. And that’s all. Then 3.7.11, on windows is wrong, luckily this works OK on 3.8.1 C:\temp>\util\sqlite3 test.sqlite3 SQLite version 3.7.11 2012-03-20 11:35:50 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> update test set d=datetime('now'); sqlite> select d, count(*) from test group by d; 2013-11-25 11:53:40|94941 2013-11-25 11:53:41|164850 2013-11-25 11:53:42|152478 2013-11-25 11:53:43|156193 2013-11-25 11:53:44|160673 2013-11-25 11:53:45|172547 2013-11-25 11:53:46|163959 2013-11-25 11:53:47|166997 2013-11-25 11:53:48|170299 2013-11-25 11:53:49|170700 2013-11-25 11:53:50|173197 2013-11-25 11:53:51|171121 2013-11-25 11:53:52|113143 sqlite> .quit C:\temp>sqlite3 test.sqlite3 SQLite version 3.8.1 2013-10-17 12:57:35 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> update test set d=datetime('now'); sqlite> select d, count(*) from test group by d; 2013-11-25 11:56:22|2031098 sqlite> ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On Sun, 24 Nov 2013 16:53:01 +0200 RSmith wrote: > Similarly if one was to add a function which returns a date-dependant > value, such as 'dayOfWeek(x)' and mark it as deterministic for inside > a single query, that would make sense, even though it would be very > indeterministic (or referentially opaque, if you will) between > queries. This is all dandy unless you have queries (as seen on this > forum) that run for three days or more... would it matter then? A SQL statement is atomic irrespective of how long it runs. Your wday() function would return to the day of the week as of the time the statement was evaluated, even if it took a month of Sundays to execute. ;-) Another way to think about it: a deterministic function always returns the same output for a given input. In an SQL statement, a function is provided input only once, regardless of how long it runs. --jkl ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On Sun, 24 Nov 2013 09:17:20 -0500 Doug Currie wrote: > in computer science we have referential transparency > > http://en.wikipedia.org/wiki/Referential_transparency_ > (computer_science) > > and pure functions > > http://en.wikipedia.org/wiki/Pure_function https://en.wikipedia.org/wiki/Deterministic_algorithm In computer science we also have deterministic and nondeterministic. Those are also IIRC the terms used in the SQL standard. --jkl ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On Nov 24, 2013, at 7:10 PM, Valentin Davydov wrote: > Wait a second... and you'll get different value of datetime('now'). In this > sense datetime() is as deterministic as random(): it may give the same result > next invocation or may not, dependng on various circumstances not related to > the function itself. Nah… in SQL… ‘now’ is supposed to be computed once per statement. And that’s all. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On Sat, Nov 23, 2013 at 06:18:29AM -0500, Richard Hipp wrote: > On Sat, Nov 23, 2013 at 5:26 AM, Pepijn Van Eeckhoudt < > pep...@vaneeckhoudt.net> wrote: > > > Is datetime special in thuis context or will constant expression hoisting > > like this happen for any function? > > SQLite must know that the function always gives the same output given the > same inputs. No every function works that way. Counterexamples include > random() and last_insert_rowid(). But most built-in functions are > factorable in the same way that datetime() is. Wait a second... and you'll get different value of datetime('now'). In this sense datetime() is as deterministic as random(): it may give the same result next invocation or may not, dependng on various circumstances not related to the function itself. > Currently there is no API to designate an application-defined function as > being "constant" in the sense that it always generates the same output > given the same inputs. Hence, SQLite assumes the worst about > application-defined functions and never tries to factor them out of the > inner loop. Probably we should add a new API that allows the application > to state auxiliary properties about application-defined functions (such as > whether or not it is "constant", whether or not it can return NULL, whether > or not it might change the encoding of its input parameters, etc.) But > that has not been done yet. To my opinion, the most general solution is to let to the application programmer to decide whether to calcucale the function once (say, at the beginning of a transaction), store the result and then access the stored value, or to make the new call to the function each iteration, depending on the application semantics. Valentin Davydov. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On Sun, Nov 24, 2013 at 4:30 PM, Petite Abeille wrote: > > On Nov 24, 2013, at 3:17 PM, Doug Currie wrote: > > > There is value in compatibility, but those adjectives are awful. > > FWIW, DETERMINISTIC is what Oracle uses: > > > http://docs.oracle.com/cd/E11882_01/appdev.112/e25519/function.htm#LNPLS1183 > There's also this: http://docs.oracle.com/cd/B13789_01/appdev.101/b10807/13_elems039.htm that specifies various levels of "purity", i.e. if the package function reads from or writes to the database or global state. Each level allows for more aggressive optimization when the function takes part in a query. It's been a while since I've coded in PL/SQL but I think that if the RESTRICT_REFERENCES pragma is not specified for a package function the SQL engine will not accept calls to it from an SQL statement. Also, if the implementation of a function violates its RESTRICT_REFERENCES pragma the PL/SQL compiler will not compile it. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
Ugh, my last thought was not well-formed - apologies. When I said: "...can add a function to replace an SQL function to improve it many times for the specific purpose". This would of course hardly matter if the SQL (or SQLite specifically) function was already deterministic (read: cached). My thinking was more towards enforcing determinism on something that isn't usually, if it does not affect your query's outcome (i.e specific purpose). Hope that is more clear - thanks. Ryan ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
Agreed - also some functions might not be intrinsically deterministic, but it may well be so for the duration of a query. There may need to be some thinking on this. I refer back to a discussion earlier (and subsequent SQLite adaption) which made a date-time reference deterministic within a single query for transactional integrity. Similarly if one was to add a function which returns a date-dependant value, such as 'dayOfWeek(x)' and mark it as deterministic for inside a single query, that would make sense, even though it would be very indeterministic (or referentially opaque, if you will) between queries. This is all dandy unless you have queries (as seen on this forum) that run for three days or more... would it matter then? I for one would very much like the ability to specify added functions as deterministic or not (or whatever term would indicate 'cacheability of the result') as this might be an enormous efficiency improvement in itself, and also allow you to now add things to improve speed in some standard queries. Think of the people who have queries running for ages and can add a function to replace an SQL function to improve it many times for the specific purpose.* Actually, thinking about it, this was always achievable with some clever programming and managing the result cache yourself - but I still think this addition would be an improvement. That's my 2 cents. Have a great day all! Ryan *This is not to say the SQL way is not good, but it often has to cater for a wide variety of things where a user-added function might be very tuned to a specific purpose - ridding a lot of CPU-time fat. On 2013/11/24 16:30, Petite Abeille wrote: On Nov 24, 2013, at 3:17 PM, Doug Currie wrote: There is value in compatibility, but those adjectives are awful. FWIW, DETERMINISTIC is what Oracle uses: http://docs.oracle.com/cd/E11882_01/appdev.112/e25519/function.htm#LNPLS1183 I would personally stick to that if such functionality was ever introduced in SQLite. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On Nov 24, 2013, at 3:17 PM, Doug Currie wrote: > There is value in compatibility, but those adjectives are awful. FWIW, DETERMINISTIC is what Oracle uses: http://docs.oracle.com/cd/E11882_01/appdev.112/e25519/function.htm#LNPLS1183 I would personally stick to that if such functionality was ever introduced in SQLite. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On Nov 24, 2013, at 6:47 AM, Alek Paunov wrote: > > BTW, I see the term "deterministic" in the SQL99 BNFs: > … > but different in PostgreSQL ("immutable", "stable", etc): There is value in compatibility, but those adjectives are awful. In computer science we have referential transparency http://en.wikipedia.org/wiki/Referential_transparency_(computer_science) and pure functions http://en.wikipedia.org/wiki/Pure_function e ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On 23.11.2013 13:18, Richard Hipp wrote: SQLite must know that the function always gives the same output given the same inputs. No every function works that way. Counterexamples include random() and last_insert_rowid(). But most built-in functions are factorable in the same way that datetime() is. BTW, I see the term "deterministic" in the SQL99 BNFs: http://savage.net.au/SQL/sql-99.bnf.html#deterministic%20characteristic http://savage.net.au/SQL/sql-2003-2.bnf.html#deterministic%20characteristic aslo found in MySQL: http://dev.mysql.com/doc/refman/5.6/en/create-procedure.html but different in PostgreSQL ("immutable", "stable", etc): http://www.postgresql.org/docs/9.3/static/sql-createfunction.html I think "deterministic" is used also in the Prolog, whit the same meaning. Kind Regards, Alek ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
On Sat, Nov 23, 2013 at 5:26 AM, Pepijn Van Eeckhoudt < pep...@vaneeckhoudt.net> wrote: > Is datetime special in thuis context or will constant expression hoisting > like this happen for any function? > SQLite must know that the function always gives the same output given the same inputs. No every function works that way. Counterexamples include random() and last_insert_rowid(). But most built-in functions are factorable in the same way that datetime() is. Currently there is no API to designate an application-defined function as being "constant" in the sense that it always generates the same output given the same inputs. Hence, SQLite assumes the worst about application-defined functions and never tries to factor them out of the inner loop. Probably we should add a new API that allows the application to state auxiliary properties about application-defined functions (such as whether or not it is "constant", whether or not it can return NULL, whether or not it might change the encoding of its input parameters, etc.) But that has not been done yet. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite version 3.8.2 running 2.5x faster for some queries.
Is datetime special in thuis context or will constant expression hoisting like this happen for any function? Pepijn > Op 22-nov.-2013 om 15:35 heeft Richard Hipp het volgende > geschreven: > > The www.sqlite.org server logs are stored in an SQLite database (of > course). We have a script that is run daily on that database with a dozen > or so queries that look something like this: > > .print > .print Downloads in the past 24 hours > .mode column > .width -6 -6 100 > .header on > SELECT count(*) AS count, count(distinct ip) AS IPs, url > FROM log > WHERE date>datetime('now','-24 hours') >AND (url GLOB '*/2013/*' > OR url GLOB '*/snapshot/*' > OR url GLOB '*/xfer' > OR url GLOB '*/blob/*') > GROUP BY url > ORDER BY count DESC; > > Today's database contains 6 days worth of data, is 590MB in size, and is > just short of 2 million records. There are no indices, so each query is a > full table scan. Using SQLite 3.8.1, the query above took 1.793 seconds on > a recent test run. But the code on trunk (and the latest snapshots at > http://www.sqlite.org/download.html) took only 0.686 seconds. The > difference is that the string constants and the datetime() function call > are factored out of the inner loop in 3.8.2. > > To be fair: Our production script does not contain exactly the SQL shown > above. Rather than using the bare datetime() call, the real script says: > "(SELECT datetime('now','-24 hours'))". Putting the datetime() call inside > a subquery is a trick that causes the datetime function to only be invoked > once even without the new constant-function factoring optimization of > 3.8.2. And with that trick, the performance difference is not nearly so > dramatic (though 3.8.2 is still faster by about 10%). Also, the > performance difference here would only apply to a full table scan that > spends a lot of time looping inside of the virtual machine. There would > not be nearly as dramatic a difference if the database held an index on the > "log.date" column. > > Nevertheless, we suspect that queries like the above are common in the > wild, and so we hope that the upcoming 3.8.2 release will make a big > difference for some people. > > If you try the 3.8.2 pre-release snapshot in your application and see a > performance improvement, we'd appreciate hearing from you. > > For testing and comparison purposes, the optimization that doubles the > performance of the query above can be disabled using the following API: > >sqlite3_test_control(SQLITE_TESTCTRL_OPTIMIZATIONS, db, 8); > > where "db" is the database connection create by sqlite3_open(), or if you > are using the command-line shell: > >.testctrl optimizations 8 > > -- > D. Richard Hipp > d...@sqlite.org > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users