Re: [sqlite] windowing functions != recursive functions
Thank you and AndrewP for the pointers - very interesting read. Some clarifications. Coming from my perspective - that of R user, not an API user - it is natural to do more complicated operations in R than to try to write additional functions in C/tcl etc. I do not know how common will such attitude be among SQLite users not using R. (In general R users tend to be statisticians with little knowledge of SQL.) This is why my mail was not an attempt to propose some extensions to core SQLite [it is so good for the intended purpose that adding more stuff can spoil it :-)] Were any additions for computing considered for SQLITE it seems plausible to argue that the should be easy to implemenent and do not change the spirit of the project. So more like 'adding a log function' rather than something (probably) much bigger like Oracle olap. It could be further argued that, for all such ideas for SQLite extentions, the strategic approach of having 'the core sqlite' and 'the extended sqlite' is the best way to proceed. Any non-conventional extensions (like log or OLAP) could be implemented in the 'extended sqlite' version, used from there - and eventually migrated to the core if there is a sufficient interest. - Original Message - From: [EMAIL PROTECTED] To: sqlite-users@sqlite.org Subject: Re: [sqlite] windowing functions != recursive functions Date: Wed, 12 Oct 2005 22:48:35 -0400 > > "pilot pirx" <[EMAIL PROTECTED]> wrote: > > > Now, for the recursive function like exponential moving average > > the defintion is that > > > > ema(i+1) = val(i) * coef + ema(i) * (1-coef). > > > > That is I have to know the previous value of both EMA _and_ > > VALUE (while for moving avearage I need to know _only_ the > > previous value(s) > > of VALUE. > > You could write an "ema()" function for SQLite using the > scarcely documented API functions sqlite3_get_auxdata() and > sqlite3_set_auxdata(). (Those routines were intended to allow > functions like "regexp" to compile a constant regular expression > once and then reused the compiled regular expression on > subsequent calls. But they have never been used for anything, > as far as I am aware.) > > The ema() function would work like this: > > SELECT ema(x, 0.45) FROM table1; > > Where 0.45 is the "coef". > > I was wondering if it would be possible to write a "prev()" > function that returned the value of a column in the result > set from the previous row. prev() could be used to implement > ema() in pure SQL. But, alas, I do not see how you could > write a general-purpose prev() function using the current > API. Some kind of extension would be required, I think. > -- > D. Richard Hipp <[EMAIL PROTECTED]> -- ___ Play 100s of games for FREE! http://games.mail.com/
[sqlite] wiindow functions and recursive functions
> I don't see why this is such a great feature. Without it, worst case, > you could still write a simple little loop which would issue one > update statement for each row, all within a single transaction. No? that would require writing in C, bindind etc. Or, for some other databases, writing in some stored procedure language. Here I have that with stadard-looking sql in a small database. One 'quirk' of the implementation (i am not sure it was intended) gives enormous additional programming facility without any additional work :-) > > Vastly more useful for moving average and the like would be real > > windowing/grouping functions, like Oracle's "analytic" functions. I'm Example of computing moving average with standard sql is in one of my earlier mails. I think that adding OLAP functions to such a small engine would be an overkill, especially as most such functionality can be expressed with standard SQL (admittedly, convoluted a bit). The owner of the project will ultimately decide. It may not fit into the 'lite' image this project is after. P.S. I remember seeing a public domain code adding correlation and regression to SQLite, but it was about 3 years old - it was rather short. Yet, even that did not make it into the mainstream project. - Original Message - From: "Andrew Piskorski" <[EMAIL PROTECTED]> To: sqlite-users@sqlite.org Subject: [sqlite] SQL Window/OLAP functions Date: Wed, 12 Oct 2005 08:34:02 -0400 > > On Wed, Oct 12, 2005 at 05:12:05AM -0500, pilot pirx wrote: > > Subject: [sqlite] Please, please do _not_ remove this feature from SQLite... > > > While using SQLite for some time (with R package, www.r-project.org) > > I did admire its functionality and speed. Then I did discover a > > hidden SQLite feature of immense usefulness - not available in other > > databases. SQLite can compute Fibonacci numbers! (I will explain why > > Transaction visibility features do vary, although often it doesn't > matter anyway. E.g., here's a dicussion of how (at least as of early > 2004), PostgreSQL's docs were quite confused about certain subtleties, > but what I find interesting, is this was still something that in > practice had never really mattered to the mostly hard-core RDBMS > programmers talking about it in that thread: > >http://openacs.org/forums/message-view?message_id=176198 > > > UPDATE fib SET > > val = (SELECT h1.val FROM fib as h1 where pos = fib.pos - 1) + > >(SELECT h2.val FROM fib as h2 where pos = fib.pos - 2) > > WHERE pos > 2; > > I don't see why this is such a great feature. Without it, worst case, > you could still write a simple little loop which would issue one > update statement for each row, all within a single transaction. No? > > > This is an _immensely_ useful functionality when one needs to > > compute various recursive functions. For example exponential moving > > average, used frequently in financials. Or Kalman filter (and many > > Vastly more useful for moving average and the like would be real > windowing/grouping functions, like Oracle's "analytic" functions. I'm > not thrilled by their particular syntax, but the functionality is > INCREDIBLY useful. (And on the other hand, I haven't thought of any > obviously better syntax, either.) > > Hm, an amendement to the SQL:1999 spec added windowing support, and > SQL:2003 includes that, I think as features T611, "Elementrary OLAP > functions" and T612, "Advanced OLAP functions". Apparently Fred Zemke > of Oracle was the author of that SQL spec, and IBM also supported it, > so the SQL:2003 syntax and behavior is probably very similar (maybe > identical?) to what Oracle 8i, 9i, and 10g and IBM's DB2 already have. > PostgreSQL, as of 8.0, doesn't support it yet. > >http://www.wintercorp.com/rwintercolumns/SQL_99snewolapfunctions.html >http://www.ncb.ernet.in/education/modules/dbms/SQL99/OLAP-99-154r2.pdf >http://www.wiscorp.com/sql/SQL2003Features.pdf >http://troels.arvin.dk/db/rdbms/#select-limit-offset >http://www.postgresql.org/docs/8.0/interactive/features.html >http://en.wikipedia.org/wiki/SQL >http://www.sigmod.org/sigmod/record/issues/0403/E.JimAndrew-standard.pdf >http://www.oracle.com/oramag/oracle/01-jul/o41industry.html > > SQLite basically supports just SQL-92, it doesn't have any of these > newer SQL:1999 or SQL:2003 features, right? > > Using SQLite in conjunction with a powerful statistical data analysis > programming language like R is an excellent example of a use where > windowing functions can be hugely helpful. Unfortunately, I've never > had a compelling need to use SQLite for that, otherwise I'd probably > take a shot at adding support for the SQL:2003 Window/OLAP stuff. :) > > -- > Andrew Piskorski <[EMAIL PROTECTED]> > http://www.piskorski.com/ -- ___ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm
[sqlite] windowing functions != recursive functions
I was unaware of the windowing functions discusssion. Having a look at the first link, it looks like we may be talking about two subtly different issues. The windowing functions described in the link are different from recursive functions. There is no problem in computing moving average or cumulative sum etc with the existing SQL (or dealing with time windows in general) - it just the SQL gets nasty - examples are in 'SQL for smarties' by Joe Celko. Therefore adding such functions to SQLite is 'nice' but does not really increase the functionality. In contrast, computation of the recursion function adds it. example with moving averages, since they are mentioned in the windowing functions assuming original series is VALUE = 1 2 3 4 5 6 7 following statement will compute 5-day moving average for the whole column UPDATE data SET MAVE5 = (SELECT AVG(val) FROM data AS h1 WHERE h1.dayno <= data.dayno AND h1.dayno > data.dayno-05); But this statement operates always on existing data in the existing column - to compute new value of MAVE5 it only needs to know values of VALUE. (When, for the element one, there is not enough data (because there is no dayno < 1) SQL simply averages over existing data.) Now, for the recursive function like exponential moving average the defintion is that ema(i+1) = val(i) * coef + ema(i) * (1-coef). That is I have to know the previous value of both EMA _and_ VALUE (while for moving avearage I need to know _only_ the previous value(s) of VALUE. - Original Message - From: "Andrew Piskorski" <[EMAIL PROTECTED]> To: sqlite-users@sqlite.org Subject: [sqlite] SQL Window/OLAP functions Date: Wed, 12 Oct 2005 08:34:02 -0400 > > On Wed, Oct 12, 2005 at 05:12:05AM -0500, pilot pirx wrote: > > Subject: [sqlite] Please, please do _not_ remove this feature from SQLite... > > > While using SQLite for some time (with R package, www.r-project.org) > > I did admire its functionality and speed. Then I did discover a > > hidden SQLite feature of immense usefulness - not available in other > > databases. SQLite can compute Fibonacci numbers! (I will explain why > > Transaction visibility features do vary, although often it doesn't > matter anyway. E.g., here's a dicussion of how (at least as of early > 2004), PostgreSQL's docs were quite confused about certain subtleties, > but what I find interesting, is this was still something that in > practice had never really mattered to the mostly hard-core RDBMS > programmers talking about it in that thread: > >http://openacs.org/forums/message-view?message_id=176198 > > > UPDATE fib SET > > val = (SELECT h1.val FROM fib as h1 where pos = fib.pos - 1) + > >(SELECT h2.val FROM fib as h2 where pos = fib.pos - 2) > > WHERE pos > 2; > > I don't see why this is such a great feature. Without it, worst case, > you could still write a simple little loop which would issue one > update statement for each row, all within a single transaction. No? > > > This is an _immensely_ useful functionality when one needs to > > compute various recursive functions. For example exponential moving > > average, used frequently in financials. Or Kalman filter (and many > > Vastly more useful for moving average and the like would be real > windowing/grouping functions, like Oracle's "analytic" functions. I'm > not thrilled by their particular syntax, but the functionality is > INCREDIBLY useful. (And on the other hand, I haven't thought of any > obviously better syntax, either.) > > Hm, an amendement to the SQL:1999 spec added windowing support, and > SQL:2003 includes that, I think as features T611, "Elementrary OLAP > functions" and T612, "Advanced OLAP functions". Apparently Fred Zemke > of Oracle was the author of that SQL spec, and IBM also supported it, > so the SQL:2003 syntax and behavior is probably very similar (maybe > identical?) to what Oracle 8i, 9i, and 10g and IBM's DB2 already have. > PostgreSQL, as of 8.0, doesn't support it yet. > >http://www.wintercorp.com/rwintercolumns/SQL_99snewolapfunctions.html >http://www.ncb.ernet.in/education/modules/dbms/SQL99/OLAP-99-154r2.pdf >http://www.wiscorp.com/sql/SQL2003Features.pdf >http://troels.arvin.dk/db/rdbms/#select-limit-offset >http://www.postgresql.org/docs/8.0/interactive/features.html >http://en.wikipedia.org/wiki/SQL >http://www.sigmod.org/sigmod/record/issues/0403/E.JimAndrew-standard.pdf >http://www.oracle.com/oramag/oracle/01-jul/o41industry.html > > SQLite basically supports just SQL-92, it doesn't have any of these > newer SQL:1999 or SQL:2003 features, right? > > Using SQLi
Re: [sqlite] SQL Window/OLAP functions
I did use R extensively for years and with SQLite for the last year. My observations, for what they are worth, about statistics and databases. A stats package, especially as powerful as R, makes any database functions less relevant. After trying various approaches I use now the database mostly - to reduce data somewhat before getting them to R. For example, with 2 mln records in 500 groups I can compute group averages in the database and read into R only 500 records. - to reduce overall memory requirements - I can process the whole data set on group-by-group basis - reading one group at a time. thus requiring less memory internally Any stats package will have many functions, most of them impossible to implement in standard SQL and very difficult to implement in general. (like clustering etc). So why do I write filters in the database, instead of using vastly superior R capabilities? Firstly, for fun - just liked to push SQL as far as possible. Secondly - it is sometimes useful to compute some basic things during or immediately after data acquisition. If we can compute some _simple_ metrics at that stage _quickly_ then R does not have to read and process that much, as only data withing some range of metric may be of interest. Also, we can index on a metric and provide very fast extraction of data subsets. In summary it seems to me that adding too heavy functions to any database may be difficult and, for SQLite, going against the basic idea of having a 'lite' db. - Original Message - From: Laurent <[EMAIL PROTECTED]> To: sqlite-users@sqlite.org Subject: Re: [sqlite] SQL Window/OLAP functions Date: Wed, 12 Oct 2005 15:36:22 +0200 > > Hello, > > I was just looking for a statiscal package linked with SQLITE. > > > > Using SQLite in conjunction with a powerful statistical data analysis > > programming language like R is an excellent example of a use where > > windowing functions can be hugely helpful. Unfortunately, I've never > > had a compelling need to use SQLite for that, otherwise I'd probably > > take a shot at adding support for the SQL:2003 Window/OLAP stuff. :) > > > I can confirm that there would be some interest in having such a library. > > Best regards, > > Laurent. > > == > > - Original Message - > From: "Andrew Piskorski" <[EMAIL PROTECTED]> > To: > Sent: Wednesday, October 12, 2005 2:34 PM > Subject: [sqlite] SQL Window/OLAP functions > > > > On Wed, Oct 12, 2005 at 05:12:05AM -0500, pilot pirx wrote: > > > Subject: [sqlite] Please, please do _not_ remove this feature from > SQLite... > > > > > While using SQLite for some time (with R package, www.r-project.org) > > > I did admire its functionality and speed. Then I did discover a > > > hidden SQLite feature of immense usefulness - not available in other > > > databases. SQLite can compute Fibonacci numbers! (I will explain why > > > > Transaction visibility features do vary, although often it doesn't > > matter anyway. E.g., here's a dicussion of how (at least as of early > > 2004), PostgreSQL's docs were quite confused about certain subtleties, > > but what I find interesting, is this was still something that in > > practice had never really mattered to the mostly hard-core RDBMS > > programmers talking about it in that thread: > > > > http://openacs.org/forums/message-view?message_id=176198 > > > > > UPDATE fib SET > > > val = (SELECT h1.val FROM fib as h1 where pos = fib.pos - 1) + > > >(SELECT h2.val FROM fib as h2 where pos = fib.pos - 2) > > > WHERE pos > 2; > > > > I don't see why this is such a great feature. Without it, worst case, > > you could still write a simple little loop which would issue one > > update statement for each row, all within a single transaction. No? > > > > > This is an _immensely_ useful functionality when one needs to > > > compute various recursive functions. For example exponential moving > > > average, used frequently in financials. Or Kalman filter (and many > > > > Vastly more useful for moving average and the like would be real > > windowing/grouping functions, like Oracle's "analytic" functions. I'm > > not thrilled by their particular syntax, but the functionality is > > INCREDIBLY useful. (And on the other hand, I haven't thought of any > > obviously better syntax, either.) > > > > Hm, an amendement to the SQL:1999 spec added windowing support, and > > SQL:2003 includes that, I think as features T611, "Elementrary OLAP > > functions" and T612, "
[sqlite] Please, please do _not_ remove this feature from SQLite...
While using SQLite for some time (with R package, www.r-project.org) I did admire its functionality and speed. Then I did discover a hidden SQLite feature of immense usefulness - not available in other databases. SQLite can compute Fibonacci numbers! (I will explain why this is important later). The are defined as follows: F1 = 1, F2 = 1; F3 = F1 + F2; F4 = F2 +F3 etc. So they are defined recursively, with the first two values known and each next value depending on the previous values. The SQL statement for that is (the full script at the end of this mail): UPDATE fib SET val = (SELECT h1.val FROM fib as h1 where pos = fib.pos - 1) + (SELECT h2.val FROM fib as h2 where pos = fib.pos - 2) WHERE pos > 2; Now, in the standard SQL that should result in: 1,1,2,null,null,null... - because the assumption is that the operations on all rows are done at the same time. So F4 will get null, because F3 was null etc. All other databases seem to do it - I tested HSQLDB, Firebird, SQLServer, Derby. Fortunately (for me) SQLite apparently stores rows after computation and the previous row's value is available when the next row is being computed. Also, the operations are executed in row order. So I do get 1,1, 2, 3, 5, 8 ... (though some table/index declarations have to be right to achieve that for larger tables). This is an _immensely_ useful functionality when one needs to compute various recursive functions. For example exponential moving average, used frequently in financials. Or Kalman filter (and many other filters) used in data smoothing and analysis. Naturally, one could argue that databases are not for numerical computations. But it is very useful to be able to do simple computations in database, especially if one does not have to write stored procedures or write external procedures for that - in essence getting something for free and without added complexity. So I hope that this feature will stay... P.S. On somewhat related note: since sqlite is written in C - why it does not expose some basic functions from the standard C library (log, exp, sqrt, sin), at least optionally? Understandably, the idea is to keep it 'lite'. But, may be, an approach similar to ant and other packages could be applied to SQLite - that is there is a set of standard (but still simple) extensions, including things which may add some bulk, but do not require any large implementation effort. = the test DROP TABLE fib; CREATE TABLE fib ( pos INTEGER, val INTEGER); CREATE UNIQUE INDEX fib_ix ON fib(pos); INSERT INTO fib VALUES (1,1); INSERT INTO fib VALUES (2,1); INSERT INTO fib VALUES (3,NULL); INSERT INTO fib VALUES (4,NULL); INSERT INTO fib VALUES (5,NULL); INSERT INTO fib VALUES (6,NULL); INSERT INTO fib VALUES (7,NULL); INSERT INTO fib VALUES (8,NULL); INSERT INTO fib VALUES (9,NULL); INSERT INTO fib VALUES (10,NULL); INSERT INTO fib VALUES (11,NULL); INSERT INTO fib VALUES (12,NULL); INSERT INTO fib VALUES (13,NULL); INSERT INTO fib VALUES (14,NULL); INSERT INTO fib VALUES (15,NULL); INSERT INTO fib VALUES (16,NULL); INSERT INTO fib VALUES (17,NULL); INSERT INTO fib VALUES (18,NULL); INSERT INTO fib VALUES (19,NULL); INSERT INTO fib VALUES (20,NULL); UPDATE fib SET -- compute fibonacci numbers val = (SELECT h1.val FROM fib as h1 where pos = fib.pos - 1) + (SELECT h2.val FROM fib as h2 where pos = fib.pos - 2) WHERE pos > 2; select * from fib; -- show the results -- ___ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm