Re: [sqlite] windowing functions != recursive functions

2005-10-13 Thread pilot pirx
Thank you and AndrewP for the pointers - very interesting read. 

Some clarifications. Coming from my perspective - that of R user, 
not an API user - it is natural to do more 
complicated operations in R than to try to write additional
functions in C/tcl etc. I do not know how common will such
attitude be among SQLite users not using R. (In general R
users tend to be statisticians with little knowledge of SQL.)
This is why my mail was not an attempt to propose some
extensions to core SQLite [it is so good for the intended
purpose that adding more stuff can spoil it :-)]

Were any additions for computing considered for SQLITE 
it seems plausible to argue that the should be easy to
implemenent and do not change the spirit of the project.
So more like 'adding a log function' rather than
something (probably) much bigger like Oracle olap.

It could be further argued that, for all such ideas for
SQLite extentions, the strategic approach of having 
'the core sqlite' and 'the extended sqlite' is the
best way to proceed. Any non-conventional extensions
(like log or OLAP) could be implemented in the 'extended
sqlite' version, used from there - and eventually migrated to 
the core if there is a sufficient interest.










- Original Message -
From: [EMAIL PROTECTED]
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] windowing functions != recursive functions
Date: Wed, 12 Oct 2005 22:48:35 -0400

> 
> "pilot pirx" <[EMAIL PROTECTED]> wrote:
> 
> > Now, for the recursive function like exponential moving average 
> > the defintion is that
> >
> > ema(i+1) =  val(i) * coef  + ema(i) * (1-coef).
> >
> > That is I have to know the previous value of both EMA _and_  
> > VALUE (while for moving avearage I need to know _only_ the 
> > previous value(s)
> > of VALUE.
> 
> You could write an "ema()" function for SQLite using the
> scarcely documented API functions sqlite3_get_auxdata() and
> sqlite3_set_auxdata().  (Those routines were intended to allow
> functions like "regexp" to compile a constant regular expression
> once and then reused the compiled regular expression on
> subsequent calls.  But they have never been used for anything,
> as far as I am aware.)
> 
> The ema() function would work like this:
> 
> SELECT ema(x, 0.45) FROM table1;
> 
> Where 0.45 is the "coef".
> 
> I was wondering if it would be possible to write a "prev()"
> function that returned the value of a column in the result
> set from the previous row.  prev() could be used to implement
> ema() in pure SQL.  But, alas, I do not see how you could
> write a general-purpose prev() function using the current
> API.  Some kind of extension would be required, I think.
> --
> D. Richard Hipp <[EMAIL PROTECTED]>


-- 
___
Play 100s of games for FREE! http://games.mail.com/



[sqlite] wiindow functions and recursive functions

2005-10-12 Thread pilot pirx
> I don't see why this is such a great feature.  Without it, worst case,
> you could still write a simple little loop which would issue one
> update statement for each row, all within a single transaction.  No?

that would require writing in C, bindind etc. Or, for some other databases,
writing in some stored procedure language. Here I have that with
stadard-looking sql in a small database. One 'quirk' of the implementation
(i am not sure it was intended) gives enormous additional
programming facility without any additional work :-)

> > Vastly more useful for moving average and the like would be real
> > windowing/grouping functions, like Oracle's "analytic" functions.  I'm

Example of computing moving average with standard sql is in one of my earlier 
mails.
I think that adding OLAP functions to such a small engine would be an overkill,
especially as most such functionality can be expressed with standard SQL
(admittedly, convoluted a bit).
The owner of the project will ultimately decide. 
It may not fit into the 'lite' 
image this project is after.


P.S. I remember seeing a public domain code adding 
correlation and regression to SQLite, but it was
about 3 years old - it was rather short. Yet, even that
did not make it into the mainstream project. 

- Original Message -
From: "Andrew Piskorski" <[EMAIL PROTECTED]>
To: sqlite-users@sqlite.org
Subject: [sqlite] SQL Window/OLAP functions
Date: Wed, 12 Oct 2005 08:34:02 -0400

> 
> On Wed, Oct 12, 2005 at 05:12:05AM -0500, pilot pirx wrote:
> > Subject: [sqlite] Please, please do _not_ remove this feature from SQLite...
> 
> > While using SQLite for some time (with R package, www.r-project.org)
> > I did admire its functionality and speed. Then I did discover a
> > hidden SQLite feature of immense usefulness - not available in other
> > databases. SQLite can compute Fibonacci numbers! (I will explain why
> 
> Transaction visibility features do vary, although often it doesn't
> matter anyway.  E.g., here's a dicussion of how (at least as of early
> 2004), PostgreSQL's docs were quite confused about certain subtleties,
> but what I find interesting, is this was still something that in
> practice had never really mattered to the mostly hard-core RDBMS
> programmers talking about it in that thread:
> 
>http://openacs.org/forums/message-view?message_id=176198
> 
> > UPDATE fib SET
> > val =  (SELECT h1.val FROM fib as h1 where pos = fib.pos - 1) +
> >(SELECT h2.val FROM fib as h2 where pos = fib.pos - 2)
> > WHERE pos > 2;
> 
> I don't see why this is such a great feature.  Without it, worst case,
> you could still write a simple little loop which would issue one
> update statement for each row, all within a single transaction.  No?
> 
> > This is an _immensely_ useful functionality when one needs to
> > compute various recursive functions. For example exponential moving
> > average, used frequently in financials. Or Kalman filter (and many
> 
> Vastly more useful for moving average and the like would be real
> windowing/grouping functions, like Oracle's "analytic" functions.  I'm
> not thrilled by their particular syntax, but the functionality is
> INCREDIBLY useful.  (And on the other hand, I haven't thought of any
> obviously better syntax, either.)
> 
> Hm, an amendement to the SQL:1999 spec added windowing support, and
> SQL:2003 includes that, I think as features T611, "Elementrary OLAP
> functions" and T612, "Advanced OLAP functions".  Apparently Fred Zemke
> of Oracle was the author of that SQL spec, and IBM also supported it,
> so the SQL:2003 syntax and behavior is probably very similar (maybe
> identical?) to what Oracle 8i, 9i, and 10g and IBM's DB2 already have.
> PostgreSQL, as of 8.0, doesn't support it yet.
> 
>http://www.wintercorp.com/rwintercolumns/SQL_99snewolapfunctions.html
>http://www.ncb.ernet.in/education/modules/dbms/SQL99/OLAP-99-154r2.pdf
>http://www.wiscorp.com/sql/SQL2003Features.pdf
>http://troels.arvin.dk/db/rdbms/#select-limit-offset
>http://www.postgresql.org/docs/8.0/interactive/features.html
>http://en.wikipedia.org/wiki/SQL
>http://www.sigmod.org/sigmod/record/issues/0403/E.JimAndrew-standard.pdf
>http://www.oracle.com/oramag/oracle/01-jul/o41industry.html
> 
> SQLite basically supports just SQL-92, it doesn't have any of these
> newer SQL:1999 or SQL:2003 features, right?
> 
> Using SQLite in conjunction with a powerful statistical data analysis
> programming language like R is an excellent example of a use where
> windowing functions can be hugely helpful.  Unfortunately, I've never
> had a compelling need to use SQLite for that, otherwise I'd probably
> take a shot at adding support for the SQL:2003 Window/OLAP stuff.  :)
> 
> --
> Andrew Piskorski <[EMAIL PROTECTED]>
> http://www.piskorski.com/


-- 
___
Sign-up for Ads Free at Mail.com
http://promo.mail.com/adsfreejump.htm



[sqlite] windowing functions != recursive functions

2005-10-12 Thread pilot pirx
I was unaware of the windowing functions discusssion.
Having a look at the first link, it looks like we may be
talking about two subtly different issues.
The windowing functions described in the link
are different from recursive functions. 
There is no problem in computing moving average
or cumulative sum etc 
with the existing SQL (or dealing with time windows
in general)  - it just the SQL gets nasty - examples
are in 'SQL for smarties' by Joe Celko. Therefore
adding such functions to SQLite is 'nice' but does
not really increase the functionality.

In contrast, computation of the recursion function
adds it. 

example with moving averages, since they
are mentioned in the windowing functions

assuming original series is
VALUE = 1 2 3 4 5 6 7

following statement will compute 5-day moving average for the whole column
UPDATE data SET 
MAVE5 = (SELECT AVG(val) FROM data AS h1 WHERE h1.dayno <=  data.dayno 
AND h1.dayno > data.dayno-05);
But this statement operates always on existing data in the existing column - to 
compute new value of MAVE5 it only needs to know values of VALUE. (When, for 
the element one, there is not enough data (because there is no dayno < 1)  SQL 
simply averages over existing data.)

Now, for the recursive function like exponential moving average the defintion 
is that
ema(i+1) =  val(i) * coef  + ema(i) * (1-coef). That is I have to know the 
previous value of
both EMA _and_  VALUE (while for moving avearage I need to know _only_
the previous value(s) of VALUE. 


- Original Message -
From: "Andrew Piskorski" <[EMAIL PROTECTED]>
To: sqlite-users@sqlite.org
Subject: [sqlite] SQL Window/OLAP functions
Date: Wed, 12 Oct 2005 08:34:02 -0400

> 
> On Wed, Oct 12, 2005 at 05:12:05AM -0500, pilot pirx wrote:
> > Subject: [sqlite] Please, please do _not_ remove this feature from SQLite...
> 
> > While using SQLite for some time (with R package, www.r-project.org)
> > I did admire its functionality and speed. Then I did discover a
> > hidden SQLite feature of immense usefulness - not available in other
> > databases. SQLite can compute Fibonacci numbers! (I will explain why
> 
> Transaction visibility features do vary, although often it doesn't
> matter anyway.  E.g., here's a dicussion of how (at least as of early
> 2004), PostgreSQL's docs were quite confused about certain subtleties,
> but what I find interesting, is this was still something that in
> practice had never really mattered to the mostly hard-core RDBMS
> programmers talking about it in that thread:
> 
>http://openacs.org/forums/message-view?message_id=176198
> 
> > UPDATE fib SET
> > val =  (SELECT h1.val FROM fib as h1 where pos = fib.pos - 1) +
> >(SELECT h2.val FROM fib as h2 where pos = fib.pos - 2)
> > WHERE pos > 2;
> 
> I don't see why this is such a great feature.  Without it, worst case,
> you could still write a simple little loop which would issue one
> update statement for each row, all within a single transaction.  No?
> 
> > This is an _immensely_ useful functionality when one needs to
> > compute various recursive functions. For example exponential moving
> > average, used frequently in financials. Or Kalman filter (and many
> 
> Vastly more useful for moving average and the like would be real
> windowing/grouping functions, like Oracle's "analytic" functions.  I'm
> not thrilled by their particular syntax, but the functionality is
> INCREDIBLY useful.  (And on the other hand, I haven't thought of any
> obviously better syntax, either.)
> 
> Hm, an amendement to the SQL:1999 spec added windowing support, and
> SQL:2003 includes that, I think as features T611, "Elementrary OLAP
> functions" and T612, "Advanced OLAP functions".  Apparently Fred Zemke
> of Oracle was the author of that SQL spec, and IBM also supported it,
> so the SQL:2003 syntax and behavior is probably very similar (maybe
> identical?) to what Oracle 8i, 9i, and 10g and IBM's DB2 already have.
> PostgreSQL, as of 8.0, doesn't support it yet.
> 
>http://www.wintercorp.com/rwintercolumns/SQL_99snewolapfunctions.html
>http://www.ncb.ernet.in/education/modules/dbms/SQL99/OLAP-99-154r2.pdf
>http://www.wiscorp.com/sql/SQL2003Features.pdf
>http://troels.arvin.dk/db/rdbms/#select-limit-offset
>http://www.postgresql.org/docs/8.0/interactive/features.html
>http://en.wikipedia.org/wiki/SQL
>http://www.sigmod.org/sigmod/record/issues/0403/E.JimAndrew-standard.pdf
>http://www.oracle.com/oramag/oracle/01-jul/o41industry.html
> 
> SQLite basically supports just SQL-92, it doesn't have any of these
> newer SQL:1999 or SQL:2003 features, right?
> 
> Using SQLi

Re: [sqlite] SQL Window/OLAP functions

2005-10-12 Thread pilot pirx
I did use R extensively for years and with SQLite for the last year. 
My observations, for what they are worth, about statistics and databases.

A stats package, especially as powerful as R, makes any database functions
less relevant. After trying various approaches I use now the database mostly 
 - to reduce data somewhat before getting them to R. 
   For example, with 2 mln records in 500 groups I can compute group 
   averages in the database and read into R only 500 records. 
- to reduce overall memory requirements - I can process the whole 
  data set on group-by-group basis - reading one group at a time.
  thus requiring less memory internally

Any stats package will have many functions, most of them impossible to
implement in standard SQL and very difficult to implement in general.
(like clustering etc).

So why do I write filters in the database, instead of using vastly superior 
R capabilities? Firstly, for fun - just liked to push SQL as far as possible.
Secondly - it is sometimes useful to compute some basic things during
or immediately after data acquisition. If we can compute some 
_simple_ metrics at that stage _quickly_ then R does not have to
read and process that much, as only data withing some range of metric
may be of interest. Also, we can index on a metric and provide very
fast extraction of data subsets.

In summary it seems to me that adding too heavy functions to any database
may be difficult and, for SQLite, going against the basic idea of 
having a 'lite' db. 






- Original Message -
From: Laurent <[EMAIL PROTECTED]>
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] SQL Window/OLAP functions
Date: Wed, 12 Oct 2005 15:36:22 +0200

> 
> Hello,
> 
> I was just looking for a statiscal package linked with SQLITE.
> >
> > Using SQLite in conjunction with a powerful statistical data analysis
> > programming language like R is an excellent example of a use where
> > windowing functions can be hugely helpful.  Unfortunately, I've never
> > had a compelling need to use SQLite for that, otherwise I'd probably
> > take a shot at adding support for the SQL:2003 Window/OLAP stuff.  :)
> >
> I can confirm that there would be some interest in having such a library.
> 
> Best regards,
> 
> Laurent.
> 
> ==
> 
> - Original Message -
> From: "Andrew Piskorski" <[EMAIL PROTECTED]>
> To: 
> Sent: Wednesday, October 12, 2005 2:34 PM
> Subject: [sqlite] SQL Window/OLAP functions
> 
> 
> > On Wed, Oct 12, 2005 at 05:12:05AM -0500, pilot pirx wrote:
> > > Subject: [sqlite] Please, please do _not_ remove this feature from
> SQLite...
> >
> > > While using SQLite for some time (with R package, www.r-project.org)
> > > I did admire its functionality and speed. Then I did discover a
> > > hidden SQLite feature of immense usefulness - not available in other
> > > databases. SQLite can compute Fibonacci numbers! (I will explain why
> >
> > Transaction visibility features do vary, although often it doesn't
> > matter anyway.  E.g., here's a dicussion of how (at least as of early
> > 2004), PostgreSQL's docs were quite confused about certain subtleties,
> > but what I find interesting, is this was still something that in
> > practice had never really mattered to the mostly hard-core RDBMS
> > programmers talking about it in that thread:
> >
> >   http://openacs.org/forums/message-view?message_id=176198
> >
> > > UPDATE fib SET
> > > val =  (SELECT h1.val FROM fib as h1 where pos = fib.pos - 1) +
> > >(SELECT h2.val FROM fib as h2 where pos = fib.pos - 2)
> > > WHERE pos > 2;
> >
> > I don't see why this is such a great feature.  Without it, worst case,
> > you could still write a simple little loop which would issue one
> > update statement for each row, all within a single transaction.  No?
> >
> > > This is an _immensely_ useful functionality when one needs to
> > > compute various recursive functions. For example exponential moving
> > > average, used frequently in financials. Or Kalman filter (and many
> >
> > Vastly more useful for moving average and the like would be real
> > windowing/grouping functions, like Oracle's "analytic" functions.  I'm
> > not thrilled by their particular syntax, but the functionality is
> > INCREDIBLY useful.  (And on the other hand, I haven't thought of any
> > obviously better syntax, either.)
> >
> > Hm, an amendement to the SQL:1999 spec added windowing support, and
> > SQL:2003 includes that, I think as features T611, "Elementrary OLAP
> > functions" and T612, "

[sqlite] Please, please do _not_ remove this feature from SQLite...

2005-10-12 Thread pilot pirx
While using SQLite for some time (with R package, www.r-project.org) I did 
admire its functionality and speed. Then I did discover a hidden SQLite feature 
of immense usefulness - not available in other databases. SQLite can compute 
Fibonacci numbers! (I will explain why this is important later).

The are defined as follows: F1 = 1, F2 = 1; F3 = F1 + F2; F4 = F2 +F3  etc. So 
they are defined recursively, with the first two values known and each next 
value depending on the previous values. The SQL statement for that is (the full 
script at the end of this mail):

UPDATE fib SET
val =  (SELECT h1.val FROM fib as h1 where pos = fib.pos - 1) +
   (SELECT h2.val FROM fib as h2 where pos = fib.pos - 2)
WHERE pos > 2;

Now, in the standard SQL that should result in: 1,1,2,null,null,null... - 
because the assumption is that the operations on all rows are done at the same 
time.  So F4 will get null, because F3 was null etc. All other databases seem 
to do it - I tested HSQLDB, Firebird, SQLServer, Derby. 

Fortunately (for me) SQLite apparently stores rows after computation and the 
previous row's value is available when the next row is being computed. Also, 
the operations are executed in row order.  So I do get 1,1, 2, 3, 5, 8 ... 
(though some table/index declarations have to be right to achieve that for 
larger tables).

This is an _immensely_ useful functionality when one needs to compute various 
recursive functions. For example exponential moving average, used frequently in 
financials. Or Kalman filter (and many other filters) used in data smoothing 
and analysis.  Naturally, one could argue that databases are not for numerical 
computations. But it is very useful to be able to do simple computations in 
database,  especially if one does not have to write stored procedures or write 
external procedures for that - in essence getting something for free and 
without added complexity. So I hope that this feature will stay...


P.S. On somewhat related note: since sqlite is written in C - why it does not 
expose some basic functions from the standard C library (log, exp, sqrt, sin), 
at least optionally? Understandably, the idea is to keep it 'lite'. But, may 
be, an approach similar to ant and other packages could be applied to SQLite - 
that is there is a set of standard (but still simple) extensions, including 
things which may add some bulk, but do not require any large implementation 
effort.


= the test

DROP TABLE fib;
CREATE TABLE fib ( pos INTEGER, val INTEGER);
CREATE UNIQUE INDEX fib_ix ON fib(pos);
INSERT INTO fib VALUES (1,1);
INSERT INTO fib VALUES (2,1);
INSERT INTO fib VALUES (3,NULL);
INSERT INTO fib VALUES (4,NULL);
INSERT INTO fib VALUES (5,NULL);
INSERT INTO fib VALUES (6,NULL);
INSERT INTO fib VALUES (7,NULL);
INSERT INTO fib VALUES (8,NULL);
INSERT INTO fib VALUES (9,NULL);
INSERT INTO fib VALUES (10,NULL);
INSERT INTO fib VALUES (11,NULL);
INSERT INTO fib VALUES (12,NULL);
INSERT INTO fib VALUES (13,NULL);
INSERT INTO fib VALUES (14,NULL);
INSERT INTO fib VALUES (15,NULL);
INSERT INTO fib VALUES (16,NULL);
INSERT INTO fib VALUES (17,NULL);
INSERT INTO fib VALUES (18,NULL);
INSERT INTO fib VALUES (19,NULL);
INSERT INTO fib VALUES (20,NULL);
UPDATE fib SET -- compute fibonacci numbers
val =  (SELECT h1.val FROM fib as h1 where pos = fib.pos - 1) +
   (SELECT h2.val FROM fib as h2 where pos = fib.pos - 2)
WHERE pos > 2;

select * from fib; -- show the results



-- 
___
Sign-up for Ads Free at Mail.com
http://promo.mail.com/adsfreejump.htm