subject:"Re\: \[sqlite\] Break on cumulative sum"

Re: [sqlite] Break on cumulative sum

2013-03-05 Thread Petite Abeille


On Mar 5, 2013, at 9:24 AM, Nico Williams  wrote:

> +1 re: recursive queries.  

There is a standard for that  (in ANSI SQL-99?): recursive 'with' clause, aka 
recursive subquery factoring, aka recursive common table expressions.

http://www.postgresql.org/docs/9.2/static/queries-with.html

This kills two birds with one stone: named, reusable subqueries *and* 
recursion. A double win. 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Break on cumulative sum

2013-03-05 Thread Petite Abeille

On Mar 5, 2013, at 8:19 AM, James K. Lowden  wrote:

>> Postgresql has also had them for a while. 
> 
> Yes, and if I'm not mistaken Ingres's QUEL had them long before that.

Yes, many databases have them, from Postgres, MSSQL, Sybase IQ, some flavor of 
DB2, to Oracle, etc, ...

> (I see Microsoft calls some of them e.g.. PERCENTILE_RANK
> "analytical".  Why?  "analysis", after all, means "take apart".  They
> might be used for analysis, but window functions are every bit as
> synthetic as normal aggregate functions.)  

Oracle call them "analytic functions":

SQL for Analysis and Reporting
http://docs.oracle.com/cd/E14072_01/server.112/e10810/analysis.htm

Analytic functions really brings SQL to a whole new level of functionality and 
usefulness. Once one goes, errr, analytic, one never goes back.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Break on cumulative sum

2013-03-05 Thread Joe.Fisher

One of the reasons why we have to use PostgreSQL and/or Oracle is for 
the 'Window Functions' (AKA: Analytic Functions).
We use SQLite all the time and love it but this one missing feature 
bites us.
Could we get some advice from the SQLite core team on the feasibility of 
this?

Could it be just an extension or would it have to be more than that?

Joe

On Mar 4, 2013, at 1:32 AM, James K. Lowden  
wrote:

> What do you have in mind?  I've benn adding some user defined functions
> and am thinking of creating a repository for them.

All the so-called window functions from SQL:2003 (aka analytic functions):

"Windowed Tables and Window Functions in SQL"
http://wwwlgis.informatik.uni-kl.de/cms/fileadmin/courses/SS2008/NEDM/RDDM.Chapter.06.Windows_and_Query_Functions_in_SQL.pdf

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Break on cumulative sum

2013-03-05 Thread James K. Lowden

On Tue, 5 Mar 2013 02:24:53 -0600
Nico Williams  wrote:

> Lies!  :)  There's one more type of UDF in SQLite3: virtual tables.
> You could have a virtual table where selecting with an equality test
> for some column is "passing an argument" to a "table function".  I
> know, I've done this.  But it gets better!  SQLite3 is re-entrant, so
> you can actually format a SQL statement in the virtual table given its
> "arguments" and execute it (being careful to not create a SQL
> injection vulnerability).  IIRC FTS works this way.

Hmm, OK.  I think you're right; this could work, even without parsing
any SQL.  I'm imagining a virtual table like

CREATE VIRTUAL TABLE output
USING tfn( 'create table output ...', 'select ...' );

where the function executes the first argument to create a table,
solely to use the metadata to determine the names and types of its own
columns.  (That saves reinventing a column-defintion language.)  Then
it executes the SQL in the second argument, the output of which become
the rows in the table.  It could even be executed recursively (3rd
argument).  

> With a tiny bit of syntactic sugar we could have a way to define table
> functions given simple scalar functions, which would be very nice
> indeed.

I don't know if that's the route; perhaps the good doctor will offer
his views.  

I think of a table function not as a disaggregator -- ick! -- but as a
parameterized view.  Or, really, any function whose output can be
expressed as a relation using any combination of C and SQL.  

Most "real" DBMSs have some form of stored procedure that's just a
named blob of SQL.  Table functions could take the place of stored
procedures in SQLite in a more disciplined way.  That would indeed be
very nice.  

> +1 re: recursive queries.  Once or twice I've resorted to a UNION ALL
> of LEFT OUTER self-JOINs, each sub-query having more and more
> self-joins -- this limits recursion depth effectively, but there's
> going to be a limit anyways.  

I wrote a UDF that executes recursively and returns a string
indicating the number of recursions and total number of rows.  You pass
in the query and target table (because we don't have table functions!)
and then select from it.   That's not as bad as it sounds.   Consider
for example a table "tree" representing a filesystem hierarchy: 

create temporary table t 
( p int -- parent
, c int -- child
, primary key (c,p)
); 
insert into t 
select * from tree 
where inode in ( 
select p from t -- "recursion"
UNION 
select inode from inodes -- seed
where filename = 'usr'
)
and not exists (-- limit
select 1 from t 
where p = pnode 
and c = inode
)

can be invoked as 

$ sqlite3 -list ../inodes.db "create temporary table t (p int, c int,
primary key (c,p)); select recurse('insert into t select tree.* from
tree where inode in (select p from t UNION select inode from inodes
where filename = ''usr'') and not exists (select 1 from t where p =
pnode and c = inode)'); \
select catname(parent_name, filename) \
from inodes where inode in \
(select c from t) \
order by catname(parent_name, filename);"

17 rows returned in 10 iterations 
catname(parent_name, filename) 
/usr/src/external
/usr/src/external/gpl3
/usr/src/external/gpl3/gcc
/usr/src/external/gpl3/gcc/dist
/usr/src/external/gpl3/gcc/dist/gcc
/usr/src/external/gpl3/gcc/dist/gcc/testsuite
/usr/src/external/gpl3/gcc/dist/gcc/testsuite/gcc.dg
/usr/src/external/gpl3/gcc/dist/gcc/testsuite/gcc.dg/cpp
/usr/src/external/gpl3/gcc/dist/gcc/testsuite/gcc.dg/cpp/usr
/usr/src/gnu
/usr/src/gnu/dist
/usr/src/gnu/dist/gcc4
/usr/src/gnu/dist/gcc4/gcc
/usr/src/gnu/dist/gcc4/gcc/testsuite
/usr/src/gnu/dist/gcc4/gcc/testsuite/gcc.dg
/usr/src/gnu/dist/gcc4/gcc/testsuite/gcc.dg/cpp
/usr/src/gnu/dist/gcc4/gcc/testsuite/gcc.dg/cpp/usr

Unlike a bunch of self-joins, this produces a correct result regardless
of depth; the recurse() function stops when an iteration produces zero
rows.  But it bears the overhead of inserting into another table and
re-executing the query (and retesting the target) on every iteration.
Moving the recursion inside the query engine would avoid all that. 

--jkl

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Break on cumulative sum

2013-03-05 Thread Dominique Devienne

On Tue, Mar 5, 2013 at 9:47 AM, Nico Williams  wrote:
> Well, under the hood table functions could use temp, gensym'ed virtual
> tables for all I care.  The point is I want table functions :)

SQLite already has table functions of sort: PRAGMAs...

They have their own issues (not real tables, so cannot be used in joins,
AFAIK), and of course you cannot add your own PRAGMA in client code using
the SQLite3 C API.

I'm hoping there's no big leap between current PRAGMAs and real table
functions, with SQL grammar changes to be able to <<>>, and that Dr. Hipp agrees table functions
would be a valuable addition of course :). --DD
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Break on cumulative sum

2013-03-05 Thread Nico Williams

On Tue, Mar 5, 2013 at 2:44 AM, Dominique Devienne  wrote:
> On Tue, Mar 5, 2013 at 9:24 AM, Nico Williams  wrote:

> Right. Virtual Tables are very flexible, but the syntax is indeed not
> practical, and it also forces you to name and "instantiate" tables, when
> often you want to use table functions on the fly in a JOIN.

I agree with everything you said.

> VTables just isn't the right fit for table functions IMHO. My $0.02. --DD

Well, under the hood table functions could use temp, gensym'ed virtual
tables for all I care.  The point is I want table functions :)

> PS: Table functions are basically the "reverse" of aggregate functions to
> me. They take scalar inputs, and generate rows, while aggregate functions
> take rows and generate scalar(s)..

Indeed!

Nico
--
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Break on cumulative sum

2013-03-05 Thread Dominique Devienne

On Tue, Mar 5, 2013 at 9:24 AM, Nico Williams  wrote:
> On Tue, Mar 5, 2013 at 1:19 AM, James K. Lowden
> > 2.  Aggregation.  The engine passes the rows to be aggregated to the
> > function.  After the last row, it calls the function's "xFinal"
> > function, which returns a value.  Example: median().
>
> Lies!  :)  There's one more type of UDF in SQLite3: virtual tables.
> You could have a virtual table where selecting with an equality test
> for some column is "passing an argument" to a "table function".  I
> know, I've done this.  But it gets better!  SQLite3 is re-entrant, so
> you can actually format a SQL statement in the virtual table given its
> "arguments" and execute it (being careful to not create a SQL
> injection vulnerability).  IIRC FTS works this way.
>
> With a tiny bit of syntactic sugar we could have a way to define table
> functions given simple scalar functions, which would be very nice
> indeed.

Right. Virtual Tables are very flexible, but the syntax is indeed not
practical, and it also forces you to name and "instantiate" tables, when
often you want to use table functions on the fly in a JOIN.

Tables functions are very useful to "re-tabulate" de-normalized data
stuffed into a CSV or JSON or XML text column for example.

Once you have table functions, windowing analytics probably becomes easier
as well to express (although not necessarily efficient).

VTables just isn't the right fit for table functions IMHO. My $0.02. --DD

http://www.oracle-base.com/articles/misc/pipelined-table-functions.php

PS: Table functions are basically the "reverse" of aggregate functions to
me. They take scalar inputs, and generate rows, while aggregate functions
take rows and generate scalar(s)..
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Break on cumulative sum

2013-03-05 Thread Nico Williams

On Tue, Mar 5, 2013 at 1:19 AM, James K. Lowden
 wrote:
> Right you are, except it's not a question of efficient but of
> possible. A UDF works one of two ways:
>
> 1.  Per row.  The engine calls the function for each row.  The function
> transforms the input into the output.  Examples are sqrt() and pow().
>
> 2.  Aggregation.  The engine passes the rows to be aggregated to the
> function.  After the last row, it calls the function's "xFinal"
> function, which returns a value.  Example: median().

Lies!  :)  There's one more type of UDF in SQLite3: virtual tables.
You could have a virtual table where selecting with an equality test
for some column is "passing an argument" to a "table function".  I
know, I've done this.  But it gets better!  SQLite3 is re-entrant, so
you can actually format a SQL statement in the virtual table given its
"arguments" and execute it (being careful to not create a SQL
injection vulnerability).  IIRC FTS works this way.

With a tiny bit of syntactic sugar we could have a way to define table
functions given simple scalar functions, which would be very nice
indeed.

>> The engine would need to become window-aware, which is
>> probably enough work to delay its introduction until a consortium
>> member decides they need it.
>
> I would say the same.  It changes the grammar and almost certainly
> intrudes on the  GROUP BY implementation.  AFAIK it's not on the
> sqlite4 agenda.

I've often wondered if there's interest in a heavier-weight fork of
SQLite*.  Clearly there isn't: it'd have been done already.  Without
support from the consortium (and a fork wouldn't get that support) it
couldn't possibly hold a candle to SQLite in terms of robustness.

> Adding nonstandard equivalent functionality by extending aggregate
> functions might be easier. But the inputs to the OVER clause --
> partition, range, order -- would still be required.

For aggregate functions like group_concat() an ORDER clause to apply
to its inputs would be wonderful.

> I'm surprised there's much interest in using SQLite for fancy
> grouping.  My hobbyhorse is recursive queries, which IMO are much more
> of a nuisance to carry out.

+1 re: recursive queries.  Once or twice I've resorted to a UNION ALL
of LEFT OUTER self-JOINs, each sub-query having more and more
self-joins -- this limits recursion depth effectively, but there's
going to be a limit anyways.  I've also use recursive triggers to good
effect (though triggers slow things down plenty).

Nico
--
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Break on cumulative sum

2013-03-04 Thread James K. Lowden

On Mon, 04 Mar 2013 16:51:35 -0500
Ryan Johnson  wrote:

> On 04/03/2013 2:20 PM, Petite Abeille wrote:
> > On Mar 4, 2013, at 1:32 AM, James K. Lowden
> >  wrote:
> >
> >> What do you have in mind?  I've benn adding some user defined
> >> functions and am thinking of creating a repository for them.
> > All the so-called window functions from SQL:2003 (aka analytic
> > functions):
> >
> > "Windowed Tables and Window Functions in SQL"
> > http://wwwlgis.informatik.uni-kl.de/cms/fileadmin/courses/SS2008/NEDM/RDDM.Chapter.06.Windows_and_Query_Functions_in_SQL.pdf
>
> Postgresql has also had them for a while. 

Yes, and if I'm not mistaken Ingres's QUEL had them long before that.
(I see Microsoft calls some of them e.g.. PERCENTILE_RANK
"analytical".  Why?  "analysis", after all, means "take apart".  They
might be used for analysis, but window functions are every bit as
synthetic as normal aggregate functions.)  

> I'd love to see sqlite3 gain them, but I don't think they can be
> added efficiently as custom functions. 

Right you are, except it's not a question of efficient but of
possible. A UDF works one of two ways:

1.  Per row.  The engine calls the function for each row.  The function
transforms the input into the output.  Examples are sqrt() and pow().  

2.  Aggregation.  The engine passes the rows to be aggregated to the
function.  After the last row, it calls the function's "xFinal"
function, which returns a value.  Example: median().  

In the per-row case, the function must produce its answer based on the
(single) value provided.  Ranking, for example, would be impossible
because the "rows ahead" haven't been seen yet.  

In the case of aggregation, the calls to xFinal are determined by the
GROUP BY clause.  

> The engine would need to become window-aware, which is 
> probably enough work to delay its introduction until a consortium
> member decides they need it. 

I would say the same.  It changes the grammar and almost certainly
intrudes on the  GROUP BY implementation.  AFAIK it's not on the
sqlite4 agenda.  

Adding nonstandard equivalent functionality by extending aggregate
functions might be easier. But the inputs to the OVER clause --
partition, range, order -- would still be required.  

I'm surprised there's much interest in using SQLite for fancy
grouping.  My hobbyhorse is recursive queries, which IMO are much more
of a nuisance to carry out.  

--jkl
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Break on cumulative sum

2013-03-04 Thread Ryan Johnson


On 04/03/2013 2:20 PM, Petite Abeille wrote:

On Mar 4, 2013, at 1:32 AM, James K. Lowden  wrote:


What do you have in mind?  I've benn adding some user defined functions
and am thinking of creating a repository for them.

All the so-called window functions from SQL:2003 (aka analytic functions):

"Windowed Tables and Window Functions in SQL"
http://wwwlgis.informatik.uni-kl.de/cms/fileadmin/courses/SS2008/NEDM/RDDM.Chapter.06.Windows_and_Query_Functions_in_SQL.pdf
Postgresql has also had them for a while. I'd love to see sqlite3 gain 
them, but I don't think they can be added efficiently as custom 
functions. The engine would need to become window-aware, which is 
probably enough work to delay its introduction until a consortium member 
decides they need it. That said, the VDBE architecture should make it 
significantly easier to implement inside sqlite3 than for most other 
engines.


Ryan

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Break on cumulative sum

2013-03-04 Thread Petite Abeille


On Mar 4, 2013, at 1:32 AM, James K. Lowden  wrote:

> What do you have in mind?  I've benn adding some user defined functions
> and am thinking of creating a repository for them.  

All the so-called window functions from SQL:2003 (aka analytic functions):

"Windowed Tables and Window Functions in SQL"
http://wwwlgis.informatik.uni-kl.de/cms/fileadmin/courses/SS2008/NEDM/RDDM.Chapter.06.Windows_and_Query_Functions_in_SQL.pdf

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Break on cumulative sum

2013-03-03 Thread James K. Lowden

On Sun, 3 Mar 2013 07:19:11 -0600
"Michael Black"  wrote:

> One thing I'll point out...the trigger solution is almost linear
> performance and constant memory where the single SQL statement will
> suffer notably as time goes on in both those measures.  Also the
> single SQL statement doesn't seem like a good idea if the data is
> coming in real time. He never mentioned how big his table is going to
> be.

If TextField 'A' is deleted, doesn't the entire table have to be
updated?  Sure, that's O(n), but where n is the size of the table, not
the size of the update.  

Acknowledged, sometimes there are reasons to store derived results.
But that choice bears costs, both in updating the database and in
engineering it, costs that can be avoided if they can be avoided.
Knuth said premature optimization is the root of all evil, and I
agree.  :-)  

Normalized data minimize the cost of manipulation.  That is why we
normalize our databases.  As a side-effect, normalized data permit more
efficient use of memory, benefiting SELECT, too.  

--jkl
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Break on cumulative sum

2013-03-03 Thread James K. Lowden

On Sun, 3 Mar 2013 18:50:35 +0100
Petite Abeille  wrote:

> Ah? if only? SQLite had analytical functions? oh, well..

What do you have in mind?  I've benn adding some user defined functions
and am thinking of creating a repository for them.  

--jkl

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Break on cumulative sum

2013-03-03 Thread Petite Abeille


On Mar 3, 2013, at 2:10 PM, "James K. Lowden"  wrote:

> There's some cruft, too.  I was only demonstrating that it could be
> done.  If you find a way to simplify it, you'll know you understand
> it.  

Ah… if only… SQLite had analytical functions… oh, well..

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Break on cumulative sum

2013-03-03 Thread Michael Black

One thing I'll point out...the trigger solution is almost linear performance
and constant memory where the single SQL statement will suffer notably as
time goes on in both those measures.  Also the single SQL statement doesn't
seem like a good idea if the data is coming in real time.
He never mentioned how big his table is going to be.

-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of James K. Lowden
Sent: Sunday, March 03, 2013 7:10 AM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] Break on cumulative sum

On Sun, 3 Mar 2013 11:46:26 +0100
Gert Van Assche  wrote:

> My SQL knowledge is not yet good enough to understand what you
> proposed, but I hope one day that I will be able to understand and
> replicate myself the code you wrote. I started of with a join on the
> table itself as well, but I got stuck. I'll study your code and I'll
> grow.

Happy to be service, Gert.  

The trick is always the same: get your ducks in a row and shoot.  Figure
out the formula that computes the answer you want, and put the
constituent parts on the row, then compute.  

The first order of business in your case was a running sum.  That's
done by adding up everything "before" the current row, where "before"
means "less than" something, however ordered.   

select   a.*, sum(b.ValueField) as total
from Test as a join Test as b
on a.TextField >= b.TextField
group by a.TextField
, a.ValueField
, a.CumulativeValue
, a.BreakPoint

For each row in "a", get all the rows in "b" whose TextField is less
than (or equal to, here) to a's.  Here it is, live and in color: 

$ sqlite3 -echo db < self.sql 
select   a.TextField
, a.ValueField
, a.BreakPoint
 , sum(b.ValueField) as total
from Test as a join Test as b
on a.TextField >= b.TextField
group by a.TextField
, a.ValueField
, a.CumulativeValue
, a.BreakPoint
;
TextField   ValueField  BreakPoint  total 
--  --  --  --
A   2   10  2 
B   3   10  5 
C   2   10  7 
D   4   10  11
E   5   10  16
F   1   10  17
G   1   10  18
H   5   10  23
I   11  10  34
J   8   10  42
K   2   10  44

That formed my building block.  

Your formula is almost

GN = 1 + total / Breakpoint

By using integer arithmetic, I could divide "total" by BreakPoint to
get a multiple, which is almost what you call a GroupName, except for
that "one past" rule I remarked on, where for example 11 is part of
the first group.  But by joining my block to itself, offset by one, we
can slide the multiplier down a row.  

For CumulativeValue, your formula is 

CV = total - [max total of previous GN]

That requires computing the GN, grouping on it, and finding the maximum
total for it, and joining that result back to the main one, where it
can be subtracted from the main total.  

So, yes, there are a lot of joins.  But I didn't write the query
top-down, and you shouldn't read it that way.  Read it inside-out.
Each piece can be separately evaluated, and each operand used in the
calculation can be exposed in the SELECT clause.  

There's some cruft, too.  I was only demonstrating that it could be
done.  If you find a way to simplify it, you'll know you understand
it.  

Regards, 

--jkl
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Break on cumulative sum

2013-03-03 Thread James K. Lowden

On Sun, 3 Mar 2013 11:46:26 +0100
Gert Van Assche  wrote:

> My SQL knowledge is not yet good enough to understand what you
> proposed, but I hope one day that I will be able to understand and
> replicate myself the code you wrote. I started of with a join on the
> table itself as well, but I got stuck. I'll study your code and I'll
> grow.

Happy to be service, Gert.  

The trick is always the same: get your ducks in a row and shoot.  Figure
out the formula that computes the answer you want, and put the
constituent parts on the row, then compute.  

The first order of business in your case was a running sum.  That's
done by adding up everything "before" the current row, where "before"
means "less than" something, however ordered.   

select   a.*, sum(b.ValueField) as total
from Test as a join Test as b
on a.TextField >= b.TextField
group by a.TextField
, a.ValueField
, a.CumulativeValue
, a.BreakPoint

For each row in "a", get all the rows in "b" whose TextField is less
than (or equal to, here) to a's.  Here it is, live and in color: 

$ sqlite3 -echo db < self.sql 
select   a.TextField
, a.ValueField
, a.BreakPoint
 , sum(b.ValueField) as total
from Test as a join Test as b
on a.TextField >= b.TextField
group by a.TextField
, a.ValueField
, a.CumulativeValue
, a.BreakPoint
;
TextField   ValueField  BreakPoint  total 
--  --  --  --
A   2   10  2 
B   3   10  5 
C   2   10  7 
D   4   10  11
E   5   10  16
F   1   10  17
G   1   10  18
H   5   10  23
I   11  10  34
J   8   10  42
K   2   10  44

That formed my building block.  

Your formula is almost

GN = 1 + total / Breakpoint

By using integer arithmetic, I could divide "total" by BreakPoint to
get a multiple, which is almost what you call a GroupName, except for
that "one past" rule I remarked on, where for example 11 is part of
the first group.  But by joining my block to itself, offset by one, we
can slide the multiplier down a row.  

For CumulativeValue, your formula is 

CV = total - [max total of previous GN]

That requires computing the GN, grouping on it, and finding the maximum
total for it, and joining that result back to the main one, where it
can be subtracted from the main total.  

So, yes, there are a lot of joins.  But I didn't write the query
top-down, and you shouldn't read it that way.  Read it inside-out.
Each piece can be separately evaluated, and each operand used in the
calculation can be exposed in the SELECT clause.  

There's some cruft, too.  I was only demonstrating that it could be
done.  If you find a way to simplify it, you'll know you understand
it.  

Regards, 

--jkl
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Break on cumulative sum

2013-03-03 Thread Gert Van Assche

James,

I'm very impressed by your solution. My SQL knowledge is not yet good
enough to understand what you proposed, but I hope one day that I will be
able to understand and replicate myself the code you wrote. I started of
with a join on the table itself as well, but I got stuck. I'll study your
code and I'll grow.
That is the beauty of groups like this: hobby developers like me get advice
from experts on problems we're facing. Very helpful, not just the code, but
also the on-problem training!
Thanks for your work, all!

gert

2013/3/3 James K. Lowden 

> On Sat, 2 Mar 2013 14:46:40 +0100
> Gert Van Assche  wrote:
>
> > All, I don't know how to achieve this: I need to put the cumulative
> > sum in a field, and create a group as soon as that cumulative sum is
> > over a breakpoint value (10).
>
> This was a bit of a challenge because the group definition is IMO a
> little strange: The first value *after* the breakpoint is included in
> the group.  That made the SQL more convoluted than it would be
> otherwise.
>
> That said, the attached SQL does the job in one statement.  No triggers
> are needed.
>
> $ sqlite3 db < running_sum.sql
> TextField   ValueField  CumulativeValue  BreakPoint  GroupName
> --  --  ---  --  --
> A   2   210  1
> B   3   510  1
> C   2   710  1
> D   4   11   10  1
> E   5   510  2
> F   1   610  2
> G   1   710  2
> H   5   12   10  2
> I   11  11   10  3
> J   8   810  4
> K   2   10   10  4
>
> This lets you define the results in terms of the base data instead of
> keeping a table of derived values.  No updates required, just a view.
> Unless you have data-warehouse rowcounts or a very limited machine, it
> should perform just fine, despite what one might guess, because the I/O
> is limited to one table.  It should also outperform any per-row
> solution.
>
> Two bits of advice if I may.
>
> Be careful of using triggers to maintain data consistency.  Triggers
> are very good for enforcing referential integrity in situations that
> DRI can't handle. Beyond that they get very complex and error prone.
>
> Using triggers to compute derived values signals redundancy in the
> database.  That redundancy often is not as desirable as the
> denormalize-for-performance crowd thinks.  Better to compute derived
> values on demand (as in a view).  If that's demonstrated to be too
> slow, a periodic update in a cron job or similar should provide better
> thoughput than recomputing on every insert/update/delete.  You'll also
> do yourself a favor by segregating derived values in a different table
> that can be regenerated at will.
>
> My other suggestion is to eschew abstract nouns in column names.  Words
> like data, field, and value add no meaning.  After all, it's a
> database.  If it's not a value, what is it?
>
> This table,
>
> CREATE TABLE [Test]
> ( Name
> , Value
> , RunningSum
> , Breakpoint /* (one word, small 'p') */ DEFAULT 10
> , GroupName
> , primary key (Name)
> );
>
> is at least  as clear, wouldn't you say?
>
> HTH.
>
> --jkl
>
> P.S. In case the listserv strips attachments, here it is in plain
> text.
>
> [SQL]
> select   A.TextField, A.ValueField
>, A.total - coalesce(C.GroupTotal, 0) as CumulativeValue
>, A.Breakpoint
> --   , coalesce(C.GroupTotal, 0) as GroupTotal
>, coalesce(B.GroupName, A.GroupName) as GroupName
> from (
> select   a.*, sum(b.ValueField) as total
>, 1 + sum(b.ValueField) / (1+a.Breakpoint) as GroupName
> from Test as a join Test as b
> on a.TextField >= b.TextField
> group by a.TextField
> , a.ValueField
> , a.CumulativeValue
> , a.BreakPoint
> ) as A
> left join (
> select   a.*, sum(b.ValueField) as total
>, 1 + sum(b.ValueField) / (1 + a.Breakpoint) as GroupName
> from Test as a join Test as b
> on a.TextField >= b.TextField
> group by a.TextField
> , a.ValueField
> , a.CumulativeValue
> , a.BreakPoint
> ) as B
> on B.TextField = (select max(TextField) from Test where TextField <
> A.TextField) left join (
> select1 + T.GroupName as GroupName
> , max(T.total) as GroupTotal
> from (
> selectA.TextField
> , A.ValueField
> , A.total
> , coalesce(B.GroupName, 1) as GroupName
> from (

Re: [sqlite] Break on cumulative sum

2013-03-03 Thread James K. Lowden

On Sat, 2 Mar 2013 14:46:40 +0100
Gert Van Assche  wrote:

> All, I don't know how to achieve this: I need to put the cumulative
> sum in a field, and create a group as soon as that cumulative sum is
> over a breakpoint value (10).

This was a bit of a challenge because the group definition is IMO a
little strange: The first value *after* the breakpoint is included in
the group.  That made the SQL more convoluted than it would be
otherwise.  

That said, the attached SQL does the job in one statement.  No triggers
are needed.  

$ sqlite3 db < running_sum.sql
TextField   ValueField  CumulativeValue  BreakPoint  GroupName 
--  --  ---  --  --
A   2   210  1 
B   3   510  1 
C   2   710  1 
D   4   11   10  1 
E   5   510  2 
F   1   610  2 
G   1   710  2 
H   5   12   10  2 
I   11  11   10  3 
J   8   810  4 
K   2   10   10  4 

This lets you define the results in terms of the base data instead of
keeping a table of derived values.  No updates required, just a view.
Unless you have data-warehouse rowcounts or a very limited machine, it
should perform just fine, despite what one might guess, because the I/O
is limited to one table.  It should also outperform any per-row
solution.  

Two bits of advice if I may.  

Be careful of using triggers to maintain data consistency.  Triggers
are very good for enforcing referential integrity in situations that
DRI can't handle. Beyond that they get very complex and error prone.  

Using triggers to compute derived values signals redundancy in the
database.  That redundancy often is not as desirable as the
denormalize-for-performance crowd thinks.  Better to compute derived
values on demand (as in a view).  If that's demonstrated to be too
slow, a periodic update in a cron job or similar should provide better
thoughput than recomputing on every insert/update/delete.  You'll also
do yourself a favor by segregating derived values in a different table
that can be regenerated at will.  

My other suggestion is to eschew abstract nouns in column names.  Words
like data, field, and value add no meaning.  After all, it's a
database.  If it's not a value, what is it?  

This table, 

CREATE TABLE [Test] 
( Name
, Value
, RunningSum
, Breakpoint /* (one word, small 'p') */ DEFAULT 10
, GroupName
, primary key (Name)
); 

is at least  as clear, wouldn't you say?  

HTH.  

--jkl

P.S. In case the listserv strips attachments, here it is in plain
text.  

[SQL]
select   A.TextField, A.ValueField
   , A.total - coalesce(C.GroupTotal, 0) as CumulativeValue
   , A.Breakpoint
--   , coalesce(C.GroupTotal, 0) as GroupTotal
   , coalesce(B.GroupName, A.GroupName) as GroupName
from (
select   a.*, sum(b.ValueField) as total
   , 1 + sum(b.ValueField) / (1+a.Breakpoint) as GroupName
from Test as a join Test as b
on a.TextField >= b.TextField
group by a.TextField
, a.ValueField
, a.CumulativeValue
, a.BreakPoint
) as A
left join (
select   a.*, sum(b.ValueField) as total
   , 1 + sum(b.ValueField) / (1 + a.Breakpoint) as GroupName
from Test as a join Test as b
on a.TextField >= b.TextField
group by a.TextField
, a.ValueField
, a.CumulativeValue
, a.BreakPoint
) as B
on B.TextField = (select max(TextField) from Test where TextField <
A.TextField) left join (
select1 + T.GroupName as GroupName
, max(T.total) as GroupTotal
from (
selectA.TextField
, A.ValueField
, A.total
, coalesce(B.GroupName, 1) as GroupName
from (
select   a.*, sum(b.ValueField) as total
   , 1 + sum(b.ValueField) / (1 +
a.Breakpoint) as GroupName from Test as a join Test as b
on a.TextField >= b.TextField
group by  a.TextField
, a.ValueField
, a.CumulativeValue
, a.BreakPoint
) as A
left join (
select   a.*, sum(b.ValueField) as total
   , 1 + sum(b.ValueField) / (1 +
a.Breakpoint) as GroupN

Re: [sqlite] Break on cumulative sum

2013-03-02 Thread Michael Black

And I just noticed that should probably be

when new.BreakPoint <=


-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of Gert Van Assche
Sent: Saturday, March 02, 2013 10:49 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Break on cumulative sum

Michael, this is working perfectly!
I learned a lot with your code.
Thanks a lot for your help.

gert

2013/3/2 Michael Black 

> I think your "K" row was a typo on the CumulativeValue?
>
> CREATE TABLE [Test] (TextField, ValueField, CumulativeValue,  BreakPoint
> DEFAULT 10, GroupName);
> CREATE TABLE [MyGroup](GroupName);
> insert into [MyGroup] values(1);
> create trigger trig1 before insert on [Test]
> when 10 <= (select sum(ValueField) from [Test] where GroupName=(select
> GroupName from [MyGroup]))
> begin
> update [MyGroup] set GroupName = (select GroupName+1 from
> [MyGroup]);
> end;
> create trigger trig2 after insert on [Test]
> begin
> update [Test] set GroupName = (select GroupName from [MyGroup])
> where
>  rowid=new.rowid;
> update [Test] set CumulativeValue = (select sum(ValueField) from
> [Test] where GroupName=(select GroupName from [MyGroup])) where
> rowid=new.rowid;
> end;
> insert into [Test] values('A', '2', null, '10',(select GroupName from
> MyGroup));
> insert into [Test] values('B', '3', null, '10',(select GroupName from
> MyGroup));
> insert into [Test] values('C', '2', null, '10',(select GroupName from
> MyGroup));
> insert into [Test] values('D', '4', null, '10',(select GroupName from
> MyGroup));
> insert into [Test] values('E', '5', null, '10',(select GroupName from
> MyGroup));
> insert into [Test] values('F', '1', null, '10',(select GroupName from
> MyGroup));
> insert into [Test] values('G', '1', null, '10',(select GroupName from
> MyGroup));
> insert into [Test] values('H', '5', null, '10',(select GroupName from
> MyGroup));
> insert into [Test] values('I', '11', null, '10',(select GroupName from
> MyGroup));
> insert into [Test] values('J', '8', null, '10',(select GroupName from
> MyGroup));
> insert into [Test] values('K', '2', null, '10',(select GroupName from
> MyGroup));
> select * from [Test];
> A|2|2|10|1
> B|3|5|10|1
> C|2|7|10|1
> D|4|11|10|1
> E|5|5|10|2
> F|1|6|10|2
> G|1|7|10|2
> H|5|12|10|2
> I|11|11|10|3
> J|8|8|10|4
> K|2|10|10|4
>
>
> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:
> sqlite-users-boun...@sqlite.org] On Behalf Of Gert Van Assche
> Sent: Saturday, March 02, 2013 7:47 AM
> To: sqlite-users
> Subject: [sqlite] Break on cumulative sum
>
> All, I don't know how to achieve this: I need to put the cumulative sum in
> a field, and create a group as soon as that cumulative sum is over a
> breakpoint value (10).
>
> This is an example table:
>
> CREATE TABLE [Test] (TextField, ValueField, CumulativeValue,  BreakPoint
> DEFAULT 10, GroupName);
> insert into [Test] values('A', '2', null, '10');
> insert into [Test] values('B', '3', null, '10');
> insert into [Test] values('C', '2', null, '10');
> insert into [Test] values('D', '4', null, '10');
> insert into [Test] values('E', '5', null, '10');
> insert into [Test] values('F', '1', null, '10');
> insert into [Test] values('G', '1', null, '10');
> insert into [Test] values('H', '5', null, '10');
> insert into [Test] values('I', '11', null, '10');
> insert into [Test] values('J', '8', null, '10');
> insert into [Test] values('K', '2', null, '10');
>
> I'd like to end up with a table that looks like this:
>
> TextField ValueField CumulativeValue BreakPoint GroupName
> A 2 2 10 1
> B 3 5 10 1
> C 2 7 10 1
> D 4 11 10 1
> E 5 5 10 2
> F 1 6 10 2
> G 1 7 10 2
> H 5 12 10 2
> I 11 11 10 3
> J 8 8 10 4
> K 2 2 10 4
>
> I spent hours trying to update the CumulativeValue field untill the
> BreakPoint value is crossed, and restarting the cumulative counter, but I
> have too little sql knowledge to do this.
>
> Could anyone help me?
>
> thanks
>
> gert
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Break on cumulative sum

2013-03-02 Thread Gert Van Assche

Michael, this is working perfectly!
I learned a lot with your code.
Thanks a lot for your help.

gert

2013/3/2 Michael Black 

> I think your "K" row was a typo on the CumulativeValue?
>
> CREATE TABLE [Test] (TextField, ValueField, CumulativeValue,  BreakPoint
> DEFAULT 10, GroupName);
> CREATE TABLE [MyGroup](GroupName);
> insert into [MyGroup] values(1);
> create trigger trig1 before insert on [Test]
> when 10 <= (select sum(ValueField) from [Test] where GroupName=(select
> GroupName from [MyGroup]))
> begin
> update [MyGroup] set GroupName = (select GroupName+1 from
> [MyGroup]);
> end;
> create trigger trig2 after insert on [Test]
> begin
> update [Test] set GroupName = (select GroupName from [MyGroup])
> where
>  rowid=new.rowid;
> update [Test] set CumulativeValue = (select sum(ValueField) from
> [Test] where GroupName=(select GroupName from [MyGroup])) where
> rowid=new.rowid;
> end;
> insert into [Test] values('A', '2', null, '10',(select GroupName from
> MyGroup));
> insert into [Test] values('B', '3', null, '10',(select GroupName from
> MyGroup));
> insert into [Test] values('C', '2', null, '10',(select GroupName from
> MyGroup));
> insert into [Test] values('D', '4', null, '10',(select GroupName from
> MyGroup));
> insert into [Test] values('E', '5', null, '10',(select GroupName from
> MyGroup));
> insert into [Test] values('F', '1', null, '10',(select GroupName from
> MyGroup));
> insert into [Test] values('G', '1', null, '10',(select GroupName from
> MyGroup));
> insert into [Test] values('H', '5', null, '10',(select GroupName from
> MyGroup));
> insert into [Test] values('I', '11', null, '10',(select GroupName from
> MyGroup));
> insert into [Test] values('J', '8', null, '10',(select GroupName from
> MyGroup));
> insert into [Test] values('K', '2', null, '10',(select GroupName from
> MyGroup));
> select * from [Test];
> A|2|2|10|1
> B|3|5|10|1
> C|2|7|10|1
> D|4|11|10|1
> E|5|5|10|2
> F|1|6|10|2
> G|1|7|10|2
> H|5|12|10|2
> I|11|11|10|3
> J|8|8|10|4
> K|2|10|10|4
>
>
> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:
> sqlite-users-boun...@sqlite.org] On Behalf Of Gert Van Assche
> Sent: Saturday, March 02, 2013 7:47 AM
> To: sqlite-users
> Subject: [sqlite] Break on cumulative sum
>
> All, I don't know how to achieve this: I need to put the cumulative sum in
> a field, and create a group as soon as that cumulative sum is over a
> breakpoint value (10).
>
> This is an example table:
>
> CREATE TABLE [Test] (TextField, ValueField, CumulativeValue,  BreakPoint
> DEFAULT 10, GroupName);
> insert into [Test] values('A', '2', null, '10');
> insert into [Test] values('B', '3', null, '10');
> insert into [Test] values('C', '2', null, '10');
> insert into [Test] values('D', '4', null, '10');
> insert into [Test] values('E', '5', null, '10');
> insert into [Test] values('F', '1', null, '10');
> insert into [Test] values('G', '1', null, '10');
> insert into [Test] values('H', '5', null, '10');
> insert into [Test] values('I', '11', null, '10');
> insert into [Test] values('J', '8', null, '10');
> insert into [Test] values('K', '2', null, '10');
>
> I'd like to end up with a table that looks like this:
>
> TextField ValueField CumulativeValue BreakPoint GroupName
> A 2 2 10 1
> B 3 5 10 1
> C 2 7 10 1
> D 4 11 10 1
> E 5 5 10 2
> F 1 6 10 2
> G 1 7 10 2
> H 5 12 10 2
> I 11 11 10 3
> J 8 8 10 4
> K 2 2 10 4
>
> I spent hours trying to update the CumulativeValue field untill the
> BreakPoint value is crossed, and restarting the cumulative counter, but I
> have too little sql knowledge to do this.
>
> Could anyone help me?
>
> thanks
>
> gert
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Break on cumulative sum

2013-03-02 Thread Michael Black

I think your "K" row was a typo on the CumulativeValue?

CREATE TABLE [Test] (TextField, ValueField, CumulativeValue,  BreakPoint 
DEFAULT 10, GroupName);
CREATE TABLE [MyGroup](GroupName);
insert into [MyGroup] values(1);
create trigger trig1 before insert on [Test]
when 10 <= (select sum(ValueField) from [Test] where GroupName=(select 
GroupName from [MyGroup]))
begin
update [MyGroup] set GroupName = (select GroupName+1 from [MyGroup]);
end;
create trigger trig2 after insert on [Test]
begin
update [Test] set GroupName = (select GroupName from [MyGroup]) where
 rowid=new.rowid;
update [Test] set CumulativeValue = (select sum(ValueField) from [Test] 
where GroupName=(select GroupName from [MyGroup])) where rowid=new.rowid;
end;
insert into [Test] values('A', '2', null, '10',(select GroupName from MyGroup));
insert into [Test] values('B', '3', null, '10',(select GroupName from MyGroup));
insert into [Test] values('C', '2', null, '10',(select GroupName from MyGroup));
insert into [Test] values('D', '4', null, '10',(select GroupName from MyGroup));
insert into [Test] values('E', '5', null, '10',(select GroupName from MyGroup));
insert into [Test] values('F', '1', null, '10',(select GroupName from MyGroup));
insert into [Test] values('G', '1', null, '10',(select GroupName from MyGroup));
insert into [Test] values('H', '5', null, '10',(select GroupName from MyGroup));
insert into [Test] values('I', '11', null, '10',(select GroupName from 
MyGroup));
insert into [Test] values('J', '8', null, '10',(select GroupName from MyGroup));
insert into [Test] values('K', '2', null, '10',(select GroupName from MyGroup));
select * from [Test];
A|2|2|10|1
B|3|5|10|1
C|2|7|10|1
D|4|11|10|1
E|5|5|10|2
F|1|6|10|2
G|1|7|10|2
H|5|12|10|2
I|11|11|10|3
J|8|8|10|4
K|2|10|10|4


-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of Gert Van Assche
Sent: Saturday, March 02, 2013 7:47 AM
To: sqlite-users
Subject: [sqlite] Break on cumulative sum

All, I don't know how to achieve this: I need to put the cumulative sum in
a field, and create a group as soon as that cumulative sum is over a
breakpoint value (10).

This is an example table:

CREATE TABLE [Test] (TextField, ValueField, CumulativeValue,  BreakPoint
DEFAULT 10, GroupName);
insert into [Test] values('A', '2', null, '10');
insert into [Test] values('B', '3', null, '10');
insert into [Test] values('C', '2', null, '10');
insert into [Test] values('D', '4', null, '10');
insert into [Test] values('E', '5', null, '10');
insert into [Test] values('F', '1', null, '10');
insert into [Test] values('G', '1', null, '10');
insert into [Test] values('H', '5', null, '10');
insert into [Test] values('I', '11', null, '10');
insert into [Test] values('J', '8', null, '10');
insert into [Test] values('K', '2', null, '10');

I'd like to end up with a table that looks like this:

TextField ValueField CumulativeValue BreakPoint GroupName
A 2 2 10 1
B 3 5 10 1
C 2 7 10 1
D 4 11 10 1
E 5 5 10 2
F 1 6 10 2
G 1 7 10 2
H 5 12 10 2
I 11 11 10 3
J 8 8 10 4
K 2 2 10 4

I spent hours trying to update the CumulativeValue field untill the
BreakPoint value is crossed, and restarting the cumulative counter, but I
have too little sql knowledge to do this.

Could anyone help me?

thanks

gert
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Break on cumulative sum

Re: [sqlite] Break on cumulative sum

Re: [sqlite] Break on cumulative sum

Re: [sqlite] Break on cumulative sum

Re: [sqlite] Break on cumulative sum

Re: [sqlite] Break on cumulative sum

Re: [sqlite] Break on cumulative sum

Re: [sqlite] Break on cumulative sum

Re: [sqlite] Break on cumulative sum

Re: [sqlite] Break on cumulative sum

Re: [sqlite] Break on cumulative sum

Re: [sqlite] Break on cumulative sum

Re: [sqlite] Break on cumulative sum

Re: [sqlite] Break on cumulative sum

Re: [sqlite] Break on cumulative sum

Re: [sqlite] Break on cumulative sum

Re: [sqlite] Break on cumulative sum

Re: [sqlite] Break on cumulative sum

Re: [sqlite] Break on cumulative sum

Re: [sqlite] Break on cumulative sum

Re: [sqlite] Break on cumulative sum

21 matches

Site Navigation

Mail list logo

Footer information