Re: [sqlite] Break on cumulative sum
On Mar 5, 2013, at 9:24 AM, Nico Williams wrote: > +1 re: recursive queries. There is a standard for that (in ANSI SQL-99?): recursive 'with' clause, aka recursive subquery factoring, aka recursive common table expressions. http://www.postgresql.org/docs/9.2/static/queries-with.html This kills two birds with one stone: named, reusable subqueries *and* recursion. A double win. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Break on cumulative sum
On Mar 5, 2013, at 8:19 AM, James K. Lowden wrote: >> Postgresql has also had them for a while. > > Yes, and if I'm not mistaken Ingres's QUEL had them long before that. Yes, many databases have them, from Postgres, MSSQL, Sybase IQ, some flavor of DB2, to Oracle, etc, ... > (I see Microsoft calls some of them e.g.. PERCENTILE_RANK > "analytical". Why? "analysis", after all, means "take apart". They > might be used for analysis, but window functions are every bit as > synthetic as normal aggregate functions.) Oracle call them "analytic functions": SQL for Analysis and Reporting http://docs.oracle.com/cd/E14072_01/server.112/e10810/analysis.htm Analytic functions really brings SQL to a whole new level of functionality and usefulness. Once one goes, errr, analytic, one never goes back. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Break on cumulative sum
One of the reasons why we have to use PostgreSQL and/or Oracle is for the 'Window Functions' (AKA: Analytic Functions). We use SQLite all the time and love it but this one missing feature bites us. Could we get some advice from the SQLite core team on the feasibility of this? Could it be just an extension or would it have to be more than that? Joe On Mar 4, 2013, at 1:32 AM, James K. Lowden wrote: > What do you have in mind? I've benn adding some user defined functions > and am thinking of creating a repository for them. All the so-called window functions from SQL:2003 (aka analytic functions): "Windowed Tables and Window Functions in SQL" http://wwwlgis.informatik.uni-kl.de/cms/fileadmin/courses/SS2008/NEDM/RDDM.Chapter.06.Windows_and_Query_Functions_in_SQL.pdf ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Break on cumulative sum
On Tue, 5 Mar 2013 02:24:53 -0600 Nico Williams wrote: > Lies! :) There's one more type of UDF in SQLite3: virtual tables. > You could have a virtual table where selecting with an equality test > for some column is "passing an argument" to a "table function". I > know, I've done this. But it gets better! SQLite3 is re-entrant, so > you can actually format a SQL statement in the virtual table given its > "arguments" and execute it (being careful to not create a SQL > injection vulnerability). IIRC FTS works this way. Hmm, OK. I think you're right; this could work, even without parsing any SQL. I'm imagining a virtual table like CREATE VIRTUAL TABLE output USING tfn( 'create table output ...', 'select ...' ); where the function executes the first argument to create a table, solely to use the metadata to determine the names and types of its own columns. (That saves reinventing a column-defintion language.) Then it executes the SQL in the second argument, the output of which become the rows in the table. It could even be executed recursively (3rd argument). > With a tiny bit of syntactic sugar we could have a way to define table > functions given simple scalar functions, which would be very nice > indeed. I don't know if that's the route; perhaps the good doctor will offer his views. I think of a table function not as a disaggregator -- ick! -- but as a parameterized view. Or, really, any function whose output can be expressed as a relation using any combination of C and SQL. Most "real" DBMSs have some form of stored procedure that's just a named blob of SQL. Table functions could take the place of stored procedures in SQLite in a more disciplined way. That would indeed be very nice. > +1 re: recursive queries. Once or twice I've resorted to a UNION ALL > of LEFT OUTER self-JOINs, each sub-query having more and more > self-joins -- this limits recursion depth effectively, but there's > going to be a limit anyways. I wrote a UDF that executes recursively and returns a string indicating the number of recursions and total number of rows. You pass in the query and target table (because we don't have table functions!) and then select from it. That's not as bad as it sounds. Consider for example a table "tree" representing a filesystem hierarchy: create temporary table t ( p int -- parent , c int -- child , primary key (c,p) ); insert into t select * from tree where inode in ( select p from t -- "recursion" UNION select inode from inodes -- seed where filename = 'usr' ) and not exists (-- limit select 1 from t where p = pnode and c = inode ) can be invoked as $ sqlite3 -list ../inodes.db "create temporary table t (p int, c int, primary key (c,p)); select recurse('insert into t select tree.* from tree where inode in (select p from t UNION select inode from inodes where filename = ''usr'') and not exists (select 1 from t where p = pnode and c = inode)'); \ select catname(parent_name, filename) \ from inodes where inode in \ (select c from t) \ order by catname(parent_name, filename);" 17 rows returned in 10 iterations catname(parent_name, filename) /usr/src/external /usr/src/external/gpl3 /usr/src/external/gpl3/gcc /usr/src/external/gpl3/gcc/dist /usr/src/external/gpl3/gcc/dist/gcc /usr/src/external/gpl3/gcc/dist/gcc/testsuite /usr/src/external/gpl3/gcc/dist/gcc/testsuite/gcc.dg /usr/src/external/gpl3/gcc/dist/gcc/testsuite/gcc.dg/cpp /usr/src/external/gpl3/gcc/dist/gcc/testsuite/gcc.dg/cpp/usr /usr/src/gnu /usr/src/gnu/dist /usr/src/gnu/dist/gcc4 /usr/src/gnu/dist/gcc4/gcc /usr/src/gnu/dist/gcc4/gcc/testsuite /usr/src/gnu/dist/gcc4/gcc/testsuite/gcc.dg /usr/src/gnu/dist/gcc4/gcc/testsuite/gcc.dg/cpp /usr/src/gnu/dist/gcc4/gcc/testsuite/gcc.dg/cpp/usr Unlike a bunch of self-joins, this produces a correct result regardless of depth; the recurse() function stops when an iteration produces zero rows. But it bears the overhead of inserting into another table and re-executing the query (and retesting the target) on every iteration. Moving the recursion inside the query engine would avoid all that. --jkl ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Break on cumulative sum
On Tue, Mar 5, 2013 at 9:47 AM, Nico Williams wrote: > Well, under the hood table functions could use temp, gensym'ed virtual > tables for all I care. The point is I want table functions :) SQLite already has table functions of sort: PRAGMAs... They have their own issues (not real tables, so cannot be used in joins, AFAIK), and of course you cannot add your own PRAGMA in client code using the SQLite3 C API. I'm hoping there's no big leap between current PRAGMAs and real table functions, with SQL grammar changes to be able to <<>>, and that Dr. Hipp agrees table functions would be a valuable addition of course :). --DD ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Break on cumulative sum
On Tue, Mar 5, 2013 at 2:44 AM, Dominique Devienne wrote: > On Tue, Mar 5, 2013 at 9:24 AM, Nico Williams wrote: > Right. Virtual Tables are very flexible, but the syntax is indeed not > practical, and it also forces you to name and "instantiate" tables, when > often you want to use table functions on the fly in a JOIN. I agree with everything you said. > VTables just isn't the right fit for table functions IMHO. My $0.02. --DD Well, under the hood table functions could use temp, gensym'ed virtual tables for all I care. The point is I want table functions :) > PS: Table functions are basically the "reverse" of aggregate functions to > me. They take scalar inputs, and generate rows, while aggregate functions > take rows and generate scalar(s).. Indeed! Nico -- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Break on cumulative sum
On Tue, Mar 5, 2013 at 9:24 AM, Nico Williams wrote: > On Tue, Mar 5, 2013 at 1:19 AM, James K. Lowden > > 2. Aggregation. The engine passes the rows to be aggregated to the > > function. After the last row, it calls the function's "xFinal" > > function, which returns a value. Example: median(). > > Lies! :) There's one more type of UDF in SQLite3: virtual tables. > You could have a virtual table where selecting with an equality test > for some column is "passing an argument" to a "table function". I > know, I've done this. But it gets better! SQLite3 is re-entrant, so > you can actually format a SQL statement in the virtual table given its > "arguments" and execute it (being careful to not create a SQL > injection vulnerability). IIRC FTS works this way. > > With a tiny bit of syntactic sugar we could have a way to define table > functions given simple scalar functions, which would be very nice > indeed. Right. Virtual Tables are very flexible, but the syntax is indeed not practical, and it also forces you to name and "instantiate" tables, when often you want to use table functions on the fly in a JOIN. Tables functions are very useful to "re-tabulate" de-normalized data stuffed into a CSV or JSON or XML text column for example. Once you have table functions, windowing analytics probably becomes easier as well to express (although not necessarily efficient). VTables just isn't the right fit for table functions IMHO. My $0.02. --DD http://www.oracle-base.com/articles/misc/pipelined-table-functions.php PS: Table functions are basically the "reverse" of aggregate functions to me. They take scalar inputs, and generate rows, while aggregate functions take rows and generate scalar(s).. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Break on cumulative sum
On Tue, Mar 5, 2013 at 1:19 AM, James K. Lowden wrote: > Right you are, except it's not a question of efficient but of > possible. A UDF works one of two ways: > > 1. Per row. The engine calls the function for each row. The function > transforms the input into the output. Examples are sqrt() and pow(). > > 2. Aggregation. The engine passes the rows to be aggregated to the > function. After the last row, it calls the function's "xFinal" > function, which returns a value. Example: median(). Lies! :) There's one more type of UDF in SQLite3: virtual tables. You could have a virtual table where selecting with an equality test for some column is "passing an argument" to a "table function". I know, I've done this. But it gets better! SQLite3 is re-entrant, so you can actually format a SQL statement in the virtual table given its "arguments" and execute it (being careful to not create a SQL injection vulnerability). IIRC FTS works this way. With a tiny bit of syntactic sugar we could have a way to define table functions given simple scalar functions, which would be very nice indeed. >> The engine would need to become window-aware, which is >> probably enough work to delay its introduction until a consortium >> member decides they need it. > > I would say the same. It changes the grammar and almost certainly > intrudes on the GROUP BY implementation. AFAIK it's not on the > sqlite4 agenda. I've often wondered if there's interest in a heavier-weight fork of SQLite*. Clearly there isn't: it'd have been done already. Without support from the consortium (and a fork wouldn't get that support) it couldn't possibly hold a candle to SQLite in terms of robustness. > Adding nonstandard equivalent functionality by extending aggregate > functions might be easier. But the inputs to the OVER clause -- > partition, range, order -- would still be required. For aggregate functions like group_concat() an ORDER clause to apply to its inputs would be wonderful. > I'm surprised there's much interest in using SQLite for fancy > grouping. My hobbyhorse is recursive queries, which IMO are much more > of a nuisance to carry out. +1 re: recursive queries. Once or twice I've resorted to a UNION ALL of LEFT OUTER self-JOINs, each sub-query having more and more self-joins -- this limits recursion depth effectively, but there's going to be a limit anyways. I've also use recursive triggers to good effect (though triggers slow things down plenty). Nico -- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Break on cumulative sum
On Mon, 04 Mar 2013 16:51:35 -0500 Ryan Johnson wrote: > On 04/03/2013 2:20 PM, Petite Abeille wrote: > > On Mar 4, 2013, at 1:32 AM, James K. Lowden > > wrote: > > > >> What do you have in mind? I've benn adding some user defined > >> functions and am thinking of creating a repository for them. > > All the so-called window functions from SQL:2003 (aka analytic > > functions): > > > > "Windowed Tables and Window Functions in SQL" > > http://wwwlgis.informatik.uni-kl.de/cms/fileadmin/courses/SS2008/NEDM/RDDM.Chapter.06.Windows_and_Query_Functions_in_SQL.pdf > > Postgresql has also had them for a while. Yes, and if I'm not mistaken Ingres's QUEL had them long before that. (I see Microsoft calls some of them e.g.. PERCENTILE_RANK "analytical". Why? "analysis", after all, means "take apart". They might be used for analysis, but window functions are every bit as synthetic as normal aggregate functions.) > I'd love to see sqlite3 gain them, but I don't think they can be > added efficiently as custom functions. Right you are, except it's not a question of efficient but of possible. A UDF works one of two ways: 1. Per row. The engine calls the function for each row. The function transforms the input into the output. Examples are sqrt() and pow(). 2. Aggregation. The engine passes the rows to be aggregated to the function. After the last row, it calls the function's "xFinal" function, which returns a value. Example: median(). In the per-row case, the function must produce its answer based on the (single) value provided. Ranking, for example, would be impossible because the "rows ahead" haven't been seen yet. In the case of aggregation, the calls to xFinal are determined by the GROUP BY clause. > The engine would need to become window-aware, which is > probably enough work to delay its introduction until a consortium > member decides they need it. I would say the same. It changes the grammar and almost certainly intrudes on the GROUP BY implementation. AFAIK it's not on the sqlite4 agenda. Adding nonstandard equivalent functionality by extending aggregate functions might be easier. But the inputs to the OVER clause -- partition, range, order -- would still be required. I'm surprised there's much interest in using SQLite for fancy grouping. My hobbyhorse is recursive queries, which IMO are much more of a nuisance to carry out. --jkl ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Break on cumulative sum
On 04/03/2013 2:20 PM, Petite Abeille wrote: On Mar 4, 2013, at 1:32 AM, James K. Lowden wrote: What do you have in mind? I've benn adding some user defined functions and am thinking of creating a repository for them. All the so-called window functions from SQL:2003 (aka analytic functions): "Windowed Tables and Window Functions in SQL" http://wwwlgis.informatik.uni-kl.de/cms/fileadmin/courses/SS2008/NEDM/RDDM.Chapter.06.Windows_and_Query_Functions_in_SQL.pdf Postgresql has also had them for a while. I'd love to see sqlite3 gain them, but I don't think they can be added efficiently as custom functions. The engine would need to become window-aware, which is probably enough work to delay its introduction until a consortium member decides they need it. That said, the VDBE architecture should make it significantly easier to implement inside sqlite3 than for most other engines. Ryan ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Break on cumulative sum
On Mar 4, 2013, at 1:32 AM, James K. Lowden wrote: > What do you have in mind? I've benn adding some user defined functions > and am thinking of creating a repository for them. All the so-called window functions from SQL:2003 (aka analytic functions): "Windowed Tables and Window Functions in SQL" http://wwwlgis.informatik.uni-kl.de/cms/fileadmin/courses/SS2008/NEDM/RDDM.Chapter.06.Windows_and_Query_Functions_in_SQL.pdf ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Break on cumulative sum
On Sun, 3 Mar 2013 07:19:11 -0600 "Michael Black" wrote: > One thing I'll point out...the trigger solution is almost linear > performance and constant memory where the single SQL statement will > suffer notably as time goes on in both those measures. Also the > single SQL statement doesn't seem like a good idea if the data is > coming in real time. He never mentioned how big his table is going to > be. If TextField 'A' is deleted, doesn't the entire table have to be updated? Sure, that's O(n), but where n is the size of the table, not the size of the update. Acknowledged, sometimes there are reasons to store derived results. But that choice bears costs, both in updating the database and in engineering it, costs that can be avoided if they can be avoided. Knuth said premature optimization is the root of all evil, and I agree. :-) Normalized data minimize the cost of manipulation. That is why we normalize our databases. As a side-effect, normalized data permit more efficient use of memory, benefiting SELECT, too. --jkl ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Break on cumulative sum
On Sun, 3 Mar 2013 18:50:35 +0100 Petite Abeille wrote: > Ah? if only? SQLite had analytical functions? oh, well.. What do you have in mind? I've benn adding some user defined functions and am thinking of creating a repository for them. --jkl ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Break on cumulative sum
On Mar 3, 2013, at 2:10 PM, "James K. Lowden" wrote: > There's some cruft, too. I was only demonstrating that it could be > done. If you find a way to simplify it, you'll know you understand > it. Ah… if only… SQLite had analytical functions… oh, well.. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Break on cumulative sum
One thing I'll point out...the trigger solution is almost linear performance and constant memory where the single SQL statement will suffer notably as time goes on in both those measures. Also the single SQL statement doesn't seem like a good idea if the data is coming in real time. He never mentioned how big his table is going to be. -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of James K. Lowden Sent: Sunday, March 03, 2013 7:10 AM To: sqlite-users@sqlite.org Subject: Re: [sqlite] Break on cumulative sum On Sun, 3 Mar 2013 11:46:26 +0100 Gert Van Assche wrote: > My SQL knowledge is not yet good enough to understand what you > proposed, but I hope one day that I will be able to understand and > replicate myself the code you wrote. I started of with a join on the > table itself as well, but I got stuck. I'll study your code and I'll > grow. Happy to be service, Gert. The trick is always the same: get your ducks in a row and shoot. Figure out the formula that computes the answer you want, and put the constituent parts on the row, then compute. The first order of business in your case was a running sum. That's done by adding up everything "before" the current row, where "before" means "less than" something, however ordered. select a.*, sum(b.ValueField) as total from Test as a join Test as b on a.TextField >= b.TextField group by a.TextField , a.ValueField , a.CumulativeValue , a.BreakPoint For each row in "a", get all the rows in "b" whose TextField is less than (or equal to, here) to a's. Here it is, live and in color: $ sqlite3 -echo db < self.sql select a.TextField , a.ValueField , a.BreakPoint , sum(b.ValueField) as total from Test as a join Test as b on a.TextField >= b.TextField group by a.TextField , a.ValueField , a.CumulativeValue , a.BreakPoint ; TextField ValueField BreakPoint total -- -- -- -- A 2 10 2 B 3 10 5 C 2 10 7 D 4 10 11 E 5 10 16 F 1 10 17 G 1 10 18 H 5 10 23 I 11 10 34 J 8 10 42 K 2 10 44 That formed my building block. Your formula is almost GN = 1 + total / Breakpoint By using integer arithmetic, I could divide "total" by BreakPoint to get a multiple, which is almost what you call a GroupName, except for that "one past" rule I remarked on, where for example 11 is part of the first group. But by joining my block to itself, offset by one, we can slide the multiplier down a row. For CumulativeValue, your formula is CV = total - [max total of previous GN] That requires computing the GN, grouping on it, and finding the maximum total for it, and joining that result back to the main one, where it can be subtracted from the main total. So, yes, there are a lot of joins. But I didn't write the query top-down, and you shouldn't read it that way. Read it inside-out. Each piece can be separately evaluated, and each operand used in the calculation can be exposed in the SELECT clause. There's some cruft, too. I was only demonstrating that it could be done. If you find a way to simplify it, you'll know you understand it. Regards, --jkl ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Break on cumulative sum
On Sun, 3 Mar 2013 11:46:26 +0100 Gert Van Assche wrote: > My SQL knowledge is not yet good enough to understand what you > proposed, but I hope one day that I will be able to understand and > replicate myself the code you wrote. I started of with a join on the > table itself as well, but I got stuck. I'll study your code and I'll > grow. Happy to be service, Gert. The trick is always the same: get your ducks in a row and shoot. Figure out the formula that computes the answer you want, and put the constituent parts on the row, then compute. The first order of business in your case was a running sum. That's done by adding up everything "before" the current row, where "before" means "less than" something, however ordered. select a.*, sum(b.ValueField) as total from Test as a join Test as b on a.TextField >= b.TextField group by a.TextField , a.ValueField , a.CumulativeValue , a.BreakPoint For each row in "a", get all the rows in "b" whose TextField is less than (or equal to, here) to a's. Here it is, live and in color: $ sqlite3 -echo db < self.sql select a.TextField , a.ValueField , a.BreakPoint , sum(b.ValueField) as total from Test as a join Test as b on a.TextField >= b.TextField group by a.TextField , a.ValueField , a.CumulativeValue , a.BreakPoint ; TextField ValueField BreakPoint total -- -- -- -- A 2 10 2 B 3 10 5 C 2 10 7 D 4 10 11 E 5 10 16 F 1 10 17 G 1 10 18 H 5 10 23 I 11 10 34 J 8 10 42 K 2 10 44 That formed my building block. Your formula is almost GN = 1 + total / Breakpoint By using integer arithmetic, I could divide "total" by BreakPoint to get a multiple, which is almost what you call a GroupName, except for that "one past" rule I remarked on, where for example 11 is part of the first group. But by joining my block to itself, offset by one, we can slide the multiplier down a row. For CumulativeValue, your formula is CV = total - [max total of previous GN] That requires computing the GN, grouping on it, and finding the maximum total for it, and joining that result back to the main one, where it can be subtracted from the main total. So, yes, there are a lot of joins. But I didn't write the query top-down, and you shouldn't read it that way. Read it inside-out. Each piece can be separately evaluated, and each operand used in the calculation can be exposed in the SELECT clause. There's some cruft, too. I was only demonstrating that it could be done. If you find a way to simplify it, you'll know you understand it. Regards, --jkl ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Break on cumulative sum
James, I'm very impressed by your solution. My SQL knowledge is not yet good enough to understand what you proposed, but I hope one day that I will be able to understand and replicate myself the code you wrote. I started of with a join on the table itself as well, but I got stuck. I'll study your code and I'll grow. That is the beauty of groups like this: hobby developers like me get advice from experts on problems we're facing. Very helpful, not just the code, but also the on-problem training! Thanks for your work, all! gert 2013/3/3 James K. Lowden > On Sat, 2 Mar 2013 14:46:40 +0100 > Gert Van Assche wrote: > > > All, I don't know how to achieve this: I need to put the cumulative > > sum in a field, and create a group as soon as that cumulative sum is > > over a breakpoint value (10). > > This was a bit of a challenge because the group definition is IMO a > little strange: The first value *after* the breakpoint is included in > the group. That made the SQL more convoluted than it would be > otherwise. > > That said, the attached SQL does the job in one statement. No triggers > are needed. > > $ sqlite3 db < running_sum.sql > TextField ValueField CumulativeValue BreakPoint GroupName > -- -- --- -- -- > A 2 210 1 > B 3 510 1 > C 2 710 1 > D 4 11 10 1 > E 5 510 2 > F 1 610 2 > G 1 710 2 > H 5 12 10 2 > I 11 11 10 3 > J 8 810 4 > K 2 10 10 4 > > This lets you define the results in terms of the base data instead of > keeping a table of derived values. No updates required, just a view. > Unless you have data-warehouse rowcounts or a very limited machine, it > should perform just fine, despite what one might guess, because the I/O > is limited to one table. It should also outperform any per-row > solution. > > Two bits of advice if I may. > > Be careful of using triggers to maintain data consistency. Triggers > are very good for enforcing referential integrity in situations that > DRI can't handle. Beyond that they get very complex and error prone. > > Using triggers to compute derived values signals redundancy in the > database. That redundancy often is not as desirable as the > denormalize-for-performance crowd thinks. Better to compute derived > values on demand (as in a view). If that's demonstrated to be too > slow, a periodic update in a cron job or similar should provide better > thoughput than recomputing on every insert/update/delete. You'll also > do yourself a favor by segregating derived values in a different table > that can be regenerated at will. > > My other suggestion is to eschew abstract nouns in column names. Words > like data, field, and value add no meaning. After all, it's a > database. If it's not a value, what is it? > > This table, > > CREATE TABLE [Test] > ( Name > , Value > , RunningSum > , Breakpoint /* (one word, small 'p') */ DEFAULT 10 > , GroupName > , primary key (Name) > ); > > is at least as clear, wouldn't you say? > > HTH. > > --jkl > > P.S. In case the listserv strips attachments, here it is in plain > text. > > [SQL] > select A.TextField, A.ValueField >, A.total - coalesce(C.GroupTotal, 0) as CumulativeValue >, A.Breakpoint > -- , coalesce(C.GroupTotal, 0) as GroupTotal >, coalesce(B.GroupName, A.GroupName) as GroupName > from ( > select a.*, sum(b.ValueField) as total >, 1 + sum(b.ValueField) / (1+a.Breakpoint) as GroupName > from Test as a join Test as b > on a.TextField >= b.TextField > group by a.TextField > , a.ValueField > , a.CumulativeValue > , a.BreakPoint > ) as A > left join ( > select a.*, sum(b.ValueField) as total >, 1 + sum(b.ValueField) / (1 + a.Breakpoint) as GroupName > from Test as a join Test as b > on a.TextField >= b.TextField > group by a.TextField > , a.ValueField > , a.CumulativeValue > , a.BreakPoint > ) as B > on B.TextField = (select max(TextField) from Test where TextField < > A.TextField) left join ( > select1 + T.GroupName as GroupName > , max(T.total) as GroupTotal > from ( > selectA.TextField > , A.ValueField > , A.total > , coalesce(B.GroupName, 1) as GroupName > from (
Re: [sqlite] Break on cumulative sum
On Sat, 2 Mar 2013 14:46:40 +0100 Gert Van Assche wrote: > All, I don't know how to achieve this: I need to put the cumulative > sum in a field, and create a group as soon as that cumulative sum is > over a breakpoint value (10). This was a bit of a challenge because the group definition is IMO a little strange: The first value *after* the breakpoint is included in the group. That made the SQL more convoluted than it would be otherwise. That said, the attached SQL does the job in one statement. No triggers are needed. $ sqlite3 db < running_sum.sql TextField ValueField CumulativeValue BreakPoint GroupName -- -- --- -- -- A 2 210 1 B 3 510 1 C 2 710 1 D 4 11 10 1 E 5 510 2 F 1 610 2 G 1 710 2 H 5 12 10 2 I 11 11 10 3 J 8 810 4 K 2 10 10 4 This lets you define the results in terms of the base data instead of keeping a table of derived values. No updates required, just a view. Unless you have data-warehouse rowcounts or a very limited machine, it should perform just fine, despite what one might guess, because the I/O is limited to one table. It should also outperform any per-row solution. Two bits of advice if I may. Be careful of using triggers to maintain data consistency. Triggers are very good for enforcing referential integrity in situations that DRI can't handle. Beyond that they get very complex and error prone. Using triggers to compute derived values signals redundancy in the database. That redundancy often is not as desirable as the denormalize-for-performance crowd thinks. Better to compute derived values on demand (as in a view). If that's demonstrated to be too slow, a periodic update in a cron job or similar should provide better thoughput than recomputing on every insert/update/delete. You'll also do yourself a favor by segregating derived values in a different table that can be regenerated at will. My other suggestion is to eschew abstract nouns in column names. Words like data, field, and value add no meaning. After all, it's a database. If it's not a value, what is it? This table, CREATE TABLE [Test] ( Name , Value , RunningSum , Breakpoint /* (one word, small 'p') */ DEFAULT 10 , GroupName , primary key (Name) ); is at least as clear, wouldn't you say? HTH. --jkl P.S. In case the listserv strips attachments, here it is in plain text. [SQL] select A.TextField, A.ValueField , A.total - coalesce(C.GroupTotal, 0) as CumulativeValue , A.Breakpoint -- , coalesce(C.GroupTotal, 0) as GroupTotal , coalesce(B.GroupName, A.GroupName) as GroupName from ( select a.*, sum(b.ValueField) as total , 1 + sum(b.ValueField) / (1+a.Breakpoint) as GroupName from Test as a join Test as b on a.TextField >= b.TextField group by a.TextField , a.ValueField , a.CumulativeValue , a.BreakPoint ) as A left join ( select a.*, sum(b.ValueField) as total , 1 + sum(b.ValueField) / (1 + a.Breakpoint) as GroupName from Test as a join Test as b on a.TextField >= b.TextField group by a.TextField , a.ValueField , a.CumulativeValue , a.BreakPoint ) as B on B.TextField = (select max(TextField) from Test where TextField < A.TextField) left join ( select1 + T.GroupName as GroupName , max(T.total) as GroupTotal from ( selectA.TextField , A.ValueField , A.total , coalesce(B.GroupName, 1) as GroupName from ( select a.*, sum(b.ValueField) as total , 1 + sum(b.ValueField) / (1 + a.Breakpoint) as GroupName from Test as a join Test as b on a.TextField >= b.TextField group by a.TextField , a.ValueField , a.CumulativeValue , a.BreakPoint ) as A left join ( select a.*, sum(b.ValueField) as total , 1 + sum(b.ValueField) / (1 + a.Breakpoint) as GroupN
Re: [sqlite] Break on cumulative sum
And I just noticed that should probably be when new.BreakPoint <= -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Gert Van Assche Sent: Saturday, March 02, 2013 10:49 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Break on cumulative sum Michael, this is working perfectly! I learned a lot with your code. Thanks a lot for your help. gert 2013/3/2 Michael Black > I think your "K" row was a typo on the CumulativeValue? > > CREATE TABLE [Test] (TextField, ValueField, CumulativeValue, BreakPoint > DEFAULT 10, GroupName); > CREATE TABLE [MyGroup](GroupName); > insert into [MyGroup] values(1); > create trigger trig1 before insert on [Test] > when 10 <= (select sum(ValueField) from [Test] where GroupName=(select > GroupName from [MyGroup])) > begin > update [MyGroup] set GroupName = (select GroupName+1 from > [MyGroup]); > end; > create trigger trig2 after insert on [Test] > begin > update [Test] set GroupName = (select GroupName from [MyGroup]) > where > rowid=new.rowid; > update [Test] set CumulativeValue = (select sum(ValueField) from > [Test] where GroupName=(select GroupName from [MyGroup])) where > rowid=new.rowid; > end; > insert into [Test] values('A', '2', null, '10',(select GroupName from > MyGroup)); > insert into [Test] values('B', '3', null, '10',(select GroupName from > MyGroup)); > insert into [Test] values('C', '2', null, '10',(select GroupName from > MyGroup)); > insert into [Test] values('D', '4', null, '10',(select GroupName from > MyGroup)); > insert into [Test] values('E', '5', null, '10',(select GroupName from > MyGroup)); > insert into [Test] values('F', '1', null, '10',(select GroupName from > MyGroup)); > insert into [Test] values('G', '1', null, '10',(select GroupName from > MyGroup)); > insert into [Test] values('H', '5', null, '10',(select GroupName from > MyGroup)); > insert into [Test] values('I', '11', null, '10',(select GroupName from > MyGroup)); > insert into [Test] values('J', '8', null, '10',(select GroupName from > MyGroup)); > insert into [Test] values('K', '2', null, '10',(select GroupName from > MyGroup)); > select * from [Test]; > A|2|2|10|1 > B|3|5|10|1 > C|2|7|10|1 > D|4|11|10|1 > E|5|5|10|2 > F|1|6|10|2 > G|1|7|10|2 > H|5|12|10|2 > I|11|11|10|3 > J|8|8|10|4 > K|2|10|10|4 > > > -Original Message- > From: sqlite-users-boun...@sqlite.org [mailto: > sqlite-users-boun...@sqlite.org] On Behalf Of Gert Van Assche > Sent: Saturday, March 02, 2013 7:47 AM > To: sqlite-users > Subject: [sqlite] Break on cumulative sum > > All, I don't know how to achieve this: I need to put the cumulative sum in > a field, and create a group as soon as that cumulative sum is over a > breakpoint value (10). > > This is an example table: > > CREATE TABLE [Test] (TextField, ValueField, CumulativeValue, BreakPoint > DEFAULT 10, GroupName); > insert into [Test] values('A', '2', null, '10'); > insert into [Test] values('B', '3', null, '10'); > insert into [Test] values('C', '2', null, '10'); > insert into [Test] values('D', '4', null, '10'); > insert into [Test] values('E', '5', null, '10'); > insert into [Test] values('F', '1', null, '10'); > insert into [Test] values('G', '1', null, '10'); > insert into [Test] values('H', '5', null, '10'); > insert into [Test] values('I', '11', null, '10'); > insert into [Test] values('J', '8', null, '10'); > insert into [Test] values('K', '2', null, '10'); > > I'd like to end up with a table that looks like this: > > TextField ValueField CumulativeValue BreakPoint GroupName > A 2 2 10 1 > B 3 5 10 1 > C 2 7 10 1 > D 4 11 10 1 > E 5 5 10 2 > F 1 6 10 2 > G 1 7 10 2 > H 5 12 10 2 > I 11 11 10 3 > J 8 8 10 4 > K 2 2 10 4 > > I spent hours trying to update the CumulativeValue field untill the > BreakPoint value is crossed, and restarting the cumulative counter, but I > have too little sql knowledge to do this. > > Could anyone help me? > > thanks > > gert > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Break on cumulative sum
Michael, this is working perfectly! I learned a lot with your code. Thanks a lot for your help. gert 2013/3/2 Michael Black > I think your "K" row was a typo on the CumulativeValue? > > CREATE TABLE [Test] (TextField, ValueField, CumulativeValue, BreakPoint > DEFAULT 10, GroupName); > CREATE TABLE [MyGroup](GroupName); > insert into [MyGroup] values(1); > create trigger trig1 before insert on [Test] > when 10 <= (select sum(ValueField) from [Test] where GroupName=(select > GroupName from [MyGroup])) > begin > update [MyGroup] set GroupName = (select GroupName+1 from > [MyGroup]); > end; > create trigger trig2 after insert on [Test] > begin > update [Test] set GroupName = (select GroupName from [MyGroup]) > where > rowid=new.rowid; > update [Test] set CumulativeValue = (select sum(ValueField) from > [Test] where GroupName=(select GroupName from [MyGroup])) where > rowid=new.rowid; > end; > insert into [Test] values('A', '2', null, '10',(select GroupName from > MyGroup)); > insert into [Test] values('B', '3', null, '10',(select GroupName from > MyGroup)); > insert into [Test] values('C', '2', null, '10',(select GroupName from > MyGroup)); > insert into [Test] values('D', '4', null, '10',(select GroupName from > MyGroup)); > insert into [Test] values('E', '5', null, '10',(select GroupName from > MyGroup)); > insert into [Test] values('F', '1', null, '10',(select GroupName from > MyGroup)); > insert into [Test] values('G', '1', null, '10',(select GroupName from > MyGroup)); > insert into [Test] values('H', '5', null, '10',(select GroupName from > MyGroup)); > insert into [Test] values('I', '11', null, '10',(select GroupName from > MyGroup)); > insert into [Test] values('J', '8', null, '10',(select GroupName from > MyGroup)); > insert into [Test] values('K', '2', null, '10',(select GroupName from > MyGroup)); > select * from [Test]; > A|2|2|10|1 > B|3|5|10|1 > C|2|7|10|1 > D|4|11|10|1 > E|5|5|10|2 > F|1|6|10|2 > G|1|7|10|2 > H|5|12|10|2 > I|11|11|10|3 > J|8|8|10|4 > K|2|10|10|4 > > > -Original Message- > From: sqlite-users-boun...@sqlite.org [mailto: > sqlite-users-boun...@sqlite.org] On Behalf Of Gert Van Assche > Sent: Saturday, March 02, 2013 7:47 AM > To: sqlite-users > Subject: [sqlite] Break on cumulative sum > > All, I don't know how to achieve this: I need to put the cumulative sum in > a field, and create a group as soon as that cumulative sum is over a > breakpoint value (10). > > This is an example table: > > CREATE TABLE [Test] (TextField, ValueField, CumulativeValue, BreakPoint > DEFAULT 10, GroupName); > insert into [Test] values('A', '2', null, '10'); > insert into [Test] values('B', '3', null, '10'); > insert into [Test] values('C', '2', null, '10'); > insert into [Test] values('D', '4', null, '10'); > insert into [Test] values('E', '5', null, '10'); > insert into [Test] values('F', '1', null, '10'); > insert into [Test] values('G', '1', null, '10'); > insert into [Test] values('H', '5', null, '10'); > insert into [Test] values('I', '11', null, '10'); > insert into [Test] values('J', '8', null, '10'); > insert into [Test] values('K', '2', null, '10'); > > I'd like to end up with a table that looks like this: > > TextField ValueField CumulativeValue BreakPoint GroupName > A 2 2 10 1 > B 3 5 10 1 > C 2 7 10 1 > D 4 11 10 1 > E 5 5 10 2 > F 1 6 10 2 > G 1 7 10 2 > H 5 12 10 2 > I 11 11 10 3 > J 8 8 10 4 > K 2 2 10 4 > > I spent hours trying to update the CumulativeValue field untill the > BreakPoint value is crossed, and restarting the cumulative counter, but I > have too little sql knowledge to do this. > > Could anyone help me? > > thanks > > gert > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Break on cumulative sum
I think your "K" row was a typo on the CumulativeValue? CREATE TABLE [Test] (TextField, ValueField, CumulativeValue, BreakPoint DEFAULT 10, GroupName); CREATE TABLE [MyGroup](GroupName); insert into [MyGroup] values(1); create trigger trig1 before insert on [Test] when 10 <= (select sum(ValueField) from [Test] where GroupName=(select GroupName from [MyGroup])) begin update [MyGroup] set GroupName = (select GroupName+1 from [MyGroup]); end; create trigger trig2 after insert on [Test] begin update [Test] set GroupName = (select GroupName from [MyGroup]) where rowid=new.rowid; update [Test] set CumulativeValue = (select sum(ValueField) from [Test] where GroupName=(select GroupName from [MyGroup])) where rowid=new.rowid; end; insert into [Test] values('A', '2', null, '10',(select GroupName from MyGroup)); insert into [Test] values('B', '3', null, '10',(select GroupName from MyGroup)); insert into [Test] values('C', '2', null, '10',(select GroupName from MyGroup)); insert into [Test] values('D', '4', null, '10',(select GroupName from MyGroup)); insert into [Test] values('E', '5', null, '10',(select GroupName from MyGroup)); insert into [Test] values('F', '1', null, '10',(select GroupName from MyGroup)); insert into [Test] values('G', '1', null, '10',(select GroupName from MyGroup)); insert into [Test] values('H', '5', null, '10',(select GroupName from MyGroup)); insert into [Test] values('I', '11', null, '10',(select GroupName from MyGroup)); insert into [Test] values('J', '8', null, '10',(select GroupName from MyGroup)); insert into [Test] values('K', '2', null, '10',(select GroupName from MyGroup)); select * from [Test]; A|2|2|10|1 B|3|5|10|1 C|2|7|10|1 D|4|11|10|1 E|5|5|10|2 F|1|6|10|2 G|1|7|10|2 H|5|12|10|2 I|11|11|10|3 J|8|8|10|4 K|2|10|10|4 -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Gert Van Assche Sent: Saturday, March 02, 2013 7:47 AM To: sqlite-users Subject: [sqlite] Break on cumulative sum All, I don't know how to achieve this: I need to put the cumulative sum in a field, and create a group as soon as that cumulative sum is over a breakpoint value (10). This is an example table: CREATE TABLE [Test] (TextField, ValueField, CumulativeValue, BreakPoint DEFAULT 10, GroupName); insert into [Test] values('A', '2', null, '10'); insert into [Test] values('B', '3', null, '10'); insert into [Test] values('C', '2', null, '10'); insert into [Test] values('D', '4', null, '10'); insert into [Test] values('E', '5', null, '10'); insert into [Test] values('F', '1', null, '10'); insert into [Test] values('G', '1', null, '10'); insert into [Test] values('H', '5', null, '10'); insert into [Test] values('I', '11', null, '10'); insert into [Test] values('J', '8', null, '10'); insert into [Test] values('K', '2', null, '10'); I'd like to end up with a table that looks like this: TextField ValueField CumulativeValue BreakPoint GroupName A 2 2 10 1 B 3 5 10 1 C 2 7 10 1 D 4 11 10 1 E 5 5 10 2 F 1 6 10 2 G 1 7 10 2 H 5 12 10 2 I 11 11 10 3 J 8 8 10 4 K 2 2 10 4 I spent hours trying to update the CumulativeValue field untill the BreakPoint value is crossed, and restarting the cumulative counter, but I have too little sql knowledge to do this. Could anyone help me? thanks gert ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users