Re: [sqlite] Window functions?
On 2014-08-26 18:00, sqlite-users-requ...@sqlite.org wrote: SELECT employee_name, employee_id, salary, rank() OVER(PARTITION BY dept ORDER BY salary DESC), 100.0*salary/sum(salary) OVER (PARTITION BY dept) FROM employee; I don't know if the above is valid SQL or not. But is seems like something somebody might like to do. And it also seems hard to implement. Yes it's valid. But this one actually is not that difficult to implement. Basically read rows must be ordered by dept, salary DESC. Then: - rank() - it only needs to know if dept or salary is different than in previous row. that's all. - sum() - probably the easiest way to implement it is: -- first pass: calculate aggregate and write result to a temporary table -- second pass: simply read calculated aggregate and append it to the result Below are explains of both functions implemented in sqlite 3.3.8 based database. However it doesn't mean that all window functions are easy to implement. For example I have no idea how to implement efficiently moving aggregates (BETWEEN x PRECEDING AND y FOLLOWING) - I think those have to recalculated for every row separately making whole query really slow (sum() can be optimized, but most likely most aggregates can't). There are many more problems. Most databases still haven't implemented everything from standard. Explains: - table definition: create table employee(salary, dept) - table is already sorted by dept, salary DESC (so no sorting/index is visible in explains) - no window specification defaults to ROWS UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING (behavior not specified in a standard) - especially for rank() there is very little code, but it should be even shorter (for example opcodes 24, 25, 26 are useless) SELECT rank() OVER(PARTITION BY dept ORDER BY salary DESC) FROM employee: addr | opcode| p1 | p2 | p3 | ++---+-++-+-- 0 | Noop | 0 | 0 | | 1 | MemNull | 2 | 0 | | 2 | Goto | 0 | 32 | | 3 | Integer | 2 | 0 | | 4 | OpenRead | 0 | 2 | | 5 | SetNumColumns | 0 | 2 | | 6 | MemInt| 0 | 1 | | 7 | Rewind| 0 | 30 | | 8 | MemLoad | 5 | 0 | | 9 | Column| 0 | 1 | | 10 | Ne| 512 | 12 | collseq(BINARY) | 11 | Goto | 0 | 15 | | 12 | MemNull | 2 | 0 | | 13 | Column| 0 | 1 | | 14 | MemStore | 5 | 1 | | 15 | MemLoad | 6 | 0 | | 16 | Column| 0 | 0 | | 17 | Ne| 512 | 20 | collseq(BINARY) | 18 | MemInt| 0 | 4 | | 19 | Goto | 0 | 23 | | 20 | MemInt| 1 | 4 | | 21 | Column| 0 | 0 | | 22 | MemStore | 6 | 1 | | 23 | WindowStep| 2 | 0 | rank(0) | 24 | MemStore | 3 | 0 | | 25 | Pop | 1 | 0 | | 26 | MemLoad | 3 | 0 | | 27 | Callback | 1 | 0 | | 28 | MemIncr | 1 | 1 | | 29 | Next | 0 | 8 | | 30 | Close | 0 | 0 | | 31 | Halt | 0 | 0 | | 32 | Transaction | 2 | 0 | | 33 | VerifyCookie | 2 | 1 | | 34 | Goto | 0 | 3 | | SELECT sum(salary) OVER (PARTITION BY dept) FROM employee: addr | opcode| p1 | p2 | p3 | ++---+-++-+-- 0 | Noop | 0 | 0 | | 1 | OpenEphemeralList | 2 | 2 | | 2 | MemInt| 1 | 2 | | 3 | MemNull | 3 | 0 | | 4 | MemInt| -1 | 0 | | 5 | Goto | 0 | 61 | | 6 | Integer | 2 | 0 | | 7 | OpenRead | 0 | 2 | | 8 | SetNumColumns | 0 | 2 | | 9 | Rewind| 0 | 35 | | 10 | IfMemPos | 2 | 28 | | 11 | MemLoad | 4 | 0 | | 12 | Column| 0 | 1 | | 13 | Ne| 512 | 17 | collseq(BINARY) | 14 | Column| 0 | 0 | | 15 | AggStep | 3 | 1 | sum(1) | 16 | Goto | 0 | 33 | | 17 | AggFinal | 3 | 0 | sum(1) | 18 | MemLoad | 0 | 0 |
Re: [sqlite] Finding rows with MIN(MTIME) for all IDs
With regular ISO window functions, one could typically write something along these lines: with DataSet as ( select foo.*, lag( 0, 1, 1 ) over ( partition by id order by time ) as is_lag fromfoo ) select * fromDataSet where DataSet.is_lag = 1 you're right. it was my fault and teradata is not guilty at all - they still allow to write it easier than standard: select * from tab qualify row_number() over(partition by id order by mtime) = 1 or if you prefer: select * from tab qualify lag(0, 1, 1) over(partition by id order by mtime) = 1 ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Finding rows with MIN(MTIME) for all IDs
select id, a, min(mtime) over(partition by id order by mtime) m from tab qualify row_number() over(partition by id order by mtime) = 1 While using analytics would indeed be the best approach overall, these are sadly not supported in SQLite in any ways or forms. (For the record, if using analytics, the only thing one really need to do is to mark the lead row for selection. No point to over complicate things as above). what do you mean by "to mark the lead row for selection" ? is there a database that has something simpler to use than qualify + row_number() ? (yes i know, min can be replaced by first_value() or order by in most aggregates can/should be removed, but after those changes it's still the same query) ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Bug
> I believe OVER() is an Oracle-specific extension to SQL, not a > standard in any way. ISO/IEC 9075-2:2003: ::= OVER ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Bug
> Apparently, using such a function in ORDER BY > clause alone doesn't make the statement aggregate (whether it should is > perhaps debatable) I suppose this may be in the standart. I'm 100% sure that this one is allowed by standart: ... ORDER BY avg(a) OVER() so likely ORDER BY avg(a) is also allowed. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Bug
> In the first query, there is an aggregate in the result set, so an > implicit GROUP BY is used. The ORDER BY is meaningless, but not an > error (and could be more easily written "ORDER BY 1"; see below). The order is not meaningless. It can return an error or do nothing. If aggregate in order by isn't allowed it should return an error. From the first query we can see that it is allowed. It would be nice if database behaved consistently. We already know that MS SQL and Firebird do. Same with MySQL. I'm not sure, but I think I'v tried also postgres with same result. I'm sure that in sqlite it was missed and it is not an intentional behaviour. And yes - obviously the query with a bug makes no sense and there is no reason to ever use it. Just like adding order by to any other query that return 1 row before ordering. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Bug
SQLite version 3.7.9 2011-11-01 00:52:41 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> create table t(a); sqlite> select avg(a) from t order by avg(a); -- order by aggregate possible sqlite> select 1 from t order by a; -- order by column not in result possible sqlite> select 1 from t group by 1 order by avg(a); -- order by aggregate not in result possible sqlite> select 1 from t order by avg(a); -- should be possible Error: misuse of aggregate: avg() sqlite> ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Big FLOAT bug
> Changing the 2 "15g" entries in sqlite3.c to "16g" corrects this problem. 15 > digits is all that is guaranteed but the vast majority of 16-digit values are > representable. > > Is this a valid solution? Or are there other side effects? It should be ok. However there's another bug that will appear after the change: C:\Users\vic\Desktop\sqlite-amalgamation-3070800>cat test.c #include "sqlite3.h" #include int main() { double d = 8901.0; char * c = sqlite3_mprintf("%.15g\n%.16g\n", d, d); printf("%s%.16g\n", c, d); } C:\Users\vic\Desktop\sqlite-amalgamation-3070800>cl /nologo test.c sqlite3.c test.c sqlite3.c Generating Code... C:\Users\vic\Desktop\sqlite-amalgamation-3070800>test.exe 8901 8901.0001 8901 C:\Users\vic\Desktop\sqlite-amalgamation-3070800>gcc test.c sqlite3.c C:\Users\vic\Desktop\sqlite-amalgamation-3070800>a.exe 8901 8901.0001 8901 So with .15g result is correct but with .16g is not. The problem with . 16g doesn't appear in gcc's and visual's printf. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Big FLOAT bug
SQLite version 3.7.8 2011-09-19 14:49:19 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> create table t(a float); sqlite> insert into t values(1125899906842624); sqlite> select a = cast(cast(a as text) as float) from t; 0 Yes, I know - 16 digits. But representable 16 digits. .dump also will export only 15 digits ant without any explicit casts precision will be lost after importing. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Read only scaling optimization
Look at disc transfer. With 16 queries in one thread on single disc, disc may be accesed linearly (depeding on query). With 16 threads accesing disc at the same time linear disc access is impossible (however os may do some prefetching) and queries will by slower. You may try increasing page size - bigger block means less near-random reads from the disc. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Need Help! -- SQlite database on server
http://www.sqlite.org/cvstrac/wiki?p=SqliteNetwork ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] sqlite seems to hang while indexing
> I was trying to create an index on an integer column in a very large > table with over 400,000,000 rows on an Ubuntu ... I > increased the cache size to 2 but to no avail. That's only 200M of cache and your table is much larger. Sqlite can't index/sort efficiently lare data on disc - you need to do this in memory, so the cache size must be large enough to store whole index. The problem is that if data in indexed column isn't sorted and you will try to use created index for example to SELECT ... ORDER BY, it will be again slow with very large index because the speed will be limited by random reads from disc (reads from index will be sequential but finding corresponding rows in the table won't be). It 'should speed up after all pages will be read into the cache (so again you will need big cache and preferably large page size). Also SSD disc will help. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Bug
> In short, you cannot assume conditions are processed left-to-right, > including JOIN conditions. I admit that i haven't checked the snadart but I did check other engines and they evaluate from left to right. I think that long time ago I'v read in a standart that 3 table join is basicly equivalent to joining first 2 tables and than third (I'm not sure though) which implies left to right. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Bug
> If you join > tables which have the same column names, you need to use the table names. You are right, but joined tables don't have the same column names. SELECT * FROM t1 JOIN t2 ON a = b -- there is only one 'a' and that select is correctly implemented SELECT * FROM t1 JOIN t2 ON a = b JOIN t3 ON 1 -- adding another select does't change column names in first join Reported error would be correct in following query: SELECT * FROM t1 JOIN t2 ON t1.a = t2.b JOIN t3 ON t1.a = t3.a AND a ! = 1; ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Bug
SQLite version 3.7.5 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> CREATE TABLE t1 (a INT); sqlite> CREATE TABLE t2 (b INT); sqlite> CREATE TABLE t3 (a INT); sqlite> SELECT * FROM t1 JOIN t2 ON t1.a = t2.b AND a /* obviously t1.a */ != 1 JOIN t3 ON t1.a = t3.a; Error: ambiguous column name: a ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Slowdown when switching from Version 3.2.5 to 3.7.4
There is a lot more synchronization in 3.7.4. If you disable it new version may be faster. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] group_concat + distinct
SQLite version 3.7.2 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> create table t(a); sqlite> select group_concat(distinct a) from t; sqlite> select group_concat(distinct a, ',') from t; Error: DISTINCT aggregates must have exactly one argument Both queries should return the same result. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Bug
SQLite version 3.6.23.1 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> create table t(a int); sqlite> insert into t values(1); sqlite> select 0 where 0; sqlite> update t set a = (select 0 where 0); sqlite> select * from t; sqlite> select 0 limit 0; sqlite> update t set a = (select 2 limit 0); sqlite> select * from t; 2 sqlite> ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Berkeley DB adds SQL using SQLite API !!
> There were many problems with > that approach: ... > (3) Each table and index is in a > separate file so your "database" was a directory full of files instead > of a single file This one is not a problem. Actually I don't see how 1 file is better than 1 directory. For example mac application is a directory not a file and no one complains. And with several files database would be faster (for example dropping a table is instant or fragmentation is handled by OS without need for vacuuming whole database). Also with current SQLite implementation only tables would be locked by a transation not a whole database (a few years ago there were even document on SQLite website listing splittnig database to several files as one way to implement table level locks in SQLite). ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Bug
> Already fixed. Seehttp://www.sqlite.org/src/info/f74beaabde I'm might be wrong because I didn't check with the newest binary, but if I remember correctly this fix will compare t3.a to t1.a (instead of t2.a). So it will produce different results (first select will be 1|2 and second will be empty) but still incorrect because error should be returned. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Bug
SQLite version 3.6.21 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> create table t1(a int); sqlite> create table t2(a int); sqlite> create table t3(a int); sqlite> insert into t1 values(1); sqlite> insert into t2 values(2); sqlite> insert into t3 values(1); sqlite> select * from t1 join t2 on t1.a < t2.a join t3 using(a); sqlite> update t3 set a = 2; sqlite> select * from t1 join t2 on t1.a < t2.a join t3 using(a); 1|2 First join produces 2 columns named a so result of the query should by similar to: MySQL: Column 'a' in from clause is ambiguous PostgreSQL: ERROR: common column name "a" appears more than once in left table ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Bug: ambiguous column name
There's no reason for following error: SQLite version 3.6.21 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> .headers ON sqlite> create table t1(a int); sqlite> create table t2(a int); sqlite> create table t3(a int); sqlite> insert into t1 values(1); sqlite> select * from t1 left join t2 using(a) left join t3 using(a); a 1 sqlite> select a from t1 left join t2 using(a) left join t3 using(a); Error: ambiguous column name: a ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] USING bug
SQLite version 3.6.21 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> create table t1(a int); sqlite> create table t2(a int); sqlite> create table t3(a int, b int); sqlite> insert into t1 values(1); sqlite> insert into t3 values(1, 1); sqlite> select * from t1 left join t2 using(a) left join t3 using(a); 1| sqlite> select * from (t1 left join t2 using(a)) left join t3 using(a); 1| I think that correct result should be 1|1 MySQL and PostgreSQL agree with me. -- Kup wlasne mieszkanie za 72 tys. zl. Sprawdz najlepsze oferty >>> http://link.interia.pl/f24ce ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] problem using random() in queries
> Richard, Before you "fix" it, I'm not convinced it is broken. > > >From MS SQL server > > create table _names (N varchar(5)); > insert into _names values('a'); > insert into _names values('b'); > insert into _names values('c'); > insert into _names values('d'); > insert into _names values('e'); > > select N, RAND() as RNDValue from _names > a 0.301745013642105 > b 0.301745013642105 > c 0.301745013642105 > d 0.301745013642105 > e 0.301745013642105 > > > select N, RAND() as RNDValue from _names where RAND() >=.5; > a 0.0427909435260437 > b 0.0427909435260437 > c 0.0427909435260437 > d 0.0427909435260437 > e 0.0427909435260437 MSSQL isn't the only database that behaves that way (it thinks it can optimize the call if arguments are the same), but it makes difficult to do some tasks (for example ORDER BY random() LIMIT will work fine in sqlite but it will return the same value many times in some other databases). I agree that it may be a matter of documenting the bahaviour (however I like that sqlite evaluating functions in every row) but evaluating functions twice causes problems hard to document: sqlite> select distinct(random() / 500) from (select 1 union select 2); 0 0 --- Nasilaja sie kradzieze. Mieszkancy osiedli zaniepokojeni. Prosimy o pomoc w tej sprawie >>> http://link.interia.pl/f1eef ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users