Re: [sqlite] Window functions?

2014-08-26 Thread Wiktor Adamski

On 2014-08-26 18:00, sqlite-users-requ...@sqlite.org wrote:

SELECT
   employee_name,
   employee_id,
   salary,
   rank() OVER(PARTITION BY dept ORDER BY salary DESC),
   100.0*salary/sum(salary) OVER (PARTITION BY dept)
FROM employee;

I don't know if the above is valid SQL or not.  But is seems like something
somebody might like to do.  And it also seems hard to implement.


Yes it's valid. But this one actually is not that difficult to 
implement. Basically read rows must be ordered by dept, salary DESC. Then:
- rank() - it only needs to know if dept or salary is different than in 
previous row. that's all.

- sum() - probably the easiest way to implement it is:
-- first pass: calculate aggregate and write result to a temporary table
-- second pass: simply read calculated aggregate and append it to the result
Below are explains of both functions implemented in sqlite 3.3.8 based 
database.
However it doesn't mean that all window functions are easy to implement. 
For example I have no idea how to implement efficiently moving 
aggregates (BETWEEN x PRECEDING AND y FOLLOWING) - I think those have to 
recalculated for every row separately making whole query really slow 
(sum() can be optimized, but most likely most aggregates can't). There 
are many more problems. Most databases still haven't implemented 
everything from standard.


Explains:
- table definition: create table employee(salary, dept)
- table is already sorted by dept, salary DESC (so no sorting/index is 
visible in explains)
- no window specification defaults to ROWS UNBOUNDED PRECEDING AND 
UNBOUNDED FOLLOWING (behavior not specified in a standard)
- especially for rank() there is very little code, but it should be even 
shorter (for example opcodes 24, 25, 26 are useless)


SELECT rank() OVER(PARTITION BY dept ORDER BY salary DESC) FROM employee:
addr | opcode| p1  | p2 | p3  |
++---+-++-+--
   0 | Noop  |   0 |  0 | |
   1 | MemNull   |   2 |  0 | |
   2 | Goto  |   0 | 32 | |
   3 | Integer   |   2 |  0 | |
   4 | OpenRead  |   0 |  2 | |
   5 | SetNumColumns |   0 |  2 | |
   6 | MemInt|   0 |  1 | |
   7 | Rewind|   0 | 30 | |
   8 | MemLoad   |   5 |  0 | |
   9 | Column|   0 |  1 | |
  10 | Ne| 512 | 12 | collseq(BINARY) |
  11 | Goto  |   0 | 15 | |
  12 | MemNull   |   2 |  0 | |
  13 | Column|   0 |  1 | |
  14 | MemStore  |   5 |  1 | |
  15 | MemLoad   |   6 |  0 | |
  16 | Column|   0 |  0 | |
  17 | Ne| 512 | 20 | collseq(BINARY) |
  18 | MemInt|   0 |  4 | |
  19 | Goto  |   0 | 23 | |
  20 | MemInt|   1 |  4 | |
  21 | Column|   0 |  0 | |
  22 | MemStore  |   6 |  1 | |
  23 | WindowStep|   2 |  0 | rank(0) |
  24 | MemStore  |   3 |  0 | |
  25 | Pop   |   1 |  0 | |
  26 | MemLoad   |   3 |  0 | |
  27 | Callback  |   1 |  0 | |
  28 | MemIncr   |   1 |  1 | |
  29 | Next  |   0 |  8 | |
  30 | Close |   0 |  0 | |
  31 | Halt  |   0 |  0 | |
  32 | Transaction   |   2 |  0 | |
  33 | VerifyCookie  |   2 |  1 | |
  34 | Goto  |   0 |  3 | |


SELECT sum(salary) OVER (PARTITION BY dept) FROM employee:
addr | opcode| p1  | p2 | p3  |
++---+-++-+--
   0 | Noop  |   0 |  0 | |
   1 | OpenEphemeralList |   2 |  2 | |
   2 | MemInt|   1 |  2 | |
   3 | MemNull   |   3 |  0 | |
   4 | MemInt|  -1 |  0 | |
   5 | Goto  |   0 | 61 | |
   6 | Integer   |   2 |  0 | |
   7 | OpenRead  |   0 |  2 | |
   8 | SetNumColumns |   0 |  2 | |
   9 | Rewind|   0 | 35 | |
  10 | IfMemPos  |   2 | 28 | |
  11 | MemLoad   |   4 |  0 | |
  12 | Column|   0 |  1 | |
  13 | Ne| 512 | 17 | collseq(BINARY) |
  14 | Column|   0 |  0 | |
  15 | AggStep   |   3 |  1 | sum(1)  |
  16 | Goto  |   0 | 33 | |
  17 | AggFinal  |   3 |  0 | sum(1)  |
  18 | MemLoad   |   0 |  0 |

Re: [sqlite] Finding rows with MIN(MTIME) for all IDs

2012-09-11 Thread Wiktor Adamski



With regular ISO window functions, one could typically write something along 
these lines:

with
DataSet
as
(
   select  foo.*,
   lag( 0, 1, 1 ) over ( partition by id order by time ) as is_lag
   fromfoo
)
select  *
fromDataSet

where   DataSet.is_lag = 1


you're right. it was my fault and teradata is not guilty at all - they 
still allow to write it easier than standard:
select * from tab qualify row_number() over(partition by id order by 
mtime) = 1

or if you prefer:
select * from tab qualify lag(0, 1, 1) over(partition by id order by 
mtime) = 1



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Finding rows with MIN(MTIME) for all IDs

2012-09-11 Thread Wiktor Adamski

select id, a, min(mtime) over(partition by id order by mtime) m from tab 
qualify row_number() over(partition by id order by mtime) = 1

While using analytics would indeed be the best approach overall, these are 
sadly not supported in SQLite in any ways or forms.

(For the record, if using analytics, the only thing one really need to do is to 
mark the lead row for selection. No point to over complicate things as above).


what do you mean by "to mark the lead row for selection" ? is there a 
database that has something simpler to use than qualify + row_number() ? 
(yes i know, min can be replaced by first_value() or order by in most 
aggregates can/should be removed, but after those changes it's still the 
same query)

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bug

2011-11-23 Thread Wiktor Adamski
> I believe OVER() is an Oracle-specific extension to SQL, not a
> standard in any way.

ISO/IEC 9075-2:2003:
 ::=  OVER 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bug

2011-11-23 Thread Wiktor Adamski
> Apparently, using such a function in ORDER BY
> clause alone doesn't make the statement aggregate (whether it should is
> perhaps debatable)

I suppose this may be in the standart. I'm 100% sure that this one is
allowed by standart:
... ORDER BY avg(a) OVER()
so likely ORDER BY avg(a) is also allowed.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bug

2011-11-23 Thread Wiktor Adamski
>   In the first query, there is an aggregate in the result set, so an
>   implicit GROUP BY is used.  The ORDER BY is meaningless, but not an
>   error (and could be more easily written "ORDER BY 1"; see below).

The order is not meaningless. It can return an error or do nothing. If
aggregate in order by isn't allowed it should return an error. From
the first query we can see that it is allowed. It would be nice if
database behaved consistently. We already know that MS SQL and
Firebird do. Same with MySQL. I'm not sure, but I think I'v tried also
postgres with same result. I'm sure that in sqlite it was missed and
it is not an intentional behaviour.
And yes - obviously the query with a bug makes no sense and there is
no reason to ever use it. Just like adding order by to any other query
that return 1 row before ordering.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Bug

2011-11-23 Thread Wiktor Adamski
SQLite version 3.7.9 2011-11-01 00:52:41
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> create table t(a);
sqlite> select avg(a) from t order by avg(a); -- order by aggregate
possible

sqlite> select 1 from t order by a; -- order by column not in result
possible
sqlite> select 1 from t group by 1 order by avg(a); -- order by
aggregate not in result possible
sqlite> select 1 from t order by avg(a); -- should be possible
Error: misuse of aggregate: avg()
sqlite>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Big FLOAT bug

2011-10-07 Thread Wiktor Adamski
> Changing the 2 "15g" entries in sqlite3.c to "16g" corrects this problem. 15 
> digits is all that is guaranteed but the vast majority of 16-digit values are 
> representable.
>
> Is this a valid solution?  Or are there other side effects?

It should be ok. However there's another bug that will appear after
the change:

C:\Users\vic\Desktop\sqlite-amalgamation-3070800>cat test.c
#include "sqlite3.h"
#include 

int main() {
  double d = 8901.0;
  char * c = sqlite3_mprintf("%.15g\n%.16g\n", d, d);
  printf("%s%.16g\n", c, d);
}

C:\Users\vic\Desktop\sqlite-amalgamation-3070800>cl /nologo test.c
sqlite3.c
test.c
sqlite3.c
Generating Code...

C:\Users\vic\Desktop\sqlite-amalgamation-3070800>test.exe
8901
8901.0001
8901

C:\Users\vic\Desktop\sqlite-amalgamation-3070800>gcc test.c sqlite3.c

C:\Users\vic\Desktop\sqlite-amalgamation-3070800>a.exe
8901
8901.0001
8901

So with .15g result is correct but with .16g is not. The problem with .
16g doesn't appear in gcc's and visual's printf.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Big FLOAT bug

2011-10-05 Thread Wiktor Adamski
SQLite version 3.7.8 2011-09-19 14:49:19
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> create table t(a float);
sqlite> insert into t values(1125899906842624);
sqlite> select a = cast(cast(a as text) as float) from t;
0

Yes, I know - 16 digits. But representable 16 digits. .dump also will
export only 15 digits ant without any explicit casts precision will be
lost after importing.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read only scaling optimization

2011-08-10 Thread Wiktor Adamski
Look at disc transfer. With 16 queries in one thread on single disc,
disc may be accesed linearly (depeding on query). With 16 threads
accesing disc at the same time linear disc access is impossible
(however os may do some prefetching) and queries will by slower. You
may try increasing page size - bigger block means less near-random
reads from the disc.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Need Help! -- SQlite database on server

2011-08-05 Thread Wiktor Adamski
http://www.sqlite.org/cvstrac/wiki?p=SqliteNetwork
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] sqlite seems to hang while indexing

2011-06-16 Thread Wiktor Adamski
> I was trying to create an index on an integer column in a very large
> table with over 400,000,000 rows on an Ubuntu ... I
> increased the cache size to 2 but to no avail.

That's only 200M of cache and your table is much larger. Sqlite can't
index/sort efficiently lare data on disc - you need to do this in
memory, so the cache size must be large enough to store whole index.
The problem is that if data in indexed column isn't sorted and you
will try to use created index for example to SELECT ... ORDER BY, it
will be again slow with very large index because the speed will be
limited by random reads from disc (reads from index will be sequential
but finding corresponding rows in the table won't be). It 'should
speed up after all pages will be read into the cache (so again you
will need big cache and preferably large page size). Also SSD disc
will help.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bug

2011-02-23 Thread Wiktor Adamski
>   In short, you cannot assume conditions are processed left-to-right,
>   including JOIN conditions.  

I admit that i haven't checked the snadart but I did check other
engines and they evaluate from left to right. I think that long time
ago I'v read in a standart that 3 table join is basicly equivalent to
joining first 2 tables and than third (I'm not sure though) which
implies left to right.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bug

2011-02-23 Thread Wiktor Adamski
> If you join
> tables which have the same column names, you need to use the table names.

You are right, but joined tables don't have the same column names.
SELECT * FROM t1 JOIN t2 ON a = b -- there is only one 'a' and that
select is correctly implemented
SELECT * FROM t1 JOIN t2 ON a = b JOIN t3 ON 1 -- adding another
select does't change column names in first join
Reported error would be correct in following query:
SELECT * FROM t1 JOIN t2 ON t1.a = t2.b JOIN t3 ON t1.a = t3.a AND a !
= 1;
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Bug

2011-02-23 Thread Wiktor Adamski
SQLite version 3.7.5
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> CREATE TABLE t1 (a INT);
sqlite> CREATE TABLE t2 (b INT);
sqlite> CREATE TABLE t3 (a INT);
sqlite> SELECT * FROM t1 JOIN t2 ON t1.a = t2.b AND a /* obviously
t1.a */ != 1 JOIN t3 ON t1.a = t3.a;
Error: ambiguous column name: a
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Slowdown when switching from Version 3.2.5 to 3.7.4

2010-12-17 Thread Wiktor Adamski
There is a lot more synchronization in 3.7.4. If you disable it new
version may be faster.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] group_concat + distinct

2010-09-22 Thread Wiktor Adamski
SQLite version 3.7.2
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> create table t(a);
sqlite> select group_concat(distinct a) from t;

sqlite> select group_concat(distinct a, ',') from t;
Error: DISTINCT aggregates must have exactly one argument

Both queries should return the same result.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Bug

2010-05-25 Thread Wiktor Adamski
SQLite version 3.6.23.1
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> create table t(a int);
sqlite> insert into t values(1);
sqlite> select 0 where 0;
sqlite> update t set a = (select 0 where 0);
sqlite> select * from t;

sqlite> select 0 limit 0;
sqlite> update t set a = (select 2 limit 0);
sqlite> select * from t;
2
sqlite>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Berkeley DB adds SQL using SQLite API !!

2010-03-31 Thread Wiktor Adamski
> There were many problems with  
> that approach:
...
> (3) Each table and index is in a  
> separate file so your "database" was a directory full of files instead  
> of a single file

This one is not a problem. Actually I don't see how 1 file is better
than 1 directory. For example mac application is a directory not a
file and no one complains. And with several files database would be
faster (for example dropping a table is instant or fragmentation is
handled by OS without need for vacuuming whole database). Also with
current SQLite implementation only tables would be locked by a
transation not a whole database (a few years ago there were even
document on SQLite website listing  splittnig database to several
files as one way to implement table level locks in SQLite).
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bug

2009-12-31 Thread Wiktor Adamski
> Already fixed.  Seehttp://www.sqlite.org/src/info/f74beaabde

I'm might be wrong because I didn't check with the newest binary, but
if I remember correctly this fix will compare t3.a to t1.a (instead of
t2.a). So it will produce different results (first select will be 1|2
and second will be empty) but still incorrect because error should be
returned.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Bug

2009-12-31 Thread Wiktor Adamski
SQLite version 3.6.21
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> create table t1(a int);
sqlite> create table t2(a int);
sqlite> create table t3(a int);
sqlite> insert into t1 values(1);
sqlite> insert into t2 values(2);
sqlite> insert into t3 values(1);
sqlite> select * from t1 join t2 on t1.a < t2.a join t3 using(a);
sqlite> update t3 set a = 2;
sqlite> select * from t1 join t2 on t1.a < t2.a join t3 using(a);
1|2

First join produces 2 columns named a so result of the query should by
similar to:
MySQL: Column 'a' in from clause is ambiguous
PostgreSQL: ERROR: common column name "a" appears more than once in
left table
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Bug: ambiguous column name

2009-12-23 Thread Wiktor Adamski
There's no reason for following error:

SQLite version 3.6.21
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .headers ON
sqlite> create table t1(a int);
sqlite> create table t2(a int);
sqlite> create table t3(a int);
sqlite> insert into t1 values(1);
sqlite> select * from t1 left join t2 using(a) left join t3 using(a);
a
1
sqlite> select a from t1 left join t2 using(a) left join t3 using(a);
Error: ambiguous column name: a
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] USING bug

2009-12-08 Thread Wiktor Adamski
SQLite version 3.6.21
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> create table t1(a int);
sqlite> create table t2(a int);
sqlite> create table t3(a int, b int);
sqlite> insert into t1 values(1);
sqlite> insert into t3 values(1, 1);
sqlite> select * from t1 left join t2 using(a) left join t3 using(a);
1|
sqlite> select * from (t1 left join t2 using(a)) left join t3 using(a);
1|

I think that correct result should be
1|1
MySQL and PostgreSQL agree with me.

--
Kup wlasne mieszkanie za 72 tys. zl.
Sprawdz najlepsze oferty >>> http://link.interia.pl/f24ce

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] problem using random() in queries

2008-08-30 Thread Wiktor Adamski
> Richard, Before you "fix" it, I'm not convinced it is broken.
>
> >From MS SQL server
>
> create table _names (N  varchar(5));
> insert into _names values('a');
> insert into _names values('b');
> insert into _names values('c');
> insert into _names values('d');
> insert into _names values('e');
>
> select N, RAND() as RNDValue from _names
> a 0.301745013642105
> b 0.301745013642105
> c 0.301745013642105
> d 0.301745013642105
> e 0.301745013642105
>
>
> select N, RAND() as RNDValue from _names where RAND() >=.5;
> a 0.0427909435260437
> b 0.0427909435260437
> c 0.0427909435260437
> d 0.0427909435260437
> e 0.0427909435260437

MSSQL isn't the only database that behaves that way (it thinks it can 
optimize the call if arguments are the same), but it makes difficult to do 
some tasks (for example ORDER BY random() LIMIT will work fine in sqlite but 
it will return the same value many times in some other databases). I agree 
that it may be a matter of documenting the bahaviour (however I like that 
sqlite evaluating functions in every row) but evaluating functions twice 
causes problems hard to document:

sqlite> select distinct(random() / 500) from (select 1 union 
select 2);
0
0 


---
Nasilaja sie kradzieze.
Mieszkancy osiedli zaniepokojeni.
Prosimy o pomoc w tej sprawie >>> http://link.interia.pl/f1eef


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users