Re: [sqlite] Group contiguous rows (islands)

2017-02-16 Thread Rossel, Jonathan
@ Pasma and Hainaut,

Thanks again, that looks promising !

Jonathan

Message: 42
Date: Wed, 15 Feb 2017 21:10:10 +0100
From: "E.Pasma" <pasm...@concepts.nl>
To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Group contiguous rows (islands)
Message-ID: <55fa9699-22fe-4dd9-9b86-36a190485...@concepts.nl>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes


Jean-Luc Hainaut:

> On 15/02/2017 18:34, E.Pasma wrote:
>>
>> Hello,  the query below is simpler. May be slower. But looks pretty  
>> relational. Thanks, E Pasma.
>>
>> create table T(date integer,test char(12));
>> insert into T
>> values (1,'clim'),(3,'clim'),(7,'amb'),(10,'amb'),(12,'xxx'),
>> (13,'clim'),(15,'clim'),(20,'clim'),(22,'amb'),(25,'amb');
>>
>> select min(date) as fromdate, max(date) as enddate, test
>> from(--get closest preceeding different key
>>select t.*, max(t2.date) as key2
>>from t
>>left join t t2
>>on t2.datet.test
>>group by t.date
>>)
>> group by key2
>
> Quite nice solution indeed!
> For those who may feel uncomfortable with outer joins, the from  
> clause could be written as a subquery:
>
> from (select date, test, (select  max(date)
>  fromt t2
>  where  t2.date < t.date
>  and  t2.test <> t.test)  
> as key2)
>
> Thanks
>
> J-L
>
this way you may also try to optimise speed by using ORDER BY & LIMIT  
1 instead of MAX

from (select date, test, (select t2.date
   from  t t2
   where t2.date < t.date
   and t2.test <>  t.test
   order by t2.date desc limit 1)  
as key2


***
This e-mail message is intended only for the addressee(s) and contains
information which may be confidential. If you are not the intended
recipient please do not read, save, forward, disclose or copy the contents
of this e-mail. If this e-mail has been sent to you in error, please delete 
this 
e-mail and any copies or links to this e-mail completely and immediately
from your system. We also like to inform you that communication via e-mail
over the Internet is insecure because third parties may have the possibility
to access and manipulate e-mails.

Any views expressed in this message are those of the individual sender,
except where the sender specifically states them to be the views of
The Swatch Group Ltd.
***
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Group contiguous rows (islands)

2017-02-15 Thread E.Pasma


Jean-Luc Hainaut:


On 15/02/2017 18:34, E.Pasma wrote:


Hello,  the query below is simpler. May be slower. But looks pretty  
relational. Thanks, E Pasma.


create table T(date integer,test char(12));
insert into T
values (1,'clim'),(3,'clim'),(7,'amb'),(10,'amb'),(12,'xxx'),
(13,'clim'),(15,'clim'),(20,'clim'),(22,'amb'),(25,'amb');

select min(date) as fromdate, max(date) as enddate, test
from(--get closest preceeding different key
   select t.*, max(t2.date) as key2
   from t
   left join t t2
   on t2.datet.test
   group by t.date
   )
group by key2


Quite nice solution indeed!
For those who may feel uncomfortable with outer joins, the from  
clause could be written as a subquery:


from (select date, test, (select  max(date)
 fromt t2
 where  t2.date < t.date
 and  t2.test <> t.test)  
as key2)


Thanks

J-L

this way you may also try to optimise speed by using ORDER BY & LIMIT  
1 instead of MAX


from (select date, test, (select t2.date
  from  t t2
  where t2.date < t.date
  and t2.test <>  t.test
  order by t2.date desc limit 1)  
as key2

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Group contiguous rows (islands)

2017-02-15 Thread Jean-Luc Hainaut

On 15/02/2017 18:34, E.Pasma wrote:


Hello,  the query below is simpler. May be slower. But looks pretty 
relational. Thanks, E Pasma.


create table T(date integer,test char(12));
insert into T
values (1,'clim'),(3,'clim'),(7,'amb'),(10,'amb'),(12,'xxx'),
(13,'clim'),(15,'clim'),(20,'clim'),(22,'amb'),(25,'amb');

select min(date) as fromdate, max(date) as enddate, test
from(--get closest preceeding different key
select t.*, max(t2.date) as key2
from t
left join t t2
on t2.datet.test
group by t.date
)
group by key2


Quite nice solution indeed!
For those who may feel uncomfortable with outer joins, the from clause 
could be written as a subquery:


from (select date, test, (select  max(date)
  fromt t2
  where  t2.date < t.date
  and  t2.test <> t.test) 
as key2)


Thanks

J-L

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Group contiguous rows (islands)

2017-02-15 Thread E.Pasma

15 feb 2017, Jean-Luc Hainaut:



You could try this, inspired by classic algorithms of temporal  
databases:


create table T(date integer,test char(12));
insert into T
values (1,'clim'),(3,'clim'),(7,'amb'),(10,'amb'),(12,'xxx'),
(13,'clim'),(15,'clim'),(20,'clim'),(22,'amb'),(25,'amb');

create table TT(seq integer not null primary key autoincrement,date  
integer,test char(12));

insert into TT(date,test) select * from T order by date;

select T1.date, T3.date, T1.test
from   TT T1, TT T3
-- More efficient than "where  T1.date <= T3.date"
where  T1.seq <= T3.seq
andT1.test = T3.test
andnot exists(select * from TT where seq = T1.seq-1 and test =  
T1.test)
andnot exists(select * from TT where seq = T3.seq+1 and test =  
T3.test)

andnot exists(select *
from   TT T2
-- More efficient than "where  T2.date between  
T1.date and T3.date"

where  T2.seq between T1.seq and T3.seq
andT2.test <> T1.test);

Result:

+--+--+--+
| date | date | test |
+--+--+--+
| 1| 3| clim |
| 7| 10   | amb  |
| 12   | 12   | xxx  |
| 13   | 20   | clim |
| 22   | 25   | amb  |
+--+--+--+

Hello,  the query below is simpler. May be slower. But looks pretty  
relational. Thanks, E Pasma.


create table T(date integer,test char(12));
insert into T
values (1,'clim'),(3,'clim'),(7,'amb'),(10,'amb'),(12,'xxx'),
(13,'clim'),(15,'clim'),(20,'clim'),(22,'amb'),(25,'amb');

select min(date) as fromdate, max(date) as enddate, test
from(--get closest preceeding different key
select t.*, max(t2.date) as key2
from t
left join t t2
on t2.datet.test
group by t.date
)
group by key2
;


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Group contiguous rows (islands)

2017-02-15 Thread Rossel, Jonathan
@ Simon,

Thanks for the input ! I was afraid someone was going to mention the dreaded 
recursive CTEs.

Jonathan




***
This e-mail message is intended only for the addressee(s) and contains
information which may be confidential. If you are not the intended
recipient please do not read, save, forward, disclose or copy the contents
of this e-mail. If this e-mail has been sent to you in error, please delete 
this 
e-mail and any copies or links to this e-mail completely and immediately
from your system. We also like to inform you that communication via e-mail
over the Internet is insecure because third parties may have the possibility
to access and manipulate e-mails.

Any views expressed in this message are those of the individual sender,
except where the sender specifically states them to be the views of
The Swatch Group Ltd.
***

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Group contiguous rows (islands)

2017-02-15 Thread Rossel, Jonathan
@ Clemens, Petite Abeille,

Thanks, that's what I thought, but it's comforting to know for sure...

@ Jean-Luc,

Thanks a lot for the detailed answer, that's awesome ! I'll give it a try and 
see how it compares with an external "manual" grouping


Jonathan 



--

Message: 79
Date: Wed, 15 Feb 2017 11:16:24 +0100
From: Clemens Ladisch <clem...@ladisch.de>
To: sqlite-users@mailinglists.sqlite.org
Subject: Re: [sqlite] Group contiguous rows (islands)
Message-ID: <d689fbd5-0f4f-34ac-db98-72872d7a7...@ladisch.de>
Content-Type: text/plain; charset=us-ascii

Rossel, Jonathan wrote:
> Other database engines have solutions for this task (like windowing in
> postgre) but I wonder if there is an efficient recipe in SQLite.

SQLite does not have windowing functions.  So the most efficient method
would be to read the data with a simple ORDER BY, and do the grouping
in your code.


Regards,
Clemens


--

Message: 83
Date: Wed, 15 Feb 2017 12:02:32 +0100
From: Jean-Luc Hainaut <jean-luc.hain...@unamur.be>
To: sqlite-users@mailinglists.sqlite.org
Subject: Re: [sqlite] Group contiguous rows (islands)
Message-ID: <58a43548@unamur.be>
Content-Type: text/plain; charset=UTF-8; format=flowed


You could try this, inspired by classic algorithms of temporal databases:

create table T(date integer,test char(12));
insert into T
values (1,'clim'),(3,'clim'),(7,'amb'),(10,'amb'),(12,'xxx'),
(13,'clim'),(15,'clim'),(20,'clim'),(22,'amb'),(25,'amb');

create table TT(seq integer not null primary key autoincrement,date 
integer,test char(12));
insert into TT(date,test) select * from T order by date;

select T1.date, T3.date, T1.test
from   TT T1, TT T3
-- More efficient than "where  T1.date <= T3.date"
where  T1.seq <= T3.seq
andT1.test = T3.test
andnot exists(select * from TT where seq = T1.seq-1 and test = T1.test)
andnot exists(select * from TT where seq = T3.seq+1 and test = T3.test)
andnot exists(select *
  from   TT T2
  -- More efficient than "where  T2.date between T1.date 
and T3.date"
  where  T2.seq between T1.seq and T3.seq
  andT2.test <> T1.test);

Result:

+--+--+--+
| date | date | test |
+--+--+--+
| 1| 3| clim |
| 7| 10   | amb  |
| 12   | 12   | xxx  |
| 13   | 20   | clim |
| 22   | 25   | amb  |
+--+--+--+

Working table TT is recommended to create an ordered sequence of rows in 
which "next" and "previous" rows are more easily described than in the 
source table. Avoid "order by" on views. It works in SQLite but it 
should not!

The idea is to identify maximal sequences of identical "test" values as 
follow:
- T1 denotes the first row of a sequence
- T3 the last row
- T2 any "disturbing" row lying between T1 and T3 but with a different 
value of "test"
- first "not exists" condition states that T1 must be the very first of 
the sequence: it must not be immediately preceded by a row with same 
value of "test"
- same for second "not exists" condition: T3 must be the last
- the third "not exists" condition states that there is no "disturbing" 
row between T1 and T3.

Valid if maximal sequences do not overlap. This query also detects 
single row sequences (e.g., 'xxx').
An index on TT.test may be useful to support T1*T3 join.

For large tables, an iterative procedure will be faster, though less 
elegant!

Regards

Jean-Luc Hainaut


--

Message: 89
Date: Wed, 15 Feb 2017 12:58:07 +0100
From: Petite Abeille <petite.abei...@gmail.com>
To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Group contiguous rows (islands)
Message-ID: <4b88b85b-75eb-4391-989e-198ebe31e...@gmail.com>
Content-Type: text/plain; charset=us-ascii


> On Feb 15, 2017, at 11:16 AM, Clemens Ladisch <clem...@ladisch.de> wrote:
> 
> SQLite does not have windowing functions.

A continuous/continual tragedy indeed :|

Still, worthwhile mentioning The Tabibitosan Method, for reference purpose:

http://www.orchestrapit.co.uk/?p=53
https://community.oracle.com/message/3991678

Rather nifty in its simplicity and power. Sadly, out of reach to SQLite 
dwellers.






--

Subject: Digest Footer

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


--

End of sqlite-users Digest, Vol 110, Issue 15
*
***
This e-mail message is intended only for the addressee(s) and contains
information which may be

Re: [sqlite] Group contiguous rows (islands)

2017-02-15 Thread Simon Slavin

On 15 Feb 2017, at 11:58am, Petite Abeille  wrote:

> On Feb 15, 2017, at 11:16 AM, Clemens Ladisch  wrote:
> 
>> SQLite does not have windowing functions.
> 
> A continuous/continual tragedy indeed :|

Windowing breaks the philosophy behind SQL.  Rows are meant to be members of a 
set, and your operations on them are meant to be set operations.  There is no 
implicit order for set elements.  That’s why bare-bones SQL implementations 
don’t have cursors or windowing.

> Still, worthwhile mentioning The Tabibitosan Method, for reference purpose:
> 
> http://www.orchestrapit.co.uk/?p=53
> https://community.oracle.com/message/3991678
> 
> Rather nifty in its simplicity and power. Sadly, out of reach to SQLite 
> dwellers.

Actually SQLite can do it, by iterating using recursive common table 
expressions:



.  It looks like an interesting programming exercise, though there’s a danger 
it will lead to unreadable code (a perpetual danger with WITH).  You would need 
a spare column in the table to store the 'group' values in.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Group contiguous rows (islands)

2017-02-15 Thread Petite Abeille

> On Feb 15, 2017, at 11:16 AM, Clemens Ladisch  wrote:
> 
> SQLite does not have windowing functions.

A continuous/continual tragedy indeed :|

Still, worthwhile mentioning The Tabibitosan Method, for reference purpose:

http://www.orchestrapit.co.uk/?p=53
https://community.oracle.com/message/3991678

Rather nifty in its simplicity and power. Sadly, out of reach to SQLite 
dwellers.




___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Group contiguous rows (islands)

2017-02-15 Thread Jean-Luc Hainaut


You could try this, inspired by classic algorithms of temporal databases:

create table T(date integer,test char(12));
insert into T
values (1,'clim'),(3,'clim'),(7,'amb'),(10,'amb'),(12,'xxx'),
(13,'clim'),(15,'clim'),(20,'clim'),(22,'amb'),(25,'amb');

create table TT(seq integer not null primary key autoincrement,date 
integer,test char(12));

insert into TT(date,test) select * from T order by date;

select T1.date, T3.date, T1.test
from   TT T1, TT T3
-- More efficient than "where  T1.date <= T3.date"
where  T1.seq <= T3.seq
andT1.test = T3.test
andnot exists(select * from TT where seq = T1.seq-1 and test = T1.test)
andnot exists(select * from TT where seq = T3.seq+1 and test = T3.test)
andnot exists(select *
 from   TT T2
 -- More efficient than "where  T2.date between T1.date 
and T3.date"

 where  T2.seq between T1.seq and T3.seq
 andT2.test <> T1.test);

Result:

+--+--+--+
| date | date | test |
+--+--+--+
| 1| 3| clim |
| 7| 10   | amb  |
| 12   | 12   | xxx  |
| 13   | 20   | clim |
| 22   | 25   | amb  |
+--+--+--+

Working table TT is recommended to create an ordered sequence of rows in 
which "next" and "previous" rows are more easily described than in the 
source table. Avoid "order by" on views. It works in SQLite but it 
should not!


The idea is to identify maximal sequences of identical "test" values as 
follow:

- T1 denotes the first row of a sequence
- T3 the last row
- T2 any "disturbing" row lying between T1 and T3 but with a different 
value of "test"
- first "not exists" condition states that T1 must be the very first of 
the sequence: it must not be immediately preceded by a row with same 
value of "test"

- same for second "not exists" condition: T3 must be the last
- the third "not exists" condition states that there is no "disturbing" 
row between T1 and T3.


Valid if maximal sequences do not overlap. This query also detects 
single row sequences (e.g., 'xxx').

An index on TT.test may be useful to support T1*T3 join.

For large tables, an iterative procedure will be faster, though less 
elegant!


Regards

Jean-Luc Hainaut

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Group contiguous rows (islands)

2017-02-15 Thread Clemens Ladisch
Rossel, Jonathan wrote:
> Other database engines have solutions for this task (like windowing in
> postgre) but I wonder if there is an efficient recipe in SQLite.

SQLite does not have windowing functions.  So the most efficient method
would be to read the data with a simple ORDER BY, and do the grouping
in your code.


Regards,
Clemens


> This e-mail message is intended only ...

This e-mail contains public information intended for any subscriber of
this mailing list and for anybody else who bothers to read it; it will
be copied, disclosed and distributed to the public.  If you think you
are not the intended recipient, please commit suicide immediately.
These terms apply also to any e-mails quoted in, referenced from, or
answering this e-mail, and supersede any confidentiality notices in
those e-mails.  Additionally, confidentiality notices in those e-mails
will incur legal processing fees of $42 per line; you have agreed to
this by reading this confidentiality notice.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Group contiguous rows (islands)

2017-02-15 Thread Rossel, Jonathan
Dear all,

I need to perform a kind of partial GROUP BY to determine the beginnings and 
ends of sets of identical data. I can't use a full GROUP BY because these sets 
can be repeated and their repetition must be conserved. Other database engines 
have solutions for this task (like windowing in postgre) but I wonder if there 
is an efficient recipe in SQLite.

Example:
===

Table: mytable


date test
-- --
1   clim
3   clim
7   amb
10  amb
13  clim
15  clim
20  clim
22  amb
25  amb

Desired result
-

date_fromdate_totest
---   --
1 3  clim
7 10amb
13   15 clim
22   25 amb


(non optimal) solution found
=

CREATE VIEW mytablebydate
AS  -- Pre-order table to avoid ordering it twice in sub-queries
SELECT * FROM mytable ORDER BY date

CREATE VIEW mytablenext
AS
SELECT  date,
test,
(
-- first row > this row
SELECT date   -- NULL if not exists
FROM mytablebydate
WHERE date > MT.date
LIMIT 1
) as date_next,
(
-- first row > this row
SELECT test   -- NULL if not exists
FROM mytablebydate
WHERE date > MT.date
LIMIT 1
) as test_next  
FROM mytable MT
WHERE test != test_next

-- Get desired results
SELECT  (
--  Date of the previous row
SELECT MAX( date_next )
FROM mytablenext
WHERE date_next < mt.date
) AS date_from,

date AS date_to,   -- this row
test
FROM mytablenext mt


Comments


This method returns a Null for the first date_from and the last group is not 
returned. It is therefore incomplete. In addition, it involves quite a lot of 
subqueries. For completeness, it is inspired by 
http://stackoverflow.com/questions/30455227/date-range-for-set-of-same-data/30460263#30460263.
 So, is there a better / official way in SQLite ?

Any help will be welcome,

Jonathan

***
This e-mail message is intended only for the addressee(s) and contains
information which may be confidential. If you are not the intended
recipient please do not read, save, forward, disclose or copy the contents
of this e-mail. If this e-mail has been sent to you in error, please delete 
this 
e-mail and any copies or links to this e-mail completely and immediately
from your system. We also like to inform you that communication via e-mail
over the Internet is insecure because third parties may have the possibility
to access and manipulate e-mails.

Any views expressed in this message are those of the individual sender,
except where the sender specifically states them to be the views of
The Swatch Group Ltd.
***
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users