Hi Peter, By "schema changes Clemens suggested" I assume you mean replacing the constraint -
....not exists (select * from edges where child=nodes.id); with ...where id not in (select child from edges); For this leaf count query, I need to constrain the result set to exclude nodes which are parents - sqlite> CREATE VIEW v_count_leaves as with recursive r(id, top) as (select id, id from nodes union all select t.id, top from nodes as t, edges as e, r where e.parent=r.id and t.id=e.child) select top, count(*) from r where top<>id and id not in (select parent from edges) group by top; sqlite> .timer on sqlite> select * from v_count_leaves where top=4465; 4465 1 Run Time: real 75.678 user 75.671875 sys 0.000000 sqlite> select count(*) from nodes; 10000 sqlite> select count(*) from edges; 9986 It is still very slow or did you mean something else? On 1 January 2018 at 18:04, petern <peter.nichvolo...@gmail.com> wrote: > Shane. I sent you a query to work with the crippled schema and index you > proposed for TABLE edges. > Clemens then explicitly suggested you correct the schema to have use of > automatic covering index. > > >CREATE TABLE edges(parent not null references nodes, child not null > >references nodes, primary key(parent, child)); > > Try your leaf counter again - after making the schema changes Clemens > suggested. > > Peter > > > On Mon, Jan 1, 2018 at 8:13 AM, Shane Dev <devshan...@gmail.com> wrote: > > > Hi, > > > > I want to the count the number of leaves (descendants without children) > for > > each node in a DAG > > > > DAG definition - > > > > CREATE TABLE nodes(id integer primary key, description text); > > CREATE TABLE edges(parent not null references nodes, child not null > > references nodes, primary key(parent, child)); > > > > My query - > > > > CREATE VIEW v_count_leaves as with recursive r(id, top) as ( > > select id, id from nodes > > union all > > select t.id, top from nodes as t, edges as e, r where e.parent=r.id and > > t.id > > =e.child) > > select top, count(*) from r where top<>id and id not in (select parent > from > > edges where parent=id) group by top; > > > > It seems to work but is complex to understand and debug despite my aim to > > keep it simple as possible, but more importantly - it is very slow when > > there are more than a few thousand nodes and edges. > > > > It there a more efficient (and ideally simpler) way? > > _______________________________________________ > > sqlite-users mailing list > > sqlite-users@mailinglists.sqlite.org > > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > > > _______________________________________________ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users