Re: [sqlite] Is there a way to inner join on named intervals?

Peng Yu Fri, 13 Aug 2010 09:48:24 -0700

On Fri, Aug 13, 2010 at 11:32 AM, Pavel Ivanov <[email protected]> wrote:
> I don't understand where do you see a problem but it looks like this
> join will do what you want:
>
> select * from A, B
> where A.name = B.name
> and A.left < B.right
> and A.right > B.left
>
>> I could use an external program (such as python
>> sqlite package) to enumerate all the named interval from table A and
>> search for overlapping named intervals in table B, but this operation
>> has a complexity of M log (N), where M is the length of table A and N
>> is the length of table B. If some sort of "inner join" could be used,
>> the complexity should be reduced to log(M+N).
>
> How did you come to this conclusion? Any inner join will execute with
> complexity either M log(N) or N log(M) anyway. The only benefit is
> that SQLite can decide which way to join depending on relative size of
> tables A and B. And constant in operation complexity is a bit smaller
> with C code that in python code...


Thank all the people that replied my email. I don't really understand
how inner join work. Sorry for the confusion.

The comparison that I actually need is the following one (logically
correct, but I'm not sure if it is the fastest comparison in term of
performance). My next question is what index I should create on table
A and B to speed up such an inner join? Are indexes on each of the
first three column of each table enough? What is the best comparison
(along with the appropriate indexes) in term of the performance?

create table A (name text, left integer, right integer, tag text);
create table B (name text, left integer, right integer, attr text);
insert into A values('a', 1, 10, 'tag1');
insert into A values('a', 5, 15, 'tag2');
insert into A values('a', 21, 30, 'tag3');
insert into A values('b', 3, 12, 'tag4');
insert into A values('b', 15, 25, 'tag5');
insert into A values('b', 19, 30, 'tag6');

insert into B values('a', 3, 7, 'attr1');
insert into B values('a', 8, 12, 'attr2');
insert into B values('a', 16, 18, 'attr3');
insert into B values('a', 25, 35, 'attr4');
insert into B values('b', 31, 32, 'attr5');

select * from A inner join B on A.name=B.name AND max(A.left, B.left)
< min(A.right, B.right);

name        left        right       tag         name        left
 right       attr
----------  ----------  ----------  ----------  ----------  ----------
 ----------  ----------
a           1           10          tag1        a           3
 7           attr1
a           1           10          tag1        a           8
 12          attr2
a           5           15          tag2        a           3
 7           attr1
a           5           15          tag2        a           8
 12          attr2
a           21          30          tag3        a           25
 35          attr4


> On Fri, Aug 13, 2010 at 11:07 AM, Peng Yu <[email protected]> wrote:
>> Hi,
>>
>> Suppose that I have a table "A", each row represents a interval. For
>> example, the first row represents an interval [1,10) with a name "a".
>> The first and second rows are considered overlapping because the
>> interval [1,10) and interval [5,15) intersect and both rows have the
>> same name "a".
>>
>> name left right   tag
>> -------------------------------------
>> a          1     10   tag1
>> a          5     15   tag2
>> a        21     30   tag3
>> b          3     12   tag4
>> b        15     25   tag5
>> b        19     30   tag6
>>
>> I want to "inner join" the above table and the following table "B"
>> based on the named interval overlapping.
>>
>> name left right   attr
>> -------------------------------------
>> a          3       7   attr1
>> a          8     12   attr2
>> a        16     18   attr3
>> a        25     35   attr4
>> b        31     32   attr5
>>
>> The result is the following. In each row, the named interval from A
>> overlaps the named interval from B. I don't see there is an easy way
>> to do this in sqlite3. I could use an external program (such as python
>> sqlite package) to enumerate all the named interval from table A and
>> search for overlapping named intervals in table B, but this operation
>> has a complexity of M log (N), where M is the length of table A and N
>> is the length of table B. If some sort of "inner join" could be used,
>> the complexity should be reduced to log(M+N). I'm wondering if there
>> something that can help do this kind of named interval inner join
>> easily.
>>
>> A.name A.left A.right A.tag B.name B.left B.right B.attr
>> ------------------------------------------------------------------------
>> a                  1     10     tag1        a          3       7    attr1
>> a                  1     10     tag1        a          8     12    attr2
>> a                  5     15     tag2        a          3       7    attr1
>> a                  5     15     tag2        a          8     12    attr2
>> a                21     30     tag3        a        16     18    attr3
>>
>> --
>> Regards,
>> Peng
>> _______________________________________________
>> sqlite-users mailing list
>> [email protected]
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
> _______________________________________________
> sqlite-users mailing list
> [email protected]
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
Regards,
Peng
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Is there a way to inner join on named intervals?

Reply via email to