Re: [sqlite] finding pairwise similarity

Igor Tandetnik Sat, 14 Jan 2012 13:12:00 -0800

Mr. Puneet Kishor <punk.k...@gmail.com> wrote:
> I have two tables like so
> 
>    CREATE TABLE c (c_no INTEGER PRIMARY KEY, c_name TEXT);
>    CREATE TABLE t (t_no INTEGER PRIMARY KEY, t_name TEXT);
>    CREATE TABLE c_t (c_no INTEGER, t_no INTEGER);
> 
> Every row in `c` has one or more `t`. I want, in a single SQL, the following
> 
>    source_c, target_c, similarity
> 
> where `similarity` is a measure of overlap of `t` between each pair of `c` 
> and is calculated as the number of `t` common between
> a given pair divided by the total number of unique `t` in the pair.


select src.c_no source_c, trg.c_no target_c,
(select count(*) from (
 select t_no from c_t where c_t.c_no = src.c_no
 intersect
 select t_no from c_t where c_t.c_no = trg.c_no)) * 1.0
/
(select count(*) from (
 select t_no from c_t where c_t.c_no = src.c_no
 union
 select t_no from c_t where c_t.c_no = trg.c_no)) similarity
from c src join c trg on (src.c_no < trg.c_no);

-- 
Igor Tandetnik

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] finding pairwise similarity

Reply via email to