Re: [GENERAL] Find similar records (compare tsvectors)

2015-03-07 Thread Patrick Dung
Thanks.
smlar is fast and quite good.I need find tuning on the search result. 

 On Saturday, March 7, 2015 12:07 AM, Oleg Bartunov  
wrote:
   

 On Fri, Mar 6, 2015 at 5:05 PM, Patrick Dung  wrote:
> Resend.
>
> How to quickly compare the similarity of two tsvector?
>

check http://www.sai.msu.su/~megera/postgres/talks/pgcon-2012.pdf


>
> On Monday, March 2, 2015 11:01 PM, Patrick Dung 
> wrote:
>
>
> Hello,
>
> I had a database with articles or attachment stored in bytea format.
> I also had a trigger: it insert/update the tsv column when a record is
> added/updated.
> The tsv column had a GIN index.
> With this setting, I can do very fast keyword search on the tsv.
>
> Suppose I had a specific record (id=10).
> How to list similar records based on ranking?
> In that case, I had to compare a tsvector with another tsvector.
>
> I had this SQL which make the original tsv as a text and then to tsquery,
> Then I can compare a tsv and a tsquery.
> SELECT ts_rank(i.tsv, replace(strip(original.tsv)::text, ' ', '|')::tsquery)
> as similarity, i.company, i.industry, i.post_timestamp, i.id FROM items i,
> (SELECT tsv, id FROM items WHERE id=10) AS original WHERE i.id !=
> original.id ORDER BY similarity;
>
> items table:
> id bigint
> company varchar
> industry varchar
> description varchar
> post_timestamp timestamp
> attachment bytea
> tsv tsvector
>
> The problem is that this is very slow.
> Any comment?
>
> Thank and regards,
> Patrick
>
>


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




Re: [GENERAL] Find similar records (compare tsvectors)

2015-03-06 Thread Oleg Bartunov
On Fri, Mar 6, 2015 at 5:05 PM, Patrick Dung  wrote:
> Resend.
>
> How to quickly compare the similarity of two tsvector?
>

check http://www.sai.msu.su/~megera/postgres/talks/pgcon-2012.pdf


>
> On Monday, March 2, 2015 11:01 PM, Patrick Dung 
> wrote:
>
>
> Hello,
>
> I had a database with articles or attachment stored in bytea format.
> I also had a trigger: it insert/update the tsv column when a record is
> added/updated.
> The tsv column had a GIN index.
> With this setting, I can do very fast keyword search on the tsv.
>
> Suppose I had a specific record (id=10).
> How to list similar records based on ranking?
> In that case, I had to compare a tsvector with another tsvector.
>
> I had this SQL which make the original tsv as a text and then to tsquery,
> Then I can compare a tsv and a tsquery.
> SELECT ts_rank(i.tsv, replace(strip(original.tsv)::text, ' ', '|')::tsquery)
> as similarity, i.company, i.industry, i.post_timestamp, i.id FROM items i,
> (SELECT tsv, id FROM items WHERE id=10) AS original WHERE i.id !=
> original.id ORDER BY similarity;
>
> items table:
> id bigint
> company varchar
> industry varchar
> description varchar
> post_timestamp timestamp
> attachment bytea
> tsv tsvector
>
> The problem is that this is very slow.
> Any comment?
>
> Thank and regards,
> Patrick
>
>


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Find similar records (compare tsvectors)

2015-03-06 Thread Patrick Dung
Resend.
How to quickly compare the similarity of two tsvector?
 

 On Monday, March 2, 2015 11:01 PM, Patrick Dung  
wrote:
   

 Hello,
I had a database with articles or attachment stored in bytea format.I also had 
a trigger: it insert/update the tsv column when a record is added/updated.The 
tsv column had a GIN index.With this setting, I can do very fast keyword search 
on the tsv.
Suppose I had a specific record (id=10).How to list similar records based 
on ranking?In that case, I had to compare a tsvector with another tsvector.
I had this SQL which make the original tsv as a text and then to tsquery, Then 
I can compare a tsv and a tsquery.
SELECT ts_rank(i.tsv, replace(strip(original.tsv)::text, ' ', '|')::tsquery) as 
similarity, i.company, i.industry, i.post_timestamp, i.id FROM items i, (SELECT 
tsv, id FROM items WHERE id=10) AS original WHERE i.id != original.id ORDER 
BY similarity;
items table:id bigint
company varchar
industry varchardescription varcharpost_timestamp timestampattachment bytea
tsv tsvector

The problem is that this is very slow.Any comment?
Thank and regards,Patrick




[GENERAL] Find similar records (compare tsvectors)

2015-03-02 Thread Patrick Dung
Hello,
I had a database with articles or attachment stored in bytea format.I also had 
a trigger: it insert/update the tsv column when a record is added/updated.The 
tsv column had a GIN index.With this setting, I can do very fast keyword search 
on the tsv.
Suppose I had a specific record (id=10).How to list similar records based 
on ranking?In that case, I had to compare a tsvector with another tsvector.
I had this SQL which make the original tsv as a text and then to tsquery, Then 
I can compare a tsv and a tsquery.
SELECT ts_rank(i.tsv, replace(strip(original.tsv)::text, ' ', '|')::tsquery) as 
similarity, i.company, i.industry, i.post_timestamp, i.id FROM items i, (SELECT 
tsv, id FROM items WHERE id=10) AS original WHERE i.id != original.id ORDER 
BY similarity;
items table:id bigint
company varchar
industry varchardescription varcharpost_timestamp timestampattachment bytea
tsv tsvector

The problem is that this is very slow.Any comment?
Thank and regards,Patrick