Re: [GENERAL] Trigram is slow when 10m rows
On Sun, Nov 13, 2016 at 3:54 AM, Aaron Lewis wrote: > I have a simple table with Trigram index, > > create table mytable(hash char(40), title text); > create index title_trgm_idx on mytable using gin(title gin_trgm_ops); > > When I run a query with 10m rows, it uses the Trigram index, but takes > 3s to execute, very slow. > (I have 80m rows, but only inserted 10m for testing purpose) > > test=# select count(*) from mytable; > count > -- > 13971887 > (1 row) > > test=# explain analyze select * from mytable where title ilike 'x264'; > QUERY PLAN > > -- > Bitmap Heap Scan on mytable (cost=462.69..5639.67 rows=1380 > width=83) (actual time=2937.308..2937.308 rows=0 loops=1) >Recheck Cond: (title ~~* 'x264'::text) >Rows Removed by Index Recheck: 11402855 >Heap Blocks: exact=39557 lossy=158010 >-> Bitmap Index Scan on title_trgm_idx (cost=0.00..462.35 > rows=1380 width=0) (actual time=342.440..342.440 rows=1220793 loops=1) > Index Cond: (title ~~* 'x264'::text) > Planning time: 0.611 ms > Execution time: 2937.729 ms > (8 rows) > > Any ideas to speed things up? > What version of postgresql and pg_trgm are you using? It might work better under 9.6/1.3 Since your query doesn't use wildcards, it is probably more well suited to a regular btree index, perhaps with citext. Cheers, Jeff
Re: [GENERAL] Trigram is slow when 10m rows
On Sun, Nov 13, 2016 at 5:25 PM, Aaron Lewis wrote: > Thanks Oleg. > > I've increased work_mem to 128MB, now the query falls down to 1.7s, > faster but still not good enough. > > Is there any other thing I can do about it? > your query 'x264' is short in terms of the number of trigrams, so trigram index isn't good. Did you tried text_pattern_ops for btree ? Something like create index title_btree_idx on mytable using btree(title text_pattern_ops ); > > test=# explain analyze select * from mytable where title ilike 'x264'; > QUERY PLAN > > -- > Bitmap Heap Scan on mytable (cost=462.69..5639.67 rows=1380 > width=83) (actual time=1754.656..1754.656 rows=0 loops=1) >Recheck Cond: (title ~~* 'x264'::text) >Rows Removed by Index Recheck: 1220793 >Heap Blocks: exact=197567 >-> Bitmap Index Scan on title_trgm_idx (cost=0.00..462.35 > rows=1380 width=0) (actual time=346.663..346.663 rows=1220793 loops=1) > Index Cond: (title ~~* 'x264'::text) > Planning time: 1.168 ms > Execution time: 1755.944 ms > > > On Sun, Nov 13, 2016 at 10:04 PM, Oleg Bartunov > wrote: > > > > > > On Sun, Nov 13, 2016 at 2:54 PM, Aaron Lewis > > > wrote: > >> > >> I have a simple table with Trigram index, > >> > >> create table mytable(hash char(40), title text); > >> create index title_trgm_idx on mytable using gin(title gin_trgm_ops); > >> > >> When I run a query with 10m rows, it uses the Trigram index, but takes > >> 3s to execute, very slow. > >> (I have 80m rows, but only inserted 10m for testing purpose) > >> > >> test=# select count(*) from mytable; > >> count > >> -- > >> 13971887 > >> (1 row) > >> > >> test=# explain analyze select * from mytable where title ilike 'x264'; > >> QUERY PLAN > >> > >> > -- > >> Bitmap Heap Scan on mytable (cost=462.69..5639.67 rows=1380 > >> width=83) (actual time=2937.308..2937.308 rows=0 loops=1) > >>Recheck Cond: (title ~~* 'x264'::text) > >>Rows Removed by Index Recheck: 11402855 > >>Heap Blocks: exact=39557 lossy=158010 > >>-> Bitmap Index Scan on title_trgm_idx (cost=0.00..462.35 > >> rows=1380 width=0) (actual time=342.440..342.440 rows=1220793 loops=1) > >> Index Cond: (title ~~* 'x264'::text) > >> Planning time: 0.611 ms > >> Execution time: 2937.729 ms > >> (8 rows) > >> > >> Any ideas to speed things up? > > > > > >Rows Removed by Index Recheck: 11402855 > >Heap Blocks: exact=39557 lossy=158010 > > > > You need to increase work_mem > >> > >> > >> -- > >> Best Regards, > >> Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/ > >> Finger Print: 9F67 391B B770 8FF6 99DC D92D 87F6 2602 1371 4D33 > >> > >> > >> -- > >> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > >> To make changes to your subscription: > >> http://www.postgresql.org/mailpref/pgsql-general > > > > > > > > -- > Best Regards, > Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/ > Finger Print: 9F67 391B B770 8FF6 99DC D92D 87F6 2602 1371 4D33 >
Re: [GENERAL] Trigram is slow when 10m rows
Thanks Oleg. I've increased work_mem to 128MB, now the query falls down to 1.7s, faster but still not good enough. Is there any other thing I can do about it? test=# explain analyze select * from mytable where title ilike 'x264'; QUERY PLAN -- Bitmap Heap Scan on mytable (cost=462.69..5639.67 rows=1380 width=83) (actual time=1754.656..1754.656 rows=0 loops=1) Recheck Cond: (title ~~* 'x264'::text) Rows Removed by Index Recheck: 1220793 Heap Blocks: exact=197567 -> Bitmap Index Scan on title_trgm_idx (cost=0.00..462.35 rows=1380 width=0) (actual time=346.663..346.663 rows=1220793 loops=1) Index Cond: (title ~~* 'x264'::text) Planning time: 1.168 ms Execution time: 1755.944 ms On Sun, Nov 13, 2016 at 10:04 PM, Oleg Bartunov wrote: > > > On Sun, Nov 13, 2016 at 2:54 PM, Aaron Lewis > wrote: >> >> I have a simple table with Trigram index, >> >> create table mytable(hash char(40), title text); >> create index title_trgm_idx on mytable using gin(title gin_trgm_ops); >> >> When I run a query with 10m rows, it uses the Trigram index, but takes >> 3s to execute, very slow. >> (I have 80m rows, but only inserted 10m for testing purpose) >> >> test=# select count(*) from mytable; >> count >> -- >> 13971887 >> (1 row) >> >> test=# explain analyze select * from mytable where title ilike 'x264'; >> QUERY PLAN >> >> -- >> Bitmap Heap Scan on mytable (cost=462.69..5639.67 rows=1380 >> width=83) (actual time=2937.308..2937.308 rows=0 loops=1) >>Recheck Cond: (title ~~* 'x264'::text) >>Rows Removed by Index Recheck: 11402855 >>Heap Blocks: exact=39557 lossy=158010 >>-> Bitmap Index Scan on title_trgm_idx (cost=0.00..462.35 >> rows=1380 width=0) (actual time=342.440..342.440 rows=1220793 loops=1) >> Index Cond: (title ~~* 'x264'::text) >> Planning time: 0.611 ms >> Execution time: 2937.729 ms >> (8 rows) >> >> Any ideas to speed things up? > > >Rows Removed by Index Recheck: 11402855 >Heap Blocks: exact=39557 lossy=158010 > > You need to increase work_mem >> >> >> -- >> Best Regards, >> Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/ >> Finger Print: 9F67 391B B770 8FF6 99DC D92D 87F6 2602 1371 4D33 >> >> >> -- >> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general > > -- Best Regards, Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/ Finger Print: 9F67 391B B770 8FF6 99DC D92D 87F6 2602 1371 4D33 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Trigram is slow when 10m rows
On Sun, Nov 13, 2016 at 2:54 PM, Aaron Lewis wrote: > I have a simple table with Trigram index, > > create table mytable(hash char(40), title text); > create index title_trgm_idx on mytable using gin(title gin_trgm_ops); > > When I run a query with 10m rows, it uses the Trigram index, but takes > 3s to execute, very slow. > (I have 80m rows, but only inserted 10m for testing purpose) > > test=# select count(*) from mytable; > count > -- > 13971887 > (1 row) > > test=# explain analyze select * from mytable where title ilike 'x264'; > QUERY PLAN > > -- > Bitmap Heap Scan on mytable (cost=462.69..5639.67 rows=1380 > width=83) (actual time=2937.308..2937.308 rows=0 loops=1) >Recheck Cond: (title ~~* 'x264'::text) >Rows Removed by Index Recheck: 11402855 >Heap Blocks: exact=39557 lossy=158010 >-> Bitmap Index Scan on title_trgm_idx (cost=0.00..462.35 > rows=1380 width=0) (actual time=342.440..342.440 rows=1220793 loops=1) > Index Cond: (title ~~* 'x264'::text) > Planning time: 0.611 ms > Execution time: 2937.729 ms > (8 rows) > > Any ideas to speed things up? > Rows Removed by Index Recheck: 11402855 Heap Blocks: exact=39557 lossy=158010 You need to increase work_mem > > -- > Best Regards, > Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/ > Finger Print: 9F67 391B B770 8FF6 99DC D92D 87F6 2602 1371 4D33 > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
[GENERAL] Trigram is slow when 10m rows
I have a simple table with Trigram index, create table mytable(hash char(40), title text); create index title_trgm_idx on mytable using gin(title gin_trgm_ops); When I run a query with 10m rows, it uses the Trigram index, but takes 3s to execute, very slow. (I have 80m rows, but only inserted 10m for testing purpose) test=# select count(*) from mytable; count -- 13971887 (1 row) test=# explain analyze select * from mytable where title ilike 'x264'; QUERY PLAN -- Bitmap Heap Scan on mytable (cost=462.69..5639.67 rows=1380 width=83) (actual time=2937.308..2937.308 rows=0 loops=1) Recheck Cond: (title ~~* 'x264'::text) Rows Removed by Index Recheck: 11402855 Heap Blocks: exact=39557 lossy=158010 -> Bitmap Index Scan on title_trgm_idx (cost=0.00..462.35 rows=1380 width=0) (actual time=342.440..342.440 rows=1220793 loops=1) Index Cond: (title ~~* 'x264'::text) Planning time: 0.611 ms Execution time: 2937.729 ms (8 rows) Any ideas to speed things up? -- Best Regards, Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/ Finger Print: 9F67 391B B770 8FF6 99DC D92D 87F6 2602 1371 4D33 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general