Re: [GENERAL] index on ILIKE/LIKE - PostgreSQL 9.2

2016-05-23 Thread Lucas Possamai
Hi there,

The problem was solved by using lowercase

*New index:*

> CREATE INDEX CONCURRENTLY  ON public.ja_jobs (clientid, lower(title)
> varchar_pattern_ops, time_job);

*New query:*

> SELECT DISTINCT title
> FROM public.ja_jobs WHERE lower(title) LIKE lower('RYAN
> WER')
> AND clientid = 31239
> AND time_job > 1457826264
> order BY title
> limit 10;


- Improvement of 1400%


Thanks

Lucas


Re: [GENERAL] index on ILIKE/LIKE - PostgreSQL 9.2

2016-05-14 Thread Jeff Janes
On Thu, May 12, 2016 at 2:32 PM, Lucas Possamai  wrote:

>> ->  Bitmap Index Scan on "ix_jobs_trgm_gin"
>> (cost=0.00..335.64 rows=485 width=0) (actual time=3883.886..3883.886 rows=32
>> loops=1)
>>   Index Cond: (("title")::"text" ~~* '%RYAN
>> WER%'::"text")
>>   Buffers: shared hit=5945
>> Total runtime: 3945.554 ms

So it is not cold-cache or IO problems, but a CPU problem.  Your query
only has 6 trigrams in it, and that is causing nearly 6000 buffer
hits.  I'm guessing the "  w" trigram is extremely common in your data
set.  Anyway, you have some huge posting lists there, and they were
not dealt with very well in 9.2 or 9.3.

Cheers,

Jeff


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] index on ILIKE/LIKE - PostgreSQL 9.2

2016-05-12 Thread Lucas Possamai
>
> With those sizes, the gin index will probably be naturally kept mostly
> in the file-system cache, if it is used regularly.  So the original
> slowness of your first query is likely just a cold-cache problem.  Can
> you generate a stream of realistic queries and see what it stabilizes
> at?
>
>
> > I just wanted to understand why the GIN index is not working, but it
> works
> > here:
> https://www.depesz.com/2011/02/19/waiting-for-9-1-faster-likeilike/
>
> In your first email, the gin index did "work", according to the
> execution plan.  It just wasn't as fast as you wanted.  In general,
> the longer the query string is between the %%, the worse it will
> perform (until version 9.6, and to a smaller degree even with 9.6).
> But it still seems oddly slow to me, unless you have a cold-cache and
> really bad (or overloaded) IO.
>
> >
> >>
> >>
> >> It would be interesting to see the output of explain (analyze,
> >> buffers) with track_io_timing turned on.
> >
> >
> > explain analyze buffer with track_io_timing turned on:
> ...
>
> That is the wrong query.  The CTE (i.e. the WITH part) is an
> optimization fence, so it can't use the gin index, simply because of
> the way you query is written.  (I think Melvin suggested it because he
> noticed that using the gin index actually slowed down the query, so he
> wanted to force it to not be used.)
>


Oh ok.

- Here is the explain analyze buffer with the original query I posted here
with the gin index:

Query:

>  explain (analyze, buffers)
>  SELECT title
> FROM ja_jobs WHERE title ILIKE '%RYAN WER%'
> and clientid = 31239  AND time_job > 1457826264
> order BY title
> limit 10


Explain analyze:

> Limit  (cost=390.07..390.08 rows=1 width=20) (actual
> time=3945.263..3945.280 rows=4 loops=1)
>   Buffers: shared hit=5956 read=10
>   I/O Timings: read=60.323
>   ->  Sort  (cost=390.07..390.08 rows=1 width=20) (actual
> time=3945.256..3945.260 rows=4 loops=1)
> Sort Key: "title"
> Sort Method: quicksort  Memory: 25kB
> Buffers: shared hit=5956 read=10
> I/O Timings: read=60.323
> ->  Bitmap Heap Scan on "ja_jobs"  (cost=386.05..390.06 rows=1
> width=20) (actual time=3944.857..3945.127 rows=4 loops=1)
>   Recheck Cond: (("clientid" = 31239) AND ("time_job" >
> 1457826264) AND (("title")::"text" ~~* '% WER%'::"text"))
>   Buffers: shared hit=5951 read=10
>   I/O Timings: read=60.323
>   ->  BitmapAnd  (cost=386.05..386.05 rows=1 width=0) (actual
> time=3929.540..3929.540 rows=0 loops=1)
> Buffers: shared hit=5950 read=7
> I/O Timings: read=45.021
> ->  Bitmap Index Scan on "ix_jobs_client_times"
>  (cost=0.00..50.16 rows=1660 width=0) (actual time=45.536..45.536 rows=795
> loops=1)
>   Index Cond: (("clientid" = 31239) AND
> ("time_job" > 1457826264))
>   Buffers: shared hit=5 read=7
>   I/O Timings: read=45.021
> ->  Bitmap Index Scan on "ix_jobs_trgm_gin"
>  (cost=0.00..335.64 rows=485 width=0) (actual time=3883.886..3883.886
> rows=32 loops=1)
>   Index Cond: (("title")::"text" ~~* '%RYAN WER
> %'::"text")
>   Buffers: shared hit=5945
> Total runtime: 3945.554 ms


Re: [GENERAL] index on ILIKE/LIKE - PostgreSQL 9.2

2016-05-12 Thread Jeff Janes
On Wed, May 11, 2016 at 11:59 PM, Lucas Possamai  wrote:
>
>>
>> How big is the table?  The gin index?  shared_buffers?  RAM?  What
>> kind of IO system do you have, and how many other things were going on
>> with it?
>
>
> - Just a reminder that I'm not running these tests on my prod server.. I'm
> running on my test server. So the confs will be different
>
>> The table is 9GB big
>> The gin index is 400MB big
>> shared_buffers = 1536MB
>> RAM = 8 GB

With those sizes, the gin index will probably be naturally kept mostly
in the file-system cache, if it is used regularly.  So the original
slowness of your first query is likely just a cold-cache problem.  Can
you generate a stream of realistic queries and see what it stabilizes
at?


> I just wanted to understand why the GIN index is not working, but it works
> here: https://www.depesz.com/2011/02/19/waiting-for-9-1-faster-likeilike/

In your first email, the gin index did "work", according to the
execution plan.  It just wasn't as fast as you wanted.  In general,
the longer the query string is between the %%, the worse it will
perform (until version 9.6, and to a smaller degree even with 9.6).
But it still seems oddly slow to me, unless you have a cold-cache and
really bad (or overloaded) IO.

>
>>
>>
>> It would be interesting to see the output of explain (analyze,
>> buffers) with track_io_timing turned on.
>
>
> explain analyze buffer with track_io_timing turned on:
...

That is the wrong query.  The CTE (i.e. the WITH part) is an
optimization fence, so it can't use the gin index, simply because of
the way you query is written.  (I think Melvin suggested it because he
noticed that using the gin index actually slowed down the query, so he
wanted to force it to not be used.)

Cheers,

Jeff


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] index on ILIKE/LIKE - PostgreSQL 9.2

2016-05-12 Thread Andreas Joseph Krogh
På torsdag 12. mai 2016 kl. 10:05:01, skrev Andreas Joseph Krogh <
andr...@visena.com >:
[snp] I created this test:
 
create table ja_jobs(id bigserial primary key, title varchar not null, 
clientid bigint not null, time_job bigint not null);
CREATE INDEX ix_ja_jobs_trgm_clientid_gin ON public.ja_jobs USING gin (title 
gin_trgm_ops, clientid);

 --- insert some test-data
 
As you see, this uses the index (when casting clientid to bigint):


 
andreak=# explain analyze SELECT DISTINCT title  
 FROM ja_jobs WHERE title ILIKE '%ras du%'
 and clientid = 12::bigint AND time_job > 257826264
 order BY title
 limit 10;
   QUERY 
PLAN   
 
---
  Limit  (cost=8.43..8.44 rows=1 width=32) (actual time=0.033..0.034 rows=1 
loops=1)
    ->  Unique  (cost=8.43..8.44 rows=1 width=32) (actual time=0.032..0.032 
rows=1 loops=1)
  ->  Sort  (cost=8.43..8.43 rows=1 width=32) (actual 
time=0.032..0.032 rows=1 loops=1)
    Sort Key: title
    Sort Method: quicksort  Memory: 25kB
    ->  Bitmap Heap Scan on ja_jobs  (cost=7.20..8.42 rows=1 
width=32) (actual time=0.025..0.025 rows=1 loops=1)
  Recheck Cond: (((title)::text ~~* '%ras du%'::text) AND 
(clientid = '12'::bigint))
  Filter: (time_job > 257826264)
  Heap Blocks: exact=1
  ->  Bitmap Index Scan on ix_ja_jobs_trgm_clientid_gin  
(cost=0.00..7.20 rows=1 width=0) (actual time=0.016..0.016 rows=1 loops=1)
    Index Cond: (((title)::text ~~* '%ras du%'::text) 
AND (clientid = '12'::bigint))
  Planning time: 0.169 ms
  Execution time: 0.061 ms
 (13 rows)

 
Forgot to say, this is in PG-9.6 (master), but should work on previous 
versions.
 
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andr...@visena.com 
www.visena.com 
 


 


Re: [GENERAL] index on ILIKE/LIKE - PostgreSQL 9.2

2016-05-12 Thread Andreas Joseph Krogh
På torsdag 12. mai 2016 kl. 09:57:58, skrev Andreas Joseph Krogh <
andr...@visena.com >:
På torsdag 12. mai 2016 kl. 02:30:33, skrev Lucas Possamai <
drum.lu...@gmail.com >:
Hi there!  
I've got a simple but slow query:
 
 SELECT DISTINCT title  
 FROM ja_jobs WHERE title ILIKE '%RYAN WER%'
 and clientid = 31239  AND time_job > 1457826264
 order BY title
 limit 10 
 
Explain analyze: 
 
Limit  (cost=5946.40..5946.41 rows=1 width=19) (actual time=2746.759..2746.772 
rows=1 loops=1)
   ->  Unique  (cost=5946.40..5946.41 rows=1 width=19) (actual 
time=2746.753..2746.763 rows=1 loops=1)
         ->  Sort  (cost=5946.40..5946.41 rows=1 width=19) (actual 
time=2746.750..2746.754 rows=4 loops=1)
               Sort Key: "title"
               Sort Method: quicksort  Memory: 25kB
               ->  Bitmap Heap Scan on "ja_jobs"  (cost=49.02..5946.39 rows=1 
width=19) (actual time=576.275..2746.609 rows=4 loops=1)
                     Recheck Cond: (("clientid" = 31239) AND ("time_job" > 
1457826264))
                     Filter: (("title")::"text" ~~* '%RYAN WER%'::"text")
                     Rows Removed by Filter: 791
                     ->  Bitmap Index Scan on "ix_jobs_client_times" 
 (cost=0.00..49.02 rows=1546 width=0) (actual time=100.870..100.870 rows=795 
loops=1)
                           Index Cond: (("clientid" = 31239) AND ("time_job" > 
1457826264))
 Total runtime: 2746.879 ms  
Then, I created a trgm index:
 
CREATE INDEX ix_ja_jobs_trgm_gin ON public.ja_jobs USING gin (title 
gin_trgm_ops);
 
Explain analyze after the index: (Yes, I ran the analyze)
 
Limit  (cost=389.91..389.91 rows=1 width=20) (actual time=3720.511..3720.511 
rows=0 loops=1)
   ->  Unique  (cost=389.91..389.91 rows=1 width=20) (actual 
time=3720.507..3720.507 rows=0 loops=1)
         ->  Sort  (cost=389.91..389.91 rows=1 width=20) (actual 
time=3720.505..3720.505 rows=0 loops=1)
               Sort Key: "title"
               Sort Method: quicksort  Memory: 25kB
               ->  Bitmap Heap Scan on "ja_jobs"  (cost=385.88..389.90 rows=1 
width=20) (actual time=3720.497..3720.497 rows=0 loops=1)
                     Recheck Cond: (("clientid" = 31239) AND ("time_job" > 
1457826264) AND (("title")::"text" ~~ '%RYAN WER%'::"text"))
                     Rows Removed by Index Recheck: 4
                     ->  BitmapAnd  (cost=385.88..385.88 rows=1 width=0) 
(actual time=3720.469..3720.469 rows=0 loops=1)
                           ->  Bitmap Index Scan on "ix_jobs_client_times" 
 (cost=0.00..50.00 rows=1644 width=0) (actual time=0.142..0.142 rows=795 
loops=1)
                                 Index Cond: (("clientid" = 31239) AND 
("time_job" > 1457826264))
                           ->  Bitmap Index Scan on "ix_ja_jobs_trgm_gin" 
 (cost=0.00..335.63 rows=484 width=0) (actual time=3720.213..3720.213 rows=32 
loops=1)
                                 Index Cond: (("title")::"text" ~~ '%RYAN WER
%'::"text")
 Total runtime: 3720.653 ms   
 
so.. the query is still slow.. 
Do you guys  know what can be done ? related to the ILIKE?

 
cheers
Lucas

 
It uses available indexes, but that isn't good enough.
 
Try including clientid in the index, using the btree_gin extension:
 
CREATE INDEX ix_ja_jobs_trgm_clientid_gin ON public.ja_jobs USING gin (title 
gin_trgm_ops, clientid);
 
Note that if clientid is a bigint you have to cast the value to bigint for 
btree_gin to use it (note that this isn't necessary if you use a prepared 
statement):
 
SELECT DISTINCT title  
         FROM ja_jobs WHERE title ILIKE '%RYAN WER%'
         and clientid = 31239::bigint  AND time_job > 1457826264
         order BY title
         limit 10

 
Also note that the index cannot ant won't be used for sorting. A bitmap-AND is 
also inevitable because GIN-indexes cannot be used for the '>' operator, so PG 
uses the ix_jobs_client_times btree-index and bigmap-ANDs the result.
 
Can you post you complete schema?
 
I created this test:
 
create table ja_jobs(id bigserial primary key, title varchar not null, 
clientid bigint not null, time_job bigint not null);
CREATE INDEX ix_ja_jobs_trgm_clientid_gin ON public.ja_jobs USING gin (title 
gin_trgm_ops, clientid);

 --- insert some test-data
 
As you see, this uses the index (when casting clientid to bigint):


 
andreak=# explain analyze SELECT DISTINCT title  
 FROM ja_jobs WHERE title ILIKE '%ras du%'
 and clientid = 12::bigint AND time_job > 257826264
 order BY title
 limit 10;
   QUERY 
PLAN   
 
---
  Limit  (cost=8.43..8.44 rows=1 width=32) (actual time=0.033..0.034 rows=1 
loops=1)
    ->  Unique  (cost=8.43..8.44 rows=1 width=32) (actual time=0.032..0.032 
rows=1 loops=1)

Re: [GENERAL] index on ILIKE/LIKE - PostgreSQL 9.2

2016-05-12 Thread Andreas Joseph Krogh
På torsdag 12. mai 2016 kl. 02:30:33, skrev Lucas Possamai >:
Hi there!  
I've got a simple but slow query:
 
 SELECT DISTINCT title  
 FROM ja_jobs WHERE title ILIKE '%RYAN WER%'
 and clientid = 31239  AND time_job > 1457826264
 order BY title
 limit 10 
 
Explain analyze: 
 
Limit  (cost=5946.40..5946.41 rows=1 width=19) (actual time=2746.759..2746.772 
rows=1 loops=1)
   ->  Unique  (cost=5946.40..5946.41 rows=1 width=19) (actual 
time=2746.753..2746.763 rows=1 loops=1)
         ->  Sort  (cost=5946.40..5946.41 rows=1 width=19) (actual 
time=2746.750..2746.754 rows=4 loops=1)
               Sort Key: "title"
               Sort Method: quicksort  Memory: 25kB
               ->  Bitmap Heap Scan on "ja_jobs"  (cost=49.02..5946.39 rows=1 
width=19) (actual time=576.275..2746.609 rows=4 loops=1)
                     Recheck Cond: (("clientid" = 31239) AND ("time_job" > 
1457826264))
                     Filter: (("title")::"text" ~~* '%RYAN WER%'::"text")
                     Rows Removed by Filter: 791
                     ->  Bitmap Index Scan on "ix_jobs_client_times" 
 (cost=0.00..49.02 rows=1546 width=0) (actual time=100.870..100.870 rows=795 
loops=1)
                           Index Cond: (("clientid" = 31239) AND ("time_job" > 
1457826264))
 Total runtime: 2746.879 ms  
Then, I created a trgm index:
 
CREATE INDEX ix_ja_jobs_trgm_gin ON public.ja_jobs USING gin (title 
gin_trgm_ops);
 
Explain analyze after the index: (Yes, I ran the analyze)
 
Limit  (cost=389.91..389.91 rows=1 width=20) (actual time=3720.511..3720.511 
rows=0 loops=1)
   ->  Unique  (cost=389.91..389.91 rows=1 width=20) (actual 
time=3720.507..3720.507 rows=0 loops=1)
         ->  Sort  (cost=389.91..389.91 rows=1 width=20) (actual 
time=3720.505..3720.505 rows=0 loops=1)
               Sort Key: "title"
               Sort Method: quicksort  Memory: 25kB
               ->  Bitmap Heap Scan on "ja_jobs"  (cost=385.88..389.90 rows=1 
width=20) (actual time=3720.497..3720.497 rows=0 loops=1)
                     Recheck Cond: (("clientid" = 31239) AND ("time_job" > 
1457826264) AND (("title")::"text" ~~ '%RYAN WER%'::"text"))
                     Rows Removed by Index Recheck: 4
                     ->  BitmapAnd  (cost=385.88..385.88 rows=1 width=0) 
(actual time=3720.469..3720.469 rows=0 loops=1)
                           ->  Bitmap Index Scan on "ix_jobs_client_times" 
 (cost=0.00..50.00 rows=1644 width=0) (actual time=0.142..0.142 rows=795 
loops=1)
                                 Index Cond: (("clientid" = 31239) AND 
("time_job" > 1457826264))
                           ->  Bitmap Index Scan on "ix_ja_jobs_trgm_gin" 
 (cost=0.00..335.63 rows=484 width=0) (actual time=3720.213..3720.213 rows=32 
loops=1)
                                 Index Cond: (("title")::"text" ~~ '%RYAN WER
%'::"text")
 Total runtime: 3720.653 ms   
 
so.. the query is still slow.. 
Do you guys  know what can be done ? related to the ILIKE?

 
cheers
Lucas

 
It uses available indexes, but that isn't good enough.
 
Try including clientid in the index, using the btree_gin extension:
 
CREATE INDEX ix_ja_jobs_trgm_clientid_gin ON public.ja_jobs USING gin (title 
gin_trgm_ops, clientid);
 
Note that if clientid is a bigint you have to cast the value to bigint for 
btree_gin to use it (note that this isn't necessary if you use a prepared 
statement):
 
SELECT DISTINCT title  
         FROM ja_jobs WHERE title ILIKE '%RYAN WER%'
         and clientid = 31239::bigint  AND time_job > 1457826264
         order BY title
         limit 10

 
Also note that the index cannot ant won't be used for sorting. A bitmap-AND is 
also inevitable because GIN-indexes cannot be used for the '>' operator, so PG 
uses the ix_jobs_client_times btree-index and bigmap-ANDs the result.
 
Can you post you complete schema?
 
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andr...@visena.com 
www.visena.com 
 


 


Re: [GENERAL] index on ILIKE/LIKE - PostgreSQL 9.2

2016-05-12 Thread Lucas Possamai
> How big is the table?  The gin index?  shared_buffers?  RAM?  What
> kind of IO system do you have, and how many other things were going on
> with it?
>

- Just a reminder that I'm not running these tests on my prod server.. I'm
running on my test server. So the confs will be different

The table is 9GB big
> The gin index is 400MB big
> shared_buffers = 1536MB
> RAM = 8 GB


I just wanted to understand why the GIN index is not working, but it works
here: https://www.depesz.com/2011/02/19/waiting-for-9-1-faster-likeilike/


>
> It would be interesting to see the output of explain (analyze,
> buffers) with track_io_timing turned on.
>

explain analyze buffer with track_io_timing turned on:


Limit  (cost=93466.83..93466.83 rows=1 width=218) (actual
> time=24025.463..24025.478 rows=5 loops=1)
>   Buffers: shared hit=8 read=42285
>   I/O Timings: read=23599.672
>   CTE ja_jobs
> ->  HashAggregate  (cost=93451.05..93455.90 rows=485 width=20) (actual
> time=23946.801..23967.660 rows=16320 loops=1)
>   Buffers: shared hit=3 read=42285
>   I/O Timings: read=23599.672
>   ->  Bitmap Heap Scan on "ja_jobs"  (cost=877.70..93374.92
> rows=30453 width=20) (actual time=161.372..23835.632 rows=48472 loops=1)
> Recheck Cond: (("clientid" = 14635) AND ("time_job" >
> 1436731799))
> Buffers: shared hit=3 read=42285
> I/O Timings: read=23599.672
> ->  Bitmap Index Scan on "ix_jobs_client_times"
>  (cost=0.00..870.09 rows=30453 width=0) (actual time=133.920..133.920
> rows=48472 loops=1)
>   Index Cond: (("clientid" = 14635) AND ("time_job" >
> 1436731799))
>   Buffers: shared hit=3 read=244
>   I/O Timings: read=120.137
>   ->  Sort  (cost=10.92..10.93 rows=1 width=218) (actual
> time=24025.457..24025.462 rows=5 loops=1)
> Sort Key: "ja_jobs"."title"
> Sort Method: quicksort  Memory: 25kB
> Buffers: shared hit=8 read=42285
> I/O Timings: read=23599.672
> ->  CTE Scan on "ja_jobs"  (cost=0.00..10.91 rows=1 width=218)
> (actual time=23977.095..24025.325 rows=5 loops=1)
>   Filter: (("title")::"text" ~~* '%To Electrical%'::"text")
>   Rows Removed by Filter: 16315
>   Buffers: shared hit=3 read=42285
>   I/O Timings: read=23599.672
> Total runtime: 24028.551 ms




>
> There have been improvements in this area since 9.2, you should
> consider upgrading to at least 9.4.
>
>
Yep I know. The upgrade will happen, but I don't know when.


Re: [GENERAL] index on ILIKE/LIKE - PostgreSQL 9.2

2016-05-11 Thread Jeff Janes
On Wed, May 11, 2016 at 5:30 PM, Lucas Possamai  wrote:
> Hi there!
>
> I've got a simple but slow query:
>
>>  SELECT DISTINCT title
>> FROM ja_jobs WHERE title ILIKE '%RYAN WER%'
>> and clientid = 31239  AND time_job > 1457826264
>> order BY title
>> limit 10


>
> CREATE INDEX ix_ja_jobs_trgm_gin ON public.ja_jobs USING gin (title
> gin_trgm_ops);
>
> Explain analyze after the index: (Yes, I ran the analyze)
>
>> Limit  (cost=389.91..389.91 rows=1 width=20) (actual
>> time=3720.511..3720.511 rows=0 loops=1)
>>   ->  Unique  (cost=389.91..389.91 rows=1 width=20) (actual
>> time=3720.507..3720.507 rows=0 loops=1)
>> ->  Sort  (cost=389.91..389.91 rows=1 width=20) (actual
>> time=3720.505..3720.505 rows=0 loops=1)
>>   Sort Key: "title"
>>   Sort Method: quicksort  Memory: 25kB
>>   ->  Bitmap Heap Scan on "ja_jobs"  (cost=385.88..389.90
>> rows=1 width=20) (actual time=3720.497..3720.497 rows=0 loops=1)
>> Recheck Cond: (("clientid" = 31239) AND ("time_job" >
>> 1457826264) AND (("title")::"text" ~~ '%RYAN WER%'::"text"))
>> Rows Removed by Index Recheck: 4
>> ->  BitmapAnd  (cost=385.88..385.88 rows=1 width=0)
>> (actual time=3720.469..3720.469 rows=0 loops=1)
>>   ->  Bitmap Index Scan on "ix_jobs_client_times"
>> (cost=0.00..50.00 rows=1644 width=0) (actual time=0.142..0.142 rows=795
>> loops=1)
>> Index Cond: (("clientid" = 31239) AND
>> ("time_job" > 1457826264))
>>   ->  Bitmap Index Scan on "ix_ja_jobs_trgm_gin"
>> (cost=0.00..335.63 rows=484 width=0) (actual time=3720.213..3720.213 rows=32
>> loops=1)
>> Index Cond: (("title")::"text" ~~ '%RYAN
>> WER%'::"text")
>> Total runtime: 3720.653 ms


How big is the table?  The gin index?  shared_buffers?  RAM?  What
kind of IO system do you have, and how many other things were going on
with it?

It would be interesting to see the output of explain (analyze,
buffers) with track_io_timing turned on.

There have been improvements in this area since 9.2, you should
consider upgrading to at least 9.4.

Cheers,

Jeff


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] index on ILIKE/LIKE - PostgreSQL 9.2

2016-05-11 Thread Jan de Visser
On Wed, May 11, 2016 at 10:03 PM, Lucas Possamai 
wrote:

>
>>>
>> Trying redoing the query with CTE as below:
>>
>> WITH ja_jobs as
>>   (SELECT DISTINCT title
>>  FROM ja_jobs
>> WHERE clientid = 31239  AND time_job > 1457826264
>>   )
>> SELECT title
>>   FROM ja_jobs
>>  WHERE title ILIKE 'RYAN WER%'
>>  ORDER BY title
>>  LIMIT 10;
>>
>
> hmm.. still slow =(
>
>
> and it's not hitting the index: (i had to change the clientid because the
> previous one was in cache)
>
> Limit  (cost=93790.08..93790.09 rows=1 width=218) (actual
>> time=284.293..284.308 rows=5 loops=1)
>>   Buffers: shared hit=42284
>>   CTE ja_jobs
>> ->  HashAggregate  (cost=93774.31..93779.16 rows=485 width=20)
>> (actual time=207.235..228.141 rows=16320 loops=1)
>>   Buffers: shared hit=42284
>>   ->  Bitmap Heap Scan on "ja_jobs"  (cost=882.98..93697.86
>> rows=30578 width=20) (actual time=21.942..133.380 rows=48472 loops=1)
>> Recheck Cond: (("clientid" = 14635) AND ("time_job" >
>> 1436731799))
>> Buffers: shared hit=42284
>> ->  Bitmap Index Scan on "ix_jobs_client_times"
>>  (cost=0.00..875.34 rows=30578 width=0) (actual time=12.389..12.389
>> rows=48472 loops=1)
>>   Index Cond: (("clientid" = 14635) AND ("time_job" >
>> 1436731799))
>>   Buffers: shared hit=243
>>   ->  Sort  (cost=10.92..10.93 rows=1 width=218) (actual
>> time=284.289..284.293 rows=5 loops=1)
>> Sort Key: "ja_jobs"."title"
>> Sort Method: quicksort  Memory: 25kB
>> Buffers: shared hit=42284
>> ->  CTE Scan on "ja_jobs"  (cost=0.00..10.91 rows=1 width=218)
>> (actual time=236.248..284.263 rows=5 loops=1)
>>   Filter: (("title")::"text" ~~* '%To Electrical%'::"text")
>>   Rows Removed by Filter: 16315
>>   Buffers: shared hit=42284
>> Total runtime: 287.633 ms
>
>
>

I think a GIN index can't be used for sorting.


Re: [GENERAL] index on ILIKE/LIKE - PostgreSQL 9.2

2016-05-11 Thread Lucas Possamai
>
>
>>
> Trying redoing the query with CTE as below:
>
> WITH ja_jobs as
>   (SELECT DISTINCT title
>  FROM ja_jobs
> WHERE clientid = 31239  AND time_job > 1457826264
>   )
> SELECT title
>   FROM ja_jobs
>  WHERE title ILIKE 'RYAN WER%'
>  ORDER BY title
>  LIMIT 10;
>

hmm.. still slow =(


and it's not hitting the index: (i had to change the clientid because the
previous one was in cache)

Limit  (cost=93790.08..93790.09 rows=1 width=218) (actual
> time=284.293..284.308 rows=5 loops=1)
>   Buffers: shared hit=42284
>   CTE ja_jobs
> ->  HashAggregate  (cost=93774.31..93779.16 rows=485 width=20) (actual
> time=207.235..228.141 rows=16320 loops=1)
>   Buffers: shared hit=42284
>   ->  Bitmap Heap Scan on "ja_jobs"  (cost=882.98..93697.86
> rows=30578 width=20) (actual time=21.942..133.380 rows=48472 loops=1)
> Recheck Cond: (("clientid" = 14635) AND ("time_job" >
> 1436731799))
> Buffers: shared hit=42284
> ->  Bitmap Index Scan on "ix_jobs_client_times"
>  (cost=0.00..875.34 rows=30578 width=0) (actual time=12.389..12.389
> rows=48472 loops=1)
>   Index Cond: (("clientid" = 14635) AND ("time_job" >
> 1436731799))
>   Buffers: shared hit=243
>   ->  Sort  (cost=10.92..10.93 rows=1 width=218) (actual
> time=284.289..284.293 rows=5 loops=1)
> Sort Key: "ja_jobs"."title"
> Sort Method: quicksort  Memory: 25kB
> Buffers: shared hit=42284
> ->  CTE Scan on "ja_jobs"  (cost=0.00..10.91 rows=1 width=218)
> (actual time=236.248..284.263 rows=5 loops=1)
>   Filter: (("title")::"text" ~~* '%To Electrical%'::"text")
>   Rows Removed by Filter: 16315
>   Buffers: shared hit=42284
> Total runtime: 287.633 ms


Re: [GENERAL] index on ILIKE/LIKE - PostgreSQL 9.2

2016-05-11 Thread Melvin Davidson
On Wed, May 11, 2016 at 9:36 PM, Lucas Possamai 
wrote:

>
>>>
>> The main problem is WHERE title ILIKE '%RYAN WER%'
>> When you put a % on the left of the text, there is no way to optimize
>> that, so yes, it will be slow.
>>
>> If you can eliminate the leading percent and just have trailing, it will
>> be much faster.
>>
>>
>
> Hmm.. yep.. I suppose I can do that.
>
> But, taking the left % off, the query is still slow:
>
> Limit  (cost=418.57..418.58 rows=1 width=20) (actual
>> time=4439.367..4439.381 rows=1 loops=1)
>>   Buffers: shared hit=6847
>>   ->  Unique  (cost=418.57..418.58 rows=1 width=20) (actual
>> time=4439.363..4439.374 rows=1 loops=1)
>> Buffers: shared hit=6847
>> ->  Sort  (cost=418.57..418.58 rows=1 width=20) (actual
>> time=4439.360..4439.365 rows=4 loops=1)
>>   Sort Key: "title"
>>   Sort Method: quicksort  Memory: 25kB
>>   Buffers: shared hit=6847
>>   ->  Bitmap Heap Scan on "ja_jobs"  (cost=414.55..418.56
>> rows=1 width=20) (actual time=4439.312..4439.329 rows=4 loops=1)
>> Recheck Cond: (("clientid" = 31239) AND ("time_job" >
>> 1457826264) AND (("title")::"text" ~~* 'RYAN SHOWER%'::"text"))
>> Buffers: shared hit=6847
>> ->  BitmapAnd  (cost=414.55..414.55 rows=1 width=0)
>> (actual time=4439.280..4439.280 rows=0 loops=1)
>>   Buffers: shared hit=6843
>>   ->  Bitmap Index Scan on "ix_jobs_client_times"
>>  (cost=0.00..50.67 rows=1711 width=0) (actual time=0.142..0.142 rows=795
>> loops=1)
>> Index Cond: (("clientid" = 31239) AND
>> ("time_job" > 1457826264))
>> Buffers: shared hit=8
>>   ->  Bitmap Index Scan on "ix_ja_jobs_trgm_gin"
>>  (cost=0.00..363.62 rows=483 width=0) (actual time=4439.014..4439.014
>> rows=32 loops=1)
>> Index Cond: (("title")::"text" ~~* 'RYAN
>> SHOWER%'::"text")
>> Buffers: shared hit=6835
>> Total runtime: 4439.427 ms
>
>
> Here [1] it appears to be working even with two %.. But it's not for
> me
>
> [1] https://www.depesz.com/2011/02/19/waiting-for-9-1-faster-likeilike/
>
>
> Any ideia? lol
>

Trying redoing the query with CTE as below:

WITH ja_jobs as
  (SELECT DISTINCT title
 FROM ja_jobs
WHERE clientid = 31239  AND time_job > 1457826264
  )
SELECT title
  FROM ja_jobs
 WHERE title ILIKE 'RYAN WER%'
 ORDER BY title
 LIMIT 10;


-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


Re: [GENERAL] index on ILIKE/LIKE - PostgreSQL 9.2

2016-05-11 Thread Lucas Possamai
>
>
>>
> The main problem is WHERE title ILIKE '%RYAN WER%'
> When you put a % on the left of the text, there is no way to optimize
> that, so yes, it will be slow.
>
> If you can eliminate the leading percent and just have trailing, it will
> be much faster.
>
>

Hmm.. yep.. I suppose I can do that.

But, taking the left % off, the query is still slow:

Limit  (cost=418.57..418.58 rows=1 width=20) (actual
> time=4439.367..4439.381 rows=1 loops=1)
>   Buffers: shared hit=6847
>   ->  Unique  (cost=418.57..418.58 rows=1 width=20) (actual
> time=4439.363..4439.374 rows=1 loops=1)
> Buffers: shared hit=6847
> ->  Sort  (cost=418.57..418.58 rows=1 width=20) (actual
> time=4439.360..4439.365 rows=4 loops=1)
>   Sort Key: "title"
>   Sort Method: quicksort  Memory: 25kB
>   Buffers: shared hit=6847
>   ->  Bitmap Heap Scan on "ja_jobs"  (cost=414.55..418.56
> rows=1 width=20) (actual time=4439.312..4439.329 rows=4 loops=1)
> Recheck Cond: (("clientid" = 31239) AND ("time_job" >
> 1457826264) AND (("title")::"text" ~~* 'RYAN SHOWER%'::"text"))
> Buffers: shared hit=6847
> ->  BitmapAnd  (cost=414.55..414.55 rows=1 width=0)
> (actual time=4439.280..4439.280 rows=0 loops=1)
>   Buffers: shared hit=6843
>   ->  Bitmap Index Scan on "ix_jobs_client_times"
>  (cost=0.00..50.67 rows=1711 width=0) (actual time=0.142..0.142 rows=795
> loops=1)
> Index Cond: (("clientid" = 31239) AND
> ("time_job" > 1457826264))
> Buffers: shared hit=8
>   ->  Bitmap Index Scan on "ix_ja_jobs_trgm_gin"
>  (cost=0.00..363.62 rows=483 width=0) (actual time=4439.014..4439.014
> rows=32 loops=1)
> Index Cond: (("title")::"text" ~~* 'RYAN
> SHOWER%'::"text")
> Buffers: shared hit=6835
> Total runtime: 4439.427 ms


Here [1] it appears to be working even with two %.. But it's not for me

[1] https://www.depesz.com/2011/02/19/waiting-for-9-1-faster-likeilike/


Any ideia? lol


Re: [GENERAL] index on ILIKE/LIKE - PostgreSQL 9.2

2016-05-11 Thread Melvin Davidson
On Wed, May 11, 2016 at 8:30 PM, Lucas Possamai 
wrote:

> Hi there!
>
> I've got a simple but slow query:
>
>  SELECT DISTINCT title
>> FROM ja_jobs WHERE title ILIKE '%RYAN WER%'
>> and clientid = 31239  AND time_job > 1457826264
>> order BY title
>> limit 10
>
>
> Explain analyze:
>
> Limit  (cost=5946.40..5946.41 rows=1 width=19) (actual
>> time=2746.759..2746.772 rows=1 loops=1)
>>   ->  Unique  (cost=5946.40..5946.41 rows=1 width=19) (actual
>> time=2746.753..2746.763 rows=1 loops=1)
>> ->  Sort  (cost=5946.40..5946.41 rows=1 width=19) (actual
>> time=2746.750..2746.754 rows=4 loops=1)
>>   Sort Key: "title"
>>   Sort Method: quicksort  Memory: 25kB
>>   ->  Bitmap Heap Scan on "ja_jobs"  (cost=49.02..5946.39
>> rows=1 width=19) (actual time=576.275..2746.609 rows=4 loops=1)
>> Recheck Cond: (("clientid" = 31239) AND ("time_job" >
>> 1457826264))
>> Filter: (("title")::"text" ~~* '%RYAN WER%'::"text")
>> Rows Removed by Filter: 791
>> ->  Bitmap Index Scan on "ix_jobs_client_times"
>>  (cost=0.00..49.02 rows=1546 width=0) (actual time=100.870..100.870
>> rows=795 loops=1)
>>   Index Cond: (("clientid" = 31239) AND
>> ("time_job" > 1457826264))
>> Total runtime: 2746.879 ms
>
>
> Then, I created a trgm index:
>
> CREATE INDEX ix_ja_jobs_trgm_gin ON public.ja_jobs USING gin (title
> gin_trgm_ops);
>
> Explain analyze after the index: (Yes, I ran the analyze)
>
> Limit  (cost=389.91..389.91 rows=1 width=20) (actual
>> time=3720.511..3720.511 rows=0 loops=1)
>>   ->  Unique  (cost=389.91..389.91 rows=1 width=20) (actual
>> time=3720.507..3720.507 rows=0 loops=1)
>> ->  Sort  (cost=389.91..389.91 rows=1 width=20) (actual
>> time=3720.505..3720.505 rows=0 loops=1)
>>   Sort Key: "title"
>>   Sort Method: quicksort  Memory: 25kB
>>   ->  Bitmap Heap Scan on "ja_jobs"  (cost=385.88..389.90
>> rows=1 width=20) (actual time=3720.497..3720.497 rows=0 loops=1)
>> Recheck Cond: (("clientid" = 31239) AND ("time_job" >
>> 1457826264) AND (("title")::"text" ~~ '%RYAN WER%'::"text"))
>> Rows Removed by Index Recheck: 4
>> ->  BitmapAnd  (cost=385.88..385.88 rows=1 width=0)
>> (actual time=3720.469..3720.469 rows=0 loops=1)
>>   ->  Bitmap Index Scan on "ix_jobs_client_times"
>>  (cost=0.00..50.00 rows=1644 width=0) (actual time=0.142..0.142 rows=795
>> loops=1)
>> Index Cond: (("clientid" = 31239) AND
>> ("time_job" > 1457826264))
>>   ->  Bitmap Index Scan on "ix_ja_jobs_trgm_gin"
>>  (cost=0.00..335.63 rows=484 width=0) (actual time=3720.213..3720.213
>> rows=32 loops=1)
>> Index Cond: (("title")::"text" ~~ '%RYAN
>> WER%'::"text")
>> Total runtime: 3720.653 ms
>
>
>
> so.. the query is still slow..
> Do you guys  know what can be done ? related to the ILIKE?
>
> cheers
> Lucas
>

The main problem is WHERE title ILIKE '%RYAN WER%'
When you put a % on the left of the text, there is no way to optimize that,
so yes, it will be slow.

If you can eliminate the leading percent and just have trailing, it will be
much faster.


-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


[GENERAL] index on ILIKE/LIKE - PostgreSQL 9.2

2016-05-11 Thread Lucas Possamai
Hi there!

I've got a simple but slow query:

 SELECT DISTINCT title
> FROM ja_jobs WHERE title ILIKE '%RYAN WER%'
> and clientid = 31239  AND time_job > 1457826264
> order BY title
> limit 10


Explain analyze:

Limit  (cost=5946.40..5946.41 rows=1 width=19) (actual
> time=2746.759..2746.772 rows=1 loops=1)
>   ->  Unique  (cost=5946.40..5946.41 rows=1 width=19) (actual
> time=2746.753..2746.763 rows=1 loops=1)
> ->  Sort  (cost=5946.40..5946.41 rows=1 width=19) (actual
> time=2746.750..2746.754 rows=4 loops=1)
>   Sort Key: "title"
>   Sort Method: quicksort  Memory: 25kB
>   ->  Bitmap Heap Scan on "ja_jobs"  (cost=49.02..5946.39
> rows=1 width=19) (actual time=576.275..2746.609 rows=4 loops=1)
> Recheck Cond: (("clientid" = 31239) AND ("time_job" >
> 1457826264))
> Filter: (("title")::"text" ~~* '%RYAN WER%'::"text")
> Rows Removed by Filter: 791
> ->  Bitmap Index Scan on "ix_jobs_client_times"
>  (cost=0.00..49.02 rows=1546 width=0) (actual time=100.870..100.870
> rows=795 loops=1)
>   Index Cond: (("clientid" = 31239) AND
> ("time_job" > 1457826264))
> Total runtime: 2746.879 ms


Then, I created a trgm index:

CREATE INDEX ix_ja_jobs_trgm_gin ON public.ja_jobs USING gin (title
gin_trgm_ops);

Explain analyze after the index: (Yes, I ran the analyze)

Limit  (cost=389.91..389.91 rows=1 width=20) (actual
> time=3720.511..3720.511 rows=0 loops=1)
>   ->  Unique  (cost=389.91..389.91 rows=1 width=20) (actual
> time=3720.507..3720.507 rows=0 loops=1)
> ->  Sort  (cost=389.91..389.91 rows=1 width=20) (actual
> time=3720.505..3720.505 rows=0 loops=1)
>   Sort Key: "title"
>   Sort Method: quicksort  Memory: 25kB
>   ->  Bitmap Heap Scan on "ja_jobs"  (cost=385.88..389.90
> rows=1 width=20) (actual time=3720.497..3720.497 rows=0 loops=1)
> Recheck Cond: (("clientid" = 31239) AND ("time_job" >
> 1457826264) AND (("title")::"text" ~~ '%RYAN WER%'::"text"))
> Rows Removed by Index Recheck: 4
> ->  BitmapAnd  (cost=385.88..385.88 rows=1 width=0)
> (actual time=3720.469..3720.469 rows=0 loops=1)
>   ->  Bitmap Index Scan on "ix_jobs_client_times"
>  (cost=0.00..50.00 rows=1644 width=0) (actual time=0.142..0.142 rows=795
> loops=1)
> Index Cond: (("clientid" = 31239) AND
> ("time_job" > 1457826264))
>   ->  Bitmap Index Scan on "ix_ja_jobs_trgm_gin"
>  (cost=0.00..335.63 rows=484 width=0) (actual time=3720.213..3720.213
> rows=32 loops=1)
> Index Cond: (("title")::"text" ~~ '%RYAN
> WER%'::"text")
> Total runtime: 3720.653 ms



so.. the query is still slow..
Do you guys  know what can be done ? related to the ILIKE?

cheers
Lucas