Re: [GENERAL] LIKE and REGEX optimization

2008-01-17 Thread hubert depesz lubaczewski
On Tue, Jan 15, 2008 at 04:49:41PM -0600, Scott Marlowe wrote:
 This query is not capable of using an index on name, since you can't
 use an index with a like beginning with a %... So

actually you can. you just can't use index for like %something%, but it
can be solved using trigrams or another approaches. for example:
http://www.depesz.com/index.php/2007/09/15/speeding-up-like-xxx/

depesz

-- 
quicksil1er: postgres is excellent, but like any DB it requires a
highly paid DBA.  here's my CV! :)
http://www.depesz.com/ - blog dla ciebie (i moje CV)

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] LIKE and REGEX optimization

2008-01-16 Thread Richard Huxton

Chris Browne wrote:

The only idea that comes to mind to follow that is to look into
tsearch.  On PostgreSQL 8.3, it's built in; on 8.2 and earlier, it's a
contrib module.  It allows indexing on words inside columns, which
would seem to fit your requirement.


You might want to google for postgresql trigram too. Some notes at the 
location below.

http://www.sai.msu.su/~megera/oddmuse/index.cgi/ReadmeTrgm

It's more of a substring search than tsearch2 is, so might meet your 
needs better. It's in the contrib package / source directory.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


[GENERAL] LIKE and REGEX optimization

2008-01-16 Thread Kico Zaninetti
Hi all.

I have a database with 62 million registers and I have to make a
SELECT using LIKE.

This is my select:
SELECT * FROM phone WHERE name LIKE = '%ZANINETTI%' AND city = 'SAO
PAULO' AND state = 'SP'

I have an index created like this:
CREATE INDEX telefones_idx2 ON public.phone
  USING btree (name varchar_pattern_ops, city, state);

When I explain the query I have this:
QUERY PLAN
Bitmap Heap Scan on telefones  (cost=1031528.27..2726942.75 rows=4
width=145)
  Recheck Cond: (((city)::text = 'SAO PAULO'::text) AND ((state)::text
= 'SP'::text))
  Filter: ((name)::text ~~ '%ZANINETTI%'::text)
  -  Bitmap Index Scan on telefones_idx2  (cost=0.00..1031528.27
rows=1712760 width=0)
Index Cond: (((city)::text = 'SAO PAULO'::text) AND
((state)::text = 'SP'::text))


The cost is over than 1 million! It's to high and I have to reduce it.
Does someone know how can I make it?

Thanks in advance.

Kico Zaninetti
carpe diem

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] LIKE and REGEX optimization

2008-01-15 Thread Chris Browne
Kico Zaninetti [EMAIL PROTECTED] writes:
 Hi all.

 I have a database with 62 million registers and I have to make a
 SELECT using LIKE.

 This is my select:
 SELECT * FROM phone WHERE name LIKE = '%ZANINETTI%' AND city = 'SAO
 PAULO' AND state = 'SP'

 I have an index created like this:
 CREATE INDEX telefones_idx2 ON public.phone
   USING btree (name varchar_pattern_ops, city, state);

 When I explain the query I have this:
 QUERY PLAN
 Bitmap Heap Scan on telefones  (cost=1031528.27..2726942.75 rows=4
 width=145)
   Recheck Cond: (((city)::text = 'SAO PAULO'::text) AND ((state)::text
 = 'SP'::text))
   Filter: ((name)::text ~~ '%ZANINETTI%'::text)
   -  Bitmap Index Scan on telefones_idx2  (cost=0.00..1031528.27
 rows=1712760 width=0)
 Index Cond: (((city)::text = 'SAO PAULO'::text) AND
 ((state)::text = 'SP'::text))


 The cost is over than 1 million! It's to high and I have to reduce it.
 Does someone know how can I make it?

You may be misreading what the cost is...

It looks to me like the planner is estimating that there are 1712760
rows where city = 'SAO PAULO' and state = 'SP'; is that estimate way
off?

If it is, then maybe you need to ANALYZE the table, perhaps combined
with altering the histogram size for city/state.

(e.g. - alter table phone alter column city set statistics 100;
alter table phone alter column state set statistics 100;)

If altering statistics, then re-ANALYZEing the table helps, that may
resolve things.

Unfortunately, the LIKE clause isn't going to readily take advantage
of regular indices, because you have put in '%' at beginning and end,
which means that regular indices will not be usable.

I wouldn't be too surprised if that is the case; Sao Paulo is about
the most populous South American city, with ~11M people, so having a
database with 1.7M phone numbers in that city does not seem out
there.

The only idea that comes to mind to follow that is to look into
tsearch.  On PostgreSQL 8.3, it's built in; on 8.2 and earlier, it's a
contrib module.  It allows indexing on words inside columns, which
would seem to fit your requirement.
-- 
select 'cbbrowne' || '@' || 'linuxfinances.info';
http://cbbrowne.com/info/sap.html
Signs  of  a  Klingon  Programmer #3:  By   filing this  TPR you have
challenged the honor of my family. Prepare to die!

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] LIKE and REGEX optimization

2008-01-15 Thread Scott Marlowe
On Jan 15, 2008 2:29 PM, Chris Browne [EMAIL PROTECTED] wrote:
 Kico Zaninetti [EMAIL PROTECTED] writes:
  Hi all.
 
  I have a database with 62 million registers and I have to make a
  SELECT using LIKE.
 
  This is my select:
  SELECT * FROM phone WHERE name LIKE = '%ZANINETTI%' AND city = 'SAO
  PAULO' AND state = 'SP'

This query is not capable of using an index on name, since you can't
use an index with a like beginning with a %... So

  I have an index created like this:
  CREATE INDEX telefones_idx2 ON public.phone
USING btree (name varchar_pattern_ops, city, state);

This index serves no purpose, because they get used left to right.
Since you can't match name, the other two parts aren't used.

Does the OP have an index on city,state?  That might help.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] LIKE and REGEX optimization

2008-01-15 Thread Chris Browne
fit
to come to Christ; they are so wicked that Christ will never accept
them. And then it may be they set themselves upon a new course of
fruitless endeavors, in their own strength, to make themselves better,
and still meet with new disappointments. They are earnest to inquire
what they shall do. They do not know but there is something else to be
done, in order to their obtaining converting grace, that they have never
done yet. It may be they hope that they are something better than they
were; but then the pleasing dream all vanishes again. If they are told
that they trust too much to their own strength and righteousness, they
cannot unlearn this practice all at once, and find not yet the
appearance of any good, but all looks as dark as midnight to them. Thus
they wander about from mountain to hill, seeking rest, and finding none.
When they are beat out of one refuge, they fly to another; till they are
as it were debilitated, broken, and subdued with legal humblings; in
which God gives them a conviction of their own utter helplessness and
insufficiency, and discovers the true remedy in a clearer knowledge of
Christ and His gospel.

When they begin to seek salvation, they are commonly profoundly ignorant
of themselves; they are not sensible how blind they are; and how little
they can do towards bringing themselves to see spiritual things aright,
and towards putting forth gracious exercises in their own souls. They
are not sensible how remote they are from love to God, and other holy
dispositions, and how dead they are in sin. When they see unexpected
pollution in their own hearts, they go about to wash away their own
defilements, and make the



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster