subject:"\[GENERAL\] Avoid huge perfomance loss on string concatenation"

Re: [GENERAL] Avoid huge perfomance loss on string concatenation

2007-12-07 Thread Andrus

Thank you.
I ran manually analyze command.
After that query runs fast.
I have enabled autovacuum and statitics collection in config file.
Every day a lot of rows are added to dok table.

However it seems that statitics is not collected (autovacuum is not running)

Any idea autovacuum is not running ?

Andrus. 



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] Avoid huge perfomance loss on string concatenation

2007-12-06 Thread Andrus

 just small correction here...expressions like that in the create index
 need an extra set of parens (but I agree with your sentiment):
 CREATE INDEX dokindex ON dok ((kuupaeve||kellaaeg))

I tried

CREATE INDEX dok_kuupaev_kellaaeg_idx ON dok ((kuupaev||kellaaeg));

but got error

ERROR: functions in index expression must be marked IMMUTABLE
SQL state: 42P17

How to create such index ?

Andrus. 



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] Avoid huge perfomance loss on string concatenation

2007-12-06 Thread Andrus

 I'm not sure what that comment is supposed to mean.

 PG is using the index for the condition
 dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'

 but there is no index that matches the expression
 dok.kuupaev||dok.kellaaeg BETWEEN '2007-11-01' AND '2007-12-0423 59'

 If you look at your explain output, you'll see that step is taking a
 lot of time, and it's inside a nested loop, which means it's run
 repeatedly.

Postgres must use index to filter out rows matching to the  condition

dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'

this returns 121 rows.

Additional condition

 dok.kuupaev||dok.kellaaeg BETWEEN '2007-11-01' AND '2007-12-0423 59'

should be evaluated only for these 121 rows.

 Have you run a VACUUM ANALYZE on the tables involved with this query
 recently?  It's possible that PG has outdated statistics and is
 running a poor plan as a result.

I have the follwing command in end of postgresql.conf file:

stats_start_collector = on
stats_row_level = on
autovacuum = on

So autovacuum should be running.

 Just add the index and rerun to see if it helps.  If it doesn't, then
 drop the index.  I have absolutely no way to investigate this for you.

I tried

CREATE INDEX dok_kuupaev_kellaaeg_idx ON dok ((kuupaev||kellaaeg));

but got error

ERROR: functions in index expression must be marked IMMUTABLE
SQL state: 42P17

How to create such index ?

 For example, in your query, you have a text string meant to represent
 a date: '2007-12-0423 59'

 This is not a valid date/time, but PostgreSQL has no way to know that
 because it's just a text string.  As a result, you're query is liable
 to give you outright incorrect results.

My fields are kuupaev date  and  kellaaeg char(5)
kellaaeg is in format hh mm

I compare this always with kuupaev||kellaaeg

'2007-12-0423 59' is my valid datetime for to be used for this conversion.

I can probably convert kuupaev||kellaaeg to a datetime and use datetime 
comparison instead of this. Will this increase perfomance ?

Andrus. 



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] Avoid huge perfomance loss on string concatenation

2007-12-06 Thread Andrus

Thank you very much for quick reply.

 can you please give us the types of  dok.kuupaev and dok.kellaaeg?  I
 think a simple fix is possible here.

dok.kuupaev type is DATE

dok.kellaaeg type is character(5) NOT NULL DEFAULT ''
and is used to represent dokument time in format   hh mm

Database encoding is UTF-8 , cluster locale is estonian, OS is Windows 2003 
server.

 You provide zero information on the table layout

dok table full definition is below.
What other information do you need ?

, and the explain output
 has been horribly mangled by your MUA.

I used copy and paste from pgAdmin.
I checked my message and it seems that explain output is OK, havent found 
any truncation.
So I do'nt understand this.

 I would suspect the problem is that there's no index that can be used
 for that final comparison.

Postgres must use index on kuupaev in both queries.
This index filters out most rows.

 Do you have an index along the lines of
 CREATE INDEX dokindex ON dok (kuupaeve||kellaaeg) ?

I do'nt have this index.
dok.kuupaev||dok.kellaaeg conditon should applied after index search is 
performed.
It filters out only a small number of rows additionally to the plain kuupaev 
filter.
So adding index on dok.kuupaev||dok.kellaaeg  is not reasonable IMHO.

Please confirm that most reasonable way to fix this to add this index, I 
will add this.

 Overall, the fact that you're concatenating two text fields to generate a
 date field tends to suggest that your database schema has some fairly
 major design problems, but I can only speculate at this point.

This schema is migrated from dbms where there was no datetime support.
char(5) field is used to express time  in form   hh mm
This schema is deployed in a large number of servers.
Its change would be very expensive. change requires huge amout of  work time 
to re-write applications, create database conversion scripts, re-write 
pl/sql triggers, test and fix new bugs causes by change.

Andrus.


CREATE TABLE firma1.dok
(
  doktyyp character(1) NOT NULL,
  dokumnr integer NOT NULL DEFAULT nextval('dok_dokumnr_seq'::regclass),
  kuupaev date NOT NULL,
  oper character(3),
  klient character(12),
  laonr numeric(2),
  raha character(3),
  tasudok character(25),
  knr character(10),
  tasukuup date,
  yksus character(10),
  sihtyksus character(10),
  pais2obj character(10),
  saaja character(12),
  krdokumnr integer,
  eimuuda ebool,
  kasutaja character(10),
  username character(10),
  kellaaeg character(5) NOT NULL DEFAULT ''::bpchar,
  arvekonto character(10),
  maksetin character(5),
  exchrate numeric(11,6),
  ratefound date,
  kurss numeric(10,5),
  tekst1 text,
  viitenr character(20),
  objrealt ebool,
  arvenumber character(25),
  pais3obj character(10),
  pais4obj character(10),
  pais5obj character(10),
  pais6obj character(10),
  pais7obj character(10),
  pais8obj character(10),
  pais9obj character(10),
  masin character(5),
  tegmasin character(5),
  guid character(36) NOT NULL,
  doksumma numeric(12,2),
  kinnitatud ebool,
  tasumata numeric(12,2),
  sularaha numeric(12,2),
  kaardimaks numeric(12,2),
  kalkliik character(1),
  kalktoode character(20),
  inventuur ebool,
  algus date,
  lopp date,
  taidetud ebool,
  kaal numeric(7,3),
  timestamp character(14) NOT NULL DEFAULT to_char(now(), 
'MMDDHH24MISS'::text),
  vmnr integer,
  tellimus character(25),
  volitaisik character(36),
  liikmesrii character(2),
  tehingulii character(2),
  tarneklaus character(10),
  statprots character(2),
  CONSTRAINT dok_pkey PRIMARY KEY (dokumnr),
  CONSTRAINT dok_arvekonto_fkey FOREIGN KEY (arvekonto)
  REFERENCES firma1.konto (kontonr) MATCH SIMPLE
  ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_kalktoode_fkey FOREIGN KEY (kalktoode)
  REFERENCES firma1.toode (toode) MATCH SIMPLE
  ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_kasutaja_fkey FOREIGN KEY (kasutaja)
  REFERENCES kasutaja (kasutaja) MATCH SIMPLE
  ON UPDATE CASCADE ON DELETE SET NULL DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_klient_fkey FOREIGN KEY (klient)
  REFERENCES firma1.klient (kood) MATCH SIMPLE
  ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_knr_fkey FOREIGN KEY (knr)
  REFERENCES firma1.konto (kontonr) MATCH SIMPLE
  ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_krdokumnr_fkey FOREIGN KEY (krdokumnr)
  REFERENCES firma1.dok (dokumnr) MATCH SIMPLE
  ON UPDATE CASCADE ON DELETE SET NULL DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_liikmesrii_fkey FOREIGN KEY (liikmesrii)
  REFERENCES riik (kood) MATCH SIMPLE
  ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_maksetin_fkey FOREIGN KEY (maksetin)
  REFERENCES maksetin (maksetin) MATCH SIMPLE
  ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,

[GENERAL] Avoid huge perfomance loss on string concatenation

2007-12-04 Thread Andrus

Using string concatenation in where clause causes huge perfomance loss:

explain analyze select
   rid.toode
   FROM dok JOIN rid USING (dokumnr)
   JOIN toode USING (toode)
   LEFT JOIN artliik using(grupp,liik)
   WHERE rid.toode='NAH S'
AND dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'
and dok.kuupaev||dok.kellaaeg BETWEEN '2007-11-01' AND '2007-12-0423 59'

Nested Loop Left Join  (cost=68.75..5064.86 rows=1 width=24) (actual 
time=8.081..26995.552 rows=567 loops=1)
  Join Filter: ((toode.grupp = artliik.grupp) AND (toode.liik = 
artliik.liik))
  -  Nested Loop  (cost=68.75..5062.19 rows=1 width=43) (actual 
time=8.045..26965.731 rows=567 loops=1)
-  Index Scan using toode_pkey on toode  (cost=0.00..8.27 rows=1 
width=43) (actual time=0.023..0.026 rows=1 loops=1)
  Index Cond: ('NAH S'::bpchar = toode)
-  Nested Loop  (cost=68.75..5053.91 rows=1 width=24) (actual 
time=8.016..26964.698 rows=567 loops=1)
  -  Index Scan using dok_kuupaev_idx on dok 
(cost=0.00..4326.16 rows=10 width=4) (actual time=0.059..67.985 rows=3543 
loops=1)
Index Cond: ((kuupaev = '2007-11-01'::date) AND 
(kuupaev = '2007-12-04'::date))
Filter: kuupaev)::text || (kellaaeg)::text) = 
'2007-11-01'::text) AND (((kuupaev)::text || (kellaaeg)::text) = 
'2007-12-0423 59'::text))
  -  Bitmap Heap Scan on rid  (cost=68.75..72.76 rows=1 
width=28) (actual time=7.577..7.577 rows=0 loops=3543)
Recheck Cond: ((dok.dokumnr = rid.dokumnr) AND 
(rid.toode = 'NAH S'::bpchar))
-  BitmapAnd  (cost=68.75..68.75 rows=1 width=0) 
(actual time=7.574..7.574 rows=0 loops=3543)
  -  Bitmap Index Scan on rid_dokumnr_idx 
(cost=0.00..5.13 rows=83 width=0) (actual time=0.037..0.037 rows=14 
loops=3543)
Index Cond: (dok.dokumnr = rid.dokumnr)
  -  Bitmap Index Scan on rid_toode_idx 
(cost=0.00..63.03 rows=1354 width=0) (actual time=7.528..7.528 rows=21144 
loops=3543)
Index Cond: (toode = 'NAH S'::bpchar)
  -  Seq Scan on artliik  (cost=0.00..2.27 rows=27 width=19) (actual 
time=0.007..0.020 rows=27 loops=567)
Total runtime: 26996.399 ms

takes 26 seconds !

If I remove last line it takes only 0 seconds:

SET SEARCH_PATH TO FIRMA1,public;
explain analyze select
   rid.toode
   FROM dok JOIN rid USING (dokumnr)
   JOIN toode USING (toode)
   LEFT JOIN artliik using(grupp,liik)
   WHERE rid.toode='NAH S'
AND dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'

Hash Left Join  (cost=4313.85..7702.10 rows=24 width=24) (actual 
time=10.138..48.884 rows=567 loops=1)
  Hash Cond: ((toode.grupp = artliik.grupp) AND (toode.liik = 
artliik.liik))
  -  Nested Loop  (cost=4311.17..7699.14 rows=24 width=43) (actual 
time=10.049..47.877 rows=567 loops=1)
-  Index Scan using toode_pkey on toode  (cost=0.00..8.27 rows=1 
width=43) (actual time=0.043..0.046 rows=1 loops=1)
  Index Cond: ('NAH S'::bpchar = toode)
-  Hash Join  (cost=4311.17..7690.63 rows=24 width=24) (actual 
time=9.998..47.341 rows=567 loops=1)
  Hash Cond: (rid.dokumnr = dok.dokumnr)
  -  Index Scan using rid_toode_idx on rid 
(cost=0.00..3372.45 rows=1354 width=28) (actual time=0.089..24.265 
rows=21144 loops=1)
Index Cond: (toode = 'NAH S'::bpchar)
  -  Hash  (cost=4286.20..4286.20 rows=1998 width=4) (actual 
time=9.871..9.871 rows=3543 loops=1)
-  Index Scan using dok_kuupaev_idx on dok 
(cost=0.00..4286.20 rows=1998 width=4) (actual time=0.057..6.779 rows=3543 
loops=1)
  Index Cond: ((kuupaev = '2007-11-01'::date) AND 
(kuupaev = '2007-12-04'::date))
  -  Hash  (cost=2.27..2.27 rows=27 width=19) (actual time=0.060..0.060 
rows=27 loops=1)
-  Seq Scan on artliik  (cost=0.00..2.27 rows=27 width=19) (actual 
time=0.009..0.027 rows=27 loops=1)
Total runtime: 49.409 ms


How to rewrite the query

select
   rid.toode,
  artliik.*
   FROM dok JOIN rid USING (dokumnr)
   JOIN toode USING (toode)
   LEFT JOIN artliik using(grupp,liik)
   WHERE rid.toode='NAH S'
AND dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'
and dok.kuupaev||dok.kellaaeg BETWEEN '2007-11-01' AND '2007-12-0423 59'

so it runs fast ?

Andrus.


PostgreSQL 8.2.4 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2 
(mingw-special) 



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] Avoid huge perfomance loss on string concatenation

2007-12-04 Thread Bill Moran

Andrus [EMAIL PROTECTED] wrote:

 Using string concatenation in where clause causes huge perfomance loss:
 
 explain analyze select
rid.toode
FROM dok JOIN rid USING (dokumnr)
JOIN toode USING (toode)
LEFT JOIN artliik using(grupp,liik)
WHERE rid.toode='NAH S'
 AND dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'
 and dok.kuupaev||dok.kellaaeg BETWEEN '2007-11-01' AND '2007-12-0423 59'

You provide zero information on the table layout, and the explain output
has been horribly mangled by your MUA.

I would suspect the problem is that there's no index that can be used
for that final comparison.  Do you have an index along the lines of
CREATE INDEX dokindex ON dok (kuupaeve||kellaaeg) ?

Overall, the fact that you're concatenating two text fields to generate a
date field tends to suggest that your database schema has some fairly
major design problems, but I can only speculate at this point.

If neither of those help, I expect you'll need to provide more information.

 Nested Loop Left Join  (cost=68.75..5064.86 rows=1 width=24) (actual 
 time=8.081..26995.552 rows=567 loops=1)
   Join Filter: ((toode.grupp = artliik.grupp) AND (toode.liik = 
 artliik.liik))
   -  Nested Loop  (cost=68.75..5062.19 rows=1 width=43) (actual 
 time=8.045..26965.731 rows=567 loops=1)
 -  Index Scan using toode_pkey on toode  (cost=0.00..8.27 rows=1 
 width=43) (actual time=0.023..0.026 rows=1 loops=1)
   Index Cond: ('NAH S'::bpchar = toode)
 -  Nested Loop  (cost=68.75..5053.91 rows=1 width=24) (actual 
 time=8.016..26964.698 rows=567 loops=1)
   -  Index Scan using dok_kuupaev_idx on dok 
 (cost=0.00..4326.16 rows=10 width=4) (actual time=0.059..67.985 rows=3543 
 loops=1)
 Index Cond: ((kuupaev = '2007-11-01'::date) AND 
 (kuupaev = '2007-12-04'::date))
 Filter: kuupaev)::text || (kellaaeg)::text) = 
 '2007-11-01'::text) AND (((kuupaev)::text || (kellaaeg)::text) = 
 '2007-12-0423 59'::text))
   -  Bitmap Heap Scan on rid  (cost=68.75..72.76 rows=1 
 width=28) (actual time=7.577..7.577 rows=0 loops=3543)
 Recheck Cond: ((dok.dokumnr = rid.dokumnr) AND 
 (rid.toode = 'NAH S'::bpchar))
 -  BitmapAnd  (cost=68.75..68.75 rows=1 width=0) 
 (actual time=7.574..7.574 rows=0 loops=3543)
   -  Bitmap Index Scan on rid_dokumnr_idx 
 (cost=0.00..5.13 rows=83 width=0) (actual time=0.037..0.037 rows=14 
 loops=3543)
 Index Cond: (dok.dokumnr = rid.dokumnr)
   -  Bitmap Index Scan on rid_toode_idx 
 (cost=0.00..63.03 rows=1354 width=0) (actual time=7.528..7.528 rows=21144 
 loops=3543)
 Index Cond: (toode = 'NAH S'::bpchar)
   -  Seq Scan on artliik  (cost=0.00..2.27 rows=27 width=19) (actual 
 time=0.007..0.020 rows=27 loops=567)
 Total runtime: 26996.399 ms
 
 takes 26 seconds !
 
 If I remove last line it takes only 0 seconds:
 
 SET SEARCH_PATH TO FIRMA1,public;
 explain analyze select
rid.toode
FROM dok JOIN rid USING (dokumnr)
JOIN toode USING (toode)
LEFT JOIN artliik using(grupp,liik)
WHERE rid.toode='NAH S'
 AND dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'
 
 Hash Left Join  (cost=4313.85..7702.10 rows=24 width=24) (actual 
 time=10.138..48.884 rows=567 loops=1)
   Hash Cond: ((toode.grupp = artliik.grupp) AND (toode.liik = 
 artliik.liik))
   -  Nested Loop  (cost=4311.17..7699.14 rows=24 width=43) (actual 
 time=10.049..47.877 rows=567 loops=1)
 -  Index Scan using toode_pkey on toode  (cost=0.00..8.27 rows=1 
 width=43) (actual time=0.043..0.046 rows=1 loops=1)
   Index Cond: ('NAH S'::bpchar = toode)
 -  Hash Join  (cost=4311.17..7690.63 rows=24 width=24) (actual 
 time=9.998..47.341 rows=567 loops=1)
   Hash Cond: (rid.dokumnr = dok.dokumnr)
   -  Index Scan using rid_toode_idx on rid 
 (cost=0.00..3372.45 rows=1354 width=28) (actual time=0.089..24.265 
 rows=21144 loops=1)
 Index Cond: (toode = 'NAH S'::bpchar)
   -  Hash  (cost=4286.20..4286.20 rows=1998 width=4) (actual 
 time=9.871..9.871 rows=3543 loops=1)
 -  Index Scan using dok_kuupaev_idx on dok 
 (cost=0.00..4286.20 rows=1998 width=4) (actual time=0.057..6.779 rows=3543 
 loops=1)
   Index Cond: ((kuupaev = '2007-11-01'::date) AND 
 (kuupaev = '2007-12-04'::date))
   -  Hash  (cost=2.27..2.27 rows=27 width=19) (actual time=0.060..0.060 
 rows=27 loops=1)
 -  Seq Scan on artliik  (cost=0.00..2.27 rows=27 width=19) (actual 
 time=0.009..0.027 rows=27 loops=1)
 Total runtime: 49.409 ms
 
 
 How to rewrite the query
 
 select
rid.toode,
   artliik.*
FROM dok JOIN rid USING (dokumnr)
JOIN toode USING (toode)
LEFT JOIN artliik using(grupp,liik)
WHERE rid.toode='NAH S'
 AND dok.kuupaev BETWEEN '2007-11-01' AND

Re: [GENERAL] Avoid huge perfomance loss on string concatenation

2007-12-04 Thread Merlin Moncure

On Dec 4, 2007 8:02 PM, Bill Moran [EMAIL PROTECTED] wrote:
 Andrus [EMAIL PROTECTED] wrote:
  Using string concatenation in where clause causes huge perfomance loss:
 
  explain analyze select
 rid.toode
 FROM dok JOIN rid USING (dokumnr)
 JOIN toode USING (toode)
 LEFT JOIN artliik using(grupp,liik)
 WHERE rid.toode='NAH S'
  AND dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'
  and dok.kuupaev||dok.kellaaeg BETWEEN '2007-11-01' AND '2007-12-0423 59'

can you please give us the types of  dok.kuupaev and dok.kellaaeg?  I
think a simple fix is possible here.


 You provide zero information on the table layout, and the explain output
 has been horribly mangled by your MUA.

 I would suspect the problem is that there's no index that can be used
 for that final comparison.  Do you have an index along the lines of
 CREATE INDEX dokindex ON dok (kuupaeve||kellaaeg) ?

 Overall, the fact that you're concatenating two text fields to generate a
 date field tends to suggest that your database schema has some fairly
 major design problems, but I can only speculate at this point.

just small correction here...expressions like that in the create index
need an extra set of parens (but I agree with your sentiment):
CREATE INDEX dokindex ON dok ((kuupaeve||kellaaeg))

merlin

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] Avoid huge perfomance loss on string concatenation

Re: [GENERAL] Avoid huge perfomance loss on string concatenation

Re: [GENERAL] Avoid huge perfomance loss on string concatenation

Re: [GENERAL] Avoid huge perfomance loss on string concatenation

[GENERAL] Avoid huge perfomance loss on string concatenation

Re: [GENERAL] Avoid huge perfomance loss on string concatenation

Re: [GENERAL] Avoid huge perfomance loss on string concatenation

7 matches

Site Navigation

Mail list logo

Footer information