Re: [HACKERS] Optimizer bug in 8.1.0?

2005-11-18 Thread Tom Lane
Alexey Slynko [EMAIL PROTECTED] writes:
 Any suggestions?

Fix contrib/intarray to have some selectivity estimation procedures
for its operators?

Without any way to estimate the number of rows matching the @@
condition, the optimizer can hardly be expected to guess right...

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Optimizer bug??

2004-05-25 Thread Ismail Kizir
Hi All,

EXPLAIN ANALYZE
SELECT COUNT(*) AS c FROM articletbl WHERE (tarih BETWEEN '2004-04-24' AND
'2004-05-24')

result is :

QUERY PLAN
Aggregate  (cost=40987.02..40987.02 rows=1 width=0) (actual
time=39209.476..39209.478 rows=1 loops=1)
  -  Seq Scan on articletbl  (cost=0.00..40947.39 rows=15851 width=0)
(actual time=1233.369..39153.741 rows=19515 loops=1)
Filter: ((tarih = '2004-04-24'::date) AND (tarih =
'2004-05-24'::date))
Total runtime: 39210.077 ms

I use fedora core 1 SMP kernel
Configuration :
Dual PIII-550
Dual SCSI (10ms. access time i think)
3x128 mb. SDRAM

Regards
Ismail Kizir
- Original Message -
From: Stephan Szabo [EMAIL PROTECTED]
To: Ismail Kizir [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Monday, May 24, 2004 11:52 PM
Subject: Re: [HACKERS] Optimizer bug??



 On Mon, 24 May 2004, Ismail Kizir wrote:

  Hi all,
 
   give us the result of these queries:
   SELECT COUNT(*) FROM articletbl;
 
  268726 records, it takes 34169 ms. to compute this
 
   SELECT COUNT(*) AS c FROM articletbl WHERE
   mydate BETWEEN '2004-04-24' AND '2004-05-24';
 
  18982 records, it takes 34249 ms. to compute this.
 
   SELECT COUNT(*) AS c FROM articletbl WHERE
   mydate = '2004-04-24';
  850 records, it takes only 40 ms. to compute this.
 
  It is evident that there is a problem here! Am I wrong??

 What does explain analyze show for the between query (not just explain)
 and what does it show if enable_seqscan is set to false?  It's possible
 that it's badly overestimating the cost of the range query, but that's
 hard to say at this point.  There is a point at which in general an index
 scan becomes more costly than a sequence scan, and it's possible to move
 that point by changing optimizer settings in the configuration.






---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Optimizer bug??

2004-05-25 Thread Ismail Kizir
Gaetano,

I've changed my settings as :

#fsync = true   # turns forced synchronization on or off
#wal_sync_method = fsync# the default varies across platforms:
#effective_cache_size = 1000# typically 8KB each
random_page_cost = 2# units are one sequential page fetch cost
cpu_tuple_cost = 0.009  # (same)
cpu_index_tuple_cost = 0.0009   # (same)
cpu_operator_cost = 0.0025  # (same)
# fsync, fdatasync, open_sync, or
open_datasync
#wal_buffers = 8# min 4, 8KB each

But it still doesn't optimize for that range.
Finally, i've set seq_scan off and, it works now.
But i think, there must be a way to handle those settings automatically for
cpu, ram and hdd settings(is it a sweet dream??)

Thanks
Ismail Kizir

- Original Message -
From: Gaetano Mendola [EMAIL PROTECTED]
Newsgroups: comp.databases.postgresql.hackers
To: Ismail Kizir [EMAIL PROTECTED]
Sent: Tuesday, May 25, 2004 12:47 AM
Subject: Re: Optimizer bug??


 Ismail Kizir wrote:

  Hi all,
 
 
 give us the result of these queries:
 SELECT COUNT(*) FROM articletbl;
 
 
  268726 records, it takes 34169 ms. to compute this
 
 
 SELECT COUNT(*) AS c FROM articletbl WHERE
 mydate BETWEEN '2004-04-24' AND '2004-05-24';
 
 
  18982 records, it takes 34249 ms. to compute this.
 
 
 SELECT COUNT(*) AS c FROM articletbl WHERE
 mydate = '2004-04-24';
 
  850 records, it takes only 40 ms. to compute this.
 
  It is evident that there is a problem here! Am I wrong??

 Try to do an explain analyze for both queries, and repeat
 the explain analyze but before disable the sequential scan
 ( set enable_seq_scan=off ).

 For sure what do you have is extimated cost higher, what
 do you have to do is decrease that cost, try with these
 values:


random_page_cost = 2.5
cpu_tuple_cost = 0.005
cpu_index_tuple_cost = 0.0005
 cpu_operator_cost = 0.0025


 these are values that are good for a modern machine.

 Hackers, what about to decrease the default values for this
 quantities ?




 Regards
 Gaetano Mendola




















---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Optimizer bug??

2004-05-25 Thread Gaetano Mendola
Ismail Kizir wrote:
Gaetano,
I've changed my settings as :
#fsync = true   # turns forced synchronization on or off
#wal_sync_method = fsync# the default varies across platforms:
#effective_cache_size = 1000# typically 8KB each
random_page_cost = 2# units are one sequential page fetch cost
cpu_tuple_cost = 0.009  # (same)
cpu_index_tuple_cost = 0.0009   # (same)
cpu_operator_cost = 0.0025  # (same)
# fsync, fdatasync, open_sync, or
open_datasync
#wal_buffers = 8# min 4, 8KB each
But it still doesn't optimize for that range.
Finally, i've set seq_scan off and, it works now.
But i think, there must be a way to handle those settings automatically for
cpu, ram and hdd settings(is it a sweet dream??)
Did you SIGHUP the postmaster after ?
You can change these settings also from command line, what you have to
do is decrease the cost of the index scan till is less of the sequential
scan cost.
Good luck.
Regards
Gaetano Mendola

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] Optimizer bug??

2004-05-25 Thread Josh Berkus
Gaetano,

 Hackers, what about to decrease the default values for this
 quantities ?

Oh, I don't think Ismail has a bug, I think he probably just needs to tune his 
database.

Ismail, join the PGSQL-PERFORMANCE mailing list and discuss this over there.   
-HACKERS is not the appropriate place for a tuning discussion.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Optimizer bug??

2004-05-25 Thread Gaetano Mendola
Josh Berkus wrote:
Gaetano,

Hackers, what about to decrease the default values for this
quantities ?

Oh, I don't think Ismail has a bug, I think he probably just needs to tune his 
database.
I just only suggesting to decrease that values that are oversized for a modern
hardware.
Regards
Gaetano Mendola

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Optimizer bug??

2004-05-25 Thread Ismail Kizir
Thanks Josh,

I'll subscribe to that group.

Regards
Ismail Kizir
- Original Message -
From: Josh Berkus [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: Ismail Kizir [EMAIL PROTECTED]
Sent: Tuesday, May 25, 2004 8:34 PM
Subject: Re: [HACKERS] Optimizer bug??


 Gaetano,

  Hackers, what about to decrease the default values for this
  quantities ?

 Oh, I don't think Ismail has a bug, I think he probably just needs to tune
his
 database.

 Ismail, join the PGSQL-PERFORMANCE mailing list and discuss this over
there.
 -HACKERS is not the appropriate place for a tuning discussion.

 --
 Josh Berkus
 Aglio Database Solutions
 San Francisco

 ---(end of broadcast)---
 TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html





---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Optimizer bug??

2004-05-25 Thread Gaetano Mendola
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Josh Berkus wrote:
| Gaetano,
|
|
|I just only suggesting to decrease that values that are oversized for a
|
| modern
|
|hardware.
|
|
| Hey, have you had success with those settings that you suggested?   I've tried
| tinkering with the relative CPU cost settings, and had mixed results.
| That's why I have no particular recommmendation for them.
|
Usually yes, decreasing that values I'm able to decrease the index scan
cost, so when I enable again the sequential scan the index one is choosed.
Regards
Gaetano Mendola

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFAs5Je7UpzwH2SGd4RAg/1AKCUYdTGIm5c7kG/ZXvmb49RWybs2ACgg3Wk
zOHkWnCvbyPgeDCU3pn6UfQ=
=Td0B
-END PGP SIGNATURE-
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Optimizer bug??

2004-05-25 Thread Tom Lane
Gaetano Mendola [EMAIL PROTECTED] writes:
 I just only suggesting to decrease that values that are oversized for
 a modern hardware.

I've seen no evidence saying that random_page_cost needs to be decreased
for modern hardware.  Disk seek speed versus bandwidth hasn't changed
that much.

People sometimes find it profitable to decrease that setting to
compensate for other optimizer issues, but that doesn't mean we
ought to change the default.

As for the other settings you mentioned, I'd agree that the defaults are
pretty arbitrary, but what evidence have you got to suggest better ones?

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Optimizer Bug issue

2004-05-25 Thread Greg Stark
Ismail Kizir [EMAIL PROTECTED] writes:

 I have a database of 20 tables, ~1gb total size. My biggest table contains
 ~270,000 newspaper article from Turkish journals. I am actually working on
 fulltext search program of my own.

How much RAM does the machine have? Have you already executed the query and
are repeating it? It's likely the entire data set is cached in RAM. That's not
the long-term average as your data set grows.

The numbers there are appropriate for a database where the data being fetched
cannot all fit in RAM and isn't all pre-cached. There are also scenarios where
the algorithms the optimizer uses to estimate costs don't capture everything.
tweaking the parameters to correct for these problems would cause other
queries to be handled even worse.

If anything the penalty for random disk accesses has increased over the years.
My desktop is about 100 times faster than my 486 router. But the hard drive in
the 486 is only about 10x slower than the hard drive in the desktop. And the
ratio of seek times is probably even less.


There is a parameter effective_cache_size which is supposed to help Postgres
take into account the likelihood that the data will already be in cache. How
exactly does this affect planning and perhaps this parameter needs to have
much more impact on the resultant plans. At least for databases that are small
relative to it.

-- 
greg


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Optimizer bug??

2004-05-25 Thread Gaetano Mendola
Tom Lane wrote:
Gaetano Mendola [EMAIL PROTECTED] writes:
I just only suggesting to decrease that values that are oversized for
a modern hardware.

I've seen no evidence saying that random_page_cost needs to be decreased
for modern hardware.  Disk seek speed versus bandwidth hasn't changed
that much.
People sometimes find it profitable to decrease that setting to
compensate for other optimizer issues, but that doesn't mean we
ought to change the default.
As for the other settings you mentioned, I'd agree that the defaults are
pretty arbitrary, but what evidence have you got to suggest better ones?
regards, tom lane
---(end of broadcast)---
TIP 6: Have you searched our list archives?
   http://archives.postgresql.org

I had queries not using a index scan ( was the best method ) that
started to use the index scan decreasing that values.
What I also notice is that under certain values I'm not able to decrease
anymore the cost of a query.
I'm using now:
random_page_cost = 2.0
cpu_tuple_cost = 0.005
cpu_index_tuple_cost = 0.0005
cpu_operator_cost = 0.0025
vs these default costs:
#random_page_cost = 4   # units are one sequential page fetch cost
#cpu_tuple_cost = 0.01  # (same)
#cpu_index_tuple_cost = 0.001   # (same)
#cpu_operator_cost = 0.0025 # (same)
look at this two queries (I just pick up one):
 DEFAULT VALUES 
test=# explain analyze select * from v_psr_guide_web;
   
   QUERY PLAN
--
 Subquery Scan v_psr_guide_web  (cost=750.58..750.85 rows=21 width=236) (actual 
time=196.420..197.210 rows=178 loops=1)
   -  Sort  (cost=750.58..750.64 rows=21 width=236) (actual time=196.411..196.532 
rows=178 loops=1)
 Sort Key: vg.estimated_start
 -  Hash Join  (cost=717.57..750.12 rows=21 width=236) (actual 
time=190.489..195.817 rows=178 loops=1)
   Hash Cond: (outer.id_publisher = inner.id_publisher)
   -  Hash Left Join  (cost=716.17..747.87 rows=128 width=208) (actual 
time=190.288..194.757 rows=178 loops=1)
 Hash Cond: (outer.id_drm_service = inner.id_drm_service)
 -  Hash Left Join  (cost=715.12..746.15 rows=128 width=188) 
(actual time=189.978..193.734 rows=178 loops=1)
   Hash Cond: (outer.id_cas_service = inner.id_cas_service)
   -  Subquery Scan vg  (cost=85.34..87.57 rows=127 
width=168) (actual time=125.807..128.751 rows=178 loops=1)
 Filter: (view_target_group  2)
 -  Unique  (cost=85.34..85.98 rows=127 width=324) 
(actual time=125.743..127.723 rows=192 loops=1)
   -  Sort  (cost=85.34..85.66 rows=127 
width=324) (actual time=125.738..126.573 rows=1298 loops=1)
 Sort Key: vp.id_package, 
s.estimated_start, sl.end_date
 -  Hash Join  (cost=71.92..80.91 
rows=127 width=324) (actual time=103.605..118.505 rows=1298 loops=1)
   Hash Cond: (outer.id_package = 
inner.id_package)
   -  Subquery Scan vp  
(cost=37.86..42.66 rows=384 width=304) (actual time=97.514..100.926 rows=384 loops=1)
 -  Sort  (cost=37.86..38.82 
rows=384 width=219) (actual time=97.488..97.744 rows=384 loops=1)
   Sort Key: 
p.id_publisher, p.name
   -  Hash Left Join  
(cost=1.96..21.37 rows=384 width=219) (actual time=1.003..95.690 rows=384 loops=1)
 Hash Cond: 
(outer.id_package = inner.id_package)
 -  Seq Scan on 
packages p  (cost=0.00..13.84 rows=384 width=203) (actual time=0.005..0.780 rows=384 
loops=1)
 -  Hash  
(cost=1.77..1.77 rows=77 width=20) (actual time=0.214..0.214 rows=0 loops=1)
   -  Seq 
Scan on package_security ps  (cost=0.00..1.77 rows=77 width=20) (actual time=0.011..0.126 
rows=77 loops=1)
   -  Hash  (cost=33.81..33.81 
rows=102 width=24) (actual time=5.756..5.756 rows=0 loops=1)
 -  Hash Join  
(cost=17.45..33.81 rows=102 width=24) (actual time=1.625..4.216 rows=1298 

Re: [HACKERS] Optimizer Bug issue

2004-05-25 Thread Tom Lane
Ismail Kizir [EMAIL PROTECTED] writes:
 I am sure default values are erronous.

Perhaps they are --- for your database, on your hardware.  Or perhaps
the problem is somewhere else (we know that the costing of nestloop
indexscan joins needs work, for instance).  But in any case there's
a reason why these numbers are tweakable.  It doesn't follow that the
defaults should be changed.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Optimizer Bug issue

2004-05-25 Thread Christopher Kings-Lynne

I am actually 
working on fulltext search program of my own.
No need, use contrib/tsearch2
Chris
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Optimizer bug??

2004-05-24 Thread Hans-Jrgen Schnig
Ismail Kizir wrote:
Hi everybody,
1  EXPLAIN SELECT COUNT(*) AS c FROM articletbl WHERE
  ((mydate BETWEEN '2004-04-24' AND '2004-05-24' )
)
2  EXPLAIN SELECT COUNT(*) AS c FROM articletbl WHERE
  ((mydate = '2004-04-24')
)
(I ran VACUUM ANALYZE before running those)
mydate is an indexed date column.
The optimizer optimizes the second query but, it doesn't optimize the first
one and decides to make a sequential scan.
Is this a bug?
Or may someone explain me the reason?
Thanks in advance.
Ismail Kizir

If 2004-04-24 to 2004-05-24 make up let's say 90% of your data 
PostgreSQL will find out that it is cheaper to use a seq scan instead of 
an index.

This is not a bug at all - this is normal and desired behaviour ...
Regards,
Hans
--
Cybertec Geschwinde u Schoenig
Schoengrabern 134, A-2020 Hollabrunn, Austria
Tel: +43/720/10 1234567 or +43/664/233 90 75
www.cybertec.at, www.postgresql.at, kernel.cybertec.at
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] Optimizer bug??

2004-05-24 Thread Ismail Kizir
2003-05-26;141
2003-05-25;138
2003-05-24;110
2003-05-23;132
2003-05-22;144
2003-05-21;157
2003-05-20;156
2003-05-19;131
2003-05-18;120
2003-05-17;130
2003-05-16;128
2003-05-15;157
2003-05-14;149
2003-05-13;158
2003-05-12;135
2003-05-11;143
2003-05-10;123
2003-05-09;128
2003-05-08;171
2003-05-07;177
2003-05-06;167
2003-05-05;151
2003-05-04;146
2003-05-03;130
2003-05-02;125
2003-05-01;156
2003-04-30;212
2003-04-29;206
2003-04-28;175
2003-04-27;172
2003-04-26;171
2003-04-25;205
2003-04-24;206
2003-04-23;179
2003-04-22;206
2003-04-21;163
2003-04-20;175
2003-04-19;162
2003-04-18;182
2003-04-17;210
2003-04-16;199
2003-04-15;196
2003-04-14;168
2003-04-13;158
2003-04-12;162
2003-04-11;175
2003-04-10;204
2003-04-09;203
2003-04-08;207
2003-04-07;160
2003-04-06;177
2003-04-05;140
2003-04-04;162
2003-04-03;173
2003-04-02;173
2003-04-01;186
2003-03-31;160
2003-03-30;154
2003-03-29;161
2003-03-28;170
2003-03-27;177
2003-03-26;200
2003-03-25;191
2003-03-24;153
2003-03-23;146
2003-03-22;161
2003-03-21;152
2003-03-20;171
2003-03-19;167
2003-03-18;207
2003-03-17;154
2003-03-16;149
2003-03-15;168
2003-03-14;172
2003-03-13;199
2003-03-12;182
2003-03-11;192
2003-03-10;170
2003-03-09;173
2003-03-08;169
2003-03-07;184
2003-03-06;186
2003-03-05;186
2003-03-04;208
2003-03-03;177
2003-03-02;160
2003-03-01;181
2003-02-28;162
2003-02-27;197
2003-02-26;186
2003-02-25;191
2003-02-24;182
2003-02-23;179
2003-02-22;172
2003-02-21;187
2003-02-20;189
2003-02-19;181
2003-02-18;204
2003-02-17;175
2003-02-16;155
2003-02-15;157
2003-02-14;141
2003-02-13;165
2003-02-12;166
2003-02-11;142
2003-02-10;132
2003-02-09;154
2003-02-08;142
2003-02-07;144
2003-02-06;154
2003-02-05;197
2003-02-04;189
2003-02-03;159
2003-02-02;114
2003-02-01;132
2003-01-31;182
2003-01-30;175
2003-01-29;181
2003-01-28;186
2003-01-27;160
2003-01-26;153
2003-01-25;149
2003-01-24;180
2003-01-23;171
2003-01-22;170
2003-01-21;183
2003-01-20;171
2003-01-19;163
2003-01-18;154
2003-01-17;157
2003-01-16;165
2003-01-15;159
2003-01-14;192
2003-01-13;148
2003-01-12;163
2003-01-11;161
2003-01-10;185
2003-01-09;142
2003-01-08;186
2003-01-07;184
2003-01-06;176
2003-01-05;158
2003-01-04;160
2003-01-03;167
2003-01-02;162
2003-01-01;148
2002-12-31;166
2002-12-30;155
2002-12-29;129
2002-12-28;177
2002-12-27;152
2002-12-26;176
2002-12-25;159
2002-12-24;191
2002-12-23;163
2002-12-22;148
2002-12-21;148
2002-12-20;175
2002-12-19;193
2002-12-18;201
2002-12-17;219
2002-12-16;192
2002-12-15;179
2002-12-14;186
2002-12-13;156
2002-12-12;181
2002-12-11;206
2002-12-10;190
2002-12-09;155
2002-12-08;151
2002-12-07;150
2002-12-06;163
2002-12-05;156
2002-12-04;194
2002-12-03;194
2002-12-02;166
2002-12-01;154
2002-11-30;191
2002-11-29;163
2002-11-28;179
2002-11-27;191
2002-11-26;184
2002-11-25;174
2002-11-24;129
2002-11-23;167
2002-11-22;170
2002-11-21;206
2002-11-20;200
2002-11-19;195
2002-11-18;144
2002-11-17;177
2002-11-16;161
2002-11-15;195
2002-11-14;221
2002-11-13;192
2002-11-12;199
2002-11-11;166
2002-11-10;157
2002-11-09;145
2002-11-08;197
2002-11-07;202
2002-11-06;168
2002-11-05;155
2002-11-04;151
2002-11-03;144
2002-11-02;188
2002-11-01;191
2002-10-31;189
2002-10-30;196
2002-10-29;193
2002-10-28;168
2002-10-27;155
2002-10-26;155
2002-10-25;184
2002-10-24;205
2002-10-23;187
2002-10-22;194
2002-10-21;202


Ismail Kizir


- Original Message -
From: Hans-Jürgen Schönig [EMAIL PROTECTED]
To: Ismail Kizir [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Monday, May 24, 2004 8:17 PM
Subject: Re: [HACKERS] Optimizer bug??


 Ismail Kizir wrote:
  Hi everybody,
 
  1  EXPLAIN SELECT COUNT(*) AS c FROM articletbl WHERE
((mydate BETWEEN '2004-04-24' AND '2004-05-24' )
  )
 
  2  EXPLAIN SELECT COUNT(*) AS c FROM articletbl WHERE
((mydate = '2004-04-24')
  )
 
  (I ran VACUUM ANALYZE before running those)
 
  mydate is an indexed date column.
  The optimizer optimizes the second query but, it doesn't optimize the
first
  one and decides to make a sequential scan.
  Is this a bug?
  Or may someone explain me the reason?
 
  Thanks in advance.
 
  Ismail Kizir


 If 2004-04-24 to 2004-05-24 make up let's say 90% of your data
 PostgreSQL will find out that it is cheaper to use a seq scan instead of
 an index.

 This is not a bug at all - this is normal and desired behaviour ...

 Regards,

 Hans


 --
 Cybertec Geschwinde u Schoenig
 Schoengrabern 134, A-2020 Hollabrunn, Austria
 Tel: +43/720/10 1234567 or +43/664/233 90 75
 www.cybertec.at, www.postgresql.at, kernel.cybertec.at






---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Optimizer bug??

2004-05-24 Thread Greg Stark
Ismail Kizir [EMAIL PROTECTED] writes:

 Thanks Hans,
 
 The optimizer does an indexed scan up to 20 days, and then, it decides to
 make a sequential scan.

What are the results of explain analyze query for the various queries?

-- 
greg


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Optimizer bug??

2004-05-24 Thread Alvaro Herrera
On Mon, May 24, 2004 at 08:27:01PM +0300, Ismail Kizir wrote:

 The optimizer does an indexed scan up to 20 days, and then, it decides to
 make a sequential scan.
 But i am still not sure about the efficiency of this decision.

Huh, so what was the EXPLAIN ANALYZE of the query with BETWEEN?

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
Syntax error: function hell() needs an argument.
Please choose what hell you want to involve.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Optimizer bug??

2004-05-24 Thread Ismail Kizir
Alvaro,

For the query :
EXPLAIN SELECT count(tarih) AS c FROM articletbl WHERE sitekodu = 12 AND
  ((tarih='2004-04-24' AND tarih'2004-05-24' ))

QUERY PLAN is
Aggregate  (cost=41619.34..41619.34 rows=1 width=4)
  -  Seq Scan on articletbl  (cost=0.00..41618.46 rows=353 width=4)
Filter: ((sitekodu = 12) AND (tarih = '2004-04-24'::date) AND
(tarih  '2004-05-24'::date))

And for the query :

EXPLAIN SELECT count(tarih) AS c FROM articletbl WHERE sitekodu = 12 AND
  ((tarih='2004-05-10' AND tarih'2004-05-24' ))

QUERY PLAN
Aggregate  (cost=20279.72..20279.72 rows=1 width=4)
  -  Index Scan using ind_articletbltrh on articletbl  (cost=0.00..20279.40
rows=127 width=4)
Index Cond: ((tarih = '2004-05-10'::date) AND (tarih 
'2004-05-24'::date))
Filter: (sitekodu = 12)

Have you got an idea?

Thanks in advance
Ismail Kizir
- Original Message -
From: Alvaro Herrera [EMAIL PROTECTED]
To: Ismail Kizir [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Monday, May 24, 2004 9:23 PM
Subject: Re: [HACKERS] Optimizer bug??


 On Mon, May 24, 2004 at 08:27:01PM +0300, Ismail Kizir wrote:

  The optimizer does an indexed scan up to 20 days, and then, it decides
to
  make a sequential scan.
  But i am still not sure about the efficiency of this decision.

 Huh, so what was the EXPLAIN ANALYZE of the query with BETWEEN?

 --
 Alvaro Herrera (alvherre[a]dcc.uchile.cl)
 Syntax error: function hell() needs an argument.
 Please choose what hell you want to involve.


 ---(end of broadcast)---
 TIP 2: you can get off all lists at once with the unregister command
 (send unregister YourEmailAddressHere to [EMAIL PROTECTED])





---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Optimizer bug??

2004-05-24 Thread Gaetano Mendola
Ismail Kizir wrote:
Hi everybody,
1  EXPLAIN SELECT COUNT(*) AS c FROM articletbl WHERE
  ((mydate BETWEEN '2004-04-24' AND '2004-05-24' )
)
2  EXPLAIN SELECT COUNT(*) AS c FROM articletbl WHERE
  ((mydate = '2004-04-24')
)
(I ran VACUUM ANALYZE before running those)
mydate is an indexed date column.
The optimizer optimizes the second query but, it doesn't optimize the first
one and decides to make a sequential scan.
Is this a bug?
Or may someone explain me the reason?
give us the result of these queries:
SELECT COUNT(*) FROM articletbl;
SELECT COUNT(*) AS c FROM articletbl WHERE
mydate BETWEEN '2004-04-24' AND '2004-05-24';
SELECT COUNT(*) AS c FROM articletbl WHERE
mydate = '2004-04-24';

Regards
Gaetano Mendola

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Optimizer bug??

2004-05-24 Thread Ismail Kizir
Hi all,

 give us the result of these queries:
 SELECT COUNT(*) FROM articletbl;

268726 records, it takes 34169 ms. to compute this

 SELECT COUNT(*) AS c FROM articletbl WHERE
 mydate BETWEEN '2004-04-24' AND '2004-05-24';

18982 records, it takes 34249 ms. to compute this.

 SELECT COUNT(*) AS c FROM articletbl WHERE
 mydate = '2004-04-24';
850 records, it takes only 40 ms. to compute this.

It is evident that there is a problem here! Am I wrong??

Regards
Ismail Kizir



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Optimizer bug??

2004-05-24 Thread Stephan Szabo

On Mon, 24 May 2004, Ismail Kizir wrote:

 Hi all,

  give us the result of these queries:
  SELECT COUNT(*) FROM articletbl;

 268726 records, it takes 34169 ms. to compute this

  SELECT COUNT(*) AS c FROM articletbl WHERE
  mydate BETWEEN '2004-04-24' AND '2004-05-24';

 18982 records, it takes 34249 ms. to compute this.

  SELECT COUNT(*) AS c FROM articletbl WHERE
  mydate = '2004-04-24';
 850 records, it takes only 40 ms. to compute this.

 It is evident that there is a problem here! Am I wrong??

What does explain analyze show for the between query (not just explain)
and what does it show if enable_seqscan is set to false?  It's possible
that it's badly overestimating the cost of the range query, but that's
hard to say at this point.  There is a point at which in general an index
scan becomes more costly than a sequence scan, and it's possible to move
that point by changing optimizer settings in the configuration.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Optimizer bug??

2004-05-24 Thread Gaetano Mendola
Ismail Kizir wrote:
Hi all,

give us the result of these queries:
SELECT COUNT(*) FROM articletbl;

268726 records, it takes 34169 ms. to compute this

SELECT COUNT(*) AS c FROM articletbl WHERE
mydate BETWEEN '2004-04-24' AND '2004-05-24';

18982 records, it takes 34249 ms. to compute this.

SELECT COUNT(*) AS c FROM articletbl WHERE
mydate = '2004-04-24';
850 records, it takes only 40 ms. to compute this.
It is evident that there is a problem here! Am I wrong??
Try to do an explain analyze for both queries, and repeat
the explain analyze but before disable the sequential scan
( set enable_seq_scan=off ).
For sure what do you have is extimated cost higher, what
do you have to do is decrease that cost, try with these
values:
random_page_cost = 2.5
cpu_tuple_cost = 0.005
cpu_index_tuple_cost = 0.0005
cpu_operator_cost = 0.0025
these are values that are good for a modern machine.
Hackers, what about to decrease the default values for this
quantities ?

Regards
Gaetano Mendola






---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html