Re: [GENERAL] Select time jump after adding filter; please help me figure out what I'm doing wrong.

2007-08-15 Thread Richard Huxton

Andrew Edson wrote:

Yes.  The only difference between the two selects was that the index
on the table in question was dropped.  As far as I know, that was the
only partial index on there, although since it's a test db, I could
probably go in and experiment on a few more if needed.

This problem may have already been solved; I'm using an older version
of Postgres; 8.1.3.  My boss has requested that it not be upgraded
just yet, however, so I'm stuck with it for the moment.


You should probably upgrade to 8.1.9 - there are a couple of bugfixes 
that mention bitmap scans, although nothing that describes your problem 
in-depth.


http://www.postgresql.org/docs/8.1/static/release.html#RELEASE-8-1-9

You don't need to dump/restore the database, it's just a matter of 
upgrading the binaries within the 8.1 series. You should plan on doing 
point-release updates fairly frequently.


I'd put the partial index back, see if you can reproduce it and if so 
tell your boss you should upgrade. If you can't talk him around, give 
him the URL for the release notes along with the three bugs you think 
are most dangerous. If that doesn't work, I'm sure we can get someone 
from the core developers to tell you the same thing I just have, and we 
can forward that to your boss :-)


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] Select time jump after adding filter; please help me figure out what I'm doing wrong.

2007-08-15 Thread Andrew Edson


Tom Lane <[EMAIL PROTECTED]> wrote:  Ah. I think your result is explained by 
this 8.1.4 bug fix:

2006-05-18 14:57 tgl

* src/backend/optimizer/plan/createplan.c (REL8_1_STABLE): When a
bitmap indexscan is using a partial index, it is necessary to
include the partial index predicate in the scan's "recheck
condition". Otherwise, if the scan becomes lossy for lack of
bitmap memory, we would fail to enforce that returned rows satisfy
the predicate. Noted while studying bug #2441 from Arjen van der
Meijden.

IOW, once the bitmap goes lossy, we'd return *every* row on any page
that the index fingered as having *any* relevant row.

Better press him a little harder. There are a lot of known bugs in
8.1.3, and not any very good reason not to update to a later 8.1.x.

As a really short-term measure, you could possibly avoid this bug by
increasing work_mem enough that the bitmap doesn't get compressed.

regards, tom lane
I'll press as I can...thanks for pointing that out to me.  How would I 
determine how much to increase work_mem by (or to) until then?

   
-
Got a little couch potato? 
Check out fun summer activities for kids.

Re: [GENERAL] Select time jump after adding filter; please help me figure out what I'm doing wrong.

2007-08-15 Thread Tom Lane
Andrew Edson <[EMAIL PROTECTED]> writes:
>   This problem may have already been solved; I'm using an older
>   version of Postgres; 8.1.3.

Ah.  I think your result is explained by this 8.1.4 bug fix:

2006-05-18 14:57  tgl

* src/backend/optimizer/plan/createplan.c (REL8_1_STABLE): When a
bitmap indexscan is using a partial index, it is necessary to
include the partial index predicate in the scan's "recheck
condition".  Otherwise, if the scan becomes lossy for lack of
bitmap memory, we would fail to enforce that returned rows satisfy
the predicate.  Noted while studying bug #2441 from Arjen van der
Meijden.

IOW, once the bitmap goes lossy, we'd return *every* row on any page
that the index fingered as having *any* relevant row.

>   My boss has requested that it not be
>   upgraded just yet, however, so I'm stuck with it for the moment. 

Better press him a little harder.  There are a lot of known bugs in
8.1.3, and not any very good reason not to update to a later 8.1.x.

As a really short-term measure, you could possibly avoid this bug by
increasing work_mem enough that the bitmap doesn't get compressed.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Select time jump after adding filter; please help me figure out what I'm doing wrong.

2007-08-15 Thread Andrew Edson
Yes.  The only difference between the two selects was that the index on the 
table in question was dropped.  As far as I know, that was the only partial 
index on there, although since it's a test db, I could probably go in and 
experiment on a few more if needed.
   
  This problem may have already been solved; I'm using an older version of 
Postgres; 8.1.3.  My boss has requested that it not be upgraded just yet, 
however, so I'm stuck with it for the moment.
   
  Richard Huxton <[EMAIL PROTECTED]> wrote:
  Andrew Edson wrote:
> I apologize about the CC; I thought I had done so.

no problem

> There are fourteen (14) distinct values in rcrd_cd. And I don't know
> if this counts as something odd, but I got the following values by
> doing a vacuum full analyze, then running the set with index,
> dropping index, running set without.

Might want to do ALTER TABLE ... ALTER COLUMN rcrd_cd SET STATISTICS =
14 (or a few more than that if you think it might be useful) - won't
help you with this though.

So - are you saying that with these two queries...

>> attest=# select count(*) from ptrans where rcrd_cd = '0A'; 
>> 6701655

>> attest=# select count(*) from ptrans where rcrd_cd = '0A'; 
>> 204855

...the only difference is that you've dropped an index?

Because that's just strange - and I don't think it's anything you're doing.

Do you have other partial indexes for different values of rcrd_cd, and
do they have similar problems? If this can be reproduced it might point 
to something odd with bitmap scans.

Oh, remind me what version of PostgreSQL you're running?

-- 
Richard Huxton
Archonet Ltd


   
-
Moody friends. Drama queens. Your life? Nope! - their life, your story.
 Play Sims Stories at Yahoo! Games. 

Re: [GENERAL] Select time jump after adding filter; please help me figure out what I'm doing wrong.

2007-08-15 Thread Richard Huxton

Andrew Edson wrote:

I apologize about the CC; I thought I had done so.


no problem


There are fourteen (14) distinct values in rcrd_cd.  And I don't know
if this counts as something odd, but I got the following values by
doing a vacuum full analyze, then running the set with index,
dropping index, running set without.


Might want to do ALTER TABLE ... ALTER COLUMN rcrd_cd SET STATISTICS =
14 (or a few more than that if you think it might be useful) - won't
help you with this though.

So - are you saying that with these two queries...

attest=# select count(*) from ptrans where rcrd_cd = '0A'; 

>> 6701655

attest=# select count(*) from ptrans where rcrd_cd = '0A'; 

>> 204855

...the only difference is that you've dropped an index?

Because that's just strange - and I don't think it's anything you're doing.

Do you have other partial indexes for different values of rcrd_cd, and
do they have similar problems? If this can be reproduced it might point 
to something odd with bitmap scans.


Oh, remind me what version of PostgreSQL you're running?

--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Select time jump after adding filter; please help me figure out what I'm doing wrong.

2007-08-15 Thread Andrew Edson
I apologize about the CC; I thought I had done so.
   
  There are fourteen (14) distinct values in rcrd_cd.  And I don't know if this 
counts as something odd, but I got the following values by doing a vacuum full 
analyze, then running the set with index, dropping index, running set without.
   
  Values with index:
> 
> attest=# select count(*) from ptrans;
>  16488578
> 
> attest=# select count(*) from ptrans where rcrd_cd = '0A';
>  6701655
> 
> attest=# explain analyze select cntrct_id, tran_dt from ptrans where rcrd_cd 
> = '   
>   
>>  0A';
>  Bitmap Heap Scan on ptrans  (cost=1223.86..151183.39 rows=87439 width=21) 
> (actu 
>   
>  >  al time=2255.640..70371.304 rows=6701655 loops=1)
>->  Bitmap Index Scan on ptrans_cid_trandt_idx  (cost=0.00..1223.86 
> rows=8743 
>   
>  >  9 width=0) (actual time=2216.856..2216.856 rows=204855 loops=1)
>  Total runtime: 89964.419 ms
> 
  Values without index:
> 
> attest=# select count(*) from ptrans;
>  16488578
> 
> attest=# select count(*) from ptrans where rcrd_cd = '0A';
>  204855
> 
> attest=# explain analyze select cntrct_id, tran_dt from ptrans where rcrd_cd 
> = '0A';
>  Seq Scan on ptrans  (cost=0.00..384813.22 rows=87439 width=21) (actual 
> time=20.286..65330.049 rows=204855 loops=1)
>Filter: (rcrd_cd = '0A'::bpchar)
>  Total runtime: 65945.160 ms
   
  Again, someone mind pointing out to me where I've managed to mess this one up?


Richard Huxton <[EMAIL PROTECTED]> wrote:
Don't forget to CC: the list.

Andrew Edson wrote:
> I apologize; You are correct in that I mistyped my original structure. Here 
> is the information for the correct explain and explain analyze statements.
> 
> attest=# explain select cntrct_id, tran_dt from ptrans where rcrd_cd = '0A';

No need for the simple explain - explain analyse includes all the 
information.

> attest=# explain analyze select cntrct_id, tran_dt from ptrans where rcrd_cd 
> = '0A';
> QUERY PLAN
> 
> Bitmap Heap Scan on ptrans (cost=1223.86..149853.23 rows=85741 width=21) 
> (actual time=2302.363..70321.838 rows=6701655 loops=1)
> -> Bitmap Index Scan on ptrans_cid_trandt_idx (cost=0.00..1223.86 rows=85741 
> width=0) (actual time=2269.064..2269.064 rows=204855 loops=1)
> Total runtime: 89854.843 ms

Well, it's taking 90 seconds to return 6.7 million rows. Depending on 
your system and memory settings, that might not be unreasonable.

It *is* getting the estimate of returned rows wrong (it thinks 85,741 
will match) which is hugely out of line. Is there something odd with 
this table/column or haven't you analysed recently? How many unique 
values does rcrd_cd have, and how many rows does the table have?

I don't know that you'll get this down to sub-second responses though, 
not if you're trying to return 6 million rows from an even larger table.

-- 
Richard Huxton
Archonet Ltd


---(end of broadcast)---
TIP 6: explain analyze is your friend



   
-
Ready for the edge of your seat? Check out tonight's top picks on Yahoo! TV. 

Re: [GENERAL] Select time jump after adding filter; please help me figure out what I'm doing wrong.

2007-08-14 Thread Richard Huxton

Don't forget to CC: the list.

Andrew Edson wrote:

I apologize; You are correct in that I mistyped my original structure.  Here is 
the information for the correct explain and explain analyze statements.
   
  attest=# explain select cntrct_id, tran_dt from ptrans where rcrd_cd = '0A';


No need for the simple explain - explain analyse includes all the 
information.



  attest=# explain analyze select cntrct_id, tran_dt from ptrans where rcrd_cd 
= '0A';
   QUERY PLAN

 Bitmap Heap Scan on ptrans  (cost=1223.86..149853.23 rows=85741 width=21) 
(actual time=2302.363..70321.838 rows=6701655 loops=1)
   ->  Bitmap Index Scan on ptrans_cid_trandt_idx  (cost=0.00..1223.86 
rows=85741 width=0) (actual time=2269.064..2269.064 rows=204855 loops=1)
 Total runtime: 89854.843 ms


Well, it's taking 90 seconds to return 6.7 million rows. Depending on 
your system and memory settings, that might not be unreasonable.


It *is* getting the estimate of returned rows wrong (it thinks 85,741 
will match) which is hugely out of line. Is there something odd with 
this table/column or haven't you analysed recently? How many unique 
values does rcrd_cd have, and how many rows does the table have?


I don't know that you'll get this down to sub-second responses though, 
not if you're trying to return 6 million rows from an even larger table.


--
  Richard Huxton
  Archonet Ltd


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Select time jump after adding filter; please help me figure out what I'm doing wrong.

2007-08-14 Thread Richard Huxton

Andrew Edson wrote:

  The following is a copy of my index creation statement, index name, and 
explain and explain analyze output on the statement I was trying to run.  Would 
someone please help me figure out what I'm doing wrong here?
   
  > attest=# create index ptrans_cid_trandt_idx on ptrans(cntrct_id, tran_dt) where rcrd_cd = '0A';



attest=# explain select cntrct_id, tran_dt from ptrans where rcrd_cd = 0;



attest=# explain analyze select cntrct_id, tran_dt from ptrans where rcrd_cd = 
0;


The index has a where clause that doesn't match your query. It wouldn't 
use the index anyway - you're not filtering or sorting on it.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[GENERAL] Select time jump after adding filter; please help me figure out what I'm doing wrong.

2007-08-14 Thread Andrew Edson
I've been working on a db project intended to perform modifications to one db 
based on the data stored in another one.  Specifically, I'm supposed to check a 
pair of tables based on two criteria; an id field, and a timestamp.  This will 
be crossreferenced with the second database; the timestamps will be used to 
update timestamps on anything which resides in the table of interest there.
   
  I was running the sequence through perl; with 76 records in the test copy of 
the second database, I was getting a four, five minute run.  Not really bad, 
but I'm expecting the number of records in the second db to hit several 
thousand in production, so I thought I'd see if I could speed things up by 
adding an index on what I'm searching for.
   
  After about an hour of run time, I killed the program and started looking 
into things.
   
  The following is a copy of my index creation statement, index name, and 
explain and explain analyze output on the statement I was trying to run.  Would 
someone please help me figure out what I'm doing wrong here?
   
  > attest=# create index ptrans_cid_trandt_idx on ptrans(cntrct_id, tran_dt) 
where rcrd_cd = '0A';
> 
> 
> "ptrans_cid_trandt_idx" btree (cntrct_id, tran_dt) WHERE rcrd_cd = 
> '0A'::bpchar
> 
> 
> attest=# explain select cntrct_id, tran_dt from ptrans where rcrd_cd = 0;
>QUERY PLAN
> 
>  Seq Scan on ptrans  (cost=0.00..426034.67 rows=82443 width=21)
>Filter: ((rcrd_cd)::text = '0'::text)
> (2 rows)
> 
> 
> attest=# explain analyze select cntrct_id, tran_dt from ptrans where rcrd_cd 
> = 0;
> QUERY PLAN
> --
>  Seq Scan on ptrans  (cost=0.00..426034.67 rows=82443 width=21) (actual 
> time=60585.740..60585.740 rows=0 loops=1)
>Filter: ((rcrd_cd)::text = '0'::text)
>  Total runtime: 60585.797 ms
> (3 rows)

   
-
Sick sense of humor? Visit Yahoo! TV's Comedy with an Edge to see what's on, 
when.