Tom,

I'm running PostgreSQL 7.3.2 on Red Hat Linux 7.3 with 512Mb RAM.

The table definition is:

                    Table "public.cdr"
        Column         |         Type          | Modifiers 
-----------------------+-----------------------+-----------
 carrier_id            | integer               | not null
 file_id               | integer               | not null
 service_num           | character varying(10) | not null
 day                   | date                  | not null
 time                  | integer               | not null
 destination           | character varying(20) | not null
 duration              | integer               | not null
 charge_wholesale      | numeric(8,2)          | not null
 charge_band_id        | integer               | 
 charge_retail         | numeric(8,2)          | not null
 rate_plan_id          | integer               | not null
 item_code             | integer               | not null
 cust_id               | integer               | not null
 bill_id               | integer               | 
 prefix                | character varying(12) | 
 charge_wholesale_calc | numeric(8,2)          | 
Indexes: cdr_ix1 btree ("day"),
         cdr_ix2 btree (service_num),
         cdr_ix3 btree (cust_id),
         cdr_ix4 btree (bill_id),
         cdr_ix5 btree (carrier_id),
         cdr_ix6 btree (file_id)

Does this make it a "wide" table?

The data arrives ordered by service_num, day, time. This customer has one primary 
service_num that most of the calls are made from. Therefore each day a clump of CDRs 
will be loaded for that customer, interspersed with CDRs from all the other customers. 
Therefore the distribution of records for a service_num is clumpy but evenly 
distributed throughout the table. For a customer with a single primary number, this 
result applies to the customer as a whole. For a customer with many service_num's the 
result is a little more doubtful depending on whether their service_num's arrive 
sequentially or not. This would not necessarily be the case.

I hope this makes sense. Does it help any?

Thanks,
David

-----Original Message-----
From: Tom Lane [mailto:[EMAIL PROTECTED]
Sent: Wednesday, 18 February 2004 16:10
To: David Witham
Cc: [EMAIL PROTECTED]
Subject: Re: [SQL] Indexes and statistics 


"David Witham" <[EMAIL PROTECTED]> writes:
> One of the customers is quite large (8.3% of the records):

Hmm.  Unless your rows are quite wide, a random sampling of 8.3% of the
table would be expected to visit every page of the table, probably
several times.  So the planner's cost estimates do not seem out of line
to me; an indexscan *should* be slow.  The first question to ask is why
the deviation from reality.  Are the rows for that customer ID likely to
be physically concentrated into a limited number of physical pages?
Do you have so much RAM that the whole table got swapped in, eliminating
the extra I/O that the planner is expecting?

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Reply via email to