[SQL] Extremely slow query

2002-07-29 Thread Patrick Hatcher

System:
OS: RedHat 7.2
Dual PIII XEON
Mem 512 mg
PG: 7.2


I have what I think is a fairly simple summary query, but it takes 1:55 to
run on just 155k records.  The query hits against a single table that I use
for reporting purposes.  This table is truncated, refreshed, reindexed, and
vacuum analysed each night.  Other than the initial table update, no other
data is added during the day.

Any help would be appreciated.  My little Win2k with a PIII 500 and 256mgs
is out performing this monster machine.

query:
SELECT  gmmid, gmmname, divid, divname, feddept, deptname, fedvend,
vendor_name, masterid, master_desc, pageid, oz_description, (
 CASE
WHEN (masterid IS NULL) THEN pageid
ELSE masterid END)::character varying(15) AS pagemaster,
 CASE
WHEN (masterid IS NULL) THEN oz_description
ELSE master_desc
 END  AS pagemaster_desc,
 CASE
WHEN (masterid IS NULL) THEN price_original
ELSE NULL::float8
 END  AS org_price_display,
 CASE
WHEN (masterid IS NULL) THEN cur_price
ELSE NULL::float8
 END  AS cur_price_display, price_original, price_owned_retail,
cur_price, oz_color, oz_size, pageflag, itemnumber,
 sum(cur_demandu + cur_returnu) AS cur_net_units,
 sum(cur_demanddol + wtd_returndol) AS cur_net_dollar,
 sum(wtd_demandu + wtd_returnu) AS wtd_net_units,
 sum(wtd_demanddol + wtd_returndol) AS wtd_net_dollar,
 sum(lw_demand + lw_returnu) AS lw_net_units,
 sum(lw_demanddollar + lw_returndollar) AS lw_net_dollar,
 sum(ptd_demanddollar + ptd_returndollar) AS ptd_net_dollar,
 sum(ptd_demand + ptd_returnu) AS ptd_net_units,
 sum(std_demanddollar + std_returndollar) AS std_net_dollar,
 sum(std_demand + std_returnu) AS std_net_units,
 sum(total_curoh) AS total_curoh,
 sum(total_curoo) AS total_curoo,
 sum((float8(total_curoh) * price_owned_retail)) AS curoh_dollar,
 sum((float8(total_curoo) * price_owned_retail)) AS curoo_dollar,
 sum(total_oh) AS total_oh,
 sum(total_oo) AS total_oo,
 sum((float8(total_oh) * price_owned_retail)) AS oh_dollar,
 sum((float8(total_oh) * price_owned_retail)) AS oo_dollar, mkd_status,
option4_flag
FROM tbldetaillevel_report detaillevel_report_v
GROUP   BY gmmid, gmmname, divid, divname, feddept, deptname, fedvend,
vendor_name, masterid, master_desc, pageid, oz_description,
 CASE
WHEN (masterid IS NULL) THEN pageid
ELSE masterid
 END,
 CASE
WHEN (masterid IS NULL) THEN oz_description
ELSE master_desc
 END,
 CASE
WHEN (masterid IS NULL) THEN price_original
ELSE NULL::float8
 END,
 CASE
WHEN (masterid IS NULL) THEN cur_price
ELSE NULL::float8
 END, price_original, price_owned_retail, cur_price, oz_color, oz_size,
pageflag, itemnumber, mkd_status, option4_flag

EXPLAIN ANALYSE results:
Aggregate  (cost=56487.32..72899.02 rows=15267 width=356)
  ->  Group  (cost=56487.32..66029.01 rows=152667 width=356)
->  Sort  (cost=56487.32..56487.32 rows=152667 width=356)
  ->  Seq Scan on tbldetaillevel_report detaillevel_report_v
(cost=0.00..9932.67 rows=152667 width=356)


Table Def:
CREATE TABLE tbldetaillevel_report (
  pageid int4,
  feddept int4,
  fedvend int4,
  oz_description varchar(254),
  price_owned_retail float8,
  oz_color varchar(50),
  oz_size varchar(50),
  lw_demanddollar float8,
  ptd_demanddollar float8,
  std_demanddollar float8,
  lw_returndollar float8,
  ptd_returndollar float8,
  std_returndollar float8,
  lw_demand int4,
  ptd_demand int4,
  std_demand int4,
  lw_returnu int4,
  ptd_returnu int4,
  std_returnu int4,
  divid int4,
  divname varchar(35),
  gmmid int4,
  gmmname varchar(35),
  deptname varchar(35),
  total_oh int4,
  total_oo int4,
  vendorname varchar(40),
  dunsnumber varchar(9),
  current_week int4,
  current_period int4,
  week_end date,
  varweek int4,
  varperiod int4,
  upc int8,
  pageflag int2,
  upcflag int2,
  pid varchar(30),
  cur_price float8,
  vendor_name varchar(40),
  ly_lw_demanddollar float8,
  ly_ptd_demanddollar float8,
  ly_std_demanddollar float8,
  itemnumber varchar(15),
  mkd_status int2,
  lw_1_demanddollar float8,
  lw_2_demanddollar float8,
  lw_3_demanddollar float8,
  lw_4_demanddollar float8,
  masterid int4,
  master_desc varchar(254),
  cur_demandu int4,
  cur_demanddol float8,
  cur_returnu int4,
  cur_returndol float8,
  wtd_demandu int4,
  wtd_demanddol float8,
  wtd_returnu int4,
  wtd_returndol float8,
  total_curoh int4,
  total_curoo int4,
  curr_date date,
  lw_1_demand int4,
  lw_2_demand int4,
  lw_3_demand int4,
  lw_4_demand int4,
  option4_flag int2,
  option3_flag int2,
  price_original float8,
  price_ticket float8
)





Patrick Hatcher





---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



[SQL] Is there a faster way to do this?

2004-06-15 Thread Patrick Hatcher




pg: 7.4.2
RedHat 7.2

Can I get some advice on a possible faster way of doing this:

Scenario:  Each day I update a column in a table with an internal
percentile value.  To arrive at  this value, I'll get a count of records
with sales > 0 and then divide this count by the total number of tile
groups I want.  So for example:
Total records w/sales > 0 = 730,000
 tile# = 100
total percentile groups (730,000 / 100):7300

Now that I have the total number of groups I need, I cycle through my
recordset, grab the key field and the current percentile number and stuff
the values into a temp table. I mod the current row counter against the
total percentile group number.  If it is 0 then I add 1 to the  current
percentile number .  After inserting records into the temp file I then
update the main table.

Using the example above, the first 7300 records would get a  percentile
number of 1, the next 7300  records would get  a percentile number of 2,
then 3,4,5,etc.

Unfortunately, I am going record by record in a loop and the process takes
upwards of 20mins.  Is there a faster way to do this?  I thought about
using limit and offset, but I'm not sure how I would accomplish it.

Below is the function I currently use.  Thanks for any help provided

CREATE OR REPLACE FUNCTION cdm.percentile_calc()
  RETURNS text AS
'DECLARE
 v_interval int4;
 v_tile int4;
 v_percentile int4;
 v_check int4;
 v_count int4;
 v_rowcount int4;
 myRec  RECORD;

BEGIN
 v_count:=0;
 v_tile:= 100;
 v_percentile:=1;
 v_rowcount :=1;
 v_check:=0;


 /* Get count of records with val_purch > 0 */
 select into v_count count(*)  from cdm.cdm_indiv_mast where
val_purch_com >0;

 /* this number will be used as part of our MOD to tell when to add one
to our percentile */
 v_interval := v_count / v_tile;

 CREATE TEMP TABLE cdmperct (f1 int8, f2 int2);


 FOR myRec IN  select indiv_key from cdm.cdm_indiv_mast where
val_purch_com  >0 order by val_purch_com desc  LOOP
  INSERT INTO cdmperct values (myRec.indiv_key,v_percentile);
  v_check = mod(v_rowcount,v_interval);
  IF v_check = 0 THEN
   v_percentile:=v_percentile+1;
  END IF;
  v_rowcount:= v_rowcount+1;
 END LOOP;


   UPDATE cdm.cdm_indiv_mast SET percentiler = f2 from  cdmperct where
indiv_key = f1;


 DROP TABLE cdmperct;
 RETURN  \'DONE\';
END; '
  LANGUAGE 'plpgsql' IMMUTABLE;

Patrick Hatcher
Macys.Com


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])