Re: [PERFORM] Anything to be gained from a 'Postgres Filesystem'?

2004-12-14 Thread Markus Schaber
Hi, Christopher,
[sorry for the delay of my answer, we were rather busy last weks]

On Thu, 04 Nov 2004 21:29:04 -0500
Christopher Browne <[EMAIL PROTECTED]> wrote:

> In an attempt to throw the authorities off his trail, [EMAIL PROTECTED] 
> (Markus Schaber) transmitted:
> > We should create a list of those needs, and then communicate those
> > to the kernel/fs developers. Then we (as well as other apps) can
> > make use of those features where they are available, and use the old
> > way everywhere else.
> 
> Which kernel/fs developers did you have in mind?  The ones working on
> Linux?  Or FreeBSD?  Or DragonflyBSD?  Or Solaris?  Or AIX?

All of them, and others (e. G. Windows).

Once we have a list of those needs, the advocates can talk to the OS
developers. Some OS developers will follow, others not.

Then the postgres folks (and other application developers that benefit
from this capabilities) can point interested users to our benchmarks and
tell them that Foox performs 3 times as fast as BaarOs because they
provide better support for database needs.

> Please keep in mind that many of the PostgreSQL developers are BSD
> folk that aren't particularly interested in creating bleeding edge
> Linux capabilities.

Then this should be motivation to add those things to BSD, maybe as a
patch or loadable module so it does not bloat mainstream. I personally
would prefer it to appear in BSD first, because in case it really pays
of, it won't be long until it appears in Linux as well :-)

> Jumping into a customized filesystem that neither hardware nor
> software vendors would remotely consider supporting just doesn't look
> like a viable strategy to me.

I did not vote for a custom filesystem, as the OP did. I did vote for
isolating a set of useful capabilities PostgreSQL could exploit, and
then try to confince the kernel developers to include this capabilities,
so they are likely to be included in the main distributions.

I don't know about the BSD market, but I know that Redhat and SuSE often
ship their patched versions of the kernels (so then they officially
support the extensions), and most of this is likely to be included in
main stream later.

> > Maybe Reiser4 is a step into the right way, and maybe even a
> > postgres plugin for Reiser4 will be worth the effort. Maybe XFS/JFS
> > etc. already have such capabilities. Maybe that's completely wrong.
> 
> The capabilities tend to be redundant.  They tend to implement vaguely
> similar transactional capabilities to what databases have to
> implement.  The similarities are not close enough to eliminate either
> variety of "commit" as redundant.

But a speed gain may be possible by coordinating DB and FS tansactions.

Thanks,
Markus

-- 
markus schaber | dipl. informatiker
logi-track ag | rennweg 14-16 | ch 8001 zürich
phone +41-43-888 62 52 | fax +41-43-888 62 53
mailto:[EMAIL PROTECTED] | www.logi-track.com

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Using LIMIT changes index used by planner

2004-12-14 Thread Sven Willenberger
On Mon, 2004-12-13 at 17:43 -0500, Tom Lane wrote:
> Sven Willenberger <[EMAIL PROTECTED]> writes:
> > explain analyze select storelocation,order_number from custacct where 
> > referrer = 1365  and orderdate between '2004-12-07' and '2004-12-07 
> > 12:00:00' order by custacctid limit 10;
>  
> >   QUERY PLAN 
> 
> > ---
> >   Limit  (cost=0.00..43065.76 rows=10 width=43) (actual 
> > time=1306957.216..1307072.111 rows=10 loops=1)
> > ->  Index Scan using custacct2_pkey on custacct 
> > (cost=0.00..92083209.38 rows=21382 width=43) (actual 
> > time=1306957.205..1307072.017 rows=10 loops=1)
> >   Filter: ((referrer = 1365) AND (orderdate >= '2004-12-07 
> > 00:00:00'::timestamp without time zone) AND (orderdate <= '2004-12-07 
> > 12:00:00'::timestamp without time zone))
> >   Total runtime: 1307072.231 ms
> > (4 rows)
> 
> I think this is the well-known issue of lack of cross-column correlation
> statistics.  The planner is well aware that this indexscan will be
> horridly expensive if run to completion --- 
> 
> There isn't any near-term fix in the wind for this, since storing
> cross-column statistics is an expensive proposition that we haven't
> decided how to handle.  Your workaround with separating the ORDER BY
> from the LIMIT is a good one.
> 

You are correct in that there is a high degree of correlation between
the custacctid (which is a serial key) and the orderdate as the orders
generally get entered in the order that they arrive. I will go with the
workaround subselect query plan then.

On a related note, is there a way (other than set enable_seqscan=off) to
give a hint to the planner that it is cheaper to use and index scan
versus seq scan? Using the "workaround" query on any time period greater
than 12 hours results in the planner using a seq scan. Disabling the seq
scan and running the query on a full day period for example shows:

explain analyze select foo.storelocaion, foo.order_number from (select
storelocation,order_number from custacct where referrer = 1365  and
ordertdate between '2004-12-09' and '2004-12-10' order by custacctid) as
foo  limit 10 offset 100;

QUERY PLAN
---
 Limit  (cost=2661326.22..2661326.35 rows=10 width=100) (actual
time=28446.605..28446.796 rows=10 loops=1)
   ->  Subquery Scan foo  (cost=2661324.97..2661866.19 rows=43297
width=100) (actual time=28444.916..28446.298 rows=110 loops=1)
 ->  Sort  (cost=2661324.97..2661433.22 rows=43297 width=41)
(actual time=28444.895..28445.334 rows=110 loops=1)
   Sort Key: custacctid
   ->  Index Scan using orderdate_idx on custacct
(cost=0.00..2657990.68 rows=43297 width=41) (actual
time=4.432..28145.212 rows=44333 loops=1)
 Index Cond: ((orderdate >= '2004-12-09
00:00:00'::timestamp without time zone) AND (orderdate <= '2004-12-10
00:00:00'::timestamp without time zone))
 Filter: (referrer = 1365)
 Total runtime: 28456.893 ms
(8 rows)


If I interpret the above correctly, the planner guestimates a cost of
2661326 but the actual cost is much less (assuming time is equivalent to
cost). Would the set statistics command be of any benefit here in
"training" the planner?

Sven




---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Using LIMIT changes index used by planner

2004-12-14 Thread Tom Lane
Sven Willenberger <[EMAIL PROTECTED]> writes:
> On a related note, is there a way (other than set enable_seqscan=off) to
> give a hint to the planner that it is cheaper to use and index scan
> versus seq scan?

There are basically two things you can do.  One: if the planner's
rowcount estimates are badly off, you can try increasing the stats
targets for relevant columns in hopes of making the estimates better.
A too-large rowcount estimate will improperly bias the decision towards
seqscan.  Two: if the rowcounts are in the right ballpark but the
estimated costs have nothing to do with reality, you can try tuning
the planner's cost parameters to make the model match local reality
a bit better.  random_page_cost is the grossest knob here;
effective_cache_size is also worth looking at.  See the
pgsql-performance archives for more discussion.

>->  Index Scan using orderdate_idx on custacct
> (cost=0.00..2657990.68 rows=43297 width=41) (actual
> time=4.432..28145.212 rows=44333 loops=1)

In this case there's already a pretty good match between actual and
estimated rowcount, so increasing the stats targets isn't likely to
improve the plan choice; especially since a more accurate estimate would
shift the costs in the "wrong" direction anyway.  Look to the cost
parameters, instead.

Standard disclaimer: don't twiddle the cost parameters on the basis
of only one test case.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Speeding up pg_dump

2004-12-14 Thread Simon Riggs
On Tue, 2004-12-14 at 17:36, Rod Taylor wrote:
> Are there any tricks to speeding up pg_dump aside from doing them from a
> replicated machine?
> 
> I'm using -Fc with no compression.

Run a separate pg_dump for larger tables and run them concurrently so
you use more cpu and disk resources.

The lower compression levels are fast and nearly as good (in my testing)
as full compression. Using compression tends to use up the CPU that
would otherwise be wasted since the pg_dump is disk intensive, and then
saves further I/O by reducing the output file size.

-- 
Best Regards, Simon Riggs


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[PERFORM] Query Optimization

2004-12-14 Thread sarlav kumar
Hi all, 
 
Can someone please help me optimize this query? Is there a better way to write this query? I am generating a report of transactions ordered by time and with details of the sender and receiver etc.
 
SELECT distinct a.time::date ||''||substring(a.time::time::text,1,8) as Time,CASE WHEN a.what = 0 THEN 'Money Transfer' WHEN a.what = 15 THEN 'Purchase' WHEN a.what = 26 THEN 'Merchant Streamline' WHEN a.what = 13 THEN 'Reversal' END  as Transaction_Type ,c1.account_no as SenderAccount, c2.account_no as RecieverAccount, b.country as SenderCountry, d.country as RecieverCountry,b.firstname as SenderFirstName, b.lastname as SenderLastName, d.firstname as ReceiverFirstName, d.lastname as ReceiverLastName, a.status as status,(select sum(td.amount * 0.01) from transaction_data td where td.data_id = a2.id and td.dir = 1 and td.uid = a.target_uid) as ReversedAmount,(select sum(td.amount * 0.01) from transaction_data td where td.data_id = a2.id and td.dir = 0 and td.uid = a.uid ) as DepositedAmount, a.flags, (a.amount * 0.01) as Amount,(a.fee * 0.01) as Fee FROM data a, customerdata b, customerdata d, customer c1, customer c2
 ,
 participant p, data a2 WHERE p.id = a.partner_id AND (a.uid = c1.id) AND (a.target_uid = c2.id) and c1.id=b.uid and c2.id=d.uidand a.confirmation is not null AND (a2.ref_id = a.id) and ((a2.what = 13) or (a2.what = 17) ) ORDER BY time desc ;
 
 
 QUERY PLAN   - Unique  (cost=2978.27..2981.54 rows=8 width=150) (actual time=502.29..506.75 rows=382 loops=1)   ->  Sort  (cost=2978.27..2978.46 rows=77 width=150) (actual time=502.29..502.61 rows=461 loops=1) Sort Key: a."time")::date)::text || ''::text) || "substring"(((a."time")::time without time zone)::text, 1, 8)), CASE WHEN (a.what = 0) THEN 'Money Transfer'::text WHEN (a.what = 15) THEN 'Purchase'::text WHEN (a.what = 26) THEN 'Merchant Streamline'::text WHEN (a.what = 13) THEN 'Reversal'::text ELSE NULL::text END, c1.account_no, c2.account_no, b.country, d.country, b.firstname, b.lastname, d.firstname, d.lastname, a.
 status,
 (subplan), (subplan), a.flags, ((a.amount)::numeric * 0.01), ((a.fee)::numeric * 0.01) ->  Hash Join  (cost=2687.00..2975.86 rows=77 width=150) (actual time=423.91..493.48 rows=461 loops=1)   Hash Cond: ("outer".partner_id = "inner".id)   ->  Nested Loop  (cost=2494.67..2781.99 rows=77 width=146) (actual time=413.19..441.61 rows=472 loops=1) ->  Merge Join  (cost=2494.67..2526.04 rows=77 width=116) (actual time=413.09..429.86 rows=472 loops=1) &
 nbsp;
 Merge Cond: ("outer".id = "inner".ref_id)   ->  Sort  (cost=1443.39..1458.57 rows=6069 width=108) (actual time=370.14..377.72 rows=5604 loops=1) Sort Key: a.id ->  Hash Join  (cost=203.50..1062.01 rows=6069 width=108) (actual time=20.35..335.44 rows=5604
 loops=1)   Hash Cond: ("outer".uid = "inner".id)   ->  Merge Join  (cost=0.00..676.43 rows=6069 width=91) (actual time=0.42..255.33 rows=5611 loops=1) Merge Cond: ("outer".target_uid =
 "inner".uid) ->  Merge Join  (cost=0.00..1224.05 rows=6069 width=61) (actual time=0.34..156.74 rows=5611 loops=1)   Merge Cond: ("outer".target_uid =
 "inner".id)   ->  Index Scan using data_target_uid on data a  (cost=0.00..2263.05 rows=6069 width=44) (actual time=0.23..63.87 rows=5630 loops=1) Filter: (confirmation IS NOT
 NULL)   ->  Index Scan using customer_pkey on customer c2  (cost=0.00..631.03 rows=6120 width=17) (actual time=0.05..50.97 rows=10862 loops=1) ->  Index Scan using customerdata_uid_idx on customerdata d  (cost=0.00..312.36 rows=6085 width=30) (actual time=0.06..48.95 rows=10822
 loops=1)   ->  Hash  (cost=188.20..188.20 rows=6120 width=17) (actual time=19.81..19.81 rows=0 loops=1) ->  Seq Scan on customer c1  (cost=0.00..188.20 rows=6120 width=17) (actual time=0.03..12.30 rows=6157 loops=1)   ->  Sort  (cost=1051.28..1052.52 rows=497 width=8) (actual time=42.05..4
 2.51
 rows=542 loops=1) Sort Key: a2.ref_id ->  Seq Scan on data a2  (cost=0.00..1029.00 rows=497 width=8) (actual time=0.21..41.14 rows=545 loops=1)   Filter: ((what = 13) OR (what = 17)) ->  Index Scan using customerdata_uid_i
 dx on

Re: [PERFORM] Partitioned table performance

2004-12-14 Thread Stacy White
Josh,

You're absolutely correct that the overhead becomes less significant as the
partitioning prunes more rows.  I can even see a two-partition table being
useful in some situations (e.g., a table divided into a relatively small
"recent data" partition and a much larger "historical data" partition).  The
break-even point is when your partitioning scheme prunes 20% of the rows
(assuming you're using the inheritance based scheme).

Thanks again for the reply.  So it sounds like the answer to my original
question is that it's expected that the pseudo-partitioning would introduce
a fairly significant amount of overhead.  Correct?



- Original Message - 
From: "Josh Berkus" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Cc: "Stacy White" <[EMAIL PROTECTED]>
Sent: Friday, December 10, 2004 9:52 PM
Subject: Re: [PERFORM] Partitioned table performance


Stacy,

> Each set of test tables holds 1,000,000 tuples with a partition value of
> '1', and 1,000,000 with a partition value of '2'. The bar* columns are all
> set to non-null values. The 'one_big_foo' table stores all 2M rows in one
> table. 'super_foo' and 'union_foo' split the data into two tables, and use
> inheritance and union views (respectively) to tie them together, as
> described in my previous message.
>
> Query timings and 'EXPLAIN ANALYZE' results for full table scans and for
> partition scans follow:

Hmmm  interesting.   I think you've demonstrated that
pseudo-partitioning
doesn't pay for having only 2 partitions.   Examine this:

-> Index Scan using idx_sub_foo2_partition on sub_foo2
super_foo (cost=0.00..2.01 rows=1 width=4) (actual time=0.221..0.221
rows=0 loops=1)
Index Cond: (partition = 1::numeric)
Total runtime: 15670.463 ms

As you see, even though the aggregate operation requires a seq scan, the
planner is still able to scan, and discard, sub_foo2, using its index in 0.2
seconds.  Unfortunately, super_foo still needs to contend with:

-> Append (cost=0.00..28376.79 rows=164 width=4) (actual
time=6.699..12072.483 rows=100 loops=1)

Right there, in the Append, you lose 6 seconds.   This means that
pseudo-partitioning via inheritance will become a speed gain once you can
"make up" that 6 seconds by being able to discard more partitions.   If you
want, do a test with 6 partitions instead of 2 and let us know how it comes
out.

Also, keep in mind that there are other reasons to do pseudo-partitioning
than
your example.  Data write performance, expiring partitions, and vacuum are
big reasons that can motivate partitioning even in cases when selects are
slower.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Hardware purchase question

2004-12-14 Thread Andrew Hood
Joshua D. Drake wrote:
An Opteron, properly tuned with PostgreSQL will always beat a Xeon
in terms of raw cpu.
RAID 10 will typically always outperform RAID 5 with the same HD config.
Fibre channel in general will always beat a normal (especially an LSI) 
raid.

Dell's suck for PostgreSQL.
Does anyone have any OS recommendations/experiences for PostgreSQL on 
Opteron?

Thanks,
Andrew
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


[PERFORM] Speeding up pg_dump

2004-12-14 Thread Rod Taylor
Are there any tricks to speeding up pg_dump aside from doing them from a
replicated machine?

I'm using -Fc with no compression.

-- 


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly