date:20030722

Re: [PERFORM] One table or many tables for data set

2003-07-22 Thread Joe Conway

Castle, Lindsay wrote:
I'm working on a project that has a data set of approximately 6million rows
with about 12,000 different elements, each element has 7 columns of data.
I'm wondering what would be faster from a scanning perspective (SELECT
statements with some calculations) for this type of set up;
one table for all the data
one table for each data element (12,000 tables)
one table per subset of elements (eg all elements that start with
"a" in a table)
I, for one, am having difficulty understanding exactly what your data 
looks like, so it's hard to give advice. Maybe some concrete examples of 
what you are calling "rows", "elements", and "columns" would help.

Does each of 6 million rows have 12000 elements, each with 7 columns? Or 
do you mean that out of 6 million rows, there are 12000 distinct kinds 
of elements?

Can I do anything with Indexing to help with performance?  I suspect for the
majority of scans I will need to evaluate an outcome based on 4 or 5 of the
7 columns of data.
Again, this isn't clear to me -- but maybe I'm just being dense ;-)
Does this mean you expect 4 or 5 items in your WHERE clause?
Joe

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] One table or many tables for data set

2003-07-22 Thread Joe Conway

Castle, Lindsay wrote:
The data structure looks like this:
element
date
num1
num2
num3
num4
units
There are approx 12,000 distinct elements for a total of about 6 million
rows of data.
Ahh, that helps! So are the elements evenly distributed, i.e. are there 
approx 500 rows of each element? If so, it should be plenty quick to put 
all the data in one table with an index on "element" (and maybe a 
multicolumn key, depending on other factors).

The scanning technology I want to use may need a different number of rows
and different columns depending on the scan formula;
eg scan1 may need num1, num2 and num3 from the last 200 rows for
element "x"
   scan2 may need num1, units from the last 10 rows for element "y"
When you say "last X rows", do you mean sorted by "date"? If so, you 
might want that index to be on (element, date). Then do:

SELECT num1, num2, num3 FROM mytable WHERE element = 'an_element' order 
by date DESC LIMIT 20;

Replace num1, num2, num3 by whatever columns you want, and "LIMIT X" as 
the number of rows you want.

HTH,

Joe

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

[PERFORM] One table or many tables for data set

2003-07-22 Thread Castle, Lindsay

Hi all,

I'm working on a project that has a data set of approximately 6million rows
with about 12,000 different elements, each element has 7 columns of data.

I'm wondering what would be faster from a scanning perspective (SELECT
statements with some calculations) for this type of set up;
one table for all the data
one table for each data element (12,000 tables)
one table per subset of elements (eg all elements that start with
"a" in a table)

The data is static once its in the database, only new records are added on a
regular basis.

I'd like to run quite a few different formulated scans in the longer term so
having efficient scans is a high priority.

Can I do anything with Indexing to help with performance?  I suspect for the
majority of scans I will need to evaluate an outcome based on 4 or 5 of the
7 columns of data.

Thanks in advance :-)

Linz

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] One table or many tables for data set

2003-07-22 Thread Castle, Lindsay

Thanks Joe,

This certainly helps me get going on the right path.

Lindsay Castle
EDS Australia
Midrange & Distributed Tools
Infrastructure Tools AP
Ph:   +61 (0)8 8464 7101
Fax:  +61 (0)8 8464 2135

-Original Message-
From: Joe Conway [mailto:[EMAIL PROTECTED]
Sent: Wednesday, 23 July 2003 11:06 AM
To: Castle, Lindsay
Cc: [EMAIL PROTECTED]
Subject: Re: [PERFORM] One table or many tables for data set

Castle, Lindsay wrote:
> The data structure looks like this:
>   element
>   date
>   num1
>   num2
>   num3
>   num4
>   units
> 
> There are approx 12,000 distinct elements for a total of about 6 million
> rows of data.

Ahh, that helps! So are the elements evenly distributed, i.e. are there 
approx 500 rows of each element? If so, it should be plenty quick to put 
all the data in one table with an index on "element" (and maybe a 
multicolumn key, depending on other factors).

> The scanning technology I want to use may need a different number of rows
> and different columns depending on the scan formula;
>   eg scan1 may need num1, num2 and num3 from the last 200 rows for
> element "x"
>  scan2 may need num1, units from the last 10 rows for element "y"

When you say "last X rows", do you mean sorted by "date"? If so, you 
might want that index to be on (element, date). Then do:

SELECT num1, num2, num3 FROM mytable WHERE element = 'an_element' order 
by date DESC LIMIT 20;

Replace num1, num2, num3 by whatever columns you want, and "LIMIT X" as 
the number of rows you want.

HTH,

Joe

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] One table or many tables for data set

2003-07-22 Thread Rod Taylor

On Tue, 2003-07-22 at 21:50, Rod Taylor wrote:
> Ok.. Unless I'm missing something, the data will be static (or near
> static).  It also sounds as if the structure is common for elements, so
> you probably only want 2 tables.

I misunderstood. Do what Joe suggested.


signature.asc
Description: This is a digitally signed message part

Re: [PERFORM] One table or many tables for data set

2003-07-22 Thread Castle, Lindsay

Thanks Rod 

My explanations will be better next time. :-)


-Original Message-
From: Rod Taylor [mailto:[EMAIL PROTECTED]
Sent: Wednesday, 23 July 2003 11:41 AM
To: Castle, Lindsay
Cc: Postgresql Performance
Subject: Re: One table or many tables for data set


On Tue, 2003-07-22 at 21:50, Rod Taylor wrote:
> Ok.. Unless I'm missing something, the data will be static (or near
> static).  It also sounds as if the structure is common for elements, so
> you probably only want 2 tables.

I misunderstood. Do what Joe suggested.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] One table or many tables for data set

2003-07-22 Thread Castle, Lindsay

Apologies, let me clear this up a bit (hopefully) :-)

The data structure looks like this:
element
date
num1
num2
num3
num4
units

There are approx 12,000 distinct elements for a total of about 6 million
rows of data.

The scanning technology I want to use may need a different number of rows
and different columns depending on the scan formula;
eg scan1 may need num1, num2 and num3 from the last 200 rows for
element "x"
   scan2 may need num1, units from the last 10 rows for element "y"

I can either do the scans and calculate what i need within SQL or drag the
data out and process it outside of SQL, my preference is to go inside SQL as
I've assumed that would be faster and less development work.

If I went with the many tables design I would not expect to need to join
between tables, there is no relationship between the different elements that
I need to cater for.

Cheers,

Linz


Castle, Lindsay wrote and :
> I'm working on a project that has a data set of approximately 6million
rows
> with about 12,000 different elements, each element has 7 columns of data.
> 
> I'm wondering what would be faster from a scanning perspective (SELECT
> statements with some calculations) for this type of set up;
>   one table for all the data
>   one table for each data element (12,000 tables)
>   one table per subset of elements (eg all elements that start with
> "a" in a table)
> 

I, for one, am having difficulty understanding exactly what your data 
looks like, so it's hard to give advice. Maybe some concrete examples of 
what you are calling "rows", "elements", and "columns" would help.

Does each of 6 million rows have 12000 elements, each with 7 columns? Or 
do you mean that out of 6 million rows, there are 12000 distinct kinds 
of elements?

> Can I do anything with Indexing to help with performance?  I suspect for
the
> majority of scans I will need to evaluate an outcome based on 4 or 5 of
the
> 7 columns of data.
> 

Again, this isn't clear to me -- but maybe I'm just being dense ;-)
Does this mean you expect 4 or 5 items in your WHERE clause?

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] One table or many tables for data set

2003-07-22 Thread Rod Taylor

Ok.. Unless I'm missing something, the data will be static (or near
static).  It also sounds as if the structure is common for elements, so
you probably only want 2 tables.

One with 6 million rows and any row information.  The other with 6
million * 12000 rows with the element data linking to the row
information line with an identifier, and have an 'element type' (I
assume there are 12000 types of elements -- or something of that
nature).

Unique constraint on (row_identifier, element_type)

The speed you achieve will be based on what indexes you create.

If you spend most of your time with one or a few (5% or less of the
structure) element types, create a partial index for those element types
only, and a partial index for all of the others.

If you have a standard mathematical operation on num1, num2, etc. you
may want to make use of functional indexes to index the result of the
calculation.

Be sure to create the tables WITHOUT OIDS and be prepared for the
dataload to take a while, and CLUSTER the table based on your most
commonly used index (once they've been setup).

To help with speed, we would need to see EXPLAIN ANALYZE results and the
query being performed.

On Tue, 2003-07-22 at 21:00, Castle, Lindsay wrote:
> All rows have the same structure, the data itself will be different for each
> row, the structure is something like this:
> 
>   element
>   date
>   num1
>   num2
>   num3
>   num4
>   units
> 
> Thanks,
> 
> 
> Lindsay Castle
> EDS Australia
> Midrange & Distributed Tools
> Infrastructure Tools AP
> Ph:   +61 (0)8 8464 7101
> Fax:  +61 (0)8 8464 2135
> 
> 
> -Original Message-
> From: Rod Taylor [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, 23 July 2003 10:24 AM
> To: Castle, Lindsay
> Cc: Postgresql Performance
> Subject: Re: One table or many tables for data set
> 
> 
> On Tue, 2003-07-22 at 20:34, Castle, Lindsay wrote:
> > Hi all,
> > 
> > I'm working on a project that has a data set of approximately 6million
> rows
> > with about 12,000 different elements, each element has 7 columns of data.
> 
> Are these 7 columns the same for each element?
> 

signature.asc
Description: This is a digitally signed message part

Re: [PERFORM] One table or many tables for data set

2003-07-22 Thread Rod Taylor

On Tue, 2003-07-22 at 20:34, Castle, Lindsay wrote:
> Hi all,
> 
> I'm working on a project that has a data set of approximately 6million rows
> with about 12,000 different elements, each element has 7 columns of data.

Are these 7 columns the same for each element?


signature.asc
Description: This is a digitally signed message part

[PERFORM] slow table updates

2003-07-22 Thread Reece Hart





I'm trying to update a table but it's taking a very long time. I would appreciate any tips folks may have about ways to speed it up. 

The table is paprospect2, as below:

\d paprospect2
   Column    |  Type   | Modifiers
-+-+---
 pfeature_id | integer | not null default nextval('unison.pfeature_pfeature_id_seq'::text)
 pseq_id | integer | not null
 pftype_id   | integer | not null
 start   | integer |
 stop    | integer |
 confidence  | real    |
 run_id  | integer | not null
 [snip 13 integer and real columns]
 run_id_new  | integer |

Indexes: paprospect2_redundant_alignment unique btree (pseq_id, "start", stop, run_id, pmodel_id),
 p2thread_p2params_id btree (run_id),
 p2thread_pmodel_id btree (pmodel_id)
Foreign Key constraints: pftype_id_exists FOREIGN KEY (pftype_id) REFERENCES pftype(pftype_id) ON UPDATE CASCADE ON DELETE CASCADE,
 p2thread_pmodel_id_exists FOREIGN KEY (pmodel_id) REFERENCES pmprospect2(pmodel_id) ON UPDATE CASCADE ON DELETE CASCADE,
 pseq_id_exists FOREIGN KEY (pseq_id) REFERENCES pseq(pseq_id) ON UPDATE CASCADE ON DELETE CASCADE
Triggers: p2thread_i_trigger


The columns pfeature_id..confidence and run_id_new (in red) are from an inherited table. Although the inheritance itself is probably not relevant here (correction welcome), I suspect it may be relevant that all existing rows were written before the table definition included run_id_new. p2thread_i_trigger is defined fires on insert only (not update).

paprospect2 contains ~40M rows. The goal now is to migrate the data to the supertable-inherited column with

update paprospect2 set run_id_new=run_id;


The update's been running for 5 hours (unloaded dual 2.4 GHz Xeon w/2GB RAM, SCSI160 10K drive). There are no other jobs running. Load is ~1.2 and the update's using ~3-5% of the CPU.

$ ps -ostime,time,pcpu,cmd 28701
STIME TIME %CPU CMD
12:18 00:07:19  2.3 postgres: admin csb 128.137.116.213 UPDATE

This suggests that the update is I/O bound (duh) and vmstat supports this:

$ vmstat 1
   procs  memory    swap  io system cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy  id
 0  1  0  0  11288  94632 3558960   0   0    14 6   12    21   1   0   6
 0  1  0  0  12044  94632 3558956   0   0 0   972  332    16   0   1  99
 0  1  0  0  11092  94632 3558932   0   0    16  4420  309    25   0   2  97
 0  1  0  0  11456  94636 3558928   0   0 0   980  326    23   0   1  99
 1  0  0  0  12340  94636 3558924   0   0    16   532  329    14   0   0 100
 0  1  0  0  12300  94636 3558916   0   0 0  1376  324    16   1   0  99
 0  1  0  0  12252  94636 3558904   0   0    16  1888  325    18   0   0  99
 0  1  0  0  11452  94636 355   0   0    16  2864  324    23   1   1  98
 0  1  0  0  12172  94636 3558884   0   0 0   940  320    12   0   1  99
 0  1  0  0  12180  94636 3558872   0   0    16  1840  318    22   0   1  99
 0  1  0  0  11588  94636 3558856   0   0 0  2752  312    16   1   2  97


Presumably the large number of blocks written (bo) versus blocks read (bi) reflects an enormous amount of bookkeeping that has to be done for MVCC, logging, perhaps rewriting a row for the new definition (a guess -- I don't know how this is handled), indicies, etc. There's no swapping and no processes are waiting. In short, it seems that this is ENTIRELY an I/O issue. Obviously, faster drives will help (but probably only by small factor).

Any ideas how I might speed this up? Presumably this is all getting wrapped in a transaction -- does that hurt me for such a large update?
 
Thanks,
Reece


Bonus diversionary topic: In case it's not obvious, the motivation for this is that the subtable (paprospect2) contains a column (run_id) whose definition I would like to migrate to the inherited table (i.e., the 'super-table'). Although postgresql permits adding a column to a supertable with the same name as an extant column in a subtable, it appears that such "merged definition" columns do not have the same properties as a typical inherited column. In particular, dropping the column from the supertable does not drop it from the subtable (but renaming it does change both names). Hmm.




-- 
Reece Hart, Ph.D.   [EMAIL PROTECTED], http://www.gene.com/
Genentech, Inc. 650/225-6133 (voice), -5389 (fax)
Bioinformatics and Protein Engineering
1 DNA Way, MS-93http://www.in-machina.com/~reece/
South San Francisco, CA  94080-4990 [EMAIL PROTECTED], GPG: 0x25EC91A0

Re: [PERFORM] Dual Xeon + HW RAID question

2003-07-22 Thread Jord Tanner

On Tue, 2003-07-22 at 11:50, Bruce Momjian wrote:
> Jord Tanner wrote:
> > On Tue, 2003-07-22 at 10:39, Bruce Momjian wrote:
> > > But CPU affinity isn't realated to hyperthreading, as far as I know. 
> > > CPU affinity tries to keep processes on the same cpu in case there is
> > > still valuable info in the cpu cache.
> > > 
> > 
> > It is true that CPU affinity is designed to prevent the dump of valuable
> > CPU cache. My thought is that if you are trying to prevent CPU
> > contention, you could use CPU affinity to prevent 2 postmaster processes
> > from running simultaneously on the same die. Am I out to lunch here?
> > I've not worked with CPU affinity before, so I'm not familiar with the
> > intimate details.
> 
> I guess you could but it is the backends that use the cpu.  I don't
> think manually specifying affinity will work for most applications.

This is beating a dead horse, but I'll take one more kick at it.

CPU affinity is defined by a bit mask, so multiple processors can be
selected. It is also inherited by child processes, so assigning CPU 0
and CPU 2 (which I assume would be on different dies in a dual processor
hyper-threading system) to the parent postmaster should prevent CPU
contention with respect to the postgres backend. 

I would be very interested to see if any advantage could be gained by a
combination of multiple HT processors and cpu affinity over multiple
non-HT processors. Yet Another Performance Testing To Do (YAPTTD)!

-- 
Jord Tanner <[EMAIL PROTECTED]>

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Dual Xeon + HW RAID question

2003-07-22 Thread Bruce Momjian

Jord Tanner wrote:
> On Tue, 2003-07-22 at 10:39, Bruce Momjian wrote:
> > But CPU affinity isn't realated to hyperthreading, as far as I know. 
> > CPU affinity tries to keep processes on the same cpu in case there is
> > still valuable info in the cpu cache.
> > 
> 
> It is true that CPU affinity is designed to prevent the dump of valuable
> CPU cache. My thought is that if you are trying to prevent CPU
> contention, you could use CPU affinity to prevent 2 postmaster processes
> from running simultaneously on the same die. Am I out to lunch here?
> I've not worked with CPU affinity before, so I'm not familiar with the
> intimate details.

I guess you could but it is the backends that use the cpu.  I don't
think manually specifying affinity will work for most applications.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Wrong plan or what ?

2003-07-22 Thread Mendola Gaetano

"Josh Berkus" <[EMAIL PROTECTED]>
> Gaetano,
>
> > QUERY PLAN
> >  Hash Join  (cost=265.64..32000.76 rows=40612 width=263) (actual
> > time=11074.21..11134.28 rows=10 loops=1)
> >Hash Cond: ("outer".id_user = "inner".id_user)
> >->  Seq Scan on user_logs ul  (cost=0.00..24932.65 rows=1258965
width=48)
> > (actual time=0.02..8530.21 rows=1258966 loops=1)
>
> OK, here's your problem
>
> The planner thinks that you're going to get 40162 rows out of the final
join,
> not 10.   If the row estimate was correct, then the Seq Scan would be a
> reasonable plan.   But it's not.   Here's some steps you can take to clear
> things up for the planner:
>
> 1) Make sure you've VACUUM ANALYZED
> 2) Adjust the following postgresql.conf statistics:
> a) effective_cache_size: increase to 70% of available (not used by other
> processes) RAM.
> b) random_page_cost: decrease, maybe to 2.
> c) default_statistics_target: try increasing to 100
> (warning: this will significantly increase the time required to do
ANALYZE)
>
> Then test again!

No improvement at all,
I pushed default_statistics_target to 1000
but the rows expected are still 40612 :-(
Of course I restarted the postmaster and I vacuumed analyze the DB


Thank you
Gaetano








---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Tunning FreeeBSD and PostgreSQL

2003-07-22 Thread Vivek Khera

> "BM" == Bruce Momjian <[EMAIL PROTECTED]> writes:

BM> I know Linux has pagable shared memory, and you can resize the maximum
BM> in a running kernel, so it seems they must have abandonded the linkage
BM> between shared page tables and the kernel.  This looks interesting:

Thanks for the info.  You can resize it in FreeBSD as well, using the
sysctl command to set the various kern.ipc.shm* values.

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.Khera Communications, Inc.
Internet: [EMAIL PROTECTED]   Rockville, MD   +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Wrong plan or what ?

2003-07-22 Thread Josh Berkus

Gaetano,

> QUERY PLAN
>  Hash Join  (cost=265.64..32000.76 rows=40612 width=263) (actual
> time=11074.21..11134.28 rows=10 loops=1)
>Hash Cond: ("outer".id_user = "inner".id_user)
>->  Seq Scan on user_logs ul  (cost=0.00..24932.65 rows=1258965 width=48)
> (actual time=0.02..8530.21 rows=1258966 loops=1)

OK, here's your problem

The planner thinks that you're going to get 40162 rows out of the final join, 
not 10.   If the row estimate was correct, then the Seq Scan would be a 
reasonable plan.   But it's not.   Here's some steps you can take to clear 
things up for the planner:

1) Make sure you've VACUUM ANALYZED
2) Adjust the following postgresql.conf statistics:
a) effective_cache_size: increase to 70% of available (not used by other 
processes) RAM.
b) random_page_cost: decrease, maybe to 2.
c) default_statistics_target: try increasing to 100
(warning: this will significantly increase the time required to do 
ANALYZE)

Then test again!


-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [PERFORM] Wrong plan or what ?

2003-07-22 Thread Mendola Gaetano

Forget my PS to last message.


---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] Dual Xeon + HW RAID question

2003-07-22 Thread Jord Tanner

On Tue, 2003-07-22 at 10:39, Bruce Momjian wrote:
> But CPU affinity isn't realated to hyperthreading, as far as I know. 
> CPU affinity tries to keep processes on the same cpu in case there is
> still valuable info in the cpu cache.
> 

It is true that CPU affinity is designed to prevent the dump of valuable
CPU cache. My thought is that if you are trying to prevent CPU
contention, you could use CPU affinity to prevent 2 postmaster processes
from running simultaneously on the same die. Am I out to lunch here?
I've not worked with CPU affinity before, so I'm not familiar with the
intimate details.

-- 
Jord Tanner <[EMAIL PROTECTED]>

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [PERFORM] Wrong plan or what ?

2003-07-22 Thread Mendola Gaetano

"Josh Berkus" <[EMAIL PROTECTED]>
> Gaetano,
>
> > SELECT * from user_logs where id_user in (
> >  10943,   10942,   10934,   10927,   10910,  10909
> > );
> > [SNIPPED]
>
> > Why the planner or the executor ( I don't know ) do not follow
> > the same strategy ?
>
> It is, actually, according to the query plan.
>
> Can you post the EXPLAIN ANALYZE for the above query?

Index Scan using idx_user_user_logs, idx_user_user_logs, idx_user_user_logs,
idx_user_user_logs, idx_user_user_logs, idx_user_user_logs on user_logs
(cost=0.00..5454.21 rows=2498 width=48) (actual time=0.09..0.28 rows=10
loops=1)
   Index Cond: ((id_user = 10943) OR (id_user = 10942) OR (id_user = 10934)
OR (id_user = 10927) OR (id_user = 10910) OR (id_user = 10909))
 Total runtime: 0.41 msec
(3 rows)


Thank you
Gaetano


PS: if I execute the query I obtain 10 rows instead of 3 that say the
explain analyze.




---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Wrong plan or what ?

2003-07-22 Thread Josh Berkus

Gaetano,

> SELECT * from user_logs where id_user in (
>  10943,   10942,   10934,   10927,   10910,  10909
> );
> [SNIPPED]

> Why the planner or the executor ( I don't know ) do not follow
> the same strategy ?

It is, actually, according to the query plan.   

Can you post the EXPLAIN ANALYZE for the above query?

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] Dual Xeon + HW RAID question

2003-07-22 Thread Bruce Momjian


But CPU affinity isn't realated to hyperthreading, as far as I know. 
CPU affinity tries to keep processes on the same cpu in case there is
still valuable info in the cpu cache.

---

Jord Tanner wrote:
> The Linux 2.6 kernel will have the ability to set CPU affinity for
> specific processes. There is a patch for the 2.4 kernel at
> http://www.kernel.org/pub/linux/kernel/people/rml/cpu-affinity
> 
> RedHat 9 already has support for CPU affinity build in.
> 
> The July 2003 issue of Linux Journal includes a little C program (on
> page 20) that gives you a shell level interface to the CPU affinity
> system calls, so you can dynamically assign processes to specific CPUs.
> I haven't tried it, but it looks very cool (my only SMP machine is in
> production, and I don't want to mess with it). If you try it out, please
> share your experiences with the list.
> 
> 
> Jord Tanner
> Independent Gecko Consultants
> 
> On Tue, 2003-07-22 at 10:10, SZUCS G?bor wrote:
> > "by default" -- do you mean there is a way to tell Linux to favor the second
> > real cpu over the HT one? how?
> > 
> > G.
> > --- cut here ---
> > - Original Message - 
> > From: "Bruce Momjian" <[EMAIL PROTECTED]>
> > Sent: Tuesday, July 22, 2003 6:26 PM
> > Subject: Re: [PERFORM] Dual Xeon + HW RAID question
> > 
> > 
> > > Right, I simplified it.  The big deal is whether the OS favors the
> > > second real CPU over one of the virtual CPU's on the same die --- by
> > > default, it doesn't.  Ever if it did work perfectly, you are talking
> > > about going from 1 to 1.4 or 2 to 2.8, which doesn't seem like much.
> > 
> > 
> > ---(end of broadcast)---
> > TIP 8: explain analyze is your friend
> -- 
> Jord Tanner <[EMAIL PROTECTED]>
> 
> 
> ---(end of broadcast)---
> TIP 3: if posting/reading through Usenet, please send an appropriate
>   subscribe-nomail command to [EMAIL PROTECTED] so that your
>   message can get through to the mailing list cleanly
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [PERFORM] Dual Xeon + HW RAID question

2003-07-22 Thread Bruce Momjian

SZUCS Gábor wrote:
> "by default" -- do you mean there is a way to tell Linux to favor the second
> real cpu over the HT one? how?

Right now there is no way the kernel can tell which virtual cpu's are on
each physical cpu's, and that is the problem.  Once there is a way,
hyperthreading will be more useful, but even then, it doesn't double
your CPU throughput, just increases by 40%.


> > Right, I simplified it.  The big deal is whether the OS favors the
> > second real CPU over one of the virtual CPU's on the same die --- by
> > default, it doesn't.  Ever if it did work perfectly, you are talking
> > about going from 1 to 1.4 or 2 to 2.8, which doesn't seem like much.
> 
> 
> ---(end of broadcast)---
> TIP 8: explain analyze is your friend
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Dual Xeon + HW RAID question

2003-07-22 Thread Jord Tanner

The Linux 2.6 kernel will have the ability to set CPU affinity for
specific processes. There is a patch for the 2.4 kernel at
http://www.kernel.org/pub/linux/kernel/people/rml/cpu-affinity

RedHat 9 already has support for CPU affinity build in.

The July 2003 issue of Linux Journal includes a little C program (on
page 20) that gives you a shell level interface to the CPU affinity
system calls, so you can dynamically assign processes to specific CPUs.
I haven't tried it, but it looks very cool (my only SMP machine is in
production, and I don't want to mess with it). If you try it out, please
share your experiences with the list.

Jord Tanner
Independent Gecko Consultants

On Tue, 2003-07-22 at 10:10, SZUCS Gábor wrote:
> "by default" -- do you mean there is a way to tell Linux to favor the second
> real cpu over the HT one? how?
> 
> G.
> --- cut here ---
> - Original Message - 
> From: "Bruce Momjian" <[EMAIL PROTECTED]>
> Sent: Tuesday, July 22, 2003 6:26 PM
> Subject: Re: [PERFORM] Dual Xeon + HW RAID question
> 
> 
> > Right, I simplified it.  The big deal is whether the OS favors the
> > second real CPU over one of the virtual CPU's on the same die --- by
> > default, it doesn't.  Ever if it did work perfectly, you are talking
> > about going from 1 to 1.4 or 2 to 2.8, which doesn't seem like much.
> 
> 
> ---(end of broadcast)---
> TIP 8: explain analyze is your friend
-- 
Jord Tanner <[EMAIL PROTECTED]>

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Tuning PostgreSQL

2003-07-22 Thread scott.marlowe

On Tue, 22 Jul 2003, Jim C. Nasby wrote:

> On Tue, Jul 22, 2003 at 03:27:20PM +0200, Alexander Priem wrote:
> > Wow, I never figured how many different RAID configurations one could think
> > of   :)
> > 
> > After reading lots of material, forums and of course, this mailing-list, I
> > think I am going for a RAID5 configuration of 6 disks (18Gb, 15.000 rpm
> > each), one of those six disks will be a 'hot spare'. I will just put the OS,
> > the WAL and the data one one volume. RAID10 is way to expensive   :)
> > 
> > If I understand correctly, this will give great read-performance, but less
> > write-performance. But since this server will be equipped with an embedded
> > RAID controller featuring 128Mb of battery-backed cache, I figure that this
> > controller will negate that (at least somewhat). I will need to find out
> > whether this cache can be configured so that it will ONLY cache WRITES, not
> > READS
>  
> I think the bigger isssue with RAID5 write performance in a database is
> that it hits every spindle.

This is a common, and wrong misconception.

If you are writing 4k out to a RAID5 of 10 disks, this is what happens:

(assumiung 64k stipes...)
READ data stripe (64k read)
READ parity stripe (64k read)
make changes to data stripe
XOR new data stripe with old parity stripe to get a new parity stripe
write new parity stripe (64k)
write new data stripe (64k)

So it's not as bad as you might think.  No modern controller (or sw raid 
for linux) hits all the spindles anymore for writes.  As you add more 
drives to a RAID5 writes actually get faster on average, because there's 
less chance of having contention for the same drives (remember, parity 
moves about in RAID5 so the parity disk isn't a choke point in RAID5 like 
it is in RAID4.)

> The real performance bottleneck you run into
> is latency, especially the latency of positioning the heads. I don't
> have any proof to this theory, but I believe this is why moving WAL
> and/or temp_db to seperate drives from the main database files can be a
> big benefit for some applications; not because of disk bandwidth but
> because it drastically cuts down the amount of time the heads have to
> spend flying around the disk.

This is absolutely true.  moving the heads costs hugely.  while most 
modern drives have SEEK times <10 ms, the SETTLE times tend to be about 
that as well, followed by the average of about 3 ms for rotational latency 
to allow the proper sector to be under the head (10krpm drives rotate once 
about every 6 ms.)

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Tunning FreeeBSD and PostgreSQL

2003-07-22 Thread Bruce Momjian

Vivek Khera wrote:
> > "BM" == Bruce Momjian <[EMAIL PROTECTED]> writes:
> 
> >> not anything pre-allocated (from my understanding).  These settings
> >> allow for up to 100,000 shared buffers (I currently only use 30,000
> >> buffers)
> 
> BM> I think the only downside to making them too big is that you allocate
> BM> page tables and prevent that address range from being used by other
> 
> Does this apply in general or just on FreeBSD?

Let me tell you how it traditionally worked  --- each process has the
kernel address space accessible at a fixed address --- it has to so the
process can make kernel calls and run those kernel calls in its own
address space, though with a kernel stack and data space.

What they did with shared memory was to put shared memory in the same
address space with the kernel, because everyone had that address range
mapped into their address space already.  If each process had its own
private copy of the kernel page tables, there is bloat in having the
kernel address space be larger than required.  However, if the kernel
page tables are shared by all processes, then there isn't much bloat,
just less addressable user memory, and if you don't have anything near 4
gigs of RAM, it isn't a problem.

I know Linux has pagable shared memory, and you can resize the maximum
in a running kernel, so it seems they must have abandonded the linkage
between shared page tables and the kernel.  This looks interesting:

http://www.linux-tutorial.info/cgi-bin/display.pl?312&0&0&0&3

and the Contents on the left show additional info like the i386 virtual
directory/page tables:

http://www.linux-tutorial.info/cgi-bin/display.pl?261&0&0&0&3

So it seems Linux has moved in the direction of making shared memory act
just like ordinary allocated memory, except it is shared, meaning I
think each process has its own pages tables for the shared memory.  Once
you do that, you get the ability to size it however you want, but you
lose shared page tables, and it can now be swapped out, which can be bad
for performance.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Dual Xeon + HW RAID question

2003-07-22 Thread SZUCS Gábor

"by default" -- do you mean there is a way to tell Linux to favor the second
real cpu over the HT one? how?

G.
--- cut here ---
- Original Message - 
From: "Bruce Momjian" <[EMAIL PROTECTED]>
Sent: Tuesday, July 22, 2003 6:26 PM
Subject: Re: [PERFORM] Dual Xeon + HW RAID question


> Right, I simplified it.  The big deal is whether the OS favors the
> second real CPU over one of the virtual CPU's on the same die --- by
> default, it doesn't.  Ever if it did work perfectly, you are talking
> about going from 1 to 1.4 or 2 to 2.8, which doesn't seem like much.


---(end of broadcast)---
TIP 8: explain analyze is your friend

[PERFORM] Wrong plan or what ?

2003-07-22 Thread Mendola Gaetano

Hi all,
I'm running Postgres7.3.3 and I'm performing this simple select:

select *
from user_logs ul,
 user_data ud,
 class_default cd
where
 ul.id_user = ud.id_user and
 ud.id_class = cd.id_class and
 cd.id_provider = 39;

these are the number of rows for each table:

user_logs:  1258955
class_default: 31   ( only one with id_provider = 39 )
user_data: 10274;


this is the explain analyze for that query:

QUERY PLAN
 Hash Join  (cost=265.64..32000.76 rows=40612 width=263) (actual
time=11074.21..11134.28 rows=10 loops=1)
   Hash Cond: ("outer".id_user = "inner".id_user)
   ->  Seq Scan on user_logs ul  (cost=0.00..24932.65 rows=1258965 width=48)
(actual time=0.02..8530.21 rows=1258966 loops=1)
   ->  Hash  (cost=264.81..264.81 rows=331 width=215) (actual
time=30.22..30.22 rows=0 loops=1)
 ->  Nested Loop  (cost=0.00..264.81 rows=331 width=215) (actual
time=29.95..30.20 rows=6 loops=1)
   ->  Seq Scan on class_default cd  (cost=0.00..1.39 rows=1
width=55) (actual time=0.08..0.10 rows=1 loops=1)
 Filter: (id_provider = 39)
   ->  Index Scan using idx_user_data_class on user_data ud
(cost=0.00..258.49 rows=395 width=160) (actual time=29.82..29.96 rows=6
loops=1)
 Index Cond: (ud.id_class = "outer".id_class)
 Total runtime: 11135.65 msec
(10 rows)


I'm able to  performe that select with these 3 steps:

SELECT id_class from class_default where id_provider = 39;
 id_class
--
   48
(1 row)

SELECT id_user from user_data where id_class in ( 48 );
 id_user
-
   10943
   10942
   10934
   10927
   10910
   10909
(6 rows)


SELECT * from user_logs where id_user in (
 10943,   10942,   10934,   10927,   10910,  10909
);
[SNIPPED]

and the time ammount is a couple of milliseconds.

Why the planner or the executor ( I don't know ) do not follow
the same strategy ?



Thank you
Gaetano Mendola









---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] Tuning PostgreSQL

2003-07-22 Thread Bruno Wolff III

On Tue, Jul 22, 2003 at 11:40:35 +0200,
  Vincent van Leeuwen <[EMAIL PROTECTED]> wrote:
> 
> About RAID types: the fastest RAID type by far is RAID-10. However, this will
> cost you a lot of useable diskspace, so it isn't for everyone. You need at
> least 4 disks for a RAID-10 array. RAID-5 is a nice compromise if you want as
> much useable diskspace as possible and still want to be redundant. RAID-1 is
> very useful for small (2-disk) arrays.

Note that while raid 10 requires 4 disks, you get the space of 2 disks.
This is the same ratio as for raid 1.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Dual Xeon + HW RAID question

2003-07-22 Thread Bruce Momjian

Mindaugas Riauba wrote:
> 
> > > I missed your orig. post, but AFAIK multiprocessing kernels will handle
> HT
> > > CPUs as 2 CPUs each. Thus, our dual Xeon 2.4 is recognized as 4 Xeon 2.4
> > > CPUs.
> > >
> > > This way, I don't think HT would improve any single query (afaik no
> postgres
> > > process uses more than one cpu), but overall multi-query performance has
> to
> > > improve.
> >
> > When you use hyperthreading, each virtual cpu runs at 70% of a full CPU,
> > so hyperthreading could be slower than non-hyperthreading.  On a fully
> > loaded dual cpu system, you are looking at 2.8 cpu's (0.70 * 4), while
> > if it isn't loaded, you are looking at slowing down if you are only
> > using 1 or 2 cpu's.
> 
>   Virtual cpus are not running at 70% of real cpus :). Slowdown will happen
> if
> scheduler will run 2 processes on the same real cpu. And I read that there
> are
> patches for Linux kernel to fix that. Sooner rather than later they will
> appear
> in Linus kernel.

Right, I simplified it.  The big deal is whether the OS favors the
second real CPU over one of the virtual CPU's on the same die --- by
default, it doesn't.  Ever if it did work perfectly, you are talking
about going from 1 to 1.4 or 2 to 2.8, which doesn't seem like much.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Tuning PostgreSQL

2003-07-22 Thread Vivek Khera

> "AP" == Alexander Priem <[EMAIL PROTECTED]> writes:

AP> Hmmm. I keep changing my mind about this. My Db would be mostly
AP> 'selecting', but there would also be pretty much inserting and
AP> updating done. But most of the work would be selects. So would
AP> this config be OK?

I'm about to order a new server.  I haven't decided exactly how many
disks I will get, but my plan is to get an 8-disk RAID10 with 15k RPM
drives.  I don't need the volume, just the speed and number of
spindles, so I'm buying the smallest drives that meet my speed
probably 18Gb each (sheesh! I remember getting my first 5Mb disk for
my 8088 PC in college and thinking that was too much space).

My mix is nearly even read/write, but probably a little biased towards
the reading.

This machine is replacing a 5-disk box that was switched from RAID5 to
4-disk RAID10 for data plus one system disk in January (what a pain
that was to re-index, but that's another story).  The switch from
RAID5 to RAID10 made an enormous improvement in performance.  The
speedup wasn't from recreating the database:  It was restored from a
file-level backup so the actual files were not compacted or secretly
"improved" in any way, other than my occasional reindexing.

So I think your 6-disk RAID10 will be good.

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.Khera Communications, Inc.
Internet: [EMAIL PROTECTED]   Rockville, MD   +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Tuning PostgreSQL

2003-07-22 Thread Ron Johnson

On Tue, 2003-07-22 at 10:01, Alexander Priem wrote:
> OK, another change of plans :)
> 
> ext2 seems to be a bad idea. So i'll stick with ext3. Better safe than
> sorry...

Don't forget noatime!

> About the RAID-config: Maybe RAID-10 with six disks is affordable after all.
> I would have to take the smallest disks in this case, 18Gb per disk. So six
> 18Gb disks (15000rpm) would result in a total capacity of 54 Gb, right? This
> volume would hold OS, WAL and data, but since RAID10 appears to deliver such
> great performance (according to several people), in combination with the
> 128Mb of battery backed cache, this would be a good solution?
> 
> Hmmm. I keep changing my mind about this. My Db would be mostly 'selecting',
> but there would also be pretty much inserting and updating done. But most of
> the work would be selects. So would this config be OK?

Others may disagree, but I'd put the OS and executables on a separate
disk from the db and WAL, and make it an IDE drive, since it's so 
much less expensive than SCSI disks.  (Make a copy of the disk, and
if it craps out, pop out the old disk, stick in the new disk, and
fire the box right back up...)

Thus, you'll have an OS/executables disk, and a separate DB disk,
and never the twain shall meet.  Theoretically, you could pick up
those 6 drives and controller, move them to another machine, and
the data should be just as it was on the other box.

-- 
+-+
| Ron Johnson, Jr.Home: [EMAIL PROTECTED] |
| Jefferson, LA  USA  |
| |
| "I'm not a vegetarian because I love animals, I'm a vegetarian  |
|  because I hate vegetables!"|
|unknown  |
+-+

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Tuning PostgreSQL

2003-07-22 Thread Alexander Priem

OK, another change of plans :)

ext2 seems to be a bad idea. So i'll stick with ext3. Better safe than
sorry...

About the RAID-config: Maybe RAID-10 with six disks is affordable after all.
I would have to take the smallest disks in this case, 18Gb per disk. So six
18Gb disks (15000rpm) would result in a total capacity of 54 Gb, right? This
volume would hold OS, WAL and data, but since RAID10 appears to deliver such
great performance (according to several people), in combination with the
128Mb of battery backed cache, this would be a good solution?

Hmmm. I keep changing my mind about this. My Db would be mostly 'selecting',
but there would also be pretty much inserting and updating done. But most of
the work would be selects. So would this config be OK?

Kind regards,
Alexander.


- Original Message -
From: "Jim C. Nasby" <[EMAIL PROTECTED]>
To: "Alexander Priem" <[EMAIL PROTECTED]>
Cc: "Vincent van Leeuwen" <[EMAIL PROTECTED]>;
<[EMAIL PROTECTED]>
Sent: Tuesday, July 22, 2003 4:33 PM
Subject: Re: [PERFORM] Tuning PostgreSQL


> On Tue, Jul 22, 2003 at 03:27:20PM +0200, Alexander Priem wrote:
> > Wow, I never figured how many different RAID configurations one could
think
> > of   :)
> >
> > After reading lots of material, forums and of course, this mailing-list,
I
> > think I am going for a RAID5 configuration of 6 disks (18Gb, 15.000 rpm
> > each), one of those six disks will be a 'hot spare'. I will just put the
OS,
> > the WAL and the data one one volume. RAID10 is way to expensive   :)
> >
> > If I understand correctly, this will give great read-performance, but
less
> > write-performance. But since this server will be equipped with an
embedded
> > RAID controller featuring 128Mb of battery-backed cache, I figure that
this
> > controller will negate that (at least somewhat). I will need to find out
> > whether this cache can be configured so that it will ONLY cache WRITES,
not
> > READS
>
> I think the bigger isssue with RAID5 write performance in a database is
> that it hits every spindle. The real performance bottleneck you run into
> is latency, especially the latency of positioning the heads. I don't
> have any proof to this theory, but I believe this is why moving WAL
> and/or temp_db to seperate drives from the main database files can be a
> big benefit for some applications; not because of disk bandwidth but
> because it drastically cuts down the amount of time the heads have to
> spend flying around the disk.
>
> Of course, this is also highly dependant on how the filesystem operates,
> too. If it puts your WALs, temp_db, and database files very close to
> each other on the drive, splitting them out to seperate spindles won't
> help as much.
> --
> Jim C. Nasby, Database Consultant  [EMAIL PROTECTED]
> Member: Triangle Fraternity, Sports Car Club of America
> Give your computer some brain candy! www.distributed.net Team #1828
>
> Windows: "Where do you want to go today?"
> Linux: "Where do you want to go tomorrow?"
> FreeBSD: "Are you guys coming, or what?"
>
> ---(end of broadcast)---
> TIP 8: explain analyze is your friend


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Tuning PostgreSQL

2003-07-22 Thread Jim C. Nasby

On Tue, Jul 22, 2003 at 03:27:20PM +0200, Alexander Priem wrote:
> Wow, I never figured how many different RAID configurations one could think
> of   :)
> 
> After reading lots of material, forums and of course, this mailing-list, I
> think I am going for a RAID5 configuration of 6 disks (18Gb, 15.000 rpm
> each), one of those six disks will be a 'hot spare'. I will just put the OS,
> the WAL and the data one one volume. RAID10 is way to expensive   :)
> 
> If I understand correctly, this will give great read-performance, but less
> write-performance. But since this server will be equipped with an embedded
> RAID controller featuring 128Mb of battery-backed cache, I figure that this
> controller will negate that (at least somewhat). I will need to find out
> whether this cache can be configured so that it will ONLY cache WRITES, not
> READS
 
I think the bigger isssue with RAID5 write performance in a database is
that it hits every spindle. The real performance bottleneck you run into
is latency, especially the latency of positioning the heads. I don't
have any proof to this theory, but I believe this is why moving WAL
and/or temp_db to seperate drives from the main database files can be a
big benefit for some applications; not because of disk bandwidth but
because it drastically cuts down the amount of time the heads have to
spend flying around the disk.

Of course, this is also highly dependant on how the filesystem operates,
too. If it puts your WALs, temp_db, and database files very close to
each other on the drive, splitting them out to seperate spindles won't
help as much.
-- 
Jim C. Nasby, Database Consultant  [EMAIL PROTECTED]
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] Tuning PostgreSQL

2003-07-22 Thread Ron Johnson

On Tue, 2003-07-22 at 07:53, Alexander Priem wrote:
> Wow, I never figured how many different RAID configurations one could think
[snip]
> Also because of this battery backed cache controller, I will go for the ext2
> file system, mounted with 'noatime'. I will use a UPS, so I don't think I
> need the journaling of ext3.

Oo, I don't think I'd do that!  It's akin to saying, "I
don't need to make backups, because I have RAID[1,5,10,1+0]

If the power is out for 26 minutes and your UPS only lasts for 25
minutes, you could be in be in for a long, painful boot process if
the box crashes.  (For example, the UPS auto-shutdown daemon doesn't
work properly, and no one can get to the console to shut it down
properly before the batteries die.)

-- 
+-+
| Ron Johnson, Jr.Home: [EMAIL PROTECTED] |
| Jefferson, LA  USA  |
| |
| "I'm not a vegetarian because I love animals, I'm a vegetarian  |
|  because I hate vegetables!"|
|unknown  |
+-+

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [PERFORM] Tunning FreeeBSD and PostgreSQL

2003-07-22 Thread Vivek Khera

> "BM" == Bruce Momjian <[EMAIL PROTECTED]> writes:

>> not anything pre-allocated (from my understanding).  These settings
>> allow for up to 100,000 shared buffers (I currently only use 30,000
>> buffers)

BM> I think the only downside to making them too big is that you allocate
BM> page tables and prevent that address range from being used by other

Does this apply in general or just on FreeBSD?

BM> processes.  Of course, if you have much less than 4 gigs of RAM in the
BM> machine, it probably isn't an issue.

Probably, but wasting page table entries is never a good idea...

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.Khera Communications, Inc.
Internet: [EMAIL PROTECTED]   Rockville, MD   +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Tuning PostgreSQL

2003-07-22 Thread Andrew Sullivan

On Tue, Jul 22, 2003 at 03:27:20PM +0200, Alexander Priem wrote:

> file system, mounted with 'noatime'. I will use a UPS, so I don't think I
> need the journaling of ext3. XFS is not natively supported by RedHat and I

Just in case you're still thinking, why do you suppose that only
power failures lead to system crashes?  Surprise kernel panics due to
bad hardware or OS upgrades with bugs in them, sudden failures
because of bad memory, &c: all these things also can lead to crashes,
and though super-redundant hardware can mitigate that risk, they
can't eliminate them completely.  This is not advice, of course, but
for my money, its a bad idea not to use a journalled filesystem (or
something similar) for production systems.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Liberty RMS   Toronto, Ontario Canada
<[EMAIL PROTECTED]>  M2P 2A8
 +1 416 646 3304 x110

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Tuning PostgreSQL

2003-07-22 Thread Alexander Priem

Wow, I never figured how many different RAID configurations one could think
of   :)

After reading lots of material, forums and of course, this mailing-list, I
think I am going for a RAID5 configuration of 6 disks (18Gb, 15.000 rpm
each), one of those six disks will be a 'hot spare'. I will just put the OS,
the WAL and the data one one volume. RAID10 is way to expensive   :)

If I understand correctly, this will give great read-performance, but less
write-performance. But since this server will be equipped with an embedded
RAID controller featuring 128Mb of battery-backed cache, I figure that this
controller will negate that (at least somewhat). I will need to find out
whether this cache can be configured so that it will ONLY cache WRITES, not
READS

Also because of this battery backed cache controller, I will go for the ext2
file system, mounted with 'noatime'. I will use a UPS, so I don't think I
need the journaling of ext3. XFS is not natively supported by RedHat and I
will go for the easy way here   :)

1 Gb of RAM should be enough, I think. That is about the only point that
almost everyone agrees on   :)   Do you think ECC is very important? The
server I have in mind does not support it. Another one does, but is is about
1.000 euros more expensive   :(

One CPU should also be enough.

As for postgresql.conf settings, I think I will start with the following :

max_connections = 128
superuser_reserved_connections = 1
shared_buffers = 8192
max_fsm_relations = 1000
max_fsm_pages = 10
wal_buffers = 32
sort_mem = 2048
vacuum_mem = 32768
effective_cache_size = 28672 (this one I'm not sure about, maybe this one
needs to be higher)
random_page_cost = 2
geq0_threshold = 20

This pretty much sums it up. What do you think about this config? It may not
be the fastest, but a server like this will cost about 4750 euros, and that
is including an Intel Xeon 2.4GHz cpu, redundant power supply, WITHOUT the
UPS. Seems very reasonable to me...

Kind regards,
Alexander Priem.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Tuning PostgreSQL

2003-07-22 Thread Alexander Priem

Wow, I never figured how many different RAID configurations one could think
of   :)

After reading lots of material, forums and of course, this mailing-list, I
think I am going for a RAID5 configuration of 6 disks (18Gb, 15.000 rpm
each), one of those six disks will be a 'hot spare'. I will just put the OS,
the WAL and the data one one volume. RAID10 is way to expensive   :)

If I understand correctly, this will give great read-performance, but less
write-performance. But since this server will be equipped with an embedded
RAID controller featuring 128Mb of battery-backed cache, I figure that this
controller will negate that (at least somewhat). I will need to find out
whether this cache can be configured so that it will ONLY cache WRITES, not
READS

Also because of this battery backed cache controller, I will go for the ext2
file system, mounted with 'noatime'. I will use a UPS, so I don't think I
need the journaling of ext3. XFS is not natively supported by RedHat and I
will go for the easy way here   :)

1 Gb of RAM should be enough, I think. That is about the only point that
almost everyone agrees on   :)   Do you think ECC is very important? The
server I have in mind does not support it. Another one does, but is is about
1.000 euros more expensive   :(

One CPU should also be enough.

As for postgresql.conf settings, I think I will start with the following :

max_connections = 128
superuser_reserved_connections = 1
shared_buffers = 8192
max_fsm_relations = 1000
max_fsm_pages = 10
wal_buffers = 32
sort_mem = 2048
vacuum_mem = 32768
effective_cache_size = 28672 (this one I'm not sure about, maybe this one
needs to be higher)
random_page_cost = 2
geq0_threshold = 20

This pretty much sums it up. What do you think about this config? It may not
be the fastest, but a server like this will cost about 4750 euros, and that
is including an Intel Xeon 2.4GHz cpu, redundant power supply, WITHOUT the
UPS. Seems very reasonable to me...

Kind regards,
Alexander Priem.

- Original Message -
From: "Vincent van Leeuwen" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Tuesday, July 22, 2003 11:40 AM
Subject: Re: [PERFORM] Tuning PostgreSQL

> On 2003-07-22 09:04:42 +0200, Alexander Priem wrote:
> > Hi all,
> >
> > Vincent, You said that using RAID1, you don't have real redundancy. But
> > RAID1 is mirroring, right? So if one of the two disks should fail, there
> > should be no data lost, right?
> >
>
> Right. But the proposal was a single disk for WAL, without redundancy, and
I
> argued that wasn't really safe. RAID1 by itself is extremely safe,
possibly
> even the safest RAID type there is.
>
> > I have been thinking some more. 18Gb drives are cheaper than 36 or 72Gb
> > drives. I don't know if I can get the money for this, but how would the
> > following setup sound?
> >
> > Two 18Gb (15.000rpm) disks in RAID1 array for Operating System + WAL.
> > Four 18Gb (15.000rpm) disks in RAID5 array for data.
> >
>
> Our own testing has shown that a 6 disk RAID-10 array is faster than what
you
> describe. Of course, this is very much dependant on how much
INSERT/UPDATES
> you generate (which taxes your WAL more), so your mileage may vary.
>
> > For the same amount of money, I could also get:
> >
> > Two 36Gb (10.000rpm) disks in RAID1 array for Operating System + WAL.
> > Five/Six 36Gb (10.000rpm) disks in RAID5 array for data.
> >
>
> It is said that a higher RPM is particularly useful for a WAL disk. So you
> might consider using two 18GB 15K rpm drives for a RAID-1 WAL disk (+OS
and
> swap), and using 36GB 10K rpm disks in a RAID-5 array if you need that
> diskspace.
>
> > Which would be the best of the above? The one with four 15k-rpm disks or
the
> > one with five/six 10k-rpm disks?
> > Would these configs be better than all disks in one huge RAID5 array?
There
> > are so many possible configs with RAID...
> >
>
> 15K rpm disks are significantly faster than 10K rpm disks. If your only
> concern is performance, buy 15K rpm disks. If you want more diskspace for
your
> money, fall back to larger 10K rpm disks.
>
> I personally think seperate WAL disks are vastly overrated, since they
haven't
> shown a big performance gain in our own tests. But as I have said, this is
> extremely dependant on the type of load you generate, so only your own
tests
> can tell you what you should do in this respect.
>
> About RAID types: the fastest RAID type by far is RAID-10. However, this
will
> cost you a lot of useable diskspace, so it isn't for everyone. You need at
> least 4 disks for a RAID-10 array. RAID-5 is a nice compromise if you want
as
> much useable diskspace as possible and still want to be redundant. RAID-1
is
> very useful for small (2-disk) arrays.
>
> If you have the time and are settled on buying 6 disks, I'd test the
following
> scenarios:
> - 6-disk RAID-10 array (should perform best)
> - 4-disk RAID-10 array containing data, 2-disk RAID-1 array for WAL, OS,
etc
> - 4-disk RAID-5 array containing d

Re: [PERFORM] Tuning PostgreSQL

2003-07-22 Thread Ron Johnson

On Mon, 2003-07-21 at 04:33, Shridhar Daithankar wrote:
> Hi Alexander ,
> 
> On 21 Jul 2003 at 11:23, Alexander Priem wrote:
[snip]
> > I use ext3 filesystem, which probably is not the best performer, is it?
> 
> No. You also need to check ext2, reiser and XFS. There is no agreement between 
> users as in what works best. You need to benchmark and decide.

According to Jeremy Allison of SAMBA, ""They used ext3, which is one
of the slowest filesystems on Linux," Allison said. "In a real
comparative test, you would use XFS".
http://www.linuxworld.com/story/32673.htm

-- 
+-+
| Ron Johnson, Jr.Home: [EMAIL PROTECTED] |
| Jefferson, LA  USA  |
| |
| "I'm not a vegetarian because I love animals, I'm a vegetarian  |
|  because I hate vegetables!"|
|unknown  |
+-+



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [PERFORM] Tuning PostgreSQL

2003-07-22 Thread Vincent van Leeuwen

On 2003-07-22 09:04:42 +0200, Alexander Priem wrote:
> Hi all,
> 
> Vincent, You said that using RAID1, you don't have real redundancy. But
> RAID1 is mirroring, right? So if one of the two disks should fail, there
> should be no data lost, right?
> 

Right. But the proposal was a single disk for WAL, without redundancy, and I
argued that wasn't really safe. RAID1 by itself is extremely safe, possibly
even the safest RAID type there is.

> I have been thinking some more. 18Gb drives are cheaper than 36 or 72Gb
> drives. I don't know if I can get the money for this, but how would the
> following setup sound?
> 
> Two 18Gb (15.000rpm) disks in RAID1 array for Operating System + WAL.
> Four 18Gb (15.000rpm) disks in RAID5 array for data.
> 

Our own testing has shown that a 6 disk RAID-10 array is faster than what you
describe. Of course, this is very much dependant on how much INSERT/UPDATES
you generate (which taxes your WAL more), so your mileage may vary.

> For the same amount of money, I could also get:
> 
> Two 36Gb (10.000rpm) disks in RAID1 array for Operating System + WAL.
> Five/Six 36Gb (10.000rpm) disks in RAID5 array for data.
> 

It is said that a higher RPM is particularly useful for a WAL disk. So you
might consider using two 18GB 15K rpm drives for a RAID-1 WAL disk (+OS and
swap), and using 36GB 10K rpm disks in a RAID-5 array if you need that
diskspace.

> Which would be the best of the above? The one with four 15k-rpm disks or the
> one with five/six 10k-rpm disks?
> Would these configs be better than all disks in one huge RAID5 array? There
> are so many possible configs with RAID...
> 

15K rpm disks are significantly faster than 10K rpm disks. If your only
concern is performance, buy 15K rpm disks. If you want more diskspace for your
money, fall back to larger 10K rpm disks.

I personally think seperate WAL disks are vastly overrated, since they haven't
shown a big performance gain in our own tests. But as I have said, this is
extremely dependant on the type of load you generate, so only your own tests
can tell you what you should do in this respect.

About RAID types: the fastest RAID type by far is RAID-10. However, this will
cost you a lot of useable diskspace, so it isn't for everyone. You need at
least 4 disks for a RAID-10 array. RAID-5 is a nice compromise if you want as
much useable diskspace as possible and still want to be redundant. RAID-1 is
very useful for small (2-disk) arrays.

If you have the time and are settled on buying 6 disks, I'd test the following
scenarios:
- 6-disk RAID-10 array (should perform best)
- 4-disk RAID-10 array containing data, 2-disk RAID-1 array for WAL, OS, etc
- 4-disk RAID-5 array containing data, 2-disk RAID-1 array for WAL, OS, etc
- 6-disk RAID-5 array (will probably perform worst)

Hope this helps.

Vincent van Leeuwen
Media Design - http://www.mediadesign.nl/

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] Dual Xeon + HW RAID question

2003-07-22 Thread Mindaugas Riauba


> > I missed your orig. post, but AFAIK multiprocessing kernels will handle
HT
> > CPUs as 2 CPUs each. Thus, our dual Xeon 2.4 is recognized as 4 Xeon 2.4
> > CPUs.
> >
> > This way, I don't think HT would improve any single query (afaik no
postgres
> > process uses more than one cpu), but overall multi-query performance has
to
> > improve.
>
> When you use hyperthreading, each virtual cpu runs at 70% of a full CPU,
> so hyperthreading could be slower than non-hyperthreading.  On a fully
> loaded dual cpu system, you are looking at 2.8 cpu's (0.70 * 4), while
> if it isn't loaded, you are looking at slowing down if you are only
> using 1 or 2 cpu's.

  Virtual cpus are not running at 70% of real cpus :). Slowdown will happen
if
scheduler will run 2 processes on the same real cpu. And I read that there
are
patches for Linux kernel to fix that. Sooner rather than later they will
appear
in Linus kernel.

  Mindaugas


---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] Tuning PostgreSQL

2003-07-22 Thread Alexander Priem

Hi all,

Vincent, You said that using RAID1, you don't have real redundancy. But
RAID1 is mirroring, right? So if one of the two disks should fail, there
should be no data lost, right?

I have been thinking some more. 18Gb drives are cheaper than 36 or 72Gb
drives. I don't know if I can get the money for this, but how would the
following setup sound?

Two 18Gb (15.000rpm) disks in RAID1 array for Operating System + WAL.
Four 18Gb (15.000rpm) disks in RAID5 array for data.

For the same amount of money, I could also get:

Two 36Gb (10.000rpm) disks in RAID1 array for Operating System + WAL.
Five/Six 36Gb (10.000rpm) disks in RAID5 array for data.

Which would be the best of the above? The one with four 15k-rpm disks or the
one with five/six 10k-rpm disks?
Would these configs be better than all disks in one huge RAID5 array? There
are so many possible configs with RAID...

Kind regards,
Alexander Priem.

- Original Message -
From: "Vincent van Leeuwen" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Monday, July 21, 2003 6:28 PM
Subject: Re: [PERFORM] Tuning PostgreSQL

> On 2003-07-21 09:06:10 -0700, Josh Berkus wrote:
> > Alexander,
> >
> > > Hmmm. Seems to me that this setup would be better than one RAID5 with
three
> > > 36Gb disks, wouldn't you think so? With one RAID5 array, I would still
have
> > > the data and the WAL on one volume...
> >
> > Definitely.   As I've said, my experience with RAID5 is that with less
than 5
> > disks, it performs around 40% of a single scsi disk for large read-write
> > operation on Postgres.
> >
> > If you have only 3 disks, I'd advocate one disk for WAL and one RAID 1
array
> > for the database.
> >
>
> In this setup your database is still screwed if a single disk (the WAL
disk)
> stops working. You'll have to revert to your last backup if this happens.
The
> RAID-1 redundancy on your data disks buys you almost nothing: marginally
> better performance and no real redundancy should a single disk fail.
>
> I'd use RAID-5 if you absolutely cannot use more disks, but I would use
> RAID-10 or two RAID-1 partitions if you can afford to use 4 disks.
>
> Vincent van Leeuwen
> Media Design - http://www.mediadesign.nl/
>
> ---(end of broadcast)---
> TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

41 matches

Mail list logo