[PERFORM] SATA RAID: Promise vs. 3ware

2007-03-20 Thread Ireneusz Pluta

Hello all,

I sent a similar post to a FreeBSD group, but thought I'd might try here too.

I am completing a box for PostgreSQL server on FreeBSD. Selecting a RAID controller I decided to go 
with 3ware SE9650-16, following good opinions about 3ware controllers found on FreeBSD and 
PostgreSQL groups.


However my dealer suggest me not to go with 3ware, and take Promise SuperTrak EX16350, instead. This 
suggestion does not have any technical background and it comes generally from the fact of limited 
availability of 16x 3ware controllers on the local market and immediate availability of Promise.


Is this technically a good idea to take Promise instead of 3ware or rather I definitely should 
insist on 3ware and wait for it?


Thank you

Ireneusz Pluta


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] SATA RAID: Promise vs. 3ware

2007-03-20 Thread Merlin Moncure

On 3/20/07, Ireneusz Pluta [EMAIL PROTECTED] wrote:

Hello all,

I sent a similar post to a FreeBSD group, but thought I'd might try here too.

I am completing a box for PostgreSQL server on FreeBSD. Selecting a RAID 
controller I decided to go
with 3ware SE9650-16, following good opinions about 3ware controllers found on 
FreeBSD and
PostgreSQL groups.

However my dealer suggest me not to go with 3ware, and take Promise SuperTrak 
EX16350, instead. This
suggestion does not have any technical background and it comes generally from 
the fact of limited
availability of 16x 3ware controllers on the local market and immediate 
availability of Promise.

Is this technically a good idea to take Promise instead of 3ware or rather I 
definitely should
insist on 3ware and wait for it?



Promise raid controllers are famous for being software based with all
the real work being done in the driver.  Without doing the research
this may or may not be the case with this particular controller.
Another issue with cheap RAID controllers is the performance may not
be as good as software raid...in fact it may be worse.  Look for
benchmarks on the web and be skeptical.

merlin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] SATA RAID: Promise vs. 3ware

2007-03-20 Thread Magnus Hagander
On Tue, Mar 20, 2007 at 10:18:45AM -0400, Merlin Moncure wrote:
 On 3/20/07, Ireneusz Pluta [EMAIL PROTECTED] wrote:
 Hello all,
 
 I sent a similar post to a FreeBSD group, but thought I'd might try here 
 too.
 
 I am completing a box for PostgreSQL server on FreeBSD. Selecting a RAID 
 controller I decided to go
 with 3ware SE9650-16, following good opinions about 3ware controllers 
 found on FreeBSD and
 PostgreSQL groups.
 
 However my dealer suggest me not to go with 3ware, and take Promise 
 SuperTrak EX16350, instead. This
 suggestion does not have any technical background and it comes generally 
 from the fact of limited
 availability of 16x 3ware controllers on the local market and immediate 
 availability of Promise.
 
 Is this technically a good idea to take Promise instead of 3ware or rather 
 I definitely should
 insist on 3ware and wait for it?
 
 
 Promise raid controllers are famous for being software based with all
 the real work being done in the driver.  Without doing the research
 this may or may not be the case with this particular controller.
 Another issue with cheap RAID controllers is the performance may not
 be as good as software raid...in fact it may be worse.  Look for
 benchmarks on the web and be skeptical.

A Promise RAID is the only hardware RAID I've ever had eat an entire
array for me... Granted this was one of those external array with SCSI
to the host, but it's certainly turned me away from Promise.. Probably
not related to the controller in question, just their general quality
level.

//Magnus


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[PERFORM] how small to split a table?

2007-03-20 Thread Vivek Khera
I've got one logging table that is over 330 million rows to store 6  
months' worth of data.  It consists of two integers and a 4-character  
long string.  I have one primary key which is the two integers, and  
an additional index on the second integer.


I'm planning to use inheritance to split the table into a bunch of  
smaller ones by using a modulo function on one of the integers on  
which we scan often.


My question is how small to make each inherited piece?  If I do  
modulo 10, then each sub-table will be between 32 and 34 million rows  
today based on current distribution.


If I expect to increase traffic 2 times over the next year (thus  
doubling my logs) what would you recommend?





smime.p7s
Description: S/MIME cryptographic signature


Re: [PERFORM] how small to split a table?

2007-03-20 Thread Heiko W.Rupp


My question is how small to make each inherited piece?  If I do  
modulo 10, then each sub-table will be between 32 and 34 million  
rows today based on current distribution.


You might try this with various sizes.
I did some testing lateley and found out that insert performance -  
even if only inserting into one partition through the master
table abould halfed the speed with 4 partitions and made a 50%  
increase for 2 partitions.

Please note: this is not representative in any kind!

So while it might be cool in your case to have e.g. one partition per  
month, this might slow inserts down too much, so
that a different number of partitions could be better. The same  
applies for queries as well (here perhaps in the other

direction).

--
Heiko W.Rupp
[EMAIL PROTECTED], http://www.dpunkt.de/buch/3-89864-429-4.html




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] SATA RAID: Promise vs. 3ware

2007-03-20 Thread Joshua D. Drake

 Is this technically a good idea to take Promise instead of 3ware or
 rather I definitely should insist on 3ware and wait for it?

Use 3Ware they are proven to provide a decent raid controller for
SATA/PATA. Promise on the other hand... not so much.

Joshua D. Drake

 
 Thank you
 
 Ireneusz Pluta
 
 
 ---(end of broadcast)---
 TIP 7: You can help support the PostgreSQL project by donating at
 
http://www.postgresql.org/about/donate
 


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] how small to split a table?

2007-03-20 Thread Vivek Khera


On Mar 20, 2007, at 11:20 AM, Heiko W.Rupp wrote:


partition through the master
table abould halfed the speed with 4 partitions and made a 50%  
increase for 2 partitions.

Please note: this is not representative in any kind!


I fully intend to build knowledge of the partitions into the insert  
part of the logging.  Only the queries which do joins on the current  
big table would use the master name.  Everything else can be trained  
to go directly to the proper subtable.


Thanks for your note.  It implies to me I'm making the right choice  
to build that knowledge into the system.





smime.p7s
Description: S/MIME cryptographic signature


Re: [PERFORM] SATA RAID: Promise vs. 3ware

2007-03-20 Thread Dave Cramer


On 20-Mar-07, at 9:23 AM, Ireneusz Pluta wrote:


Hello all,

I sent a similar post to a FreeBSD group, but thought I'd might try  
here too.


I am completing a box for PostgreSQL server on FreeBSD. Selecting a  
RAID controller I decided to go with 3ware SE9650-16, following  
good opinions about 3ware controllers found on FreeBSD and  
PostgreSQL groups.


However my dealer suggest me not to go with 3ware, and take Promise  
SuperTrak EX16350, instead. This suggestion does not have any  
technical background and it comes generally from the fact of  
limited availability of 16x 3ware controllers on the local market  
and immediate availability of Promise.


Is this technically a good idea to take Promise instead of 3ware or  
rather I definitely should insist on 3ware and wait for it?


The reality is that most dealers have no idea what is good for a  
database application. It is likely that this card is better for him  
somehow ( more margin, easier to get, etc.)


I'd stick with 3ware, areca, or lsi. And even then I'd check it when  
I got it to make sure it lived up to it's reputation.

Dave

Thank you

Ireneusz Pluta


---(end of  
broadcast)---

TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] SATA RAID: Promise vs. 3ware

2007-03-20 Thread Dave Cramer


On 20-Mar-07, at 1:53 PM, Benjamin Arai wrote:

This is a little biased but I would stay away from areca only  
because they have fans on the card.  At some point down the line  
that card is going to die.  When it does there is really no telling  
what it will do to your data.  I personally use 3Ware cards, they  
work well but I have had one die before (1/10).


Well, they are also the only one of the bunch that I am aware of that  
will sell you 1G of cache. Plus if you use battery backup sooner or  
later you have to replace the batteries. I use areca  all the time  
and I've never had a fan die, but I admit it is a point of failure.


Dave


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] SATA RAID: Promise vs. 3ware

2007-03-20 Thread Ron

At 02:08 PM 3/20/2007, Dave Cramer wrote:


On 20-Mar-07, at 1:53 PM, Benjamin Arai wrote:

This is a little biased but I would stay away from areca only 
because they have fans on the card.  At some point down the line 
that card is going to die.  When it does there is really no telling 
what it will do to your data.


Ummm ?what? fan? The Intel IOP341 (AKA 81341) based ARC-12xx cards 
are what people are most likely going to want to buy at this point, 
and they are fanless:

http://www.areca.us/support/photo_gallery.htm

The lore is that
+3ware is best at random IO and Areca is best at streaming IO.  OLTP 
= 3ware.  OLAP = Areca.

- stay away from Adaptec or Promise for any mission critical role.
= LSI is a mixed bag.


Well, they are also the only one of the bunch that I am aware of 
that  will sell you 1G of cache.


Actually, it's up to 2GB of BB cache...  2GB DDR2 SDRAMs are cheap 
and easy to get now.  I've actually been agitating for Areca to 
support 4GB of RAM.



Plus if you use battery backup sooner or later you have to replace 
the batteries. I use areca all the time and I've never had a fan 
die, but I admit it is a point of failure.


I've had the whole card die (massive cooling failure in NOC led to 
...), but never any component on the card.  OTOH, I'm conservative 
about how much heat per unit area I'm willing to allow to occur in or 
near my DB servers.


Cheers,
Ron 



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PERFORM] Horrible trigger performance after upgrade 8.0.12 - 8.2.3

2007-03-20 Thread Joseph S
After upgrading to 8.2.3 INSERTs and UPDATEs on one of my tables became 
incredibly slow.  I traced the problem to one of my triggers that calls 
one of my defined functions (that is IMMUTABLE).  If I inline the 
function instead of calling it the runtime for my test update drops from 
 10261.234 ms to 564.094 ms.  The time running the trigger itself 
dropped from 9749.910 to 99.504.


BTW does make any sense to bother marking trigger functions as STABLE or 
IMMUTABLE?


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Horrible trigger performance after upgrade 8.0.12 - 8.2.3

2007-03-20 Thread Tom Lane
Joseph S jks@selectacast.net writes:
 After upgrading to 8.2.3 INSERTs and UPDATEs on one of my tables became 
 incredibly slow.  I traced the problem to one of my triggers that calls 
 one of my defined functions (that is IMMUTABLE).  If I inline the 
 function instead of calling it the runtime for my test update drops from 
   10261.234 ms to 564.094 ms.  The time running the trigger itself 
 dropped from 9749.910 to 99.504.

With no more details than that, I don't see how you expect any useful
comments.  Let's see the code.  Also, what PG version are you comparing to?

 BTW does make any sense to bother marking trigger functions as STABLE or 
 IMMUTABLE?

No, the trigger mechanisms don't pay any attention to that.  I can
hardly conceive of a useful trigger that wouldn't be VOLATILE anyway,
since side effects are more or less the point.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[PERFORM] Determining server load from client

2007-03-20 Thread Dan Harris
I've found that it would be helpful to be able to tell how busy my 
dedicated PG server is ( Linux 2.6 kernel, v8.0.3 currently ) before 
pounding it with some OLAP-type queries.  Specifically, I have a 
multi-threaded client program that needs to run several thousand 
sequential queries.  I broke it into threads to take advantage of the 
multi-core architecture of the server hardware.  It would be very nice 
if I could check the load of the server at certain intervals to throttle 
the number of concurrent queries and mitigate load problems when other 
processes might be already inducing a significant load.


I have seen some other nice back-end things exposed through PG functions 
( e.g. database size on disk ) and wondered if there was anything 
applicable to this.  Even if it can't return the load average proper, is 
there anything else in the pg_* tables that might give me a clue how 
busy the server is for a period of time?


I've thought about allowing an ssh login without a keyphrase to log in 
and capture it.  But, the client process is running as an apache user. 
Giving the apache user a shell login to the DB box does not seem like a 
smart idea for obvious security reasons...


So far, that's all I can come up with, other than a dedicated socket 
server daemon on the DB machine to do it.


Any creative ideas are welcomed :)

Thanks

-Dan

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Determining server load from client

2007-03-20 Thread Craig A. James

Dan Harris wrote:
I've found that it would be helpful to be able to tell how busy my 
dedicated PG server is ...


I have seen some other nice back-end things exposed through PG functions 
( e.g. database size on disk ) and wondered if there was anything 
applicable to this.


I'd write a simple pg-perl function to do this.  You can access operating-system calls to 
find out the system's load.  But notice that you need Untrusted Perl to do 
this, so you can only do it on a system where you trust every application that connects 
to your database.  Something like this:

create or replace function get_stats()
 returns text as '
 open(STAT, /proc/stat);
 my @stats = STAT;
 close STAT;
 return join(, @stats);
' language plperlu;

See http://www.postgresql.org/docs/8.1/interactive/plperl-trusted.html

Craig

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Determining server load from client

2007-03-20 Thread Joe Healy

(forgot to send to list)
Dan Harris wrote:
architecture of the server hardware.  It would be very nice if I could 
check the load of the server at certain intervals to throttle the 
number of concurrent queries and mitigate load problems when other 
processes might be already inducing a significant load.


I have seen some other nice back-end things exposed through PG 
functions ( e.g. database size on disk ) and wondered if there was 
anything applicable to this.  Even if it can't return the load average 
proper, is there anything else in the pg_* tables that might give me a 
clue how busy the server is for a period of time?




I have installed munin (http://munin.projects.linpro.no/) on a few 
systems. This lets you look at graphs of system resources/load etc. I 
have also added python scripts which do sample queries to let me know if 
performance/index size is changing dramatically. I have attached an 
example script.




Hope that helps,



Joe




#! /usr/bin/python
import psycopg
import sys

def fixName(name):
   return name[:19]

if len(sys.argv)  1 and sys.argv[1] == config:
   print graph_title Postgresql Index Sizes
graph_vlabel Mb

   con = psycopg.connect(host=xxx user=xxx dbname=xxx password=xxx)
   cur = con.cursor()
   
   cur.execute(select relname, relpages from pg_class where relowner  10 and relkind='i' and relpages  256 order by reltuples desc;)

   results = cur.fetchall()
   for name, pages in results:
   print %s.label %s % (fixName(name), name)

else:
   con = psycopg.connect(host=xxx user=xxx dbname=xxx password=xxx)
   cur = con.cursor()
   
   cur.execute(select relname, relpages from pg_class where relowner  10 and relkind='i' and relpages  256 order by reltuples desc;)

   results = cur.fetchall()
   
   for name, pages in results:

   print %s.value %.2f % (name[:19], pages*8.0/1024.0)


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Determining server load from client

2007-03-20 Thread Dan Harris

Dan Harris wrote:
I've found that it would be helpful to be able to tell how busy my 
dedicated PG server is ( Linux 2.6 kernel, v8.0.3 currently ) before 
pounding it with some OLAP-type queries. 

..snip

Thank you all for your great ideas!  I'm going to try the perl function 
as that seems like a very elegant way of doing it.


-Dan

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Determining server load from client

2007-03-20 Thread Jim Buttafuoco
Dan

Use the following plperlu function

create or replace function LoadAVG()
returns record
as
$$
use Sys::Statistics::Linux::LoadAVG;
my $lxs = new Sys::Statistics::Linux::LoadAVG;
my $stats = $lxs-get;
return $stats;

$$
language plperlu;


select * from LoadAVg() as (avg_1 float,avg_5 float,avg_15 float);

The Sys::Statistics::Linux has all kind of info (from the /proc) file
system.

Jim

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Dan Harris
Sent: Tuesday, March 20, 2007 8:48 PM
To: PostgreSQL Performance
Subject: [PERFORM] Determining server load from client

I've found that it would be helpful to be able to tell how busy my 
dedicated PG server is ( Linux 2.6 kernel, v8.0.3 currently ) before 
pounding it with some OLAP-type queries.  Specifically, I have a 
multi-threaded client program that needs to run several thousand 
sequential queries.  I broke it into threads to take advantage of the 
multi-core architecture of the server hardware.  It would be very nice 
if I could check the load of the server at certain intervals to throttle 
the number of concurrent queries and mitigate load problems when other 
processes might be already inducing a significant load.

I have seen some other nice back-end things exposed through PG functions 
( e.g. database size on disk ) and wondered if there was anything 
applicable to this.  Even if it can't return the load average proper, is 
there anything else in the pg_* tables that might give me a clue how 
busy the server is for a period of time?

I've thought about allowing an ssh login without a keyphrase to log in 
and capture it.  But, the client process is running as an apache user. 
Giving the apache user a shell login to the DB box does not seem like a 
smart idea for obvious security reasons...

So far, that's all I can come up with, other than a dedicated socket 
server daemon on the DB machine to do it.

Any creative ideas are welcomed :)

Thanks

-Dan

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match