Re: [PERFORM] big database performance

2008-01-10 Thread Jared Mauch
On Thu, Jan 10, 2008 at 12:08:39PM +0100, Stephane Bailliez wrote:
> Jared Mauch wrote:
>>  I do large databases in Pg, like 300GB/day of new data.
>
> That's impressive.  Would it be possible to have details on your hardware, 
> schema and configuration and type of usage ?
>
> I'm sure there's something to learn in there for a lot of people (or at 
> least for me)

http://archives.postgresql.org/pgsql-performance/2007-12/msg00372.php

http://archives.postgresql.org/pgsql-performance/2006-05/msg00444.php

The hardware specs are kinda boring since it's not
I/O bound, so you could get the same disk performance out of
some EIDE 7200 rpm disks (which I have done for testing).

The current setup is a 4xOpteron 8218 (dual core) w/ 16G ram.
I have roughly 12TB usable disk space on the sysem connected via some
SATA <-> FC thing our systems folks got us.  Problem I have is the linear
cpu speed isn't enough and there would be challenges splitting the
workload across multiple cpus.  All my major reporting is done via
pg_dump and I'm pondering what would happen if I just removed Pg
from the equation for the major reporting tasks entirely.  I may see
much better performance without the database [in my way].  I've not
done that as some types of data access would need to be significantly
redone and I don't want to spend the time on that...

- Jared

-- 
Jared Mauch  | pgp key available via finger from [EMAIL PROTECTED]
clue++;  | http://puck.nether.net/~jared/  My statements are only mine.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] big database performance

2008-01-10 Thread Jared Mauch
On Thu, Jan 10, 2008 at 10:57:46AM +0200, Adrian Moisey wrote:
> What sort of information do you need from me ?

Ratio of read vs write operations (select vs insert/copy).

average number of indicies per table

average table size.  (analyze verbose  if you want to get
into more details).

What is the process doing (eg: in top, is it just on the CPU or
is it blocking for I/O?).

I/O information, from iostat -d (You may need to build an iostat
binary for Linux, the source is out there, i can give you a pointer if
you need it).

>>  Is your problem with performance database reads? writes? (insert/copy?)
>> How many indicies do you have?
>
> I think the problem is related to load.  Everything is slow because there 
> are way too many connections.  So everything is making everything else 
> slow.  Not much detail, is it?
>
> We have 345 indicies on the db.

If the tables are heavily indexed this could easily slow down
insert performance.  Taking a large dataset and adding a second
index, postgres doesn't use threads to create the two indicies on
different cpus/cores in parallel.  This could represent some of your
performance difference.  If you're doing a lot of write operations
and fewer read, perhaps the cost of an index isn't worth it in the
cpu time spent creating it vs the amount of time for a seq scan.

- Jared

-- 
Jared Mauch  | pgp key available via finger from [EMAIL PROTECTED]
clue++;  | http://puck.nether.net/~jared/  My statements are only mine.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] big database performance

2008-01-10 Thread Stephane Bailliez

Jared Mauch wrote:

I do large databases in Pg, like 300GB/day of new data.
That's impressive.  Would it be possible to have details on your 
hardware, schema and configuration and type of usage ?


I'm sure there's something to learn in there for a lot of people (or at 
least for me)


Cheers,

-- stephane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] big database performance

2008-01-10 Thread Adrian Moisey

Hi


I do large databases in Pg, like 300GB/day of new data.  Need a lot
more data on what you're having issues with.


That is big!

What sort of information do you need from me ?

	Is your problem with performance database reads? 
writes? (insert/copy?)  How many indicies do you have?


I think the problem is related to load.  Everything is slow because 
there are way too many connections.  So everything is making everything 
else slow.  Not much detail, is it?


We have 345 indicies on the db.

--
Adrian Moisey
System Administrator | CareerJunction | Your Future Starts Here.
Web: www.careerjunction.co.za | Email: [EMAIL PROTECTED]
Phone: +27 21 686 6820 | Mobile: +27 82 858 7830 | Fax: +27 21 686 6842

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] big database performance

2008-01-10 Thread Adrian Moisey

Hi


Also, we're running the db on ext3 with noatime.   Should I look at
changing or getting rid of journaling ?


No (unless you like really long fsck times). data=writeback is safe with 
PostgreSQL, though.


I tested that on a dev box, and I didn't notice a difference when using 
pgbench


--
Adrian Moisey
System Administrator | CareerJunction | Your Future Starts Here.
Web: www.careerjunction.co.za | Email: [EMAIL PROTECTED]
Phone: +27 21 686 6820 | Mobile: +27 82 858 7830 | Fax: +27 21 686 6842

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] big database performance

2008-01-09 Thread Simon Riggs
On Wed, 2008-01-09 at 10:18 +0200, Adrian Moisey wrote:

> We recently converted to postgres (from mssql) and we're having 
> performance issues.

I think you need to say more about what the performance issues actually
are, otherwise everybody will just speculate you to death.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] big database performance

2008-01-09 Thread Greg Smith

On Wed, 9 Jan 2008, Guillaume Smet wrote:


On Jan 9, 2008 9:27 AM, Joshua D. Drake <[EMAIL PROTECTED]> wrote:

wal_sync_method = open_sync


Do you recommend it in every situation or just because data are on a
SAN? Do you have any numbers/real cases explaining this choice.


Sync writes are faster on Linux in every case I've ever tried, compared to 
the default config that does a write followed by a sync.  With regular 
discs they're just a little faster.  On some SAN configurations, they're 
enormously faster, because the SANs are often optimized to handle 
syncronous writes far more efficiently than write/sync ones.  This is 
mainly because Oracle does its writes that way, so if you want good Oracle 
performance you have to handle sync writes well.


I have something on this topic I keep meaning to publish, but I got 
spooked about the potential to have silent problems or crashes when using 
open_sync due to a Linux kernel issue reported here:


http://archives.postgresql.org/pgsql-hackers/2007-10/msg01310.php

Certainly with that report floating out there I'd only recommend open_sync 
to people who are putting plenty of time into testing their database is 
robust under load with that configuration before deploying it; I sure 
wouldn't just make that changes on a production system just to see if it's 
faster.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] big database performance

2008-01-09 Thread Jared Mauch
On Wed, Jan 09, 2008 at 12:27:33AM -0800, Joshua D. Drake wrote:
> Adrian Moisey wrote:
>> Hi
>>
>> We recently converted to postgres (from mssql) and we're having 
>> performance issues.  Not all the issues are related to postgres, but we're 
>> trying to sort everything out.

Hi,

I do large databases in Pg, like 300GB/day of new data.  Need a lot
more data on what you're having issues with.

Is your problem with performance database reads? 
writes? (insert/copy?)  How many indicies do you have?

- jared

-- 
Jared Mauch  | pgp key available via finger from [EMAIL PROTECTED]
clue++;  | http://puck.nether.net/~jared/  My statements are only mine.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] big database performance

2008-01-09 Thread Guillaume Smet
Hi Joshua,

On Jan 9, 2008 9:27 AM, Joshua D. Drake <[EMAIL PROTECTED]> wrote:
> wal_sync_method = open_sync

Do you recommend it in every situation or just because data are on a
SAN? Do you have any numbers/real cases explaining this choice.

Thanks.

--
Guillaume

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] big database performance

2008-01-09 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Wed, 9 Jan 2008 08:16:48 -0800
Alan Hodgson <[EMAIL PROTECTED]> wrote:

> On Wednesday 09 January 2008, Adrian Moisey
> <[EMAIL PROTECTED]> wrote:
> >
> > Also, we're running the db on ext3 with noatime.   Should I look at
> > changing or getting rid of journaling ?
> 
> No (unless you like really long fsck times). data=writeback is safe
> with PostgreSQL, though.
> 

Except :)... for pg_xlog. If you have pg_xlog on a different partition,
feel free to run ext2 for it.

Joshua D. Drake

> 
> ---(end of
> broadcast)--- TIP 7: You can help support the
> PostgreSQL project by donating at
> 
> http://www.postgresql.org/about/donate
> 


- -- 
The PostgreSQL Company: Since 1997, http://www.commandprompt.com/ 
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
SELECT 'Training', 'Consulting' FROM vendor WHERE name = 'CMD'


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHhPtyATb/zqfZUUQRAk32AKCTvJPBCvHtb4JWMu7+xwxQZdA/ZQCgn3K2
pCmcUXAiAibLkTgEwGVXPyQ=
=H2bK
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] big database performance

2008-01-09 Thread Alan Hodgson
On Wednesday 09 January 2008, Adrian Moisey <[EMAIL PROTECTED]> 
wrote:
>
> Also, we're running the db on ext3 with noatime.   Should I look at
> changing or getting rid of journaling ?

No (unless you like really long fsck times). data=writeback is safe with 
PostgreSQL, though.


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] big database performance

2008-01-09 Thread Adrian Moisey

Hi

We recently converted to postgres (from mssql) and we're having 
performance issues.  Not all the issues are related to postgres, but 
we're trying to sort everything out.


The server is running ubuntu Gutsy with the database stored on a IBM 
SAN.  It is a Dell box with dual quad core 2.00GHz Xeons and 8GB RAM.



The database is about 71GB in size.

I've looked at the postgres config files and we've tweaked as much as 
our knowledge allows.


Can someone shed some light on the settings I should use ?


Umpf that isn't quite enough info :) but assuming you are running 8.2.x:


Sorry :/  Yes, we are running 8.2.x

Start with 1GB shared_buffers (you may be able to go hire), 4MB 
work_mem, wal_sync_method = open_sync, checkpoint_segments = 30, 
default_statistics_target = 150, effective_cache_size = 6GB .


Our shared_buffers is 1GB.
work_mem is 32MB
I changed wal_sync_method to open_sync (which helped a ton!)

Can someone please explain effective_cache_size.  what cache does it 
want to know about?  Linux cache?


Also, we're running the db on ext3 with noatime.   Should I look at 
changing or getting rid of journaling ?




64bit Gutsy?


Yes


How is the SAN connected?


fibre


What does mpstat 5 (3 iterations) say?
Even better what does sar -A say over a 24 hour period?


I'll get these for you

--
Adrian Moisey
System Administrator | CareerJunction | Your Future Starts Here.
Web: www.careerjunction.co.za | Email: [EMAIL PROTECTED]
Phone: +27 21 686 6820 | Mobile: +27 82 858 7830 | Fax: +27 21 686 6842

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] big database performance

2008-01-09 Thread Frank Habermann

Hi,

what segment size do you use for the san partition? This could also be a 
bottle neck for db servers.


Frank

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] big database performance

2008-01-09 Thread Joshua D. Drake

Adrian Moisey wrote:

Hi

We recently converted to postgres (from mssql) and we're having 
performance issues.  Not all the issues are related to postgres, but 
we're trying to sort everything out.


The server is running ubuntu Gutsy with the database stored on a IBM 
SAN.  It is a Dell box with dual quad core 2.00GHz Xeons and 8GB RAM.



The database is about 71GB in size.

I've looked at the postgres config files and we've tweaked as much as 
our knowledge allows.


Can someone shed some light on the settings I should use ?


Umpf that isn't quite enough info :) but assuming you are running 8.2.x:

Start with 1GB shared_buffers (you may be able to go hire), 4MB 
work_mem, wal_sync_method = open_sync, checkpoint_segments = 30, 
default_statistics_target = 150, effective_cache_size = 6GB .



Restart, VACUUM ANALYZE VERBOSE, post back last 4 lines of output.

Other good items to know:

64bit Gutsy?
How is the SAN connected?
What does mpstat 5 (3 iterations) say?
Even better what does sar -A say over a 24 hour period?

Sincerely,

Joshua D. Drake







Thanks in advance



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


[PERFORM] big database performance

2008-01-09 Thread Adrian Moisey

Hi

We recently converted to postgres (from mssql) and we're having 
performance issues.  Not all the issues are related to postgres, but 
we're trying to sort everything out.


The server is running ubuntu Gutsy with the database stored on a IBM 
SAN.  It is a Dell box with dual quad core 2.00GHz Xeons and 8GB RAM.



The database is about 71GB in size.

I've looked at the postgres config files and we've tweaked as much as 
our knowledge allows.


Can someone shed some light on the settings I should use ?


Thanks in advance
--
Adrian Moisey
System Administrator | CareerJunction | Your Future Starts Here.
Web: www.careerjunction.co.za | Email: [EMAIL PROTECTED]
Phone: +27 21 686 6820 | Mobile: +27 82 858 7830 | Fax: +27 21 686 6842

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match