Re: [GENERAL] Storage location of temporary files

2008-11-05 Thread Scott Marlowe
On Wed, Nov 5, 2008 at 8:22 PM, Gregory Stark <[EMAIL PROTECTED]> wrote:
>
> "Scott Marlowe" <[EMAIL PROTECTED]> writes:
>
>> 2008/11/5 Christian Schröder <[EMAIL PROTECTED]>:
>>> Tomasz Ostrowski wrote:

 This is wrong. RAID5 is slower than RAID1.
 You should go for RAID1+0 for fast and reliable storage. Or RAID0 for
 even faster but unreliable.

>>>
>>> I did not find a clear statement about this. I agree that RAID10 would be
>>> better than RAID5, but in some situations RAID5 at least seems to be faster
>>> than RAID1.
>>
>> For certain read heavy loads RAID-5 will beat RAID-1 handily.  After
>> all, from a read only perspective, a healthy RAID-5 with n disks is
>> equal to a healthy RAID-0 with n-1 disks.
>
> Uhm, and for a read-heavy load a RAID-1 or RAID 1+0 array with n disks is
> equal to a healthy RAID-0 with n disks.

Don't know what testing you've done, but very very few RAID-1 /
RAID-10 setups can equal a RAID-0 setup of the same number of disks.

> RAID-5 should never beat any combination of RAID-0 and RAID-1 with the same
> number of drives at read performance. It's advantage is that you get more
> capacity.

Of course it won't beat a RAID-0 with the same number, but a good
controller is just one disk behind a RAID-0 and RAID-1 controllers
usually don't aggregate RAID-1 reads, but do allow multiple readers to
hit different disks for better concurrent access.  But that's not what
I was talking about.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Storage location of temporary files

2008-11-05 Thread Gregory Stark

"Scott Marlowe" <[EMAIL PROTECTED]> writes:

> 2008/11/5 Christian Schröder <[EMAIL PROTECTED]>:
>> Tomasz Ostrowski wrote:
>>>
>>> This is wrong. RAID5 is slower than RAID1.
>>> You should go for RAID1+0 for fast and reliable storage. Or RAID0 for
>>> even faster but unreliable.
>>>
>>
>> I did not find a clear statement about this. I agree that RAID10 would be
>> better than RAID5, but in some situations RAID5 at least seems to be faster
>> than RAID1.
>
> For certain read heavy loads RAID-5 will beat RAID-1 handily.  After
> all, from a read only perspective, a healthy RAID-5 with n disks is
> equal to a healthy RAID-0 with n-1 disks.  

Uhm, and for a read-heavy load a RAID-1 or RAID 1+0 array with n disks is
equal to a healthy RAID-0 with n disks.

RAID-5 should never beat any combination of RAID-0 and RAID-1 with the same
number of drives at read performance. It's advantage is that you get more
capacity.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Storage location of temporary files

2008-11-05 Thread Scott Marlowe
2008/11/5 Christian Schröder <[EMAIL PROTECTED]>:
> Tomasz Ostrowski wrote:
>>
>> This is wrong. RAID5 is slower than RAID1.
>> You should go for RAID1+0 for fast and reliable storage. Or RAID0 for
>> even faster but unreliable.
>>
>
> I did not find a clear statement about this. I agree that RAID10 would be
> better than RAID5, but in some situations RAID5 at least seems to be faster
> than RAID1.

For certain read heavy loads RAID-5 will beat RAID-1 handily.  After
all, from a read only perspective, a healthy RAID-5 with n disks is
equal to a healthy RAID-0 with n-1 disks.  However, writes are much
more costly on RAID-5

> If I have 5 disks available, how should I use them to get best performance
> without the risk of severe data loss? If I use 4 of the disks to build a
> RAID10 then I will have only 1 remaining drive, e.g. to put the pgsql_tmp
> directories there. In this scenario I would not have the WAL on a separate
> disk.

How you use your disks depends very much on your RAID controller and
your workload.  On a fast controller with battery backed cache, I'd
normally create a 4 disk RAID-10 with one disk as a hot spare.  If you
don't have a battery backed caching controller, then you'd probably be
better off with two 2 disk RAID-1 arrays and the 5th drive as a hot
spare.

Note that a 4 disk RAID-10 or 2 2 disk RAID-1  with a hot spare is
much more reliable than a 5 disk RAID-5.

> Or should I use 3 disks to build a RAID5, 1 disk for tempspace and 1 disk
> for WAL? How important is data integrity for the WAL? If the WAL disk fails,
> can this corrupt my data? Or would I just lose the data after the last
> checkpoint?

If your data is important, then you will put it on redundant arrays of
independent disks (RAID).  WAL is as important as any other part of
the db.  It should be on a mirror set at a minimum.

> Or maybe I should use 2 disks as RAID1 for the database, 2 disks as RAID1
> for the WAL and the remaining disk for the tempspace?

That could work too.  But I like having a hot spare in case something dies.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Storage location of temporary files

2008-11-05 Thread Martijn van Oosterhout
On Wed, Nov 05, 2008 at 08:13:10AM +0100, Christian Schröder wrote:
> Tomasz Ostrowski wrote:
> >This is wrong. RAID5 is slower than RAID1.
> >You should go for RAID1+0 for fast and reliable storage. Or RAID0 for
> >even faster but unreliable.
> >  
> I did not find a clear statement about this. I agree that RAID10 would 
> be better than RAID5, but in some situations RAID5 at least seems to be 
> faster than RAID1.

The basic problem is that RAID5 has a checksum disk. So if you update a
block, you need to update the checksum. No matter how you do it you
need to read one or more of the parallel blocks. Clever disk
controllers can reduce the cost, but not eliminate it. None of RAID 0, 1
or 10 have this problem.

Have a nice day,
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while 
> boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [GENERAL] Storage location of temporary files

2008-11-05 Thread Tomasz Ostrowski
On 2008-11-05 08:13, Christian Schröder wrote:

> If I have 5 disks available, how should I use them to get best 
> performance without the risk of severe data loss?

What percentage of your usage are writes? What do you need the most:
high throughput or minimal latency?

> How important is data integrity for the WAL? If the WAL 
> disk fails, can this corrupt my data? Or would I just lose the data 
> after the last checkpoint?

It is important. With corrupted WAL IMHO your database will not get up.

> Or maybe I should use 2 disks as RAID1 for the database, 2 disks as 
> RAID1 for the WAL and the remaining disk for the tempspace?

I'd go for 3 disks as RAID1 for database, OS and tempspace, 2 disks as
RAID1 for WAL. The reasoning is that on a dedicated server OS will
rarely write or read. A commit will only sync WAL, and checkpoints can
be smoothed so they will not cripple reads. Reads by separate clients
will not wait for each other, as they can use 3 disks concurrently, so
read latency will be very good.

If you do not have a disk controller with battery backed cache remember
to disable hardware write cache on all disks
("hdparm -W 0 /dev/sd[a,b,c,d]" for ATA on Linux). And use smartd for
periodic checking health of drives (for example in /etc/smartd.conf for
ATA on Linux):
/dev/sda -H -C -U -l selftest -m root -s (S/../.././01|L/../../7/03)
/dev/sdb -H -C -U -l selftest -m root -s (S/../.././01|L/../../7/03)
/dev/sdc -H -C -U -l selftest -m root -s (S/../.././01|L/../../7/03)
/dev/sdd -H -C -U -l selftest -m root -s (S/../.././01|L/../../7/03)

Regards
Tometzky
-- 
...although Eating Honey was a very good thing to do, there was a
moment just before you began to eat it which was better than when you
were...
  Winnie the Pooh

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Storage location of temporary files

2008-11-04 Thread Christian Schröder

Tomasz Ostrowski wrote:

This is wrong. RAID5 is slower than RAID1.
You should go for RAID1+0 for fast and reliable storage. Or RAID0 for
even faster but unreliable.
  
I did not find a clear statement about this. I agree that RAID10 would 
be better than RAID5, but in some situations RAID5 at least seems to be 
faster than RAID1.


If I have 5 disks available, how should I use them to get best 
performance without the risk of severe data loss? If I use 4 of the 
disks to build a RAID10 then I will have only 1 remaining drive, e.g. to 
put the pgsql_tmp directories there. In this scenario I would not have 
the WAL on a separate disk.
Or should I use 3 disks to build a RAID5, 1 disk for tempspace and 1 
disk for WAL? How important is data integrity for the WAL? If the WAL 
disk fails, can this corrupt my data? Or would I just lose the data 
after the last checkpoint?
Or maybe I should use 2 disks as RAID1 for the database, 2 disks as 
RAID1 for the WAL and the remaining disk for the tempspace?


Regards,
  Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Storage location of temporary files

2008-11-04 Thread Tomasz Ostrowski
On 2008-10-31 09:01, Christian Schröder wrote:

> We will now move the database to a raid5 
> (which should be faster than the raid1)

This is wrong. RAID5 is slower than RAID1.
You should go for RAID1+0 for fast and reliable storage. Or RAID0 for
even faster but unreliable.

Regards
Tometzky
-- 
...although Eating Honey was a very good thing to do, there was a
moment just before you began to eat it which was better than when you
were...
  Winnie the Pooh

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Storage location of temporary files

2008-10-31 Thread Aaron
I too have used a symlink for some time (years) to put temp onto
dedicated disks without any problems.  I am not sure if 8.3 is
different but I symlink the directory: base/pgsql_tmp


Aaron Thul
http://www.chasingnuts.com



On Fri, Oct 31, 2008 at 8:11 AM, Sam Mason <[EMAIL PROTECTED]> wrote:
> On Fri, Oct 31, 2008 at 09:01:29AM +0100, Christian Schrrrder wrote:
>> So I would like
>> to use a faster disk for these temporary files, too, but I could not
>> find where the temporary files are located. Is there a separate
>> directory? I have found a "pgsql_tmp" directory inside of the database
>> directories ("base//pgsql_tmp"). Is this what I'm looking for?
>
> Yes, I believe it's accepted practice to replace this directory with a
> symlink.  I've done this before and it's behaved as I'd expect.
>
>
>  Sam
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Storage location of temporary files

2008-10-31 Thread Sam Mason
On Fri, Oct 31, 2008 at 09:01:29AM +0100, Christian Schrrrder wrote:
> So I would like 
> to use a faster disk for these temporary files, too, but I could not 
> find where the temporary files are located. Is there a separate 
> directory? I have found a "pgsql_tmp" directory inside of the database 
> directories ("base//pgsql_tmp"). Is this what I'm looking for?

Yes, I believe it's accepted practice to replace this directory with a
symlink.  I've done this before and it's behaved as I'd expect.


  Sam

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Storage location of temporary files

2008-10-31 Thread Christian Schröder

Christian Schröder wrote:
So I would like to use a faster disk for these temporary files, too, 
but I could not find where the temporary files are located. Is there a 
separate directory? I have found a "pgsql_tmp" directory inside of the 
database directories ("base//pgsql_tmp"). Is this what I'm 
looking for?

Just one addition: I have found the following in the 8.3 docs:

   temp_tablespaces (string)
   This variable specifies tablespace(s) in which to create temporary
   objects (temp tables and indexes on temp tables) when a CREATE
   command does not explicitly specify a tablespace. Temporary files
   for purposes such as sorting large data sets are also created in
   these tablespace(s).

So my problem seems to have been addressed in the 8.3 release. Maybe we 
can upgrade our database, but until that happens I will need another 
solution, so my question remains ...


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Storage location of temporary files

2008-10-31 Thread Christian Schröder

Hi list,
I want to optimize the performance of our PostgreSQL 8.2 server. Up to 
now the server has a raid1 where the whole database is located 
(including tha WAL files). We will now move the database to a raid5 
(which should be faster than the raid1) and will also move the WAL to a 
separate disk (which should, according to the docs, also increase the 
performance).
But I see the temporary files as another important performance 
bottleneck. From the docs (chapter 17.4.1):


   work_mem (integer)
   Specifies the amount of memory to be used by internal sort
   operations and hash tables before switching to temporary disk files.
   [...]

We have rather complex queries and as I far as I see from the disk usage 
patterns the system makes use of temporary disk files. So I would like 
to use a faster disk for these temporary files, too, but I could not 
find where the temporary files are located. Is there a separate 
directory? I have found a "pgsql_tmp" directory inside of the database 
directories ("base//pgsql_tmp"). Is this what I'm looking for?

Thanks for your help!

Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general