Re: [PERFORM] RAID controllers etc... was: PostgreSQL data on aNAS device ?

2003-10-23 Thread Alexander Priem
I have been searching (www.lsil.com) for this megaraid_2 driver you
mentioned.

What kind of MegaRaid card does the Perc4/Di match? Elite1600? Elite1650?

I picked Elite1600 and the latest driver I found was version 2.05.00. Is
this one OK for RedHat 9? The README file present only mentions RedHat8...

Kind regards,
Alexander.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[PERFORM] RedHat Enterprise Linux ES 3 ?!?!

2003-10-23 Thread Alexander Priem
Hi guys,

This basically continues the other thread about the PERC4 RAID controller,
but since it is a bit off-topic I thought to start another thread. Thanks
for all your help so far   :)

Earlier today I read about the newly released RedHat Enterprise Linux ES
version 3. This version should include out-of-the-box megaraid_2 drivers, so
it would support the Dell PERC4/Di RAID controller.

However, it is very much more expensive than RedHat Linux 9. RH Linux 9 is
free and the Enterpise ES edition will cost between 400 and several 1.000's
of dollars, depending on the support you want to go with it.

Do any of you guys have experience with the previous version of Enterprise
Linux (that would be version 2.1) or even better, are any of you already
using version 3?

Would you recommend this over RedHat Linux 9? I think that with RH Linux 9
it would be easier to get all the latest versions of components I need (RPMs
for PostgreSQL, Apache, Samba etc.), while my guess would be that Enterprise
Linux would be more difficult to upgrade...

Also, I cannot find any list of packages included in Enterprise Linux 2.1 /
3. Does anyone know if PostgreSQL is included and if so, what version?

Kind regards,
Alexander Priem.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] RAID controllers etc... was: PostgreSQL data on a NAS device ?

2003-10-22 Thread Alexander Priem
So I guess the PERC4/Di RAID controller is pretty good. It seems that
RedHat9 supports it out-of-the-box (driver 1.18f), but I gather from the
sites mentioned before that upgrading this driver to 1.18i would be
better...


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] PostgreSQL data on a NAS device ?

2003-10-21 Thread Alexander Priem
The machine is going to be used for a pretty large database (well over 100
tables with some of them containing over a million records from the start,
number of tables and records will grow (much?) larger in the future). This
database is going to be used by a pretty large number of employees. The
number of concurrent users will vary between 1 - 100 or so, depending on the
time of day etc. This will be a database containing client and supplier data
as well as product descriptions and prices/ingredients/labels/brands etc.
Database use will include lots of SELECTS but also lots of INSERTS/UPDATES,
i.e. the database will be pretty active during bussiness hours...

I think you (Scott and Will) are right when you say that NAS devices are not
ideal for this kind of thing. I have been thinking about the hardware
configuration for this machine for some time now (and had a lot of hints
through this list already) and decided to go for a SCSI RAID config after
all. The extra costs will be worth it   :)

The machine I have in mind now is like this :

Dell PowerEdge 1750 machine with Intel Xeon CPU at 3 GHz and 4 GB of RAM.
This machine will contain a PERC4/Di RAID controller with 128MB of battery
backed cache memory. The O/S and logfiles will be placed on a RAID-1 setup
of two 36Gb SCSI-U320 drives (15.000rpm). Database data will be placed on a
Dell PowerVault 220S rack-module containing six 36Gb SCSI-U320 drives
(15.000rpm) in a RAID-10 setup. This PowerVault will be connected to the DB
server via a SCSI cable...

This machine will be a bit more expensive than I thought at first (it's
going to be about EUR 14.000, but that's including 3 years of on-site
support from Dell (24x7, 4-hour response) and peripherals like UPS etc...

Do you think this machine wil be OK for this task?

Thanks for your help so far   :)

Kind regards,
Alexander Priem.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] PostgreSQL data on a NAS device ?

2003-10-21 Thread Alexander Priem
I have considered Opteron, yes. But I think there are too many
uncertainties, like :

* It's a new CPU that has not proven itself yet.
* I don't think I can buy directly from IBM (according to their site), so
how about support (24x7) ? This will be very important to our client.
* I need to install and configure a 64bit Linux flavour which I don't know
(yet)

Any suggestions about the usability of the system I described before?

Here is the description again:

Dell PowerEdge 1750 machine with Intel Xeon CPU at 3 GHz and 4 GB of RAM.
This machine will contain a PERC4/Di RAID controller with 128MB of battery
backed cache memory. The O/S and logfiles will be placed on a RAID-1 setup
of two 36Gb SCSI-U320 drives (15.000rpm). Database data will be placed on a
Dell PowerVault 220S rack-module containing six 36Gb SCSI-U320 drives
(15.000rpm) in a RAID-10 setup. This PowerVault will be connected to the DB
server via a SCSI cable...

I have never worked with a XEON CPU before. Does anyone know how it performs
running PostgreSQL 7.3.4 / 7.4 on RedHat 9 ? Is it faster than a Pentium 4?
I believe the main difference is cache memory, right? Aside from cache mem,
it's basically a Pentium 4, or am I wrong?

Kind regards,
Alexander.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] PostgreSQL data on a NAS device ?

2003-10-20 Thread Alexander Priem
Hi all,

Does anyone have any experience with putting PostgreSQL data on a NAS
device?

I am asking this because a NAS device is much cheaper to set up than a
couple of SCSI disks. I would like to use a relatively cheap NAS device
which uses four IDE drives (7.200 rpm), like the Dell PowerVault 725N. The
disks themselves would be much slower than SCSI disks, I know, but a NAS
device can be equipped with 3 Gb of memory, so this would make a very large
disk cache, right? If this NAS would be dedicated only to PostgreSQL, would
this be slower/faster than a SCSI RAID-10 setup of 6 disks? It would be much
cheaper...

Any advice on this would be appreciated   :)

Kind regards,
Alexander Priem.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] PostgreSQL data on a NAS device ?

2003-10-20 Thread Alexander Priem
Thanks for your reply, Jeff.

If we are going to use a NAS device for storage, then it will be attached
through a gigabit ethernet connection. Fiber will not be an option, since
that would negate the savings we can make by using an IDE NAS device instead
of SCSI-RAID, fiber's pretty expensive, right?

Using a NAS device (that is used only by PostgreSQL, so it's dedicated) with
3Gb of RAM and four 7200 rpm IDE harddisks, connected using a gigabit
ethernet connection to the PostgreSQL server, do you think it will be a
match for a SCSI-RAID config using 4 or 6 15000rpm disks (RAID-10) through a
SCSI-RAID controller having 128mb of writeback cache (battery-backed)?

The SCSI-RAID config would be a lot more expensive. I can't purchase both
configs and test which one wil be faster, but if the NAS solution would be
(almost) as fast as the SCSI-RAID solution, it would be cheaper and easier
to maintain...

About clustering: I know this can't be done by hooking multiple postmasters
to one and the same NAS. This would result in data corruption, i've read...

Kind regards,
Alexander.


- Original Message -
From: Jeff [EMAIL PROTECTED]
To: Alexander Priem [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Monday, October 20, 2003 2:20 PM
Subject: Re: [PERFORM] PostgreSQL data on a NAS device ?


 On Mon, 20 Oct 2003 09:12:35 +0200
 Alexander Priem [EMAIL PROTECTED] wrote:

  I am asking this because a NAS device is much cheaper to set up than a
  couple of SCSI disks. I would like to use a relatively cheap NAS
  device which uses four IDE drives (7.200 rpm), like the Dell
  PowerVault 725N. The disks themselves would be much slower than SCSI
  disks, I know, but a NAS device can be equipped with 3 Gb of memory,
  so this would make a very large disk cache, right? If this NAS would
  be dedicated only to PostgreSQL, would this be slower/faster than a
  SCSI RAID-10 setup of 6 disks? It would be much cheaper...
 

 The big concern would be the network connection, unless you are going
 fiber.  You need to use _AT LEAST_ gigabit. _at least_.If you do
 go that route it'd be interesting to see bonnie results.  And the
 other thing - remember that just because you are running NAS doesn't
 mean you can attach another machine running postgres and have a
 cluster.  (See archives for more info about this).

 I suppose it all boils down to your budget (I usually get to work with
 a budget of $0). And I mentioned this in another post- If you don't mind
 refurb disks(or slightly used) check out ebay - you can get scsi disks
 by the truckload for cheap.


 --
 Jeff Trout [EMAIL PROTECTED]
 http://www.jefftrout.com/
 http://www.stuarthamm.net/


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] PostgreSQL data on a NAS device ?

2003-10-20 Thread Alexander Priem
Even better than the four-disk NAS I mentioned earlier is the following:

Promise UltraTrak RM8000. This is a so-called SCSI-to-IDE RAID system.
Basically it's a RAID setup of eight IDE disks, using a hardware RAID
engine, that's connected to (in this case) the PostgreSQL server via a SCSI
Ultra160 interface (!). So the server won't know any better than that
there's a SCSI disk attached, but in reality it's a IDE RAID setup. It
supports RAID levels 0, 1, 0+1, 5, 50 and JBOD and supports hot-swapping.

Such a NAS config would cost around EUR 3700 (ex. VAT), using 8x40 Gb IDE
disks (7200rpm).

A SCSI RAID-10 setup using 6x18Gb (15000rpm) disks would cost around EUR
6000 (ex. VAT) so it's a big difference...

Does anyone have experience with this NAS device or other SCSI-to-IDE RAID
systems? Are they OK in terms of performance and reliability?

Kind regards,
Alexander.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] Many fields in one table or many tables?

2003-09-18 Thread Alexander Priem
Hi guys,

I am in the process of creating a database design in which LOTS of data need
to be modelled.

For instance, I need to store data about products. Every product has LOTS of
properties, well over a hundred.

So I'm wondering. What's the best approach here, performance wise? Just
create one Product table with well over a hundred columns? Or would it be
better to divide this over more tables and link them together via ID's? I
could for instance create tables Product, PriceInfo, Logistics, StorageInfo,
PackagingInfo and link them together via the same ID. This would be easier
to document (try to visualize a 100+ column table in a document!), but would
it impact performance? I tihnk maybe it would impact Select performance, but
Updating of products would maybe speed up a little...

All info about a product is unique for this product so records in PriceInfo,
Logistics, StorageInfo, PackagingInfo tables would map one to one to records
in the Product table.

Do any of you know if and how PostgreSQL would prefer one approach over the
other?

Thanks in advance,
Alexander Priem.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Indexing question

2003-08-29 Thread Alexander Priem
So if I understand correctly I could ditch the 'deleted' field entirely and
use just the 'deleteddate' field. This 'deleteddate' field would be NULL by
default. It would contain a date value if the record is considered
'deleted'.

The index would be 'create index a on tablename(deleteddate) where
deleteddate is null'.

I could then access 'current' records with a view like 'create view x_view
as select * from tablename where deleteddate is null'.

Is that correct? This would be the best performing solution for this kind of
thing, I think (theoretically at least)?

Kind regards,
Alexander Priem.



- Original Message -
From: Tomasz Myrta [EMAIL PROTECTED]
To: Alexander Priem [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Friday, August 29, 2003 9:03 AM
Subject: Re: [PERFORM] Indexing question


  Hi all,
 
  I have some tables (which can get pretty large) in which I want to
  record 'current' data as well as 'historical' data. This table has
  fields 'deleted' and 'deleteddate' (among other fields, of course). The
  field 'deleted' is false be default. Every record that I want to delete
  gets the value true for 'deleted' and 'deleteddate' is set to the date
  of deletion.
 
  Since these tables are used a lot by queries that only use 'current'
  data, I have created a view with a where clause 'Where not deleted'.
  Also, I have indexed field 'deleted'.

 cut
 I think the best choice for your case is using conditional indexes. It
 should be much better than indexing 'deleted' field. I don't know on
 which exactly fields you have to create this index - you have to check
 it by yourself - what do you have in where clause?

 Example:
 create index some_index on your_table(id_field) where not deleted;


 Regards,
 Tomasz Myrta


 ---(end of broadcast)---
 TIP 8: explain analyze is your friend


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Indexing question

2003-08-29 Thread Alexander Priem
Well, the intention is to hold every record that ever existed in the table.
Therefore, records do not get deleted, but they get a date in the
deleteddate field. This way, we can track what changes were made to the
table(s).

So if a record gets 'deleted', the field 'deleted' is set to today's date.
If a record gets 'updated', a new record is made containing the new data,
and the old record is marked as 'deleted'.

So the percentage of 'deleted' records will grow with time, if you
understand what I mean.



- Original Message -
From: Tom Lane [EMAIL PROTECTED]
To: Alexander Priem [EMAIL PROTECTED]
Cc: Tomasz Myrta [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Friday, August 29, 2003 4:00 PM
Subject: Re: [PERFORM] Indexing question


 Alexander Priem [EMAIL PROTECTED] writes:
  Does anyone know whether it is bad practise to have two indexes on the
  primary key of a table? (one 'primary key' index and one partial index)

 It's a little unusual, but if you get enough performance boost from it
 to justify the maintenance cost of the extra index, then I can't see
 anything wrong with it.

 The if is worth checking though.  I missed the start of this thread,
 but what percentage of your rows do you expect to have null deleteddate?
 Unless it's a pretty small percentage, I'm unconvinced that the extra
 indexes will be worth their cost.

 regards, tom lane


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Tuning PostgreSQL

2003-07-22 Thread Alexander Priem
Wow, I never figured how many different RAID configurations one could think
of   :)

After reading lots of material, forums and of course, this mailing-list, I
think I am going for a RAID5 configuration of 6 disks (18Gb, 15.000 rpm
each), one of those six disks will be a 'hot spare'. I will just put the OS,
the WAL and the data one one volume. RAID10 is way to expensive   :)

If I understand correctly, this will give great read-performance, but less
write-performance. But since this server will be equipped with an embedded
RAID controller featuring 128Mb of battery-backed cache, I figure that this
controller will negate that (at least somewhat). I will need to find out
whether this cache can be configured so that it will ONLY cache WRITES, not
READS

Also because of this battery backed cache controller, I will go for the ext2
file system, mounted with 'noatime'. I will use a UPS, so I don't think I
need the journaling of ext3. XFS is not natively supported by RedHat and I
will go for the easy way here   :)

1 Gb of RAM should be enough, I think. That is about the only point that
almost everyone agrees on   :)   Do you think ECC is very important? The
server I have in mind does not support it. Another one does, but is is about
1.000 euros more expensive   :(

One CPU should also be enough.

As for postgresql.conf settings, I think I will start with the following :

max_connections = 128
superuser_reserved_connections = 1
shared_buffers = 8192
max_fsm_relations = 1000
max_fsm_pages = 10
wal_buffers = 32
sort_mem = 2048
vacuum_mem = 32768
effective_cache_size = 28672 (this one I'm not sure about, maybe this one
needs to be higher)
random_page_cost = 2
geq0_threshold = 20

This pretty much sums it up. What do you think about this config? It may not
be the fastest, but a server like this will cost about 4750 euros, and that
is including an Intel Xeon 2.4GHz cpu, redundant power supply, WITHOUT the
UPS. Seems very reasonable to me...

Kind regards,
Alexander Priem.



- Original Message -
From: Vincent van Leeuwen [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Tuesday, July 22, 2003 11:40 AM
Subject: Re: [PERFORM] Tuning PostgreSQL


 On 2003-07-22 09:04:42 +0200, Alexander Priem wrote:
  Hi all,
 
  Vincent, You said that using RAID1, you don't have real redundancy. But
  RAID1 is mirroring, right? So if one of the two disks should fail, there
  should be no data lost, right?
 

 Right. But the proposal was a single disk for WAL, without redundancy, and
I
 argued that wasn't really safe. RAID1 by itself is extremely safe,
possibly
 even the safest RAID type there is.

  I have been thinking some more. 18Gb drives are cheaper than 36 or 72Gb
  drives. I don't know if I can get the money for this, but how would the
  following setup sound?
 
  Two 18Gb (15.000rpm) disks in RAID1 array for Operating System + WAL.
  Four 18Gb (15.000rpm) disks in RAID5 array for data.
 

 Our own testing has shown that a 6 disk RAID-10 array is faster than what
you
 describe. Of course, this is very much dependant on how much
INSERT/UPDATES
 you generate (which taxes your WAL more), so your mileage may vary.

  For the same amount of money, I could also get:
 
  Two 36Gb (10.000rpm) disks in RAID1 array for Operating System + WAL.
  Five/Six 36Gb (10.000rpm) disks in RAID5 array for data.
 

 It is said that a higher RPM is particularly useful for a WAL disk. So you
 might consider using two 18GB 15K rpm drives for a RAID-1 WAL disk (+OS
and
 swap), and using 36GB 10K rpm disks in a RAID-5 array if you need that
 diskspace.

  Which would be the best of the above? The one with four 15k-rpm disks or
the
  one with five/six 10k-rpm disks?
  Would these configs be better than all disks in one huge RAID5 array?
There
  are so many possible configs with RAID...
 

 15K rpm disks are significantly faster than 10K rpm disks. If your only
 concern is performance, buy 15K rpm disks. If you want more diskspace for
your
 money, fall back to larger 10K rpm disks.

 I personally think seperate WAL disks are vastly overrated, since they
haven't
 shown a big performance gain in our own tests. But as I have said, this is
 extremely dependant on the type of load you generate, so only your own
tests
 can tell you what you should do in this respect.

 About RAID types: the fastest RAID type by far is RAID-10. However, this
will
 cost you a lot of useable diskspace, so it isn't for everyone. You need at
 least 4 disks for a RAID-10 array. RAID-5 is a nice compromise if you want
as
 much useable diskspace as possible and still want to be redundant. RAID-1
is
 very useful for small (2-disk) arrays.

 If you have the time and are settled on buying 6 disks, I'd test the
following
 scenarios:
 - 6-disk RAID-10 array (should perform best)
 - 4-disk RAID-10 array containing data, 2-disk RAID-1 array for WAL, OS,
etc
 - 4-disk RAID-5 array containing data, 2-disk RAID-1 array for WAL, OS,
etc
 - 6-disk RAID-5 array (will probably perform

[PERFORM] Tuning PostgreSQL

2003-07-21 Thread Alexander Priem



Hi guys,

I am new to PostgreSQL and have done some 
"extensive" research already. If you could give me some advice/confirmation, I 
would be really grateful.

I am going to build a PostgreSQL database server 
for a client. This database will contain many tables (over 100, 
maybemore), with some of them containing over 1 million records pretty 
soon. This database will be accessed via a desktop application (Windows) and a 
web application (PHP/Apache). There may beover 100 people accessing the 
database atany giventime (selecting, inserting and updating), but I 
don't think it will be a lot more than 100 at any given time. Most of the time, 
it will be less.

What I am thinking about is buying a server with 
the following specifications:

* 1 or 2 Intel Xeon processors (2.4 
GHz).
* 2 Gigabytes of RAM (DDR/ECC).
* Three 36Gb SCSI160 disks (10.000rpm) in a RAID-5 
config, giving 72Gb storage space (right?). The RAID-5 controller has 
a(hardware) cache of 128Mb.
* 100Mbit ethernet.

I will run RedHat Linux 9 (kernel 2.40) with 
PostgreSQL 7.3.3 on this server.

What would you think of this hardware config? Would 
it do? Of would 4Gb RAM be a lot better? What do you think about the need for 
two Xeon procs?

Finally, I have some questions about 
postgresql.conf (who doesnt?). After some research, I think I will go for the 
following settings initially. Maybe benchmarking will lead to other values, but 
I think these settings will be a fine starting point :

shared_buffers = 6000(kernel.shmmax = 
6000)
sort_mem = 4096
max_connections = 150
vacuum_mem = 65536

What do you think of these settings? Do you have 
any other hints for optimizing PostgreSQL

Many many thanks in advance 
:)

Kind regards,Alexander PriemCICT SolutionsEmail: [EMAIL PROTECTED]Internet: www.cict.nl


Re: [PERFORM] Tuning PostgreSQL

2003-07-21 Thread Alexander Priem
Thanks, I will look at the site you sent me and purchase some hardware. Then
I will run some benchmarks.

Kind regards,
Alexander.

- Original Message -
From: Shridhar Daithankar [EMAIL PROTECTED]
To: Alexander Priem [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Monday, July 21, 2003 11:33 AM
Subject: Re: [PERFORM] Tuning PostgreSQL


 Hi Alexander ,

 On 21 Jul 2003 at 11:23, Alexander Priem wrote:
  So the memory settings I specified are pretty much OK?

 As of now yes, You need to test with these settings and make sure that
they
 perform as per your requirement. That tweaking will always be there...

  What would be good guidelines for setting effective_cache_size, noatime
?

 I suggest you look at
 http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html.

 That should help you.

  I use ext3 filesystem, which probably is not the best performer, is it?

 No. You also need to check ext2, reiser and XFS. There is no agreement
between
 users as in what works best. You need to benchmark and decide.

  I will set the WAL on a separate drive. What do I need to change in the
conf
  files to achive this?

 No. You need to shutdown postgresql server process and symlink WAL and
clog
 directories in postgresql database cluster to another place. That should
do it.

 HTH

 Bye
  Shridhar

 --
 Meade's Maxim: Always remember that you are absolutely unique, just like
everyone else.



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Tuning PostgreSQL

2003-07-21 Thread Alexander Priem
Thanks, i'll look further into these mount setting.

I was just thinking, the server will have a (RAID) controller containing
128Mb of battery-backed cache memory. This would really speed up inserts to
the disk and would prevent data loss in case of a power-down also.

What would you guys think of not using RAID5 in that case, but just a really
fast 15.000 rpm SCSI-320 disk?

Kind regards,
Alexander.


- Original Message -
From: Shridhar Daithankar [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Monday, July 21, 2003 2:05 PM
Subject: Re: [PERFORM] Tuning PostgreSQL


 On 21 Jul 2003 at 13:45, Alexander Priem wrote:

  So where can I set the noatime  data=writeback variables? They are not
  PostgreSQL settings, but rather Linux settings, right? Where can I find
  these?

 These are typicaly set in /etc/fstab.conf. These are mount settings. man
mount
 for more details.

 The second setting data=writeback is ext3 specific, IIRC.

 HTH

 Bye
  Shridhar

 --
 History tends to exaggerate. -- Col. Green, The Savage Curtain, stardate
 5906.4


 ---(end of broadcast)---
 TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster