Re: [PERFORM] RAID controllers etc... was: PostgreSQL data on aNAS device ?
I have been searching (www.lsil.com) for this megaraid_2 driver you mentioned. What kind of MegaRaid card does the Perc4/Di match? Elite1600? Elite1650? I picked Elite1600 and the latest driver I found was version 2.05.00. Is this one OK for RedHat 9? The README file present only mentions RedHat8... Kind regards, Alexander. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] RedHat Enterprise Linux ES 3 ?!?!
Hi guys, This basically continues the other thread about the PERC4 RAID controller, but since it is a bit off-topic I thought to start another thread. Thanks for all your help so far :) Earlier today I read about the newly released RedHat Enterprise Linux ES version 3. This version should include out-of-the-box megaraid_2 drivers, so it would support the Dell PERC4/Di RAID controller. However, it is very much more expensive than RedHat Linux 9. RH Linux 9 is free and the Enterpise ES edition will cost between 400 and several 1.000's of dollars, depending on the support you want to go with it. Do any of you guys have experience with the previous version of Enterprise Linux (that would be version 2.1) or even better, are any of you already using version 3? Would you recommend this over RedHat Linux 9? I think that with RH Linux 9 it would be easier to get all the latest versions of components I need (RPMs for PostgreSQL, Apache, Samba etc.), while my guess would be that Enterprise Linux would be more difficult to upgrade... Also, I cannot find any list of packages included in Enterprise Linux 2.1 / 3. Does anyone know if PostgreSQL is included and if so, what version? Kind regards, Alexander Priem. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] RAID controllers etc... was: PostgreSQL data on a NAS device ?
So I guess the PERC4/Di RAID controller is pretty good. It seems that RedHat9 supports it out-of-the-box (driver 1.18f), but I gather from the sites mentioned before that upgrading this driver to 1.18i would be better... ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] PostgreSQL data on a NAS device ?
The machine is going to be used for a pretty large database (well over 100 tables with some of them containing over a million records from the start, number of tables and records will grow (much?) larger in the future). This database is going to be used by a pretty large number of employees. The number of concurrent users will vary between 1 - 100 or so, depending on the time of day etc. This will be a database containing client and supplier data as well as product descriptions and prices/ingredients/labels/brands etc. Database use will include lots of SELECTS but also lots of INSERTS/UPDATES, i.e. the database will be pretty active during bussiness hours... I think you (Scott and Will) are right when you say that NAS devices are not ideal for this kind of thing. I have been thinking about the hardware configuration for this machine for some time now (and had a lot of hints through this list already) and decided to go for a SCSI RAID config after all. The extra costs will be worth it :) The machine I have in mind now is like this : Dell PowerEdge 1750 machine with Intel Xeon CPU at 3 GHz and 4 GB of RAM. This machine will contain a PERC4/Di RAID controller with 128MB of battery backed cache memory. The O/S and logfiles will be placed on a RAID-1 setup of two 36Gb SCSI-U320 drives (15.000rpm). Database data will be placed on a Dell PowerVault 220S rack-module containing six 36Gb SCSI-U320 drives (15.000rpm) in a RAID-10 setup. This PowerVault will be connected to the DB server via a SCSI cable... This machine will be a bit more expensive than I thought at first (it's going to be about EUR 14.000, but that's including 3 years of on-site support from Dell (24x7, 4-hour response) and peripherals like UPS etc... Do you think this machine wil be OK for this task? Thanks for your help so far :) Kind regards, Alexander Priem. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] PostgreSQL data on a NAS device ?
I have considered Opteron, yes. But I think there are too many uncertainties, like : * It's a new CPU that has not proven itself yet. * I don't think I can buy directly from IBM (according to their site), so how about support (24x7) ? This will be very important to our client. * I need to install and configure a 64bit Linux flavour which I don't know (yet) Any suggestions about the usability of the system I described before? Here is the description again: Dell PowerEdge 1750 machine with Intel Xeon CPU at 3 GHz and 4 GB of RAM. This machine will contain a PERC4/Di RAID controller with 128MB of battery backed cache memory. The O/S and logfiles will be placed on a RAID-1 setup of two 36Gb SCSI-U320 drives (15.000rpm). Database data will be placed on a Dell PowerVault 220S rack-module containing six 36Gb SCSI-U320 drives (15.000rpm) in a RAID-10 setup. This PowerVault will be connected to the DB server via a SCSI cable... I have never worked with a XEON CPU before. Does anyone know how it performs running PostgreSQL 7.3.4 / 7.4 on RedHat 9 ? Is it faster than a Pentium 4? I believe the main difference is cache memory, right? Aside from cache mem, it's basically a Pentium 4, or am I wrong? Kind regards, Alexander. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] PostgreSQL data on a NAS device ?
Hi all, Does anyone have any experience with putting PostgreSQL data on a NAS device? I am asking this because a NAS device is much cheaper to set up than a couple of SCSI disks. I would like to use a relatively cheap NAS device which uses four IDE drives (7.200 rpm), like the Dell PowerVault 725N. The disks themselves would be much slower than SCSI disks, I know, but a NAS device can be equipped with 3 Gb of memory, so this would make a very large disk cache, right? If this NAS would be dedicated only to PostgreSQL, would this be slower/faster than a SCSI RAID-10 setup of 6 disks? It would be much cheaper... Any advice on this would be appreciated :) Kind regards, Alexander Priem. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] PostgreSQL data on a NAS device ?
Thanks for your reply, Jeff. If we are going to use a NAS device for storage, then it will be attached through a gigabit ethernet connection. Fiber will not be an option, since that would negate the savings we can make by using an IDE NAS device instead of SCSI-RAID, fiber's pretty expensive, right? Using a NAS device (that is used only by PostgreSQL, so it's dedicated) with 3Gb of RAM and four 7200 rpm IDE harddisks, connected using a gigabit ethernet connection to the PostgreSQL server, do you think it will be a match for a SCSI-RAID config using 4 or 6 15000rpm disks (RAID-10) through a SCSI-RAID controller having 128mb of writeback cache (battery-backed)? The SCSI-RAID config would be a lot more expensive. I can't purchase both configs and test which one wil be faster, but if the NAS solution would be (almost) as fast as the SCSI-RAID solution, it would be cheaper and easier to maintain... About clustering: I know this can't be done by hooking multiple postmasters to one and the same NAS. This would result in data corruption, i've read... Kind regards, Alexander. - Original Message - From: Jeff [EMAIL PROTECTED] To: Alexander Priem [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Monday, October 20, 2003 2:20 PM Subject: Re: [PERFORM] PostgreSQL data on a NAS device ? On Mon, 20 Oct 2003 09:12:35 +0200 Alexander Priem [EMAIL PROTECTED] wrote: I am asking this because a NAS device is much cheaper to set up than a couple of SCSI disks. I would like to use a relatively cheap NAS device which uses four IDE drives (7.200 rpm), like the Dell PowerVault 725N. The disks themselves would be much slower than SCSI disks, I know, but a NAS device can be equipped with 3 Gb of memory, so this would make a very large disk cache, right? If this NAS would be dedicated only to PostgreSQL, would this be slower/faster than a SCSI RAID-10 setup of 6 disks? It would be much cheaper... The big concern would be the network connection, unless you are going fiber. You need to use _AT LEAST_ gigabit. _at least_.If you do go that route it'd be interesting to see bonnie results. And the other thing - remember that just because you are running NAS doesn't mean you can attach another machine running postgres and have a cluster. (See archives for more info about this). I suppose it all boils down to your budget (I usually get to work with a budget of $0). And I mentioned this in another post- If you don't mind refurb disks(or slightly used) check out ebay - you can get scsi disks by the truckload for cheap. -- Jeff Trout [EMAIL PROTECTED] http://www.jefftrout.com/ http://www.stuarthamm.net/ ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] PostgreSQL data on a NAS device ?
Even better than the four-disk NAS I mentioned earlier is the following: Promise UltraTrak RM8000. This is a so-called SCSI-to-IDE RAID system. Basically it's a RAID setup of eight IDE disks, using a hardware RAID engine, that's connected to (in this case) the PostgreSQL server via a SCSI Ultra160 interface (!). So the server won't know any better than that there's a SCSI disk attached, but in reality it's a IDE RAID setup. It supports RAID levels 0, 1, 0+1, 5, 50 and JBOD and supports hot-swapping. Such a NAS config would cost around EUR 3700 (ex. VAT), using 8x40 Gb IDE disks (7200rpm). A SCSI RAID-10 setup using 6x18Gb (15000rpm) disks would cost around EUR 6000 (ex. VAT) so it's a big difference... Does anyone have experience with this NAS device or other SCSI-to-IDE RAID systems? Are they OK in terms of performance and reliability? Kind regards, Alexander. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[PERFORM] Many fields in one table or many tables?
Hi guys, I am in the process of creating a database design in which LOTS of data need to be modelled. For instance, I need to store data about products. Every product has LOTS of properties, well over a hundred. So I'm wondering. What's the best approach here, performance wise? Just create one Product table with well over a hundred columns? Or would it be better to divide this over more tables and link them together via ID's? I could for instance create tables Product, PriceInfo, Logistics, StorageInfo, PackagingInfo and link them together via the same ID. This would be easier to document (try to visualize a 100+ column table in a document!), but would it impact performance? I tihnk maybe it would impact Select performance, but Updating of products would maybe speed up a little... All info about a product is unique for this product so records in PriceInfo, Logistics, StorageInfo, PackagingInfo tables would map one to one to records in the Product table. Do any of you know if and how PostgreSQL would prefer one approach over the other? Thanks in advance, Alexander Priem. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Indexing question
So if I understand correctly I could ditch the 'deleted' field entirely and use just the 'deleteddate' field. This 'deleteddate' field would be NULL by default. It would contain a date value if the record is considered 'deleted'. The index would be 'create index a on tablename(deleteddate) where deleteddate is null'. I could then access 'current' records with a view like 'create view x_view as select * from tablename where deleteddate is null'. Is that correct? This would be the best performing solution for this kind of thing, I think (theoretically at least)? Kind regards, Alexander Priem. - Original Message - From: Tomasz Myrta [EMAIL PROTECTED] To: Alexander Priem [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Friday, August 29, 2003 9:03 AM Subject: Re: [PERFORM] Indexing question Hi all, I have some tables (which can get pretty large) in which I want to record 'current' data as well as 'historical' data. This table has fields 'deleted' and 'deleteddate' (among other fields, of course). The field 'deleted' is false be default. Every record that I want to delete gets the value true for 'deleted' and 'deleteddate' is set to the date of deletion. Since these tables are used a lot by queries that only use 'current' data, I have created a view with a where clause 'Where not deleted'. Also, I have indexed field 'deleted'. cut I think the best choice for your case is using conditional indexes. It should be much better than indexing 'deleted' field. I don't know on which exactly fields you have to create this index - you have to check it by yourself - what do you have in where clause? Example: create index some_index on your_table(id_field) where not deleted; Regards, Tomasz Myrta ---(end of broadcast)--- TIP 8: explain analyze is your friend ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Indexing question
Well, the intention is to hold every record that ever existed in the table. Therefore, records do not get deleted, but they get a date in the deleteddate field. This way, we can track what changes were made to the table(s). So if a record gets 'deleted', the field 'deleted' is set to today's date. If a record gets 'updated', a new record is made containing the new data, and the old record is marked as 'deleted'. So the percentage of 'deleted' records will grow with time, if you understand what I mean. - Original Message - From: Tom Lane [EMAIL PROTECTED] To: Alexander Priem [EMAIL PROTECTED] Cc: Tomasz Myrta [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Friday, August 29, 2003 4:00 PM Subject: Re: [PERFORM] Indexing question Alexander Priem [EMAIL PROTECTED] writes: Does anyone know whether it is bad practise to have two indexes on the primary key of a table? (one 'primary key' index and one partial index) It's a little unusual, but if you get enough performance boost from it to justify the maintenance cost of the extra index, then I can't see anything wrong with it. The if is worth checking though. I missed the start of this thread, but what percentage of your rows do you expect to have null deleteddate? Unless it's a pretty small percentage, I'm unconvinced that the extra indexes will be worth their cost. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Tuning PostgreSQL
Wow, I never figured how many different RAID configurations one could think of :) After reading lots of material, forums and of course, this mailing-list, I think I am going for a RAID5 configuration of 6 disks (18Gb, 15.000 rpm each), one of those six disks will be a 'hot spare'. I will just put the OS, the WAL and the data one one volume. RAID10 is way to expensive :) If I understand correctly, this will give great read-performance, but less write-performance. But since this server will be equipped with an embedded RAID controller featuring 128Mb of battery-backed cache, I figure that this controller will negate that (at least somewhat). I will need to find out whether this cache can be configured so that it will ONLY cache WRITES, not READS Also because of this battery backed cache controller, I will go for the ext2 file system, mounted with 'noatime'. I will use a UPS, so I don't think I need the journaling of ext3. XFS is not natively supported by RedHat and I will go for the easy way here :) 1 Gb of RAM should be enough, I think. That is about the only point that almost everyone agrees on :) Do you think ECC is very important? The server I have in mind does not support it. Another one does, but is is about 1.000 euros more expensive :( One CPU should also be enough. As for postgresql.conf settings, I think I will start with the following : max_connections = 128 superuser_reserved_connections = 1 shared_buffers = 8192 max_fsm_relations = 1000 max_fsm_pages = 10 wal_buffers = 32 sort_mem = 2048 vacuum_mem = 32768 effective_cache_size = 28672 (this one I'm not sure about, maybe this one needs to be higher) random_page_cost = 2 geq0_threshold = 20 This pretty much sums it up. What do you think about this config? It may not be the fastest, but a server like this will cost about 4750 euros, and that is including an Intel Xeon 2.4GHz cpu, redundant power supply, WITHOUT the UPS. Seems very reasonable to me... Kind regards, Alexander Priem. - Original Message - From: Vincent van Leeuwen [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Tuesday, July 22, 2003 11:40 AM Subject: Re: [PERFORM] Tuning PostgreSQL On 2003-07-22 09:04:42 +0200, Alexander Priem wrote: Hi all, Vincent, You said that using RAID1, you don't have real redundancy. But RAID1 is mirroring, right? So if one of the two disks should fail, there should be no data lost, right? Right. But the proposal was a single disk for WAL, without redundancy, and I argued that wasn't really safe. RAID1 by itself is extremely safe, possibly even the safest RAID type there is. I have been thinking some more. 18Gb drives are cheaper than 36 or 72Gb drives. I don't know if I can get the money for this, but how would the following setup sound? Two 18Gb (15.000rpm) disks in RAID1 array for Operating System + WAL. Four 18Gb (15.000rpm) disks in RAID5 array for data. Our own testing has shown that a 6 disk RAID-10 array is faster than what you describe. Of course, this is very much dependant on how much INSERT/UPDATES you generate (which taxes your WAL more), so your mileage may vary. For the same amount of money, I could also get: Two 36Gb (10.000rpm) disks in RAID1 array for Operating System + WAL. Five/Six 36Gb (10.000rpm) disks in RAID5 array for data. It is said that a higher RPM is particularly useful for a WAL disk. So you might consider using two 18GB 15K rpm drives for a RAID-1 WAL disk (+OS and swap), and using 36GB 10K rpm disks in a RAID-5 array if you need that diskspace. Which would be the best of the above? The one with four 15k-rpm disks or the one with five/six 10k-rpm disks? Would these configs be better than all disks in one huge RAID5 array? There are so many possible configs with RAID... 15K rpm disks are significantly faster than 10K rpm disks. If your only concern is performance, buy 15K rpm disks. If you want more diskspace for your money, fall back to larger 10K rpm disks. I personally think seperate WAL disks are vastly overrated, since they haven't shown a big performance gain in our own tests. But as I have said, this is extremely dependant on the type of load you generate, so only your own tests can tell you what you should do in this respect. About RAID types: the fastest RAID type by far is RAID-10. However, this will cost you a lot of useable diskspace, so it isn't for everyone. You need at least 4 disks for a RAID-10 array. RAID-5 is a nice compromise if you want as much useable diskspace as possible and still want to be redundant. RAID-1 is very useful for small (2-disk) arrays. If you have the time and are settled on buying 6 disks, I'd test the following scenarios: - 6-disk RAID-10 array (should perform best) - 4-disk RAID-10 array containing data, 2-disk RAID-1 array for WAL, OS, etc - 4-disk RAID-5 array containing data, 2-disk RAID-1 array for WAL, OS, etc - 6-disk RAID-5 array (will probably perform
[PERFORM] Tuning PostgreSQL
Hi guys, I am new to PostgreSQL and have done some "extensive" research already. If you could give me some advice/confirmation, I would be really grateful. I am going to build a PostgreSQL database server for a client. This database will contain many tables (over 100, maybemore), with some of them containing over 1 million records pretty soon. This database will be accessed via a desktop application (Windows) and a web application (PHP/Apache). There may beover 100 people accessing the database atany giventime (selecting, inserting and updating), but I don't think it will be a lot more than 100 at any given time. Most of the time, it will be less. What I am thinking about is buying a server with the following specifications: * 1 or 2 Intel Xeon processors (2.4 GHz). * 2 Gigabytes of RAM (DDR/ECC). * Three 36Gb SCSI160 disks (10.000rpm) in a RAID-5 config, giving 72Gb storage space (right?). The RAID-5 controller has a(hardware) cache of 128Mb. * 100Mbit ethernet. I will run RedHat Linux 9 (kernel 2.40) with PostgreSQL 7.3.3 on this server. What would you think of this hardware config? Would it do? Of would 4Gb RAM be a lot better? What do you think about the need for two Xeon procs? Finally, I have some questions about postgresql.conf (who doesnt?). After some research, I think I will go for the following settings initially. Maybe benchmarking will lead to other values, but I think these settings will be a fine starting point : shared_buffers = 6000(kernel.shmmax = 6000) sort_mem = 4096 max_connections = 150 vacuum_mem = 65536 What do you think of these settings? Do you have any other hints for optimizing PostgreSQL Many many thanks in advance :) Kind regards,Alexander PriemCICT SolutionsEmail: [EMAIL PROTECTED]Internet: www.cict.nl
Re: [PERFORM] Tuning PostgreSQL
Thanks, I will look at the site you sent me and purchase some hardware. Then I will run some benchmarks. Kind regards, Alexander. - Original Message - From: Shridhar Daithankar [EMAIL PROTECTED] To: Alexander Priem [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Monday, July 21, 2003 11:33 AM Subject: Re: [PERFORM] Tuning PostgreSQL Hi Alexander , On 21 Jul 2003 at 11:23, Alexander Priem wrote: So the memory settings I specified are pretty much OK? As of now yes, You need to test with these settings and make sure that they perform as per your requirement. That tweaking will always be there... What would be good guidelines for setting effective_cache_size, noatime ? I suggest you look at http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html. That should help you. I use ext3 filesystem, which probably is not the best performer, is it? No. You also need to check ext2, reiser and XFS. There is no agreement between users as in what works best. You need to benchmark and decide. I will set the WAL on a separate drive. What do I need to change in the conf files to achive this? No. You need to shutdown postgresql server process and symlink WAL and clog directories in postgresql database cluster to another place. That should do it. HTH Bye Shridhar -- Meade's Maxim: Always remember that you are absolutely unique, just like everyone else. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Tuning PostgreSQL
Thanks, i'll look further into these mount setting. I was just thinking, the server will have a (RAID) controller containing 128Mb of battery-backed cache memory. This would really speed up inserts to the disk and would prevent data loss in case of a power-down also. What would you guys think of not using RAID5 in that case, but just a really fast 15.000 rpm SCSI-320 disk? Kind regards, Alexander. - Original Message - From: Shridhar Daithankar [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Monday, July 21, 2003 2:05 PM Subject: Re: [PERFORM] Tuning PostgreSQL On 21 Jul 2003 at 13:45, Alexander Priem wrote: So where can I set the noatime data=writeback variables? They are not PostgreSQL settings, but rather Linux settings, right? Where can I find these? These are typicaly set in /etc/fstab.conf. These are mount settings. man mount for more details. The second setting data=writeback is ext3 specific, IIRC. HTH Bye Shridhar -- History tends to exaggerate. -- Col. Green, The Savage Curtain, stardate 5906.4 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster