Re: [PERFORM] good pc but bad performance,why?
hello! Thanks, you are right. I use "postmaster -o "-F" " to start my PG,and performance improved greatly. Best regards, huang yaqin >huang yaqin <[EMAIL PROTECTED]> writes: >> I have some question when I use postgresql 7.4.1 on redhat adv server 2.1 . >> I use IBM335 as server, it has 4 cpus, 1G RAM. but I got very bad performance. >> I can only do about 50 inserts per sencond. Event worse than my pc(PIII 800,256M >> RAM), can anyone give me some advice? > >If the cheap machine appears to be able to commit more transactions >per second than the better one, it's very likely because the cheap >machine has a disk that lies about write completion. Is the server >using SCSI disks by any chance? To a first approximation, IDE drives >lie by default, SCSI drives don't. > > regards, tom lane > > > > >Powered by MessageSoft SMG >SPAM, virus-free and secure email >http://www.messagesoft.com = = = = = = = = = = = = = = = = = = = = 致 礼! huang yaqin [EMAIL PROTECTED] 2004-04-07 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] possible improvement between G4 and G5
Aaron, > I do consulting, so they're all over the place and tend to be complex. Very > few fit in RAM, but still are very buffered. These are almost all backed > with very high end I/O subsystems, with dozens of spindles with battery > backed up writethrough cache and gigs of buffers, which may be why I worry > so much about CPU. I have had this issue with multiple servers. Aha, I think this is the difference. I never seem to be able to get my clients to fork out for adequate disk support. They are always running off single or double SCSI RAID in the host server; not the sort of setup you have. > What my CPU tends to be doing is a combination of general processing, > complex SQL processing: nested loops and sorting and hashing and triggers > and SPs. I haven't noticed SPs to be particularly CPU-hoggish, more RAM. > I'm curious about you having flat CPU, which is not my experience. Are your > apps mature and stable? Well, "flat" was a bit of an exaggeration ... there are spikes ... but average CPU load is < 30%.I think the difference is that your clients listen to you about disk access. Mine are all too liable to purchase a quad-Xeon machine but with an Adaptec RAID-5 card with 4 drives, and *then* call me and ask for advice. As a result, most intensive operations don't tend to swamp the CPU because they are waiting for disk. I have noticed the limitiations on RAM for 64 vs. 32, as I find it easier to convince a client to get 8GB RAM than four-channel RAID with 12 drives, mostly because the former is cheaper. Linux 2.4 + Bigmem just doesn't cut it for making effective use of > 3GB of RAM. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] possible improvement between G4 and G5
- Original Message - From: "Josh Berkus" <[EMAIL PROTECTED]> To: "Aaron Werman" <[EMAIL PROTECTED]>; "Qing Zhao" <[EMAIL PROTECTED]>; "Tom Lane" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Tuesday, April 06, 2004 2:52 PM Subject: Re: [PERFORM] possible improvement between G4 and G5 > Aaron, > > > I'm surprised by this thought. I tend to hit CPU bottlenecks more often than > > I/O ones. In most applications, db I/O is a combination of buffer misses and > > logging, which are both reasonably constrained. > > Not my experience at all. In fact, the only times I've seen modern platforms > max out the CPU was when: > a) I had bad queries with bad plans, or > b) I had reporting queires that did a lot of calculation for display (think > OLAP). > > Otherwise, on the numerous servers I administrate, RAM spikes, and I/O > bottlenecks, but the CPU stays almost flat. > > Of course, most of my apps are large databases (i.e. too big for RAM) with a > heavy transaction-processing component. > > What kind of applications are you running? > > -- > -Josh Berkus > Aglio Database Solutions > San Francisco > > I do consulting, so they're all over the place and tend to be complex. Very few fit in RAM, but still are very buffered. These are almost all backed with very high end I/O subsystems, with dozens of spindles with battery backed up writethrough cache and gigs of buffers, which may be why I worry so much about CPU. I have had this issue with multiple servers. Consider an analysis db with 10G data. Of that, 98% of the access is read and only 2% write (that is normal for almost anything that is not order entry, even transaction processing with thorough cross validation). Almost all the queries access 10%, or 1G of the data. Of the reads, they average ~3 level b-trees, with the first 2 levels certainly cached, and the last ones often cached. Virtually all the I/O activity is logical reads against buffer. A system with a 100 transactions which on average access 200 rows does 98% of 200 rows x 100 transactions x 3 logical I/Os per read = 58,800 logical reads, of which actually maybe a hundred are physical reads. It also does 2% of 200 rows x 100 transactions x (1 table logical I/O and say 2 index logical writes) per write = 1,200 logical writes to log, of which there are 100 transaction commit synch writes, and in reality less than that because of queuing against logs (there are also 1,200 logical writes deferred to checkpoint, of which it is likely to only be 40 physical writes because of page overlaps). Transaction processing is a spectrum between activity logging, and database centric design. The former, where actions are stored in the database is totally I/O bound with the engine acting as a thin layer of logical to physical mapping. Database centric processing makes the engine a functional server of discrete actions - and is a big CPU hog. What my CPU tends to be doing is a combination of general processing, complex SQL processing: nested loops and sorting and hashing and triggers and SPs. I'm curious about you having flat CPU, which is not my experience. Are your apps mature and stable? /Aaron ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[PERFORM] plan problem
I am trying to find an efficient way to draw a random sample from a complex query. I also want it to be easy to use within my application. So I've defined a view that encapsulates the query. The id in the "driving" table is exposed, and I run a query like: select * from stats_record_view where id in (select id from driver_stats order by random() limit 3); driver_stats.id is unique, the primary key. The problem I'm having is that neither the ORDER BY nor the LIMIT change the uniqueness of that column, but the planner doesn't know that. It does a HashAggregate to make sure the results are unique. It thinks that 200 rows will come out of that operation, and then 200 rows is small enough that it thinks a Nested Loop is the best way to proceed from there. I can post more query plan, but I don't think it would be that very helpful. I'm considering just making a sample table and creating an analogous view around that. I'd like to be able to keep this as simple as possible though. Ken ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] good pc but bad performance,why?
huang yaqin wrote: hello, I have some question when I use postgresql 7.4.1 on redhat adv server 2.1 . I use IBM335 as server, it has 4 cpus, 1G RAM. but I got very bad performance. This is most likely a dual processor Xeon machine with HT, because the x335 is limited to two physical cpus. I can only do about 50 inserts per sencond. Event worse than my pc(PIII 800,256M RAM), can anyone give me some advice? any chance you are using the onboard MPT-Fusion "Raid"controller with a RAID1 - we have seen absolutely horrible performance from these controllers here. Using them as a normal SCSI-Controller with Softwareraid on top fixed this for us ... stefan ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] possible improvement between G4 and G5
Aaron, > I'm surprised by this thought. I tend to hit CPU bottlenecks more often than > I/O ones. In most applications, db I/O is a combination of buffer misses and > logging, which are both reasonably constrained. Not my experience at all. In fact, the only times I've seen modern platforms max out the CPU was when: a) I had bad queries with bad plans, or b) I had reporting queires that did a lot of calculation for display (think OLAP). Otherwise, on the numerous servers I administrate, RAM spikes, and I/O bottlenecks, but the CPU stays almost flat. Of course, most of my apps are large databases (i.e. too big for RAM) with a heavy transaction-processing component. What kind of applications are you running? -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] possible improvement between G4 and G5
- Original Message - From: "Tom Lane" <[EMAIL PROTECTED]> To: "Qing Zhao" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Tuesday, April 06, 2004 1:47 AM Subject: Re: [PERFORM] possible improvement between G4 and G5 > Qing Zhao <[EMAIL PROTECTED]> writes: > > We have got a G5 64-bit processor to replace an old G4 32-bit > > processor. Given everything else equal, should we see a big > > improvement on PG's performance? > > Nope. Database performance typically depends on disk performance first, > and RAM size second. I'm surprised by this thought. I tend to hit CPU bottlenecks more often than I/O ones. In most applications, db I/O is a combination of buffer misses and logging, which are both reasonably constrained. RAM size seems to me to be the best way to improve performance, and then CPU which is needed to perform the in-memory searching, locking, versioning, and processing, and finally I/O (this is not the case in small I/O subsystems - if you have less than a dozen drives, you're easily I/O bound). I/O is often the thing I tune first, because I can do it in place without buying hardware. Conceptually, an RDBMS converts slow random I/O into in memory processing and sequential logging writes. If successful, it should reduce the I/O overhead. /Aaron ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] good pc but bad performance,why?
huang yaqin <[EMAIL PROTECTED]> writes: > I have some question when I use postgresql 7.4.1 on redhat adv server 2.1 . > I use IBM335 as server, it has 4 cpus, 1G RAM. but I got very bad performance. > I can only do about 50 inserts per sencond. Event worse than my pc(PIII 800,256M > RAM), can anyone give me some advice? If the cheap machine appears to be able to commit more transactions per second than the better one, it's very likely because the cheap machine has a disk that lies about write completion. Is the server using SCSI disks by any chance? To a first approximation, IDE drives lie by default, SCSI drives don't. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] possible improvement between G4 and G5
On Tue, Apr 06, 2004 at 01:47:22AM -0400, Tom Lane wrote: > Qing Zhao <[EMAIL PROTECTED]> writes: > > We have got a G5 64-bit processor to replace an old G4 32-bit > > processor. Given everything else equal, should we see a big > > improvement on PG's performance? > > Nope. Database performance typically depends on disk performance first, > and RAM size second. A 64-bit processor might help by allowing you to > install more RAM, but you didn't say that you had. Memory bandwidth is a consideration too, so you might see some performance improvements on a G5. We recently debated between Xeons and Opterons in a new PGSQL server and a little poking around on the lists indicated that the Opterons did perform better, presumably due to the increased memory bandwidth. Incidentally, this is why you need about 2x the CPUs on Sun hardware vs RS6000 hardware for database stuff (and that gets expensive if you're paying per CPU!). -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Member: Triangle Fraternity, Sports Car Club of America Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?" ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[PERFORM] Wierd issues
This is what I got… Two servers, one debian, one fedora Debain dual 3ghz, 1 gig ram, ide, PostgreSQL 7.2.1 on i686-pc-linux-gnu, compiled by GCC 2.95.4 Fedora: Dual 3ghz, 1 gig ram, scsi, PostgreSQL 7.3.4-RH on i386-redhat-linux-gnu, compiled by GCC i386-redhat-linux-gcc (GCC) 3.3.2 20031022 (Red Hat Linux 3.3.2-1) Both have same databases, Both have had vacume full ran on them. Both doing the same query Select * from vpopmail; The vpopmail is a view, this is the view View "vpopmail" Column | Type | Modifiers ---++--- pw_name | character varying(32) | pw_domain | character varying(64) | pw_passwd | character varying | pw_uid | integer | pw_gid | integer | pw_gecos | character varying | pw_dir | character varying(160) | pw_shell | character varying(20) | View definition: SELECT ea.email_name AS pw_name, ea.domain AS pw_domain, get_pwd(u.username, '127.0.0.1'::"varchar", '101'::"varchar", 'MD5'::"varchar") AS pw_passwd, 0 AS pw_uid, 0 AS pw_gid, ''::"varchar" AS pw_gecos, ei.directory AS pw_dir, ei.quota AS pw_shell FROM email_addresses ea, email_info ei, users u, user_resources ur WHERE (ea.user_resource_id = ei.user_resource_id) AND (get_pwd(u.username, '127.0.0.1'::"varchar", '101'::"varchar", 'MD5'::"varchar") IS NOT NULL)) AND (ur.id = ei.user_resource_id)) AND (u.id = ur.user_id)) AND (NOT (EXISTS (SELECT forwarding.email_id FROM forwarding WHERE (forwarding.email_id = ea.id); Both are set to the same buffers and everything… this is the execution time: Debian: Total runtime: 35594.81 msec Fedora: Total runtime: 2279869.08 msec Huge difference as you can see… here are the pastes of the stuff Debain: user_acl=# explain analyze SELECT count(*) from vpopmail; NOTICE: QUERY PLAN: Aggregate (cost=438231.94..438231.94 rows=1 width=20) (actual time=35594.67..35594.67 rows=1 loops=1) -> Hash Join (cost=434592.51..438142.51 rows=35774 width=20) (actual time=34319.24..35537.11 rows=70613 loops=1) -> Seq Scan on email_info ei (cost=0.00..1721.40 rows=71640 width=4) (actual time=0.04..95.13 rows=71689 loops=1) -> Hash (cost=434328.07..434328.07 rows=35776 width=16) (actual time=34319.00..34319.00 rows=0 loops=1) -> Hash Join (cost=430582.53..434328.07 rows=35776 width=16) (actual time=2372.45..34207.21 rows=70613 loops=1) -> Seq Scan on users u (cost=0.00..1938.51 rows=71283 width=4) (actual time=0.81..30119.58 rows=70809 loops=1) -> Hash (cost=430333.64..430333.64 rows=35956 width=12) (actual time=2371.51..2371.51 rows=0 loops=1) -> Hash Join (cost=2425.62..430333.64 rows=35956 width=12) (actual time=176.73..2271.14 rows=71470 loops=1) -> Seq Scan on email_addresses ea (cost=0.00..426393.25 rows=35956 width=4) (actual time=0.06..627.49 rows=71473 loops=1) SubPlan -> Index Scan using forwarding_idx on forwarding (cost=0.00..5.88 rows=1 width=4) (actual time=0.00..0.00 rows=0 loops=71960) -> Hash (cost=1148.37..1148.37 rows=71637 width=8) (actual time=176.38..176.38 rows=0 loops=1) -> Seq Scan on user_resources ur (cost=0.00..1148.37 rows=71637 width=8) (actual time=0.03..82.21 rows=71686 loops=1) Total runtime: 35594.81 msec EXPLAIN And for fedora it’s Aggregate (cost=416775.52..416775.52 rows=1 width=20) (actual time=2279868.57..2279868.58 rows=1 loops=1) -> Hash Join (cost=413853.79..416686.09 rows=35772 width=20) (actual time=2279271.26..2279803.91 rows=70841 loops=1) Hash Cond: ("outer".user_resource_id = "inner".id) -> Seq Scan on email_info ei (cost=0.00..1666.07 rows=71907 width=4) (actual time=8.12..171.10 rows=71907 loops=1) -> Hash (cost=413764.36..413764.36 rows=35772 width=16) (actual time=2279263.03..2279263.03 rows=0 loops=1) -> Hash Join (cost=410712.87..413764.36 rows=35772 width=16) (actual time=993.90..2279008.72 rows=70841 loops=1) Hash Cond: ("outer".id = "inner".user_id) -> Seq Scan on users u (cost=0.00..1888.85 rows=71548 width=4) (actual time=18.38..2277152.51 rows=71028 loops=1) Filter: (get_pwd(username, '127.0.0.1'::character varying, '101'::character varying, 'MD5'::character varying) IS NOT NULL) -> Hash (cost=410622.99..410622.99 rows=35952 width=12) (actual time=975.40..975.40 rows=0 loops=1) -> Hash Join (cost=408346.51..410622.99 rows=35952 width=12) (actual time=507.52..905.91 rows=71697 loops=1) Hash Cond: ("outer"
Re: [PERFORM] performance comparission postgresql/ms-sql server
Heiko Kehlenbrink wrote: i want to convince people to use postgresql instead of ms-sql server, so i set up a kind of comparission insert data / select data from postgresql / ms-sql server [...] do you have any hints like compiler-flags and so on to get the answering time from postgresql equal to ms-sql? (btw both dbms were running on exactly the same hardware) i use suse 8.1 postgresql 7.2 compiled from the rpms for using postgis, but that is anothe story... 1.5 gig ram 1.8 mhz intel cpu every help welcome Suse 8.1 comes with 2.4 series kernel I suppose. Many have witnessed a speed increase when using 2.6 series kernel. Might consider this too besides the newer PostgreSQL version already suggested. 2.6 has some scheduling options that are not enabled by default but may enhance database performance (http://story.news.yahoo.com/news?tmpl=story&cid=75&e=2&u=/nf/20040405/tc_nf/23603). Kaarel ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] performance comparission postgresql/ms-sql server
Heiko Kehlenbrink wrote: Hmm... I would suggest if you are testing, you should try 7.4.2. 7.4 has some good optimisation for hash agregates though I am not sure if it apply to averaging. would be the last option till we are runing other applications on that 7.2 system I can understand.. Also try forcing a seq. scan by turning off index scan. I guess index scan for so many rows is not exactly good thing even if tuple size if pretty small. a sequential scann gives me the following result: [EMAIL PROTECTED]:~> time psql -d test -c 'explain analyse select avg(dist) from massive2 where dist > 100*sqrt(3.0)::float8 and dist < 150*sqrt(3.0)::float8 ;' NOTICE: QUERY PLAN: Aggregate (cost=1193714.43..1193714.43 rows=1 width=8) (actual time=166718.54..166718.54 rows=1 loops=1) -> Seq Scan on massive2 (cost=0.00..1192478.00 rows=494573 width=8) (actual time=3233.22..165576.40 rows=49 loops=1) Total runtime: 166733.73 msec Certainly bad and not an option.. I can't think of anything offhand to speed this up.. Shridhar ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PERFORM] Back to Linux 2.6 kernel thoughts...
I seem to remember discussion of anticipatory vs deadline scheduler in 2.6. Here is what Andrew Morton (I think) says: "The deadline scheduler has two additional scheduling queues that were not available to the 2.4 IO scheduler. The two new queues are a FIFO read queue and a FIFO write queue. This new multi-queue method allows for greater interactivity by giving the read requests a better deadline than write requests, thus ensuring that applications rarely will be delayed by read requests. Deadline scheduling is best suited for database servers and high disk performance systems. Morton has experienced up to 15 percent increases on database loads while using deadline scheduling." http://story.news.yahoo.com/news?tmpl=story&cid=75&e=2&u=/nf/20040405/tc_nf/23603 Nothing very in-depth in the story. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] good pc but bad performance,why?
huang yaqin wrote: > hello, > > I have some question when I use postgresql 7.4.1 on redhat adv server 2.1 . > I use IBM335 as server, it has 4 cpus, 1G RAM. but I got very bad performance. > I can only do about 50 inserts per sencond. Event worse than my pc(PIII 800,256M > RAM), can anyone give me some advice? Have you referenced this document?: http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html -- Bill Moran Potential Technologies http://www.potentialtech.com ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] good pc but bad performance,why?
On Tuesday 06 April 2004 09:01, huang yaqin wrote: > hello, > > I have some question when I use postgresql 7.4.1 on redhat adv server 2.1 > . I use IBM335 as server, it has 4 cpus, 1G RAM. but I got very bad > performance. I can only do about 50 inserts per sencond. Event worse than > my pc(PIII 800,256M RAM), can anyone give me some advice? How have you tuned your postgresql.conf file? What disk systems do you have? What does vmstat/iostat show as the bottleneck in the system? -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[PERFORM] good pc but bad performance,why?
hello, I have some question when I use postgresql 7.4.1 on redhat adv server 2.1 . I use IBM335 as server, it has 4 cpus, 1G RAM. but I got very bad performance. I can only do about 50 inserts per sencond. Event worse than my pc(PIII 800,256M RAM), can anyone give me some advice? huang yaqin [EMAIL PROTECTED] 2004-04-06 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] performance comparission postgresql/ms-sql server
Heiko Kehlenbrink wrote: [EMAIL PROTECTED]:~> psql -d test -c 'explain analyse select avg(dist) from massive2 where dist > (100*sqrt(3.0))::float8 and dist < (150*sqrt(3.0))::float8;' NOTICE: QUERY PLAN: Aggregate (cost=14884.61..14884.61 rows=1 width=8) (actual time=3133.24..3133.24 rows=1 loops=1) -> Index Scan using massive2_dist on massive2 (cost=0.00..13648.17 rows=494573 width=8) (actual time=0.11..2061.38 rows=49 loops=1) Total runtime: 3133.79 msec EXPLAIN seems to me that most time was needed for the index scanning... Hmm... I would suggest if you are testing, you should try 7.4.2. 7.4 has some good optimisation for hash agregates though I am not sure if it apply to averaging. Also try forcing a seq. scan by turning off index scan. I guess index scan for so many rows is not exactly good thing even if tuple size if pretty small. Shridhar ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings