Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0 Bayes module.
Matthew Schumacher [EMAIL PROTECTED] writes: After playing with various indexes and what not I simply am unable to make this procedure perform any better. Perhaps someone on the list can spot the bottleneck and reveal why this procedure isn't performing that well or ways to make it better. There's not anything obviously wrong with that procedure --- all of the updates are on primary keys, so one would expect reasonably efficient query plans to get chosen. Perhaps it'd be worth the trouble to build the server with profiling enabled and get a gprof trace to see where the time is going. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0
On Wed, 2005-07-27 at 14:35 -0800, Matthew Schumacher wrote: I put the rest of the schema up at http://www.aptalaska.net/~matt.s/bayes/bayes_pg.sql in case someone needs to see it too. Do you have sample data too? -- Karim Nassar Collaborative Computing Lab of NAU Office: (928) 523 5868 -=- Mobile: (928) 699 9221 http://ccl.cens.nau.edu/~kan4 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] Finding bottleneck
Hello, we recently upgraded our dual Xeon Dell to a brand new Sun v40z with 4 opterons, 16GB of memory and MegaRAID with enough disks. OS is Debian Sarge amd64, PostgreSQL is 8.0.3. Size of database is something like 80GB and our website performs about 600 selects and several updates/inserts a second. v40z performs somewhat better than our old Dell but mostly due to increased amount of memory. The problem is.. there seems to by plenty of free CPU available and almost no IO-wait but CPU bound queries seem to linger for some reason. Problem appears very clearly during checkpointing. Queries accumulate and when checkpointing is over, there can be something like 400 queries running but over 50% of cpu is just idling. procs ---memory ---swap-- -io --system-- cpu r b swpd free buffcache si sobibo incs us sy id wa 3 1 0 494008 159492 1410718000 919 3164 3176 13031 29 12 52 8 5 3 0 477508 159508 1411845200 1071 4479 3474 13298 27 13 47 13 0 0 0 463604 159532 1412883200 922 2903 3352 12627 29 11 52 8 3 1 0 442260 159616 1414166800 1208 3153 3357 13163 28 12 52 9 An example of a lingering query (there's usually several of these or similar): SELECT u.uid, u.nick, u.name, u.showname, i.status, i.stamp, i.image_id, i.info, i.t_width, i.t_height FROM users u INNER JOIN image i ON i.uid = u.uid INNER JOIN user_online uo ON u.uid = uo.uid WHERE u.city_id = 5 AND i.status = 'd' AND u.status = 'a' ORDER BY city_id, upper(u.nick) LIMIT (40 + 1) OFFSET 320 Tables involved contain no more than 4 million rows. Those are constantly accessed and should fit nicely to cache. But database is just slow because of some unknown reason. Any ideas? -8 Relevant rows from postgresql.conf 8- shared_buffers = 15000 # min 16, at least max_connections*2, 8KB each work_mem = 1536 # min 64, size in KB maintenance_work_mem = 32768# min 1024, size in KB max_fsm_pages = 100 # min max_fsm_relations*16, 6 bytes each max_fsm_relations = 5000# min 100, ~50 bytes each vacuum_cost_delay = 15 # 0-1000 milliseconds vacuum_cost_limit = 120 # 0-1 credits bgwriter_percent = 2# 0-100% of dirty buffers in each round fsync = true# turns forced synchronization on or off # fsync, fdatasync, open_sync, or open_datasync wal_buffers = 128 # min 4, 8KB each commit_delay = 8# range 0-10, in microseconds commit_siblings = 10# range 1-1000 checkpoint_segments = 200 # in logfile segments, min 1, 16MB each checkpoint_timeout = 1800 # range 30-3600, in seconds effective_cache_size = 100 # typically 8KB each random_page_cost = 1.8 # units are one sequential page fetch cost default_statistics_target = 150 # range 1-1000 stats_start_collector = true stats_command_string = true |\__/| ( oo )Kari Lavikka - [EMAIL PROTECTED] - (050) 380 3808 __ooO( )Ooo___ _ ___ _ _ _ __ _ _ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Finding bottleneck
Hi, On Thu, 28 Jul 2005, Kari Lavikka wrote: -8 Relevant rows from postgresql.conf 8- shared_buffers = 15000 # min 16, at least max_connections*2, 8KB each work_mem = 1536 # min 64, size in KB As an aside, I'd increase work_mem -- but it doesn't sound like that is your problem. maintenance_work_mem = 32768# min 1024, size in KB max_fsm_pages = 100 # min max_fsm_relations*16, 6 bytes each max_fsm_relations = 5000# min 100, ~50 bytes each vacuum_cost_delay = 15 # 0-1000 milliseconds vacuum_cost_limit = 120 # 0-1 credits bgwriter_percent = 2# 0-100% of dirty buffers in each round fsync = true# turns forced synchronization on or off # fsync, fdatasync, open_sync, or open_datasync wal_buffers = 128 # min 4, 8KB each Some benchmarking results out today suggest that wal_buffers = 1024 or even 2048 could greatly assist you. commit_delay = 8# range 0-10, in microseconds commit_siblings = 10# range 1-1000 This may explain the fact that you've got backed up queries and idle CPU -- I'm not certain though. What does disabling commit_delay do to your situation? Gavin ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Left joining against two empty tables makes a query SLOW
On 7/28/05, Chris Travers [EMAIL PROTECTED] wrote: Hi all;I have a customer who currently uses an application which had becomeslow.After doing some digging, I found the slow query:SELECT c.accno, c.description, c.link, c.category, ac.project_id,p.projectnumber ,a.department_id, d.description AS departmentFROM chart c JOIN acc_trans ac ON (ac.chart_id = c.id)JOIN ar a ON (a.id = ac.trans_id)LEFT JOIN project p ON ( ac.project_id = p.id)LEFT JOIN department d ON (d.id = a.department_id)WHERE a.customer_id = 11373 AND a.id IN (SELECT max(id) FROM ar WHERE customer_id = 11373 );(reformatted for readability)This is taking 10 seconds to run.Interestingly, both the project and department tables are blank, and ifI omit them, the query becomes:SELECT c.accno, c.description , c.link, c.category, ac.project_idFROM chart c JOIN acc_trans ac ON (ac.chart_id = c.id)JOIN ar a ON (a.id = ac.trans_id)WHERE a.customer_id = 11373 AND a.id IN (SELECT max(id) FROM ar WHERE customer_id = 11373);This takes 139ms.1% of the previous query.The plan for the long query is:QUERY PLAN Hash IN Join(cost=87337.25..106344.93 rows=41 width=118) (actualtime=7615.843..9850.209 rows=10 loops=1) Hash Cond: (outer.trans_id = inner.max) -Merge Right Join(cost= 86620.57..100889.85 rows=947598width=126) (actual time=7408.830..9200.435 rows=177769 loops=1) Merge Cond: (outer.id = inner.department_id) -Index Scan using department_id_key on department d (cost=0.00..52.66rows=1060 width=36) (actual time=0.090..0.090 rows=0 loops=1) vacuum reindex the department and project table as the planner expects there are 1060 rows but actually returning nothing. -Sort(cost=86620.57..87067.55 rows=178792 width=94)(actual time= 7408.709..7925.843 rows=177769 loops=1) Sort Key: a.department_id -Merge Right Join(cost=45871.18..46952.83rows=178792 width=94) (actual time=4962.122..6671.319 rows=177769 loops=1) Merge Cond: (outer.id = inner.project_id) -Index Scan using project_id_key on project p(cost=0.00..49.80 rows=800 width=36) (actual time=0.007..0.007 rows=0loops=1) -Sort(cost=45871.18..46318.16 rows=178792width=62) (actual time=4962.084..5475.636 rows=177769 loops=1) Sort Key: ac.project_id -Hash Join(cost=821.20..13193.43rows=178792 width=62) (actual time=174.905..4295.685 rows=177769 loops=1) Hash Cond: (outer.chart_id = inner.id) -Hash Join(cost=817.66..10508.02rows=178791width=20) (actual time=173.952..2840.824 rows=177769 loops=1) Hash Cond: (outer.trans_id =inner.id) -Seq Scan on acc_trans ac(cost=0.00..3304.38 rows=181538 width=12) (actual time=0.062..537.753rows=181322 loops=1) -Hash(cost=659.55..659.55rows=22844 width=8) (actual time=173.625..173.625 rows=0 loops=1) -Seq Scan on ar a(cost=0.00..659.55 rows=22844 width=8) (actual time=0.022..101.828rows=22844 loops=1) Filter: (customer_id= 11373) -Hash(cost=3.23..3.23 rows=123width=50) (actual time=0.915..0.915 rows=0 loops=1) -Seq Scan on chart c(cost=0.00..3.23 rows=123 width=50) (actual time=0.013..0.528 rows=123loops=1) -Hash(cost=716.67..716.67 rows=1 width=4) (actualtime=129.037..129.037 rows=0 loops=1) -Subquery Scan IN_subquery(cost=716.66..716.67 rows=1width=4) (actual time=129.017..129.025 rows=1 loops=1) -Aggregate(cost=716.66..716.66 rows=1 width=4)(actual time=129.008..129.011 rows=1 loops=1) -Seq Scan on ar(cost=0.00..659.55 rows=22844width=4) (actual time=0.020..73.266 rows=22844 loops=1) Filter: (customer_id = 11373) Total runtime: 9954.133 ms(28 rows)The shorter query's plan is:QUERY PLAN- Hash Join(cost=728.42..732.96 rows=8 width=50) (actualtime=130.908..131.593 rows=10 loops=1) Hash Cond: (outer.id = inner.chart_id) -Seq Scan on chart c(cost=0.00..3.23 rows=123 width=50) (actualtime=0.006..0.361 rows=123 loops=1) -Hash(cost=728.40..728.40 rows=8 width=8) (actualtime=130.841..130.841 rows=0 loops=1) -Nested Loop(cost=716.67..728.40 rows=8 width=8) (actualtime=130.692..130.805 rows=10 loops=1) -Nested Loop(cost=716.67..720.89 rows=1 width=8)(actual time=130.626..130.639 rows=1 loops=1) -HashAggregate(cost=716.67..716.67 rows=1width=4) (actual time=130.484..130.487 rows=1 loops=1) -Subquery Scan IN_subquery(cost=716.66..716.67 rows=1 width=4) (actual time=130.455..130.464rows=1 loops=1) -Aggregate(cost=716.66..716.66rows=1 width=4) (actual time=130.445..130.448 rows=1 loops=1) -Seq Scan on ar(cost=0.00..659.55 rows=22844 width=4) (actual time=0.020..74.174rows=22844 loops=1) Filter: (customer_id = 11373) -Index Scan using ar_id_key on ar a(cost=0.00..4.20 rows=1 width=4) (actual time=0.122..0.125 rows=1 loops=1) Index Cond: (a.id = outer.max) Filter: (customer_id = 11373) -Index Scan using acc_trans_trans_id_key on acc_transac(cost=0.00..7.41 rows=8 width=12) (actual
Re: [PERFORM] Finding bottleneck
effective_cache_size = 100 # typically 8KB each I have this setting on postgresql 7.4.8 on FreeBSD with 4 GB RAM: effective_cache_size = 27462 So eventhough your machine runs Debian and you have four times as much RAM as mine your effective_cache_size is 36 times larger. You could try lowering this setting. regards Claus ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0 Bayes module.
I'm not sure how much this has been discussed on the list, but wasn't able to find anything relevant in the archives. The new Spamassassin is due out pretty soon. They are currently testing 3.1.0pre4. One of the things I hope to get out of this release is bayes word stats moved to a real RDBMS. They have separated the mysql BayesStore module from the PgSQL one so now postgres can use it's own queries. I loaded all of this stuff up on a test server and am finding that the bayes put performance is really not good enough for any real amount of mail load. The performance problems seems to be when the bayes module is inserting/updating. This is now handled by the token_put procedure. 1. you need high performance client side timing (sub 1 millisecond). on win32 use QueryPerformanceCounter 2. one by one, convert queries inside your routine into dynamic versions. That is, use execute 'query string' 3. Identify the problem. Something somewhere is not using the index. Because of the way the planner works you have to do this sometimes. Merlin ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Finding bottleneck
Kari Lavikka wrote: shared_buffers = 15000 you can play around with this one but in my experience it doesn't make much difference anymore (it used to). work_mem = 1536 # min 64, size in KB this seems low. are you sure you are not getting sorts swapped to disk? fsync = true# turns forced synchronization on or off does turning this to off make a difference? This would help narrow down where the problem is. commit_delay = 8# range 0-10, in microseconds hm! how did you arrive at this number? try setting to zero and comparing. stats_start_collector = true stats_command_string = true with a high query load you may want to consider turning this off. On win32, I've had some problem with stat's collector under high load conditions. Not un unix, but it's something to look at. Just turn off stats for a while and see if it helps. good luck! your hardware should be more than adequate. Merlin ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] Fwd: Help with view performance problem
Does anyone have any suggestions on this? I did not get any response from the admin list. Thanks, Chris -- Forwarded message -- From: Chris Hoover [EMAIL PROTECTED] Date: Jul 27, 2005 12:29 PM Subject: Re: Help with view performance problem To: pgsql-admin@postgresql.org I did some more testing, and ran the explain analyze on the problem. In my session I did a set enable_hashjoin = false and then ran the analyze. This caused it to use the indexes as I have been expecting it to do. Now, how can I get it to use the indexes w/o manipulating the environment? What make postgresql want to sequentially scan and use a hash join? thanks, Chris explain analyze with set_hashjoin=false; prob_db=#explain analyze select * from clm_com; QUERY PLAN -- -- Subquery Scan clm_com (cost=1057975.45..1169021.26 rows=126910 width=366) (actual time=142307.99..225997.22 rows=1268649 loops=1) - Unique (cost=1057975.45..1169021.26 rows=126910 width=366) (actual time=142307.96..206082.30 rows=1268649 loops=1) - Sort (cost=1057975.45..1061148.19 rows=1269095 width=366) (actual time=142307.95..156019.01 rows=1268649 loops=1) Sort Key: clmcom1.inv_nbr, clmcom1.inv_qfr, clmcom1.pat_addr_1, clmcom1.pat_addr_2, clmcom1.pat_city, clmcom1.pat_cntry, clmcom1.pat_dob, clmcom1.pat_gender_cd, clmcom1.pat_info_pregnancy_ind, clmcom1.pat_state, clmcom1.pat_suffix, clmcom1.pat_zip, clmcom1.payto_addr_1, clmcom1.payto_addr_2, clmcom1.payto_city, clmcom1.payto_cntry, clmcom1.payto_f_name, clmcom1.payto_m_name, clmcom1.payto_state, clmcom1.payto_zip, clmcom1.clm_tot_clm_chgs, clmcom1.bill_l_name_org, clmcom1.clm_delay_rsn_cd, clmcom1.clm_submit_rsn_cd, clmcom1.payto_l_name_org, clmcom1.payto_prim_id, clmcom1.bill_prim_id, clmcom1.clm_tot_ncov_chgs, clmcom2.contract_amt, clmcom2.svc_fac_or_lab_name, clmcom2.svc_fac_addr_1, clmcom2.svc_fac_addr_2, clmcom2.svc_fac_city, clmcom2.svc_fac_zip - Merge Join (cost=0.00..565541.46 rows=1269095 width=366) (actual time=464.89..130638.06 rows=1268649 loops=1) Merge Cond: (outer.inv_nbr = inner.inv_nbr) Join Filter: (outer.inv_qfr = inner.inv_qfr) - Index Scan using clmcom1_inv_nbr_iview_idx on clmcom1 (cost=0.00..380534.32 rows=1269095 width=270) (actual time=0.27..82159.37 rows=1268649 loops=1) - Index Scan using clmcom2_inv_nbr_iview_idx on clmcom2 (cost=0.00..159636.25 rows=1271198 width=96) (actual time=464.56..21774.02 rows=1494019 loops=1) Total runtime: 227369.39 msec (10 rows) On 7/27/05, Chris Hoover [EMAIL PROTECTED] wrote: I am having a problem with a view on one of my db's. This view is trying to sequentially can the 2 tables it is accessing. However, when I explain the view on most of my other db's (all have the same schema's), it is using the indexes. Can anyone please help me understand why postgres is choosing to sequenially scan both tables? Both tables in the view have a primary key defined on inv_nbr, inv_qfr. Vacuum and analyze have been run on the tables in question to try and make sure stats are up to date. Thanks, Chris PG - 7.3.4 RH 2.1 Here is the view definition: SELECT DISTINCT clmcom1.inv_nbr AS inventory_number, clmcom1.inv_qfr AS inventory_qualifier, clmcom1.pat_addr_1 AS patient_address_1, clmcom1.pat_addr_2 AS patient_address_2, clmcom1.pat_city AS patient_city, clmcom1.pat_cntry AS patient_country, clmcom1.pat_dob AS patient_date_of_birth, clmcom1.pat_gender_cd AS patient_gender_code, clmcom1.pat_info_pregnancy_ind AS pregnancy_ind, clmcom1.pat_state AS patient_state, clmcom1.pat_suffix AS patient_suffix, clmcom1.pat_zip AS patient_zip_code, clmcom1.payto_addr_1 AS payto_address_1, clmcom1.payto_addr_2 AS
Re: [PERFORM] Fwd: Help with view performance problem
On Jul 28, 2005, at 8:38 AM, Chris Hoover wrote: I did some more testing, and ran the explain analyze on the problem. In my session I did a set enable_hashjoin = false and then ran the analyze. This caused it to use the indexes as I have been expecting it to do. Now, how can I get it to use the indexes w/o manipulating the environment? What make postgresql want to sequentially scan and use a hash join? thanks, Chris explain analyze with set_hashjoin=false; prob_db=#explain analyze select * from clm_com; I had something similar to this happen recently. The planner was choosing a merge join and seq scan because my 'random_page_cost' was set too high. I had it at 3 , and ended up settling at 1.8 to get it to correctly use my indices. Once that change was in place, the planner did the 'right' thing for me. Not sure if this will help you, but it sounds similar. -Dan ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Unable to explain DB error
Postgres V7.3.9-2. While executing a query in psql, the following error was generated: vsa=# select * from vsa.dtbl_logged_event_20050318 where id=2689472; PANIC: open of /vsa/db/pg_clog/0FC0 failed: No such file or directory server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. !# I checked in the /vsa/db/pg_clog directory, and the files have monotonically increasing filenames starting with . The most recent names are: -rw---1 postgres postgres 262144 Jul 25 21:39 04CA -rw---1 postgres postgres 262144 Jul 26 01:10 04CB -rw---1 postgres postgres 262144 Jul 26 05:39 04CC -rw---1 postgres postgres 262144 Jul 28 00:01 04CD -rw---1 postgres postgres 237568 Jul 28 11:31 04CE Any idea why Postgres would be looking for a clog file name 0FC0 when the most recent filename is 04CE? Any help and suggestions for recovery are appreciated. --- Steve ___ Steven Rosenstein IT Architect/Developer | IBM Virtual Server Administration Voice/FAX: 845-689-2064 | Cell: 646-345-6978 | Tieline: 930-6001 Text Messaging: 6463456978 @ mobile.mycingular.com Email: srosenst @ us.ibm.com Learn from the mistakes of others because you can't live long enough to make them all yourself. -- Eleanor Roosevelt ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Left joining against two empty tables makes a query
Gnanavel S wrote: vacuum reindex the department and project table as the planner expects there are 1060 rows but actually returning nothing. I guess I should have mentioned that I have been vacuuming and reindexing at least once a week, and I did so just before running this test. Normally I do: vacuum analyze; reindex database ; Secondly, the project table has *never* had anything in it. So where are these numbers coming from? Best Wishes, Chris Travers Metatron Technology Consulting ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Fwd: Help with view performance problem
I'm alreading running at 1.5. It looks like if I drop the random_page_cost t0 1.39, it starts using the indexes. Are there any unseen issues with dropping the random_page_cost this low? Thanks, Chris On 7/28/05, Dan Harris [EMAIL PROTECTED] wrote: On Jul 28, 2005, at 8:38 AM, Chris Hoover wrote: I did some more testing, and ran the explain analyze on the problem. In my session I did a set enable_hashjoin = false and then ran the analyze. This caused it to use the indexes as I have been expecting it to do. Now, how can I get it to use the indexes w/o manipulating the environment? What make postgresql want to sequentially scan and use a hash join? thanks, Chris explain analyze with set_hashjoin=false; prob_db=#explain analyze select * from clm_com; I had something similar to this happen recently. The planner was choosing a merge join and seq scan because my 'random_page_cost' was set too high. I had it at 3 , and ended up settling at 1.8 to get it to correctly use my indices. Once that change was in place, the planner did the 'right' thing for me. Not sure if this will help you, but it sounds similar. -Dan ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Left joining against two empty tables makes a query
Chris Travers [EMAIL PROTECTED] writes: Secondly, the project table has *never* had anything in it. So where are these numbers coming from? The planner is designed to assume a certain minimum size (10 pages) when it sees that a table is of zero physical length. The reason for this is that there are lots of scenarios where a plan created just after a table is first created will continue to be used while the table is filled, and if we optimized on the assumption of zero size we would produce plans that seriously suck once the table gets big. Assuming a few thousand rows keeps us out of the worst problems of this type. (If we had an infrastructure for regenerating cached plans then we could fix this more directly, by replanning whenever the table size changes too much. We don't yet but I hope there will be something by 8.2.) You might try going ahead and actually putting a row or two into projects; vacuuming that will change the state to where the planner will believe the small size. (If you aren't ever planning to have anything in projects, why have the table at all?) regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Unable to explain DB error
Steven Rosenstein [EMAIL PROTECTED] writes: Any idea why Postgres would be looking for a clog file name 0FC0 when the most recent filename is 04CE? Corrupt data --- specifically a bad transaction number in a tuple header. (In practice, this is the first field looked at in which we can readily detect an error, so you tend to see this symptom for any serious data corruption situation. The actual fault may well be something like a corrupt page header causing the code to follow tuple pointers that point to garbage.) See the PG list archives for past discussions of dealing with corrupt data. pgsql-performance is pretty off-topic for this. BTW, PG 7.4 and up handle this sort of thing much more gracefully ... they can't resurrect corrupt data of course, but they tend not to panic. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Finding bottleneck
On 7/28/05 2:21 AM, Kari Lavikka [EMAIL PROTECTED] wrote: There's a new profiling tool called oprofile: http://oprofile.sourceforge.net/download/ that can be run without instrumenting the binaries beforehand. To actually find out what the code is doing during these stalls, oprofile can show you in which routines the CPU is spending time when you start/stop the profiling. As an alternative to the guess-change parameters-repeat approach, this is the most direct way to find the exact nature of the problem. - Luke ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] Two queries are better than one?
I ran into a situation today maintaining someone else's code where the sum time running 2 queries seems to be faster than 1. The original code was split into two queries. I thought about joining them, but considering the intelligence of my predecessor, I wanted to test it. The question is, which technique is really faster? Is there some hidden setup cost I don't see with explain analyze? Postgres 7.4.7, Redhat AES 3 Each query individually: test= explain analyze test- select * from order WHERE ord_batch='343B' AND ord_id='12-645'; QUERY PLAN Index Scan using order_pkey on order (cost=0.00..6.02 rows=1 width=486) (actual time=0.063..0.066 rows=1 loops=1) Index Cond: ((ord_batch = '343B'::bpchar) AND (ord_id = '12-645'::bpchar)) Total runtime: 0.172 ms (3 rows) test= explain analyze test- select cli_name from client where cli_code='1837'; QUERY PLAN - Index Scan using client_pkey on client (cost=0.00..5.98 rows=2 width=39) (actual time=0.043..0.047 rows=1 loops=1) Index Cond: (cli_code = '1837'::bpchar) Total runtime: 0.112 ms (3 rows) Joined: test= explain analyze test-SELECT cli_name,order.* test- FROM order test- JOIN client ON (ord_client = cli_code) test- WHERE ord_batch='343B' AND ord_id='12-645'; QUERY PLAN -- Nested Loop (cost=0.00..12.00 rows=2 width=525) (actual time=0.120..0.128 rows=1 loops=1) - Index Scan using order_pkey on order (cost=0.00..6.02 rows=1 width=486) (actual time=0.064..0.066 rows=1 loops=1) Index Cond: ((ord_batch = '343B'::bpchar) AND (ord_id = '12-645'::bpchar)) - Index Scan using client_pkey on client (cost=0.00..5.98 rows=1 width=51) (actual time=0.023..0.026 rows=1 loops=1) Index Cond: (outer.ord_client = client.cli_code) Total runtime: 0.328 ms (6 rows) -- Karim Nassar [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] [PATCHES] COPY FROM performance improvements
On Fri, 22 Jul 2005 12:28:43 -0700 Luke Lonergan [EMAIL PROTECTED] wrote: Joshua, On 7/22/05 10:11 AM, Joshua D. Drake [EMAIL PROTECTED] wrote: The database server is a PE (Power Edge) 6600 Database Server IO: [EMAIL PROTECTED] root]# /sbin/hdparm -tT /dev/sda /dev/sda: Timing buffer-cache reads: 1888 MB in 2.00 seconds = 944.00 MB/sec Timing buffered disk reads: 32 MB in 3.06 seconds = 10.46 MB/sec Second Database Server IO: [EMAIL PROTECTED] root]# /sbin/hdparm -tT /dev/sda /dev/sda: Timing buffer-cache reads: 1816 MB in 2.00 seconds = 908.00 MB/sec Timing buffered disk reads: 26 MB in 3.11 seconds = 8.36 MB/sec [EMAIL PROTECTED] root]# Can you post the time dd if=/dev/zero of=bigfile bs=8k count=50 results? Also do the reverse (read the file) with time dd if=bigfile of=/dev/null bs=8k. I think you are observing what we've known for a while, hardware RAID is horribly slow. We've not found a hardware RAID adapter of this class yet that shows reasonable read or write performance. The Adaptec 2400R or the LSI or others have terrible internal I/O compared to raw SCSI with software RAID, and even the CPU usage is higher on these cards while doing slower I/O than linux SW RAID. Notably - we've found that the 3Ware RAID controller does a better job than the low end SCSI RAID at HW RAID support, and also exports JBOD at high speeds. If you export JBOD on the low end SCSI RAID adapters, the performance is also very poor, though generally faster than using HW RAID. Are there any recommendations for Qlogic controllers on Linux, scsi or fiber channel? I might be able to my hands on some. I have pci-x slots for AMD, Itanium, or POWER5 if the architecture makes a difference. Mark ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] [PATCHES] COPY FROM performance improvements
[EMAIL PROTECTED] root]# /sbin/hdparm -tT /dev/sda /dev/sda: Timing buffer-cache reads: 1816 MB in 2.00 seconds = 908.00 MB/sec Timing buffered disk reads: 26 MB in 3.11 seconds = 8.36 MB/sec [EMAIL PROTECTED] root]# Can you post the time dd if=/dev/zero of=bigfile bs=8k count=50 results? Also do the reverse (read the file) with time dd if=bigfile of=/dev/null bs=8k. I didn't see this come across before... here ya go: time dd if=/dev/zero of=bigfile bs=8k count=50 50+0 records in 50+0 records out real1m52.738s user0m0.310s sys 0m36.590s time dd if=bigfile of=/dev/null bs=8k time dd if=bigfile of=/dev/null bs=8k 50+0 records in 50+0 records out real4m38.742s user0m0.320s sys 0m27.870s FYI on your hardware raid comment... I easily get 50 megs a second on my 3ware controllers and faster on my LSI SATA controllers. Sincerely, Joshua D. Drake -- Your PostgreSQL solutions provider, Command Prompt, Inc. 24x7 support - 1.800.492.2240, programming, and consulting Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit http://www.commandprompt.com / http://www.postgresql.org ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0
Karim Nassar wrote: On Wed, 2005-07-27 at 14:35 -0800, Matthew Schumacher wrote: I put the rest of the schema up at http://www.aptalaska.net/~matt.s/bayes/bayes_pg.sql in case someone needs to see it too. Do you have sample data too? Ok, I finally got some test data together so that others can test without installing SA. The schema and test dataset is over at http://www.aptalaska.net/~matt.s/bayes/bayesBenchmark.tar.gz I have a pretty fast machine with a tuned postgres and it takes it about 2 minutes 30 seconds to load the test data. Since the test data is the bayes information on 616 spam messages than comes out to be about 250ms per message. While that is doable, it does add quite a bit of overhead to the email system. Perhaps this is as fast as I can expect it to go, if that's the case I may have to look at mysql, but I really don't want to do that. I will be working on some other benchmarks, and reading though exactly how bayes works, but at least there is some data to play with. schu ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Two queries are better than one?
Karim Nassar wrote: I ran into a situation today maintaining someone else's code where the sum time running 2 queries seems to be faster than 1. The original code was split into two queries. I thought about joining them, but considering the intelligence of my predecessor, I wanted to test it. The question is, which technique is really faster? Is there some hidden setup cost I don't see with explain analyze? Yes, the time it takes your user code to parse the result, and create the new query. :) It does seem like you are taking an extra 0.1ms for the combined query, but that means you don't have another round trip to the database. So that would mean one less context switch, and you don't need to know what the cli_code is before you can get the cli_name. I would guess the overhead is the time for postgres to parse out the text, place another index query, and then combine the rows. It seems like this shouldn't take 0.1ms, but then again, that isn't very long. Also, did you run it *lots* of times to make sure that this isn't just noise? John =:- Postgres 7.4.7, Redhat AES 3 Each query individually: test= explain analyze test- select * from order WHERE ord_batch='343B' AND ord_id='12-645'; QUERY PLAN Index Scan using order_pkey on order (cost=0.00..6.02 rows=1 width=486) (actual time=0.063..0.066 rows=1 loops=1) Index Cond: ((ord_batch = '343B'::bpchar) AND (ord_id = '12-645'::bpchar)) Total runtime: 0.172 ms (3 rows) test= explain analyze test- select cli_name from client where cli_code='1837'; QUERY PLAN - Index Scan using client_pkey on client (cost=0.00..5.98 rows=2 width=39) (actual time=0.043..0.047 rows=1 loops=1) Index Cond: (cli_code = '1837'::bpchar) Total runtime: 0.112 ms (3 rows) Joined: test= explain analyze test-SELECT cli_name,order.* test- FROM order test- JOIN client ON (ord_client = cli_code) test- WHERE ord_batch='343B' AND ord_id='12-645'; QUERY PLAN -- Nested Loop (cost=0.00..12.00 rows=2 width=525) (actual time=0.120..0.128 rows=1 loops=1) - Index Scan using order_pkey on order (cost=0.00..6.02 rows=1 width=486) (actual time=0.064..0.066 rows=1 loops=1) Index Cond: ((ord_batch = '343B'::bpchar) AND (ord_id = '12-645'::bpchar)) - Index Scan using client_pkey on client (cost=0.00..5.98 rows=1 width=51) (actual time=0.023..0.026 rows=1 loops=1) Index Cond: (outer.ord_client = client.cli_code) Total runtime: 0.328 ms (6 rows) signature.asc Description: OpenPGP digital signature
Re: [PERFORM] Two queries are better than one?
On Thu, Jul 28, 2005 at 04:04:25PM -0700, Karim Nassar wrote: I ran into a situation today maintaining someone else's code where the sum time running 2 queries seems to be faster than 1. The original code was split into two queries. I thought about joining them, but considering the intelligence of my predecessor, I wanted to test it. The question is, which technique is really faster? Is there some hidden setup cost I don't see with explain analyze? To see which technique will be faster in your application, time the application code. The queries you show are taking fractions of a millisecond; the communications overhead of executing two queries might make that technique significantly slower than just the server execution time that EXPLAIN ANALYZE shows. -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] [PATCHES] COPY FROM performance improvements
Can you post the time dd if=/dev/zero of=bigfile bs=8k count=50 results? Also do the reverse (read the file) with time dd if=bigfile of=/dev/null bs=8k. I didn't see this come across before... here ya go: time dd if=/dev/zero of=bigfile bs=8k count=50 50+0 records in 50+0 records out real1m52.738s user0m0.310s sys 0m36.590s So, that's 35MB/s, or 1/2 of a single disk drive. time dd if=bigfile of=/dev/null bs=8k time dd if=bigfile of=/dev/null bs=8k 50+0 records in 50+0 records out real4m38.742s user0m0.320s sys 0m27.870s And that's 14MB/s, or 1/4 of a single disk drive. FYI on your hardware raid comment... I easily get 50 megs a second on my 3ware controllers and faster on my LSI SATA controllers. Then you are almost getting one disk worth of bandwidth. By comparison, we get this using Linux software RAID on Xeon or Opteron: $ time dd if=/dev/zero of=bigfile bs=8k count=50 50+0 records in 50+0 records out real0m26.927s user0m0.074s sys 0m8.769s $ time dd if=bigfile of=/dev/null bs=8k 50+0 records in 50+0 records out real0m28.190s user0m0.039s sys 0m8.349s with less CPU usage than HW SCSI RAID controllers. - Luke ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0
On Thu, 28 Jul 2005, Matthew Schumacher wrote: Karim Nassar wrote: On Wed, 2005-07-27 at 14:35 -0800, Matthew Schumacher wrote: I put the rest of the schema up at http://www.aptalaska.net/~matt.s/bayes/bayes_pg.sql in case someone needs to see it too. Do you have sample data too? Ok, I finally got some test data together so that others can test without installing SA. The schema and test dataset is over at http://www.aptalaska.net/~matt.s/bayes/bayesBenchmark.tar.gz I have a pretty fast machine with a tuned postgres and it takes it about 2 minutes 30 seconds to load the test data. Since the test data is the bayes information on 616 spam messages than comes out to be about 250ms per message. While that is doable, it does add quite a bit of overhead to the email system. I had a look at your data -- thanks. I have a question though: put_token() is invoked 120596 times in your benchmark... for 616 messages. That's nearly 200 queries (not even counting the 1-8 (??) inside the function itself) per message. Something doesn't seem right there Gavin ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] [PATCHES] COPY FROM performance improvements
Mark, On 7/28/05 4:43 PM, Mark Wong [EMAIL PROTECTED] wrote: Are there any recommendations for Qlogic controllers on Linux, scsi or fiber channel? I might be able to my hands on some. I have pci-x slots for AMD, Itanium, or POWER5 if the architecture makes a difference. I don't have a recommendation for a particular one, it's been too long (1998) since I've used one with Linux. However, I'd like to see a comparison between Emulex and Qlogic and a winner chosen. We've had some apparent driver issues with a client running Emulex on Linux, even using many different versions of the kernel. - Luke ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Left joining against two empty tables makes a query SLOW
On 7/28/05, Chris Travers [EMAIL PROTECTED] wrote: Gnanavel S wrote: vacuum reindex the department and project table as the planner expects there are 1060 rows but actually returning nothing.I guess I should have mentioned that I have been vacuuming and reindexing at least once a week, and I did so just before running this test.Normally I do:vacuum analyze;reindex database ; reindex the tables separately. Secondly, the project table has *never* had anything in it.So whereare these numbers coming from? pg_statistics Best Wishes,Chris TraversMetatron Technology Consulting -- with regards,S.GnanavelSatyam Computer Services Ltd.
Re: [PERFORM] Two queries are better than one?
On 7/29/05, Karim Nassar [EMAIL PROTECTED] wrote: I ran into a situation today maintaining someone else's code where thesum time running 2 queries seems to be faster than 1. The original codewas split into two queries. I thought about joining them, butconsidering the intelligence of my predecessor, I wanted to test it. The question is, which technique is really faster? Is there some hiddensetup cost I don't see with explain analyze?Postgres 7.4.7, Redhat AES 3Each query individually:test= explain analyze test- select * from orderWHERE ord_batch='343B' AND ord_id='12-645'; QUERY PLAN Index Scan using order_pkey on order(cost=0.00..6.02 rows=1 width=486) (actual time= 0.063..0.066 rows=1 loops=1) Index Cond: ((ord_batch = '343B'::bpchar) AND (ord_id = '12-645'::bpchar)) Total runtime: 0.172 ms(3 rows)test= explain analyzetest- select cli_name from client where cli_code='1837'; QUERY PLAN- Index Scan using client_pkey on client(cost=0.00..5.98 rows=2 width=39) (actual time= 0.043..0.047 rows=1 loops=1) Index Cond: (cli_code = '1837'::bpchar) Total runtime: 0.112 ms(3 rows)Joined:test= explain analyzetest-SELECT cli_name,order.*test- FROM order test- JOIN client ON (ord_client = cli_code)test-WHERE ord_batch='343B' AND ord_id='12-645'; where is the cli_code condition in the above query? QUERY PLAN-- Nested Loop(cost=0.00..12.00 rows=2 width=525) (actual time=0.120..0.128 rows=1 loops=1) -Index Scan using order_pkey on order(cost=0.00..6.02 rows=1 width=486) (actual time=0.064..0.066 rows=1 loops=1) Index Cond: ((ord_batch = '343B'::bpchar) AND (ord_id = '12-645'::bpchar)) -Index Scan using client_pkey on client(cost=0.00..5.98 rows=1 width=51) (actual time=0.023..0.026 rows=1 loops=1) Index Cond: (outer.ord_client = client.cli_code) Total runtime: 0.328 ms(6 rows)--Karim Nassar [EMAIL PROTECTED] ---(end of broadcast)---TIP 6: explain analyze is your friend-- with regards,S.GnanavelSatyam Computer Services Ltd.
Re: [PERFORM] Left joining against two empty tables makes a query
Gnanavel S wrote: reindex the tables separately. Reindexing should not affect this problem, anyway. -Neil ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Two queries are better than one?
On Fri, 2005-07-29 at 09:41 +0530, Gnanavel S wrote: Joined: test= explain analyze test-SELECT cli_name,order.* test- FROM order test- JOIN client ON (ord_client = cli_code) test- WHERE ord_batch='343B' AND ord_id='12-645'; where is the cli_code condition in the above query? I don't understand the question. ord_client is the client code, and cli_code is the client code, for their respective tables. batch/id is unique, so there is only one record from order, and only one client to associate. Clearer? -- Karim Nassar [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Two queries are better than one?
On 7/29/05, Karim Nassar [EMAIL PROTECTED] wrote: On Fri, 2005-07-29 at 09:41 +0530, Gnanavel S wrote: Joined: test= explain analyze test-SELECT cli_name,order.* test- FROM order test- JOIN client ON (ord_client = cli_code) test-WHERE ord_batch='343B' AND ord_id='12-645'; where is the cli_code condition in the above query?I don't understand the question. ord_client is the client code, andcli_code is the client code, for their respective tables. batch/id is unique, so there is only one record from order, and only one client toassociate.Clearer? ok. Reason might be comparing with a literal value (previous case) is cheaper than comparing with column(as it has to be evaluated). But with the previous case getting and assigning the cli_code in the application and executing in db will be time consuming as it includes IPC cost. --Karim Nassar [EMAIL PROTECTED] -- with regards,S.GnanavelSatyam Computer Services Ltd.
Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0
Gavin Sherry wrote: I had a look at your data -- thanks. I have a question though: put_token() is invoked 120596 times in your benchmark... for 616 messages. That's nearly 200 queries (not even counting the 1-8 (??) inside the function itself) per message. Something doesn't seem right there Gavin I am pretty sure that's right because it is doing word statistics on email messages. I need to spend some time studying the code, I just haven't found time yet. Would it be safe to say that there isn't any glaring performance penalties other than the sheer volume of queries? Thanks, schu ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Left joining against two empty tables makes a query
Secondly, the project table has *never* had anything in it. So where are these numbers coming from? pg_statistics I very much doubt that. I was unable to locate any rows in pg_statistic where the pg_class.oid for either table matched any row's starelid. Tom's argument that this is behavior by design makes sense. I assumed that something like that had to be going on, otherwise there would be nowhere for the numbers to come from. I.e. if there never were any rows in the table, then if pg_statistic is showing 1060 rows, we have bigger problems than a bad query plan. I hope however that eventually tables which are truly empty can be treated intelligently sometime in the future in Left Joins. Otherwise this limits the usefulness of out of the box solutions which may have functionality that we don't use. Such solutions can then kill the database performance quite easily. Chris Travers Metatron Technology Consulting ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0
On Thu, 2005-07-28 at 16:13 -0800, Matthew Schumacher wrote: Ok, I finally got some test data together so that others can test without installing SA. The schema and test dataset is over at http://www.aptalaska.net/~matt.s/bayes/bayesBenchmark.tar.gz I have a pretty fast machine with a tuned postgres and it takes it about 2 minutes 30 seconds to load the test data. Since the test data is the bayes information on 616 spam messages than comes out to be about 250ms per message. While that is doable, it does add quite a bit of overhead to the email system. On my laptop this takes: real1m33.758s user0m4.285s sys 0m1.181s One interesting effect is the data in bayes_vars has a huge number of updates and needs vacuum _frequently_. After the run a vacuum full compacts it down from 461 pages to 1 page. Regards, Andrew. - Andrew @ Catalyst .Net .NZ Ltd, PO Box 11-053, Manners St, Wellington WEB: http://catalyst.net.nz/PHYS: Level 2, 150-154 Willis St DDI: +64(4)803-2201 MOB: +64(272)DEBIAN OFFICE: +64(4)499-2267 I don't do it for the money. -- Donald Trump, Art of the Deal - signature.asc Description: This is a digitally signed message part