Re: [PERFORM] need to speed up query

2008-05-06 Thread PFC
i've had to write queries to get trail balance values out of the GL transaction table and i'm not happy with its performance The table has 76K rows growing about 1000 rows per working day so the performance is not that great it takes about 20 to 30 seconds to get all the records for the

[PERFORM] plan difference between set-returning function with ROWS within IN() and a plain join

2008-05-06 Thread Frank van Vugt
L.S. I'm noticing a difference in planning between a join and an in() clause, before trying to create an independent test-case, I'd like to know if there's an obvious reason why this would be happening: = the relatively simple PLPGSQL si_credit_tree() function has 'ROWS 5' in it's definition

Re: [PERFORM] plan difference between set-returning function with ROWS within IN() and a plain join

2008-05-06 Thread PFC
On Tue, 06 May 2008 10:21:43 +0200, Frank van Vugt [EMAIL PROTECTED] wrote: L.S. I'm noticing a difference in planning between a join and an in() clause, before trying to create an independent test-case, I'd like to know if there's an obvious reason why this would be happening: Is the

[PERFORM] Seqscan problem

2008-05-06 Thread Vlad Arkhipov
I've just discovered a problem with quite simple query. It's really confusing me. Postgresql 8.3.1, random_page_cost=1.1. All tables were analyzed before query. EXPLAIN ANALYZE SELECT i.c, d.r FROM i JOIN d ON d.cr = i.c WHERE i.dd between '2007-08-01' and '2007-08-30' Hash Join

Re: [PERFORM] RAID 10 Benchmark with different I/O schedulers (was: Performance increase with elevator=deadline)

2008-05-06 Thread Jeff
On May 5, 2008, at 7:33 PM, Craig James wrote: I had the opportunity to do more testing on another new server to see whether the kernel's I/O scheduling makes any difference. Conclusion: On a battery-backed RAID 10 system, the kernel's I/O scheduling algorithm has no effect. This makes

Re: [PERFORM] need to speed up query

2008-05-06 Thread Justin
PFC wrote: i've had to write queries to get trail balance values out of the GL transaction table and i'm not happy with its performance The table has 76K rows growing about 1000 rows per working day so the performance is not that great it takes about 20 to 30 seconds to get all the

Re: [PERFORM] plan difference between set-returning function with ROWS within IN() and a plain join

2008-05-06 Thread Tom Lane
Frank van Vugt [EMAIL PROTECTED] writes: db=# explain analyse select sum(base_total_val) from sales_invoice where id in (select id from si_credit_tree(8057)); Did you check whether this query even gives the right answer? The EXPLAIN output shows that 21703 rows of

Re: [PERFORM] Seqscan problem

2008-05-06 Thread Tom Lane
Vlad Arkhipov [EMAIL PROTECTED] writes: I've just discovered a problem with quite simple query. It's really confusing me. Postgresql 8.3.1, random_page_cost=1.1. All tables were analyzed before query. What have you got effective_cache_size set to? regards, tom lane

Re: [PERFORM] plan difference between set-returning function with ROWS within IN() and a plain join

2008-05-06 Thread Frank van Vugt
db=# explain analyse select sum(base_total_val) from sales_invoice where id in (select id from si_credit_tree(8057)); Did you check whether this query even gives the right answer? You knew the right answer to that already ;) I think you forgot the alias foo(id) in the

[PERFORM] multiple joins + Order by + LIMIT query performance issue

2008-05-06 Thread Antoine Baudoux
Hello, I have a query that runs for hours when joining 4 tables but takes milliseconds when joining one MORE table to the query. I have One big table, t_event (8 million rows) and 4 small tables (t_network,t_system,t_service, t_interface, all 1000 rows). This query takes a few

Re: [PERFORM] multiple joins + Order by + LIMIT query performance issue

2008-05-06 Thread Guillaume Smet
Antoine, On Tue, May 6, 2008 at 5:03 PM, Antoine Baudoux [EMAIL PROTECTED] wrote: Limit (cost=23981.18..23981.18 rows=1 width=977) - Sort (cost=23981.18..23981.18 rows=1 width=977) Sort Key: this_.c_date Can you please provide the EXPLAIN ANALYZE output instead of EXPLAIN?

[PERFORM] What constitutes a complex query

2008-05-06 Thread Justin
This falls under the stupid question and i'm just curious what other people think what makes a query complex? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] What constitutes a complex query

2008-05-06 Thread Craig James
Justin wrote: This falls under the stupid question and i'm just curious what other people think what makes a query complex? There are two kinds: 1. Hard for Postgres to get the answer. 2. Hard for a person to comprehend. Which do you mean? Craig -- Sent via pgsql-performance mailing list

Re: [PERFORM] need to speed up query

2008-05-06 Thread PFC
What is a period ? Is it a month, or something more custom ? Can periods overlap ? No periods can never overlap. If the periods did you would be in violation of many tax laws around the world. Plus it you would not know how much money you are making or losing. I was wondering if

Re: [PERFORM] What constitutes a complex query

2008-05-06 Thread Scott Marlowe
On Tue, May 6, 2008 at 9:45 AM, Justin [EMAIL PROTECTED] wrote: This falls under the stupid question and i'm just curious what other people think what makes a query complex? Well, as mentioned, there's two kinds. some that look big and ugly are actually just shovelling data with no fancy

Re: [PERFORM] multiple joins + Order by + LIMIT query performance issue

2008-05-06 Thread Antoine Baudoux
Here is the explain analyse for the first query, the other is still running... explain analyse select * from t_Event event inner join t_Service service on event.service_id=service.id inner join t_System system on service.system_id=system.id inner join t_Interface interface on

Re: [PERFORM] multiple joins + Order by + LIMIT query performance issue

2008-05-06 Thread Shaun Thomas
On Tue, 2008-05-06 at 16:03 +0100, Antoine Baudoux wrote: My understanding is that in the first case the sort is done after all the table joins and filtering, but in the second case ALL the rows in t_event are scanned and sorted before the join. You've actually run into a problem that's

Re: [PERFORM] need to speed up query

2008-05-06 Thread Shaun Thomas
On Tue, 2008-05-06 at 03:01 +0100, Justin wrote: i've had to write queries to get trail balance values out of the GL transaction table and i'm not happy with its performance Go ahead and give this a try: SELECT p.period_id, p.period_start, p.period_end, a.accnt_id, a.accnt_number,

Re: [PERFORM] What constitutes a complex query

2008-05-06 Thread Steve Atkins
On May 6, 2008, at 8:45 AM, Justin wrote: This falls under the stupid question and i'm just curious what other people think what makes a query complex? If I know in advance exactly how the planner will plan the query (and be right), it's a simple query. Otherwise it's a complex query.

Re: [PERFORM] What constitutes a complex query

2008-05-06 Thread Richard Broersma
On Tue, May 6, 2008 at 9:41 AM, Scott Marlowe [EMAIL PROTECTED] wrote: I'd say that the use of correlated subqueries qualifies a query as complicated. Joining on non-usual pk-fk stuff. the more you're mashing one set of data against another, and the odder the way you have to do it, the more

[PERFORM] Possible Redundancy/Performance Solution

2008-05-06 Thread Dennis Muhlestein
Right now, we have a few servers that host our databases. None of them are redundant. Each hosts databases for one or more applications. Things work reasonably well but I'm worried about the availability of some of the sites. Our hardware is 3-4 years old at this point and I'm not naive to

Re: [PERFORM] need to speed up query

2008-05-06 Thread Justin
it worked it had couple missing parts but it worked and ran in 3.3 seconds. *Thanks for this * i need to review the result and balance it to my results as the Accountant already went through and balanced some accounts by hand to verify my results begin quote You might want to consider a

Re: [PERFORM] What constitutes a complex query

2008-05-06 Thread Justin
Craig James wrote: Justin wrote: This falls under the stupid question and i'm just curious what other people think what makes a query complex? There are two kinds: 1. Hard for Postgres to get the answer. this one 2. Hard for a person to comprehend. Which do you mean? Craig -- Sent

Re: [PERFORM] multiple joins + Order by + LIMIT query performance issue

2008-05-06 Thread Antoine Baudoux
Thanks a lot for your answer, there are some points I didnt understand On May 6, 2008, at 6:43 PM, Shaun Thomas wrote: The second query says Awesome! Only one network... I can just search the index of t_event backwards for this small result set! Shouldnt It be the opposite?

Re: [PERFORM] What constitutes a complex query

2008-05-06 Thread Scott Marlowe
On Tue, May 6, 2008 at 11:23 AM, Justin [EMAIL PROTECTED] wrote: Craig James wrote: Justin wrote: This falls under the stupid question and i'm just curious what other people think what makes a query complex? There are two kinds: 1. Hard for Postgres to get the answer.

Re: [PERFORM] Possible Redundancy/Performance Solution

2008-05-06 Thread Greg Smith
On Tue, 6 May 2008, Dennis Muhlestein wrote: First, I'd replace are sata hard drives with a scsi controller and two scsi hard drives that run raid 0 (probably running the OS and logs on the original sata drive). RAID0 on two disks makes a disk failure that will wipe out the database twice

Re: [PERFORM] need to speed up query

2008-05-06 Thread Justin
PFC wrote: What is a period ? Is it a month, or something more custom ? Can periods overlap ? No periods can never overlap. If the periods did you would be in violation of many tax laws around the world. Plus it you would not know how much money you are making or losing. I

Re: [PERFORM] multiple joins + Order by + LIMIT query performance issue

2008-05-06 Thread Tom Lane
Shaun Thomas [EMAIL PROTECTED] writes: I'm not sure what causes this, but the problem with indexes is that they're not necessarily in the order you want unless you also cluster them, so a backwards index scan is almost always the wrong answer. Whether the scan is forwards or backwards has

Re: [PERFORM] multiple joins + Order by + LIMIT query performance issue

2008-05-06 Thread Heikki Linnakangas
Antoine Baudoux wrote: Here is the explain analyse for the first query, the other is still running... explain analyse select * from t_Event event inner join t_Service service on event.service_id=service.id inner join t_System system on service.system_id=system.id inner join t_Interface

Re: [PERFORM] multiple joins + Order by + LIMIT query performance issue

2008-05-06 Thread Heikki Linnakangas
Antoine Baudoux wrote: Here is the explain analyse for the first query, the other is still running... explain analyse select * from t_Event event inner join t_Service service on event.service_id=service.id inner join t_System system on service.system_id=system.id inner join t_Interface

Re: [PERFORM] multiple joins + Order by + LIMIT query performance issue

2008-05-06 Thread Shaun Thomas
On Tue, 2008-05-06 at 18:59 +0100, Tom Lane wrote: Whether the scan is forwards or backwards has nothing to do with it. The planner is using the index ordering to avoid having to do a full-table scan and sort. Oh, I know that. I just noticed that when this happened to us, more often than

Re: [PERFORM] multiple joins + Order by + LIMIT query performance issue

2008-05-06 Thread Shaun Thomas
On Tue, 2008-05-06 at 18:24 +0100, Antoine Baudoux wrote: Isnt the planner fooled by the index on the sorting column? If I remove the index the query runs OK. In your case, for whatever reason, the stats say doing the index scan on the sorted column will give you the results faster. That

Re: [PERFORM] RAID 10 Benchmark with different I/O schedulers

2008-05-06 Thread Craig James
Greg Smith wrote: On Mon, 5 May 2008, Craig James wrote: pgbench -i -s 20 -U test That's way too low to expect you'll see a difference in I/O schedulers. A scale of 20 is giving you a 320MB database, you can fit the whole thing in RAM and almost all of it on your controller cache. What's

Re: [PERFORM] Possible Redundancy/Performance Solution

2008-05-06 Thread Dennis Muhlestein
Greg Smith wrote: On Tue, 6 May 2008, Dennis Muhlestein wrote: RAID0 on two disks makes a disk failure that will wipe out the database twice as likely. If you goal is better reliability, you want some sort of RAID1, which you can do with two disks. That should increase read throughput a

Re: [PERFORM] RAID 10 Benchmark with different I/O schedulers

2008-05-06 Thread Greg Smith
On Tue, 6 May 2008, Craig James wrote: I only did two runs of each, which took about 24 minutes. Like the first round of tests, the noise in the measurements (about 10%) exceeds the difference between scheduler-algorithm performance, except that anticipatory seems to be measurably slower.

Re: [PERFORM] Possible Redundancy/Performance Solution

2008-05-06 Thread Greg Smith
On Tue, 6 May 2008, Dennis Muhlestein wrote: I was planning on pgpool being the cushion between the raid0 failure probability and my need for redundancy. This way, I get protection against not only disks, but cpu, memory, network cards,motherboards etc.Is this not a reasonable approach?

Re: [PERFORM] RAID 10 Benchmark with different I/O schedulers

2008-05-06 Thread Craig James
Greg Smith wrote: On Tue, 6 May 2008, Craig James wrote: I only did two runs of each, which took about 24 minutes. Like the first round of tests, the noise in the measurements (about 10%) exceeds the difference between scheduler-algorithm performance, except that anticipatory seems to be

[PERFORM] pgfouine - commit details?

2008-05-06 Thread Josh Cole
We are using pgfouine to try and optimize our database at this time. Is there a way to have pgfouine show examples or breakout commits? Queries that took up the most time Rank Total durationTimes executed Av. duration (s) Query 1 26m54s 222,305

Re: [PERFORM] Possible Redundancy/Performance Solution

2008-05-06 Thread Dennis Muhlestein
Greg Smith wrote: On Tue, 6 May 2008, Dennis Muhlestein wrote: Since disks are by far the most likely thing to fail, I think it would be bad planning to switch to a design that doubles the chance of a disk failure taking out the server just because you're adding some server-level

Re: [PERFORM] pgfouine - commit details?

2008-05-06 Thread Guillaume Smet
Josh, On Tue, May 6, 2008 at 11:10 PM, Josh Cole [EMAIL PROTECTED] wrote: We are using pgfouine to try and optimize our database at this time. Is there a way to have pgfouine show examples or breakout commits? I hesitated before not implementing this idea. The problem is that you often don't

Re: [PERFORM] Seqscan problem

2008-05-06 Thread Vlad Arkhipov
Tom Lane writes: Vlad Arkhipov [EMAIL PROTECTED] writes: I've just discovered a problem with quite simple query. It's really confusing me. Postgresql 8.3.1, random_page_cost=1.1. All tables were analyzed before query. What have you got effective_cache_size set to?

Re: [PERFORM] Possible Redundancy/Performance Solution

2008-05-06 Thread Scott Marlowe
On Tue, May 6, 2008 at 3:39 PM, Dennis Muhlestein [EMAIL PROTECTED] wrote: Those are good points. So you'd go ahead and add the pgpool in front (or another redundancy approach, but then use raid1,5 or perhaps 10 on each server? That's what I'd do. specificall RAID10 for small to medium

Re: [PERFORM] Possible Redundancy/Performance Solution

2008-05-06 Thread Greg Smith
On Tue, 6 May 2008, Dennis Muhlestein wrote: Those are good points. So you'd go ahead and add the pgpool in front (or another redundancy approach, but then use raid1,5 or perhaps 10 on each server? Right. I don't advise using the fact that you've got some sort of replication going as an

Re: [PERFORM] pgfouine - commit details?

2008-05-06 Thread Josh Cole
We are shipping the postgres.log to a remote syslog repository to take the I/O burden off our postgresql server. As such if we set log_min_duration_statement to 0 this allow us to get more detailed information about our commits using pgfouine...correct? -- Josh