Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread Simon Riggs
On Tue, 2005-05-10 at 11:03 +0100, Alex Stapleton wrote: So, when/is PG meant to be getting a decent partitioning system? ISTM that your question seems to confuse where code comes from. Without meaning to pick on you, or reply rudely, I'd like to explore that question. Perhaps it should be a

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread David Roussel
For an interesting look at scalability, clustering, caching, etc for a large site have a look at how livejournal did it. http://www.danga.com/words/2004_lisa/lisa04.pdf They have 2.6 Million active users, posting 200 new blog entries per minute, plus many comments and countless page views.

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread Alex Stapleton
On 11 May 2005, at 08:57, David Roussel wrote: For an interesting look at scalability, clustering, caching, etc for a large site have a look at how livejournal did it. http://www.danga.com/words/2004_lisa/lisa04.pdf I have implemented similar systems in the past, it's a pretty good technique,

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread Alex Stapleton
On 11 May 2005, at 09:50, Alex Stapleton wrote: On 11 May 2005, at 08:57, David Roussel wrote: For an interesting look at scalability, clustering, caching, etc for a large site have a look at how livejournal did it. http://www.danga.com/words/2004_lisa/lisa04.pdf I have implemented similar

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread Christopher Kings-Lynne
Acceptable Answers to 'So, when/is PG meant to be getting a decent partitioning system?': 1. Person X is working on it I believe. 2. It's on the list, but nobody has done anything about it yet 3. Your welcome to take a stab at it, I expect the community would support your efforts

Re: [PERFORM] Query tuning help

2005-05-11 Thread Ulrich Wisser
Hi Dan, I tried to understand your query, but I couldn't get my understanding of the query and your description in sync. Why do you use sub selects? Wouldn't a simple recordtext like '%RED%' do the trick too? You combine all your where conditions with and. To me this looks like you get only

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread Tom Lane
Mischa Sandberg [EMAIL PROTECTED] writes: So, simplicity dictates something like: table pg_remote(schemaname text, connectby text, remoteschema text) Previous discussion of this sort of thing concluded that we wanted to follow the SQL-MED standard. regards, tom lane

Re: [PERFORM] Prefetch

2005-05-11 Thread Rod Taylor
On Wed, 2005-05-11 at 12:53 +0800, Christopher Kings-Lynne wrote: Another trick you can use with large data sets like this when you want results back in seconds is to have regularly updated tables that aggregate the data along each column normally aggregated against the main data set.

Re: [PERFORM] Optimizer wrongly picks Nested Loop Left Join

2005-05-11 Thread John A Meinel
Edin Kadribasic wrote: Hi, I have a query that is giving the optimizer (and me) great headache. When its in the good mood the optimizer chooses Hash Left Join and the query executes in 13ms or so, but sometimes (more and more often) it chooses Nested Loop Left Join and the execution time goes up

Re: [PERFORM] Optimizer wrongly picks Nested Loop Left Join

2005-05-11 Thread Tom Lane
Edin Kadribasic [EMAIL PROTECTED] writes: I have a query that is giving the optimizer (and me) great headache. The main problem seems to be that the rowcount estimates for axp_temp_order_match and axp_dayschedule are way off: - Index Scan using axp_temp_order_match_idx1 on

Re: [PERFORM] Prefetch

2005-05-11 Thread Bricklen Anderson
Christopher Kings-Lynne wrote: Another trick you can use with large data sets like this when you want results back in seconds is to have regularly updated tables that aggregate the data along each column normally aggregated against the main data set. Maybe some bright person will prove me wrong

Re: [PERFORM] [GENERAL] Hash index vs. b-tree index (PostgreSQL

2005-05-11 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes: Greg Stark wrote: What if the hash index stored *only* the hash code? Attached is a WIP patch that implements this. Performance? I'm posting mainly because I wasn't sure what to do to avoid false positives in the case of hash collisions. In the hash

Re: [PERFORM] [GENERAL] Hash index vs. b-tree index (PostgreSQL

2005-05-11 Thread Greg Stark
Neil Conway [EMAIL PROTECTED] writes: I'm posting mainly because I wasn't sure what to do to avoid false positives in the case of hash collisions. In the hash AM code it is somewhat awkward to fetch the pointed-to heap tuple and recheck the scankey.[1] I just did the first thing that came

Re: [PERFORM] [GENERAL] Hash index vs. b-tree index (PostgreSQL

2005-05-11 Thread Neil Conway
Tom Lane wrote: Performance? I'll run some benchmarks tomorrow, as it's rather late in my time zone. If anyone wants to post some benchmark results, they are welcome to. I disagree completely with the idea of forcing this behavior for all datatypes. It could only be sensible for fairly wide

Re: [PERFORM] Optimizer wrongly picks Nested Loop Left Join

2005-05-11 Thread Tom Lane
John A Meinel [EMAIL PROTECTED] writes: Unfortunately, because Hash Join doesn't report the number of rows (rows=0 always), it's hard to tell how good the estimator is. This is only a cosmetic problem because you can just look at the number of rows actually emitted by the Hash node's child;

Re: [PERFORM] Sort and index

2005-05-11 Thread Manfred Koizar
On Sun, 24 Apr 2005 17:01:46 -0500, Jim C. Nasby [EMAIL PROTECTED] wrote: Feel free to propose better cost equations. I did. More than once. estimated index scan cost for (project_id, id, date) is 0.00..100117429.34 while the estimate for work_units is 0.00..103168408.62; almost no

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread Josh Berkus
David, It's interesting that the solution livejournal have arrived at is quite similar in ways to the way google is set up. Yes, although again, they're using memcached as pseudo-clustering software, and as a result are limited to what fits in RAM (RAM on 27 machines, but it's still RAM).

[PERFORM] Bad plan after vacuum analyze

2005-05-11 Thread Guillaume Smet
Hi, We have some performances problem on a particular query. We reproduced the problem on a 7.4.5 and on a 7.4.7 server. * we load the dump in a new database * query: it's fast ( 1ms) * VACUUM FULL ANALYZE; * query: it's really slow (130ms) and it's another plan * set enable_seqscan=off; * query:

Re: [PERFORM] Bad plan after vacuum analyze

2005-05-11 Thread Josh Berkus
Guillaume, We reproduced the problem on a 7.4.5 and on a 7.4.7 server. * we load the dump in a new database * query: it's fast ( 1ms) * VACUUM FULL ANALYZE; * query: it's really slow (130ms) and it's another plan * set enable_seqscan=off; * query: it's fast ( 1ms) : it uses the best plan

Re: [PERFORM] Bad plan after vacuum analyze

2005-05-11 Thread Tom Lane
Josh Berkus josh@agliodbs.com writes: - Merge Right Join (cost=8.92..9.26 rows=1 width=529) (actual time=129.100..129.103 rows=1 loops=1) Merge Cond: (outer.object_id = inner.parent_application_id) - Index Scan using acs_objects_object_id_p_hhkb1 on

Re: [PERFORM] Prefetch

2005-05-11 Thread Mischa Sandberg
Quoting Christopher Kings-Lynne [EMAIL PROTECTED]: Another trick you can use with large data sets like this when you want results back in seconds is to have regularly updated tables that aggregate the data along each column normally aggregated against the main data set. Maybe some

Re: [PERFORM] Bad plan after vacuum analyze

2005-05-11 Thread Guillaume Smet
Tom, So, the usual questions: have these two tables been ANALYZEd lately? Yes, of course. As I wrote in my previous mail, here is how I reproduce the problem: - we load the dump in a new database (to be sure, there is no problem on an index or something like that) - query: it's fast ( 1ms) -

Re: [PERFORM] Bad plan after vacuum analyze

2005-05-11 Thread Tom Lane
Guillaume Smet [EMAIL PROTECTED] writes: If so, can we see the pg_stats rows for the object_id and parent_application_id columns? See attached file. Well, those stats certainly appear to justify the planner's belief that the indexscan needn't run very far: the one value of

Re: [PERFORM] Bad plan after vacuum analyze

2005-05-11 Thread Guillaume Smet
Well, those stats certainly appear to justify the planner's belief that the indexscan needn't run very far: the one value of parent_application_id is 1031 and this is below the smallest value of object_id seen by analyze. Yes, it seems rather logical but why does it cost so much if it should

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread Simon Riggs
On Wed, 2005-05-11 at 17:13 +0800, Christopher Kings-Lynne wrote: Alex Stapleton wrote Be more helpful, and less arrogant please. Simon told you all the reasons clearly and politely. Thanks Chris for your comments. PostgreSQL can always do with one more developer and my sole intent was

Re: [PERFORM] Bad plan after vacuum analyze

2005-05-11 Thread Tom Lane
Ah-ha, I can replicate the problem. This example uses tenk1 from the regression database, which has a column unique2 containing just the integers 0... regression=# create table t1(f1 int); CREATE TABLE regression=# insert into t1 values(5); INSERT 154632 1 regression=# insert into t1

Re: [PERFORM] Bad plan after vacuum analyze

2005-05-11 Thread Guillaume Smet
Josh, Tom, Thanks for your explanations. In the meantime it seems like the quickest answer for Guillaume might be to try to avoid keeping any NULLs in parent_application_id. I can't do that as the majority of the applications don't have any parent one. Moreover, we use a third party application

Re: [PERFORM] Sort and index

2005-05-11 Thread Jim C. Nasby
First, I've got some updated numbers up at http://stats.distributed.net/~decibel/ timing2.log shows that the planner actually under-estimates an index scan by several orders of magnitude. Granted, random_page_cost is set to an unrealistic 1.1 (otherwise I can't force the index scan), but that

Federated PG servers -- Was: Re: [PERFORM] [GENERAL] Hash index vs. b-tree index (PostgreSQL

2005-05-11 Thread Mischa Sandberg
Was curious why you pointed out SQL-MED as a SQL-standard approach to federated servers. Always thought of it as covering access to non-SQL data, the way the lo_* interface works; as opposed to meshing compatible (to say nothing of identical) SQL servers. Just checked Jim Melton's last word on

Re: [PERFORM] Bad plan after vacuum analyze

2005-05-11 Thread Mischa Sandberg
Quoting Guillaume Smet [EMAIL PROTECTED]: Hi, We have some performances problem on a particular query. ... I have to say it, this was the best laid-out set of details behind a problem I've ever seen on this list; I'm going to try live up to it, the next time I have a problem of my own.

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread Jim C. Nasby
On Wed, May 11, 2005 at 08:57:57AM +0100, David Roussel wrote: For an interesting look at scalability, clustering, caching, etc for a large site have a look at how livejournal did it. http://www.danga.com/words/2004_lisa/lisa04.pdf They have 2.6 Million active users, posting 200 new blog

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread PFC
However, memcached (and for us, pg_memcached) is an excellent way to improve horizontal scalability by taking disposable data (like session information) out of the database and putting it in protected RAM. So, what is the advantage of such a system versus, say, a sticky sessions system

Re: [PERFORM] Optimizer wrongly picks Nested Loop Left Join

2005-05-11 Thread Edin Kadribasic
From: Tom Lane [EMAIL PROTECTED] Edin Kadribasic [EMAIL PROTECTED] writes: I have a query that is giving the optimizer (and me) great headache. The main problem seems to be that the rowcount estimates for axp_temp_order_match and axp_dayschedule are way off: - Index Scan using

Re: [PERFORM] BLOB's bypassing the OS Filesystem for better Image loading speed?

2005-05-11 Thread Enrico Weigelt
* [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Hi, My next queststion is dedicated to blobs in my Webapplication (using Tomcat 5 and JDBC integrated a the J2EE Appserver JBoss). Filesystems with many Filesystem Objects can slow down the Performance at opening and reading Data. As others

Re: [PERFORM] Intel SRCS16 SATA raid?

2005-05-11 Thread Bruce Momjian
[EMAIL PROTECTED] wrote: Greg, I posted this link under a different thread (the $7k server thread). It is a very good read on why SCSI is better for servers than ATA. I didn't note bias, though it is from a drive manufacturer. YMMV. There is an interesting, though dated appendix on