Re: [PERFORM] PostgreSQL vs. Oracle vs. Microsoft
On Mon, 10 Jan 2005 12:46:01 -0500, Alex Turner [EMAIL PROTECTED] wrote: You sir are correct! You can't use perl in MS-SQL or Oracle ;). Can you benefit from the luminous power of Visual Basic as a pl in MSSQL ? ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] PostgreSQL vs. Oracle vs. Microsoft
The .NET Runtime will be a part of the next MS SQLServer engine. You will be able to have C# as a pl in the database engine with the next version of MSSQL. That certainly will be something to think about. Ah, well, if it's C# (or even VB.NET) then it's serious ! I thought postgres had pl/java ? ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Low Performance for big hospital server ..
Decrease the sort mem too much [8196] make the performance much slower so I use sort_mem = 16384 and leave effective cache to the same value , the result is quite better but I should wait for tomorrow morning [official hour] to see the end result. You could also profile your queries to see where those big sorts come from, and maybe add some indexes to try to replace sorts by index-scans-in-order, which use no temporary memory. Can you give an example of your queries which make use of big sorts like this ? ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] LIMIT causes SEQSCAN in subselect
The fact that the estimator knows that the LIMIT is pointless because there are less rows in the subselect than the LIMIT will return is not something we want to count on; sometimes the estimator has innaccurate information. The UNIQUE index makes this more certain, except that I'm not sure that the planner distinguishes between actual UNIQUE indexes and columns which are estimated unique (per the pg_stats). And I think you can see in your case that there's quite a difference between a column we're CERTAIN is unique, versus a column we THINK is unique. I think a UNIQUE constraint can permit several 'different' NULL values... better say UNIQUE NOT NULL ? ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Using LIMIT changes index used by planner
On Mon, 13 Dec 2004 17:43:07 -0500, Tom Lane [EMAIL PROTECTED] wrote: Sven Willenberger [EMAIL PROTECTED] writes: explain analyze select storelocation,order_number from custacct where referrer = 1365 and orderdate between '2004-12-07' and '2004-12-07 12:00:00' order by custacctid limit 10; why not create an index on referrer, orderdate ? ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Caching of Queries
I've looked at PREPARE, but apparently it only lasts per-session - that's worthless in our case (web based service, one connection per data-requiring connection). You don't use persistent connections ??? Your problem might simply be the connection time overhead (also including a few TCP roudtrips). ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Poor Query
How many rows do the following queries return : select userID from bankaccount ba where ba.bankaccountID = u.bankaccountID and ba.accountnumber = '12345678' select userID from bankaccount ba where ba.bankaccountID = u.bankaccountID and ba.routingNumber = '12345678' Can you post EXPLAIN ANALYZE for these two queries ? Regards. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Poor Query
Just wanted to know the selectivity of the accountnumber and routingNumber columns. I shoulda written : How many rows do the following queries return : One or few at most, or a lot ? select userID from bankaccount WHERE accountnumber = '12345678' select userID from bankaccount WHERE routingNumber = '12345678' Can you post EXPLAIN ANALYZE for these two queries ? Regards. Thanks! for the quick reply. It should usually return just one account for that user so its only one record. Actually userid column doesnt exist on bankaccount table it exists only on the user table and it is joined with bankaccountid column, if i run this query separately i wouldnt able to run it . ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Poor Query
Just One, user can i have only one bankaccount. Ah well, in that case : This is your query : select userID, fname, lname, email, phone, dateEntered, dateCanceled, dateSuspended, billAmount, billDate, dateBilled, datePaid, '?' as searches from Users u where 1=1 AND exists (select userID from bankaccount ba where ba.bankaccountID = u.bankaccountID and ba.accountnumber = '12345678') AND exists (select userID from bankaccount ba where ba.bankaccountID = u.bankaccountID and ba.routingNumber = '12345678') order by UserID desc limit 500 What it does is scan all users, and for each user, test if it has the accountnumber or the routingNumber you seek. You're reversing the problem : you should first look for accountnumber and routingNumber, THEN look for the user : SELECT * FROM Users WHERE bankaccountID IN (SELECT bankaccountID FROM bankaccount WHERE accountnumber = '12345678' OR/AND routingNumber = '12345678') or : SELECT * FROM Users WHERE userID IN (SELECT userID FROM bankaccount WHERE accountnumber = '12345678' OR/AND routingNumber = '12345678') There is something very strange in your query, it seems that bankaccount and Users both have a UserID column and a bankaccountID column. Is this normal ? It looks denormalized to me... ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Poor Query
Your suffering comes from the where ba.bankaccountID = u.bankaccountID in the subselect. It means postgres has to run the subselect once for each row in Users. You want the subselect to run only once, and return one (or more?) bankaccountid's, then fetch the users from Users. Just remove the where ba.bankaccountID = u.bankaccountID ! select userID, fname, lname, email, phone, dateEntered, dateCanceled, dateSuspended, billAmount, billDate, dateBilled, datePaid, '?' as searches from Users u where bankaccountid in (select bankaccountid from bankaccount ba where ba.bankaccountID = u.bankaccountID and ba.accountnumber = '12345678' and ba.routingNumber = '12345678') order by UserID desc limit 500 New version : select userID, fname, lname, email, phone, dateEntered, dateCanceled, dateSuspended, billAmount, billDate, dateBilled, datePaid, '?' as searches from Users u where bankaccountid in (select bankaccountid from bankaccount ba WHEREba.accountnumber = '12345678' and ba.routingNumber = '12345678') You could also do this : select u.* from Users u, bankaccount ba where u.bankaccountid = ba.bankaccountid and ba.accountnumber = '12345678' and ba.routingNumber = '12345678') ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Data type to use for primary key
All, Well, you should still escape any strings you're getting from a web page so you can ensure you're not subject to a SQL insert attack, even if you're expecting integers. Thanks, Peter Darley Well, your framework should do this for you : integer specified in your database object class description %d appears in in your generated queries (or you put it in your hand written queries) = if the parameter is not an integer, an exception is thrown, then catched, then an error page is displayed... Or, just casting to int should throw an exception... Forms should be validated, but hidden parameters in links are OK imho to display an error page if they are incorrect, after all, if the user edits the get or post parameters, well... ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] memcached and PostgreSQL
While an exception, this is a very real possibility in day to day operations. The absence of any feedback or balancing mechanism between the database and cache makes it impossible to know that they are in sync and even a small error percentage multiplied over time will lead to an ever increasing likelihood of error. Sure, but there are applications where it does not matter, and these applications are othen loading the database... think about displaying forum posts, products list in a web store, and especially category trees, top N queries... for all these, it does not matter if the data is a bit stale. For instance, a very popular forum will be cached, which is very important. In this case I think it is acceptable if a new post does not appear instantly. Of course, when inserting or updating data in the database, the primary keys and other important data should be fetched from the database and not the cache, which supposes a bit of application logic (for instance, in a forum, the display page should query the cache, but the post message page should query the database directly). Memcache can also save the database from update-heavy tasks like user session management. In that case sessions can be stored entirely in memory. ON COMMIT triggers would be very useful. More dangerous is that this discrepancy will NOT always be apparent because without active verification of the correctness of the cache, we will not know about any errors unless the error grows to an obvious point. The errors may cause material damage long before they become obvious. This is a common failure pattern with caches. This is why it would be dangerous to fetch referential integrity data from the cache... this fits your banking example for instance. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Data type to use for primary key
What is the common approach? Should I use directly the product_code as my ID, or use a sequantial number for speed? (I did the same for the company_id, this is a 'serial' and not the shor name of the customer. I just don't know what is usually done. Use a serial : - you can change product_code for a product easily - you can pass around integers easier around, in web forms for instance, you don't have to ask 'should I escape this string ?' - it's faster - it uses less space - if one day you must manage products from another source whose product_code overlap yours, you won't have problems - you can generate them with a serial uniquely and easily ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] scalability issues on win32
Test platform: Pentium 4 3.06 GHz/HT 10k SATA Raptor 1Gb memory Windows XP Pro SP2/Redhat Fedora 3 (64 bit results coming soon) Could you please add information about... - filesystems ? - windows configured as network server or as desktop box ? - virtual memory In my experience you MUST deactivate virtual memory on a Windows box to avoid catastrophic competition between virtual memory and disk cache - respective pgsql configurations (buffers...) identical ? - explain analyze for the two, identical ? - client on same machine or via network (100Mb ? 1G ?) - size of the data set involved in query - first query time after boot (with nothing in the cache), and times for the next disk-cached runs ? - are the N users doing the same query or exercising different parts of the dataset ? You don't do any writes in your test do you ? Just big SELECTs ? ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Index usage for sorted query
Instead of : WHERE cd='ca' ORDER BY l_postcode; Write : WHERE cd='ca' ORDER BY cd, l_postcode; You have a multicolumn index, so you should specify a multicolumn sort exactly the same as your index, and the planner will get it. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] vacuum analyze slows sql query
Lets say for a second that you manage to trick it into using index scan, and then you actually call the function with one of the values that returns 1,000s of rows. Probably it will take 10-100 times longer than if it used a seq scan. I don't know if it matters (I suspect that it does) but I am using LIMIT 1 in the sub-query/stored function. All I need is one single row meeting any of the criteria laid out in the stored procedure to establish an offer_id is pending. So, in your case if you LIMIT the index scan will always be fast, and the seq scan will be catastrophic, because you don't need to retrieve all the rows, but just one. (IMHO the planner screws these LIMIT clauses becauses it expects the data to be randomly distributed in the first page while in real life it's not). You could use EXIST to test the existence of a subquery (after all, thats its purpose), or you could : When SELECT ... FROM table WHERE stuff=value LIMIT 1 obstinately uses a seq scan, spray a little order by : When SELECT ... FROM table WHERE stuff=value ORDER BY stuff LIMIT 1 the ORDER BY will make the planner think I could use the index to order... ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Restricting Postgres
Myself, I like a small Apache with few modules serving static files (no dynamic content, no db connections), and with a mod_proxy on a special path directed to another Apache which generates the dynamic pages (few processes, persistent connections...) You get the best of both, static files do not hog DB connections, and the second apache sends generated pages very fast to the first which then trickles them down to the clients. Case in point: A first time visitor hits your home page. A dynamic page is generated (in about 1 second) and served (taking 2 more seconds) which contains links to 20 additional The gain from an accelerator is actually even more that that, as it takes essentially zero seconds for Apache to return the generated content (which in the case of a message board could be quite large) to Squid, which can then feed it slowly to the user, leaving Apache free again to generate another page. When serving dialup users large dynamic pages this can be a _huge_ gain. I think Martin's pages (dimly recalling another thread) take a pretty long time to generate though, so he may not see quite such a significant gain. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Restricting Postgres
On Thu, 4 Nov 2004 18:20:18 -, Matt Clark [EMAIL PROTECTED] wrote: Correct the 75% of all hits are on a script that can take anywhere from a few seconds to a half an hour to complete.The script essentially auto-flushes to the browser so they get new information as it arrives creating the illusion of on demand generation. Er, do you mean that : 1- You have a query that runs for half an hour and you spoon feed the results to the client ? (argh) 2- Your script looks for new data every few seconds, sends a packet, then sleeps, and loops ? If it's 2 I have a readymade solution for you, just ask. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Restricting Postgres
I'm guessing (2) - PG doesn't give the results of a query in a stream. In 1- I was thinking about a cursor... but I think his problem is more like 2- In that case one can either code a special purpose server or use the following hack : In your webpage include an iframe with a Javascript to refresh it every five seconds. The iframe fetches a page from the server which brings in the new data in form of generated JavaScript which writes in the parent window. Thus, you get a very short request every 5 seconds to fetch new data, and it is displayed in the client's window very naturally. I've used this technique for another application and find it very cool. It's for selection lists, often you'll see a list of things to be checked or not, which makes a big form that people forget to submit. Thus I've replaced the checkboxes with clickable zones which trigger the loading of a page in a hidden iframe, which does appropriate modifications in the database, and updates the HTML in the parent page, changing texts here and there... it feels a bit like it's not a webpage but rather a standard GUI. Very neat. Changes are recorded without needing a submit button... I should write a framework for making that easy to do. I did not use a frame because frames suck, but iframes are convenient. Yeah, it does not work with Lynx... it needs JavaScript... but it works well. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Restricting Postgres
check this marvelus piece of 5 minutes of work : http://boutiquenumerique.com/test/iframe_feed.html Yup. If you go the JS route then you can do even better by using JS to load data into JS objects in the background and manipulate the page content directly, no need for even an Iframe. Ignore the dullards who have JS turned off - it's essential for modern web apps, and refusing JS conflicts absolutely with proper semantic markup. http://developer.apple.com/internet/webcontent/xmlhttpreq.html is a good starting point. Didn't know this existed ! Very, very cool. I have to check this out more in depth. A note though : you'll have to turn off HTTP persistent connections in your server (not in your proxy) or youre back to square one. It's clear that this discussion has moved way away from PG! Although in the context of DB backed web apps I guess in remains a bit on-topic... I find it very on-topic as - it's a way to help this guy solve his pg problem which was iin fact a design problem - it's the future of database driven web apps (no more reloading the whole page !) I think in the future there will be a good bit of presentation login in the client... ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] preloading indexes
-- uh, you can always load a table in cache by doing a seq scan on it... like select count(1) from table or something... this doesn't work for indexes of course, but you can always look in the system catalogs, find the filename for the index, then just open() it from an external program and read it without caring for the data... it'll save you the seeks in the index... of course you'll have problems with file permissions etc, not mentioning security, locking, etc, etc, etc, is that worth the trouble ? On Wed, 3 Nov 2004 14:35:28 -0500, Andrew Sullivan [EMAIL PROTECTED] wrote: On Wed, Nov 03, 2004 at 12:12:43PM -0700, [EMAIL PROTECTED] wrote: That's correct - I'd like to be able to keep particular indexes in RAM available all the time If these are queries that run frequently, then the relevant cache will probably remain populated[1]. If they _don't_ run frequently, why do you want to force the memory to be used to optimise something that is uncommon? But in any case, there's no mechanism to do this. A [1] there are in fact limits on the caching: if your data set is larger than memory, for instance, there's no way it will all stay cached. Also, VACUUM does nasty things to the cache. It is hoped that nastiness is fixed in 8.0. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Anything to be gained from a 'Postgres Filesystem'?
Reiser4 ? On Thu, 21 Oct 2004 08:58:01 +0100, Matt Clark [EMAIL PROTECTED] wrote: I suppose I'm just idly wondering really. Clearly it's against PG philosophy to build an FS or direct IO management into PG, but now it's so relatively easy to plug filesystems into the main open-source Oses, It struck me that there might be some useful changes to, say, XFS or ext3, that could be made that would help PG out. I'm thinking along the lines of an FS that's aware of PG's strategies and requirements and therefore optimised to make those activities as efiicient as possible - possibly even being aware of PG's disk layout and treating files differently on that basis. Not being an FS guru I'm not really clear on whether this would help much (enough to be worth it anyway) or not - any thoughts? And if there were useful gains to be had, would it need a whole new FS or could an existing one be modified? So there might be (as I said, I'm not an FS guru...): * great append performance for the WAL? * optimised scattered writes for checkpointing? * Knowledge that FSYNC is being used for preserving ordering a lot of the time, rather than requiring actual writes to disk (so long as the writes eventually happen in order...)? Matt Matt Clark Ymogen Ltd P: 0845 130 4531 W: https://ymogen.net/ M: 0774 870 1584 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] execute cursor fetch
I just discovered this : http://www.postgresql.org/docs/7.4/static/jdbc-query.html#AEN24298 On Tue, 12 Oct 2004 04:43:43 -0700 (PDT), my ho [EMAIL PROTECTED] wrote: Hi, If anyone can help pls, I have a question abt the execution of cursor create/fetch/move , in particular about disk cost. When a cursor is created, is the whole table (with the required columns) got put into memory? otherwise how does it work? (in term of disk read and transfer?) after user issues command move/fetch, how does postgre speed up the query in compare to normal selection? Thanks a lot, regards, MT Ho __ Do you Yahoo!? Yahoo! Mail Address AutoComplete - You start. We finish. http://promotions.yahoo.com/new_mail ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] sequential scan on select distinct
The really tricky part is that a DISTINCT ON needs to know about a first() aggregate. And to make optimal use of indexes, a last() aggregate as well. And ideally the planner/executor needs to know something is magic about first()/last() (and potentially min()/max() at some point) and that they don't need the complete set of tuples to calculate their results. I'm going to be accused of hand-waving again, but please pardon me, I'm enthusiastic, and I like to propose new idead, you can kick me if you don't like them or if I put out too much uninformed bull ! Idea : The aggregate accumulation function could have a way to say : stop ! I've had enough of these values ! Get on with the next item in the GROUP BY clause ! I don't know how, or if, the planner could use this (guess: no) or the index scan use this (guess: no) but it would at least save the function calls. I'd guess this idea is quite useless. Aggregates could have an additional attribute saying how much values it will need ('max_rows' maybe). This would prevent the creation of magic aggregates for max() (which is a kind of special-casing), keep it generic (so users can create magic aggregates like this). Aggregates already consist of a bunch of functions (start, accumulate, return retuls) so this could be just another element in this set. This information would be known ahead of time and could influence the query plans too. I'm going to wave my hand and say not too much planning cost because I guess the aggregate details are fetched during planning so fetching one more attribute would not be that long... For instance first() would have max_rows=1, and users could code a first N accumulator-in-array which would have max_rows=N... This does not solve the problem of min() and max() which need max_rows=1 only if the result is sorted... hum... maybe another attribute like max_rows_sorted = 1 for max() and -1 for min() meaning 'first 1' or 'last 1' (or first N or last N)... according to the order by clause it would be known that the 'first N' of an 'order by ... asc' is the same as the 'last N' from an 'order by ... desc' ??? ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] integer[] indexing.
disclaimer : brainless proposition (SELECT * FROM table WHERE (icount(ids) = 1 AND ids[1] = 33) UNION ALL (SELECT * FROM table WHERE (icount(ids) 1 AND ids '{33}')); ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] sequential scan on select distinct
Hashing is at least as fast, if not faster. regards, tom lane Probably quite faster if the dataset is not huge... UniqueSort would be useful for GROUP BY x ORDER BY x though ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] sequential scan on select distinct
I don't really think it would be a useful plan anyway. What *would* be useful is to support HashAggregate as an implementation alternative for DISTINCT --- currently I believe we only consider that for GROUP BY. The DISTINCT planning code is fairly old and crufty and hasn't been redesigned lately. regards, tom lane I see this as a minor annoyance only because I can write GROUP BY instead of DISTINCT and get the speed boost. It probably annoys people trying to port applications to postgres though, forcing them to rewrite their queries. * SELECT DISTINCT : 21442.296 ms (by default, uses an index scan) disabling index_scan = Sort + Unique : 14512.105 ms * GROUP BY : 1793.651 ms using HashAggregate * skip index scan by function : 13.833 ms The HashAggregate speed boost is good, but rather pathetic compared to a skip index scan ; but it's still worth having if updating the DISTINCT code is easy. Note that it would also benefit UNION queries which apparently use DISTINCT internally and currently produce this : -- explain analyze select number from ((select number from dummy) union (select number from dummy)) as foo; Subquery Scan foo (cost=287087.62..317087.62 rows=200 width=4) (actual time=33068.776..35575.330 rows=255 loops=1) - Unique (cost=287087.62..297087.62 rows=200 width=4) (actual time=33068.763..35574.126 rows=255 loops=1) - Sort (cost=287087.62..292087.62 rows=200 width=4) (actual time=33068.757..34639.180 rows=200 loops=1) Sort Key: number - Append (cost=0.00..49804.00 rows=200 width=4) (actual time=0.055..7412.551 rows=200 loops=1) - Subquery Scan *SELECT* 1 (cost=0.00..24902.00 rows=100 width=4) (actual time=0.054..3104.165 rows=100 loops=1) - Seq Scan on dummy (cost=0.00..14902.00 rows=100 width=4) (actual time=0.051..1792.348 rows=100 loops=1) - Subquery Scan *SELECT* 2 (cost=0.00..24902.00 rows=100 width=4) (actual time=0.048..3034.462 rows=100 loops=1) - Seq Scan on dummy (cost=0.00..14902.00 rows=100 width=4) (actual time=0.044..1718.682 rows=100 loops=1) Total runtime: 36265.662 ms -- But could instead do this : explain analyze select number from ((select number from dummy) union all (select number from dummy)) as foo group by number; HashAggregate (cost=74804.00..74804.00 rows=200 width=4) (actual time=10753.648..10753.890 rows=255 loops=1) - Subquery Scan foo (cost=0.00..69804.00 rows=200 width=4) (actual time=0.059..8992.084 rows=200 loops=1) - Append (cost=0.00..49804.00 rows=200 width=4) (actual time=0.055..6688.639 rows=200 loops=1) - Subquery Scan *SELECT* 1 (cost=0.00..24902.00 rows=100 width=4) (actual time=0.054..2749.708 rows=100 loops=1) - Seq Scan on dummy (cost=0.00..14902.00 rows=100 width=4) (actual time=0.052..1640.427 rows=100 loops=1) - Subquery Scan *SELECT* 2 (cost=0.00..24902.00 rows=100 width=4) (actual time=0.038..2751.916 rows=100 loops=1) - Seq Scan on dummy (cost=0.00..14902.00 rows=100 width=4) (actual time=0.034..1637.818 rows=100 loops=1) Total runtime: 10754.120 ms -- A 3x speedup, but still a good thing to have. When I LIMIT the two subqueries to 100k rows instead of a million, the times are about equal. When I LIMIT one of the subqueries to 100k and leave the other to 1M, UNION ALL 17949.609 ms UNION + GROUP BY6130.417 ms Still some performance to be gained... -- Of course it can't use a skip index scan on a subquery, but I could instead : I know it's pretty stupid to use the same table twice but it's just an example. However, if you think about table partitions and views, a select distinct number from a view having multiple partitions would yield this type of query, and that table partitioning seems like a hot subject lately. let's create a dummy example view : create view dummy_view as (select * from dummy) union all (select * from dummy); explain analyze select number from dummy_view group by number; HashAggregate (cost=74804.00..74804.00 rows=200 width=4) (actual time=10206.456..10206.713 rows=255 loops=1) - Subquery Scan dummy_view (cost=0.00..69804.00 rows=200 width=4) (actual time=0.060..8431.776 rows=200 loops=1) - Append (cost=0.00..49804.00 rows=200 width=8) (actual time=0.055..6122.125 rows=200 loops=1)
Re: [PERFORM] sequential scan on select distinct
You could try : explain analyze select land from customer_dim group by land; It will be a lot faster but I can't make it use the index on my machine... Example : create table dummy as (select id, id%255 as number from a large table with 1M rows); so we have a table with 256 (0-255) disctinct number values. = explain analyze select distinct number from dummy; Unique (cost=69.83..74.83 rows=200 width=4) (actual time=13160.490..14414.004 rows=255 loops=1) - Sort (cost=69.83..72.33 rows=1000 width=4) (actual time=13160.483..13955.792 rows=100 loops=1) Sort Key: number - Seq Scan on dummy (cost=0.00..20.00 rows=1000 width=4) (actual time=0.052..1759.145 rows=100 loops=1) Total runtime: 14442.872 ms = Horribly slow because it has to sort 1M rows for the Unique. = explain analyze select number from dummy group by number; HashAggregate (cost=22.50..22.50 rows=200 width=4) (actual time=1875.214..1875.459 rows=255 loops=1) - Seq Scan on dummy (cost=0.00..20.00 rows=1000 width=4) (actual time=0.107..1021.014 rows=100 loops=1) Total runtime: 1875.646 ms = A lot faster because it HashAggregates instead of sorting (but still seq scan) Now : create index dummy_idx on dummy(number); Let's try again. explain analyze select distinct number from dummy; Unique (cost=0.00..35301.00 rows=200 width=4) (actual time=0.165..21781.732 rows=255 loops=1) - Index Scan using dummy_idx on dummy (cost=0.00..32801.00 rows=100 width=4) (actual time=0.162..21154.752 rows=100 loops=1) Total runtime: 21782.270 ms = Index scan the whole table. argh. I should have ANALYZized. explain analyze select number from dummy group by number; HashAggregate (cost=17402.00..17402.00 rows=200 width=4) (actual time=1788.425..1788.668 rows=255 loops=1) - Seq Scan on dummy (cost=0.00..14902.00 rows=100 width=4) (actual time=0.048..960.063 rows=100 loops=1) Total runtime: 1788.855 ms = Still the same... Let's make a function : The function starts at the lowest number and advances to the next number in the index until they are all exhausted. CREATE OR REPLACE FUNCTION sel_distinct() RETURNS SETOF INTEGER LANGUAGE plpgsql AS ' DECLARE pos INTEGER; BEGIN SELECT INTO pos number FROM dummy ORDER BY number ASC LIMIT 1; IF NOT FOUND THEN RAISE NOTICE ''no records.''; RETURN; END IF; LOOP RETURN NEXT pos; SELECT INTO pos number FROM dummy WHERE numberpos ORDER BY number ASC LIMIT 1; IF NOT FOUND THEN RETURN; END IF; END LOOP; END; '; explain analyze select * from sel_distinct(); Function Scan on sel_distinct (cost=0.00..12.50 rows=1000 width=4) (actual time=215.472..215.696 rows=255 loops=1) Total runtime: 215.839 ms That's better ! Why not use DESC instead of ASC ? CREATE OR REPLACE FUNCTION sel_distinct() RETURNS SETOF INTEGER LANGUAGE plpgsql AS ' DECLARE pos INTEGER; BEGIN SELECT INTO pos number FROM dummy ORDER BY number DESC LIMIT 1; IF NOT FOUND THEN RAISE NOTICE ''no records.''; RETURN; END IF; LOOP RETURN NEXT pos; SELECT INTO pos number FROM dummy WHERE numberpos ORDER BY number DESC LIMIT 1; IF NOT FOUND THEN RETURN; END IF; END LOOP; END; '; explain analyze select * from sel_distinct(); Function Scan on sel_distinct (cost=0.00..12.50 rows=1000 width=4) (actual time=13.500..13.713 rows=255 loops=1) Total runtime: 13.857 ms Hum hum ! Again, a lot better ! Index scan backwards seems a lot faster than index scan forwards. Why, I don't know, but here you go from 15 seconds to 14 milliseconds... I don't know WHY (oh why) postgres does not use this kind of strategy when distinct'ing an indexed field... Anybody got an idea ? ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] sequential scan on select distinct
There are even three questions here : - given that 'SELECT DISTINCT field FROM table' is exactly the same as 'SELECT field FROM table GROUP BY field, postgres could transform the first into the second and avoid itself a (potentially killer) sort. On my example the table was not too large but on a very large table, sorting all the values and then discinct'ing them does not look too appealing. Currently Postgres does Sort+Unique, but there could be a DistinctSort instead of a Sort, that is a thing that sorts and removes the duplicates at the same time. Not that much complicated to code than a sort, and much faster in this case. Or there could be a DistinctHash, which would be similar or rather identical to a HashAggregate and would again skip the sort. It would (as a bonus) speed up queries like UNION (not ALL), that kind of things. For example : explain (select number from dummy) union (select number from dummy); Unique (cost=287087.62..297087.62 rows=200 width=4) - Sort (cost=287087.62..292087.62 rows=200 width=4) Sort Key: number - Append (cost=0.00..49804.00 rows=200 width=4) - Subquery Scan *SELECT* 1 (cost=0.00..24902.00 rows=100 width=4) - Seq Scan on dummy (cost=0.00..14902.00 rows=100 width=4) - Subquery Scan *SELECT* 2 (cost=0.00..24902.00 rows=100 width=4) - Seq Scan on dummy (cost=0.00..14902.00 rows=100 width=4) This is scary ! I can rewrite it as such (and the planner could, too) : explain select * from ((select number from dummy) union all (select number from dummy)) as foo group by number; HashAggregate (cost=74804.00..74804.00 rows=200 width=4) - Subquery Scan foo (cost=0.00..69804.00 rows=200 width=4) - Append (cost=0.00..49804.00 rows=200 width=4) - Subquery Scan *SELECT* 1 (cost=0.00..24902.00 rows=100 width=4) - Seq Scan on dummy (cost=0.00..14902.00 rows=100 width=4) - Subquery Scan *SELECT* 2 (cost=0.00..24902.00 rows=100 width=4) - Seq Scan on dummy (cost=0.00..14902.00 rows=100 width=4) which avoids a large sort... However there must be cases in which performing a sort is faster, like when there are a lot of distinct values and the HashAggregate becomes huge too. Well there are two questions here. Why given the current plans available does postgres choose a sequential scan instead of an index scan. And why isn't Well because it needs to get all the rows in the table in order. in this case seq scan+sort is about twice as fast as index scan. Interestingly, once I ANALYZED the table, postgres will chooses to index-scan, which is slower. there this kind of skip index scan available. It would be really nice to have a skip index scan available. I have an other idea, lets call it the indexed sequential scan : When pg knows there are a lot of rows to access, it will ignore the index and seqscan. This is because index access is very random, thus slow. However postgres could implement an indexed sequential scan where : - the page numbers for the matching rows are looked up in the index (this is fast as an index has good locality) - the page numbers are grouped so we have a list of pages with one and only one instance of each page number - the list is then sorted so we have page numbers in-order - the pages are loaded in sorted order (doing a kind of partial sequential scan) which would be faster than reading them randomly. Other ideas later Postgres chooses a sequential scan with a sort (or hash aggregate) over an index scan because it expects it to be faster. sequential scans are much faster than random access scans of indexes, plus index scans need to read many more blocks. If you're finding the index scan to be just as fast as sequential scans you might consider lowering random_page_cost closer to 1.0. But note that you may be getting fooled by a testing methodology where more things are cached than would be in production. why isn't a skip index scan plan available? Well, nobody's written the code yet. It would part of the same code needed to get an index scan used for: select y,min(x) from bar group by y And possibly also related to the TODO item: Use index to restrict rows returned by multi-key index when used with non-consecutive keys to reduce heap accesses For an index on col1,col2,col3, and a WHERE clause of col1 = 5 and col3 = 9, spin though the index checking for col1 and col3 matches, rather than just col1 Note that the optimizer would have to make a judgement call based on the expected number of distinct values. If you had much more than 256 distinct values then the your plpgsql function wouldn't have performed well at all. ---(end of
Re: [PERFORM] Caching of Queries
pgpool (which makes some rather questionable claims IMO); any decent web application language/environment will support connection pooling. That's why it should not be tied to something specific as pgpool. If you want performance, which is the case here, usually you have a webserver serving static files, and an application server serving dynamic pages. This is not necessarily a huge application server, it can be as simple as an Apache instance serving static files, with a special path mod_proxy'ed to another instance of apache acting as an application server. IMHO this is a nice way to do it, because you have a light weight static files server which can spawn many processes without using precious resources like memory and postgres connections, and a specialized server which has a lot less processes, each one having more size, a db connection, etc. The connexions are permanent, of course, so there is no connection overhead. The proxy has an extra advantage buffering the data from the app server and sending it back slowly to the client, so the app server can then very quickly process the next request instead of hogging a db connection while the html is slowly trickled back to the client. IMHO the standard PHP way of doing things (just one server) is wrong because every server process, even if it's serving static files, hogs a connection and thus needs an extra layer for pooling. Thus, I see query result caching as a way to pushing further architectures which are already optimized for performance, not as a band-aid for poor design solutions like the one-apache server with pooling. Now, a proposition : Here is where we are now, a typical slow query : PREPARE myquery(text,integer) EXECUTE myquery('john',2) My proposition : PREPARE myquery(text,integer) PLANNED USING ('john',2) CACHED IF $1 IS NOT NULL AND $2 IS NOT NULL DEPENDS ON $1, $2 MAXIMUM CACHE TIME '5 minute'::interval MINIMUM CACHE TIME '1 minute'::interval MAXIMUM CACHE SIZE 200 AS SELECT count(*) as number FROM mytable WHERE myname=$2 AND myfield=$1; EXECUTE myquery('john',2) Explainations : --- PLANNED USING ('john',2) Tells the planner to compute the stored query plan using the given parameters. This is independent from caching but could be a nice feature as it would avoid the possibility of storing a bad query plan. --- CACHED IF $1 IS NOT NULL AND $2 IS NOT NULL Specifies that the result is to be cached. There is an optional condition (here, IF ...) telling postgres of when and where it should cache, or not cache. It could be useful to avoid wasting cache space. --- DEPENDS ON $1, $2 Defines the cache key. I don't know if this is useful, as the query parameters make a pretty obvious cache key so why repeat them. It could be used to add other data as a cache key, like : DEPENDS ON (SELECT somefunction($1)) Also a syntax for specifying which tables should be watched for updates, and which should be ignored, could be interesting. --- MAXIMUM CACHE TIME '5 minute'::interval Pretty obvious. --- MINIMUM CACHE TIME '1 minute'::interval This query is a count and I want a fast but imprecise count. Thus, I specify a minimum cache time of 1 minute, meaning that the result will stay in the cache even if the tables change. This is dangerous, so I'd suggest the following : MINIMUM CACHE TIME CASE WHEN result.number10 THEN '1 minute'::interval ELSE '5 second'::interval Thus the cache time is an expression ; it is evaluated after performed the query. There needs to be a way to access the 'count' result, which I called 'result.number' because of the SELECT count() as number. The result could also be used in the CACHE IF. The idea here is that the count will vary over time, but we accept some imprecision to gain speed. SWho cares if there are 225 or 227 messages in a forum thread counter anyway ? However, if there are 2 messages, first caching the query is less necessary because it's fast, and second a variation in the count will be much easier to spot, thus we specify a shorter cache duration for small counts and a longer duration for large counts. For queries returning result sets, this is not usable of course, but a special feature for speeding count() queries would be welcome ! --- MAXIMUM CACHE SIZE 200 Pretty obvious. Size in bytes. For queries returning several rows, MIN/MAX on result rows could be useful also : MAXIMUM RESULT ROWS nnn Or maybe : CACHE IF (select count(*) from result) nnn Thinking about it, using prepared queries seems a bad idea ; maybe the cache should act on the result of functions. This would force the application programmers to put the queries they want to optimize in
Re: [PERFORM] Caching of Queries
1) The materialized data is available in 3 different forms; a list, a detail view, and a spreadsheet. Each form as somewhat different columns and different rules about ordering, which would likely confuse an SQC planner. In this implementation, all 3 forms are able to share the same cache. See my proposal to cache function results. You can create a cached function and : SELECT your rows FROM cached_function(parameters) WHERE ... ORDER BY... GROUP BY... will only fetch the function result from the cache, and then the only additional costs are the ORDER and GROUP BY... the query parsing is very simple, it's just a select, and a cached function scan I think caching can be made much more powerful if it is made usable like this. I mean, not only cache a query and its result, but being able to use cached queries internally like this and manipulaing them, adds value to the cached data and allows storing less data in the cache because duplicates are avoided. Thus we could use cached results in CHECK() conditions, inside plsql functions, anywhere... ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Caching of Queries
If it was in pgpool or something similar, I could devote a separate machine just for caching results leaving the db server untouched. BUT you would be limited to caching complete queries. There is a more efficient strategy... ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] index not used when using function
Maybe add an order by artist to force a groupaggregate ? Hi all, a small question: I've got this table songs and an index on column artist. Since there's about one distinct artist for every 10 rows, it would be nice if it could use this index when counting artists. It doesn't however: lyrics= EXPLAIN ANALYZE SELECT count(DISTINCT artist) FROM songs; Aggregate (cost=31961.26..31961.26 rows=1 width=14) (actual time=808.863..808.864 rows=1 loops=1) - Seq Scan on songs (cost=0.00..31950.41 rows=4341 width=14) (actual time=26.801..607.172 rows=25207 loops=1) Total runtime: 809.106 ms Even with enable_seqscan to off, it just can't seem to use the index. The same query without the count() works just fine: lyrics= EXPLAIN ANALYZE SELECT DISTINCT artist FROM songs; Unique (cost=0.00..10814.96 rows=828 width=14) (actual time=0.029..132.903 rows=3280 loops=1) - Index Scan using songs_artist_key on songs (cost=0.00..10804.11 rows=4341 width=14) (actual time=0.027..103.448 rows=25207 loops=1) Total runtime: 135.697 ms Of course I can just take the number of rows from the latter query, but I'm still wondering why it can't use indexes with functions. Thanks ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Data Warehouse Reevaluation - MySQL vs Postgres -- merge tables
Performance hint : For static data, do not normalize too much. For instance if you have a row which can be linked to several other rows, you can do this : create table parents ( id serial primary key, values... ) create table children ( id serial primary key, parent_id references parents(id), integer slave_value ) Or you can do this, using an array : create table everything ( id serial primary key, integer[] children_values, values... ) Pros : No Joins. Getting the list of chilndren_values from table everything is just a select. On an application with several million rows, a query lasting 150 ms with a Join takes 30 ms with an array. You can build the arrays from normalized tables by using an aggregate function. You can index the array elements with a GIST index... Cons : No joins, thus your queries are a little bit limited ; problems if the array is too long ; ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Large # of rows in query extremely slow, not using
I have a table with ~8 million rows and I am executing a query which should return about ~800,000 rows. The problem is that as soon as I execute the query it absolutely kills my machine and begins swapping for 5 or 6 minutes before it begins returning results. Is postgres trying to load the whole query into memory before returning anything? Also, why would it choose not to use the index? It is properly estimating the # of rows returned. If I set enable_seqscan to off it is just as slow. 1; EXPLAIN ANALYZE. Note the time it takes. It should not swap, just read data from the disk (and not kill the machine). 2; Run the query in your software Note the time it takes. Watch RAM usage. If it's vastly longer and you're swimming in virtual memory, postgres is not the culprit... rather use a cursor to fetch a huge resultset bit by bit. Tell us what you find ? Regards. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Help with extracting large volumes of records across related tables
There's a very simple solution using cursors. As an example : create table categories ( id serial primary key, name text ); create table items ( id serial primary key, cat_id integer references categories(id), name text ); create index items_cat_idx on items( cat_id ); insert stuff... select * from categories; id | name +-- 1 | tools 2 | supplies 3 | food (3 lignes) select * from items; id | cat_id | name ++-- 1 | 1 | hammer 2 | 1 | screwdriver 3 | 2 | nails 4 | 2 | screws 5 | 1 | wrench 6 | 2 | bolts 7 | 2 | cement 8 | 3 | beer 9 | 3 | burgers 10 | 3 | french fries (10 lignes) Now (supposing you use Python) you use the extremely simple sample program below : import psycopg db = psycopg.connect(host=localhost dbname=rencontres user=rencontres password=.) # Simple. Let's make some cursors. cursor = db.cursor() cursor.execute( BEGIN; ) cursor.execute( declare cat_cursor no scroll cursor without hold for select * from categories order by id for read only; ) cursor.execute( declare items_cursor no scroll cursor without hold for select * from items order by cat_id for read only; ) # set up some generators def qcursor( cursor, psql_cursor_name ): while True: cursor.execute( fetch 2 from %s; % psql_cursor_name )guess if not cursor.rowcount: break # print %s fetched %d rows. % (psql_cursor_name, cursor.rowcount) for row in cursor.dictfetchall(): yield row print %s exhausted. % psql_cursor_name # use the generators categories = qcursor( cursor, cat_cursor ) items = qcursor( cursor, items_cursor ) current_item = items.next() for cat in categories: print Category : , cat # if no items (or all items in category are done) skip to next category if cat['id'] current_item['cat_id']: continue # case of items without category (should not happen) while cat['id'] current_item['cat_id']: current_item = items.next() while current_item['cat_id'] == cat['id']: print \t, current_item current_item = items.next() It produces the following output : Category : {'id': 1, 'name': 'tools'} {'cat_id': 1, 'id': 1, 'name': 'hammer'} {'cat_id': 1, 'id': 2, 'name': 'screwdriver'} {'cat_id': 1, 'id': 5, 'name': 'wrench'} Category : {'id': 2, 'name': 'supplies'} {'cat_id': 2, 'id': 3, 'name': 'nails'} {'cat_id': 2, 'id': 4, 'name': 'screws'} {'cat_id': 2, 'id': 6, 'name': 'bolts'} {'cat_id': 2, 'id': 7, 'name': 'cement'} Category : {'id': 3, 'name': 'food'} {'cat_id': 3, 'id': 8, 'name': 'beer'} {'cat_id': 3, 'id': 9, 'name': 'burgers'} {'cat_id': 3, 'id': 10, 'name': 'french fries'} This simple code, with fetch 1000 instead of fetch 2, dumps a database of several million rows, where each categories contains generally 1 but often 2-4 items, at the speed of about 10.000 items/s. Satisfied ? ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Help with extracting large volumes of records across related tables
Thanks for the thanks ! Generally, when grouping stuff together, it is a good idea to have two sorted lists, and to scan them simultaneously. I have already used this solution several times outside of Postgres, and it worked very well (it was with Berkeley DB and there were 3 lists to scan in order). The fact that Python can very easily virtualize these lists using generators makes it possible to do it without consuming too much memory. Pierre-Frederic, Paul, Thanks for your fast response (especially for the python code and performance figure) - I'll chase this up as a solution - looks most promising! Cheers, Damien ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[PERFORM] Question on Byte Sizes
Hello, * I need information on the size of pg ARRAY[]'s : I did not find any info in the Docs on this. How many bytes does an array take on disk ? Is there a difference between an array of fixed size elements like integers, and an array of variable length elements like text ? is there a pointer table ? Or are the elements packed together ? Is there any advantage in using a smallint[] over an integer[] regarding size ? Does a smallint[] with 2 elements really take 12 bytes ? * On Alignment : The docs say fields are aligned on 4-bytes boundaries. Does this mean that several consecutive smallint fields will take 4 bytes each ? What about seleral consecutive char fields ? 4 bytes each too ? I ask this because I'll have a lot of columns with small values to store in a table, and would like it to be small and to fit in the cache. Thanks for any info. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] The usual sequential scan, but with LIMIT !
OK, thanks a lot for your explanations. Knowing how the planner thinks, makes it pretty logical. Thank you. Now another question... I have a table of records representing forum posts with a primary key (id), a topic_id, a timestamp, and other fields which I won't detail. I want to split them into pages (like forums usually do), with N posts per page. In that case : SELECT * FROM table WHERE topic_id=... ORDER BY post_timestamp asc LIMIT N OFFSET N*page; Also it's almost the same to order by id rather than post_timestamp (id being a serial). SELECT * FROM table WHERE topic_id=... ORDER BY id asc LIMIT N OFFSET N*page; This query runs slower and slower as the OFFSET grows, which is a problem because the most accessed page in a forum is the last one. So, for the last page, I tried : SELECT * FROM table WHERE topic_id=... ORDER BY id desc LIMIT N; But this does not use the index at all (seq scan + sort + limit). My solution is simple : build an index on (-id), or on (some date)-post_timestamp, then : SELECT * FROM table WHERE topic_id=... ORDER BY (-id) desc LIMIT N; Then the last page is the fastest one, but it always shows N posts. That's not a problem, so I guess I'll use that. I don't like forums which show 1 post on the last page because the number of posts modulo N is 1. I may store the number of posts in a forum (updated by a trigger) to avoid costly COUNT queries to count the pages, so I could use ORDER BY id for the first half of the pages, and ORDER BY (-id) for the rest, so it will always be fastest. I could even create a pages table to store the id of the first post on that page and then : SELECT * FROM table WHERE topic_id=... AND idid_of_first_post_in_page ORDER BY id asc LIMIT N; then all pages would be aqually fast. Or, I could cache the query results for all pages but the last one. Finally, the question : having a multiple field btree, it is not harder to scan it in desc order than in asc order. So why does not Postgres do it ? Here is a btree example : topic_idid 1 1 1 10 2 2 2 5 2 17 3 4 3 6 suppose I SELECT WHERE topic_id=2 ORDER BY topic_id ASC,id ASC. Postgres simply finds the first row with topic_id=2 and goes from there. suppose I SELECT WHERE topic_id=2 ORDER BY topic_id ASC,id DESC. Postgres does a seq scan, but it could think a bit more and start at first index node which has topic_id2 (simple to find in a btree) then go backwards in the index. This can ge beneralized to any combination of (asc,desc). I made some more experiments, and saw Postgres does an 'Index Scan' when ORDER BY clauses are all ASC, and an 'Index Scan Backwards' when all ORDER BY are DESC. However, it does not handle a combination of ASC and DESC? What do you think of this ? On Mon, 06 Sep 2004 12:40:41 -0400, Tom Lane [EMAIL PROTECTED] wrote: =?iso-8859-15?Q?Pierre-Fr=E9d=E9ric_Caillaud?= [EMAIL PROTECTED] writes: Now, if I LIMIT the query to 10 rows, the index should be used all the time, because it will always return few rows... well, it doesn't ! Not at all. From the planner's point of view, the LIMIT is going to reduce the cost by about a factor of 10/1403, since the underlying plan step will only be run partway through. That's not going to change the decision about which underlying plan step is cheapest: 10/1403 of a cheaper plan is still always less than 10/1403 of a more expensive plan. Later, you note that LIMIT with ORDER BY does affect the plan choice --- that's because in that situation one plan alternative has a much higher startup cost than the other (namely the cost of a sort step). A small LIMIT can allow the fast-startup plan to be chosen even though it would be estimated to be the loser if run to completion. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] The usual sequential scan, but with LIMIT !
Yes, you're right as usual. I had not thought about playing with ORDER BY on a field which has only one value in the result set. If you write it as SELECT WHERE topic_id=2 ORDER BY topic_id DESC,id DESC. then an index on (topic_id, id) will work fine. The mixed ASC/DESC ordering is not compatible with the index. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] The usual sequential scan, but with LIMIT !
Hello, I have this table : CREATE TABLE apparts ( id SERIAL NOT NULL PRIMARY KEY, priceFLOAT NOT NULL, surfaceINTEGER NOT NULL, price_sq FLOAT NOT NULL, roomsINTEGER NULL, venteBOOL NOT NULL, category TEXT NOT NULL, zipcodeINTEGER NOT NULL, departement INTEGER NOT NULL ) WITHOUT OIDS; There is a BTREE index on 'departement'. The table fits in RAM. When I want to SELECT according to my indexed field, postgres chooses a sequential scan unless the number of rows to be returned is very, very small : apparts= explain analyze select * from apparts where departement=42; Seq Scan on apparts (cost=0.00..853.12 rows=1403 width=47) (actual time=5.094..52.026 rows=1516 loops=1) Filter: (departement = 42) Total runtime: 52.634 ms OK, it returns 1516 rows, so maybe the seq scan is right. apparts= SET enable_seqscan = 0; apparts= explain analyze select * from apparts where departement=42; Index Scan using apparts_dept on apparts (cost=0.00..1514.59 rows=1403 width=47) (actual time=0.045..2.770 rows=1516 loops=1) Index Cond: (departement = 42) Total runtime: 3.404 ms Um, 15 times faster... Index scan is called only when there are few rows. With other values for 'departement' where there are few rows, the Index is used automatically. This is logical, even if I should adjust the page costs. I wish I could tell postgres this table will fit in RAM and be accessed often, so for this table, the page seek cost should be very low. Everything is vacuum full analyze. Now, if I LIMIT the query to 10 rows, the index should be used all the time, because it will always return few rows... well, it doesn't ! apparts= SET enable_seqscan = 1; apparts= explain analyze select * from apparts where departement=42 LIMIT 10; Limit (cost=0.00..6.08 rows=10 width=47) (actual time=5.003..5.023 rows=10 loops=1) - Seq Scan on apparts (cost=0.00..853.12 rows=1403 width=47) (actual time=4.998..5.013 rows=10 loops=1) Filter: (departement = 42) Total runtime: 5.107 ms Now, let's try : apparts= SET enable_seqscan = 0; apparts= explain analyze select * from apparts where departement=42 LIMIT 10; Limit (cost=0.00..10.80 rows=10 width=47) (actual time=0.047..0.072 rows=10 loops=1) - Index Scan using apparts_dept on apparts (cost=0.00..1514.59 rows=1403 width=47) (actual time=0.044..0.061 rows=10 loops=1) Index Cond: (departement = 42) Total runtime: 0.157 ms So, by itself, Postgres will select a very bad query plan (32x slower) on a query which would be executed very fast using indexes. If I use OFFSET + LIMIT, it only gets worse because the seq scan has to scan more rows : apparts= SET enable_seqscan = 1; apparts= explain analyze select * from apparts where departement=42 LIMIT 10 OFFSET 85; QUERY PLAN --- Limit (cost=51.69..57.77 rows=10 width=47) (actual time=10.224..10.246 rows=10 loops=1) - Seq Scan on apparts (cost=0.00..853.12 rows=1403 width=47) (actual time=5.254..10.200 rows=95 loops=1) Filter: (departement = 42) Total runtime: 10.326 ms apparts= SET enable_seqscan = 1; apparts= explain analyze select * from apparts where departement=42 LIMIT 10 OFFSET 1000; Limit (cost=608.07..614.15 rows=10 width=47) (actual time=43.993..44.047 rows=10 loops=1) - Seq Scan on apparts (cost=0.00..853.12 rows=1403 width=47) (actual time=5.328..43.791 rows=1010 loops=1) Filter: (departement = 42) Total runtime: 44.128 ms apparts= SET enable_seqscan = 0; apparts= explain analyze select * from apparts where departement=42 LIMIT 10 OFFSET 1000; Limit (cost=1079.54..1090.33 rows=10 width=47) (actual time=2.147..2.170 rows=10 loops=1) - Index Scan using apparts_dept on apparts (cost=0.00..1514.59 rows=1403 width=47) (actual time=0.044..1.860 rows=1010 loops=1) Index Cond: (departement = 42) Total runtime: 2.259 ms Why is it that way ? The planner should use the LIMIT values when planning the query, should it not ? ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] fsync vs open_sync
Were you upset by my message ? I'll try to clarify. I understood from your email that you are a Windows haters Well, no, not really. I use Windows everyday and it has its strengths. I still don't think the average (non-geek) person can really use Linux as a Desktop OS. The problem I have with Windows is that I think it could be made much faster, without too much effort (mainly some tweaking in the Disk IO field), but Microsoft doesn't do it. Why ? I can't understand this. in Linux. You can write 1 files in one second and the HDD is still idle... then when it decides to flush it all goes to disk in one burst. You can not trust your data in this. That's why I mentioned that it did not relate to database type performance. If the computer crashes while writing these files, some may be partially written, some not at all, some okay... the only certainty is about filesystem integrity. But it's exactly the same on all Journaling filesystems (including NTFS). Thus, with equal reliability, the faster wins. Maybe, with Reiser4, we will see real filesystem transactions and maybe this will translate in higher postgres performance... I've had my computers shutdown violently by power failures and no reiserfs problems so far. NTFS is very crash proof too. My windows machine bluescreens twice a day and still no data loss ;) If you have the BSOD twice a day then you have a broken driver or broken HW. CPU overclocked ? I think this machine has crap hardware. In fact this example was to emphasize the reliability of NTFS : it is indeed remarkable that no data loss occurs even on such a crap machine. I know Windows has got quite reliable now. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] fsync vs open_sync
There is also the fact that NTFS is a very slow filesystem, and Linux is a lot better than Windows for everything disk, caching and IO related. Try to copy some files in NTFS and in ReiserFS... I'm not so sure I would agree with such a blanket generalization. I find NTFS to be very fast, my main complaint is fragmentation issues...I bet NTFS is better than ext3 at most things (I do agree with you about the cache, thoughO. Ok, you were right. I made some tests and NTFS is just not very good in the general case. I've seen some benchmarks for Reiser4 that are just amazing. As a matter of fact I was again amazed today. I was looking into a way to cache database queries for a website (not yet) written in Python. The purpose was to cache long queries like those used to render forum pages (which is the typical slow query, selecting from a big table where records are rather random and LIMIT is used to cut the result in pages). I wanted to save a serialized (python pickled) representation of the data to disk to avoid reissuing the query every time. In the end it took about 1 ms to load or save the data for a page with 40 posts... then I wondered, how much does it take just to read or write the file ? ReiserFS 3.6, Athlon XP 2.5G+, 512Mb DDR400 7200 RPM IDE Drive with 8MB Cache This would be considered a very underpowered server... 22 KB files, 1000 of them : open(), read(), close() : 10.000 files/s open(), write(), close() : 4.000 files/s This is quite far from database FS activity, but it's still amazing, although the disk doesn't even get used. Which is what I like in Linux. You can write 1 files in one second and the HDD is still idle... then when it decides to flush it all goes to disk in one burst. I did make benchmarks some time ago and found that what sets Linux apart from Windows in terms of filesystems is : - very high performance filesystems like ReiserFS This is the obvious part ; although with a hge amount of data in small files accessed randomly, ReiserFS is faster but not 10x, maybe something like 2x NTFS. I trust Reiser4 to offer better performance, but not right now. Also ReiserFS lacks a defragmenter, and it gets slower after 1-2 years (compared to 1-2 weeks with NTFS this is still not that bad, but I'd like to defragment and I cant). Reiser4 will fix that apparently with background defragger etc. - caching. Linux disk caching is amazing. When copying a large file to the same disk on Windows, the drive head swaps a lot, like the OS can't decide between reading and writing. Linux, on the other hand, reads and writes by large chunks and loses a lot less time seekng. Even when reading two files at the same time, Linux reads ahead in large chunks (very little performance loss) whereas Windows seeks a lot. The read-ahead and write-back thus gets it a lot faster than 2x NTFS for everyday tasks like copying files, backing up, making archives, grepping, serving files, etc... My windows box was able to saturate a 100Mbps ethernet while serving one large FTP file on the LAN (not that impressive, it's only 10 MB/s hey!). However, when several simultaneous clients were trying to download different files which were not in the disk cache, all hell broke loose : lots of seeking, and bandwidth dropped to 30 Mbits/s. Not enough read-ahead... The Linux box, serving FTP, with half the RAM (256 Mb), had no problem pushing the 100 Mbits/s with something like 10 simultaneous connections. The amusing part is that I could not use the Windows box to test it because it would choke at such a high IO concurrency (writing 10 MBytes/s to several files at once, my god). Of course the files which had been downloaded to the Windows box were cut in as many fragments as the number of disk seeks during the download... several hundred fragments each... my god... What amazes me is that it must just be some parameter somewhere and the Microsoft guys probably could have easily changed the read-ahead thresholds and time between seeks when in a multitasking environment, but they didn't. Why ? Thus people are forced to buy 1RPM SCSI drives for their LAN servers when an IDE raid, used with Linux, could push nearly a Gigabit... For database, this is different, as we're concerned about large files, and fsync() times... but it seems reiserfs still wins over ext3 so... About NTFS vs EXT3 : ext3 dies if you put a lot of files in the same directory. It's fast but still outperformed by reiser. I saw XFS fry eight 7 harddisk RAID bays. The computer was rebooted with the Reset button a few times because a faulty SCSI cable in the eighth RAID bay was making it hang. The 7 bays had no problem. When it went back up, all the bays were in mayhem. XFSrepair just vomited over itself and we got plenty of files with random data in them. Fortunately there was a catalog
Re: [PERFORM] Why does a simple query not use an obvious index?
Another primary key trick : If you insert records with a serial primary key, and rarely delete them or update the timestamp, you can use the primary key to compute an approximate number of rows. a := SELECT pkey FROM table WHERE timestamp() threshold ORDER BY timestamp ASC LIMIT 1; b := SELECT pkey FROM table WHERE ORDER BY pkey DESC LIMIT 1; (b-a) is an approximate count. Performance is great because you only fetch two rows. Index scan is guaranteed (LIMIT 1). On the downside, you get an approximation, and this only works for tables where timestamp is a date of INSERT, timestamp worrelated wiht pkey) not when timestamp is a date of UPDATE (uncorrelated with pkey). ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] seqscan instead of index scan
create index t_idx on t((c+d)); select * from t where c+d 0; Why not : select ((select * from t where c0::bigint) UNION (select * from t where d0::bigint)) group by whatever; or someting ? ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Postgres does not utilize indexes. Why?
test where id = 5; Few times I added 100,000 records, applied cast the 5 to int8 and it will use the index ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] Reiser4
ReiserFS 4 is (will be) a filesystem that implements transactions. Are there any plans in a future Postgresql version to support a special fsync method for Reiser4 which will use the filesystem's transaction engine, instead of an old kludge like fsync(), with a possibility of vastly enhanced performance ? Is there also a possibility to tell Postgres : I don't care if I lose 30 seconds of transactions on this table if the power goes out, I just want to be sure it's still ACID et al. compliant but you can fsync less often and thus be faster (with a possibility of setting that on a per-table basis) ? Thanks. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Hardware upgrade for a high-traffic database
We're currently running Postgres 7.4.1 on FreeBSD 5.2, a Dual Xeon 2.4, 2GB ECC, 3Ware Serial ATA RAID 5 w/ 4 disks (SLOW!!). Cheap solution while you look for another server : Try to use something other than RAID5. You have 4 disks, so you could use a striping+mirroring RAID which would boost performance. You can switch with a minimum downtime (copy files to other HDD, change RAID parameters, copy again...) maybe 1 hour ? If your hardware supports it of course. And tell us how it works ! ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] NUMERIC x VARCHAR
Numeric won't store that : (+33) 4 01 23 45 67 On Wed, 11 Aug 2004 02:42:33 -0300, Er Galvão Abbott [EMAIL PROTECTED] wrote: Greetings. I have a question regarding performance of certain datatypes: I have a field where I will store my clients phone numbers. I know that this field will never exceed 15 characters, and I will store only numbers here (no dashes, dots, etc...), so I was wondering: Wich type is faster: NUMERIC(15,0) or VARCHAR(15)? Are there any storage differences between them? TIA, ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Performance Bottleneck
Apache processes running for 30 minutes ?. My advice : use frames and Javascript ! In your webpage, you have two frames : content and refresh. content starts empty (say, just a title on top of the page). refresh is refreshed every five seconds from a script on your server. This script generates a javascript which document.write()'s new entries in the content frame, thus adding new records in the upper frame. Thus, the refreshing uses a new request every 5 seconds, which terminates very fast, and does not hog an Apache process. Turn keepalive timeout down. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Performance Bottleneck
The queries themselves are simple, normally drawing information from one table with few conditions or in the most complex cases using joins on two table or sub queries. These behave very well and always have, the problem is that these queries take place in rather large amounts due to the dumb nature of the scripts themselves. Hum, maybe this dumb thing is where to look at ? I'm no expert, but I have had the same situation with a very dump PHP application, namely osCommerce, which averaged about 140 (!) queries on a page ! I added some traces to queries, and some logging, only to see that the stupid programmers did something like (pseudo code): for id in id_list: select stuff from database where id=id Geee... I replaced it by : select stuff from database where id in (id_list) And this saved about 20 requests... The code was peppered by queries like that. In the end it went from 140 queries to about 20, which is still way too much IMHO, but I couldn't go lower without an extensive rewrite. If you have a script making many selects, it's worth grouping them, even using stored procedures. For instance using the classical tree in a table to store a tree of product categories : create table categories ( id serial primary key, parent_id references categories(id), etc ); You basically have these choices in order to display the tree : - select for parent_id=0 (root) - for each element, select its children - and so on OR - make a stored procedure which does that. At least 3x faster and a lot less CPU overhead. OR (if you have say 50 rows in the table which was my case) - select the entire table and build your tree in the script It was a little bit faster than the stored procedure. Could you give an example of your dumb scripts ? It's good to optimize a database, but it's even better to remove useless queries... ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Tuning queries on large database
not so bad for oracle. What about for PG ? How data is stored I agree with the datatype issue. Smallint, bigint, integer... add a constraint... Also the way order of the records in the database is very important. As you seem to have a very large static population in your table, you should insert it, ordered by your favourite selection index (looks like it's poste). Also, you have a lot of static data which pollutes your table. Why not create two tables, one for the current year, and one for all the past years. Use a view to present a merged view. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Tuning queries on large database
You often make sums. Why not use separate tables to cache these sums by month, by poste, by whatever ? Rule on insert on the big table updates the cache tables. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] arrays and indexes
SELECT * from content where 42 = ANY (authors); Postgres does have a way to do what you ask, though. It involves GiST indexes and the operators from the contrib/intarray directory from the Postgres source. I have tried to use these indexes, and the performance was very good. It can be faster (in fact much faster) than a join with an additional table, because you don't have a join. The SQL array syntax is a pain, though. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org