Re: [PERFORM] File system choice for Red Hat systems
Mark Kirkwood wrote: Yeah, RHEL6 seems like the version we would prefer - unfortunately time frame is the next few days. Awesome - thanks for the quick reply! The RHEL6 beta is out, I'm running it, and I expect a straightforward upgrade path to the final release--I think I can just keep grabbing updated packages. Depending on how long your transition from test into production is, you might want to consider a similar move, putting RHEL6 onto something right now in nearly complete form and just slip in updates as it moves toward the official release. It's already better than RHEL5 at many things, even as a beta. The 2.6.18 kernel in particular is looking painfully old nowadays. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] performance of temporary vs. regular tables
As promised, I did a tiny benchmark - basically, 8 empty tables are filled with 100k rows each within 8 transactions (somewhat typically for my application). The test machine has 4 cores, 64G RAM and RAID1 10k drives for data. # INSERTs into a TEMPORARY table: [joac...@testsrv scaling]$ time pb query -d scaling_qry_1.xml real3m18.242s user1m59.074s sys 1m51.001s # INSERTs into a standard table: [joac...@testsrv scaling]$ time pb query -d scaling_qry_1.xml real3m35.090s user2m5.295s sys 2m2.307s Thus, there is a slight hit of about 10% (which may even be within meausrement variations) - your milage will vary. Usually WAL causes a much larger performance hit than this. Since the following command : CREATE TABLE tmp AS SELECT n FROM generate_series(1,100) AS n which inserts 1M rows takes 1.6 seconds on my desktop, your 800k rows INSERT taking more than 3 minutes is a bit suspicious unless : - you got huge fields that need TOASTing ; in this case TOAST compression will eat a lot of CPU and you're benchmarking TOAST, not the rest of the system - you got some non-indexed foreign key - some other reason ? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Overusing 1 CPU
Hallo all I have a strange problem here. I have a pgsql database running on Intel hardware here, it has 8 cores hyperthreaded so you see 16 cpu's. This box is basically adle @ the moment as it is still in testing yet top shows high usage on just 1 of the cores. mpstat gives the below. As you can see only cpu 1 is verey bussy, the rest are idle. Thanx Mozzi 13:02:19 CPU%usr %nice%sys %iowait%irq %soft %steal %guest %idle 13:02:21 all4.700.000.411.570.000.000.00 0.00 93.32 13:02:21 00.000.000.000.000.000.000.00 0.00 100.00 13:02:21 1 72.680.005.37 21.460.000.490.00 0.000.00 13:02:21 20.000.000.000.000.000.000.00 0.00 100.00 13:02:21 30.000.000.510.000.000.000.00 0.00 99.49 13:02:21 40.000.000.000.000.000.000.00 0.00 100.00 13:02:21 50.000.000.000.000.000.000.00 0.00 100.00 13:02:21 60.000.000.000.000.000.000.00 0.00 100.00 13:02:21 70.000.000.360.000.000.000.00 0.00 99.64 13:02:21 80.000.000.000.000.000.000.00 0.00 100.00 13:02:21 90.000.000.000.000.000.000.00 0.00 100.00 13:02:21 100.000.000.000.000.000.000.00 0.00 100.00 13:02:21 110.000.000.001.000.000.000.00 0.00 99.00 13:02:21 120.000.000.000.000.000.000.00 0.00 100.00 13:02:21 130.000.000.002.000.000.000.00 0.00 98.00 13:02:21 140.000.000.510.000.000.000.00 0.00 99.49 13:02:21 150.000.000.000.000.000.000.00 0.00 100.00 Average: CPU%usr %nice%sys %iowait%irq %soft %steal %guest %idle Average: all4.660.000.431.460.000.040.00 0.00 93.41 Average: 00.000.000.000.000.000.000.00 0.00 100.00 Average: 1 72.270.005.47 21.580.000.590.00 0.000.10 Average: 20.000.000.000.000.000.000.00 0.00 100.00 Average: 30.100.000.500.000.000.000.00 0.00 99.40 Average: 40.100.000.100.000.000.000.00 0.00 99.80 Average: 50.000.000.000.000.000.000.00 0.00 100.00 Average: 60.000.000.000.000.000.000.00 0.00 100.00 Average: 70.000.000.100.600.000.000.00 0.00 99.30 Average: 80.000.000.000.000.000.000.00 0.00 100.00 Average: 90.000.000.000.000.000.000.00 0.00 100.00 Average: 100.000.000.000.000.000.000.00 0.00 100.00 Average: 110.000.000.000.200.000.000.00 0.00 99.80 Average: 120.000.000.000.000.000.000.00 0.00 100.00 Average: 130.000.000.100.400.000.000.00 0.00 99.50 Average: 140.000.000.500.000.000.000.00 0.00 99.50 Average: 150.000.000.000.000.000.000.00 0.00 100.00 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Overusing 1 CPU
On Wed, 2 Jun 2010, Mozzi wrote: This box is basically adle @ the moment as it is still in testing yet top shows high usage on just 1 of the cores. First port of call: What process is using the CPU? Run top on a fairly wide terminal and use the c button to show the full command line. Matthew -- Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. - Kernighan -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Overusing 1 CPU
Hi Thanx mate Create Index seems to be the culprit. Is it normal to just use 1 cpu tho? Mozzi On Wed, 2010-06-02 at 12:24 +0100, Matthew Wakeling wrote: On Wed, 2 Jun 2010, Mozzi wrote: This box is basically adle @ the moment as it is still in testing yet top shows high usage on just 1 of the cores. First port of call: What process is using the CPU? Run top on a fairly wide terminal and use the c button to show the full command line. Matthew -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Overusing 1 CPU
In response to Mozzi : Hi Thanx mate Create Index seems to be the culprit. Is it normal to just use 1 cpu tho? If you have only one client, yes. If you have more then one active connections, every connection will use one CPU. In your case: create index can use only one CPU. Regards, Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: - Header) GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431 2EB0 389D 1DC2 3172 0C99 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Overusing 1 CPU
Mozzi, * Mozzi (mozzi.g...@gmail.com) wrote: Thanx mate Create Index seems to be the culprit. Is it normal to just use 1 cpu tho? Yes, PG can only use 1 CPU for a given query or connection. You'll start to see the other CPUs going when you have more than one connection to the database. If you're building alot of indexes then you probably want to split up the statements into multiple connections and run them in parallel. Thanks, Stephen signature.asc Description: Digital signature
Re: [PERFORM] performance of temporary vs. regular tables
Am 02.06.2010 12:03, schrieb Pierre C: Usually WAL causes a much larger performance hit than this. Since the following command : CREATE TABLE tmp AS SELECT n FROM generate_series(1,100) AS n which inserts 1M rows takes 1.6 seconds on my desktop, your 800k rows INSERT taking more than 3 minutes is a bit suspicious unless : - you got huge fields that need TOASTing ; in this case TOAST compression will eat a lot of CPU and you're benchmarking TOAST, not the rest of the system - you got some non-indexed foreign key - some other reason ? Yes, the other reason is that I am not issueing a single SQL command, but import data from plain ASCII files through the Pyhton-based framework into the database. The difference between your measurement and my measurent is the upper potential of improvement for my system (which has, on the other hand, the advantage of being a bit more powerful and flexible than a single SQL statement;-) ) Joachim -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Overusing 1 CPU
On Wednesday 02 June 2010 13:37:37 Mozzi wrote: Hi Thanx mate Create Index seems to be the culprit. Is it normal to just use 1 cpu tho? If it is a single-threaded process, then yes. And a Create index on a single table will probably be single-threaded. If you now start a create index on a different table, a different CPU should be used for that. Mozzi On Wed, 2010-06-02 at 12:24 +0100, Matthew Wakeling wrote: On Wed, 2 Jun 2010, Mozzi wrote: This box is basically adle @ the moment as it is still in testing yet top shows high usage on just 1 of the cores. First port of call: What process is using the CPU? Run top on a fairly wide terminal and use the c button to show the full command line. Matthew -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] File system choice for Red Hat systems
On Tuesday 01 June 2010, Mark Kirkwood mark.kirkw...@catalyst.net.nz wrote: I'm helping set up a Red Hat 5.5 system for Postgres. I was going to recommend xfs for the filesystem - however it seems that xfs is supported as a technology preview layered product for 5.5. This apparently means that the xfs tools are only available via special channels. What are Red Hat using people choosing for a good performing filesystem? I've run PostgreSQL on XFS on CentOS for years. It works well. Make sure you have a good battery-backed RAID controller under it (true for all filesystems). -- No animals were harmed in the recording of this episode. We tried but that damn monkey was just too fast. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] File system choice for Red Hat systems
you can try Scientific Linux 5.x,it plus XFS and some other soft for HPC based on CentOS. It had XFS for years --- On Wed, 6/2/10, Alan Hodgson ahodg...@simkin.ca wrote: From: Alan Hodgson ahodg...@simkin.ca Subject: Re: [PERFORM] File system choice for Red Hat systems To: pgsql-performance@postgresql.org Date: Wednesday, June 2, 2010, 10:53 PM On Tuesday 01 June 2010, Mark Kirkwood mark.kirkw...@catalyst.net.nz wrote: I'm helping set up a Red Hat 5.5 system for Postgres. I was going to recommend xfs for the filesystem - however it seems that xfs is supported as a technology preview layered product for 5.5. This apparently means that the xfs tools are only available via special channels. What are Red Hat using people choosing for a good performing filesystem? I've run PostgreSQL on XFS on CentOS for years. It works well. Make sure you have a good battery-backed RAID controller under it (true for all filesystems). -- No animals were harmed in the recording of this episode. We tried but that damn monkey was just too fast. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Autovacuum in postgres.
Thanks for the reply.. I am using postgres 8.01 and since it runs on a client box, I can't upgrade it. I've set the auto vacuum nap time to 3600 seconds. On Thu, May 27, 2010 at 8:03 PM, Bruce Momjian br...@momjian.us wrote: venu madhav wrote: Hi All, In my application we are using postgres which runs on an embedded box. I have configured autovacuum to run once for every one hour. It has 5 different databases in it. When I saw the log messages, I found that it is running autovacuum on one database every hour. As a result, on my database autovacuum is run once in 5 hours. Is there any way to make it run it every hour. What settings did you change to make it run every hour? Also, it will only vacuum tables that need vacuuming. What version of Postgres are you using? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com
[PERFORM] Re: [BUGS] Query causing explosion of temp space with join involving partitioning
I made a brute force check and indeed, for one of the parameters the query was switching to sequential scans (or bitmaps scans with condition on survey_pk=16 only if sequential scans were off). After closer look at the plan cardinalities I thought it would be worthy to increase histogram size and I set statistics on sources(srcid) to 1000 from default 10. It fixed the plan! Sources table was around 100M so skewness in this range must have been looking odd for the planner.. Thank you for the hints! Best Regards, Krzysztof On May 27, 2010, at 6:41 PM, Tom Lane wrote: Krzysztof Nienartowicz krzysztof.nienartow...@unige.ch writes: Logs of the system running queries are not utterly clear, so chasing the parameters for the explosive query is not that simple (shared logs between multiple threads), but from what I see there is no difference between them and the plan looks like (without removal of irrelevant parameters this time, most of them are float8, but also bytea) [ nestloop with inner index scans over the inherited table ] Well, that type of plan isn't going to consume much memory or disk space. What I suspect is happening is that sometimes, depending on the specific parameter values called out in the query, the planner is switching to another plan type that does consume lots of space (probably via sort or hash temp files). The most obvious guess is that that will happen when the range limits on srcid get far enough apart to make a nestloop not look cheap. You could try experimenting with EXPLAIN and different constant values to see what you get. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] how to force hashaggregate plan?
Hi, I have two similar queries that calculate group by summaries over a huge table (74.6mil rows). The only difference between two queries is the number of columns that group by is performed on. This difference is causing two different plans which are vary so very much in performance. Postgres is 8.4.4. on Linux 64bit. Work_mem is 4GB for both queries and effective_cache_size = 30GB (server has 72GB RAM). Both queries are 100% time on CPU (data is all in buffer cache or OS cache). My questions are: 1) Is there a way to force plan that uses hashaggregate for the second query? 2) I am not trying to achieve any particular execution time for the query, but I noticed that when disk sort kicks in (and that happens eventually once the dataset is large enough) the query drastically slows down, even if there is no physical IO going on. I wonder if it's possible to have predictable performance rather than sudden drop. 3) Why hashAggregate plan uses so much less memory (work_mem) than the plan with groupAggregate/sort? HashAggregate plan for Query1 works even with work_mem='2GB'; The second plan decides to use disk sort even with work_mem='4GB'. Why sort is so memory greedy? Are there any plans to address the sorting memory efficiency issues? Thank you! Query1: explain analyze smslocate_edw-# SELECT smslocate_edw-# month_code, smslocate_edw-# short_code, smslocate_edw-# gateway_carrier_id, smslocate_edw-# mp_code, smslocate_edw-# partner_id, smslocate_edw-# master_company_id, smslocate_edw-# ad_id, smslocate_edw-# sc_name_id, smslocate_edw-# sc_sports_league_id, smslocate_edw-# sc_sports_alert_type, smslocate_edw-# al_widget_id, smslocate_edw-# keyword_id, smslocate_edw-# cp_id, smslocate_edw-# sum(coalesce(message_count,0)), -- message_cnt smslocate_edw-# sum(coalesce(message_sellable_count,0)), -- message_sellable_cnt smslocate_edw-# sum(coalesce(ad_cost_sum,0)),-- ad_cost_sum smslocate_edw-# NULL::int4, --count(distinct device_number), -- unique_user_cnt smslocate_edw-# NULL::int4, --count(distinct case when message_sellable_count 0 then device_number end), -- unique_user_sellable_cnt smslocate_edw-# NULL,-- unique_user_first_time_cnt smslocate_edw-# 1, -- ALL smslocate_edw-# CURRENT_TIMESTAMP smslocate_edw-# from staging.agg_phones_monthly_snapshot smslocate_edw-# group by smslocate_edw-# month_code, smslocate_edw-# short_code, smslocate_edw-# gateway_carrier_id, smslocate_edw-# mp_code, smslocate_edw-# partner_id, smslocate_edw-# master_company_id, smslocate_edw-# ad_id, smslocate_edw-# sc_name_id, smslocate_edw-# sc_sports_league_id, smslocate_edw-# sc_sports_alert_type, smslocate_edw-# al_widget_id, smslocate_edw-# keyword_id, smslocate_edw-# cp_id smslocate_edw-# ; QUERY PLAN HashAggregate (cost=5065227.32..5214455.48 rows=7461408 width=64) (actual time=183289.883..185213.565 rows=2240716 loops=1) - Append (cost=0.00..2080664.40 rows=74614073 width=64) (actual time=0.030..58952.749 rows=74614237 loops=1) - Seq Scan on agg_phones_monthly (cost=0.00..11.50 rows=1 width=102) (actual time=0.002..0.002 rows=0 loops=1) Filter: (month_code = '2010M04'::bpchar) - Seq Scan on agg_phones_monthly_2010m04 agg_phones_monthly (cost=0.00..2080652.90 rows=74614072 width=64) (actual time=0.027..42713.387 rows=74614237 loops=1) Filter: (month_code = '2010M04'::bpchar) Total runtime: 185519.997 ms (7 rows) Time: 185684.396 ms Query2: explain analyze smslocate_edw-# SELECT smslocate_edw-# month_code, smslocate_edw-# gateway_carrier_id, smslocate_edw-# sum(coalesce(message_count,0)), -- message_cnt smslocate_edw-# sum(coalesce(message_sellable_count,0)), -- message_sellable_cnt smslocate_edw-# sum(coalesce(ad_cost_sum,0)),-- ad_cost_sum smslocate_edw-# count(distinct device_number), -- unique_user_cnt smslocate_edw-# count(distinct case when message_sellable_count 0 then device_number end), -- unique_user_sellable_cnt smslocate_edw-# NULL,-- unique_user_first_time_cnt smslocate_edw-# 15, -- CARRIER smslocate_edw-# CURRENT_TIMESTAMP smslocate_edw-# from staging.agg_phones_monthly_snapshot smslocate_edw-# group by smslocate_edw-# month_code, smslocate_edw-# gateway_carrier_id smslocate_edw-# ; QUERY PLAN
[PERFORM] How to insert a bulk of data with unique-violations very fast
Hello, i have a set of unique data which about 150.000.000 rows. Regullary i get a list of data, which contains multiple times of rows than the already stored one. Often around 2.000.000.000 rows. Within this rows are many duplicates and often the set of already stored data. I want to store just every entry, which is not within the already stored one. Also i do not want to store duplicates. Example: Already stored set: a,b,c Given set: a,b,a,c,d,a,c,d,b Expected set after import: a,b,c,d I now looking for a faster way for the import. At the moment i import the new data with copy into an table 'import'. then i remove the duplicates and insert every row which is not already known. after that import is truncated. Is there a faster way? Should i just insert every row and ignore it, if the unique constrain fails? Here the simplified table-schema. in real life it's with partitions: test=# \d urls Tabelle »public.urls« Spalte | Typ | Attribute +-+--- url_id | integer | not null default nextval('urls_url_id_seq'::regclass) url| text| not null Indexe: »urls_url« UNIQUE, btree (url) »urls_url_id« btree (url_id) Thanks for every hint or advice! :) Greetings from Germany, Torsten -- http://www.dddbl.de - ein Datenbank-Layer, der die Arbeit mit 8 verschiedenen Datenbanksystemen abstrahiert, Queries von Applikationen trennt und automatisch die Query-Ergebnisse auswerten kann. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] 答复: [PERFORM] About Tom Lane's Xeon CS test case
Tom , Thank you for your reply! I am encountering a context-switch storm problem . We got the pg_locks data when context-switch value over 200K/sec We fount that the value of CS relate to the count of Exclutivelocks . And I donnt know how to make the problem appear again by testing to collect evidence to update postgreSQL . So I want to redo your testing for that. Thank you! Best regards, Ray Huang -邮件原件- 发件人: Tom Lane [mailto:t...@sss.pgh.pa.us] 发送时间: 2010年5月27日 22:10 收件人: 黄永卫 抄送: pgsql-performance@postgresql.org 主题: Re: [PERFORM] About Tom Lane's Xeon CS test case =?gb2312?B?u8bTwM7A?= yongwei.hu...@gmail.com writes: My postgres version: 8.1.3; You do realize that version was obsoleted four years ago last week? If you're encountering multiprocessor performance problems you really need to get onto 8.3.x or later. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] PgAdmin iii - Explain.
I'm needing some tutorial to use and understand the graphical feature Explain of PgAdmin III? Do you have it? Thanks, Jeres.
Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?
Sorry, Alvaro. I was contemplating using a GIN or GiST index as a way of optimizing the query. Instead, I found that re-inserting the data in order of station ID (the primary look-up column) and then CLUSTER'ing on the station ID, taken date, and category index increased the speed by an order of magnitude. I might be able to drop the station/taken/category index in favour of the simple station index and CLUSTER on that, instead (saving plenty of disk space). Either way, it's fast right now so I'm not keen to try and make it much faster. Dave
Re: [PERFORM] Optimize date query for large child tables: GiST or GIN?
Hi, Hmm, that's nice, though I cannot but wonder whether the exclusive lock required by CLUSTER is going to be a problem in the long run. Not an issue; the inserts are one-time (or very rare; at most: once a year). Hm, keep in mind that if the station clause alone is not selective enough, scanning it may be too expensive. The current three column The seven child tables (split on category ID) have the following indexes: 1. Primary key (unique ID, sequence) 2. Station ID (table data is physically inserted by station ID order) 3. Station ID, Date, and Category ID (this index is CLUSTER'ed) I agree that the last index is probably all that is necessary. 99% of the searches use the station ID, date, and category. I don't think PostgreSQL necessarily uses that last index, though. Dave
Re: [PERFORM] requested shared memory size overflows size_t
Hi, Sorry to revive an old thread but I have had this error whilst trying to configure my 32-bit build of postgres to run on a 64-bit Windows Server 2008 machine with 96GB of RAM (that I would very much like to use with postgres). I am getting: 2010-06-02 11:34:09 BSTFATAL: requested shared memory size overflows size_t 2010-06-02 11:41:01 BSTFATAL: could not create shared memory segment: 8 2010-06-02 11:41:01 BSTDETAIL: Failed system call was MapViewOfFileEx. which makes a lot of sense since I was setting shared_buffers (and effective_cache_size) to values like 60GB.. Is it possible to get postgres to make use of the available 96GB RAM on a Windows 32-bit build? Otherwise, how can I get it to work? Im guessing my options are: - Use the 64-bit Linux build (Not a viable option for me - unless from a VM - in which case recommendations?) or - Configure Windows and postgres properly (Preferred option - but I don't know what needs to be done here or if Im testing properly using Resource Monitor) Thanks, Tom -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How to insert a bulk of data with unique-violations very fast
On Tue, Jun 1, 2010 at 9:03 AM, Torsten Zühlsdorff f...@meisterderspiele.de wrote: Hello, i have a set of unique data which about 150.000.000 rows. Regullary i get a list of data, which contains multiple times of rows than the already stored one. Often around 2.000.000.000 rows. Within this rows are many duplicates and often the set of already stored data. I want to store just every entry, which is not within the already stored one. Also i do not want to store duplicates. Example: The standard method in pgsql is to load the data into a temp table then insert where not exists in old table. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Autovacuum in postgres.
On Thu, May 27, 2010 at 9:01 AM, venu madhav venutaurus...@gmail.com wrote: Thanks for the reply.. I am using postgres 8.01 and since it runs on a client box, I can't upgrade it. I've set the auto vacuum nap time to 3600 seconds. You've pretty much made autovac run every 5 hours with that setting. What was wrong with the original settings? Just wondering what problem you were / are trying to solve here. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] requested shared memory size overflows size_t
Tom Wilcox hungry...@googlemail.com wrote: Is it possible to get postgres to make use of the available 96GB RAM on a Windows 32-bit build? I would try setting shared_memory to somewhere between 200MB and 1GB and set effective_cache_size = 90GB or so. The default behavior of Windows was to use otherwise idle RAM for disk caching, last I checked, anyway. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] requested shared memory size overflows size_t
* Kevin Grittner (kevin.gritt...@wicourts.gov) wrote: Tom Wilcox hungry...@googlemail.com wrote: Is it possible to get postgres to make use of the available 96GB RAM on a Windows 32-bit build? I would try setting shared_memory to somewhere between 200MB and 1GB and set effective_cache_size = 90GB or so. The default behavior of Windows was to use otherwise idle RAM for disk caching, last I checked, anyway. Sure, but as explained on -general already, all that RAM will only ever get used for disk cacheing. It won't be able to be used for sorts or hash aggs or any other PG operations (PG would use at most 4GB-shared_buffers, or so). Thanks, Stephen signature.asc Description: Digital signature
Re: [PERFORM] Certain query eating up all free memory (out of memory error)
On Mon, May 24, 2010 at 12:50 PM, Łukasz Dejneka l.dejn...@gmail.com wrote: Hi group, I could really use your help with this one. I don't have all the details right now (I can provide more descriptions tomorrow and logs if needed), but maybe this will be enough: I have written a PG (8.3.8) module, which uses Flex Lexical Analyser. It takes text from database field and finds matches for defined rules. It returns a set of two text fields (value found and value type). When I run query like this: SELECT * FROM flex_me(SELECT some_text FROM some_table WHERE id = 1); It works perfectly fine. Memory never reaches more than 1% (usually its below 0.5% of system mem). But when I run query like this: SELECT flex_me(some_text_field) FROM some_table WHERE id = 1; Memory usage goes through the roof, and if the result is over about 10k matches (rows) it eats up all memory and I get out of memory error. I'm not sure exactly what's happening in your particular case, but there is some known suckage in this area. http://archives.postgresql.org/pgsql-hackers/2010-05/msg00230.php http://archives.postgresql.org/pgsql-hackers/2010-05/msg00395.php -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] File system choice for Red Hat systems
On 03/06/10 02:53, Alan Hodgson wrote: On Tuesday 01 June 2010, Mark Kirkwoodmark.kirkw...@catalyst.net.nz wrote: I'm helping set up a Red Hat 5.5 system for Postgres. I was going to recommend xfs for the filesystem - however it seems that xfs is supported as a technology preview layered product for 5.5. This apparently means that the xfs tools are only available via special channels. What are Red Hat using people choosing for a good performing filesystem? I've run PostgreSQL on XFS on CentOS for years. It works well. Make sure you have a good battery-backed RAID controller under it (true for all filesystems). Thanks - yes, left to myself I would consider using Centos instead. However os choice is prescribed in this case I believe. Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] SELECT ignoring index even though ORDER BY and LIMIT present
hi, I have a problem space where the main goal is to search backward in time for events. Time can go back very far into the past, and so the table can get quite large. However, the vast majority of queries are all satisfied by relatively recent data. I have an index on the row creation date and I would like almost all of my queries to have a query plan looking something like: Limit ... - Index Scan Backward using server_timestamp_idx on events (cost=0.00..623055.91 rows=8695 width=177) ... However, PostgreSQL frequently tries to do a full table scan. Often what controls whether a scan is performed or not is dependent on the size of the LIMIT and how detailed the WHERE clause is. In practice, the scan is always the wrong answer for my use cases (where always is defined to be 99.9%). Some examples: (1) A sample query that devolves to a full table scan EXPLAIN SELECT events.id, events.client_duration, events.message, events.created_by, events.source, events.type, events.event, events.environment, events.server_timestamp, events.session_id, events.reference, events.client_uuid FROM events WHERE client_uuid ~* E'^foo bar so what' ORDER BY server_timestamp DESC LIMIT 20; QUERY PLAN (BAD!) -- Limit (cost=363278.56..363278.61 rows=20 width=177) - Sort (cost=363278.56..363278.62 rows=24 width=177) Sort Key: server_timestamp - Seq Scan on events (cost=0.00..363278.01 rows=24 width=177) Filter: (client_uuid ~* '^foo bar so what'::text) (2) Making the query faster by making the string match LESS specific (odd, seems like it should be MORE) EXPLAIN SELECT events.id, events.client_duration, events.message, events.created_by, events.source, events.type, events.event, events.environment, events.server_timestamp, events.session_id, events.reference, events.client_uuid FROM events WHERE client_uuid ~* E'^foo' ORDER BY server_timestamp DESC LIMIT 20; QUERY PLAN (GOOD!) Limit (cost=0.00..1433.14 rows=20 width=177) - Index Scan Backward using server_timestamp_idx on events (cost=0.00..623055.91 rows=8695 width=177) Filter: (client_uuid ~* '^foo'::text) (3) Alternatively making the query faster by using a smaller limit EXPLAIN SELECT events.id, events.client_duration, events.message, events.created_by, events.source, events.type, events.event, events.environment, events.server_timestamp, events.session_id, events.reference, events.client_uuid FROM events WHERE client_uuid ~* E'^foo bar so what' ORDER BY server_timestamp DESC LIMIT 10; QUERY PLAN (GOOD!) -- Limit (cost=0.00..259606.63 rows=10 width=177) - Index Scan Backward using server_timestamp_idx on events (cost=0.00..623055.91 rows=24 width=177) Filter: (client_uuid ~* '^foo bar so what'::text) I find myself wishing I could just put a SQL HINT on the query to force the index to be used but I understand that HINTs are considered harmful and are therefore not provided for PostgreSQL, so what is the recommended way to solve this? thank you very much
Re: [PERFORM] SELECT ignoring index even though ORDER BY and LIMIT present
Jori Jovanovich j...@dimensiology.com wrote: what is the recommended way to solve this? The recommended way is to adjust your costing configuration to better reflect your environment. What version of PostgreSQL is this? What do you have set in your postgresql.conf file? What does the hardware look like? How big is the active (frequently referenced) portion of your database? -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] SELECT ignoring index even though ORDER BY and LIMIT present
2010/6/2 Jori Jovanovich j...@dimensiology.com hi, I have a problem space where the main goal is to search backward in time for events. Time can go back very far into the past, and so the table can get quite large. However, the vast majority of queries are all satisfied by relatively recent data. I have an index on the row creation date and I would like almost all of my queries to have a query plan looking something like: [CUT] Do you have autovacuum running? Have you tried updating statistics? regards Szymon Guz
Re: [PERFORM] SELECT ignoring index even though ORDER BY and LIMIT present
Kevin Grittner kevin.gritt...@wicourts.gov writes: Jori Jovanovich j...@dimensiology.com wrote: what is the recommended way to solve this? The recommended way is to adjust your costing configuration to better reflect your environment. Actually, it's probably not the costs so much as the row estimates. For instance, that first query was estimated to select 20 out of a possible 24 rows. If 24 is indeed the right number of matches, then the planner is right and the OP is wrong: the indexscan is going to have to traverse almost all of the table and therefore it will be a lot slower than seqscan + sort. Now, if the real number of matches is a lot more than that, then the indexscan would make sense because it could be expected to get stopped by the LIMIT before it has to traverse too much of the table. So the true problem is to get the rowcount estimate to line up with reality. Unfortunately the estimates for ~* are typically not very good. If you could convert that to plain ~ (case sensitive) it'd probably work better. Also, if this isn't a particularly modern version of Postgres, a newer version might do a bit better with the estimate. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Weird XFS WAL problem
I'm testing/tuning a new midsize server and ran into an inexplicable problem. With an RAID10 drive, when I move the WAL to a separate RAID1 drive, TPS drops from over 1200 to less than 90! I've checked everything and can't find a reason. Here are the details. 8 cores (2x4 Intel Nehalem 2 GHz) 12 GB memory 12 x 7200 SATA 500 GB disks 3WARE 9650SE-12ML RAID controller with bbu 2 disks: RAID1 500GB ext4 blocksize=4096 8 disks: RAID10 2TB, stripe size 64K, blocksize=4096 (ext4 or xfs - see below) 2 disks: hot swap Ubuntu 10.04 LTS (Lucid) With xfs or ext4 on the RAID10 I got decent bonnie++ and pgbench results (this one is for xfs): Version 1.03e --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP argon24064M 70491 99 288158 25 129918 16 65296 97 428210 23 558.9 1 --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 23283 81 + +++ 13775 56 20143 74 + +++ 15152 54 argon,24064M,70491,99,288158,25,129918,16,65296,97,428210,23,558.9,1,16,23283,81,+,+++,13775,56,20143\ ,74,+,+++,15152,54 pgbench -i -s 100 -U test pgbench -c 10 -t 1 -U test scaling factor: 100 query mode: simple number of clients: 10 number of transactions per client: 1 number of transactions actually processed: 10/10 tps = 1046.104635 (including connections establishing) tps = 1046.337276 (excluding connections establishing) Now the mystery: I moved the pg_xlog directory to a RAID1 array (same 3WARE controller, two more SATA 7200 disks). Run the same tests and ... tps = 82.325446 (including connections establishing) tps = 82.326874 (excluding connections establishing) I thought I'd made a mistake, like maybe I moved the whole database to the RAID1 array, but I checked and double checked. I even watched the lights blink - the WAL was definitely on the RAID1 and the rest of Postgres on the RAID10. So I moved the WAL back to the RAID10 array, and performance jumped right back up to the 1200 TPS range. Next I check the RAID1 itself: dd if=/dev/zero of=./bigfile bs=8192 count=200 which yielded 98.8 MB/sec - not bad. bonnie++ on the RAID1 pair showed good performance too: Version 1.03e --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP argon24064M 68601 99 110057 18 46534 6 59883 90 123053 7 471.3 1 --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 + +++ + +++ + +++ + +++ + +++ + +++ argon,24064M,68601,99,110057,18,46534,6,59883,90,123053,7,471.3,1,16,+,+++,+,+++,+,+++,+,\ +++,+,+++,+,+++ So ... anyone have any idea at all how TPS drops to below 90 when I move the WAL to a separate RAID1 disk? Does this make any sense at all? It's repeatable. It happens for both ext4 and xfs. It's weird. You can even watch the disk lights and see it: the RAID10 disks are on almost constantly when the WAL is on the RAID10, but when you move the WAL over to the RAID1, its lights are dim and flicker a lot, like it's barely getting any data, and the RAID10 disk's lights barely go on at all. Thanks, Craig -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Weird XFS WAL problem
On 03/06/10 11:30, Craig James wrote: I'm testing/tuning a new midsize server and ran into an inexplicable problem. With an RAID10 drive, when I move the WAL to a separate RAID1 drive, TPS drops from over 1200 to less than 90! I've checked everything and can't find a reason. Are the 2 new RAID1 disks the same make and model as the 12 RAID10 ones? Also, are barriers *on* on the RAID1 mount and off on the RAID10 one? Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] requested shared memory size overflows size_t
Tom, A 32 bit build could only reference at most 4 Gb - certainly not 60 Gb. Also, Windows doesn't do well with large shared buffer sizes anyway. Try setting shared_buffers to 2 Gb and let the OS file system cache handle the rest. Your other option, of course, is a nice 64-bit linux variant, which won't have this problem at all. Good luck! Bob Lunney --- On Wed, 6/2/10, Tom Wilcox hungry...@googlemail.com wrote: From: Tom Wilcox hungry...@googlemail.com Subject: Re: [PERFORM] requested shared memory size overflows size_t To: pgsql-performance@postgresql.org Date: Wednesday, June 2, 2010, 6:58 AM Hi, Sorry to revive an old thread but I have had this error whilst trying to configure my 32-bit build of postgres to run on a 64-bit Windows Server 2008 machine with 96GB of RAM (that I would very much like to use with postgres). I am getting: 2010-06-02 11:34:09 BSTFATAL: requested shared memory size overflows size_t 2010-06-02 11:41:01 BSTFATAL: could not create shared memory segment: 8 2010-06-02 11:41:01 BSTDETAIL: Failed system call was MapViewOfFileEx. which makes a lot of sense since I was setting shared_buffers (and effective_cache_size) to values like 60GB.. Is it possible to get postgres to make use of the available 96GB RAM on a Windows 32-bit build? Otherwise, how can I get it to work? Im guessing my options are: - Use the 64-bit Linux build (Not a viable option for me - unless from a VM - in which case recommendations?) or - Configure Windows and postgres properly (Preferred option - but I don't know what needs to be done here or if Im testing properly using Resource Monitor) Thanks, Tom -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] SELECT ignoring index even though ORDER BY and LIMIT present
Jori, What is the PostgreSQL version/shared_buffers/work_mem/effective_cache_size/default_statistics_target? Are the statistics for the table up to date? (Run analyze verbose tablename to update them.) Table and index structure would be nice to know, too. If all else fails you can set enable_seqscan = off for the session, but that is a Big Hammer for what is probably a smaller problem. Bob Lunney --- On Wed, 6/2/10, Jori Jovanovich j...@dimensiology.com wrote: From: Jori Jovanovich j...@dimensiology.com Subject: [PERFORM] SELECT ignoring index even though ORDER BY and LIMIT present To: pgsql-performance@postgresql.org Date: Wednesday, June 2, 2010, 4:28 PM hi, I have a problem space where the main goal is to search backward in time for events. Time can go back very far into the past, and so the table can get quite large. However, the vast majority of queries are all satisfied by relatively recent data. I have an index on the row creation date and I would like almost all of my queries to have a query plan looking something like: Limit ... - Index Scan Backward using server_timestamp_idx on events (cost=0.00..623055.91 rows=8695 width=177) ... However, PostgreSQL frequently tries to do a full table scan. Often what controls whether a scan is performed or not is dependent on the size of the LIMIT and how detailed the WHERE clause is. In practice, the scan is always the wrong answer for my use cases (where always is defined to be 99.9%). Some examples: (1) A sample query that devolves to a full table scan EXPLAIN SELECT events.id, events.client_duration, events.message, events.created_by, events.source, events.type, events.event, events.environment, events.server_timestamp, events.session_id, events.reference, events.client_uuid FROM events WHERE client_uuid ~* E'^foo bar so what' ORDER BY server_timestamp DESC LIMIT 20; QUERY PLAN (BAD!) -- Limit (cost=363278.56..363278.61 rows=20 width=177) - Sort (cost=363278.56..363278.62 rows=24 width=177) Sort Key: server_timestamp - Seq Scan on events (cost=0.00..363278.01 rows=24 width=177) Filter: (client_uuid ~* '^foo bar so what'::text) (2) Making the query faster by making the string match LESS specific (odd, seems like it should be MORE) EXPLAIN SELECT events.id, events.client_duration, events.message, events.created_by, events.source, events.type, events.event, events.environment, events.server_timestamp, events.session_id, events.reference, events.client_uuid FROM events WHERE client_uuid ~* E'^foo' ORDER BY server_timestamp DESC LIMIT 20; QUERY PLAN (GOOD!) Limit (cost=0.00..1433.14 rows=20 width=177) - Index Scan Backward using server_timestamp_idx on events (cost=0.00..623055.91 rows=8695 width=177) Filter: (client_uuid ~* '^foo'::text) (3) Alternatively making the query faster by using a smaller limit EXPLAIN SELECT events.id, events.client_duration, events.message, events.created_by, events.source, events.type, events.event, events.environment, events.server_timestamp, events.session_id, events.reference, events.client_uuid FROM events WHERE client_uuid ~* E'^foo bar so what' ORDER BY server_timestamp DESC LIMIT 10; QUERY PLAN (GOOD!) -- Limit (cost=0.00..259606.63 rows=10 width=177) - Index Scan Backward using server_timestamp_idx on events (cost=0.00..623055.91 rows=24 width=177) Filter: (client_uuid ~* '^foo bar so what'::text) I find myself wishing I could just put a SQL HINT on the query to force the index to be used but I understand that HINTs are considered harmful and are therefore not provided for PostgreSQL, so what is the recommended way to solve this? thank you very much
Re: [PERFORM] requested shared memory size overflows size_t
On Wed, Jun 02, 2010 at 11:58:47AM +0100, Tom Wilcox wrote: Hi, Sorry to revive an old thread but I have had this error whilst trying to configure my 32-bit build of postgres to run on a 64-bit Windows Server 2008 machine with 96GB of RAM (that I would very much like to use with postgres). I am getting: 2010-06-02 11:34:09 BSTFATAL: requested shared memory size overflows size_t 2010-06-02 11:41:01 BSTFATAL: could not create shared memory segment: 8 2010-06-02 11:41:01 BSTDETAIL: Failed system call was MapViewOfFileEx. which makes a lot of sense since I was setting shared_buffers (and effective_cache_size) to values like 60GB.. I realize other answers have already been given on this thread; I figured I'd just refer to the manual, which says, The useful range for shared_buffers on Windows systems is generally from 64MB to 512MB. [1] [1] http://www.postgresql.org/docs/8.4/static/runtime-config-resource.html -- Joshua Tolley / eggyknap End Point Corporation http://www.endpoint.com signature.asc Description: Digital signature