Re: [PERFORM] Postmaster using only 4-5% CPU
At 18.44 21/03/2006, Scott Marlowe wrote: Here's what's happening. On the fast machine, you are almost certainly using IDE drives. Oh yes, the fast machine has IDE drives, you got it ;) Meanwhile, back in the jungle... The machine with IDE drives operates differently. Most, if not all, IDE drives, when told by the OS to fsync() tell the OS immediately that the fsync() call has completed, and the data is written to the drive. Shortly thereafter, the drive actually commences to write the data out. When it gets a chance. I really didn't know this behaviour of IDE drives. I was stracing the postmaster while investigating the problem and noticed many fsync syscalls (one after each INSERT). I was investigating on it but I didn't explain me why SCSI was slower. You helped me a lot ;) tnx For PostgreSQL, the way IDE drives operate is dangerous. Write data out, call fsync(), get an immediate return, mark the data as committed, move on the next operation, operator trips over power cord / power conditioner explodes, power supply dies, brown out causes the machine to reboot, et. al., and when the machine comes up, PostgreSQL politely informs you that your database is corrupt, and you come to the pgsql-general group asking how to get your database back online. Very bad. Yes, it sounds very bad... what about SATA drives ? I heard about command queueing in SATA but I don't know if the kernel handles it properly Try wrapping the inserts in the sql file in begin; / commit; statements, like so: begin; insert into table ... (100,000 inserts here) insert into table ... commit; and it should fly. Oh, yes with the insert wrapped in a transaction the import time is as follows: - SCSI: 35 secs - IDE: 50 secs When a good friend of mine first started using PostgreSQL, he was a total MySQL bigot. He was importing a 10,000 row dataset, and made a smartassed remark after 10 minutes how it would have imported in minutes on MySQL. It was a test database, so I had him stop the import, delete all the imported rows, and wrap the whole import inside begin; and commit; The import took about 20 seconds or so. ;) Now, for the interesting test. Run the import on both machines, with the begin; commit; pairs around it. Halfway through the import, pull the power cord, and see which one comes back up. Don't do this to servers with data you like, only test machines, obviously. For an even more interesting test, do this with MySQL, Oracle, DB2, etc... I will surely run a test like this ;) Tnx a lot again for help Regards Edoardo Serra ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Indexes with descending date columns
On Fri, 2006-03-17 at 08:25, [EMAIL PROTECTED] wrote: I have a performance problem when traversing a table in index order with multiple columns including a date column in date reverse order. Below follows a simplified description of the table, the index and the associated query \d prcdedit prcdedit_prcd | character(20) | prcdedit_date | timestamp without time zone | Indexes: prcdedit_idx btree (prcdedit_prcd, prcdedit_date) Depending on how you use the table, there are three possible solutions. First, if it makes sense in the domain, using an ORDER BY where _both_ columns are used descending will make PG search the index in reverse and will be just as fast as when both as searched by the default ascending. Second possibility: Create a dummy column whose value depends on the negative of prcdedit_date, e.g., -extract(epoch from prcdedit_date), keep the dummy column in sync with the original column using triggers, and rewrite your queries to use ORDER BY prcdedit_prod, dummy_column. Third: Create an index on a function which sorts in the order you want, and then always sort using the function index (you could use the -extract(epoch...) gimmick for that, among other possibilities.) HTH. All good input - thanks, however, before I start messing with my stuff which I know will be complex - some questions to any of the developers on the list. i Is it feasible to extend index creation to support descending columns? ... this is supported on other commercial and non commercial databases, but I do not know if this is a SQL standard. ii If no to i, is it feasible to extend PostgreSQL to allow traversing an index in column descending and column ascending order - assuming an order by on more than one column with column order not in the same direction and indexes existing? ... if that makes sense. -- Regards Theo ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] Problem with query, server totally unresponsive
Hello, I have a big problem with one of my databases. When i run my query, after a few minutes, the postmaster shows 99% mem i top, and the server becomes totally unresponsive. I get this message when I try to cancel the query: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. This works fine on a different machine with the same database settings and about 30% less records. The other machine is running PostgreSQL 8.0.3 The troubled one is running 8.1.2 Any help is greatly appreciated! Thanks The machine has 2x Intel dual core processors (3GHz) and 2 Gigs of ram. #--- # RESOURCE USAGE (except WAL) #--- # - Memory - shared_buffers = 8192 # min 16 or max_connections*2, 8KB each #temp_buffers = 1000# min 100, 8KB each #max_prepared_transactions = 5 # can be 0 or more # note: increasing max_prepared_transactions costs ~600 bytes of shared memory # per transaction slot, plus lock space (see max_locks_per_transaction). work_mem = 4096 # min 64, size in KB maintenance_work_mem = 262144 # min 1024, size in KB #max_stack_depth = 2048 # min 100, size in KB My query: SELECT r.id AS id, max(r.name) AS name, max(companyid) AS companyid, max(extract(epoch from r.updated)) as r_updated, hydra.join(co.value) AS contacts, hydra.join(ad.postalsite) AS postalsites FROM records r LEFT OUTER JOIN contacts co ON(r.id = co.record AND co.type IN (1,11,101,3)) LEFT OUTER JOIN addresses ad ON(r.id = ad.record) WHERE r.original IS NULL GROUP BY r.id; The hydra.join function -- Aggregates a column to an array DROP FUNCTION hydra.join_aggregate(text, text) CASCADE; DROP FUNCTION hydra.join_aggregate_to_array(text); CREATE FUNCTION hydra.join_aggregate(text, text) RETURNS text AS 'select $1 || ''|'' || $2' LANGUAGE sql IMMUTABLE STRICT; CREATE FUNCTION hydra.join_aggregate_to_array(text) RETURNS text[] AS 'SELECT string_to_array($1, ''|'')' LANGUAGE sql IMMUTABLE STRICT; CREATE AGGREGATE hydra.join ( BASETYPE = text ,SFUNC = hydra.join_aggregate ,STYPE = text ,FINALFUNC = hydra.join_aggregate_to_array ); Tables: records: 757278 rows contacts: 2256253 rows addresses: 741536 rows Explain: QUERY PLAN - GroupAggregate (cost=636575.63..738618.40 rows=757278 width=75) - Merge Left Join (cost=636575.63..694469.65 rows=1681120 width=75) Merge Cond: (outer.id = inner.record) - Merge Left Join (cost=523248.93..552247.54 rows=1681120 width=63) Merge Cond: (outer.id = inner.record) - Sort (cost=164044.73..165937.93 rows=757278 width=48) Sort Key: r.id - Seq Scan on records r (cost=0.00..19134.78 rows=757278 width=48) Filter: (original IS NULL) - Sort (cost=359204.20..363407.00 rows=1681120 width=19) Sort Key: co.record - Seq Scan on contacts co (cost=0.00..73438.06 rows=1681120 width=19) Filter: ((type = 1) OR (type = 11) OR (type = 101) OR (type = 3)) - Sort (cost=113326.70..115180.54 rows=741536 width=16) Sort Key: ad.record - Seq Scan on addresses ad (cost=0.00..20801.36 rows=741536 width=16) (16 rows) se_companies=# \d records; Table public.records Column | Type | Modifiers -+-- +-- id | integer | not null default nextval ('records_id_seq'::regclass) companyid | character varying(16)| default ''::character varying categories | integer[]| nace| integer[]| name| character varying(255) | default ''::character varying updated | timestamp with time zone | default ('now'::text)::timestamp(6) with time zone updater | integer | owner | integer | loaner | integer | info| text | original| integer | active | boolean | default true categoryquality | integer | not null default 0 searchwords | character varying(128)[] | priority| integer | categorized
Re: [PERFORM] Indexes with descending date columns
Theo Kramer wrote: All good input - thanks, however, before I start messing with my stuff which I know will be complex - some questions to any of the developers on the list. i Is it feasible to extend index creation to support descending columns? ... this is supported on other commercial and non commercial databases, but I do not know if this is a SQL standard. This can be done. You need to create an operator class which specifies the reverse sort order (i.e. reverse the operators), and then use it in the new index. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Scaling up PostgreSQL in Multiple CPU / Dual Core Powered Servers
[EMAIL PROTECTED] (Jojo Paderes) wrote: I'd like to know if the latest PostgreSQL release can scale up by utilizing multiple cpu or dual core cpu to boost up the sql executions. I already do a research on the PostgreSQL mailing archives and only found old threads dating back 2000. A lot of things have improved with PostgreSQL and hopefully the support for multiple cpu or dual cores is already provided. If you submit multiple concurrent queries, they can be concurrently processed on separate CPUs; that has long been supported, and people have been using SMP systems to this end for years. -- let name=cbbrowne and tld=gmail.com in String.concat @ [name;tld];; http://cbbrowne.com/info/spreadsheets.html In other words -- and this is the rock solid principle on which the whole of the Corporation's Galaxy-wide success is founded -- their fundamental design flaws are completely hidden by their superficial design flaws. -- HHGTG ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] Problem with query, forget previous message
Seems the problem was with the custom aggregate function not being able to handle thousands of rows. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Scaling up PostgreSQL in Multiple CPU / Dual Core
On Thu, 23 Mar 2006 14:19:24 +0800 Jojo Paderes [EMAIL PROTECTED] wrote: I'd like to know if the latest PostgreSQL release can scale up by utilizing multiple cpu or dual core cpu to boost up the sql executions. I already do a research on the PostgreSQL mailing archives and only found old threads dating back 2000. A lot of things have improved with PostgreSQL and hopefully the support for multiple cpu or dual cores is already provided. Yes PostgreSQL can take advantage of multiple CPUs and core, has been able to for quite some time. - Frank Wiles [EMAIL PROTECTED] http://www.wiles.org - ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Scaling up PostgreSQL in Multiple CPU / Dual Core
On Thu, 2006-03-23 at 00:19, Jojo Paderes wrote: I'd like to know if the latest PostgreSQL release can scale up by utilizing multiple cpu or dual core cpu to boost up the sql executions. I already do a research on the PostgreSQL mailing archives and only found old threads dating back 2000. A lot of things have improved with PostgreSQL and hopefully the support for multiple cpu or dual cores is already provided. Can a single query be split up into parts and run on separate processors at the same time? No. Can multiple incoming queries be run on different processors for better performance? Yes. Has someone been working on the problem of splitting a query into pieces and running it on multiple CPUs / multiple machines? Yes. Bizgress has done that. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Scaling up PostgreSQL in Multiple CPU / Dual Core
Has someone been working on the problem of splitting a query into pieces and running it on multiple CPUs / multiple machines? Yes. Bizgress has done that. I believe that is limited to Bizgress MPP yes? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Scaling up PostgreSQL in Multiple CPU / Dual Core
On Thu, 2006-03-23 at 10:43, Joshua D. Drake wrote: Has someone been working on the problem of splitting a query into pieces and running it on multiple CPUs / multiple machines? Yes. Bizgress has done that. I believe that is limited to Bizgress MPP yes? Yep. I hope that someday it will be released to the postgresql global dev group for inclusion. Or at least parts of it. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Scaling up PostgreSQL in Multiple CPU / Dual Core
Martha Stewart called it a Good Thing when [EMAIL PROTECTED] (Scott Marlowe) wrote: On Thu, 2006-03-23 at 10:43, Joshua D. Drake wrote: Has someone been working on the problem of splitting a query into pieces and running it on multiple CPUs / multiple machines? Yes. Bizgress has done that. I believe that is limited to Bizgress MPP yes? Yep. I hope that someday it will be released to the postgresql global dev group for inclusion. Or at least parts of it. Question: Does the Bizgress/MPP use threading for this concurrency? Or forking? If it does so via forking, that's more portable, and less dependent on specific complexities of threading implementations (which amounts to non-portability ;-)). Most times Jan comes to town, we spend a few minutes musing about the splitting queries across threads problem, and dismiss it again; if there's the beginning of a split across processes, that's decidedly neat :-). -- output = reverse(moc.liamg @ enworbbc) http://linuxfinances.info/info/internet.html Why do we put suits in a garment bag, and put garments in a suitcase? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Indexes with descending date columns
On Thu, 2006-03-23 at 16:16, Alvaro Herrera wrote: Theo Kramer wrote: All good input - thanks, however, before I start messing with my stuff which I know will be complex - some questions to any of the developers on the list. i Is it feasible to extend index creation to support descending columns? ... this is supported on other commercial and non commercial databases, but I do not know if this is a SQL standard. This can be done. You need to create an operator class which specifies the reverse sort order (i.e. reverse the operators), and then use it in the new index. Hmmm, would that then result in the following syntax being valid? create index my_idx on my_table (c1, c2 desc, c3, c4 desc) ; where my_table is defined as create table my_table ( c1 text, c2 timestamp, c3 integer, c4 integer ); If so, I would appreciate any pointers on where to start on this - already fumbling my way through Interfacing Extensions To Indexes in the manual... Regards Theo -- Regards Theo ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Indexes with descending date columns
Theo Kramer [EMAIL PROTECTED] writes: If so, I would appreciate any pointers on where to start on this - already fumbling my way through Interfacing Extensions To Indexes in the manual... Search the PG list archives for discussions of reverse-sort opclasses. It's really pretty trivial, once you've created a negated btree comparison function for the datatype. This is the sort of thing that we are almost but not quite ready to put into the standard distribution. The issues that are bugging me have to do with whether NULLs sort low or high --- right now, if you make a reverse-sort opclass, it will effectively sort NULLs low instead of high, and that has some unpleasant consequences because the rest of the system isn't prepared for variance on the point (in particular I'm afraid this could break mergejoins). I'd like to see us make NULLs low vs NULLs high be a defined property of opclasses, and deal with the fallout from that, and then we could put reverse-sort opclasses for all the standard datatypes into the regular distribution. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly