Re: [HACKERS] unique index violation after pg_upgrade to PG10
On Tue, Oct 24, 2017 at 02:57:47PM -0700, Peter Geoghegan wrote: > On Tue, Oct 24, 2017 at 1:11 PM, Justin Pryzby <pry...@telsasoft.com> wrote: > > ..which I gather just verifies that the index is corrupt, not sure if > > there's > > anything else to do with it? Note, we've already removed the duplicate > > rows. > > Yes, the index itself is definitely corrupt -- this failed before the > new "heapallindexed" check even started. Though it looks like that > would have failed too, if you got that far, since the index points to > a row that does not contain the same data. (I only know this because > you dumped the heap tuple and the index tuple.) I think you must have compared these: On Tue, Oct 24, 2017 at 03:11:44PM -0500, Justin Pryzby wrote: > ts=# SELECT * FROM bt_page_items(get_raw_page('sites_idx', 1)); > > itemoffset | 48 > ctid | (1,37) > itemlen| 32 > nulls | f > vars | t > data | 1b 43 52 43 4c 4d 54 2d 43 45 4d 53 30 0b 31 31 31 31 00 00 00 > 00 00 00 ... > itemoffset | 37 > ctid | (0,97) > itemlen| 24 > nulls | f > vars | t > data | 1b 43 52 43 4c 4d 54 2d 43 45 4d 53 30 03 00 00 ..but note those are both items in sites_idx (48 and 37, which I seem to have pasted out of order).. I included both because I'm not confident I know what the "index tid=(1,37)" referred to, but I gather it means item at offset=37 (and not item with ctid=(1,37).) | [pryzbyj@database amcheck]$ psql --port 5678 ts -c "SELECT bt_index_check('sites_idx'::regclass::oid, heapallindexed=>True)" | ERROR: high key invariant violated for index "sites_idx" | DETAIL: Index tid=(1,37) points to heap tid=(0,97) page lsn=0/0. ts=# SELECT * FROM page_header(get_raw_page('sites_idx', 1)); lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid -+--+---+---+---+-+--+-+--- 0/0 |0 | 0 | 872 | 1696 |8176 | 8192 | 4 | 0 Here is its heap page: ts=# SELECT * FROM heap_page_items(get_raw_page('sites', 0)) WHERE lp=97; lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data ++--+++--+--++-+++--+---+ 97 |968 |1 | 52 | 21269 | 33567444 |0 | (3,27) | 8204 | 2307 | 32 | 11101001 | | \x70001b4352434c4d542d43454d5330030303 Which I see ends with 0303 vs .. t_infomask=2307=2048+256+3 => #define HEAP_HASNULL0x0001 /* has null attribute(s) */ #define HEAP_HASVARWIDTH0x0002 /* has variable-width attribute(s) */ #define HEAP_XMIN_COMMITTED 0x0100 /* t_xmin committed */ #define HEAP_XMAX_INVALID 0x0800 /* t_xmax invalid/aborted */ t_infomask2=8204 => 8192+12 => #define HEAP_KEYS_UPDATED 0x2000 /* tuple was updated and key cols modified, or tuple deleted */ Maybe this is relevant ? ts=# SELECT * FROM heap_page_items(get_raw_page('sites', 3)) WHERE lp=27; lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data ++--++++--++-+++----+---+ 27 | 0 |0 | 0 ||| || |||| | Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unique index violation after pg_upgrade to PG10
On Tue, Oct 24, 2017 at 02:57:47PM -0700, Peter Geoghegan wrote: > On Tue, Oct 24, 2017 at 1:11 PM, Justin Pryzby <pry...@telsasoft.com> wrote: > > ..which I gather just verifies that the index is corrupt, not sure if > > there's > > anything else to do with it? Note, we've already removed the duplicate > > rows. > Maybe you could try verifying a different index on the same table with > "heapallindexed", too. Perhaps that would fail in a more interesting > way. The only other index seems fine: [pryzbyj@database ~]$ psql --port 5678 ts -txc "SELECT bt_index_parent_check('sites_pkey'::regclass::oid, heapallindexed=>True)" bt_index_parent_check | [pryzbyj@database ~]$ psql --port 5678 ts -txc "SELECT bt_index_check('sites_pkey'::regclass::oid, heapallindexed=>True)" bt_index_check | > You're using LVM snapshots -- I hope that you're aware that they're > not guaranteed to be consistent across logical volumes. There are a > few different ways that they could cause corruption like this if you > weren't careful. (In general, I wouldn't recommend using LVM snapshots > as any kind of backup solution.) Right, I asked a coworker to create the snapshot before trying to fix the immediate problem, as a forensic measure. We have postgres on just one LVM LV. Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unique index violation after pg_upgrade to PG10
On Tue, Oct 24, 2017 at 01:48:55PM -0500, Kenneth Marshall wrote: > I just dealt with a similar problem with pg_repack and a PostgreSQL 9.5 DB, > the exact same error. It seemed to caused by a tuple visibility issue that > allowed the "working" unique index to be built, even though a duplicate row > existed. Then the next pg_repack would fail with the error you got. FTR, I was able to run the repack script several times without issue, hitting that table each time. Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unique index violation after pg_upgrade to PG10
On Tue, Oct 24, 2017 at 12:31:49PM -0700, Peter Geoghegan wrote: > On Tue, Oct 24, 2017 at 11:48 AM, Kenneth Marshall <k...@rice.edu> wrote: > >> Really ? pg_repack "found" and was victim to the duplicate keys, and > >> rolled > >> back its work. The CSV logs clearly show that our application INSERTed > >> rows > >> which are duplicates. > >> > >> [pryzbyj@database ~]$ rpm -qa pg_repack10 > >> pg_repack10-1.4.2-1.rhel6.x86_64 > >> > >> Justin > > > > Hi Justin, > > > > I just dealt with a similar problem with pg_repack and a PostgreSQL 9.5 DB, > > the exact same error. It seemed to caused by a tuple visibility issue that > > allowed the "working" unique index to be built, even though a duplicate row > > existed. Then the next pg_repack would fail with the error you got. In our > > case I needed the locality of reference to keep the DB performance > > acceptable > > and it was not a critical issue if there was a duplicate. We would remove > > the > > duplicates if we had a failure. We never had a problem with the NO pg_repack > > scenarios. > > A new, enhanced version of the corruption detection tool amcheck is > now available, and has both apt + yum packages available: > > https://github.com/petergeoghegan/amcheck > > Unlike the version in Postgres 10, this enhanced version (V1.2) has > "heapallindexed" verification, which is really what you want here. If > you install it, and run it against the unique index in question (with > "heapallindexed" verification), that could help. It might provide a > more useful diagnostic message. > > This is very new, so do let me know how you get on if you try it out. I started an instance connected to a copy of the LVM snapshot I saved: [pryzbyj@database ~]$ sudo -u postgres /usr/pgsql-10/bin/postmaster -c port=5678 -D /mnt/10/data [pryzbyj@database amcheck]$ psql --port 5678 ts -c "SELECT bt_index_check('sites_idx'::regclass::oid, heapallindexed=>True)" ERROR: high key invariant violated for index "sites_idx" DETAIL: Index tid=(1,37) points to heap tid=(0,97) page lsn=0/0. [pryzbyj@database amcheck]$ psql --port 5678 ts -c "SELECT bt_index_parent_check('sites_idx'::regclass::oid, heapallindexed=>True)" ERROR: high key invariant violated for index "sites_idx" DETAIL: Index tid=(1,37) points to heap tid=(0,97) page lsn=0/0. ts=# SELECT * FROM page_header(get_raw_page('sites_idx', 1)); lsn | 0/0 checksum | 0 flags | 0 lower | 872 upper | 1696 special | 8176 pagesize | 8192 version | 4 prune_xid | 0 ts=# SELECT * FROM page_header(get_raw_page('sites', 0)); lsn | 1FB/AC5A4908 checksum | 0 flags | 5 lower | 436 upper | 464 special | 8192 pagesize | 8192 version | 4 prune_xid | 0 ts=# SELECT * FROM bt_page_items(get_raw_page('sites_idx', 1)); itemoffset | 48 ctid | (1,37) itemlen| 32 nulls | f vars | t data | 1b 43 52 43 4c 4d 54 2d 43 45 4d 53 30 0b 31 31 31 31 00 00 00 00 00 00 itemoffset | 37 ctid | (0,97) itemlen| 24 nulls | f vars | t data | 1b 43 52 43 4c 4d 54 2d 43 45 4d 53 30 03 00 00 ..which I gather just verifies that the index is corrupt, not sure if there's anything else to do with it? Note, we've already removed the duplicate rows. Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unique index violation after pg_upgrade to PG10
On Tue, Oct 24, 2017 at 01:27:14PM -0500, Kenneth Marshall wrote: > On Tue, Oct 24, 2017 at 01:14:53PM -0500, Justin Pryzby wrote: > > Note: > > I run a script which does various combinations of ANALYZE/VACUUM > > (FULL/ANALYZE) > > following the upgrade, and a script runs nightly with REINDEX and pg_repack > > (and a couple of CLUSTER), so you should assume that any combination of > > those > > maintenance commands have been run. > > > > In our reindex/repack log I found the first error due to duplicates: > > Tue Oct 24 01:27:53 MDT 2017: sites: sites_idx(repack non-partitioned)... > > WARNING: Error creating index "public"."index_61764": ERROR: could not > > create unique index "index_61764" > > DETAIL: Key (site_office, site_location)=(CRCLMT-DOEMS0, 1120) is > > duplicated. > > WARNING: Skipping index swapping for "sites", since no new indexes built > > WARNING: repack failed for "sites_idx" > > reindex: warning, dropping invalid/unswapped index: index_61764 > > > > Hi Justin, > > This sounds like a pg_repack bug and not a PostgreSQL bug. What version are > you running? Really ? pg_repack "found" and was victim to the duplicate keys, and rolled back its work. The CSV logs clearly show that our application INSERTed rows which are duplicates. [pryzbyj@database ~]$ rpm -qa pg_repack10 pg_repack10-1.4.2-1.rhel6.x86_64 Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] unique index violation after pg_upgrade to PG10
x "index_61764" DETAIL: Key (site_office, site_location)=(CRCLMT-DOEMS0, 1120) is duplicated. WARNING: Skipping index swapping for "sites", since no new indexes built WARNING: repack failed for "sites_idx" reindex: warning, dropping invalid/unswapped index: index_61764 postgres=# SELECT * FROM postgres_log WHERE session_id='59eeeb79.4bd5' AND error_severity='ERROR' ORDER BY 1 DESC LIMIT 33; log_time | 2017-10-24 01:27:53.545-06 user_name | postgres database | ts pid| 19413 connection_from| [local] session_id | 59eeeb79.4bd5 session_line | 10 command_tag| CREATE INDEX session_start_time | 2017-10-24 01:27:53-06 error_severity | ERROR message| could not create unique index "index_61764" detail | Key (site_office, site_location)=(CRCLMT-DOEMS0, 1120) is duplicated. query | CREATE UNIQUE INDEX CONCURRENTLY index_61764 ON sites USING btree (site_office, site_location) TABLESPACE pg_default application_name | pg_repack Disclosure: I see a storage error from 2 days ago (pre-upgrade). This is a 4-drive RAID5 with spare. I can't see that we're missing a spare, so I don't know what else to do besides consider this message to be spurious.. Oct 22 21:28:16 midrivers-ubuntu kernel: [1429369.825577] hpsa :05:00.0: scsi 4:0:0:0 Aborting command 8f84f2ba7c00Tag:0x:01f0 CDBLen: 6 CDB: 0x1201... SN: 0x0 BEING SENT Oct 22 21:28:16 midrivers-ubuntu kernel: [1429369.825583] hpsa :05:00.0: scsi 4:0:0:0: Aborting command RAID HP P410i controller SSDSmartPathCap- En- Exp=1 Oct 22 21:28:16 midrivers-ubuntu kernel: [1429369.898930] hpsa :05:00.0: CDB 120180006000 was aborted with status 0x0 Oct 22 21:28:17 midrivers-ubuntu kernel: [1429371.269963] hpsa :05:00.0: invalid command: LUN: CDB:f001 Oct 22 21:28:17 midrivers-ubuntu kernel: [1429371.269966] hpsa :05:00.0: probably means device no longer present Oct 22 21:28:17 midrivers-ubuntu kernel: [1429371.269968] hpsa :05:00.0: scsi 4:0:0:0 Aborting command 8f84f2ba7c00Tag:0x:01f0 CDBLen: 6 CDB: 0x1201... SN: 0x0 SENT, FAILED Oct 22 21:28:17 midrivers-ubuntu kernel: [1429371.269974] hpsa :05:00.0: scsi 4:0:0:0: FAILED to abort command RAID HP P410i controller SSDSmartPathCap- En- Exp=1 sites is perhaps our most central table and I would expect issues there to be quickly apparent (however this was at least initially a silent failure). public | sites | table | telsasoft | 80 kB | Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] per-sesson errors after interrupting CLUSTER pg_attrdef
This was briefly scary but seems to have been limited to my psql session (no other errors logged). Issue with catcache (?) I realized that the backup job I'd kicked off was precluding the CLUSTER from running, but that CLUSTER was still holding lock and stalling everything else under the sun. psql (10.0, server 9.6.5) ... ts=# CLUSTER pg_attribute USING pg_attribute_relid_attnum_index ; ^CCancel request sent ERROR: canceling statement due to user request ts=# \df qci_add* ERROR: could not read block 8 in file "base/16400/999225102": read only 0 of 8192 bytes ts=# \dt+ pg_att ts=# \dt+ pg_attrdef ERROR: could not read block 8 in file "base/16400/999225102": read only 0 of 8192 bytes ts=# ^C ts=# \q postgres=# SELECT session_line ln, user_name, error_severity sev, left(message,66) , left(query,66) FROM postgres_log_2017_10_19_1700 WHERE session_id='59e93953.20c9' ORDER BY 1,2 DESC ; ln | user_name | sev |left |left +---+---++ 1 | pryzbyj | LOG | statement: CLUSTER pg_attribute USING pg_attribute_relid_attnum_in | 2 | pryzbyj | ERROR | canceling statement due to user request | CLUSTER pg_attribute USING pg_attribute_relid_attnum_index ; 3 | pryzbyj | LOG | statement: SELECT n.nspname as "Schema", +| | | | p.proname as "Name", +| | | | | 4 | pryzbyj | ERROR | could not read block 8 in file "base/16400/999225102": read only 0 | SELECT n.nspname as "Schema", + | | | | p.proname as "Name", + | | | | pg_catalog. 5 | pryzbyj | LOG | statement: SELECT pg_catalog.quote_ident(c.relname) FROM pg_catalo | 6 | pryzbyj | ERROR | could not read block 1 in file "base/16400/999225102": read only 0 | SELECT pg_catalog.quote_ident(c.relname) FROM pg_catalog.pg_class 7 | pryzbyj | LOG | statement: SELECT pg_catalog.quote_ident(c.relname) FROM pg_catalo | 8 | pryzbyj | ERROR | could not read block 1 in file "base/16400/999225102": read only 0 | SELECT pg_catalog.quote_ident(c.relname) FROM pg_catalog.pg_class 9 | pryzbyj | LOG | statement: SELECT pg_catalog.quote_ident(c.relname) FROM pg_catalo | 10 | pryzbyj | ERROR | could not read block 1 in file "base/16400/999225102": read only 0 | SELECT pg_catalog.quote_ident(c.relname) FROM pg_catalog.pg_class 11 | pryzbyj | LOG | statement: SELECT n.nspname as "Schema", +| | | | c.relname as "Name", +| | | | | 12 | pryzbyj | ERROR | could not read block 8 in file "base/16400/999225102": read only 0 | SELECT n.nspname as "Schema", + | | | | c.relname as "Name", + | | | | CASE c.relk (12 rows) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] huge RAM use in multi-command ALTER of table heirarchy
Hi, I just ran into this again in another context (see original dicussion, quoted below). Some time ago, while initially introducting non-default stats target, I set our non-filtering columns to "STATISTICS 10", lower than default, to minimize the size of pg_statistic, which (at least at one point) I perceived to have become bloated and causing issue (partially due to having an excessive number of "daily" granularity partitions, a problem I've since mitigated). The large number of columns with non-default stats target was (I think) causing pg_dump --section=pre-data to take 10+ minutes, which makes pg_upgrade more disruptive than necessary, so now I'm going back and fixing it. [pryzbyj@database ~]$ time sed '/SET STATISTICS 10;$/!d; s//SET STATISTICS -1;/' /srv/cdrperfbackup/ts/2017-10-17/pg_dump-section\=pre-data |psql -1q ts server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. connection to server was lost [pryzbyj@database ~]$ dmesg |tail -n2 Out of memory: Kill process 6725 (postmaster) score 550 or sacrifice child Killed process 6725, UID 26, (postmaster) total-vm:13544792kB, anon-rss:8977764kB, file-rss:8kB So I'm hoping to encourage someone to commit the change contemplated earlier. Thanks in advance. Justin On Tue, Jul 18, 2017 at 07:26:30PM -0400, Tom Lane wrote: > Justin Pryzby <pry...@telsasoft.com> writes: > > I've seen this before while doing SET STATISTICS on a larger number of > > columns > > using xargs, but just came up while doing ADD of a large number of columns. > > Seems to be roughly linear in number of children but superlinear WRT > > columns. > > I think having to do with catalog update / cache invalidation with many > > ALTERs*children*columns? > > I poked into this a bit. The operation is necessarily roughly O(N^2) in > the number of columns, because we rebuild the affected table's relcache > entry after each elementary ADD COLUMN operation, and one of the principal > components of that cost is reading all the pg_attribute entries. However, > that should only be a time cost not a space cost. Eventually I traced the > O(N^2) space consumption to RememberToFreeTupleDescAtEOX, which seems to > have been introduced in Simon's commit e5550d5fe, and which strikes me as > a kluge of the first magnitude. Unless I am missing something, that > function's design concept can fairly be characterized as "let's leak > memory like there's no tomorrow, on the off chance that somebody somewhere > is ignoring basic coding rules". > > I tried ripping that out, and the peak space consumption of your example > (with 20 child tables and 1600 columns) decreased from ~3GB to ~200MB. > Moreover, the system still passes make check-world, so it's not clear > to me what excuse this code has to live. > > It's probably a bit late in the v10 cycle to be taking any risks in > this area, but I'd vote for ripping out RememberToFreeTupleDescAtEOX > as soon as the v11 cycle opens, unless someone can show an example > of non-broken coding that requires it. (And if so, there ought to > be a regression test incorporating that.) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SIGSEGV in BRIN autosummarize
On Wed, Oct 18, 2017 at 07:22:27PM +0200, Alvaro Herrera wrote: > Do you still have those core dumps? If so, would you please verify the > database that autovacuum was running in? Just open each with gdb (using > the original postgres binary, not the one you just installed) and do > "print MyDatabaseId". [pryzbyj@database ~]$ gdb ccpp-2017-10-16-23:21:22-31543/coredump -ex 'p MyDatabaseId' -ex q 2>/dev/null |tail -5 Core was generated by `postgres: autovacuum worker process gtt '. Program terminated with signal 11, Segmentation fault. #0 index_close (relation=0x324647603246466, lockmode=1) at indexam.c:178 178 LockRelId relid = relation->rd_lockInfo.lockRelId; $1 = 16400 [pryzbyj@database ~]$ gdb ccpp-2017-10-14-18:05:35-26500/coredump -ex 'p MyDatabaseId' -ex q 2>/dev/null |tail -5 Core was generated by `postgres: autovacuum worker process gtt '. Program terminated with signal 11, Segmentation fault. #0 pfree (pointer=0x298c740) at mcxt.c:954 954 (*context->methods->free_p) (context, pointer); $1 = 16400 gtt=# SELECT oid,datname FROM pg_database; 13456 | template0 16400 | gtt 13457 | postgres 1 | template1 The gtt DB is where the (only) BRIN indicies are (not sure what to conclude from that?) Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SIGSEGV in BRIN autosummarize
On Wed, Oct 18, 2017 at 06:54:09PM +0200, Alvaro Herrera wrote: > Justin Pryzby wrote: > > > No crashes in ~28hr. It occurs to me that it's a weaker test due to not > > preserving most compilation options. > > And the previous code crashes in 45 minutes? That's solid enough for > me; I'll clean up the patch and push in the next few days. I think what > you have now should be sufficient for the time being for your production > system. No - the crash happened 4 times since adding BRIN+autosummarize 6 days ago, and in once instance occured twice within 3 hours (while I was trying to query logs for the preceding crash). [pryzbyj@database ~]$ sudo grep -hE 'in postgres|Saved core' /var/log/messages* Oct 13 17:22:45 database kernel: postmaster[32127] general protection ip:4bd467 sp:7ffd9b349990 error:0 in postgres[40+692000] Oct 13 17:22:47 database abrt[32387]: Saved core dump of pid 32127 (/usr/pgsql-10/bin/postgres) to /var/spool/abrt/ccpp-2017-10-13-17:22:47-32127 (15040512 bytes) Oct 14 18:05:35 database kernel: postmaster[26500] general protection ip:84a177 sp:7ffd9b349b88 error:0 in postgres[40+692000] Oct 14 18:05:35 database abrt[27564]: Saved core dump of pid 26500 (/usr/pgsql-10/bin/postgres) to /var/spool/abrt/ccpp-2017-10-14-18:05:35-26500 (24137728 bytes) Oct 16 23:21:22 database kernel: postmaster[31543] general protection ip:4bd467 sp:7ffe08a94890 error:0 in postgres[40+692000] Oct 16 23:21:22 database abrt[570]: Saved core dump of pid 31543 (/usr/pgsql-10/bin/postgres) to /var/spool/abrt/ccpp-2017-10-16-23:21:22-31543 (25133056 bytes) Oct 17 01:58:36 database kernel: postmaster[8646]: segfault at 8 ip 0084a177 sp 7ffe08a94a88 error 4 in postgres[40+692000] Oct 17 01:58:38 database abrt[9192]: Saved core dump of pid 8646 (/usr/pgsql-10/bin/postgres) to /var/spool/abrt/ccpp-2017-10-17-01:58:38-8646 (7692288 bytes) > > If I understand, our crash isn't explained by the avw_database test > > anyway (?) > > I don't see why you would think that -- I disagree. No problem - apparently I read too far into Tom's thoughts regarding memory context. I'll continue runnning with the existing patch and come back if the issue recurs. Thanks Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SIGSEGV in BRIN autosummarize
On Tue, Oct 17, 2017 at 09:07:40AM -0500, Justin Pryzby wrote: > On Tue, Oct 17, 2017 at 09:34:24AM -0400, Tom Lane wrote: > > Justin Pryzby <pry...@telsasoft.com> writes: > > > On Tue, Oct 17, 2017 at 12:59:16PM +0200, Alvaro Herrera wrote: > > >> Anyway, can give this patch a try? > > > > The trick in this sort of situation is to make sure you build binaries > > that match your existing install in every way except having the added > > patch, and maybe getting installed into a different directory. > > I'm familiar with that process; but, these are PG10 binaries from PGDG for > centos6 x64. I had to add symlinks for postgis library, but otherwise seems > to > be working fine (although I didn't preserve as many configure options as your > message would suggest I should have). On Tue, Oct 17, 2017 at 12:49:55PM -0400, Tom Lane wrote: > So what I'm thinking is that you need an error during perform_work_item, > and/or more than one work_item picked up in the calling loop, to make this > bug manifest. You would need to enter perform_work_item in a ..in our case probably due to interruption by LOCK TABLE, yes? On Tue, Oct 17, 2017 at 12:49:55PM -0400, Tom Lane wrote: > Alvaro Herrera <alvhe...@alvh.no-ip.org> writes: > > And I think that's because we're not > > checking that the namespace OID is a valid value before calling > > get_namespace_name on it. > > The part of your patch that adds a check on avw_database is clearly > correct and necessary. I'm thinking the change you propose in > perform_work_item is just overcomplicating code that's okay as it > stands. We don't need to optimize for the schema-went-away case. No crashes in ~28hr. It occurs to me that it's a weaker test due to not preserving most compilation options. If I understand, our crash isn't explained by the avw_database test anyway (?) Should I make clean and recompile with all non-prefix options and a minimal patch (avw_database==MyDatabaseId || continue) ? Or recompile with existing options but no patch to first verify crash occurs with locally compiled binary? Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SIGSEGV in BRIN autosummarize
On Tue, Oct 17, 2017 at 09:34:24AM -0400, Tom Lane wrote: > Justin Pryzby <pry...@telsasoft.com> writes: > > On Tue, Oct 17, 2017 at 12:59:16PM +0200, Alvaro Herrera wrote: > >> Anyway, can give this patch a try? > > > I've only compiled postgres once before and this is a production environment > > (althought nothing so important that the crashes are a serious concern > > either). > > > Is it reasonable to wget the postgres tarball, apply the patch, and run the > > compiled postgres binary from the source tree, without running make install > > or > > similar ? Otherwise, would it be good enough to copy the postgres binary to > > /usr/pgsql-10/bin (and reinstall the binary package later) ? > > The trick in this sort of situation is to make sure you build binaries > that match your existing install in every way except having the added > patch, and maybe getting installed into a different directory. > > So: where did you get the existing binaries? If it's from some vendor > packaging system, what you should do is fetch the package source, add > the patch to the probably-nonempty set of patches the vendor is applying, > and rebuild your own custom package version. If you haven't done that > before, it's a good skill to acquire ... I'm familiar with that process; but, these are PG10 binaries from PGDG for centos6 x64. I had to add symlinks for postgis library, but otherwise seems to be working fine (although I didn't preserve as many configure options as your message would suggest I should have). Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SIGSEGV in BRIN autosummarize
On Tue, Oct 17, 2017 at 12:59:16PM +0200, Alvaro Herrera wrote: > Justin Pryzby wrote: > > > #1 0x006a52e9 in perform_work_item (workitem=0x7f8ad1f94824) at > > autovacuum.c:2676 > > cur_datname = 0x298c740 "no 1 :vartype 1184 :vartypmod -1 > > :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno 1 :location 146} {CONST > > :consttype 1184 :consttypmod -1 :constcollid 0 :constlen 8 :constbyval true > > :constisnull fal"... > > cur_nspname = 0x298c728 "s ({VAR :varno 1 :varattno 1 :vartype 1184 > > :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno 1 > > :location 146} {CONST :consttype 1184 :consttypmod -1 :constcollid 0 > > :constlen 8 :constbyv"... > > cur_relname = 0x298cd68 > > "cdrs_eric_msc_sms_2017_10_14_startofcharge_idx" > > __func__ = "perform_work_item" > > cur_datname here seems corrupted -- it points halfway into cur_nspname, > which is also a corrupt value. And I think that's because we're not > checking that the namespace OID is a valid value before calling > get_namespace_name on it. And I'm betting that these values are all not > what we expect, because we're not checking that we're in the correct > database before trying to execute the work item. I don't quite > understand how this results in an invalid string rather than just a > NULL, as I would have expected. I'm happy to try the patch, but in case it makes any difference, we have few DBs/schemas: postgres=# \dn List of schemas Name | Owner +-- public | postgres (1 row) postgres=# \l+ List of databases Name| Owner | Encoding | Collate |Ctype| Access privileges | Size | Tablespace |Description ---+--+--+-+-+---+-++ gtt | pryzbyj | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 2215 GB | pg_default | postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 76 GB | pg_default | default administrative connection database template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +| 7625 kB | pg_default | unmodifiable empty database | | | | | postgres=CTc/postgres | || template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | postgres=CTc/postgres+| 7485 kB | pg_default | default template for new databases | | | | | =c/postgres | || (4 rows) Note that db:postgres is of substantial size since I set log_statement=all and log_destination=stderr,csv, and import all CSVs into a table heirarchy in db:postgres. There are 3 BRIN indices per (daily child) table in db:gtt (and none in postgres, although that's probably a good idea for the timestamp column once all our customers are upgraded to PG10). Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SIGSEGV in BRIN autosummarize
On Tue, Oct 17, 2017 at 12:59:16PM +0200, Alvaro Herrera wrote: > Anyway, can give this patch a try? I've only compiled postgres once before and this is a production environment (althought nothing so important that the crashes are a serious concern either). Is it reasonable to wget the postgres tarball, apply the patch, and run the compiled postgres binary from the source tree, without running make install or similar ? Otherwise, would it be good enough to copy the postgres binary to /usr/pgsql-10/bin (and reinstall the binary package later) ? Thanks, Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SIGSEGV in BRIN autosummarize
On Sun, Oct 15, 2017 at 02:44:58PM +0200, Tomas Vondra wrote: > Thanks, but I'm not sure that'll help, at this point. We already know > what happened (corrupted memory), we don't know "how". And core files > are mostly just "snapshots" so are not very useful in answering that :-( Is there anything I should be saving for these or hints how else to debug? I just got a segfault while looking for logs from an segfault from 2hr earlier...but it was a duplicate from the first stacktrace... < 2017-10-16 23:21:23.172 -04 >LOG: server process (PID 31543) was terminated by signal 11: Segmentation fault < 2017-10-16 23:21:23.172 -04 >DETAIL: Failed process was running: autovacuum: BRIN summarize public.gtt 162175 Oct 16 23:21:22 database kernel: postmaster[31543] general protection ip:4bd467 sp:7ffe08a94890 error:0 in postgres[40+692000] ... Oct 17 01:58:36 database kernel: postmaster[8646]: segfault at 8 ip 0084a177 sp 7ffe08a94a88 error 4 in postgres[40+692000] Oct 17 01:58:38 database abrt[9192]: Saved core dump of pid 8646 (/usr/pgsql-10/bin/postgres) to /var/spool/abrt/ccpp-2017-10-17-01:58:38-8646 (7692288 bytes) Oct 17 01:58:38 database abrtd: Directory 'ccpp-2017-10-17-01:58:38-8646' creation detected Oct 17 01:58:38 database abrt[9192]: Saved core dump of pid 8646 (/usr/pgsql-10/bin/postgres) to /var/spool/abrt/ccpp-2017-10-17-01:58:38-8646 (7692288 bytes) Oct 17 01:58:38 database abrtd: Directory 'ccpp-2017-10-17-01:58:38-8646' creation detected Oct 17 01:59:18 database abrtd: Generating core_backtrace Oct 17 01:59:18 database abrtd: Duplicate: core backtrace Oct 17 01:59:18 database abrtd: DUP_OF_DIR: /var/spool/abrt/ccpp-2017-10-14-18:05:35-26500 Oct 17 01:59:18 database abrtd: Deleting problem directory ccpp-2017-10-17-01:58:38-8646 (dup of ccpp-2017-10-14-18:05:35-26500) postgres=# SELECT * FROM postgres_log_2017_10_16_2300 WHERE pid=31543; -[ RECORD 1 ]--+- log_time | 2017-10-16 23:21:22.021-04 pid| 31543 session_id | 59e57677.7b37 session_line | 2 session_start_time | 2017-10-16 23:18:15-04 virtual_transaction_id | 22/54376 transaction_id | 0 error_severity | ERROR sql_state_code | 57014 message| canceling autovacuum task context| automatic analyze of table "gtt.public.cdrs_eric_egsnpdprecord_2017_10_15" -[ RECORD 2 ]--+- log_time | 2017-10-16 23:21:22.021-04 pid| 31543 session_id | 59e57677.7b37 session_line | 3 session_start_time | 2017-10-16 23:18:15-04 virtual_transaction_id | 22/54377 transaction_id | 0 error_severity | ERROR sql_state_code | 57014 message| canceling autovacuum task context| processing work entry for relation "gtt.public.cdrs_eric_msc_voice_2017_10_16_startofcharge_idx" -[ RECORD 3 ]--+- log_time | 2017-10-16 23:21:21.003-04 pid| 31543 session_id | 59e57677.7b37 session_line | 1 session_start_time | 2017-10-16 23:18:15-04 virtual_transaction_id | 22/54374 transaction_id | 0 error_severity | ERROR sql_state_code | 57014 message| canceling autovacuum task context| automatic analyze of table "gtt.public.cdrs_eric_ggsnpdprecord_2017_10_15" Note, the most recent LOCK TABLE I could find beofre 23:21:22 was actually ~10min earlier at 2017-10-16 23:12:16.519-04. Core was generated by `postgres: autovacuum worker process gtt '. Program terminated with signal 11, Segmentation fault. #0 index_close (relation=0x324647603246466, lockmode=1) at indexam.c:178 178 LockRelId relid = relation->rd_lockInfo.lockRelId; Missing separate debuginfos, use: debuginfo-install audit-libs-2.4.5-6.el6.x86_64 cyrus-sasl-lib-2.1.23-15.el6_6.2.x86_64 keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64 libcom_err-1.41.12-23.el6.x86_64 libgcc-4.4.7-18.el6.x86_64 libicu-4.2.1-14.el6.x86_64 libselinux-2.0.94-7.el6.x86_64 libstdc++-4.4.7-18.el6.x86_64 nspr-4.13.1-1.el6.x86_64 nss-3.28.4-1.el6_9.x86_64 nss-softokn-freebl-3.14.3-23.3.el6_8.x86_64 nss-util-3.28.4-1.el6_9.x86_64 openldap-2.4.40-16.el6.x86_64 openssl-1.0.1e-57.el6.x86_64 pam-1.1.1-24.el6.x86_64 zlib-1.2.3-29.el6.x86_64 (gdb) p relation $1 = (struct RelationData *) 0x324647603246466 (gdb) p relation->rd_lockInfo Cannot access memory at address 0x3246476032464aa (gdb) bt #0 index_close (relation=0x324647603246466, lockmode=1) at indexam.c:178 #1 0x004bc3c2 in systable_endscan (sysscan=0x145b3c8) at
Re: [HACKERS] SIGSEGV in BRIN autosummarize
On Sat, Oct 14, 2017 at 08:56:56PM -0500, Justin Pryzby wrote: > On Fri, Oct 13, 2017 at 10:57:32PM -0500, Justin Pryzby wrote: > > > Also notice the vacuum process was interrupted, same as yesterday (think > > > goodness for full logs). Our INSERT script is using python > > > multiprocessing.pool() with "maxtasksperchild=1", which I think means we > > > load > > > one file and then exit the subprocess, and pool() creates a new subproc, > > > which > > > starts a new PG session and transaction. Which explains why autovacuum > > > starts > > > processing the table only to be immediately interrupted. > > On Sun, Oct 15, 2017 at 01:57:14AM +0200, Tomas Vondra wrote: > > I don't follow. Why does it explain that autovacuum gets canceled? I > > mean, merely opening a new connection/session should not cancel > > autovacuum. That requires a command that requires table-level lock > > conflicting with autovacuum (so e.g. explicit LOCK command, DDL, ...). > > I was thinking that INSERT would do it, but I gather you're right about > autovacuum. Let me get back to you about this.. I confirmed that we're taking an explicit lock before creating new child tables (as I recall, to avoid errors in the logs shortly after midnight when multiple subprocesses see data for the new date for the first time): 2017-10-15 12:52:50.499-04 | 59e3925e.6951 | statement: LOCK TABLE cdrs_huawei_sgsnPDPRecord IN SHARE UPDATE EXCLUSIVE MODE Probably we can improve that with LOCK TABLE ONLY. Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SIGSEGV in BRIN autosummarize
On Fri, Oct 13, 2017 at 10:57:32PM -0500, Justin Pryzby wrote: > > Also notice the vacuum process was interrupted, same as yesterday (think > > goodness for full logs). Our INSERT script is using python > > multiprocessing.pool() with "maxtasksperchild=1", which I think means we > > load > > one file and then exit the subprocess, and pool() creates a new subproc, > > which > > starts a new PG session and transaction. Which explains why autovacuum > > starts > > processing the table only to be immediately interrupted. On Sun, Oct 15, 2017 at 01:57:14AM +0200, Tomas Vondra wrote: > I don't follow. Why does it explain that autovacuum gets canceled? I > mean, merely opening a new connection/session should not cancel > autovacuum. That requires a command that requires table-level lock > conflicting with autovacuum (so e.g. explicit LOCK command, DDL, ...). I was thinking that INSERT would do it, but I gather you're right about autovacuum. Let me get back to you about this.. > > Due to a .."behavioral deficiency" in the loader for those tables, the > > crashed > > backend causes the loader to get stuck, so the tables should be untouched > > since > > the crash, should it be desirable to inspect them. > > > > It's a bit difficult to guess what went wrong from this backtrace. For > me gdb typically prints a bunch of lines immediately before the frames, > explaining what went wrong - not sure why it's missing here. Do you mean this ? ... Loaded symbols for /lib64/libnss_files-2.12.so Core was generated by `postgres: autovacuum worker process gtt '. Program terminated with signal 11, Segmentation fault. #0 pfree (pointer=0x298c740) at mcxt.c:954 954 (*context->methods->free_p) (context, pointer); > Perhaps some of those pointers are bogus, the memory was already pfree-d > or something like that. You'll have to poke around and try dereferencing > the pointers to find what works and what does not. > > For example what do these gdb commands do in the #0 frame? > > (gdb) p *(MemoryContext)context (gdb) p *(MemoryContext)context Cannot access memory at address 0x7474617261763a20 > (gdb) p *GetMemoryChunkContext(pointer) (gdb) p *GetMemoryChunkContext(pointer) No symbol "GetMemoryChunkContext" in current context. I had to do this since it's apparently inlined/macro: (gdb) p *(MemoryContext *) (((char *) pointer) - sizeof(void *)) $8 = (MemoryContext) 0x7474617261763a20 I uploaded the corefile: http://telsasoft.com/tmp/coredump-postgres-autovacuum-brin-summarize.gz Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SIGSEGV in BRIN autosummarize
On Fri, Oct 13, 2017 at 10:57:32PM -0500, Justin Pryzby wrote: > I don't have any reason to believe there's memory issue on the server, So I > suppose this is just a "heads up" to early adopters until/in case it happens > again and I can at least provide a stack trace. I'm back; find stacktrace below. > Today I see: > < 2017-10-13 17:22:47.839 -04 >LOG: server process (PID 32127) was > terminated by signal 11: Segmentation fault > < 2017-10-13 17:22:47.839 -04 >DETAIL: Failed process was running: > autovacuum: BRIN summarize public.gtt 747263 Is it a coincidence the server failed within 45m of yesterday's failure ? postmaster[26500] general protection ip:84a177 sp:7ffd9b349b88 error:0 in postgres[40+692000] < 2017-10-14 18:05:36.432 -04 >DETAIL: Failed process was running: autovacuum: BRIN summarize public.gtt 41087 > It looks like this table was being inserted into simultaneously by a python > program using multiprocessing. It looks like each subprocess was INSERTing > into several tables, each of which has one BRIN index on timestamp column. I should add: These are insert-only child tables in a heirarchy (not PG10 partitions), being inserted into directly (not via trigger/rule). Also notice the vacuum process was interrupted, same as yesterday (think goodness for full logs). Our INSERT script is using python multiprocessing.pool() with "maxtasksperchild=1", which I think means we load one file and then exit the subprocess, and pool() creates a new subproc, which starts a new PG session and transaction. Which explains why autovacuum starts processing the table only to be immediately interrupted. postgres=# SELECT * FROM postgres_log_2017_10_14_1800 WHERE pid=26500 ORDER BY log_time DESC LIMIT 9; log_time | 2017-10-14 18:05:34.132-04 pid| 26500 session_id | 59e289b4.6784 session_line | 2 session_start_time | 2017-10-14 18:03:32-04 error_severity | ERROR sql_state_code | 57014 message| canceling autovacuum task context| processing work entry for relation "gtt.public.cdrs_eric_ggsnpdprecord_2017_10_14_recordopeningtime_idx" ---+- log_time | 2017-10-14 18:05:32.925-04 pid| 26500 session_id | 59e289b4.6784 session_line | 1 session_start_time | 2017-10-14 18:03:32-04 error_severity | ERROR sql_state_code | 57014 message| canceling autovacuum task context| automatic analyze of table "gtt.public.cdrs_eric_egsnpdprecord_2017_10_14" gtt=# \dt+ *record_2017_10_14 public | cdrs_eric_egsnpdprecord_2017_10_14 | table | gtt | 1642 MB | public | cdrs_eric_ggsnpdprecord_2017_10_14 | table | gtt | 492 MB | gtt=# \di+ *_2017_10_14*_recordopeningtime_idx public | cdrs_eric_egsnpdprecord_2017_10_14_recordopeningtime_idx | index | gtt | cdrs_eric_egsnpdprecord_2017_10_14 | 72 kB | public | cdrs_eric_ggsnpdprecord_2017_10_14_recordopeningtime_idx | index | gtt | cdrs_eric_ggsnpdprecord_2017_10_14 | 48 kB | Due to a .."behavioral deficiency" in the loader for those tables, the crashed backend causes the loader to get stuck, so the tables should be untouched since the crash, should it be desirable to inspect them. #0 pfree (pointer=0x298c740) at mcxt.c:954 context = 0x7474617261763a20 #1 0x006a52e9 in perform_work_item (workitem=0x7f8ad1f94824) at autovacuum.c:2676 cur_datname = 0x298c740 "no 1 :vartype 1184 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno 1 :location 146} {CONST :consttype 1184 :consttypmod -1 :constcollid 0 :constlen 8 :constbyval true :constisnull fal"... cur_nspname = 0x298c728 "s ({VAR :varno 1 :varattno 1 :vartype 1184 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno 1 :location 146} {CONST :consttype 1184 :consttypmod -1 :constcollid 0 :constlen 8 :constbyv"... cur_relname = 0x298cd68 "cdrs_eric_msc_sms_2017_10_14_startofcharge_idx" __func__ = "perform_work_item" #2 0x006a6fd9 in do_autovacuum () at autovacuum.c:2533 workitem = 0x7f8ad1f94824 classRel = 0x7f89c26d0e58 tuple = relScan = dbForm = table_oids = orphan_oids = 0x0 ctl = {num_partitions = 0, ssize = 0, dsize = 0, max_dsize = 0, ffactor = 0, keysize = 4, entrysize = 80, hash = 0, match = 0, keycopy = 0, alloc = 0, hcxt = 0x0, hctl = 0x0} table_toast_map = 0x29c8188 cell = 0x0 shared = 0x298ce18 dbentry = 0x298d0a0 bstrategy = 0x2a61c18 key = {sk_flags = 0, sk_attno = 16, sk_strategy = 3, sk_subtype = 0, sk_colla
[HACKERS] SIGSEGV in BRIN autosummarize
I upgraded one of our customers to PG10 Tuesday night, and Wednesday replaced an BTREE index with BRIN index (WITH autosummarize). Today I see: < 2017-10-13 17:22:47.839 -04 >LOG: server process (PID 32127) was terminated by signal 11: Segmentation fault < 2017-10-13 17:22:47.839 -04 >DETAIL: Failed process was running: autovacuum: BRIN summarize public.gtt 747263 postmaster[32127] general protection ip:4bd467 sp:7ffd9b349990 error:0 in postgres[40+692000] [pryzbyj@database ~]$ rpm -qa postgresql10 postgresql10-10.0-1PGDG.rhel6.x86_64 Oct 13 17:22:45 database kernel: postmaster[32127] general protection ip:4bd467 sp:7ffd9b349990 error:0 in postgres[40+692000] Oct 13 17:22:47 database abrtd: Directory 'ccpp-2017-10-13-17:22:47-32127' creation detected Oct 13 17:22:47 database abrt[32387]: Saved core dump of pid 32127 (/usr/pgsql-10/bin/postgres) to /var/spool/abrt/ccpp-2017-10-13-17:22:47-32127 (15040512 bytes) ..unfortunately: Oct 13 17:22:47 database abrtd: Package 'postgresql10-server' isn't signed with proper key Oct 13 17:22:47 database abrtd: 'post-create' on '/var/spool/abrt/ccpp-2017-10-13-17:22:47-32127' exited with 1 Oct 13 17:22:47 database abrtd: DELETING PROBLEM DIRECTORY '/var/spool/abrt/ccpp-2017-10-13-17:22:47-32127' postgres=# SELECT * FROM bak_postgres_log_2017_10_13_1700 WHERE pid=32127 ORDER BY log_time DESC LIMIT 9; -[ RECORD 1 ]--+- log_time | 2017-10-13 17:22:45.56-04 pid| 32127 session_id | 59e12e67.7d7f session_line | 2 command_tag| session_start_time | 2017-10-13 17:21:43-04 error_severity | ERROR sql_state_code | 57014 message| canceling autovacuum task context| processing work entry for relation "gtt.public.cdrs_eric_egsnpdprecord_2017_10_13_recordopeningtime_idx" -[ RECORD 2 ]--+- log_time | 2017-10-13 17:22:44.557-04 pid| 32127 session_id | 59e12e67.7d7f session_line | 1 session_start_time | 2017-10-13 17:21:43-04 error_severity | ERROR sql_state_code | 57014 message| canceling autovacuum task context| automatic analyze of table "gtt.public.cdrs_huawei_sgsnpdprecord_2017_10_13" Time: 375.552 ms It looks like this table was being inserted into simultaneously by a python program using multiprocessing. It looks like each subprocess was INSERTing into several tables, each of which has one BRIN index on timestamp column. gtt=# \dt+ cdrs_eric_egsnpdprecord_2017_10_13 public | cdrs_eric_egsnpdprecord_2017_10_13 | table | gtt | 5841 MB | gtt=# \di+ cdrs_eric_egsnpdprecord_2017_10_13_recordopeningtime_idx public | cdrs_eric_egsnpdprecord_2017_10_13_recordopeningtime_idx | index | gtt | cdrs_eric_egsnpdprecord_2017_10_13 | 136 kB | I don't have any reason to believe there's memory issue on the server, So I suppose this is just a "heads up" to early adopters until/in case it happens again and I can at least provide a stack trace. Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] memory fields from getrusage()
On Sat, Sep 02, 2017 at 02:00:44PM +1200, Thomas Munro wrote: > On Sat, Sep 2, 2017 at 7:46 AM, Peter Eisentraut > <peter.eisentr...@2ndquadrant.com> wrote: > > On 6/15/17 10:58, Justin Pryzby wrote: > >> On Thu, Jun 15, 2017 at 10:29:21AM -0400, Robert Haas wrote: > >>> On Wed, Jun 14, 2017 at 6:28 PM, Justin Pryzby <pry...@telsasoft.com> > >>> wrote: > >>>> On Tue, Jun 13, 2017 at 12:16:00PM -0400, Robert Haas wrote: > >>>>> It might be worth adding platform-specific code for common platforms. > >>>> > >>>> All I care (which linux happily/happens to support) is maxrss; I was > >>>> probably > >>>> originally interested in this while digging into an issue with hash agg. > >>> > >>> I don't think it needs to go in a separate file. I'd just patch > >>> ShowUsage(). > > > > I have committed a patch that shows maxrss, with /1024 adjustment for > > macOS. That should cover all platforms that I could find.(*) > > Apparently ru_maxrss is in *pages* on Solaris-derived systems: > > https://illumos.org/man/3c/getrusage ..but note that that: "The ru_maxrss, ru_ixrss, ru_idrss, and ru_isrss members of the rusage structure are set to 0 in this implementation." Same here: https://docs.oracle.com/cd/E23823_01/html/816-5168/getrusage-3c.html ..and earlier solaris docs don't seem to mention getrusage at all (?) posix docs say: http://pubs.opengroup.org/onlinepubs/009695399/functions/getrusage.html |CHANGE HISTORY |First released in Issue 4, Version 2. ..which I gather is SUSv1 c. 1995 Curiously, GNU time reported maxrss too high by a factor of 4 (and that was present in early centos6.X) https://bugzilla.redhat.com/show_bug.cgi?id=702826 Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Possible issue with expanded object infrastructure on Postgres 9.6.1
On Tue, Aug 8, 2017 at 8:17 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Peter Geoghegan <p...@bowt.ie> writes: > > On Thu, Jan 19, 2017 at 5:45 PM, Peter Geoghegan <p...@heroku.com> wrote: > >> A customer is on 9.6.1, and complains of a segfault observed at least > >> 3 times. > > ... > > For the sake of the archives: this now looks very much like the issue > > that Tom just fixed with commit > > 9bf4068cc321a4d44ac54089ab651a49d89bb567. > > Yeah, particularly seeing that $customer noted that some of the > columns involved were UUIDs: > > https://www.postgresql.org/message-id/CAOxz3fqK9Y0GntL8MDoeZ > jy2ot_6lx1yvhay6bd1vykup-i...@mail.gmail.com > > Good to have gotten to the bottom of that one. Too bad it just > missed the train for 9.6.4. > > regards, tom lane > $customer, here. I just want to thank everyone involved for getting to the bottom of this and for your support. Even if it missed the 9.6.4 release, I'm very grateful for your help. We haven't had much of an issue since disabling parallel workers so nothing is harmed by waiting a little longer. Thanks, Justin
[HACKERS] PG10b2: column-list UPDATE syntax fails with single column
In testing our application with PG10beta2, I discovered that our use of UPDATE broke, apparently by this commit |commit 906bfcad7ba7cb3863fe0e2a7810be8e3cd84fbd |Author: Tom Lane <t...@sss.pgh.pa.us> |Date: Tue Nov 22 15:19:57 2016 -0500 | |Improve handling of "UPDATE ... SET (column_list) = row_constructor". ERROR ON QUERY: UPDATE eric_enodeb_cell_metrics SET (pmActiveDrbDlSum)=(%s) WHERE sect_id=%s AND subnetwork_id=%s AND start_time=%s AND site_id=%s AND interval_seconds=%s AND nedn=%s AND mecontext=%s AND EUtranCellxDD=%s AND EUtranCell=%s AND subnetwork=%s AND device_id=%s -- Query Error: ERROR: source for a multiple-column UPDATE item must be a sub-SELECT or ROW() expression This still works for multiple columns but in pg10b2 no longer works for single column update with "column_list" syntax. ts=# begin; UPDATE eric_enodeb_201707 SET (pmhdelayvarbest50pct,pmlicconnecteduserstimecong)=(0,0) ; BEGIN UPDATE 3 ts=# begin; UPDATE eric_enodeb_201707 SET (pmhdelayvarbest50pct)=(0) ; BEGIN ERROR: source for a multiple-column UPDATE item must be a sub-SELECT or ROW() expression It may be that our use was wrong (?) or unintuitive (I'm in the middle of changing it), but wondered if it was intentional or otherwise if the release notes should mention that old syntax is no longer accepted. Either way - thanks in advance. Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] tupconvert.c API change in v10 release notes
FYI, I happened across this commit comment: 3f902354b08ac788600f0ae54fcbfc1d4e3ea765 | So, let's accept the removal of the guarantee about | the output tuple's rowtype marking, recognizing that this is a API change | that could conceivably break third-party callers of tupconvert.c. (So, | let's remember to mention it in the v10 release notes.) ..but couldn't see that the commit or change is so referenced. Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] memory fields from getrusage()
On Thu, Jun 15, 2017 at 10:29:21AM -0400, Robert Haas wrote: > On Wed, Jun 14, 2017 at 6:28 PM, Justin Pryzby <pry...@telsasoft.com> wrote: > > On Tue, Jun 13, 2017 at 12:16:00PM -0400, Robert Haas wrote: > >> It might be worth adding platform-specific code for common platforms. > > > > All I care (which linux happily/happens to support) is maxrss; I was > > probably > > originally interested in this while digging into an issue with hash agg. > > I don't think it needs to go in a separate file. I'd just patch ShowUsage(). diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index f99dd0a..7f57a84 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -4467,6 +4467,21 @@ ShowUsage(const char *title) r.ru_nvcsw - Save_r.ru_nvcsw, r.ru_nivcsw - Save_r.ru_nivcsw, r.ru_nvcsw, r.ru_nivcsw); + +#if defined(__linux__) + appendStringInfo(, +"!\t%ld max resident (kB)\n", +r.ru_maxrss); +#elif defined(BSD) + appendStringInfo(, +"!\t%ld max resident, %ld shared, %ld unshared data, %ld unshared stack (kB)\n", +r.ru_maxrss, r.ru_ixrss, r.ru_idrss, r.ru_isrss); +#elif defined(__darwin__) + appendStringInfo(, +"!\t%ld max resident, %ld shared, %ld unshared data, %ld unshared stack (kB)\n", +r.ru_maxrss/1024, r.ru_ixrss/1024, r.ru_idrss/1024, r.ru_isrss/1024); +#endif /* __linux__ */ + #endif /* HAVE_GETRUSAGE */ /* remove trailing newline */ Comments ? Testing or suggestions on !linux would be useful. Justin Add memory fields (sometimes) available from getrusage() diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index f99dd0a..7f57a84 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -4467,6 +4467,21 @@ ShowUsage(const char *title) r.ru_nvcsw - Save_r.ru_nvcsw, r.ru_nivcsw - Save_r.ru_nivcsw, r.ru_nvcsw, r.ru_nivcsw); + +#if defined(__linux__) + appendStringInfo(, +"!\t%ld max resident (kB)\n", +r.ru_maxrss); +#elif defined(BSD) + appendStringInfo(, +"!\t%ld max resident, %ld shared, %ld unshared data, %ld unshared stack (kB)\n", +r.ru_maxrss, r.ru_ixrss, r.ru_idrss, r.ru_isrss); +#elif defined(__darwin__) + appendStringInfo(, +"!\t%ld max resident, %ld shared, %ld unshared data, %ld unshared stack (kB)\n", +r.ru_maxrss/1024, r.ru_ixrss/1024, r.ru_idrss/1024, r.ru_isrss/1024); +#endif /* __linux__ */ + #endif /* HAVE_GETRUSAGE */ /* remove trailing newline */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] memory fields from getrusage()
On Tue, Jun 13, 2017 at 12:16:00PM -0400, Robert Haas wrote: > It might be worth adding platform-specific code for common platforms. All I care (which linux happily/happens to support) is maxrss; I was probably originally interested in this while digging into an issue with hash agg. I think it's fine to show zeros for unsupported fields; that's what getusage(2) and time(1) do after all. pryzbyj@pryzbyj:~$ sh -c 'command time -v find /dev' 2>&1 >/dev/null |grep -Fw 0 User time (seconds): 0.00 System time (seconds): 0.00 Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.00 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 > it would be a good idea to install code specific to Linux that > displays all and only those values that are meaningful on Linux, and > (less importantly) similarly for macOS. Linux is such a common > platform that reporting bogus zero values and omitting other fields > that are actually meaningful does not seem like a very good plan. That has the issue that it varies not just by OS but also by OS version. For example PG already shows context switches and FS in/out puts, but they're nonzero only since linux 2.6 (yes, 2.4 is ancient and unsupported but still). ru_nvcsw (since Linux 2.6) ru_inblock (since Linux 2.6.22) ..and other fields are "currently unused", but maybe supported in the past or future(?) ru_ixrss (unmaintained) This field is currently unused on Linux. Are you thinking of something like this, maybe hidden away in a separate file somewhere? #if defined(__linux__) || defined(BSD) appendStringInfo(, "!\t%ld max resident, %ld shared, %ld unshared data, %ld unshared stack (kB)\n", r.ru_maxrss, r.ru_ixrss, r.ru_idrss, r.ru_isrss); #elif defined(__darwin__) appendStringInfo(, "!\t%ld max resident, %ld shared, %ld unshared data, %ld unshared stack (kB)\n", r.ru_maxrss/1024, r.ru_ixrss/1024, r.ru_idrss/1024, r.ru_isrss/1024); #endif /* __linux__ */ Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] memory fields from getrusage()
I'm interested to expose output of the remaining (memory) fields from getrusage(). postgres=# SET log_parser_stats='on'; postgres=# SELECT c.oid::regclass, usagecount FROM pg_buffercache b JOIN pg_class c USING (relfilenode) WHERE usagecount=1 ; LOG: PARSER STATISTICS DETAIL: ! system usage stats: ! 0.000197 elapsed 0.000119 user 0.79 system sec ! [0.011572 user 0.007714 sys total] ! 0/0 [0/264] filesystem blocks in/out ! 0/39 [0/989] page faults/reclaims, 0 [0] swaps ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent ! 0/0 [2/2] voluntary/involuntary context switches ! 3920 MAX RESIDENT, 0 SHARED, 0 UNSHARED DATA, 0 UNSHARED STACK (Kb) Before that can work for some platforms, it looks like rusagestub.h needs to have any desired fields added, and ./src/port/getrusage.c should memset(0) in the non-windows case before adding any reliance on the rest of the structure. This comment from ~1996 says: https://doxygen.postgresql.org/postgres_8c_source.html 4421 * the only stats we don't show here are for memory usage -- i can't 4422 * figure out how to interpret the relevant fields in the rusage struct, 4423 * and they change names across o/s platforms, anyway. if you can figure 4424 * out what the entries mean, you can somehow extract resident set size, 4425 * shared text size, and unshared data and stack sizes. .. is that really (still) the case for supported platforms? I'm hoping that in 2017 one can just call getrusage() if autoconf says it's okay ?? Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Possible issue with expanded object infrastructure on Postgres 9.6.1
> > It would help to know the data types of the columns involved in this > query; but just eyeballing it, it doesn't look like it involves any > array operations, so it's pretty hard to believe that the expanded-object > code could have gotten invoked intentionally. (The mere presence of > an array column somewhere in the vicinity would not do that; you'd > need to invoke an array-ish operation, or at least pass the array into > a plpgsql function.) > If I had to bet on the basis of this much info, I would bet that the > parallel-query infrastructure is dropping the ball somewhere and > transmitting a corrupted datum that accidentally looks like it is > an expanded-object reference. > If $customer wants a quick fix, I'd suggest seeing whether disabling > parallel query makes the problem go away. That might be a good first > step anyway, just to narrow down where the problem lies. > regards, tom lane Hi Tom, My name is Justin, and I am $customer as it were. As Peter explained, we haven't seen the segfaults anymore since disabling parallel queries. This works as a quick fix and is much appreciated! If you would still like to get to the bottom of this, I am willing to help out with more information as needed. My knowledge of PG internals is extremely limited so I don't know how much help I can be, but we'd like to see this resolved beyond the quick fix, or at least understand why it happened. album_photo_assignments.id, album_photo_assignments.album_id, album_photo_assignments.photo_id and albums.id are all UUID columns. albums.deleted_at is a timestamp. Thanks so much for your time, Justin Workman
[HACKERS] smallint out of range EXECUTEing prepared statement
Is this expected behavior ? ts=# SELECT * FROM t WHERE site_id=32768 LIMIT 1; (0 rows) ts=# PREPARE x AS SELECT * FROM t WHERE site_id=$1 LIMIT 1; PREPARE ts=# EXECUTE x(32768); ERROR: smallint out of range ts=# PREPARE y AS SELECT * FROM t WHERE site_id::int=$1 LIMIT 1; PREPARE ts=# EXECUTE y(32768); (0 rows) Note, we also sometimes get small/int out of range when SELECTing from a view, and we end up as a workaround putting a ::big/int cast into the view or multiplying by 1. Thanks, Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE .. ALTER COLUMN .. ERROR: attribute .. has wrong type
On Tue, Jan 03, 2017 at 03:35:34PM -0500, Tom Lane wrote: > Justin Pryzby <pry...@telsasoft.com> writes: > > On Tue, Jan 03, 2017 at 03:18:15PM -0500, Tom Lane wrote: > >> I'm wondering if this represents some sort of out-of-sync condition > >> between the table and its child tables. We can't actually tell from > >> this trace which table is being processed. Could you try, from this > >> breakpoint, > >> > >> f 3 > >> p oldrel->rd_rel->relname > > > (gdb) p oldrel->rd_rel->relname > > $1 = {data = "eric_umts_rnc_utrancell_201701", '\000' } > > Right. So I bet that if you check the attnum of pmsumpacketlatency_000 in > eric_umts_rnc_utrancell_metrics, you'll find it's different from that in > eric_umts_rnc_utrancell_201701, and that the attribute having that attnum > in eric_umts_rnc_utrancell_201701 has type smallint not int. I think that's consistent with what your understanding: ts=# SELECT attrelid::regclass, attname, attnum, atttypid FROM pg_attribute WHERE attrelid::regclass::text~'eric_umts_rnc_utrancell_(metrics|201701)$' AND (attname='pmsumpacketlatency_000' OR attnum IN (367,424) ) ORDER BY 1,2; eric_umts_rnc_utrancell_metrics | pmsamplespshsadchrabestablish |367 | 21 eric_umts_rnc_utrancell_metrics | pmsumpacketlatency_000|424 | 23 eric_umts_rnc_utrancell_201701 | pmsumpacketlatency_000|367 | 23 eric_umts_rnc_utrancell_201701 | pmulupswitchsuccessmedium |424 | 21 Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE .. ALTER COLUMN .. ERROR: attribute .. has wrong type
On Tue, Jan 03, 2017 at 03:18:15PM -0500, Tom Lane wrote: > Justin Pryzby <pry...@telsasoft.com> writes: > > (gdb) bt > > #3 0x0059d5ce in ATRewriteTable (tab=, > > OIDNewHeap=, lockmode=) at > > tablecmds.c:4152 > > I'm wondering if this represents some sort of out-of-sync condition > between the table and its child tables. We can't actually tell from > this trace which table is being processed. Could you try, from this > breakpoint, > > f 3 > p oldrel->rd_rel->relname (gdb) p oldrel->rd_rel->relname $1 = {data = "eric_umts_rnc_utrancell_201701", '\000' } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE .. ALTER COLUMN .. ERROR: attribute .. has wrong type
On Tue, Jan 03, 2017 at 02:50:21PM -0500, Tom Lane wrote: > Justin Pryzby <pry...@telsasoft.com> writes: > > On Tue, Jan 03, 2017 at 02:32:36PM -0500, Tom Lane wrote: > >> 2. Even better would be a stack trace for the call to errfinish, > >> https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend > > Thanks, but we need the whole call stack, or at least the first dozen or > so levels. "bt" in gdb would do. #0 errfinish (dummy=0) at elog.c:414 #1 0x006dd39f in exec_simple_query (query_string=0x1fc5fb0 "ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT USING PMSUMPACKETLATENCY_000::BIGINT;") at postgres.c:932 #2 0x006dec8c in PostgresMain (argc=, argv=, dbname=0x1f65d98 "ts", username=) at postgres.c:4070 #3 0x0067f2c5 in BackendRun (argc=, argv=) at postmaster.c:4270 #4 BackendStartup (argc=, argv=) at postmaster.c:3944 #5 ServerLoop (argc=, argv=) at postmaster.c:1701 #6 PostmasterMain (argc=, argv=) at postmaster.c:1309 #7 0x00607658 in main (argc=3, argv=0x1f3a4f0) at main.c:228 (gdb) bt #0 errfinish (dummy=0) at elog.c:414 #1 0x005d0e30 in ExecEvalScalarVar (exprstate=, econtext=, isNull=, isDone=) at execQual.c:655 #2 0x005d0c3c in ExecMakeFunctionResultNoSets (fcache=0x21f18a0, econtext=0x2199e80, isNull=0x21e90ee "", isDone=) at execQual.c:2015 #3 0x0059d5ce in ATRewriteTable (tab=, OIDNewHeap=, lockmode=) at tablecmds.c:4152 #4 0x005a92fc in ATRewriteTables (parsetree=0x1f63b20, rel=, cmds=, recurse=, lockmode=) at tablecmds.c:3858 #5 ATController (parsetree=0x1f63b20, rel=, cmds=, recurse=, lockmode=) at tablecmds.c:3104 #6 0x006e25e6 in ProcessUtilitySlow (parsetree=0x1fc6f78, queryString=0x1fc5fb0 "ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT USING PMSUMPACKETLATENCY_000::BIGINT;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=, completionTag=0x7fff8b9d3a90 "") at utility.c:1085 #7 0x006e2a70 in standard_ProcessUtility (parsetree=0x1fc6f78, queryString=0x1fc5fb0 "ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT USING PMSUMPACKETLATENCY_000::BIGINT;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=0x1fc72b8, completionTag=0x7fff8b9d3a90 "") at utility.c:907 #8 0x006df2cc in PortalRunUtility (portal=0x1fff2e0, utilityStmt=0x1fc6f78, isTopLevel=1 '\001', setHoldSnapshot=, dest=0x1fc72b8, completionTag=0x7fff8b9d3a90 "") at pquery.c:1193 #9 0x006e01cb in PortalRunMulti (portal=0x1fff2e0, isTopLevel=1 '\001', setHoldSnapshot=0 '\000', dest=0x1fc72b8, altdest=0x1fc72b8, completionTag=0x7fff8b9d3a90 "") at pquery.c:1349 #10 0x006e0934 in PortalRun (portal=0x1fff2e0, count=9223372036854775807, isTopLevel=1 '\001', dest=0x1fc72b8, altdest=0x1fc72b8, completionTag=0x7fff8b9d3a90 "") at pquery.c:815 #11 0x006dd5b1 in exec_simple_query (query_string=0x1fc5fb0 "ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT USING PMSUMPACKETLATENCY_000::BIGINT;") at postgres.c:1094 #12 0x006dec8c in PostgresMain (argc=, argv=, dbname=0x1f65d98 "ts", username=) at postgres.c:4070 #13 0x0067f2c5 in BackendRun (argc=, argv=) at postmaster.c:4270 #14 BackendStartup (argc=, argv=) at postmaster.c:3944 #15 ServerLoop (argc=, argv=) at postmaster.c:1701 #16 PostmasterMain (argc=, argv=) at postmaster.c:1309 #17 0x00607658 in main (argc=3, argv=0x1f3a4f0) at main.c:228 > > I'll send the rest of \d if you really want but: > > Well, we don't know what we're looking for, so assuming that there's > nothing of interest there is probably bad. Attached Justin alter-wrong-type-dplus.gz Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE .. ALTER COLUMN .. ERROR: attribute .. has wrong type
On Tue, Jan 03, 2017 at 02:32:36PM -0500, Tom Lane wrote: > 3. It's pretty hard to see how you'd reach any of these places for an > ALTER COLUMN TYPE on a simple table. Has the table got rules, triggers, > default values? Could we see "\d+" output for it? I really meant to do \d+.. Table "public.eric_umts_rnc_utrancell_metrics" Column| Type | Modifiers | Storage | Stats target | Description -+--+---+--+--+- sect_id | integer | not null | plain| 400 | start_time | timestamp with time zone | not null | plain| 400 | site_id | integer | not null | plain| 400 | interval_seconds| smallint | not null | plain| 200 | utrancell | text | not null | extended | 200 | nedn| text | not null | extended | 200 | rnc_id | integer | not null | plain| 400 | device_id | integer | not null | plain| 200 | pmcelldowntimeauto | smallint | | plain| 10 | pmcelldowntimeman | smallint | | plain| 10 | [...] Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE .. ALTER COLUMN .. ERROR: attribute .. has wrong type
On Tue, Jan 03, 2017 at 02:32:36PM -0500, Tom Lane wrote: > Justin Pryzby <pry...@telsasoft.com> writes: > >>>> I can cause the error at will on the existing table, > > That's good news, at least. > > 1. Please trigger it with "\set VERBOSITY verbose" enabled, so we can see > the exact source location --- there are a couple of instances of that > text. ts=# begin; drop view umts_eric_ch_switch_view, eric_umts_rnc_utrancell_view, umts_eric_cell_integrity_view; ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT USING PMSUMPACKETLATENCY_000::BIGINT; BEGIN DROP VIEW ERROR: 42804: attribute 424 has wrong type DETAIL: Table has type smallint, but query expects integer. LOCATION: ExecEvalScalarVar, execQual.c:660 > 2. Even better would be a stack trace for the call to errfinish, > https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend #1 0x006dd39f in exec_simple_query (query_string=0x1fc5fb0 "begin;") at postgres.c:932 dest = DestRemote oldcontext = 0x1f3b100 parsetree_list = 0x1fc69f0 save_log_statement_stats = 0 '\000' was_logged = 0 '\000' msec_str = "\360:\235\213\377\177\000\000`<\235\213\377\177\000\000\260_\374\001", '\000' __func__ = "exec_simple_query" and then #1 0x006dd39f in exec_simple_query ( query_string=0x1fc5fb0 "ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT USING PMSUMPACKETLATENCY_000::BIGINT;") at postgres.c:932 dest = DestRemote oldcontext = 0x1f3b100 parsetree_list = 0x1fc6fc8 save_log_statement_stats = 0 '\000' was_logged = 0 '\000' msec_str = "\360:\235\213\377\177\000\000`<\235\213\377\177\000\000\260_\374\001", '\000' __func__ = "exec_simple_query" then #1 0x005d0e30 in ExecEvalScalarVar (exprstate=, econtext=, isNull=, isDone=) at execQual.c:655 attnum = 424 __func__ = "ExecEvalScalarVar" > 3. It's pretty hard to see how you'd reach any of these places for an > ALTER COLUMN TYPE on a simple table. Has the table got rules, triggers, > default values? Could we see "\d+" output for it? triggers and defaults, yes. sect_id | integer | not null start_time | timestamp with time zone | not null site_id | integer | not null interval_seconds| smallint | not null utrancell | text | not null nedn| text | not null rnc_id | integer | not null device_id | integer | not null pmcelldowntimeauto | smallint | pmcelldowntimeman | smallint | pmchswitchattemptfachura| smallint | pmchswitchattempturafach| smallint | ... Triggers: eric_umts_rnc_utrancell_insert_trigger BEFORE INSERT ON eric_umts_rnc_utrancell_metrics FOR EACH ROW EXECUTE PROCEDURE eric_umts_rnc_utrancell_insert_function() Number of child tables: 3 (Use \d+ to list them.) I'll send the rest of \d if you really want but: ts=# SELECT COUNT(1) FROM pg_attribute WHERE attrelid='eric_umts_rnc_utrancell_metrics'::regclass; count | 1116 Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE .. ALTER COLUMN .. ERROR: attribute .. has wrong type
On Tue, Jan 03, 2017 at 01:40:50PM -0500, Robert Haas wrote: > On Tue, Jan 3, 2017 at 11:59 AM, Justin Pryzby <pry...@telsasoft.com> wrote: > > On Tue, Jan 03, 2017 at 11:45:33AM -0500, Robert Haas wrote: > >> > ts=# begin; drop view umts_eric_ch_switch_view, > >> > eric_umts_rnc_utrancell_view, umts_eric_cell_integrity_view; ALTER TABLE > >> > eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE > >> > BIGINT USING PMSUMPACKETLATENCY_000::BIGINT; > >> > BEGIN > >> > DROP VIEW > >> > ERROR: attribute 424 has wrong type > >> > DETAIL: Table has type smallint, but query expects integer. > >> > ts=# > >> > > > I can cause the error at will on the existing table, but I wouldn't know > > how to > > reproduce the problem on a new table/database. I'm guessing it has > > something > Just for kicks, could you try running pg_catcheck on the affected system? > > https://github.com/EnterpriseDB/pg_catcheck Neat, I hadn't heard of it before ;) The version in PGDG has the "amkeytype" issue, so I compiled, I got this: [pryzbyj@database pg_catcheck]$ ./pg_catcheck ts notice: pg_shdepend row has invalid classid "2613": not a system catalog OID row identity: dbid="16402" classid="2613" objid="1086583699" objsubid="0" refclassid="1260" refobjid="16384" deptype="o" notice: pg_shdepend row has invalid classid "2613": not a system catalog OID row identity: dbid="16402" classid="2613" objid="1086583701" objsubid="0" refclassid="1260" refobjid="16384" deptype="o" [...] notice: pg_depend row has invalid objid "1124153791": no matching entry in pg_class row identity: classid="1259" objid="1124153791" objsubid="0" refclassid="1259" refobjid="1064197368" refobjsubid="1" deptype="a" progress: done (294 inconsistencies, 0 warnings, 0 errors) .. those are the only two problem oids: [pryzbyj@database pg_catcheck]$ time ./pg_catcheck ts 2>&1 |grep -Evw '2613|1259' progress: done (264 inconsistencies, 0 warnings, 0 errors) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE .. ALTER COLUMN .. ERROR: attribute .. has wrong type
On Tue, Jan 03, 2017 at 11:45:33AM -0500, Robert Haas wrote: > > ts=# begin; drop view umts_eric_ch_switch_view, > > eric_umts_rnc_utrancell_view, umts_eric_cell_integrity_view; ALTER TABLE > > eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE > > BIGINT USING PMSUMPACKETLATENCY_000::BIGINT; > > BEGIN > > DROP VIEW > > ERROR: attribute 424 has wrong type > > DETAIL: Table has type smallint, but query expects integer. > > ts=# > > > > ts=# begin; drop view umts_eric_ch_switch_view, > > eric_umts_rnc_utrancell_view, umts_eric_cell_integrity_view; ALTER TABLE > > eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE > > BIGINT ; > > BEGIN > > DROP VIEW > > ALTER TABLE > > ts=# > > > > Is it useful to send something from pg_attribute, or other clues ?? > > So, are these errors reproducible? Like, if you create a brand new I can cause the error at will on the existing table, but I wouldn't know how to reproduce the problem on a new table/database. I'm guessing it has something to do with dropped columns or historic alters (which I mentioned are typically done separately on child tables vs their parent). Since it's happened 3 times now on this table, but not others on this database, I would guess it's an "data issue", possibly related to pg_upgrades. IOW it may be impossible to get into this state from a fresh initdb from a current version. I considered that perhaps it only affected our oldest tables, and would stop happening once they were dropped, but note this ALTER is only of a parent and its 3 most recent children. So only the empty parent could be described as "old". Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE .. ALTER COLUMN .. ERROR: attribute .. has wrong type
On Wed, Oct 12, 2016 at 10:25:05AM -0500, Justin Pryzby wrote: > > I don't have a clear recollection how I solved this in July; possibly by > > restoring the (historic, partition) table from backup. > > > > Last week again again just now (both under 9.6), a colleague found that he > > was > > able to avoid the error by ALTER TYPE without USING. > > > > Note that we ALTER TABLE .. NO INHERIT the partitions for all but the most > > recent 2 months before ALTERing them (or the parent). The "ALTER NO > > INHERIT" > > and the ALTER TYPE of historic partitions are done outside of a transaction > > in > > order to avoid large additional disk use otherwise used when ALTERing a > > parent > > with many or large children (the sum of the size of the children). Here's DETAILs for a 2nd such error which has shown up today: (EricssonUtranXmlParser): Failed to alter table eric_umts_rnc_utrancell_metrics: ERROR: attribute 424 has wrong type DETAIL: Table has type smallint, but query expects integer. (EricssonUtranXmlParser): Failed to alter table eric_umts_rnc_utrancell_metrics: ERROR: attribute 361 has wrong type DETAIL: Table has type integer, but query expects smallint. Also, note both alters really do work without "USING": ts=# begin; drop view umts_eric_ch_switch_view, eric_umts_rnc_utrancell_view, umts_eric_cell_integrity_view; ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT USING PMSUMPACKETLATENCY_000::BIGINT; BEGIN DROP VIEW ERROR: attribute 424 has wrong type DETAIL: Table has type smallint, but query expects integer. ts=# ts=# begin; drop view umts_eric_ch_switch_view, eric_umts_rnc_utrancell_view, umts_eric_cell_integrity_view; ALTER TABLE eric_umts_rnc_utrancell_metrics ALTER COLUMN PMSUMPACKETLATENCY_000 TYPE BIGINT ; BEGIN DROP VIEW ALTER TABLE ts=# Is it useful to send something from pg_attribute, or other clues ?? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE .. ALTER COLUMN .. ERROR: attribute .. has wrong type
I originally sent to psql-general some months ago, but it appears it was never delivered (perhaps I wasn't properly subscribed?). Failed to alter table eric_umts_rnc_utrancell_metrics: ERROR: attribute 361 has wrong type DETAIL: Table has type integer, but query expects smallint. We've seen this at least 4 times now, on PG95 and 9.6; 3 of those times are for the above table. Any ideas what I can do to either reproduce it or otherwise avoid it ? On Wed, Oct 12, 2016 at 10:25:05AM -0500, Justin Pryzby wrote: > We've seen this happen at least once on a 9.5 server, and twice on (the same) > server since its upgrade last week to 9.6: > > > ALTER TABLE t ALTER column TYPE says: "ERROR: attribute 81 has wrong type". > > Just now under 9.6 > DETAIL: Table has type integer, but query expects smallint > ... > ts=# SELECT attnum, atttypid, attrelid::regclass FROM pg_attribute WHERE > attname='pmnopagingattemptutranrejected' ORDER BY 1 DESC,2,3; > attnum | atttypid |attrelid > +--+- > 193 | 21 | eric_umts_rnc_utrancell_metrics > 193 | 21 | eric_umts_rnc_utrancell_201508 > 179 | 21 | eric_umts_rnc_utrancell_201509 > 179 | 21 | eric_umts_rnc_utrancell_201510 > 179 | 21 | eric_umts_rnc_utrancell_201511 > 179 | 21 | eric_umts_rnc_utrancell_201602 > [...] > 179 | 21 | eric_umts_rnc_utrancell_201610 > 179 | 21 | eric_umts_rnc_utrancell_201611 > (17 rows) > > Last week (same server, same table, still 9.6): > DETAIL: Table has type real, but query expects smallint > > In July (different server) under 9.5 > DETAIL: Table has type real, but query expects smallint > ... > SELECT atttypid, attnum, attrelid::regclass FROM pg_attribute WHERE > attname='c_84150886' > atttypid | attnum | attrelid > --++- >21 |200 | huawei_msc_trunkgrp_201605 >21 |200 | huawei_msc_trunkgrp_201604 >21 |200 | huawei_msc_trunkgrp_201603 >21 |200 | huawei_msc_trunkgrp_201602 >21 |200 | huawei_msc_trunkgrp_201512 >21 |200 | huawei_msc_trunkgrp_201511 >21 |200 | huawei_msc_trunkgrp_201510 >21 |200 | huawei_msc_trunkgrp_201508 >21 |200 | huawei_msc_trunkgrp_201507 >21 |200 | huawei_msc_trunkgrp_201506 >21 |200 | huawei_msc_trunkgrp_201505 >21 |200 | huawei_msc_trunkgrp_201607 >21 |200 | huawei_msc_trunkgrp_201606 >21 |200 | huawei_msc_trunkgrp_201608 >21 |201 | huawei_msc_trunkgrp_metrics >21 |200 | huawei_msc_trunkgrp_201509 >21 |200 | huawei_msc_trunkgrp_201601 > (17 rows) > > I don't have a clear recollection how I solved this in July; possibly by > restoring the (historic, partition) table from backup. > > Last week again again just now (both under 9.6), a colleague found that he was > able to avoid the error by ALTER TYPE without USING. > > Note that we ALTER TABLE .. NO INHERIT the partitions for all but the most > recent 2 months before ALTERing them (or the parent). The "ALTER NO INHERIT" > and the ALTER TYPE of historic partitions are done outside of a transaction in > order to avoid large additional disk use otherwise used when ALTERing a parent > with many or large children (the sum of the size of the children). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd behavior with domains
I was the one that reported that on twitter. I have a more detailed message on the general list that I sent before subscribing and probably needs to be moderated (or if it went to /dev/null let me know). On Thu, Jun 23, 2016 at 11:01 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > "Joshua D. Drake" <j...@commandprompt.com> writes: > > So this came across my twitter feed: > > https://pbs.twimg.com/media/ClqIJtmXEAA5IGt.png > > public.text can exist in parallel with pg_catalog.text. > > It just doesn't seem right to me to be able to do: CREATE DOMAIN int AS varchar(50); Justin
Re: [HACKERS] Lets (not) break all the things. Was: [pgsql-advocacy] 9.6 -> 10.0
On 13 May 2016, at 21:42, Josh berkus <j...@agliodbs.com> wrote: > On 05/13/2016 01:04 PM, Joshua D. Drake wrote: >> On 05/13/2016 12:03 PM, Josh berkus wrote: >>> On 05/13/2016 11:48 AM, Robert Haas wrote: >>>> On Fri, May 13, 2016 at 12:12 PM, Joshua D. Drake >>>> <j...@commandprompt.com> wrote: >> >>> Anyway, all of this is a moot point, because nobody has the power to >>> tell the various companies what to do. We're just lucky that everyone >>> is still committed to writing stuff which adds to PostgreSQL. >> >> Lucky? No. We earned it. We earned it through years and years of hard >> work. Should we be thankful? Absolutely. Should we be grateful that we >> have such a powerful and engaged commercial contribution base? 100%. > > Lucky. Sure there was work and personal integrity involved, but like > any success story, there was luck. > > But we've also been fortunate in not spawning hostile-but-popular forks > by people who left the project, and that none of the companies who > created hostile forks were very successful with them, and that nobody > has seriously tried using lawyers to control/ruin the project. > > And, most importantly, we've been lucky that a lot of competing projects > have self-immolated instead of being successful and brain-draining our > contributors (MySQL, ANTS, MonetDB, etc.) Oracle buying MySQL (via Sun) seems to have helped things along pretty well too. :) + Justin -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lets (not) break all the things. Was: [pgsql-advocacy] 9.6 -> 10.0
On 12 Apr 2016, at 17:23, Merlin Moncure <mmonc...@gmail.com> wrote: > On Mon, Apr 11, 2016 at 11:39 AM, Justin Clift <jus...@postgresql.org> wrote: >> Moving over a conversation from the pgsql-advocacy mailing list. In it >> Simon (CC'd) raised the issue of potentially creating a >> backwards-compatibility >> breaking release at some point in the future, to deal with things that >> might have no other solution (my wording). >> >> Relevant part of that thread there for reference: >> >> >> http://www.postgresql.org/message-id/CANP8+jLtk1NtaJyXc=hAqX=0k+ku4zfavgvbkfs+_sor9he...@mail.gmail.com >> >> Simon included a short starter list of potentials which might be in >> that category: >> >> * SQL compliant identifiers >> * Remove RULEs >> * Change recovery.conf >> * Change block headers >> * Retire template0, template1 >> * Optimise FSM >> * Add heap metapage >> * Alter tuple headers >> et al >> >> This still is better placed on -hackers though, so lets have the >> conversation here to figure out if a "backwards compatibility breaking" >> release really is needed or not. > > A couple of points here: > *) I don't think having a version number that starts with 10 instead > of 9 magically fixes backwards compatibility problems and I think > that's a dangerous precedent to set unless we're willing to fork > development and support version 9 indefinitely including major release > versions. > > *) Compatibility issues at the SQL level have to be taken much more > seriously than other things (like internal layouts or .conf issues). > > *) We need to do an honest cost benefit analysis before breaking > things. Code refactors placed on your users puts an enormous cost > that is often underestimated. I have some fairly specific examples of > the costs related to the text cast removal for example. It's not a > pretty picture. Yeah. Moving the discussion here was more to determine which items really would need a backwards compatible break. eg no other approach can be found. Seems I started it off badly, as no-one's yet jumped in to discuss the initial points. :( + Justin -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lets (not) break all the things. Was: [pgsql-advocacy] 9.6 -> 10.0
On 12 Apr 2016, at 14:12, Yury Zhuravlev <u.zhurav...@postgrespro.ru> wrote: > Justin Clift wrote: >> Simon included a short starter list of potentials which might be in >> that category: >> >> * SQL compliant identifiers >> * Remove RULEs >> * Change recovery.conf >> * Change block headers >> * Retire template0, template1 >> * Optimise FSM >> * Add heap metapage >> * Alter tuple headers >> et al > > + CMake build I think. > > Now I can build: > * postgres > * bin/* programs > * pl/* languages > * contrib/* (with cmake PGXS analogue) > > Can run regression and isolation tests for postgres/pl* and all contrib > modules. > There is still a lot of work but I hope everything will turn out. Also it > would be good to get help. > > Thanks. > > PS https://github.com/stalkerg/postgres_cmake If/when PostgreSQL can be built and tested with CMake... why would the resulting code + database files + network protocol (etc) not be compatible with previous versions? :) + Justin -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Lets (not) break all the things. Was: [pgsql-advocacy] 9.6 -> 10.0
Moving over a conversation from the pgsql-advocacy mailing list. In it Simon (CC'd) raised the issue of potentially creating a backwards-compatibility breaking release at some point in the future, to deal with things that might have no other solution (my wording). Relevant part of that thread there for reference: http://www.postgresql.org/message-id/CANP8+jLtk1NtaJyXc=hAqX=0k+ku4zfavgvbkfs+_sor9he...@mail.gmail.com Simon included a short starter list of potentials which might be in that category: * SQL compliant identifiers * Remove RULEs * Change recovery.conf * Change block headers * Retire template0, template1 * Optimise FSM * Add heap metapage * Alter tuple headers et al This still is better placed on -hackers though, so lets have the conversation here to figure out if a "backwards compatibility breaking" release really is needed or not. Hopefully we can get it all done without giving users a reason to consider switching. ;) Regards and best wishes, Justin Clift -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg9.4 relpages of child tables
On Wed, Mar 18, 2015 at 12:11:22PM -0400, Tom Lane wrote: Justin Pryzby pry...@telsasoft.com writes: I believe there's been a behavior change, and not sure if it's deliberate. I don't think there's a negative consequence for our production use, but it confused me while summing relpages for analysis purposes, as our 9.4 customers behaved differently. I suspect that you're getting confused because autovacuum kicked in on the It seems you're right. I was additionally confused because the autovacuum had processed the most recent tables (triggered by insertion; partitioned by date), but had not processed some of last month's tables (which I was querying for relpages, since it's a complete month), since this DB was upgraded to pg9.4 last month. Thanks, Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg9.4 relpages of child tables
I believe there's been a behavior change, and not sure if it's deliberate. I don't think there's a negative consequence for our production use, but it confused me while summing relpages for analysis purposes, as our 9.4 customers behaved differently. Documentation indicates that in pg9.0, ANALYZE of a parent table included statistics of its children. Under both pg9.3 and 9.4, this returns no stats rows, after the parent table is analyzed. It returns stats if the child is analyzed. SELECT * FROM pg_statistic WHERE starelid='.._2014_01'::regclass However, in pg9.4, the child's pg_class.relpages is 0 (apparently, 1 for indices) even after the parent is analyzed (and is an approximate number of pages if the child is analyzed). On pg93: pryzbyj=# create table t (i int); pryzbyj=# create table t2 (like t) inherits(t); pryzbyj=# insert into t2(SELECT generate_series(1,10) ORDER BY RANDOM()); pryzbyj=# select relpages from pg_class where relname='t2'; = 0 pryzbyj=# analyze t; pryzbyj=# select relpages from pg_class where relname='t2'; = 885 On pg94: ts=# create table t (i int); ts=# create table t2 (like t) inherits(t); ts=# insert into t2(SELECT generate_series(1,10) ORDER BY RANDOM()); ts=# select relpages from pg_class where relname='t2'; = 0 ts=# analyze t; ts=# select relpages from pg_class where relname='t2'; -- this changed = 0 ts=# analyze t2; ts=# select relpages from pg_class where relname='t2'; = 443 Is that a deliberate change, and if so, is there any documentation of it? I'd prefer to avoid analyzing all our child tables, as all queries hit the parents, which include statistics on the children. Thanks, Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] ruby connect
I'm trying to connect ruby to postgres on ubuntu and the only link I found that has the library is down. Does anyone have the postgres library for ruby? Or direct me to it? Justin Darby Carrera Database Programmer/Analyst MED-Department of Medical Social Sciences j-carr...@northwestern.edumailto:j-carr...@northwestern.edu 312-695-8994
Re: [HACKERS] windows doesn't notice backend death
Tom Lane wrote: Have M$ really been stupid enough to make an external kill look just like an exit() call? regards, tom lane kind of :-( Everything i have read seems to point the Task Manager calls TerminateProcess() in the kernel32.dll and passes a 1 setting the exitcode to 1. I have not found anything clearly stating that, yet logic points that is what its doing. http://msdn.microsoft.com/en-us/library/ms686714(VS.85).aspx Would it not be easy to set the normal exitcode to something other than 1 to see the difference ExitProcess() http://msdn.microsoft.com/en-us/library/ms682658(VS.85).aspx this article maybe helpful. http://msdn.microsoft.com/en-us/library/ms686722(VS.85).aspx BTW tested on Windows 2003 SP1 this problem shows up in Postgresql 8.3.3
Re: [HACKERS] windows doesn't notice backend death
Andrew Dunstan wrote: justin wrote: Would it not be easy to set the normal exitcode to something other than 1 to see the difference ExitProcess() Not really, as Tom showed later this is an example of a more general problem. I think his solution of detecting when backends have cleaned up nicely and when they have not is the right way to go. cheers andrew Stupid thought why can the some clueless code set the exit status to crashed status??? Would it not be more prudent to remove that ability??? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] string_to_array with empty input
Steve Crawford wrote: Did I miss the exciting conclusion or did this drift silently off radar? it was pretty well split between the options. tabled for another time. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] string_to_array with empty input
Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Tue, Mar 31, 2009 at 10:44 AM, Greg Stark st...@enterprisedb.com wrote: On Tue, Mar 31, 2009 at 3:42 PM, Sam Mason s...@samason.me.uk wrote: string_to_array('',',')::INT[] = invalid input syntax for integer: "" Oof. That's a good point. +1. I find this argument much more compelling than anything else that's been offered up so far. Yeah. It seems to me that if you consider only the case where the array elements are text, there's a weak preference for considering '' to be a single empty string; but as soon as you think about any other datatype, there's a strong preference to consider it a zero-element list. So I too have come around to favor the latter interpretation. Do we have any remaining holdouts? regards, tom lane I'm still a hold out, We are taking a string putting it into a array based on a delimiter. That is very simple and straight forward. Yet many argue if we want to cast this into another data type the function should deal with in limited cases. string_to_array('',',')::INT[] works as proposed But string_to_array(',,,', ',' )::INT[] Fails or string_to_array('1,2,,4', ',' )::INT[] Fails . I'm trying to understand the difference between a empty string to a string with many blank entries between the delimiter. Consider ',,' = '' once the delimiter is removed . Yet Seven zero length entries were passed. How is that going to be handled In one case it works and yet other cases it fails. This is inconsistent behavior. Unless all zero length strings are removed or are treated as NULLs I can't see how casting to another type is going to work. If zero length strings are treated as NULLs this creates idea that zero length strings are = to NULLs. The input is a string and the output is text[], casting to another data type is error prone and should be handled by the programmer.
Re: [HACKERS] [GENERAL] string_to_array with empty input
If someone can show me a real world example this logic simplifies the code and has more uses I'll bite I just presently can't see how this works better. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] string_to_array with empty input
Martin Gainty wrote: Split strings into array elements using provided delimiter string_to_array('xx~^~yy~^~zz', '~^~') output: {xx,yy,zz} http://www.postgresql.org/docs/8.3/interactive/functions-array.html Sorry thats not the question i'm asking. We are debating if it makes sense to change the output in certain cases. I'm for not returning nulls or returning zero element array. I'm asking how is the other better by giving a real world example??? I don't see the plus side at the moment.
Re: [HACKERS] [GENERAL] string_to_array with empty input
Sam Mason wrote: I'd expect 3,2,1 and 1. That's also a disingenuous example; what would you expect back from: select count_elements(string_to_array('butter,,milk',',')) I think the semantics you want is what you'd get from: array_filter_blanks(string_to_array($1,$2)) where I defined array_filter_blanks in my previous post. I agree the function should not be changing values passed. Stripping/Dropping empty strings is changing what was passed into the function instead breaking it into a array. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] string_to_array with empty input
Sam Mason wrote: string_to_array('',',')::INT[] = invalid input syntax for integer: Which you don't get at the moment; although you do currently get it in other common cases such as: string_to_array('1,',',')::INT[] If you want backwards compatible behaviour you could always bung a NULLIF in there: string_to_array(NULLIF('',''),',')::INT[] = NULL But consider this fails also select string_to_array('1, , 3', ',' )::int[] = ERROR: invalid input syntax for integer: yet this works select string_to_array('1, 2, 3',',')::int[] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] string_to_array with empty input
Greg Stark wrote: On Tue, Mar 31, 2009 at 5:48 PM, justin jus...@emproshunts.com wrote: But consider this fails also select string_to_array('1, , 3', ',' )::int[] = ERROR: invalid input syntax for integer: " " yet this works select string_to_array('1, 2, 3',',')::int[] Sure, and the analogous pair string_to_array(' ',',')::int[] and string_to_array('1 ',',')::int[] behave similarly. The point is that '' could represent no items or one empty string. We get to pick which one and in any use case where the string was a list of objects it's almost certainly intended to be an empty list. And databases are almost always processing lists of things. I think the only use case where you want it to be a singleton list of an empty string is when you're doing string parsing such as building a lexer or something like that, which is isn't a typical use for sql code. I disagree. Casting a string to something else can be a very error prone to begin with. Having string_to_array() to deal with that possibility is out of its scope IMHO. Consider this. I have intelligent part numbers that need to be split apart to simplify searching and do math with. string_to_array(' F-2500-50 ', '-' ) ::int[] Still fails with an error as expected. what is the difference between ' ' and 'F' So before doing any thing a test needs to be done to verify the contents, so it can be casted to something else.
Re: [HACKERS] [GENERAL] string_to_array with empty input
This thread being cross posted has made it a bit confusing Greg Stark wrote: Nobody has ever suggested filtering out empty elements or dealing specially with spaces or anything else like that. If you're talking about that then you've missed the original question. "Does anyone want to argue for keeping it the same? Or perhaps argue that a zero-element array is a more sensible result than a one-element array with one empty string? (It doesn't seem like it to me, but maybe somebody thinks so.)" That means to me dropping empty strings or removing values that don't make sense. Then the argument begins what values make since to drop. Just zero length strings or include strings with million white spaces??? One last try. If there's a column called "shopping list" which is a comma-separated list of items to buy in the store and I store '' in it. How many items do you think that means you should go shopping for? Do you think that's one item that should be rejected because it's an empty string? Or do you think that's an empty list with zero items listed? It can't be rejected, Its an Empty shopping list although a worthless shopping list . What would it look like if it was a zero-length list? You can ask what would it look like if it was a shopping list of one item called ''. But I agree both are theoretically consistent, but one is actually useful in 99% of use cases. The other is only useful in unusual cases. I'm still confused which one you want here
[HACKERS] math error or rounding problem Money type
I believe i have found a math bug/rounding problem with Money type when its used with SUM()... Postgresql 8.3.1 --- Background on the Problem We have gl_trans table with 92,000 rows with one column containing the positive and negative entries. In order to make this gl_trans table make more sense and to group the accounts in correct debits and credits along with type of accounts, A view was created that does grouping and sorting. To further make things easier the view casted the results into the Money Type just to make the select statements that call the view shorter. All looked great for several weeks till all of sudden the sumed values for all accounts goes out by 0.01. I needed to confirm this was a rounding problem and not a GL entry that was bad. ( if we had a bad entry this would scream we have a far bigger problem where the application allowed an GL entry to be committed that was out of balance) To confirm that all entries made have equal and opposite entry below select statement was created. The gltrans_sequence column is integer key that groups General Ledger entries together so all the sides of a specific entry can be found. select * from (select gltrans_sequence, sum(gltrans_amount) as Neg from gltrans where gltrans_amount 0 group by gltrans_sequence) as neg, (select gltrans_sequence, sum(gltrans_amount) as pos from gltrans where gltrans_amount 0 group by gltrans_sequence) as pos where neg.gltrans_sequence = pos.gltrans_sequence and neg.neg pos.pos*-1 This returns no records as expected... Now armed with that no entry was bad I suspected it had to be with the money data type. So I added explicit castings select * from (select gltrans_sequence, sum(gltrans_amount::text::money) as Neg from gltrans where gltrans_amount 0 group by gltrans_sequence) as neg, (select gltrans_sequence, sum(gltrans_amount::text::money) as pos from gltrans where gltrans_amount 0 group by gltrans_sequence) as pos where neg.gltrans_sequence = pos.gltrans_sequence and neg.neg pos.pos*-1 select * from (select gltrans_sequence, sum(gltrans_amount::text::money) as Neg from gltrans where gltrans_amount 0 group by gltrans_sequence) as neg, (select gltrans_sequence, sum(gltrans_amount::text::money) as pos from gltrans where gltrans_amount 0 group by gltrans_sequence) as pos where neg.gltrans_sequence = pos.gltrans_sequence and neg.neg::text::money pos.pos::text::money*-1 - select * from (select gltrans_sequence, sum(gltrans_amount) as Neg from gltrans where gltrans_amount 0 group by gltrans_sequence) as neg, (select gltrans_sequence, sum(gltrans_amount) as pos from gltrans where gltrans_amount 0 group by gltrans_sequence) as pos where neg.gltrans_sequence = pos.gltrans_sequence and neg.neg::text::money pos.pos::text::money*-1 --- Nothing resulted in showing a entry that was out of balance. --Identifying the problem --- So i turned my attention to the view which casted numeric type to Money. View is called trailbalance The Bad Select Statement that creates the View -- SELECT p.period_id, p.period_start, p.period_end, a.accnt_id, a.accnt_number, a.accnt_descrip, p.period_yearperiod_id, a.accnt_type, SUM(CASE WHEN g.gltrans_date p.period_start THEN g.gltrans_amount ELSE 0.0 END)::text::money AS beginbalance, SUM(CASE WHEN g.gltrans_date = p.period_end AND g.gltrans_date = p.period_start AND g.gltrans_amount = 0::numeric THEN g.gltrans_amount ELSE 0.0 END)::text::money AS negative, SUM(CASE WHEN g.gltrans_date = p.period_end AND g.gltrans_date = p.period_start AND g.gltrans_amount = 0::numeric THEN g.gltrans_amount ELSE 0.0 END)::text::money AS positive, SUM(CASE WHEN g.gltrans_date = p.period_end AND g.gltrans_date = p.period_start THEN g.gltrans_amount ELSE 0.0 END)::text::money AS difference, SUM(CASE WHEN g.gltrans_date = p.period_end THEN g.gltrans_amount ELSE 0.0 END)::text::money AS endbalance FROM period p CROSS JOIN accnt a LEFT JOIN gltrans g ON (g.gltrans_accnt_id = a.accnt_id AND g.gltrans_posted = true) where p.period_id = 58 group by p.period_id, p.period_start, p.period_end, a.accnt_id, a.accnt_number, a.accnt_descrip, p.period_yearperiod_id, a.accnt_type ORDER BY p.period_id, a.accnt_number; ---End Select The query that calls this View -- Select sum( beginBalance ) as beginbalance, sum( negative ) as debit, sum( positive ) as credit, sum( difference ) as difference, sum( endbalance) as endbalance from trailbalance -
Re: [HACKERS] math error or rounding problem Money type
Tom Lane wrote: Justin [EMAIL PROTECTED] writes: I believe i have found a math bug/rounding problem with Money type when its used with SUM()... Postgresql 8.3.1 You do know that money only stores two fractional digits? regards, tom lane yes. The question is why are the to sides not equal anymore Take this Select '2', round(sum( beginBalance ),6) as beginbalance, round(sum( negative ),6) as debit, round(sum( positive ),6) as credit, round(sum( difference ),6) as difference, round(sum( endbalance),6) as endbalance from trailbalance union Select '1', sum( round(beginBalance,6)) as beginbalance, sum( round(negative,6)) as debit, sum( round(positive,6)) as credit, sum( round(difference,6)) as difference, sum( round(endbalance,6)) as endbalance from trailbalance 1 -0.06 -11250546.74375211250546.743752 0.00 -0.07 2 0.00-11250546.743752 11250546.743752 0.000.00 At the very least this show a clear warning when rounding do it after all the sum function is called not before.
Re: [HACKERS] [PERFORM] Memory question on win32 systems
Sabbiolina wrote: On Thu, May 29, 2008 at 2:11 AM, Douglas McNaught [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: On Wed, May 28, 2008 at 7:05 PM, Sabbiolina [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Hello, in my particular case I need to configure Postgres to handle only a few concurrent connections, but I need it to be blazingly fast, so I need it to cache everything possible. I've changed the config file and multiplied all memory-related values by 10, still Postgres uses only less than 50 Mb of my RAM. How are you measuring this? Task Manager I have 4 Gigs of RAM, how do I force Postgres to use a higher part of such memory in order to cache more indexes, queries and so on? Post the settings values you're using and people will be better able to help you. -Doug config: port = 5432# (change requires restart) max_connections = 50# (change requires restart) shared_buffers = 320MB# min 128kB or max_connections*16kB temp_buffers = 80MB# min 800kB max_prepared_transactions = 10# can be 0 or more work_mem = 10MB# min 64kB maintenance_work_mem = 160MB# min 1MB Max_stack_depth = 20MB# min 100kB max_fsm_pages = 2048000# min max_fsm_relations*16, 6 bytes each max_fsm_relations = 1# min 100, ~70 bytes each shared_preload_libraries = '$libdir/plugins/plugin _debugger.dll'# (change requires restart) wal_buffers = 640kB# min 32kB log_destination = 'stderr'# Valid values are combinations of logging_collector = on# Enable capturing of stderr and csvlog log_line_prefix = '%t '# special values: datestyle = 'iso, dmy' lc_messages = 'Italian_Italy.1252'# locale for system error message lc_monetary = 'Italian_Italy.1252'# locale for monetary formatting lc_numeric = 'Italian_Italy.1252'# locale for number formatting lc_time = 'Italian_Italy.1252'# locale for time formatting default_text_search_config = 'pg_catalog.italian' Need to Use Performance monitor and Process Explorer to track down the caching on windows machines and performance. Postgresql does not work like MS SQL where when it start ups grabs a large chuck memory that it uses to keep last X number of queries along with the indexes and tables in memory then its pushed out only when new queries that completely different need the memory. To my understanding Postgresql only caches queries and results in memory for that specific connection. So when that connection is closed those cached results are cleared out.So cached indexs and queries are for that connection only. I hope my understanding is correct.
Re: [HACKERS] [PERFORM] Memory question on win32 systems
Merlin Moncure wrote: On Thu, May 29, 2008 at 10:19 AM, Justin [EMAIL PROTECTED] wrote: To my understanding Postgresql only caches queries and results in memory for that specific connection. So when that connection is closed those cached results are cleared out.So cached indexs and queries are for that connection only. I hope my understanding is correct. completely wrong. They are called 'shared' buffers for a reason. Also you are missing the point of the o/s file cache which lies under that. If you have a computer with 4gb ram that is addressable, all its memory is used for caching at all times, period, less what operating system needs or what is used by running programs or temporary demands (sorting, etc). Also, postgresql doesn't as a rule cache 'results and queries'. shared buffers setting reserves memory for postgresql's internal cache (plus some housekeeping things like locks)...which can be faster than the o/s cache because it is more tightly integrated with the backend. However a page fault to disk is much more interesting in performance terms than the performance differences between shared buffers and o/s cache. merlin This is not meant to be argumentative i'm trying to clearify what is going because documentation for different sources seem to be in conflict or confusing. --- From Postgresql Second Edition: Shared_Buffers: This cache is shared by all clients connected to a single cluster. DISK I?) (and cache I/O) is performed in 8KB chunks. The shared_buffers parameter determines how many 8KB will be created in the shared cache. from here explanation is different http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html it referrs to queries which means to me the SQL command and the result sets (but that is wrong) We have sort_mem aka work_mem --- From Postgresql Second Edition: where postgresql processes query it transforms the query from string form into an execution plan. An execution plan is a sequence of operations that must be performed in order satisfy the query. (This specific to the client connection and when the client is closed out this is cleared my thinking.) From postgresql Documentation work_mem (integer) Specifies the amount of memory to be used by internal sort operations and hash tables before switching to temporary disk files. The value is defaults to one megabyte (1MB). Note that for a complex query, several sort or hash operations might be running in parallel; each one will be allowed to use as much memory as this value specifies before it starts to put data into temporary files. Also, several running sessions could be doing such operations concurrently. So the total memory used could be many times the value of work_mem; it is necessary to keep this fact in mind when choosing the value. Sort operations are used for ORDER BY, DISTINCT, and merge joins. Hash tables are used in hash joins, hash-based aggregation, and hash-based processing of IN subqueries - Quoting You Also, postgresql doesn't as a rule cache 'results and queries'. Then what is the purpose of shared buffers if nothing is being reused is it only used to keep track locks, changes and what is to being spooled to the kernel??? I'm confussed trying to figure out how caches are being use and being moving through postgresql backend. As many have stated that small database can fit completely in the caches How does that really work.
[HACKERS] transaction support in pl/slql
i have a request ?? Is there any intension of expanding Transaction support into PL/SQL. this would simplify GUI/client code to know the if a function failed it roll back the transaction. right now i have to pass back some kind error message that i look for to see if the transaction failed to either run the commit or rollback command i know about raise exception within sub blocks but i want a way to roll back to the starting transaction. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] creating new aggregate function
Need help and direction creating new aggregate functions. We need to add more average functions for both scientific and finical purposes RMS for electrical measurement purposes Mode for both electrical and finical Weighted Average finical purposes Generalized mean for electrical measurement purposes Geometric mean for electrical measurement purposes Harmonic mean for electrical measurement purposes what would be the best way to create these new functions?? ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] Ad Hoc Indexes
Question: We have recently moved PostgreSql as we just migrated to OpenMfg. i really like the database and PL/pgSQL its very similar to how FoxPro language layout so learning has been pretty easy. Now for my question It does not appear PostgreSQL does not have an Ad Hoc Indexes ability where the Query planner will create an in memory index based on the Select, Update, Insert or Delete commands. I got pretty lazy with MsSQL and Foxpro doing ad hoc indexes to speed up complex queries where i should have created an index but never did. Is there any plans in the future to add the ability for PostgreSQL to create Ad Hoc indexes if it makes sense.Example i was creating a new report thats presently only processing around 120,000 records in the 7 tables for the query. But the query took between 18 to 19 seconds to run. I analyzed the query and added 2 indexes and drop the processing time to 93 milliseconds. This query will be run 3 to 5 times a day so it makes sense to create indexes But with reports that run maybe once a week or once a month it does not make sense to create indexes. this is where I think Add Hoc indexes are nice to have because to my understanding maintaining indexes can get pretty expensive. Whats the likely hood of Ad Hoc Indexes being added to PostgreSql thanks ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Ad Hoc Indexes
The idea of ad hoc indexes is speed up loop scans To prove my idea i created a sql file in PGAdmin that creates the indexes on the fly then runs the query then drops the indexs. without the indexes it takes 18 to 19 seconds to run the query. To create the index and do the query takes 400 milliseconds. I wish i could do an Explain on it with create index in the query but i can't it errors out. So i reran the query with indexes already in place it drops the query time 191 milliseconds. Create and deleting the indexes on the fly improves performance almost 50 times. I think creating Ad Hoc indexes on the fly in memory makes sense . I imagine it would be even faster if the index stayed in memory Tom Lane wrote: Justin [EMAIL PROTECTED] writes: Is there any plans in the future to add the ability for PostgreSQL to create Ad Hoc indexes if it makes sense. No, I'm not aware of anyone contemplating such a thing. I can hardly imagine a situation where building an index for a single query is actually a win. Maybe those DBMSes you mention were using this as a substitute for having decent join mechanisms, or something? regards, tom lane Sort (cost=4091.92..4091.93 rows=1 width=83) (actual time=22003.849..22004.041 rows=458 loops=1) Sort Key: to_char((coitem.coitem_scheddate)::timestamp with time zone, 'Month Dy DD '::text), cohead.cohead_number, coitem.coitem_linenumber - Nested Loop Left Join (cost=0.00..4091.91 rows=1 width=83) (actual time=14.477..21998.119 rows=458 loops=1) - Nested Loop (cost=0.00..191.54 rows=1 width=75) (actual time=0.132..32.066 rows=458 loops=1) - Nested Loop (cost=0.00..191.22 rows=1 width=44) (actual time=0.121..25.253 rows=458 loops=1) - Nested Loop (cost=0.00..182.94 rows=1 width=40) (actual time=0.111..18.476 rows=458 loops=1) - Nested Loop (cost=0.00..182.53 rows=1 width=30) (actual time=0.100..11.662 rows=458 loops=1) - Index Scan using coitem_status_key on coitem (cost=0.00..174.25 rows=1 width=30) (actual time=0.084..2.364 rows=458 loops=1) Index Cond: (coitem_status = 'O'::bpchar) Filter: ((coitem_scheddate = '2008-01-01'::date) AND (coitem_scheddate = '2008-12-31'::date)) - Index Scan using itemsite_pkey on itemsite (cost=0.00..8.27 rows=1 width=8) (actual time=0.012..0.014 rows=1 loops=458) Index Cond: (coitem.coitem_itemsite_id = itemsite.itemsite_id) - Index Scan using item_pkey on item (cost=0.00..0.40 rows=1 width=18) (actual time=0.008..0.010 rows=1 loops=458) Index Cond: (itemsite.itemsite_item_id = item.item_id) - Index Scan using cohead_pkey on cohead (cost=0.00..8.27 rows=1 width=12) (actual time=0.008..0.010 rows=1 loops=458) Index Cond: (cohead.cohead_id = coitem.coitem_cohead_id) - Index Scan using cust_pkey on custinfo (cost=0.00..0.31 rows=1 width=39) (actual time=0.007..0.009 rows=1 loops=458) Index Cond: (cohead.cohead_cust_id = custinfo.cust_id) - Index Scan using wo_pkey on wo (cost=0.00..8.27 rows=1 width=12) (actual time=0.007..0.011 rows=1 loops=458) Index Cond: (wo.wo_id = coitem.coitem_order_id) SubPlan - Nested Loop (cost=958.43..1975.22 rows=1 width=36) (actual time=22.607..22.955 rows=1 loops=419) Join Filter: (wrkcnt.wrkcnt_id = wooper.wooper_wrkcnt_id) InitPlan - Aggregate (cost=958.42..958.43 rows=1 width=4) (actual time=13.870..13.870 rows=1 loops=419) - Seq Scan on wooper (cost=0.00..958.41 rows=1 width=4) (actual time=13.453..13.856 rows=1 loops=419) Filter: (wooper_rncomplete AND (wooper_wo_id = $0)) - Seq Scan on wooper (cost=0.00..1013.10 rows=1 width=8) (actual time=22.547..22.821 rows=1 loops=419) Filter: ((wooper_seqnumber = $1) AND (wooper_wo_id = $0)) - Seq Scan on wrkcnt (cost=0.00..2.75 rows=75 width=36) (actual time=0.010..0.052 rows=75 loops=419) - Aggregate (cost=958.42..958.43 rows=1 width=4) (actual time=14.033..14.034 rows=1 loops=419) - Seq Scan on wooper (cost=0.00..958.41 rows=1 width=4) (actual time=13.600..14.011 rows=1 loops=419) Filter: (wooper_rncomplete AND (wooper_wo_id = $0)) - Aggregate (cost=958.42..958.43 rows=1 width=4) (actual time=14.058..14.059 rows=1 loops=458) - Seq Scan on wooper (cost=0.00..958.41 rows=1 width=4) (actual time=13.671..14.043 rows=1 loops=458) Filter: (wooper_rncomplete AND (wooper_wo_id = $0
Re: [HACKERS] Ad Hoc Indexes
Then why are the estimates so far off??? If estimates where correct would it improve the performance that much. Vaccum is set to run automatically so the stats stay update. Total record count for the tables for all the tables put together is around 120,000 the query returns only 458 records which is correct. If i am correct in my understanding the reason the index improved the query so much is the wooper table gets hit hard because it appears in 3 separate nested queries . So taking only 458 records returned from the parent query times 3 for 1,375 table scans going through 21,873 records for a total number records being processed to 30,075,375 on a table with no index. So if you look at it that way PostgreSql did remarkably well processing the query in 18 to 20 seconds. The idea behind adhoc indexes is when one shot queries or really used queries are created that would require numerous indexes to run in a decent time can be run in a faction of the time. This also saves processing times across the entire system where creating indexes for the all the possible queries is impractical This does not take away the need for index but speed up ad-hoc queries created from a website or other business analysis tool that someone might create Tom Lane wrote: Justin [EMAIL PROTECTED] writes: The idea of ad hoc indexes is speed up loop scans To prove my idea i created a sql file in PGAdmin that creates the indexes on the fly then runs the query then drops the indexs. without the indexes it takes 18 to 19 seconds to run the query. To create the index and do the query takes 400 milliseconds. The example you show doesn't convince me of much of anything, because the estimated rowcounts are so far off. I think you're basically dealing with an estimation failure and it's pure luck that the extra index fixes it. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Ad Hoc Indexes
oops dam spell checker really should be rarely sorry Justin wrote: Then why are the estimates so far off??? If estimates where correct would it improve the performance that much. Vaccum is set to run automatically so the stats stay update. Total record count for the tables for all the tables put together is around 120,000 the query returns only 458 records which is correct. If i am correct in my understanding the reason the index improved the query so much is the wooper table gets hit hard because it appears in 3 separate nested queries . So taking only 458 records returned from the parent query times 3 for 1,375 table scans going through 21,873 records for a total number records being processed to 30,075,375 on a table with no index. So if you look at it that way PostgreSql did remarkably well processing the query in 18 to 20 seconds. The idea behind adhoc indexes is when one shot queries or rarely used queries are created that would require numerous indexes to run in a decent time can be run in a faction of the time. This also saves processing times across the entire system where creating indexes for the all the possible queries is impractical This does not take away the need for index but speed up ad-hoc queries created from a website or other business analysis tool that someone might create Tom Lane wrote: Justin [EMAIL PROTECTED] writes: The idea of ad hoc indexes is speed up loop scans To prove my idea i created a sql file in PGAdmin that creates the indexes on the fly then runs the query then drops the indexs. without the indexes it takes 18 to 19 seconds to run the query. To create the index and do the query takes 400 milliseconds. The example you show doesn't convince me of much of anything, because the estimated rowcounts are so far off. I think you're basically dealing with an estimation failure and it's pure luck that the extra index fixes it. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Ad Hoc Indexes
yes that improved the select statement allot from 18 to 20 to 6 seconds 3 fold improvement with no indexes. thanks Now i tested your new and improved select statement with the indexes its query time was taking from 90 to 140 milliseconds compared to the original select statement of 94 to 120 milliseconds. I tested both select statements 5 times to get an average and they both hover around 110 to 120 I attached the results from the explain that used the indexes and one with no index although in both case with and with out indexes the estimate still failed to return the correct number by allot. Stephen Denne wrote: Justin wrote: Then why are the estimates so far off? Perhaps a really high correlation between those records where coitem_status = 'O' and those records in your date range. The planner estimates 1 row when restricting by both restrictions, whereas there where 458. The 1 row was probably a result of a small % for status='O' multiplied by a small % for the date range. If estimates where correct would it improve the performance that much. Possibly, but a better performance gain might be obtained by rewriting the query, changing the case expression to something along the lines of: coalesce(( select wrkcnt_code || ' Operation Completed :' || cast(wooper_seqnumber as text) from wooper inner join wrkcnt on wrkcnt_id = wooper_wrkcnt_id where wooper_rncomplete = true and wooper_wo_id = coitem_order_id order by wooper_seqnumber desc limit 1 ),'No Operation Completed') as LastFinshedOp regards, Stephen Denne. Disclaimer: At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachments is confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately, destroy it and do not copy, disclose or use it in any way. __ This email has been scanned by the DMZGlobal Business Quality Electronic Messaging Suite. Please see http://www.dmzglobal.com/services/bqem.htm for details. __ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq Sort (cost=1161.97..1161.98 rows=1 width=83) (actual time=6458.955..6459.166 rows=458 loops=1) Sort Key: to_char((coitem.coitem_scheddate)::timestamp with time zone, 'Month Dy DD '::text), cohead.cohead_number, coitem.coitem_linenumber - Nested Loop Left Join (cost=0.00..1161.96 rows=1 width=83) (actual time=14.638..6453.589 rows=458 loops=1) - Nested Loop (cost=0.00..191.54 rows=1 width=75) (actual time=0.121..31.942 rows=458 loops=1) - Nested Loop (cost=0.00..191.22 rows=1 width=44) (actual time=0.111..24.863 rows=458 loops=1) - Nested Loop (cost=0.00..182.94 rows=1 width=40) (actual time=0.102..18.215 rows=458 loops=1) - Nested Loop (cost=0.00..182.53 rows=1 width=30) (actual time=0.091..11.299 rows=458 loops=1) - Index Scan using coitem_status_key on coitem (cost=0.00..174.25 rows=1 width=30) (actual time=0.077..2.314 rows=458 loops=1) Index Cond: (coitem_status = 'O'::bpchar) Filter: ((coitem_scheddate = '2008-01-01'::date) AND (coitem_scheddate = '2008-12-31'::date)) - Index Scan using itemsite_pkey on itemsite (cost=0.00..8.27 rows=1 width=8) (actual time=0.012..0.014 rows=1 loops=458) Index Cond: (coitem.coitem_itemsite_id = itemsite.itemsite_id) - Index Scan using item_pkey on item (cost=0.00..0.40 rows=1 width=18) (actual time=0.008..0.010 rows=1 loops=458) Index Cond: (itemsite.itemsite_item_id = item.item_id) - Index Scan using cohead_pkey on cohead (cost=0.00..8.27 rows=1 width=12) (actual time=0.008..0.010 rows=1 loops=458) Index Cond: (cohead.cohead_id = coitem.coitem_cohead_id) - Index Scan using cust_pkey on custinfo (cost=0.00..0.31 rows=1 width=39) (actual time=0.007..0.010 rows=1 loops=458) Index Cond: (cohead.cohead_cust_id = custinfo.cust_id) - Index Scan using wo_pkey on wo (cost=0.00..8.27 rows=1 width=12) (actual time=0.007..0.011 rows=1 loops=458) Index Cond: (wo.wo_id = coitem.coitem_order_id) SubPlan - Limit (cost=962.12..962.12 rows=1 width=36) (actual time=13.977..13.978 rows=1 loops=458) - Sort (cost=962.12..962.12 rows=1 width=36) (actual time=13.970..13.970 rows=1 loops=458) Sort Key
Re: [HACKERS] Ad Hoc Indexes
there are certainly problems with this idea. Figuring out the cost to either create an index or just do table scan would be pain but not impossible. The hit to index a table in memory i don't think would be that bad compare to do 100's to thousands of loops with table scans. I see more options for the query planner in config file to handle ad-hoc indexes Now to Toms points. The point of Ad Hoc index is they're only alive for the period time the query is running and only live in memory. Once the query completes they die horrible deaths These temporay indexes will not do a table lock or row lock ever it only needs to view the record to create this index. So the same problems that affects 100 table scans would affect create temp index. I think this would help reduce concurrence sense the table is not being scanned thousands of times The idea of creating one time use indexes has been around for long time and has showed to be a benefit when dealing with large data sets where queries is one time or rarely used and its to much labor and cost to figure out how to make it faster. This would also reduce IO disk activity allot if the table can't fit in memory but the index would fit because it relativity small in comparison. Tom Lane wrote: Stephen Denne [EMAIL PROTECTED] writes: The improvement wasn't to the part of the query that had the bad cost estimate, it was to the part that was being performed hundreds of times instead of the one time the planner estimated. The planner still thought it was only going to perform a sequential scan of your wooper table once. So even if there had been any Ad Hoc Index creation code that had been used to consider creating indexes as part of a plan cost estimate, it wouldn't have bothered creating any indexes on wooper. Right. And even more to the point, if it had gotten the estimate right and known that the subquery would have been repeated, that would have (ideally) prompted it to shift to a different plan structure. As Peter pointed out upthread, the existing hash join logic seems to be a pretty decent facsimile of an ad hoc index --- in fact, the hash table *is* a hash index for all intents and purposes. If you tilt your head at the right angle, a merge join could be seen as comparable to constructing a couple of ad-hoc btree indexes. Plus the join code is not burdened by any overhead that actual index code would be likely to have, such as niggling worries about crash-safety or concurrent access to the index. So in my mind the issue here is not why don't we have ad hoc indexes, it's why the planner didn't choose a more appropriate join method. It's clear that faulty rowcount estimation was one part of the answer. Now it may also be that there are some outright structural limitations involved due to the join arising from a sub-select --- I'm not sure that the planner *can* generate a bulk join plan from a query expressed like this one. But the TODO items that this suggests to me do not include implementing anything I'd call an ad-hoc index. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [pgsql-www] pg_autovacuum is nice ... but ...
Bruce Momjian wrote: Should I add a TODO to warn if FSM values are too small? Is that doable? It sounds like it should be, and it would be a valuable pointer to people, so yep. Any idea who'd be interested in claiming it? Regards and best wishes, Justin Clift ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Improvements to PostgreSQL
Suresh Tri wrote: snip All your sugestions are welcome. Please help us to implement these features. Our aim is to make postgresql enterprise level. Hi Suresh, From reading your post, I feel your team is approaching the goal of making PostgreSQL Enterprise Level in a non-optimal way. With the soon to be released version 7.5 of PostgreSQL, the core database engine itself is already very good. This is not the area needing to be worked upon for the next level of Enterprise Functionality. Your team will likely have a lot more effect if they concentrate on what Enterprises really need that PostgreSQL is missing: + An SNMP agent to report on PostgreSQL's status and allows remote control of the PostgreSQL daemon. From an Oracle perspective, this would be the equivalent of Oracle Intelligent Agents, part of the core features of the Oracle Enterprise Manager (OEM). + Tools to allow control of PostgreSQL databases from one central place. Again, the same as OEM. + Starting and stopping the database + Managing Users + Backup and Recovery + Alerts and submitting jobs + etc Oracle does this by having a centralised information repository that a management GUI connects too, and having Oracle Intelligent Agents running on each server the database software is on. These Oracle Intelligent Agents keep the centralised repository aware of the status of the Oracle server software, perform actions on the Oracle servers as directed by the centralised repository (jobs running on there, instructions by the GUI, etc), and more. There's more to what the OEM GUI does, but that's a good start. + Something else that would be useful is a GUI tool to automatically setup PostgreSQL replication. The PostgreSQL Slony-I project would be the first one to look at, and probably equivalent to something like Oracle's Data Guard. They use the different approach, but the end result is having a master and standby databases. Hope this is helpful. Regards and best wishes, Justin Clift Thanks, Suresh snip ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Improvements to PostgreSQL
Patrick Welche wrote: snip Is there more to remote control than setting GUC variables? Tell me more! Sure: + starting/restarting/stopping the database server software + the normal DBA type work - creating/altering/dropping databases, users, functions, languages, permissions (etc) + Remote backup and recovery + Submitting jobs to run remotely on the server. i.e. reindexing or vacuuming scripts Remote Monitoring: + Alerts for specified events. i.e. The database server is getting near to capacity in it's filesystem(s), or there have been too many invalid PG authorisation attempts, or there are connections getting rejected because the max_connections parameter isn't high enough Groups + Defining arbitrary groups of servers for the above to speed things up when working with many servers Roles + Having multiple administrators with different permissions (role based is generally good), all communicating through the centralised info repository so things don't get out of sync (possibly) + loading additional PG packages. i.e. rolling out oid2name or pgbench (or other PG utils) to servers. Could be viewed as something that should be done with the OS packaging mechanism(s) instead. Any of the PG GUI's (I generally use pgAdmin) could likely be extended to do all of this in a nice, user friendly way. As an aside, SNMP is important in enterprise settings as it allows PG to be plugged into the monitoring capabilities of enterprise management frameworks. i.e. Concord's eHealth, and probably Tivoli, OpenView, etc Hope that's useful. Regards and best wishes, Justin Clift Cheers, Patrick ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[HACKERS] pg_dump - option for just dumping sequences?
Hi all, Came across a scenario today where I want to backup the latest values of sequences in the database, so they can be restored along with the data from a COPY command. Looked at pg_dump's output options (for 7.4/7.5) and there doesn't seem to be an easy way for just dumping sequence definitions (with the latest sequence numbers). Thought about using egrep too, but had difficulties there too. Does anyone else think it would be a useful thing adding an option to pg_dump for outputting just the sequences? Or perhaps there's an easy way of doing this I just haven't thought of? :-) Regards and best wishes, Justin Clift ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] pg_dump - option for just dumping sequences?
Christopher Kings-Lynne wrote: Does anyone else think it would be a useful thing adding an option to pg_dump for outputting just the sequences? Or perhaps there's an easy way of doing this I just haven't thought of? Can't you just grep the output for pg_catalog.setval and SET SESSION AUTHORIZATION? Arrrgh, yep. Thanks Chris, I must be in space cadet mode today. :( Regards and best wishes, Justin Clift Chris ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [pgsql-www] Problems logging into CVS server
Marc G. Fournier wrote: Damn ... I'll have to look at it ... we had a hacker get in through the way anoncvs was setup, so I set a passwd on in /etc/passwd (but didn't touch the anoncvs setup itself) ... will play with it tonight and see if I can figure out how to do a more secure anon-cvs ;( I have to be missing something in the config *sigh* Um, that sounds worrying. Was the activity of the hacker anything that would affect PG code, or access to anything sensitive (account passwords, etc)? Regards and best wishes, Justin Clift ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: Release planning (was: Re: [HACKERS] Status report)
Marc G. Fournier wrote: snip As Jan points out, its the 'small features that are done' that we've been griping about having to wait for, not the big ones which we know aren't done ... snip Hmmm... so we do things slightly differently than previously... This upcoming version could be PG version 8.0, We continue with bugfixes on 7.4.x, That still leaves 7.5.x, 7.6.x (etc if needed), for releasing new versions of PG without the big features. Kind of like an in-between thing, whilst waiting for the major features in the major releases? That would mean we'd have: Version 8.0 as our next main release, Version 9.0 being the version after that with the next big features in it. Version 8.x being version 8 plus smaller features, prior to 9.0 Version 8.x.x being version 8.x plus bug fixes. Sounds like it could get hairy if we're not careful, but I reckon the PG Community is mature enough to make the right calls where needed. :) Regards and best wishes, Justin Clift ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Status report
Bruce Momjian wrote: If you get full control of PostgreSQL, you can dictate what will happen. Until then, I will follow the community consensus, which may or may not match your opinion. Um, let's take the time to get the features in, otherwise we'll be waiting another year (roughly) to get PITR and others out to end users. :) Regards and best wishes, Justin Clift ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Loadable Oracle Personality: WAS LinuxTag wrapup thread
Simon Riggs wrote: snip External tool is one thing, but the loadable personality seems like a very good idea and worth discussing further. Would an interesting, and maybe slightly different way of viewing a loadable personality, be as a set of rules that can be applied to parser input before the parser actually gets it... and massages input SQL into something for the parser to understand. I'm hugely generalising here of course, but you know how we have a PostgreSQL Rules system that rewrites queries before handing them to the query planner... well, would it be possible/practical to potentially have a Rules system that rewrites incoming SQL before it gets given to the normal parser. Might get complicated though... we'd need a pre-parser or something. However, having a generalised system for doing this may make it far easier to provide personalities. i.e. load a set of Oracle 8i rules, load a set of Oracle 9i rules, load a set of DB2 x, rules, etc. :) Regards and best wishes, Justin Clift ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] LinuxTag wrapup
Mario Weilguni wrote: Because their SQL queries always seem to need a target object to select from. i.e. SELECT NEXTVAL.foo isn't valid for Oracle 8/9. snip It has been a long time since I've used Oracle, but shouldn't it be select foo.nextval from dual? Yep, that's sounds better. It's been a couple of months since I was writing SQL in Oracle. Previous contract. :) Regards and best wishes, Justin Clift Regards, Mario Weilguni ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] LinuxTag wrapup
Andreas Pflug wrote: snip That's true, it's the question how much can be offered without too much effort. I'm not too deep in oracle stuff, what comes to my mind is - outer join syntax (parser thing) - sequences usage (parser too) - maybe stored procedure call, with a wrapper to convert output parameters to a composite return value. There's also their FROM DUAL workaround (in common usage) as well. i.e. SELECT NEXTVAL.foo FROM DUAL; Because their SQL queries always seem to need a target object to select from. i.e. SELECT NEXTVAL.foo isn't valid for Oracle 8/9. Regards and best wishes, Justin Clift There's certainly no point supporting any weird ddl command, so there's still porting work to be done when migrating. Regards, Andreas ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Adding column comment to information_schema.columns
Christopher Kings-Lynne wrote: Well, if we add them (and they would be very useful I reckon) should we ensure there's an obvious PG naming thing happening? Why are they useful If you want PG specific stuff then use the PG specific catalogs!!! My take on this is that it's a LOT easier for people who don't know the internals of the PG catalogs to be able to query the information schema, as in the information schema things are generally explicitly named. Much easier for non-experts, which most people don't want to have to invest the time in becoming. Regards and best wishes, Justin Clift Chris ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Adding column comment to information_schema.columns
Christopher Kings-Lynne wrote: snip Anyone who's writing queries that are examing the schema of the database is by definition not a newbie... By newbie here, I mean someone who's a PG newbie but has a reasonable understanding of databases (i.e. Oracle, etc) would generally find the information_schema much easier to locate and use information in compared to having to learn the PG internals. There's a whole lot of difference between the skill level needed to query the information_schema and find out things like table and column names, vs looking into pg_namespace, pg_class and pg_attribute plus understanding the specific info there to work out table and column names. I reckon that having information pre-prepared in views like those in information_schema is of course going to be easier for people than raw information our internal catalogs. Do you get where I'm coming from with this? Regards and best wishes, Justin Clift Chris ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Adding column comment to information_schema.columns
Christopher Kings-Lynne wrote: There's a whole lot of difference between the skill level needed to query the information_schema and find out things like table and column names, vs looking into pg_namespace, pg_class and pg_attribute plus understanding the specific info there to work out table and column names. I reckon that having information pre-prepared in views like those in information_schema is of course going to be easier for people than raw information our internal catalogs. Do you get where I'm coming from with this? Yes, but I disagree. Your opinion is as an experienced user anyway, and you're just putting words in novice mouths... That I directly disagree with. I'm putting forth the viewpoint of the people I work with here, who aren't PG experienced. They're Oracle experienced. We've never had someone complain about querying stuff like that. For example, why do you need the comments on columns? The comment on columns addition to the constraint_column_usage view was a suggestion for our particular environment, where it's easier for some of the Perl programmers to have one view that shows them all of the needed info. I'm not super caring either if we add this stuff or not to PG, it was just a suggestion from the trying to be helpful POV. However, saying that people who aren't experienced with PG can easily (i.e. time efficiently) figure out how to query table and column names from PG by going through the pg_catalog stuff in comparison to things like information_schema.* is just not right. One other benefit of having more stuff in information_schema.* is that the stuff there is easier to look at and figure out what it is. With the view definitions that are provided to things like psql and pgAdmin when people look at an information_schema view, it provides them a way of figuring out where in the internal tables stuff is if they want to look for it. i.e. they can find a column in information_schema.constraint_column_usage and go gee where is that in the real PostgreSQL tables? Then look at the code that generates it and so on. :) Regards and best wishes, Justin Clift Chris ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Adding column comment to information_schema.columns
Tom Lane wrote: Justin Clift [EMAIL PROTECTED] writes: Not sure how worthwhile others will find this small patch (to CVS HEAD), but we found it useful. It adds the column comments to the information_schema.columns view. This question has been touched on before, but I guess it's time to face it fair and square: is it reasonable for an SQL implementation to add implementation-specific columns to an information_schema view? One could certainly argue that the entire point of information_schema is to be *standard*, not more, not less. OTOH I do not know if adding an extra column is likely to break anyone's application. Comments? Well, I suppose it reduces application portability if anyone starts relying on it. ? Regards and best wishes, Justin Clift regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Bug with view definitions?
Hi guys, Not sure if this is a known issue or not, but I think I may have found a bug with the way view definitions are shown... at least in psql. Using 7.5 development CVS (as of a few hours ago) or even 7.4.3, if I connect using it's version of psql to a database (of the same version), then use psql to view the information_schema.constraint_columns_usage view, it gives me this definition: *** mydb=# \d information_schema.constraint_column_usage View information_schema.constraint_column_usage Column | Type| Modifiers +---+--- table_catalog | information_schema.sql_identifier | table_schema | information_schema.sql_identifier | table_name | information_schema.sql_identifier | column_name| information_schema.sql_identifier | constraint_catalog | information_schema.sql_identifier | constraint_schema | information_schema.sql_identifier | constraint_name| information_schema.sql_identifier | View definition: SELECT current_database()::information_schema.sql_identifier AS table_catalog, x.tblschema::information_schema.sql_identifier AS table_schema, x.tblname::information_schema.sql_identifier AS table_name, x.colname::information_schema.sql_identifier AS column_name, current_database()::information_schema.sql_identifier AS constraint_catalog, x.cstrschema::information_schema.sql_identifier AS constraint_schema, x.cstrname::information_schema.sql_identifier AS constraint_name FROM ( SELECT DISTINCT nr.nspname, r.relname, r.relowner, a.attname, nc.nspname, c.conname FROM pg_namespace nr, pg_class r, pg_attribute a, pg_depend d, pg_namespace nc, pg_constraint c WHERE nr.oid = r.relnamespace AND r.oid = a.attrelid AND d.refclassid = 'pg_class'::regclass::oid AND d.refobjid = r.oid AND d.refobjsubid = a.attnum AND d.classid = 'pg_constraint'::regclass::oid AND d.objid = c.oid AND c.connamespace = nc.oid AND c.contype = 'c'::char AND r.relkind = 'r'::char AND NOT a.attisdropped ORDER BY nr.nspname, r.relname, r.relowner, a.attname, nc.nspname, c.conname UNION ALL SELECT nr.nspname, r.relname, r.relowner, a.attname, nc.nspname, c.conname FROM pg_namespace nr, pg_class r, pg_attribute a, pg_namespace nc, pg_constraint c, information_schema._pg_keypositions() pos(n) WHERE nr.oid = r.relnamespace AND r.oid = a.attrelid AND nc.oid = c.connamespace AND CASE WHEN c.contype = 'f'::char THEN r.oid = c.confrelid AND c.confkey[pos.n] = a.attnum ELSE r.oid = c.conrelid AND c.conkey[pos.n] = a.attnum END AND NOT a.attisdropped AND (c.contype = 'p'::char OR c.contype = 'u'::char OR c.contype = 'f'::char) AND r.relkind = 'r'::char) x(tblschema, tblname, tblowner, colname, cstrschema, cstrname), pg_user u WHERE x.tblowner = u.usesysid AND u.usename = current_user(); mydb=# *** However, when I use this definition (cut-n-paste style to avoid mistakes) to create the view anew (even with a different name, etc), then it gives me an error: *** mydb=# \e ERROR: parse error at or near ALL at character 1105 ERROR: parse error at or near ALL at character 1105 LINE 6: UNION ALL ^ mydb=# *** I haven't come across this before, and am having the same problem with pgAdmin3 as well, as it supplies the exact same definition of the view. I think I'm doing everything right here, could this be a bug with PG? Regards and best wishes, Justin Clift ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Bug with view definitions?
Dennis Bjorklund wrote: snip I've still not checked any code. I don't even know what part of pg it is that produce that bad SQL. The view itself works, so it must be the pretty printer that is broken (where ever that is hidden away in the code). Thanks Dennis. So, it's definitely a bug then. I wasn't sure if it was PG or me. :) Um, as I'm not up to the task of fixing it, is this something you or someone else would be interested in? Regards and best wishes, Justin Clift ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Adding column comment to information_schema.columns
Christopher Kings-Lynne wrote: There is a huge difference between adhering to a standard and limiting yourself to a standard. The real question is whether PostgreSQL's goal is to support SQL standards, or whether PostgreSQL's goal is to give PostgreSQL users a useful set of tools. There are literally _hundreds_ of fields we could add to the information_schema. Either we add them all or we add none of them. Well, if we add them (and they would be very useful I reckon) should we ensure there's an obvious PG naming thing happening? i.e. pg_column_comment or similar? Maybe not pg_ but you know what I mean. :-) Regards and best wishes, Justin Clift Chris ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Bug with view definitions?
Tom Lane wrote: snip Actually, if you look at the source code (information_schema.sql) there is no ORDER BY in it, only a DISTINCT. The ORDER BY gets added by the parser to help implement the DISTINCT. Sooner or later we should look at suppressing the added ORDER BY when displaying the view. If someone fixes this can we make sure it goes into 7.4.4 as well (if it's not a drastic code change)? It's not a data corrupting bug but it's stopping view definitions from working as advertised which is bad if you're used to being able to rely on them. :-/ For now, I'll personally use the pg_dump version of the query, or maybe see if the one in backend/catalog/information_schema.sql can be run directly. :) Regards and best wishes, Justin Clift ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Adding column comment to information_schema.columns
Greg Sabino Mullane wrote: snip If there is that much clamor for this, why not make a new schema, such as pginformation_schema People could then tweak the views to their heart's content, while keeping 100% compliance. Doesn't sound very neat. If we add a pginformation_schema, then it'd probably contain all of the existing information_schema... plus more. Reduplication? I guess we could just leverage off the existing information_schema views: i.e. CREATE VIEW pg_information_schmema.some_view AS SELECT * FROM information_schema.some_view (then add extra bits). But it still doesn't sound very neat. ? Regards and best wishes, Justin Clift ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Adding column comment to information_schema.columns
Hi all, Not sure how worthwhile others will find this small patch (to CVS HEAD), but we found it useful. It adds the column comments to the information_schema.columns view. Hope it's useful. :-) Regards and best wishes, Justin Clift *** information_schema.sql.orig 2004-07-01 11:59:26.0 +1000 --- information_schema.sql 2004-07-01 12:33:01.0 +1000 *** *** 442,448 CAST(null AS cardinal_number) AS maximum_cardinality, CAST(a.attnum AS sql_identifier) AS dtd_identifier, !CAST('NO' AS character_data) AS is_self_referencing FROM (pg_attribute a LEFT JOIN pg_attrdef ad ON attrelid = adrelid AND attnum = adnum), pg_class c, pg_namespace nc, pg_user u, --- 442,450 CAST(null AS cardinal_number) AS maximum_cardinality, CAST(a.attnum AS sql_identifier) AS dtd_identifier, !CAST('NO' AS character_data) AS is_self_referencing, ! !col_description(a.attrelid, a.attnum) AS column_comment FROM (pg_attribute a LEFT JOIN pg_attrdef ad ON attrelid = adrelid AND attnum = adnum), pg_class c, pg_namespace nc, pg_user u, ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[HACKERS] [Fwd: Re: Interesting thought....]
Hi guys, Wasn't sure if this is a valid idea, but Bruce's response made me think it was worth forwarding here to see if anyone had/has considered it before. As everyone is aware, I'm not up for coding it, but am mentioning it Just In Case it's deemed worthy of adding to the TODO list in case someone wants to pick up. :-) Regards and best wishes, Justin Clift Original Message Subject: Re: Interesting thought Date: Mon, 21 Jul 2003 14:13:49 -0400 (EDT) From: Bruce Momjian [EMAIL PROTECTED] To: Justin Clift [EMAIL PROTECTED] This would fix all our inheritance problems, and allow sequences to span multiple tables with an index constraint --- Justin Clift wrote: Hi Bruce, Here's an interesting thought Multi-table indexes... like multi-column, but instead of specifying columns in one table, they specify columns in several tables. That leads into thoughts about multi-table primary and secondary keys and multi-table foreign keys too. Wonder if it'd be useful in the real world? I reckon so. :-) Regards and best wishes, Justin Clift -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Two weeks to feature freeze
The Hermit Hacker wrote: On Thu, 19 Jun 2003, Andrew Dunstan wrote: Maybe a better strategy would be to get a release out soon but not wait 6 months for another release which would contain the Win32 port and the PITR stuff (assuming those aren't done in time for this release). Just a thought. And definitely in agreement here ... I'd rather see a shortened dev cycle prompted by a big feature being added, then delaying a release because oh oh, I need another few weeks that draws out when something unexpected happens :( Yep, this makes sense. Looks like it'll be PostgreSQL 7.4 being all the present improvements, but without PITR and Win32. Then, in a few months (hopefully less than 3) we'll have PostgreSQL 8.0, with both of those major features in it (and whatever other enhancements have been added). The only thing that makes me wince is that we have a protocol change at PostgreSQL 7.4 release instead of 8.0. It kind of doesn't sound right, having a protocol change in the 7 series, when we have an 8 series coming up soon after. Oh well, so it's not perfect... ;-) Regards and best wishes, Justin Clift snip ---(end of broadcast)--- TIP 8: explain analyze is your friend
[HACKERS] [Fwd: [GENERAL] [CYGWIN] Problems compiling PostgreSQL 7.3.3-1 underWin98]
Hi everyone, Can anyone assist Diogo here? He's not some random user, he's our official Portuguese translator and helps us out a lot. Sounds like he really needs a hand. *please* Regards and best wishes, Justin Clift Original Message Subject: [GENERAL] [CYGWIN] Problems compiling PostgreSQL 7.3.3-1 under Win98 Date: Wed, 18 Jun 2003 16:51:55 -0300 From: Diogo de Oliveira Biazus [EMAIL PROTECTED] To: [EMAIL PROTECTED] [EMAIL PROTECTED] Hi everybody, I'm having some problems compiling PostgreSQL 7.3.3-1 under Win98, when the gcc is compiling the pg_sema.c in the src/backend/port directory gcc exit with error. There are lots of warnings about undeclered stuff (like IpcSemaphoreGetValue) and a message in the end saying: Storage size of 'sops' insn't known I'm compiling because I need the libpostgres.a file to build some extensions, is there any other way of doing that? I've already posted this message in pgsql-cygwin, sory for the cross-post. Please, somebody help me. Thanks in advance, -- Diogo de Oliveira Biazus [EMAIL PROTECTED] Ikono Sistemas e Automao http://www.ikono.com.br ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Translation
Hi, Bingo, count yourself in. :) Is your preference for translation of website related stuff (i.e. http://advocacy.postgresql.org), or existing manuals and documentation, or error messages in the code itself, or ...? (up to you) Translating some of the web materials into Slovak would open up a completely new community to PostgreSQL, if you wouldn't mind doing that. :-) Regards and best wishes, Justin Clift P.M wrote: Hi, Ok so if i can't help you in the code, maybe i can help you translating installation comments into several languages ? I know, french, slovak, spanish.. ;-) __ Do you Yahoo!? SBC Yahoo! DSL - Now only $29.95 per month! http://sbc.yahoo.com ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Compiling Win32
Hi Jan, Do you have an ETA for this? :-) Regards and best wishes, Justin Clift Bruce Momjian wrote: I have not tested the Win32 compile in a few weeks, so it is possible it is broken at this point. It will not run because we need exec() handling that Jan is working on, and signal stuff. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Compiling Win32
Bruce Momjian wrote: Justin Clift wrote: Hi Jan, Do you have an ETA for this? I was going to report the Win32 status this week. As you know, I had to leave Win32 to catch up on email. The two big items left are exec() handlling and signals. If I can get those in by July 1, I can continue cleanup during beta and perhaps have a patch that people can test for Win32 later in the summer. Ok, how does that leave our status for the next release of PostgreSQL? :-) Regards and best wishes, Justin Clift ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Compiling Win32
Bruce Momjian wrote: Justin Clift wrote: Bruce Momjian wrote: Justin Clift wrote: Hi Jan, Do you have an ETA for this? I was going to report the Win32 status this week. As you know, I had to leave Win32 to catch up on email. The two big items left are exec() handlling and signals. If I can get those in by July 1, I can continue cleanup during beta and perhaps have a patch that people can test for Win32 later in the summer. Ok, how does that leave our status for the next release of PostgreSQL? Wish I knew --- I am realizing that I have trouble doing major development and keeping up with patches/email. Heh Heh Heh That's what it was like when I started doing the Techdocs site. Had to shelve my plans for doing coding, purely because 16 hours a days still didn't leave any time for it. :( Worth looking into delegating some of your workload? Regards and best wishes, Justin Clift ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] PostgreSQL under Windows
Hi guys, Although the Proof of Concept build works, it does have a few drawbacks: + Only works on English (US at least) installations, because it's hard coded to install in C:\Program Files\PostgreSQL. On non-English installations, C:\Program Files is named differently and causes things to fail. + It's missing the pager program less, and the gzip libraries, so using pg_dump and psql have problems. + It's based on an old version of PostgreSQL, 7.3.1 from memory. All that being said, it should be ok to use, but don't run it in a production environment. Development and testing environments should be ok. :-) Regards and best wishes, Justin Clift Robert Treat wrote: On Wed, 2003-06-11 at 08:03, Bruno Wolff III wrote: On Tue, Jun 10, 2003 at 22:34:04 -0700, P.M [EMAIL PROTECTED] wrote: I was thinking that PostgreSQL could help me to reduce the cost of a such software. But i would like to know what is the status of the PostGreSQL version under Windows ? I mean, i know that some of you are trying to do an installer version under Windows for PostGreSQL and i would like to know if a beta version already exist or not There will be a beta native windows port available in about 3 weeks. It is currently possible to run postgresql on windows using cygwin. If you don't want to wait and not big on cygwin, you can get a proof of concept build at http://sourceforge.net/project/showfiles.php?group_id=9764release_id=136623 It's not supported by anyone and I can't even say if it will work for you, but it has worked for some in the past and might be a good way to get your feet wet. Once you get up and running be sure to come back and help us beta test! :-) Robert Treat ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Proposal to Re-Order Postgresql.Conf, part II
Tom Lane wrote: Josh Berkus [EMAIL PROTECTED] writes: wal_debug is seldom used outside of Postgresql source development or unusual system failures, and should therefore go last. BTW, it occurs to me that wal_debug is one of the hacker-only variables that probably ought not be documented at all. I cannot imagine any use for it for the average DBA. Um, not documenting it is probably not a good move for us, however putting it at the end in a section marked Developer Focused or something similar would probably have the right mix of messages. i.e. hands off + not a performance tweak, etc. :-) Regards and best wishes, Justin Clift regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there. - Indira Gandhi ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Wrong version of jdbc in 7.3.3 rpms
Barry Lind wrote: Does anyone know why apparently the 7.3beta1 version of the jdbc drivers are what is included in the 7.3.3 rpms? No idea. Just updated the PostgreSQL Release Process document though in case anyone (Marc) ever decides they're going to use it: http://advocacy.postgresql.org/documents/ReleaseProcess Regards and best wishes, Justin Clift --Barry Original Message Subject: Re: [JDBC] Official JDBC driver release ? Date: Thu, 05 Jun 2003 08:14:40 +0200 From: Thomas Kellerer [EMAIL PROTECTED] To: [EMAIL PROTECTED] References: [EMAIL PROTECTED] [EMAIL PROTECTED] Barry Lind schrieb: I'm a bit puzzled about the versions of the JDBC driver floating around. I initially downloaded the release for 7.3 from jdbc.postgresql.org Now I have seen that the JDBC driver which is included e.g. in the RPM's for 7.3.3 is a bit older and has a different name (pg73b1jdbc3.jar vs. pg73jdbc3.jar) The pg73b1jdbc3.jar file is very old (it is the 7.3 beta 1 version). What RPMs are you using? You should contact whoever produced those RPMs to get them built with the current version. The 'official' version is the source code that is tagged with the 7.3.3 freeze label (which is the version that is currently posted on the jdbc.postgresql.org web site) --Barry Barry, thanks for the answer. The pg73b1jdbc3.jar file is contained in all the 7.3.3 rpm available on the ftp mirrors... (ok, not necessarilly all, but I checked about 3 or 4 different mirrors) I don't know who builds the rpms on the mirror sites available from www.postgresql.org Cheers Thomas ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster -- My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there. - Indira Gandhi ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] 7.3.2 make failed on AIX4.3 using native c compiler
Hi John, Marco Pratesi [EMAIL PROTECTED] wrote a step-by-step guide for compiling PostgreSQL on AIX a while ago: http://techdocs.postgresql.org/guides/CompilingForAIX Hope that helps. :-) Regards and best wishes, Justin Clift John Liu wrote: I config and make the 7.3.2 on the one works, then try to install on the other one which was failed, after the install, try to start postmaster - tail -f postmaster.log exec(): 0509-036 Cannot load program /emrxdbs/pgsql/bin/postmaster because of the following errors: Dependent module /usr/local/lib/libz.a(shr.o) could not be loaded. Member shr.o is not found in archive But the failed one, libz.a is older working one - ls -all /usr/local/lib/libz.a -rw-r--r-- 1 root system 77308 Mar 20 2000 /usr/local/lib/libz.a failed one - ls -all /usr/local/lib/libz.a -rwxr-xr-x 1 root system 83699 Feb 19 2001 /usr/local/lib/libz.a johnl -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of John Liu Sent: Monday, March 31, 2003 11:24 AM To: Bruce Momjian Cc: [EMAIL PROTECTED] Subject: Re: [HACKERS] 7.3.2 make failed on AIX4.3 using native c compiler Hi, Bruce, I've tried on two AIX4.3.3 boxes, both are the same oslevel=4330-09, both are the same compiler version, lslpp -l|grep -i xlc xlC.aix43.rte 4.0.2.1 COMMITTED C Set ++ Runtime for AIX 4.3 xlC.cpp4.3.0.1 COMMITTED C for AIX Preprocessor xlC.msg.en_US.cpp 4.3.0.1 COMMITTED C for AIX Preprocessor xlC.msg.en_US.rte 4.0.2.0 COMMITTED C Set ++ Runtime xlC.rte4.0.2.0 COMMITTED C Set ++ Runtime one make works, the other one failed. I'm trying to figure out what makes the differences. johnl -Original Message- From: Bruce Momjian [mailto:[EMAIL PROTECTED] Sent: Monday, March 31, 2003 11:15 AM To: John Liu Cc: [EMAIL PROTECTED] Subject: Re: [HACKERS] 7.3.2 make failed on AIX4.3 using native c compiler I know we have other AIX users using PostgreSQL. What compiler version is that? -- - John Liu wrote: make[4]: Leaving directory `/emrxdbs/postgresql-7.3.2/src/backend/parser' cc -O2 -qmaxmem=16384 -qsrcmsg -qlonglong -I../../../src/interfaces/libpq -I ../../../src/include -I/usr/local/include -DBINDIR=\/emrxdbs/pgsql/bin\ - c -o pg_dump.o pg_dump.c 2681 | COMMENT, deps, a a - 1506-280 (W) Function argument assignment between types const unsigned char*(*)[] and unsigned char*(*)[] is not allowed. 2777 | COMMENT, deps, a a - 1506-280 (W) Function argument assignment between types const unsigned char*(*)[] and unsigned char*(*)[] is not allowed. 2795 | COMMENT, deps, a a - 1506-280 (W) Function argument assignment between types const unsigned char*(*)[] and unsigned char*(*)[] is not allowed. 3121 | tinfo-usename, TYPE, deps, .a a - 1506-280 (W) Function argument assignment between types const unsigned char*(*)[] and unsigned char*(*)[] is not allowed. 3226 | tinfo-usename, DOMAIN, deps, ...a a - 1506-280 (W) Function argument assignment between types const unsigned char*(*)[] and unsigned char*(*)[] is not allowed. 3515 | PROCEDURAL LANGUAGE, deps, a a - 1506-280 (W) Function argument assignment between types const unsigned char*(*)[] and unsigned char*(*)[] is not allowed. 3882 | CAST, deps, .a a - 1506-280 (W) Function argument assignment between types const unsigned char*(*)[] and unsigned char*(*)[] is not allowed. cc -O2 -qmaxmem=16384 -qsrcmsg -qlonglong -I../../../src/interfaces/libpq -I ../../../src/include -I/usr/local/include -DBINDIR=\/emrxdbs/pgsql/bin\ - c -o common.o common.c cc -O2 -qmaxmem=16384 -qsrcmsg -qlonglong -I../../../src/interfaces/libpq -I ../../../src/include -I/usr/local/include -DBINDIR=\/emrxdbs/pgsql/bin\ - c -o pg_backup_archiver.o pg_backup_archiver.c 590 | ArchiveEntry(Archive *AHX, char *oid, char *tag, a. a - 1506-343 (S) Redeclaration of ArchiveEntry differs from previous declaration on line 135 of pg_backup.h. a - 1506-377 (I) The type unsigned char*(*)[] of parameter 7 differs from the previous type const unsigned char*(*)[]. make[3]: *** [pg_backup_archiver.o] Error 1 make[3]: Leaving directory `/emrxdbs/postgresql-7.3.2/src/bin/pg_dump' make[2
Re: [HACKERS] Numbering of the next release: 8.0 vs 7.4
mlw wrote: snip So, if the decision is to go with an 8.0, what would you guys say to having a roll call about stuff that is possible and practical and really design PostgreSQL 8.0 as something fundimentally newer than 7.x. 8.0 could get the project some hype. It has been 7x for so many years. Sounds great. I just don't want it to take _ages_ to accomplish. :) Regards and best wishes, Justin Clift -- My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there. - Indira Gandhi ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org