Re: [HACKERS] Retain dynamic shared memory segments for postmaster lifetime
Hello, I've managed to reconstruct windows build environment and tried to run the previous patch. - DSM implimentation seems divided into generic part (dsm.c) and platform dependent part(dsm_impl.c). This dsm_keep_segment puts WIN32 specific part directly into dms.c. I suppose it'd be better defining DSM_OP_KEEP_SEGMENT and calling dms_impl_op from dms_keep_segment, or something. - Repeated calling of dsm_keep_segment even from different backends creates new (orphan) handles as many as it is called. Simplly invoking this function in some of extensions intending to stick segments might results in so many orphan handles. Something to curb that situation would be needed. I think the right way to fix above 2 comments is as suggested by Robert. Fine. I have no objection on that way. - Global/PostgreSQL.%u is the same literal constant with that occurred in dsm_impl_windows. It should be defined as a constant (or a macro). - dms_impl_windows uses errcode_for_dynamic_shared_memory() for ereport and it finally falls down to errcode_for_file_access(). I think it is preferable, maybe. Okay, will take care of these in new version after your verification on Windows. I will apologize in advance for probably silly questions but I have two problems. Server was crashed by dsm_demo_read on my test environment but unfortunately the cause is still uncertain for me. | LOG: server process (PID 19440) was terminated by exception 0xC005 | DETAIL: Failed process was running: select dsm_demo_read(4294967297); | HINT: See C include file ntstatus.h for a description of the hexadecimal value. | LOG: terminating any other active server processes 0xC005 is ACCESS_VIOLATION. The crash occurred at aset.c:853 | /* Try to allocate it */ | block = (AllocBlock) malloc(blksize); Where blksize is 55011304... It's sasier to investigate still more if I could step into functions in the dynamic loaded module, but VC2008 IDE skips over the function body:-( Do you have any idea how I can step into there? My environment is VC2008 Express/ Win7-64. Step-into bounces back at the line where function definition. | PG_FUNCTION_INFO_V1(dsm_demo_create); In contrast, dsm_demo_create doesn't crash and seems to return sane vaule. | =# select dsm_demo_create('hoge', 100); | dsm_demo_create | - | 4294967297 === And the another problem is perhaps not the issue of this module but it happened for me. | =# create extension dsm_demo; | ERROR: could not find function dsm_demo_create in file c:/pgsql/lib/dsm_demo.dll I've found that generated dll file exports the function with the names following after some investigation. | ...\Debugdumpbin /EXPORTS dsm_demo.dll | Microsoft (R) COFF/PE Dumper Version 9.00.21022.08 | Copyright (C) Microsoft Corporation. All rights reserved. (snipped) | 10 0001100A Pg_magic_func = @ILT+5(_Pg_magic_func) | 21 0008 pg_finfo_dsm_demo_create = @ILT+275(_pg_finfo_dsm_demo | _create) | 32 000110AF pg_finfo_dsm_demo_read = @ILT+170(_pg_finfo_dsm_demo_r Then explicitly designating the function name in 'CREATE FUNCTION' in dsm_demo--1.0.sql stops the complaint. I might did something wrong in setting up build environment for this dll but I have no idea which would cause this kind of trouble.. | CREATE FUNCTION dsm_demo_create(pg_catalog.text, pg_catalog.int4 default 0) | RETURNS pg_catalog.int8 STRICT | AS 'MODULE_PATHNAME', 'pg_finfo_dsm_demo_create' | LANGUAGE C; Do you have any idea for this? regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation
On Fri, Jan 31, 2014 at 12:33 AM, Amit Kapila amit.kapil...@gmail.com wrote: On Thu, Jan 30, 2014 at 12:23 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Wed, Jan 29, 2014 at 8:13 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: After basic verification of back-to-pglz-like-delta-encoding-1, I will take the data with both the patches and report the same. I have corrected the problems reported in back-to-pglz-like-delta-encoding-1 and removed hindex from pgrb_delta_encoding_v6 and attached are new versions of both patches. I/O Reduction Data - Non-Default settings autovacuum = off checkpoitnt_segments = 256 checkpoint_timeout =15min Observations 1. With both the patches WAL reduction is similar i.e ~37% for one short and one long field, no change and 12% for hundred tiny fields, half nulled 2. With pgrb_delta_encoding_v7, there is ~19% CPU reduction for best case one short and one long field, no change. 3. With pgrb_delta_encoding_v7, there is approximately 8~9% overhead for cases where there is no match 4. With pgrb_delta_encoding_v7, there is approximately 15~18% overhead for hundred tiny fields, half nulled case 5. With back-to-pglz-like-delta-encoding-2, the data is mostly similar except for hundred tiny fields, half nulled where CPU overhead is much more. The case (hundred tiny fields, half nulled) where CPU overhead is visible is due to repetitive data and if take some random or different data, it will not be there. To verify this theory, I have added one new test which is almost similar to hundred tiny fields, half nulled, the difference is that it has non-repetive string and the results are as below: Unpatch -- testname | wal_generated | duration --+---+-- nine short and one long field, thirty percent change | 698912496 | 12.1819660663605 nine short and one long field, thirty percent change | 698906048 | 11.9409539699554 nine short and one long field, thirty percent change | 698910904 | 11.9367880821228 Patch pgrb_delta_encoding_v7 testname | wal_generated | duration --+---+-- nine short and one long field, thirty percent change | 559840840 | 11.6027710437775 nine short and one long field, thirty percent change | 559829440 | 11.8239741325378 nine short and one long field, thirty percent change | 560141352 | 11.6789472103119 Patch back-to-pglz-like-delta-encoding-2 -- testname | wal_generated | duration --+---+-- nine short and one long field, thirty percent change | 544391432 | 12.3666560649872 nine short and one long field, thirty percent change | 544378616 | 11.8833730220795 nine short and one long field, thirty percent change | 544376888 | 11.9487581253052 (3 rows) Basic idea of new test is that some part of tuple is unchanged and other part is changed, here the unchanged part contains random string rather than repetitive set of chars. The new test is added with other tests in attached file. Observation --- LZ like delta encoding has more WAL reduction and chunk wise encoding has bit better CPU usage, but overall both are almost similar. I think the main reason for overhead is that we store last offset of matching data in history at front, so during match, it has to traverse back many times to find longest possible match and in real world it won't be the case that most of history entries contain same hash index, so it should not effect. If we want to improve CPU usage for cases like hundred tiny fields, half nulled (which I think is not important), forming history table by traversing from end rather than beginning, can serve the purpose, I have not tried it but I think it can certainly help. Do you think overall data is acceptable? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com wal-update-testsuite.sh Description: Bourne shell script -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UNION ALL on partitioned tables won't use indices.
Hello, No hurry. Thanks. The attached two patches are rebased to current 9.4dev HEAD and make check at the topmost directory and src/test/isolation are passed without error. One bug was found and fixed on the way. It was an assertion failure caused by probably unexpected type conversion introduced by collapse_appendrels() which leads implicit whole-row cast from some valid reltype to invalid reltype (0) into adjust_appendrel_attrs_mutator(). What query demonstrated this bug in the previous type 2/3 patches? I would still like to know the answer to the above question. Ouch! Sorry but I couldn't replay the bug just now, please wait for a while. It's true that each AppendRelInfo is initially created that way, but they need not remain so simple. You can assume ctx-child_appinfo-translated_vars contains only these Var nodes, but parent_appinfo-translated_vars may contain arbitrary expressions. (I think the pullup_replace_vars() call at prepjointree.c:954 is where an AppendRelInfo can get non-Var expressions.) Itself is not a problem. My patch doesn't replace parent's sole Var at top-level with the corresponding node in child, it repaces any Var node in parent's expressions in translated_vars with the corresponding node in child. So expressions in FROM clauses of union's-operand queries can adequately modifies expressions made in prepjointree. The query like follows returns correct result with this patch. As I mentioned before, I think the other stuffs than Var pullup would be done in adjust_appendrel_attrs separately. | SELECT (a + 1) as x, (b * 3) as y FROM p1 | UNION ALL | SELECT (a * 2), (b / 5) FROM p2 ORDER BY x LIMIT 10; So all we should do to collapse nested appendrels is simplly connecting each RTEs directly to the root (most ancient?) RTE in the relationship, resolving Vars like above, as seen in patched expand_inherited_rtentry. # If translated_vars are generated always in the way shown above, # using tree walker might be no use.. This can be done apart from all other stuffs compensating translation skew done in adjust_appendrel_attrs. I believe they are in orthogonal relationship. Here is a test case that provokes an assertion failure under the patch; I believe the cause is oversimplification in transvars_merge_mutator(): create table parent (a int, b int); create table child () inherits (parent); select parent from parent union all select parent from parent; Wow. Ok, I'm pretty convinced that you're right. I'll dig it on. While poking at that test case, I noticed that the following test emits three rows on HEAD and wrongly emits four rows with the patch: create table parent (a int, b int); create table child () inherits (parent); insert into parent values (1,10); insert into child values (2,20); select a, b from parent union all select a, b from child; Mmm. I had the same result. Please let me have a bit more time. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [bug fix] psql \copy doesn't end if backend is killed
On 20 December 2013 19:43, MauMau Wrote [Problem] If the backend is terminated with SIGKILL while psql is running \copy table_name from file_name, the \copy didn't end forever. I expected \copy to be cancelled because the corresponding server process vanished. [Cause] psql could not get out of the loop below in handleCopyIn(): while (res = PQgetResult(conn), PQresultStatus(res) == PGRES_COPY_IN) { OK = false; PQclear(res); PQputCopyEnd(pset.db, _(trying to exit copy mode)); } 1. Patch applied to git head successfully 2. Problem can occur in some scenario and fix looks fine to me. 3. For testing, I think it's not possible to directly generate such scenario, so I have verified by debugging as the steps explained. 1. Make pqsecure_write to return less byte(by updating the result while debugging in gdb in pqSendSome. (also make sure that remaining byte is = 8192 i.e conn-outCount-sent 8192 , so that in next step pqPutMsgEnd called from PQputCopyEnd go for flushing the data) 2. Then Kill the backend process before it calls pqReadData. Scenario reproduced without patch and after applying the patch issue resolves. Is there any direct scenario by which it can be reproduce ? Regards, Dilip -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch (v2) for updatable security barrier views
On 01/31/2014 05:09 PM, Dean Rasheed wrote: I don't like this fix --- you appear to be adding another RTE to the rangetable (one not in the FROM list) and applying the rowmarks to it, which seems wrong because you're not locking the right set of rows. This is reflected in the change to the regression test output where, in one of the tests, the ctids for the table to update are no longer coming from the same table. I think a better approach is to push down the rowmark into the subquery so that any locking required applies to the pushed down RTE --- see the attached patch. Thankyou. I wasn't able to detect any changes in behaviour caused by the original change other than the table alias change due to the additional RTE. The new RTE referred to the same Relation, and there didn't seem to be any problems caused by that. Nonetheless, your fix seems a lot cleaner, especially in the target-view case. I've updated the commitfest app to show this patch. Anyway, please test if this works with your RLS code. It does. Thankyou. I'm still working on the other issues I outlined yesterday, but that's for the other thread. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: show xid and xmin in pg_stat_activity and pg_stat_replication
On 2014-01-30 12:27:43 -0500, Robert Haas wrote: Nope, but I think this patch is broken. It looks to me like it's conflating the process offset in the BackendStatus array with its backendId, which does not seem like a good idea even if it happens to work at present. Hm. I don't see how that's going to be broken without major surgery in pgstat.c. The whole thing seems to rely on being able to index BackendStatusArray with MyBackendId? And the way BackendIdGetProc() is used looks unsafe, too: the contents might no longer be valid by the time we read them. I suspect we should have a new accessor function that takes a backend ID and copies the xid and xmin to pointers provided by the client while holding the lock. I also note that the docs seem to need some copy-editing: It certainly needs to be documented as racy, but I don't see a big problem with being racy here. We assume in lots of places that writing/reading xids is atomic, and we don't even hold exclusive locks while writing... (And yes, that means that the xid and xmin don't necessarily belong to each other) That said, encapsulating that racy access into a accessor function does sound like a good plan. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2
Hi, On 2014-01-28 21:27:29 -0800, Peter Geoghegan wrote: On Fri, Nov 15, 2013 at 11:47 AM, Andres Freund and...@2ndquadrant.com wrote: 1) I've added an abstracted atomic ops implementation. Needs a fair amount of work, also submitted as a separate CF entry. (Patch 1 2) Commit 220b34331f77effdb46798ddd7cca0cffc1b2858 caused bitrot when applying 0002-Very-basic-atomic-ops-implementation.patch. Please rebase. I've pushed a rebased version of the patchset to http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git branch rwlock contention. 220b34331f77effdb46798ddd7cca0cffc1b2858 actually was the small problem, ea9df812d8502fff74e7bc37d61bdc7d66d77a7f was the major PITA. I plan to split the atomics patch into smaller chunks before reposting. Imo the Convert the PGPROC-lwWaitLink list into a dlist instead of open coding it. is worth being applied independently from the rest of the series, it simplies code and it fixes a bug... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] pg_basebackup: progress report max once per second
On Tue, Jan 21, 2014 at 7:17 AM, Oskari Saarenmaa o...@ohmu.fi wrote: 18.11.2013 07:53, Sawada Masahiko kirjoitti: On 13 Nov 2013, at 20:51, Mika Eloranta m...@ohmu.fi wrote: Prevent excessive progress reporting that can grow to gigabytes of output with large databases. I got error with following scenario. $ initdb -D data -E UTF8 --no-locale /* setting the replication parameters */ $ pg_basebackup -D 2data Floating point exception LOG: could not send data to client: Broken pipe ERROR: base backup could not send data, aborting backup FATAL: connection to client lost Attached a rebased patch with a fix for division by zero error plus some code style issues. I also moved the patch to the current commitfest. Thank you for updating the patch! I have reviewed it easily. I didn't get error of compile, and the patch works fine. I have one question. What does it mean the calling progress_report() which you added at end of ReceiveUnpackTarFile() and RecieveTarFile()? I could not understand necessity of this code. And the force argument of progress_report() is also same. If you would like to use 'force' option, I think that you should add the comment to source code about it. Regards, --- Sawada Masahiko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
Hmm, neither me, nor Teodor have experience and knowledge with populate_record() and moreover hstore here is virgin and we don't know the right behaviour, so I think we better take it from jsonb, once Andrew realize it. Andrew ? On Fri, Jan 31, 2014 at 4:52 AM, Andrew Dunstan and...@dunslane.net wrote: On 01/30/2014 07:21 PM, Merlin Moncure wrote: Something seems off: postgres=# create type z as (a int, b int[]); CREATE TYPE postgres=# create type y as (a int, b z[]); CREATE TYPE postgres=# create type x as (a int, b y[]); CREATE TYPE -- test a complicated construction postgres=# select row(1, array[row(1, array[row(1, array[1,2])::z])::y])::x; row - (1,{(1,\\{(1,{1,2})}\\)}) postgres=# select hstore(row(1, array[row(1, array[row(1, array[1,2])::z])::y])::x); hstore -- a=1, b={\(1,\\\{\\(1,\\{1,2}\\)\\}\\\)\} here, the output escaping has leaked into the internal array structures. istm we should have a json expressing the internal structure. What has this to do with json at all? It's clearly a failure in the hstore() function. It does (weirdly) map back however: postgres=# select populate_record(null::x, hstore(row(1, array[row(1, array[row(1, array[1,2])::z])::y])::x)); populate_record - (1,{(1,\\{(1,{1,2})}\\)}) OTOH, if I go via json route: postgres=# select row_to_json(row(1, array[row(1, array[row(1, array[1,2])::z])::y])::x); row_to_json --- {a:1,b:[{a:1,b:[{a:1,b:[1,2]}]}]} so far, so good. let's push to hstore: postgres=# select row_to_json(row(1, array[row(1, array[row(1, array[1,2])::z])::y])::x)::jsonb::hstore; row_to_json --- a=1, b=[{a=1, b=[{a=1, b=[1, 2]}]}] this ISTM is the 'right' behavior. but what if we bring it back to record object? postgres=# select populate_record(null::x, row_to_json(row(1, array[row(1, array[row(1, array[1,2])::z])::y])::x)::jsonb::hstore); ERROR: malformed array literal: {{a=1, b={{a=1, b={1, 2} yikes. The situation as I read it is that (notwithstanding my comments upthread) there is no clean way to slide rowtypes to/from hstore and jsonb while preserving structure. IMO, the above query should work and the populate function record above should return the internally structured row object, not the text escaped version. And this is a failure in populate_record(). I think we possibly need to say that handling of nested composites and arrays is an area that needs further work. OTOH, the refusal of json_populate_record() and json_populate_recordset() to handle these in 9.3 has not generated a flood of complaints, so I don't think it's a tragedy, just a limitation, which should be documented if it's not already. (And of course hstore hasn't handled nested anything before now.) Meanwhile, maybe Teodor can fix the two hstore bugs shown here. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Fwd: [HACKERS] Proposal: variant of regclass
There are duplicate oids in pg_proc.h : make[3]: Entering directory `/tmp/git-pg/src/backend/catalog' cd ../../../src/include/catalog '/usr/bin/X11/perl' ./duplicate_oids 3180 3195 3196 3197 - There is a whitespace diff in regoperatorin and regprocedurein() definition. Other than this, there are no more issues from my side. I have only checked the part of the patch that was modified as per *my* review comments. I will leave the rest of the review for the other reviewer. On 28 January 2014 14:08, Yugo Nagata nag...@sraoss.co.jp wrote: Hi Amit and Marti, I revised the patch. Could you please review this? The fixes include: - Fix *_guts() function definition to return Oid instead of Datum - Fix to_regproc() definition in pg_proc.h - Fix some indentation - Add regression test - Fix to use missing_ok instead of raiseError - Merge parseTypeString - Remove *_guts() and *MissingOk() and merge to one function about parseTypeString and typenameTypeIdAndMod - Fix to not raise error even when schema name doesn't exist This is a new from the previous patch. In previous, specifying wrong schema name raises an error like: =# SELECT to_regproc('ng_catalog.now'); ERROR : schema ng_catalog doew not exist On Fri, 24 Jan 2014 12:35:27 +0900 Yugo Nagata nag...@sraoss.co.jp wrote: On Thu, 23 Jan 2014 13:19:37 +0200 Marti Raudsepp ma...@juffo.org wrote: Resending to Tatsuo Ishii and Yugo Nagata, your email server was having problems yesterday: Thanks for resending! This is the mail system at host sraigw2.sra.co.jp. yug...@sranhm.sra.co.jp: mail for srasce.sra.co.jp loops back to myself t-is...@sra.co.jp: mail for srasce.sra.co.jp loops back to myself -- Forwarded message -- From: Marti Raudsepp ma...@juffo.org Date: Thu, Jan 23, 2014 at 3:39 AM Subject: Re: [HACKERS] Proposal: variant of regclass To: Yugo Nagata nag...@sraoss.co.jp Cc: Tatsuo Ishii is...@postgresql.org, pgsql-hackers pgsql-hackers@postgresql.org, Vik Fearing vik.fear...@dalibo.com, Robert Haas robertmh...@gmail.com, Tom Lane t...@sss.pgh.pa.us, Pavel Golub pa...@gf.microolap.com, Pavel Golub pa...@microolap.com, Andres Freund and...@2ndquadrant.com, Pavel Stěhule pavel.steh...@gmail.com On Wed, Jan 22, 2014 at 1:44 PM, Yugo Nagata nag...@sraoss.co.jp wrote: On Wed, 22 Jan 2014 20:04:12 +0900 (JST) Tatsuo Ishii is...@postgresql.org wrote: parseTypeString() is called by some other functions and I avoided influences of modifying the definition on them, since this should raise errors in most cases. This is same reason for other *MissingOk functions in parse_type.c. Is it better to write definitions of these function and all there callers? Yes, for parseTypeString certainly. There have been many refactorings like that in the past and all of them use this pattern. Ok. I'll rewrite the definition and there callers. typenameTypeIdAndMod is less clear since the code paths differ so much, maybe keep 2 versions (merging back to 1 function is OK too, but in any case you don't need 3). I'll also fix this in either way to not use typenameTypeIdAndMod_guts. typenameTypeIdAndModMissingOk(...) { Type tup = LookupTypeName(pstate, typeName, typmod_p); if (tup == NULL || !((Form_pg_type) GETSTRUCT(tup))-typisdefined) *typeid_p = InvalidOid; else *typeid_p = HeapTupleGetOid(tup); if (tup) ReleaseSysCache(tup); } typenameTypeIdAndMod(...) { Type tup = typenameType(pstate, typeName, typmod_p); *typeid_p = HeapTupleGetOid(tup); ReleaseSysCache(tup); } Also, there's no need for else here: if (raiseError) ereport(ERROR, ...); else return InvalidOid; Regards, Marti -- Yugo Nagata nag...@sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Yugo Nagata nag...@sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [bug fix or improvement?] Correctly place DLLs for ECPG apps in bin folder
Hi MauMau, I have't completed tested all the expects of submitted patch yet. I would like to share my findings so far. By looking at the patch I do feel that there is room for improvement in the patch, Instead of moving related dll's from lib directory to bin directory later in the installation process, copy these files directly from release/debug directory earlier (PFA patch). It seems unfortunate that Windows don't have RPATH or similar logic implemented in OS. Alternate methods seems not appropriate, Only feasible option seems to be placing related dependency dll in same executable directory. At first one may think an alternate to create symbolic link for relative path in bin directory e.g. libpq.dll - ..\lib\libpq.dll but it is unfortunate that normal user do require special permissions to create symbolic link otherwise it could help. There is SetDllDirectory or AddDllDirectory function available that effects only subsequent calls to LoadLibrary and LoadLibraryEx. I will look into this patch further and let you know about my more findings. Thanks. Regards, Muhammad Asif Naeem On Wed, Dec 4, 2013 at 5:07 PM, MauMau maumau...@gmail.com wrote: From: MauMau maumau...@gmail.com In addition, I'll remove libpq.dll from lib folder unless somebody objects. Currently, libpq.dll is placed in both bin and lib. I guess libpq.dll was left in lib because it was considered necessary for ECPG DLLs. The attached patch also removes libpq.dll from lib folder. I don't mind whether this patch or yesterday's one will be adopted. I'll add this to 2014-1 commitfest this weekend if the patch is not committed until then. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers Win_lib_bin.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
On Sun, Jan 26, 2014 at 5:45 PM, Andres Freund and...@2ndquadrant.com wrote: We're also seeing log entries about wal contains reference to invalid pages but these errors seem only vaguely correlated. Sometimes we get the errors but the tables don't grow noticeably and sometimes we don't get the errors and the tables are much larger. Uhm. I am a bit confused. You see those in the standby's log? At !debug log levels? That'd imply that the standby is dead and needed to be recloned, no? How do you continue after that? So in chatting with Heikki last night we came up with a scenario where this check is insufficient. If you have multiple checkpoints during the base backup then there will be restartpoints during recovery. If the reference to the invalid page is before the restartpont then after crashing recovery and coming back up the recovery will go forward fine. Fixing this check doesn't look trivial. I'm inclined to say to suppress any restartpoints while there are references to invalid pages in the hash. The problem with this is that it will prevent trimming the xlog during recovery. It seems frightening that most days recovery will take little extra space but if you happen to have a drop table or truncate during the base backup then your recovery might require a lot of extra space. The alternative of spilling the hash table to disk at every restartpoint seems kind of hokey. Then we need to worry about fsyncing this file, cleaning it up, dealing with the file after crashes, etc. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
On 2014-01-31 11:09:14 +, Greg Stark wrote: On Sun, Jan 26, 2014 at 5:45 PM, Andres Freund and...@2ndquadrant.com wrote: We're also seeing log entries about wal contains reference to invalid pages but these errors seem only vaguely correlated. Sometimes we get the errors but the tables don't grow noticeably and sometimes we don't get the errors and the tables are much larger. Uhm. I am a bit confused. You see those in the standby's log? At !debug log levels? That'd imply that the standby is dead and needed to be recloned, no? How do you continue after that? So in chatting with Heikki last night we came up with a scenario where this check is insufficient. But that seems unrelated to the issue at hand, right? If you have multiple checkpoints during the base backup then there will be restartpoints during recovery. If the reference to the invalid page is before the restartpont then after crashing recovery and coming back up the recovery will go forward fine. We don't perform restartpoints if there are invalid pages registered. Check the XLogHaveInvalidPages() call in xlog.c. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
On 2014-01-31 11:09:14 +, Greg Stark wrote: On Sun, Jan 26, 2014 at 5:45 PM, Andres Freund and...@2ndquadrant.com wrote: We're also seeing log entries about wal contains reference to invalid pages but these errors seem only vaguely correlated. Sometimes we get the errors but the tables don't grow noticeably and sometimes we don't get the errors and the tables are much larger. Uhm. I am a bit confused. You see those in the standby's log? At !debug log levels? That'd imply that the standby is dead and needed to be recloned, no? How do you continue after that? So in chatting with Heikki last night we came up with a scenario where this check is insufficient. The slightly more likely explanation for transient errors is that you hit the vacuum bug (061b079f89800929a863a692b952207cadf15886). That had only taken effect if HS has already assembled a snapshot, which can make such an error vanish after a restart... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
On Fri, Jan 31, 2014 at 11:26 AM, Andres Freund and...@2ndquadrant.com wrote: The slightly more likely explanation for transient errors is that you hit the vacuum bug (061b079f89800929a863a692b952207cadf15886). That had only taken effect if HS has already assembled a snapshot, which can make such an error vanish after a restart... Which one, there seem to be several So this seems like it's more likely to be a symptom of whatever is causing the table to grow than a cause? That is, there's some bug causing the standby to extend the btree dramatically resulting in lots of uninitialized pages and touching those pages triggers this bug. But this doesn't explain why the btree is being extended I don't think. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
On 2014-01-31 11:46:09 +, Greg Stark wrote: On Fri, Jan 31, 2014 at 11:26 AM, Andres Freund and...@2ndquadrant.com wrote: The slightly more likely explanation for transient errors is that you hit the vacuum bug (061b079f89800929a863a692b952207cadf15886). That had only taken effect if HS has already assembled a snapshot, which can make such an error vanish after a restart... Which one, there seem to be several So this seems like it's more likely to be a symptom of whatever is causing the table to grow than a cause? That is, there's some bug causing the standby to extend the btree dramatically resulting in lots of uninitialized pages and touching those pages triggers this bug. But this doesn't explain why the btree is being extended I don't think. I don't think anything we've talked about so far is likely to explain the issue. I don't have time atm to look closer, but what I'd do is try to look if there are any pages with valid LSNs on the standby in the bloated area... That then might give you a hint where to look. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup and pg_stat_tmp directory
On Tue, Jan 28, 2014 at 5:51 PM, Magnus Hagander mag...@hagander.net wrote: On Tue, Jan 28, 2014 at 6:11 AM, Amit Kapila amit.kapil...@gmail.com wrote: On Tue, Jan 28, 2014 at 9:26 AM, Fujii Masao masao.fu...@gmail.com wrote: Hi, The files in pg_stat_tmp directory don't need to be backed up because they are basically reset at the archive recovery. So I think it's worth changing pg_basebackup so that it skips any files in pg_stat_tmp directory. Thought? I think this is good idea, but can't it also avoid PGSTAT_STAT_PERMANENT_TMPFILE along with temp files in pg_stat_tmp All stats files should be excluded. IIRC the PGSTAT_STAT_PERMANENT_TMPFILE refers to just the global one. You want to exclude based on PGSTAT_STAT_PERMANENT_DIRECTORY (and of course based on the guc stats_temp_directory if it's in PGDATA. Attached patch changes basebackup.c so that it skips all files in both pg_stat_tmp and stats_temp_directory. Even when a user sets stats_temp_directory to the directory other than pg_stat_tmp, we need to skip the files in pg_stat_tmp. Because, per recent change of pg_stat_statements, the external query file is always created there. Regards, -- Fujii Masao *** a/contrib/pg_stat_statements/pg_stat_statements.c --- b/contrib/pg_stat_statements/pg_stat_statements.c *** *** 68,73 --- 68,74 #include parser/analyze.h #include parser/parsetree.h #include parser/scanner.h + #include pgstat.h #include storage/fd.h #include storage/ipc.h #include storage/spin.h *** *** 89,95 PG_MODULE_MAGIC; * race conditions. Besides, we only expect modest, infrequent I/O for query * strings, so placing the file on a faster filesystem is not compelling. */ ! #define PGSS_TEXT_FILE pg_stat_tmp/pgss_query_texts.stat /* Magic number identifying the stats file format */ static const uint32 PGSS_FILE_HEADER = 0x20140125; --- 90,96 * race conditions. Besides, we only expect modest, infrequent I/O for query * strings, so placing the file on a faster filesystem is not compelling. */ ! #define PGSS_TEXT_FILE PG_STAT_TMP_DIR /pgss_query_texts.stat /* Magic number identifying the stats file format */ static const uint32 PGSS_FILE_HEADER = 0x20140125; *** a/src/backend/replication/basebackup.c --- b/src/backend/replication/basebackup.c *** *** 25,30 --- 25,31 #include libpq/pqformat.h #include miscadmin.h #include nodes/pg_list.h + #include pgstat.h #include replication/basebackup.h #include replication/walsender.h #include replication/walsender_private.h *** *** 63,68 static int compareWalFileNames(const void *a, const void *b); --- 64,72 /* Was the backup currently in-progress initiated in recovery mode? */ static bool backup_started_in_recovery = false; + /* Relative path of temporary statistics directory */ + static char *statrelpath = NULL; + /* * Size of each block sent into the tar stream for larger files. */ *** *** 111,116 perform_base_backup(basebackup_options *opt, DIR *tblspcdir) --- 115,132 labelfile); SendXlogRecPtrResult(startptr, starttli); + /* + * Calculate the relative path of temporary statistics directory + * in order to skip the files which are located in that directory later. + */ + if (is_absolute_path(pgstat_stat_directory) + strncmp(pgstat_stat_directory, DataDir, datadirpathlen) == 0) + statrelpath = psprintf(./%s, pgstat_stat_directory + datadirpathlen + 1); + else if (strncmp(pgstat_stat_directory, ./, 2) != 0) + statrelpath = psprintf(./%s, pgstat_stat_directory); + else + statrelpath = pgstat_stat_directory; + PG_ENSURE_ERROR_CLEANUP(base_backup_cleanup, (Datum) 0); { List *tablespaces = NIL; *** *** 838,844 sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces) sizeof(PG_AUTOCONF_FILENAME) + 4) == 0) continue; - /* * If there's a backup_label file, it belongs to a backup started by * the user with pg_start_backup(). It is *not* correct for this --- 854,859 *** *** 888,893 sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces) --- 903,922 } /* + * Skip temporary statistics files. PG_STAT_TMP_DIR must be skipped + * even when stats_temp_directory is set because PGSS_TEXT_FILE is + * always created there. + */ + if ((statrelpath != NULL strcmp(pathbuf, statrelpath) == 0) || + strncmp(de-d_name, PG_STAT_TMP_DIR, strlen(PG_STAT_TMP_DIR)) == 0) + { + if (!sizeonly) + _tarWriteHeader(pathbuf + basepathlen + 1, NULL, statbuf); + size += 512; + continue; + } + + /* * We can skip pg_xlog, the WAL segments need to be fetched from the * WAL archive anyway. But include it as an empty directory anyway, so * we get permissions right. *** a/src/backend/utils/misc/guc.c ---
Re: [HACKERS] Patch: show xid and xmin in pg_stat_activity and pg_stat_replication
Hi, I suspect we should have a new accessor function that takes a backend ID and copies the xid and xmin to pointers provided by the client while holding the lock. what do you think about the approach the attached patch implements? I'm not really sure if this is what you had in mind, especially if this is the right lock. I also note that the docs seem to need some copy-editing: + entryThe current xref linked=ddl-system-columnsxmin value./xref/entry Can you elaborate? Best regards, -- Christian Kruse http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services pgp70zYmW2633.pgp Description: PGP signature
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
2014-01-31 Peter Geoghegan p...@heroku.com On Thu, Jan 30, 2014 at 12:32 PM, Tom Lane t...@sss.pgh.pa.us wrote: In reality, actual applications could hardly be further from the perfectly uniform distribution of distinct queries presented here. Yeah, I made the same point in different words. I think any realistic comparison of this code to what we had before needs to measure a workload with a more plausible query frequency distribution. Even though that distribution just doesn't square with anybody's reality, you can still increase the pg_stat_statements.max setting to 10k and the problem goes away at little cost (a lower setting is better, but a setting high enough to cache everything is best). But you're not going to have terribly much use for pg_stat_statements anywayif you really do experience churn at that rate with 5,000 possible entries, the module is ipso facto useless, and should be disabled. I run extra test your and my patch with the pg_stat_statements.max setting=10k in other same setting and servers. They are faster than past results. method | try1 | try2 | try3 peter 3 | 6.769 | 6.784 | 6.785 method 5 | 6.770 | 6.774 | 6.761 I think that most significant overhead in pg_stat_statements is deleting and inserting cost in hash table update, and not at LWLocks. If LWLock is the most overhead, we can see the overhead -S pgbench, because it have one select pet tern which are most Lock conflict case. But we can't see such result. I'm not sure about dynahash.c, but we can see hash conflict case in this code. IMHO, I think It might heavy because it have to run list search and compare one until not conflict it. And past result shows that your patch's most weak point is that deleting most old statement and inserting new old statement cost is very high, as you know. It accelerate to affect update(delete and insert) cost in pg_stat_statements table. So you proposed new setting 10k in default max value. But it is not essential solution, because it is also good perfomance for old pg_stat_statements. And when we set max=10K in your patch and want to get most used only 1000 queries in pg_stat_statements, we have to use order-by-query with limit 1000. Sort cost is relatively high, so monitoring query will be slow and high cost. But old one is only set pg_stat_statements.max=1000, and performance is not relatively bad. It will be best settings for getting most used 1000 queries infomation. That' all my assumption. Sorry for a few extra test, I had no time in my office today. If we hope, I can run 1/N distribution pgbench test next week, I modify my perl script little bit, for creating multiple sql files with various sleep time. Regards, -- Mitsumasa KONDO NTT Open Source Software Center
Re: [HACKERS] [bug fix] pg_ctl fails with config-only directory
From: Christian Kruse christ...@2ndquadrant.com personally I really dislike constructs like you used: Thanks for reviewing the patch. Fixed. I'll add this revised patch to the CommitFest entry soon. Regards MauMau config_dir_win_v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup and pg_stat_tmp directory
2014-01-31 Fujii Masao masao.fu...@gmail.com On Tue, Jan 28, 2014 at 5:51 PM, Magnus Hagander mag...@hagander.net wrote: On Tue, Jan 28, 2014 at 6:11 AM, Amit Kapila amit.kapil...@gmail.com wrote: On Tue, Jan 28, 2014 at 9:26 AM, Fujii Masao masao.fu...@gmail.com wrote: Hi, The files in pg_stat_tmp directory don't need to be backed up because they are basically reset at the archive recovery. So I think it's worth changing pg_basebackup so that it skips any files in pg_stat_tmp directory. Thought? I think this is good idea, but can't it also avoid PGSTAT_STAT_PERMANENT_TMPFILE along with temp files in pg_stat_tmp All stats files should be excluded. IIRC the PGSTAT_STAT_PERMANENT_TMPFILE refers to just the global one. You want to exclude based on PGSTAT_STAT_PERMANENT_DIRECTORY (and of course based on the guc stats_temp_directory if it's in PGDATA. Attached patch changes basebackup.c so that it skips all files in both pg_stat_tmp and stats_temp_directory. Even when a user sets stats_temp_directory to the directory other than pg_stat_tmp, we need to skip the files in pg_stat_tmp. Because, per recent change of pg_stat_statements, the external query file is always created there. +1. And, I'd like to also skip pg_log directory because security reason. If you have time and get community agreed, could you create these patch after committed your patch? I don't want to bother you. Regards, -- Mitsumasa KONDO NTT Open Source Software Center
Re: [HACKERS] New option for pg_basebackup, to specify a different directory for pg_xlog
On Thu, Jan 30, 2014 at 9:37 AM, Haribabu Kommi kommi.harib...@gmail.com wrote: On Tue, Jan 28, 2014 at 1:17 PM, Peter Eisentraut wrote: On 11/30/13, 6:59 AM, Haribabu kommi wrote: To detect provided data and xlog directories are same or not, I reused the Existing make_absolute_path() code as follows. I note that initdb does not detect whether the data and xlog directories are the same. I think there is no point in addressing this only in pg_basebackup. If we want to forbid it, it should be done in initdb foremost. Thanks for pointing it. if the following approach is fine for identifying the identical directories then I will do the same for initdb also. I'm feeling the similar way as Peter. And, ISTM that we need much changes of source though its benefit is small Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup and pg_stat_tmp directory
On Fri, Jan 31, 2014 at 10:18 PM, Mitsumasa KONDO kondo.mitsum...@gmail.com wrote: 2014-01-31 Fujii Masao masao.fu...@gmail.com On Tue, Jan 28, 2014 at 5:51 PM, Magnus Hagander mag...@hagander.net wrote: On Tue, Jan 28, 2014 at 6:11 AM, Amit Kapila amit.kapil...@gmail.com wrote: On Tue, Jan 28, 2014 at 9:26 AM, Fujii Masao masao.fu...@gmail.com wrote: Hi, The files in pg_stat_tmp directory don't need to be backed up because they are basically reset at the archive recovery. So I think it's worth changing pg_basebackup so that it skips any files in pg_stat_tmp directory. Thought? I think this is good idea, but can't it also avoid PGSTAT_STAT_PERMANENT_TMPFILE along with temp files in pg_stat_tmp All stats files should be excluded. IIRC the PGSTAT_STAT_PERMANENT_TMPFILE refers to just the global one. You want to exclude based on PGSTAT_STAT_PERMANENT_DIRECTORY (and of course based on the guc stats_temp_directory if it's in PGDATA. Attached patch changes basebackup.c so that it skips all files in both pg_stat_tmp and stats_temp_directory. Even when a user sets stats_temp_directory to the directory other than pg_stat_tmp, we need to skip the files in pg_stat_tmp. Because, per recent change of pg_stat_statements, the external query file is always created there. +1. And, I'd like to also skip pg_log directory because security reason. Yeah, I was thinking that, too. I'm not sure whether including log files in backup really increases the security risk, though. There are already very important data, i.e., database, in backups. Anyway, since the amount of log files can be very large and they are not essential for recovery, it's worth considering whether to exclude them. OTOH, I'm sure that some users prefer current behavior for some reasons. So I think that it's better to expose the pg_basebackup option specifying whether log files are included in backups or not. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing currentTransactionWALVolume
On Wed, Jan 15, 2014 at 2:12 AM, Simon Riggs si...@2ndquadrant.com wrote: Short patch to expose a function GetCurrentTransactionWALVolume() that gives the total number of bytes written to WAL by current transaction. Could you tell me the use case of this function? ISTM that it's less useful in the real world because it reports the information of WAL generated only in my own current transaction. For example, the monitoring tool cannot see that information because it's the different session from the target. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Fri, Jan 31, 2014 at 4:03 AM, Oleg Bartunov obartu...@gmail.com wrote: Hmm, neither me, nor Teodor have experience and knowledge with populate_record() and moreover hstore here is virgin and we don't know the right behaviour, so I think we better take it from jsonb, once Andrew realize it. Andrew ? Andrew Gierth wrote the current implementation of htsore populate_record IIRC. Unfortunately the plan for jsonb was to borrow hstore's (I don't think hstore can use the jsonb implementation because you'd be taking away the ability to handle internally nested structures it currently has). Of my two complaints upthread, the second one, not being able to populate from and internally well formed structure, is by far the more serious one I think. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing currentTransactionWALVolume
I send you my review comment. 2014-01-15 Simon Riggs si...@2ndquadrant.com: Short patch to expose a function GetCurrentTransactionWALVolume() that gives the total number of bytes written to WAL by current transaction. * It's simple and good feature. It is useful for system management, and forecasting server spec(especially disk volume) on the system needed. * Overhead is nothing unless my test. * Compile and unit tests are no problem. User interface to this information discussed on separate thread, so that we don't check the baby out with the bathwater when people discuss UI pros and cons. Did you get good opinion in other thread? I'd like to use seeing WAL volume sql and init value of WAL volume sql. Your patch seems to init the value when start up the server now. If we have init function, we can see database activities in each hours in a day from WAL volumes. Now, we only see number of transactions and database volumes. I'd like to see more detail activities from WAL volume in each minutes or hours. It might be good for performance improvement by hackers, too Regards, -- Mitsumasa KONDO NTT Open Source Software Center
Re: [HACKERS] updated emacs configuration
On Thu, Jan 30, 2014 at 11:06 PM, Bruce Momjian br...@momjian.us wrote: On Thu, Jan 30, 2014 at 03:36:48PM -0500, Bruce Momjian wrote: On Thu, Jan 30, 2014 at 03:32:27PM -0500, Robert Haas wrote: Or this: - mp_int_copy(a, b); /* ok: 0 = r b */ - mp_int_copy(q, a); /* ok: q = a */ + mp_int_copy(a, b); /* ok: 0 = r b */ + mp_int_copy(q, a); /* ok: q = a */ I do agree with you some of the changes-after-periods look like improvements; what I meant to say is that there is an awful lot of churn in this patch that is unrelated to that particular point. Let me work on a entab patch that does replacements in comments only after periods and post the results. I should be done in an hour. OK, eight hours later, I have the results for only removing tabs after periods in comments: http://momjian.us/expire/entab_comment.v2.cdiff http://momjian.us/expire/entab_comment.v2.pdiff http://momjian.us/expire/entab_comment.v2.udiff The diff line counts are: 64356 entab_comment.v2.cdiff 17327 entab_comment.v2.pdiff 35453 entab_comment.v2.udiff It represents 3903 lines changed. That looks loads better. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add force option to dropdb
$ createdb -U postgres hoge $ psql -d hoge -U postgres hoge=# create table test (col text); hoge=# insert into test select repeat(chr(code),1) from generate_series(1,10) code; Execute dropdb -k while the client is inserting many tuples into database $ dropdb -k hoge 2014-01-29 23:10:49 JST FATAL: terminating connection due to administrator command 2014-01-29 23:10:49 JST STATEMENT: insert into test select repeat(chr(code),1) from generate_series(1,200) code; 2014-01-29 23:10:54 JST ERROR: database hoge is being accessed by other users 2014-01-29 23:10:54 JST DETAIL: There is 1 other session using the database. 2014-01-29 23:10:54 JST STATEMENT: DROP DATABASE hoge; 2014-01-29 23:10:54 JST ERROR: syntax error at or near hoge at character 41 2014-01-29 23:10:54 JST STATEMENT: UPDATE pg_database SET datconnlimit = e hoge is being accessed by other users WHERE datname= 'hoge'; dropdb: database removal failed: ERROR: syntax error at or near hoge LINE 1: UPDATE pg_database SET datconnlimit = e hoge is being acce...                         ^ hoge=# \l               List of databases  Name  | Owner | Encoding | Collate | Ctype | Access privileges ---+--+--+-+---+--- hoge   | postgres | UTF8  | C   | C  | postgres | postgres | UTF8  | C   | C  | template0 | postgres | UTF8  | C   | C  | =c/postgres     +      |     |     |    |   | postgres=CTc/postgres template1 | postgres | UTF8  | C   | C  | =c/postgres     +      |     |     |    |   | postgres=CTc/postgres hoge database is not dropped yet. Is this the bug? or not?  It is a bug, sorry for doubling your work. I have updated the patch. Regards On Wednesday, January 29, 2014 8:50 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, Jan 29, 2014 at 4:56 AM, salah jubeh s_ju...@yahoo.com wrote: I'm not particularly in favor of implementing this as client-side functionality, because then it's only available to people who use that particular client. Simon Riggs proposed a server-side option to the DROP DATABASE command some time ago, and I think that might be the way to go. Could you please direct me -if possible- to the thread. I think,implementing it on the client side gives the user the some visibility and control. Initially, I wanted to force drop the database, then I have changed it to kill connections. I think the change in the client tool, is simple and covers the main reason for not being able to drop a database. I think, killing client connection is one of the FAQs. OTOH, having an option like DROP DATABASE [IF EXISTS, FORCE] database is more crisp. However, what does force mean? many options exist such as killing the connections gently, waiting for connections to terminate, killing connections immediately. Also, as Alvaro Herrera mentioned, DROP OWNED BY and/or REASSIGNED OWNED BY might hinder the force option -an example here would be nice-. So, for quick wins, I prefer the kill option in the client side; but, for mature solution , certainly back-end is the way to proceed. http://www.postgresql.org/message-id/1296552979.1779.8622.camel@ebony -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Companydiff --git a/src/bin/scripts/dropdb.c b/src/bin/scripts/dropdb.c index fa6ea3e..9041caa 100644 --- a/src/bin/scripts/dropdb.c +++ b/src/bin/scripts/dropdb.c @@ -32,6 +32,7 @@ main(int argc, char *argv[]) {echo, no_argument, NULL, 'e'}, {interactive, no_argument, NULL, 'i'}, {if-exists, no_argument, if_exists, 1}, + {kill, no_argument, NULL, 'k'}, {maintenance-db, required_argument, NULL, 2}, {NULL, 0, NULL, 0} }; @@ -48,6 +49,8 @@ main(int argc, char *argv[]) enum trivalue prompt_password = TRI_DEFAULT; bool echo = false; bool interactive = false; + bool kill = false; + char *database_conn_limit = -1; PQExpBufferData sql; @@ -59,7 +62,7 @@ main(int argc, char *argv[]) handle_help_version_opts(argc, argv, dropdb, help); - while ((c = getopt_long(argc, argv, h:p:U:wWei, long_options, optindex)) != -1) + while ((c = getopt_long(argc, argv, h:p:U:wWeik, long_options, optindex)) != -1) { switch (c) { @@ -84,6 +87,9 @@ main(int argc, char *argv[]) case 'i': interactive = true; break; + case 'k': +kill = true; +break; case 0: /* this covers the long options */ break; @@ -121,8 +127,6 @@ main(int argc, char *argv[]) initPQExpBuffer(sql); - appendPQExpBuffer(sql, DROP DATABASE %s%s;\n, - (if_exists ? IF EXISTS : ), fmtId(dbname)); /* Avoid trying to drop postgres db while we are connected to it. */ if (maintenance_db == NULL strcmp(dbname, postgres) == 0) @@ -131,6 +135,51 @@ main(int argc, char *argv[]) conn = connectMaintenanceDatabase(maintenance_db, host,
Re: [HACKERS] Prohibit row-security + inheritance in 9.4?
* Craig Ringer (cr...@2ndquadrant.com) wrote: On 01/31/2014 09:01 AM, Stephen Frost wrote: I don't see where this follows at all- clearly, you already get a subset of rows from the child than if you queried the parent because there are other children. Er, what? I don't see what you're saying here. You were argueing that people may be confused by the parent and the child returning different sets of rows. I was pointing out that this is already the case. Currently, when you query the parent, you see rows from other children (superset). You don't ever _not_ see rows from the child that you would see when querying the child directly. Right- but you don't see rows from *other* children when querying the child either. I was merely trying to point out that we don't make children and parents be synonyms of each other. It's not a very strong argument when it comes to this discussion, but I thought it helped illustrate that anyone using inheiritance-as-PG-does-it already has a fair bit of reading to do to be able to understand what's going on. We also don't have any mechanism to automatically pick the right child when you insert into the parent either (of course, that can be done with triggers but that's not automatic :). Basically, it requires a non-trivial amount of effort and understanding to use these features. Is there a case which can't be implemented if the two are independent as I am describing? There are cases which can NOT be implemented if we force the two paths to be handled identically but I feel the approach where we keep them independently managed is flexible to allow the other cases if people want them. The only case prevented is one where access to the child via the parent shows rows that the parent's row-security qual would hide, because the child's qual doesn't. It makes absolutely zero sense, in my head anyway, to have rows returned when querying the parent which should NOT be returned based on the quals of the parent. Personally I think that's ugly anyway; I don't want to support that, and have only been looking at it because it'd solve the consistency issues. Good, though I would characterize it as bug or wrong more than ugly. :) Since the user can achieve this with: SELECT ... FROM ONLY parent UNION ALL SELECT ... FROM ONLY child1 They could do it with a more complex inheiritance tree also, no? As in, with multiple levels or multiple parents where they could redefine the quals to be used? Don't think they'd have to resort to views if they really wanted this (which I tend to doubt many would...). I think it's fine to just apply the parent qual to all children. Agreed. :) So what we're talking about here (conveniently, exactly what's currently impemented) is to: I do like that. :) Apply the policy of the relation actually named in the query before inheritance expansion. If the relation has children expanded during planning, allow the parent policy to be copied to those children. The children are _not_ checked for their own row-security policies when the appendrel is created, and any child policies are _not_ applied. Right. That's consistent with how security barrier views (and views in general) work, and it means that the policy on a relation is applied consistently to all rows, including rows from child relations. As we discussed, it's also consistent with relation-level GRANTs for access to relations. The trade-off is that it creates inconsistent views of the contents of the data depending on whether you query via a parent, or query a child directly, and it means that child policies are ignored when querying via a parent relation. Right, let's make sure the documentation lays this out as clearly as we can make it and perhaps also draw those parallels to how views are handled with the GRANT system. Since that's what I've already written, I'm happy with that ;-) Excellent. ;) Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_basebackup and pg_stat_tmp directory
2014-01-31 Fujii Masao masao.fu...@gmail.com: On Fri, Jan 31, 2014 at 10:18 PM, Mitsumasa KONDO kondo.mitsum...@gmail.com wrote: 2014-01-31 Fujii Masao masao.fu...@gmail.com On Tue, Jan 28, 2014 at 5:51 PM, Magnus Hagander mag...@hagander.net wrote: On Tue, Jan 28, 2014 at 6:11 AM, Amit Kapila amit.kapil...@gmail.com wrote: On Tue, Jan 28, 2014 at 9:26 AM, Fujii Masao masao.fu...@gmail.com wrote: Hi, The files in pg_stat_tmp directory don't need to be backed up because they are basically reset at the archive recovery. So I think it's worth changing pg_basebackup so that it skips any files in pg_stat_tmp directory. Thought? I think this is good idea, but can't it also avoid PGSTAT_STAT_PERMANENT_TMPFILE along with temp files in pg_stat_tmp All stats files should be excluded. IIRC the PGSTAT_STAT_PERMANENT_TMPFILE refers to just the global one. You want to exclude based on PGSTAT_STAT_PERMANENT_DIRECTORY (and of course based on the guc stats_temp_directory if it's in PGDATA. Attached patch changes basebackup.c so that it skips all files in both pg_stat_tmp and stats_temp_directory. Even when a user sets stats_temp_directory to the directory other than pg_stat_tmp, we need to skip the files in pg_stat_tmp. Because, per recent change of pg_stat_statements, the external query file is always created there. +1. And, I'd like to also skip pg_log directory because security reason. Yeah, I was thinking that, too. I'm not sure whether including log files in backup really increases the security risk, though. There are already very important data, i.e., database, in backups. Anyway, since the amount of log files can be very large and they are not essential for recovery, it's worth considering whether to exclude them. OTOH, I'm sure that some users prefer current behavior for some reasons. So I think that it's better to expose the pg_basebackup option specifying whether log files are included in backups or not. I'm with you. Thanks a lot ! Regards, -- Mitsumsasa KONDO NTT Open Source Software Center
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
1261982.53 is entirely nuls. I think that's true for most if not all of the intervening files, still investigating. The 54th segment is nul up to offset 1f0c after which it has valid looking blocks: # hexdump 1261982.54 | head -100 000 * 1f0c 0ea1 8988 0063 0006 04d8 0cf0 However when I grep xlogdump for any records mentioning this block I get nothing. In fact the largest block I find in the xlog is 3646630: # grep 'tid 3646631/' 1261982 | wc -l 0 # grep 'tid 3646630/' 1261982 | wc -l 177 Looking at the block above it looks like the LSN is EA100638988 which I find in the logs but it's a btree insert on a different btree: [cur:EA1/637140, xid:1418089147, rmid:11(Btree), len/tot_len:18/6194, info:8, prev:EA1/635290] bkpblock[1]: s/d/r:1663/16385/1261982 blk:3634978 hole_off/len:1240/2072 [cur:EA1/638988, xid:1418089147, rmid:11(Btree), len/tot_len:18/5894, info:8, prev:EA1/637140] insert_leaf: s/d/r:1663/16385/1364767 tid 2746914/219 [cur:EA1/638988, xid:1418089147, rmid:11(Btree), len/tot_len:18/5894, info:8, prev:EA1/637140] bkpblock[1]: s/d/r:1663/16385/1364767 blk:2746914 hole_off/len:1180/2372 [cur:EA1/63A0A8, xid:1418089147, rmid:1(Transaction), len/tot_len:32/64, info:0, prev:EA1/638988] d/s:16385/1663 commit at 2014-01-21 05:41:11 UTC -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing currentTransactionWALVolume
On 31 January 2014 13:56, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Jan 15, 2014 at 2:12 AM, Simon Riggs si...@2ndquadrant.com wrote: Short patch to expose a function GetCurrentTransactionWALVolume() that gives the total number of bytes written to WAL by current transaction. Could you tell me the use case of this function? ISTM that it's less useful in the real world because it reports the information of WAL generated only in my own current transaction. For example, the monitoring tool cannot see that information because it's the different session from the target. It's already possible to work out how much total WAL has been generated. Monitoring tools may access that. What this allows is user/plugin code to see how much WAL has been generated in this transaction and take different user defined actions. Those actions would be specific to the transaction, so this is more about production control applications than overall system monitoring. There are many possible ways to use that knowledge. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
On 2014-01-31 14:39:47 +, Greg Stark wrote: 1261982.53 is entirely nuls. I think that's true for most if not all of the intervening files, still investigating. The 54th segment is nul up to offset 1f0c after which it has valid looking blocks: It'd be interesting to dump the page header for that using pageinspect. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 01/31/2014 08:57 AM, Merlin Moncure wrote: On Fri, Jan 31, 2014 at 4:03 AM, Oleg Bartunov obartu...@gmail.com wrote: Hmm, neither me, nor Teodor have experience and knowledge with populate_record() and moreover hstore here is virgin and we don't know the right behaviour, so I think we better take it from jsonb, once Andrew realize it. Andrew ? Andrew Gierth wrote the current implementation of htsore populate_record IIRC. Unfortunately the plan for jsonb was to borrow hstore's (I don't think hstore can use the jsonb implementation because you'd be taking away the ability to handle internally nested structures it currently has). Of my two complaints upthread, the second one, not being able to populate from and internally well formed structure, is by far the more serious one I think. Umm, I think at least one of us is seriously confused. I am going to look at dealing with these issues in a way that can be used by both - at least the populate_record case. As far as populate_record goes, there is a bit of an impedance mismatch, since json/hstore records are heterogenous and one-dimensional, whereas sql arrays are homogeneous and multidimensional. Right now I am thinking I will deal with arrays up to two dimensions, because I can do that relatively simply, and after that throw in the towel. That will surely deal with 99.9% of use cases. Of course this would be documented. Anyway, Let me see what I can do. If Andrew Gierth wants to have a look at fixing the hstore() side that might help speed things up. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [bug fix] pg_ctl always uses the same event source
Hi, Amit san, I'm replying to your previous email. I wanted to reply to your latest mail below, but I removed it from my mailer by mistake. http://www.postgresql.org/message-id/CAA4eK1LAg6ndZdWLb5e=Ep5DzcE8KZU=JbmO+tFwySYHm2ja=q...@mail.gmail.com Do you know how I can reply to an email which was deleted locally? I thought I could download an old mail by clicking raw link and import it to the mailer. However, it requires username/password input, and it seems to be different from the one for editing CommitFest. I couldn't find how to authenticate myself. Anyway, the revised patch is attached. From: Amit Kapila amit.kapil...@gmail.com It gives the proper message, but even after error, the second message box it shows DLLInstall ... succeeded. I think the reason is that caller of function DllRegisterServer() doesn't check the return value. I see. Corrected by checking the return value of DllRegisterServer(). + char message[1024]; why you have kept message as a global buffer, can't we just declare locally inside the function? I made it a local variable. At first, I thought we might use it in other functions in the future. Okay, I think we can leave it and also remove it from other parts of patch. Although I found it is the right way, but Tom is not convinced with the idea, so lets keep the Default event source name handling as it is. OK, I changed the value of DEFAULT_EVENT_SOURCE to PostgreSQL. As suggested by Tom, please update documentation. Possibly there's room for a documentation patch reminding users to make sure that event_source is set appropriately before they turn on eventlog. I think right place to update this information is where we are explaining about setting of event log i.e at below link or may be if you find some other better place: http://www.postgresql.org/docs/devel/static/runtime-config-logging.html#GUC-LOG-DESTINATION Please let us make this a separate patch. I agree with you about the place in the manual. Regards MauMau pg_ctl_eventsrc_v6.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Fri, Jan 31, 2014 at 8:45 AM, Andrew Dunstan and...@dunslane.net wrote: On 01/31/2014 08:57 AM, Merlin Moncure wrote: On Fri, Jan 31, 2014 at 4:03 AM, Oleg Bartunov obartu...@gmail.com wrote: Hmm, neither me, nor Teodor have experience and knowledge with populate_record() and moreover hstore here is virgin and we don't know the right behaviour, so I think we better take it from jsonb, once Andrew realize it. Andrew ? Andrew Gierth wrote the current implementation of htsore populate_record IIRC. Unfortunately the plan for jsonb was to borrow hstore's (I don't think hstore can use the jsonb implementation because you'd be taking away the ability to handle internally nested structures it currently has). Of my two complaints upthread, the second one, not being able to populate from and internally well formed structure, is by far the more serious one I think. Umm, I think at least one of us is seriously confused. I am going to look at dealing with these issues in a way that can be used by both - at least the populate_record case. As far as populate_record goes, there is a bit of an impedance mismatch, since json/hstore records are heterogenous and one-dimensional, whereas sql arrays are homogeneous and multidimensional. Right now I am thinking I will deal with arrays up to two dimensions, because I can do that relatively simply, and after that throw in the towel. That will surely deal with 99.9% of use cases. Of course this would be documented. Anyway, Let me see what I can do. If Andrew Gierth wants to have a look at fixing the hstore() side that might help speed things up. (ah, you beat me to it.) Disregard my statements above. It works. postgres=# select jsonb_populate_record(null::x, hstore(row(1, array[row(1, array[row(1, array[1,2])::z])::y])::x)::jsonb); jsonb_populate_record - (1,{(1,\\{(1,{1,2})}\\)}) merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
On Fri, Jan 31, 2014 at 2:39 PM, Greg Stark st...@mit.edu wrote: [cur:EA1/637140, xid:1418089147, rmid:11(Btree), len/tot_len:18/6194, info:8, prev:EA1/635290] bkpblock[1]: s/d/r:1663/16385/1261982 blk:3634978 hole_off/len:1240/2072 [cur:EA1/638988, xid:1418089147, rmid:11(Btree), len/tot_len:18/5894, info:8, prev:EA1/637140] insert_leaf: s/d/r:1663/16385/1364767 tid 2746914/219 Actually wait, the immediate previous record is indeed on the right filenode. Is the LSN printed in xlogdump the LSN that would be in the pd_lsn or is the pd_lsn going to be from the following record? -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prohibit row-security + inheritance in 9.4?
* Yeb Havinga (y.t.havi...@mgrid.net) wrote: IMHO, there is another way to implement this, other than the procedure to override the child-rel-quals with the ones from the parent. At DDL time, synchronize quals on the parent with rls quals of the childs. Isn't this also what happens with constraints? No, we're not going to do that. We don't do it for GRANT and I don't think it makes sense to do it here. If we wanted to make them the same then we'd throw out the ability to do any kind of changes or actions on the child and then we'd have actual partitioning. We don't have that though, we have inheiritance. Then during expansion of the range table, no code is needed to ignore child rls quals and copy parent rels to child rels. This is what's already implemented and isn't a huge amount of code to begin with, so I don't see this as being an argument against having the flexibility. Also, the security policy applied would be invariant to the route through which the rows were accessed: You could also get this by simply only allowing access to the parent and not granting any privileges on the children. - directly to the child row: child rls quals and parent quals (by propagate at ddl) are applied. - through the parent: child rls quals and parent quals applied. If you want them to be the same then you can implement this yourself without having PG force it on you. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
On 2014-01-31 14:59:21 +, Greg Stark wrote: On Fri, Jan 31, 2014 at 2:39 PM, Greg Stark st...@mit.edu wrote: [cur:EA1/637140, xid:1418089147, rmid:11(Btree), len/tot_len:18/6194, info:8, prev:EA1/635290] bkpblock[1]: s/d/r:1663/16385/1261982 blk:3634978 hole_off/len:1240/2072 [cur:EA1/638988, xid:1418089147, rmid:11(Btree), len/tot_len:18/5894, info:8, prev:EA1/637140] insert_leaf: s/d/r:1663/16385/1364767 tid 2746914/219 Actually wait, the immediate previous record is indeed on the right filenode. Is the LSN printed in xlogdump the LSN that would be in the pd_lsn or is the pd_lsn going to be from the following record? It points to the end of the record (i.e. the beginning of the next). It needs to, because otherwise XLogFlush()es on the pd_lsn wouldn't flush enough. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [bug fix] psql \copy doesn't end if backend is killed
From: Dilip kumar dilip.ku...@huawei.com Is there any direct scenario by which it can be reproduce ? Thank you for reviewing and testing the patch. There is no other direct scenario. I reproduced the failure exactly like you suggested, because it was very difficult to reproduce the problem without using the debugger. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
On Fri, Jan 31, 2014 at 3:08 PM, Andres Freund and...@2ndquadrant.com wrote: It points to the end of the record (i.e. the beginning of the next). It needs to, because otherwise XLogFlush()es on the pd_lsn wouldn't flush enough. Ah, in which case the relevant record is: [cur:EA1/637140, xid:1418089147, rmid:11(Btree), len/tot_len:18/6194, info:8, prev:EA1/635290] insert_leaf: s/d/r:1663/16385/1261982 tid 3634978/282 [cur:EA1/637140, xid:1418089147, rmid:11(Btree), len/tot_len:18/6194, info:8, prev:EA1/635290] bkpblock[1]: s/d/r:1663/16385/1261982 blk:3634978 hole_off/len:1240/2072 It looks like get_raw_page() refuses to read past the end of relpages. I could make a clone of this database to allow experimenting with tweaking relpages but it may or may not reproduce the problem... =# select pg_relation_size('data_pkey') / 1024 / 1024 / 1024; ?column? -- 23 (1 row) =# select get_raw_page('data_pkey', 'main', 11073632) ; ERROR: block number 11073632 is out of range for relation data_pkey d9de7pcqls4ib6=# select relpages from pg_class where relname = 'data_pkey'; relpages -- 2889286 -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
On 2014-01-31 15:15:24 +, Greg Stark wrote: On Fri, Jan 31, 2014 at 3:08 PM, Andres Freund and...@2ndquadrant.com wrote: It points to the end of the record (i.e. the beginning of the next). It needs to, because otherwise XLogFlush()es on the pd_lsn wouldn't flush enough. Ah, in which case the relevant record is: [cur:EA1/637140, xid:1418089147, rmid:11(Btree), len/tot_len:18/6194, info:8, prev:EA1/635290] insert_leaf: s/d/r:1663/16385/1261982 tid 3634978/282 [cur:EA1/637140, xid:1418089147, rmid:11(Btree), len/tot_len:18/6194, info:8, prev:EA1/635290] bkpblock[1]: s/d/r:1663/16385/1261982 blk:3634978 hole_off/len:1240/2072 It looks like get_raw_page() refuses to read past the end of relpages. I could make a clone of this database to allow experimenting with tweaking relpages but it may or may not reproduce the problem... No, it uses smgrnblocks() to get the size. =# select get_raw_page('data_pkey', 'main', 11073632) ; ERROR: block number 11073632 is out of range for relation data_pkey Isn't the page 3634978? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] inherit support for foreign tables
* Robert Haas (robertmh...@gmail.com) wrote: On Thu, Jan 30, 2014 at 5:05 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Thu, Jan 30, 2014 at 11:04 AM, Tom Lane t...@sss.pgh.pa.us wrote: I think this is totally misguided. Who's to say that some weird FDW might not pay attention to attstorage? I could imagine a file-based FDW using that to decide whether to compress columns, for instance. Admittedly, the chances of that aren't large, but it's pretty hard to argue that going out of our way to prevent it is a useful activity. I think that's a pretty tenuous position. There are already FDW-specific options sufficient to let a particular FDW store whatever kinds of options it likes; letting the user set options that were only ever intended to be applied to tables just because we can seems sort of dubious. I'm tempted by the idea of continuing to disallow SET STORAGE on an unvarnished foreign table, but allowing it on an inheritance hierarchy that contains at least one real table, with the semantics that we quietly ignore the foreign tables and apply the operation to the plain tables. [ shrug... ] By far the easiest implementation of that is just to apply the catalog change to all of them. According to your assumptions, it'll be a no-op on the foreign tables anyway. Well, there's some point to that, too, I suppose. What do others think? I agree that using the FDW-specific options is the right approach and disallowing those to be set on foreign tables makes sense. I don't particularly like the idea of applying changes during inheiritance which we wouldn't allow the user to do directly. While it might be a no-op and no FDW might use it, it'd still be confusing. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
On Fri, Jan 31, 2014 at 3:19 PM, Andres Freund and...@2ndquadrant.com wrote: =# select get_raw_page('data_pkey', 'main', 11073632) ; ERROR: block number 11073632 is out of range for relation data_pkey Isn't the page 3634978? The page in the record is. But the page on disk is in the 54th segment at offset 1F0C So unless my arithmetic is wrong: bc -l ibase=16 400 * 400 * 400 / 2000 * 54 + 1F0C / 2000 11073632 -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
On 2014-01-31 15:21:35 +, Greg Stark wrote: On Fri, Jan 31, 2014 at 3:19 PM, Andres Freund and...@2ndquadrant.com wrote: =# select get_raw_page('data_pkey', 'main', 11073632) ; ERROR: block number 11073632 is out of range for relation data_pkey Isn't the page 3634978? The page in the record is. It'd be interesting to look at the referenced page using bt_page_items(). But the page on disk is in the 54th segment at offset 1F0C It's interesting that the smgr gets this wrong then (as also evidenced by the fact that relation_size does as well). Could you please do a ls -l path/to/relfilenode*? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 01/31/2014 09:53 AM, Merlin Moncure wrote: On Fri, Jan 31, 2014 at 8:45 AM, Andrew Dunstan and...@dunslane.net wrote: On 01/31/2014 08:57 AM, Merlin Moncure wrote: On Fri, Jan 31, 2014 at 4:03 AM, Oleg Bartunov obartu...@gmail.com wrote: Hmm, neither me, nor Teodor have experience and knowledge with populate_record() and moreover hstore here is virgin and we don't know the right behaviour, so I think we better take it from jsonb, once Andrew realize it. Andrew ? Andrew Gierth wrote the current implementation of htsore populate_record IIRC. Unfortunately the plan for jsonb was to borrow hstore's (I don't think hstore can use the jsonb implementation because you'd be taking away the ability to handle internally nested structures it currently has). Of my two complaints upthread, the second one, not being able to populate from and internally well formed structure, is by far the more serious one I think. Umm, I think at least one of us is seriously confused. I am going to look at dealing with these issues in a way that can be used by both - at least the populate_record case. As far as populate_record goes, there is a bit of an impedance mismatch, since json/hstore records are heterogenous and one-dimensional, whereas sql arrays are homogeneous and multidimensional. Right now I am thinking I will deal with arrays up to two dimensions, because I can do that relatively simply, and after that throw in the towel. That will surely deal with 99.9% of use cases. Of course this would be documented. Anyway, Let me see what I can do. If Andrew Gierth wants to have a look at fixing the hstore() side that might help speed things up. (ah, you beat me to it.) Disregard my statements above. It works. postgres=# select jsonb_populate_record(null::x, hstore(row(1, array[row(1, array[row(1, array[1,2])::z])::y])::x)::jsonb); jsonb_populate_record - (1,{(1,\\{(1,{1,2})}\\)}) Actually, there is a workaround to the limitations of hstore(record): andrew=# select row_to_json(row(1, array[row(1, array[row(1, array[1,2])::z])::y])::x)::jsonb::hstore; row_to_json --- a=1, b=[{a=1, b=[{a=1, b=[1, 2]}]}] I think we could just document that for now, or possibly just use it inside hstore(record) if we encounter a nested composite or array. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prohibit row-security + inheritance in 9.4?
On 2014-01-31 16:05, Stephen Frost wrote: * Yeb Havinga (y.t.havi...@mgrid.net) wrote: IMHO, there is another way to implement this, other than the procedure to override the child-rel-quals with the ones from the parent. At DDL time, synchronize quals on the parent with rls quals of the childs. Isn't this also what happens with constraints? No, we're not going to do that. We don't do it for GRANT and I don't think it makes sense to do it here. This reasoning could go either way. GRANT is on a complete set of rows. This is a restriction on the level of individual rows, and in that sense, it is more like a row-level CHECK constraint. If we wanted to make them the same then we'd throw out the ability to do any kind of changes or actions on the child and then we'd have actual partitioning. We don't have that though, we have inheiritance. I fail to understand this, probably because I do not have a partition use case for inheritance, but rather an information model that is more ontology like. The more specific childs get down the inheritance tree, more columns they get, and their requirements might differ completely in nature from their siblings, and make no sense at all as well when specified at the level of the parent (or even impossible, since the parent does not have all the columns). Then during expansion of the range table, no code is needed to ignore child rls quals and copy parent rels to child rels. This is what's already implemented and isn't a huge amount of code to begin with, so I don't see this as being an argument against having the flexibility. It would seem to me that any additional logic that can be avoided during planning is a good thing. Also, the argument that something is already implemented, does not itself make it good to commit. What do you mean with 'having the flexibility' and why is that good? Also, the security policy applied would be invariant to the route through which the rows were accessed: You could also get this by simply only allowing access to the parent and not granting any privileges on the children. That might work for partitioning, but not for use cases where childs have more columns than parents. - directly to the child row: child rls quals and parent quals (by propagate at ddl) are applied. - through the parent: child rls quals and parent quals applied. If you want them to be the same then you can implement this yourself without having PG force it on you. I suggested it as a solution for a requirement worded upthread as It makes absolutely zero sense, in my head anyway, to have rows returned when querying the parent which should NOT be returned based on the quals of the parent. without disregarding rls-quals on childs. regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] inherit support for foreign tables
Stephen Frost sfr...@snowman.net writes: I agree that using the FDW-specific options is the right approach and disallowing those to be set on foreign tables makes sense. I don't particularly like the idea of applying changes during inheiritance which we wouldn't allow the user to do directly. While it might be a no-op and no FDW might use it, it'd still be confusing. If we don't allow it directly, why not? Seems rather arbitrary. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
Andres Freund and...@2ndquadrant.com writes: It's interesting that the smgr gets this wrong then (as also evidenced by the fact that relation_size does as well). Could you please do a ls -l path/to/relfilenode*? IIRC, smgrnblocks will stop as soon as it finds a segment that is not 1GB in size. Could you check the lengths of all segments for that relation? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
On 2014-01-31 10:33:16 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: It's interesting that the smgr gets this wrong then (as also evidenced by the fact that relation_size does as well). Could you please do a ls -l path/to/relfilenode*? IIRC, smgrnblocks will stop as soon as it finds a segment that is not 1GB in size. Could you check the lengths of all segments for that relation? Yea, that's what I am wondering about. I wanted the full list because there could be an entire file missing and it's interesting to see at which time they were last touched relative to each other... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
Greg Stark st...@mit.edu writes: On Fri, Jan 31, 2014 at 3:19 PM, Andres Freund and...@2ndquadrant.com wrote: Isn't the page 3634978? The page in the record is. But the page on disk is in the 54th segment at offset 1F0C So unless my arithmetic is wrong: bc -l ibase=16 400 * 400 * 400 / 2000 * 54 + 1F0C / 2000 11073632 At least two of us are confused. I get # select ((2^30) * 54.0 + 'x1F0C'::bit(32)::int) / 8192; ?column? -- 7141472 (1 row) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
Sorry guys. I transposed two numbers when looking up the relation. data_pk wasn't the right index. =# select (page_header(get_raw_page('index_data_id', 'main', 3020854))).* ; lsn | tli | flags | lower | upper | special | pagesize | version | prune_xid --+-+---+---+---+-+--+-+--- CF0/2DD67BB8 | 5 | 0 | 792 | 5104 |8176 | 8192 | 4 | 0 (1 row) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
Greg Stark st...@mit.edu writes: Sorry guys. I transposed two numbers when looking up the relation. data_pk wasn't the right index. =# select (page_header(get_raw_page('index_data_id', 'main', 3020854))).* ; lsn | tli | flags | lower | upper | special | pagesize | version | prune_xid --+-+---+---+---+-+--+-+--- CF0/2DD67BB8 | 5 | 0 | 792 | 5104 |8176 | 8192 | 4 | 0 (1 row) Clearly not the right page --- the LSN isn't what we saw in the hexdump. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add force option to dropdb
On Fri, Jan 31, 2014 at 9:09 AM, salah jubeh s_ju...@yahoo.com wrote: $ createdb -U postgres hoge $ psql -d hoge -U postgres hoge=# create table test (col text); hoge=# insert into test select repeat(chr(code),1) from generate_series(1,10) code; Execute dropdb -k while the client is inserting many tuples into database $ dropdb -k hoge 2014-01-29 23:10:49 JST FATAL: terminating connection due to administrator command 2014-01-29 23:10:49 JST STATEMENT: insert into test select repeat(chr(code),1) from generate_series(1,200) code; 2014-01-29 23:10:54 JST ERROR: database hoge is being accessed by other users 2014-01-29 23:10:54 JST DETAIL: There is 1 other session using the database. 2014-01-29 23:10:54 JST STATEMENT: DROP DATABASE hoge; 2014-01-29 23:10:54 JST ERROR: syntax error at or near hoge at character 41 2014-01-29 23:10:54 JST STATEMENT: UPDATE pg_database SET datconnlimit = e hoge is being accessed by other users WHERE datname= 'hoge'; dropdb: database removal failed: ERROR: syntax error at or near hoge LINE 1: UPDATE pg_database SET datconnlimit = e hoge is being acce... ^ hoge=# \l List of databases Name| Owner | Encoding | Collate | Ctype | Access privileges ---+--+--+-+---+--- hoge | postgres | UTF8| C | C| postgres | postgres | UTF8| C | C| template0 | postgres | UTF8| C | C| =c/postgres + | | || | postgres=CTc/postgres template1 | postgres | UTF8| C | C| =c/postgres + | | || | postgres=CTc/postgres hoge database is not dropped yet. Is this the bug? or not? It is a bug, sorry for doubling your work. I have updated the patch. In case it wasn't clear before, I think that a client-side hack like this has zero chance of being acceptable to the community, and we should mark this patch rejected. I'm not saying it couldn't be useful to someone, but the scripts are intended to be thin wrappers around the underlying database functionality, and I think this is straying too far from that core mission. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Regression tests failing if not launched on db regression
On Fri, Jan 31, 2014 at 2:28 AM, Michael Paquier michael.paqu...@gmail.com wrote: I took a look at this with a view to committing it but on examination I'm not sure this is the best way to proceed. The proposed text documents that the tests should be run in a database called regression, but the larger documentation chapter of which this section is a part never explains how to run them anywhere else, so it feels a bit like telling a ten-year-old not to burn out the clutch. The bit about not changing enable_* probably belongs, if anywhere, in section 30.2, on test evaluation, rather than here. And what about the attached? I have moved all the content to 30.2, and added two paragraphs: one about the planner flags, the other about the database used. Regards, Well, it doesn't really address my first concern, which was that you talk about running the tests in a database named regression, but that's exactly what make check does and it's unclear how you would do anything else without modifying the source code. It's not the purpose of the documentation to tell you all the ways that you could break things if you patch the tree. I also don't want to document exactly which tests would fail if you did hack things like that; that documentation is likely to become outdated. I think the remaining points you raise are worth mentioning. I'm attaching a patch with my proposed rewording of your changes. I made the section about configuration parameters a bit more generic and adjusted the phrasing to sound more natural in English, and I moved your mention of the other issues around a bit. What do you think of this version? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml index 2b95587..edb476a 100644 --- a/doc/src/sgml/regress.sgml +++ b/doc/src/sgml/regress.sgml @@ -125,7 +125,9 @@ gmake installcheck-parallel /screen The tests will expect to contact the server at the local host and the default port number, unless directed otherwise by envarPGHOST/envar and - envarPGPORT/envar environment variables. + envarPGPORT/envar environment variables. The tests will be run in a + database named literalregression/; any existing database by this name + will be dropped. /para para @@ -147,7 +149,9 @@ gmake installcheck /screen The filenamecontrib/ modules must have been built and installed first. You can also do this in a subdirectory of filenamecontrib/ to run - the tests for just one module. + the tests for just one module. Tests of literalcontrib/ modules will + be run in a database named literalcontrib_regression/; any existing + database by this name will be dropped. /para /sect2 @@ -471,6 +475,18 @@ diff results/random.out expected/random.out not worry unless the random test fails repeatedly. /para /sect2 + + sect2 +titleConfiguration Parameters/title + +para + When running the tests against an existing installation, some non-default + parameter settings could cause the tests to fail. For example, the + settings described in xref linkend=runtime-config-query-enable + could cause plan changes that would affect the results of tests which + use commandEXPLAIN/. +/para + /sect2 /sect1 !-- We might want to move the following section into the developer's guide. -- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
On Fri, Jan 31, 2014 at 3:41 PM, Tom Lane t...@sss.pgh.pa.us wrote: 400 * 400 * 400 / 2000 * 54 + 1F0C / 2000 11073632 Ooops, it's reading 54 in hex there. # select ((2^30) * 54.0 + 'x1F0C'::bit(32)::int) / 8192; ?column? -- 7141472 ibase=16 400 * 400 * 400 / 2000 * 36 + 1F0C / 2000 7141472. So now that I've got the arithmetic right: =# select (page_header(get_raw_page('index_data_id', 'main', 7141472))).* ; lsn | tli | flags | lower | upper | special | pagesize | version | prune_xid +-+---+---+---+-+--+-+--- EA1/638988 | 6 | 0 | 1240 | 3312 |8176 | 8192 | 4 | 0 -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
On 2014-01-31 10:33:16 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: It's interesting that the smgr gets this wrong then (as also evidenced by the fact that relation_size does as well). Could you please do a ls -l path/to/relfilenode*? IIRC, smgrnblocks will stop as soon as it finds a segment that is not 1GB in size. Could you check the lengths of all segments for that relation? I am not sure that explains the issue, but I think the redo action for truncation is not safe across crashes. A XLOG_SMGR_TRUNCATE will just do a smgrtruncate() (and then mdtruncate) which will iterate over the segments starting at 0 till mdnblocks()/segment_size and *truncate* but not delete individual segment files that are not needed anymore, right? If we crash in the midst of that a new mdtruncate() will be issued, but it will get a shorter value back from mdnblocks(). Am I missing something? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
So just to summarize, this xlog record: [cur:EA1/637140, xid:1418089147, rmid:11(Btree), len/tot_len:18/6194, info:8, prev:EA1/635290] insert_leaf: s/d/r:1663/16385/1261982 tid 3634978/282 [cur:EA1/637140, xid:1418089147, rmid:11(Btree), len/tot_len:18/6194, info:8, prev:EA1/635290] bkpblock[1]: s/d/r:1663/16385/1261982 blk:3634978 hole_off/len:1240/2072 Appears to have been written to this location: d9de7pcqls4ib6=# select (page_header(get_raw_page('index_event_data_on_event_occurrence_id', 'main', 7141472))).* ; lsn | tli | flags | lower | upper | special | pagesize | version | prune_xid +-+---+---+---+-+--+-+--- EA1/638988 | 6 | 0 | 1240 | 3312 |8176 | 8192 | 4 | 0 (1 row) The correct location appears to have been written to by later records (a split where it's the leftsib to be specific) so any forensic evidence is lost: d9de7pcqls4ib6=# select (page_header(get_raw_page('index_event_data_on_event_occurrence_id', 'main', 3634978))).* ; lsn | tli | flags | lower | upper | special | pagesize | version | prune_xid +-+---+---+---+-+--+-+--- EA1/A90CF8 | 6 | 0 | 792 | 5104 |8176 | 8192 | 4 | 0 (1 row) But the record prior to that lsn is a split where the leftsib matches that record: [cur:EA1/A8EDE0, xid:1418089314, rmid:11(Btree), len/tot_len:3502/7938, info:68, prev:EA1/A8D7A0] split_r: s/d/r:1663/16385/1261982 leftsib 3634978 [cur:EA1/A8EDE0, xid:1418089314, rmid:11(Btree), len/tot_len:3502/7938, info:68, prev:EA1/A8D7A0] bkpblock[2]: s/d/r:1663/16385/1261982 blk:1881360 hole_off/len:892/3812 Interestingly there are a bunch of other records that also write to that block but there are no other full page writes. That seems to imply that the index is currently missing the leaf inserted by the EA1/637140 record. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] updated emacs configuration
On Fri, Jan 31, 2014 at 08:58:21AM -0500, Robert Haas wrote: OK, eight hours later, I have the results for only removing tabs after periods in comments: http://momjian.us/expire/entab_comment.v2.cdiff http://momjian.us/expire/entab_comment.v2.pdiff http://momjian.us/expire/entab_comment.v2.udiff The diff line counts are: 64356 entab_comment.v2.cdiff 17327 entab_comment.v2.pdiff 35453 entab_comment.v2.udiff It represents 3903 lines changed. That looks loads better. OK. I have updated entab.c to support this new ability as -m. When should it be run this against HEAD and supported back branches? Probably when we run pgindent for 9.4. I will need to modify pgindent too because this new option will be run inside pgindent in the future. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgindent behavior we could do without
On Thu, Jan 30, 2014 at 11:44:31PM -0500, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: OK, seven hours later, I have fixed pg_bsd_indent to no longer insert blank lines above #elif/#else/#endif, and therefore removed the special case code from pgindent. You will need to download version 1.3 of pg_bsd_indent for this to work, and pgindent will complain if it doesn't find the right pg_bsd_indent version. Cool. Do we need to go an clean up any places where pgindent has suppressed blank lines above #elif/#else/#endif in the past? Not sure it's worth making a massive change for this, but I appreciate the fact that pgindent won't mess up such code in the future. Yes, it is a shame pgindent has removed many proper empty lines in the past and there is no way to re-add them without causing backpatching problems. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Fwd: [HACKERS] Proposal: variant of regclass
On Fri, Jan 31, 2014 at 6:31 AM, Yugo Nagata nag...@sraoss.co.jp wrote: Hi Amit, Thanks for your reviewing. I updated the patch. I fixed the oids and removed the witespace. This patch contains several whitespace-only hunks. Please revert them. I don't like the changes to typenameTypeIdAndMod(). The code for the missing_ok case shouldn't look completely different from the code for the !missing_ok case. It would be cleaner to start by duplicating typenameType into typenameTypeIdAndMod and then adjusting it as needed to support the new argument. It might be better still to just change parseTypeString() to call LookupTypeName() directly, and add the necessary logic to handle missing_ok there. The advantage of that is that you wouldn't need to modify everybody who is calling typenameTypeIdAndMod(); there are a decent number of such callers here, and there might be third-party code calling that as well. I think the changes this patch makes to OpernameGetCandidates() need a bit of further consideration. The new argument is called missing_ok, but OpernameGetCandidates() can already return an empty list. What that new argument does is tolerate a missing schema name. So we could call the argument missing_schema_ok, I guess, but I'm still not sure I like that. I don't have a better proposal right at the moment, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgindent behavior we could do without
On Fri, Jan 31, 2014 at 11:18:17AM -0500, Bruce Momjian wrote: Yes, it is a shame pgindent has removed many proper empty lines in the past and there is no way to re-add them without causing backpatching problems. FYI, the original BSD indent code that added the blank lines kind of made sense. If you defined a block of variables or a function, BSD indent wanted a blank line after that. When it saw a CPP directive, it knew that wasn't a blank line, so it forced one. In our coding, we often want CPP directives with no blank space, hence the problem. pg_bsd_indent 1.3 will not longer force a blank line when it sees a CPP directive, and pgindent will no longer remove those blank lines. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: show xid and xmin in pg_stat_activity and pg_stat_replication
On Fri, Jan 31, 2014 at 4:40 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-01-30 12:27:43 -0500, Robert Haas wrote: Nope, but I think this patch is broken. It looks to me like it's conflating the process offset in the BackendStatus array with its backendId, which does not seem like a good idea even if it happens to work at present. Hm. I don't see how that's going to be broken without major surgery in pgstat.c. The whole thing seems to rely on being able to index BackendStatusArray with MyBackendId? Oh, you're right. pgstat_initialize() sets it up that way. And the way BackendIdGetProc() is used looks unsafe, too: the contents might no longer be valid by the time we read them. I suspect we should have a new accessor function that takes a backend ID and copies the xid and xmin to pointers provided by the client while holding the lock. I also note that the docs seem to need some copy-editing: It certainly needs to be documented as racy, but I don't see a big problem with being racy here. We assume in lots of places that writing/reading xids is atomic, and we don't even hold exclusive locks while writing... (And yes, that means that the xid and xmin don't necessarily belong to each other) That said, encapsulating that racy access into a accessor function does sound like a good plan. Yep, shouldn't be hard to do. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: show xid and xmin in pg_stat_activity and pg_stat_replication
On Fri, Jan 31, 2014 at 8:02 AM, Christian Kruse christ...@2ndquadrant.com wrote: what do you think about the approach the attached patch implements? I'm not really sure if this is what you had in mind, especially if this is the right lock. The attached patch seems not to be attached, but the right lock to use would be the same one BackendIdGetProc() is using. I'd add a new function BackendIdGetTransactionIds or something like that. I also note that the docs seem to need some copy-editing: + entryThe current xref linked=ddl-system-columnsxmin value./xref/entry The link shouldn't include the period, and probably should also not include the word value. I would make only the word xmin part of the link. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] small typo in src/backend/access/transam/xlog.c
On Mon, Jul 22, 2013 at 07:32:20PM -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2013-07-22 15:55:46 -0400, Robert Haas wrote: And why is that? The comment above tells: while the lower half is the XOR of tv_sec and tv_usec. Yeah, the code doesn't match the comment; this mistake seems to be aboriginal. I don't think it really matters. the bitwise OR has the tenency to collect too many set bits, but ... who cares? This is making the value less unique than intended, so I think it's worth doing something about. However, it's also worth noting that the intended behavior (as described by the comment) was designed to allow for the possibility that uint64 is really only 32 bits --- a possibility we stopped supporting several versions ago. So rather than just quickly s/|/^/, maybe we should step back and think about whether we want to change the sysid generation algorithm altogether. We could for instance keep the high half as tv_sec, while making the low half be something like (tv_usec 12) | (getpid() 0xfff). This would restore the intended ability to reverse-engineer the exact creation time from the sysidentifier, and also add a little more uniqueness by way of the creating process's PID. (Note tv_usec must fit in 20 bits.) Can someone make a change here so we can close the issue? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLESPACE ... MOVE ALL TO ...
On Thu, Jan 30, 2014 at 8:47 PM, Fujii Masao masao.fu...@gmail.com wrote: On Tue, Jan 21, 2014 at 1:33 AM, Simon Riggs si...@2ndquadrant.com wrote: On 20 January 2014 17:00, Stephen Frost sfr...@snowman.net wrote: * Tom Lane (t...@sss.pgh.pa.us) wrote: What if you're a superuser and you want to move everybody's objects (perhaps in preparation for dropping the tablespace)? I think there's value in both the ALL and OWNED forms. A superuser is considered to 'own' all objects and so 'ALL' and 'OWNED' above would be the same when issued by a superuser, in the current implementation. Looking at DROP OWNED and REASSIGN OWNED, they operate at the more specific level of OWNED == relowner rather than if the role is considered an 'owner' of the object through role membership, as you are implying above. As such, I'll rework this to be more in-line with the existing OWNED BY semantics of REASSIGN OWNED BY and DROP OWNED BY, which means we'd have: ALTER TABLESPACE name MOVE [ ALL | OWNED [ BY reluser ] ] [ TABLES | INDEXES | MATERIALIZED VIEWS ] TO name opt_nowait eg: ALTER TABLESPACE tblspc1 MOVE ALL TO tblspc2; ALTER TABLESPACE tblspc1 MOVE OWNED TO tblspc2; ALTER TABLESPACE tblspc1 MOVE OWNED BY myrole TO tblspc2; ALTER TABLESPACE tblspc1 MOVE TABLES OWNED BY myrole TO tblspc2; ALTER TABLESPACE tblspc1 MOVE ALL OWNED BY myrole TO tblspc2; Sounds great, thanks. We should add the tab-completion for ALTER TABLESPACE MOVE? Attached does that. Committed. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add force option to dropdb
Hello Robert, but the scripts are intended to be thin wrappers around the underlying database functionality, and I think this is straying too far from that core mission. I think, you have a good point here. Regards On Friday, January 31, 2014 4:47 PM, Robert Haas robertmh...@gmail.com wrote: On Fri, Jan 31, 2014 at 9:09 AM, salah jubeh s_ju...@yahoo.com wrote: $ createdb -U postgres hoge $ psql -d hoge -U postgres hoge=# create table test (col text); hoge=# insert into test select repeat(chr(code),1) from generate_series(1,10) code; Execute dropdb -k while the client is inserting many tuples into database $ dropdb -k hoge 2014-01-29 23:10:49 JST FATAL: terminating connection due to administrator command 2014-01-29 23:10:49 JST STATEMENT: insert into test select repeat(chr(code),1) from generate_series(1,200) code; 2014-01-29 23:10:54 JST ERROR: database hoge is being accessed by other users 2014-01-29 23:10:54 JST DETAIL: There is 1 other session using the database. 2014-01-29 23:10:54 JST STATEMENT: DROP DATABASE hoge; 2014-01-29 23:10:54 JST ERROR: syntax error at or near hoge at character 41 2014-01-29 23:10:54 JST STATEMENT: UPDATE pg_database SET datconnlimit = e hoge is being accessed by other users WHERE datname= 'hoge'; dropdb: database removal failed: ERROR: syntax error at or near hoge LINE 1: UPDATE pg_database SET datconnlimit = e hoge is being acce...                         ^ hoge=# \l               List of databases  Name  | Owner | Encoding | Collate | Ctype | Access privileges ---+--+--+-+---+--- hoge   | postgres | UTF8  | C   | C  | postgres | postgres | UTF8  | C   | C  | template0 | postgres | UTF8  | C   | C  | =c/postgres     +      |     |     |    |   | postgres=CTc/postgres template1 | postgres | UTF8  | C   | C  | =c/postgres     +      |     |     |    |   | postgres=CTc/postgres hoge database is not dropped yet. Is this the bug? or not?  It is a bug, sorry for doubling your work. I have updated the patch. In case it wasn't clear before, I think that a client-side hack like this has zero chance of being acceptable to the community, and we should mark this patch rejected. I'm not saying it couldn't be useful to someone, but the scripts are intended to be thin wrappers around the underlying database functionality, and I think this is straying too far from that core mission. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] [bug fix] PITR corrupts the database cluster
On Wed, Jul 24, 2013 at 10:57:14AM -0400, Tom Lane wrote: I wrote: The only thing here that really bothers me is that a crash during DROP DATABASE/TABLESPACE could leave us with a partially populated db/ts that's still accessible through the system catalogs. ... I guess one thing we could do is create a flag file, say dead.dont.use, in the database's default-tablespace directory, and make new backends check for that before being willing to start up in that database; then make sure that removal of that file is the last step in DROP DATABASE. After a bit of experimentation, it seems there's a pre-existing way that we could do this: just remove PG_VERSION from the database's default directory as the first filesystem action in DROP DATABASE. If we crash before committing, subsequent attempts to connect to that database would fail like this: $ psql bogus psql: FATAL: base/176774 is not a valid data directory DETAIL: File base/176774/PG_VERSION is missing. which is probably already good enough, though maybe we could add a HINT suggesting that the DB was incompletely dropped. To ensure this is replayed properly on slave servers, I'd be inclined to mechanize it by (1) changing remove_dbtablespaces to ensure that the DB's default tablespace is the first one dropped, and (2) incorporating remove-PG_VERSION-first into rmtree(). Where are we on this? Is there a TODO here? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] updated emacs configuration
Bruce Momjian br...@momjian.us writes: OK. I have updated entab.c to support this new ability as -m. When should it be run this against HEAD and supported back branches? Probably when we run pgindent for 9.4. Yeah. The whole point is to keep the branches in sync for patching, so we need to do them all at the same time. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup and pg_stat_tmp directory
On Fri, Jan 31, 2014 at 8:40 AM, Fujii Masao masao.fu...@gmail.com wrote: Yeah, I was thinking that, too. I'm not sure whether including log files in backup really increases the security risk, though. There are already very important data, i.e., database, in backups. Anyway, since the amount of log files can be very large and they are not essential for recovery, it's worth considering whether to exclude them. OTOH, I'm sure that some users prefer current behavior for some reasons. So I think that it's better to expose the pg_basebackup option specifying whether log files are included in backups or not. I don't really see how this can work reliably; pg_log isn't a hard-coded name, but rather a GUC that users can leave set to that value or set to any other value they choose. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Failure while inserting parent tuple to B-tree is not fun
On 01/30/2014 12:46 AM, Peter Geoghegan wrote: On Mon, Jan 27, 2014 at 10:54 AM, Peter Geoghegan p...@heroku.com wrote: On Mon, Jan 27, 2014 at 10:27 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: I think I see some bugs in _bt_moveright(). If you examine _bt_finish_split() in detail, you'll see that it doesn't just drop the write buffer lock that the caller will have provided (per its comments) - it also drops the buffer pin. It calls _bt_insert_parent() last, which was previously only called last thing by _bt_insertonpg() (some of the time), and _bt_insertonpg() is indeed documented to drop both the lock and the pin. And if you look at _bt_moveright() again, you'll see that there is a tacit assumption that the buffer pin isn't dropped, or at least that opaque (the BTPageOpaque BT special page area pointer) continues to point to the same page held in the same buffer, even though the code doesn't set the opaque' pointer again and it may not point to a pinned buffer or even the appropriate buffer. Ditto page. So opaque may point to the wrong thing on subsequent iterations - you don't do anything with the return value of _bt_getbuf(), unlike all of the existing call sites. I believe you should store its return value, and get the held page and the special pointer on the page, and assign that to the opaque pointer before the next iteration (an iteration in which you very probably really do make progress not limited to completing a page split, the occurrence of which the _bt_moveright() loop gets stuck on). You know, do what is done in the non-split-handling case. It may not always be the same buffer as before, obviously. Yep, fixed. Can you explain what the fix was, please? Ping? I would like to hear some details here. I refactored the loop in _bt_moveright to, well, not have that bug anymore. The 'page' and 'opaque' pointers are now fetched at the beginning of the loop. Did I miss something? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE FOREIGN TABLE ( ... LIKE ... )
On 01/25/2014 06:25 AM, David Fetter wrote: I like this patch, but I don't like its implementation at all. First of all, the documentation doesn't compile: openjade:ref/create_foreign_table.sgml:124:17:E: end tag for LISTITEM omitted, but OMITTAG NO was specified openjade:ref/create_foreign_table.sgml:119:4: start tag was here Fixed. I fixed that, and then noticed that like_option is not explained like it is in CREATE TABLE. Also fixed. Then I got down to the description of the LIKE clause in both pages, and I noticed the last line of CREATE TABLE, which is Inapplicable options (e.g., INCLUDING INDEXES from a view) are ignored.. This is inconsistent with the behavior of this patch to throw errors for inapplicable options. Fixed. Please find attached the next rev :) This version looks committable to me, so I am marking it as such. -- Vik -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Insert result does not match record count
On Wed, Jul 24, 2013 at 08:08:32PM +0200, Andres Freund wrote: On 2013-07-24 13:48:23 -0400, Tom Lane wrote: Vik Fearing vik.fear...@dalibo.com writes: Also worth mentioning is bug #7766. http://www.postgresql.org/message-id/e1tlli5-0007tr...@wrigleys.postgresql.org Yeah, did you read that whole thread? The real issue here is going to be whether client-side code falls over on wider-than-32-bit counts. We can fix the backend and be pretty sure that we've found all the relevant places inside it, but we'll just be exporting the issue. I did look at libpq and noted that it doesn't seem to have any internal problem, because it returns the count to callers as a string (!). But what do you think are the odds that callers are using code that won't overflow? I'd bet on finding atoi() or suchlike in a lot of callers. Even if they thought to use strtoul(), unsigned long is not necessarily 64 bits wide. Application code that relies on the values already has problems though since the returned values are pretty bogus now. Including the fact that it can return 0 as the number of modified rows which is checked for more frequently than the actual number IME... So I think client code that uses simplistic stuff like atoi isn't worse off afterwards since the values will be about as bogus. I am more worried about code that does range checks like java's string conversion routines... I think fixing this for 9.4 is fine, but due to the compat issues I think it's to late for 9.3. Where are we on this? There was a posted patch, attached, but Vik Fearing said it was insufficent and he was working on a new one: http://www.postgresql.org/message-id/51eff67a.7020...@dalibo.com -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + *** a/src/backend/commands/createas.c --- b/src/backend/commands/createas.c *** *** 172,178 ExecCreateTableAs(CreateTableAsStmt *stmt, const char *queryString, /* save the rowcount if we're given a completionTag to fill */ if (completionTag) snprintf(completionTag, COMPLETION_TAG_BUFSIZE, ! SELECT %u, queryDesc-estate-es_processed); /* and clean up */ ExecutorFinish(queryDesc); --- 172,178 /* save the rowcount if we're given a completionTag to fill */ if (completionTag) snprintf(completionTag, COMPLETION_TAG_BUFSIZE, ! SELECT UINT64_FORMAT, queryDesc-estate-es_processed); /* and clean up */ ExecutorFinish(queryDesc); *** a/src/backend/tcop/pquery.c --- b/src/backend/tcop/pquery.c *** *** 195,201 ProcessQuery(PlannedStmt *plan, { case CMD_SELECT: snprintf(completionTag, COMPLETION_TAG_BUFSIZE, ! SELECT %u, queryDesc-estate-es_processed); break; case CMD_INSERT: if (queryDesc-estate-es_processed == 1) --- 195,201 { case CMD_SELECT: snprintf(completionTag, COMPLETION_TAG_BUFSIZE, ! SELECT UINT64_FORMAT, queryDesc-estate-es_processed); break; case CMD_INSERT: if (queryDesc-estate-es_processed == 1) *** *** 203,217 ProcessQuery(PlannedStmt *plan, else lastOid = InvalidOid; snprintf(completionTag, COMPLETION_TAG_BUFSIZE, ! INSERT %u %u, lastOid, queryDesc-estate-es_processed); break; case CMD_UPDATE: snprintf(completionTag, COMPLETION_TAG_BUFSIZE, ! UPDATE %u, queryDesc-estate-es_processed); break; case CMD_DELETE: snprintf(completionTag, COMPLETION_TAG_BUFSIZE, ! DELETE %u, queryDesc-estate-es_processed); break; default: strcpy(completionTag, ???); --- 203,217 else lastOid = InvalidOid; snprintf(completionTag, COMPLETION_TAG_BUFSIZE, ! INSERT %u UINT64_FORMAT, lastOid, queryDesc-estate-es_processed); break; case CMD_UPDATE: snprintf(completionTag, COMPLETION_TAG_BUFSIZE, ! UPDATE UINT64_FORMAT, queryDesc-estate-es_processed); break; case CMD_DELETE: snprintf(completionTag, COMPLETION_TAG_BUFSIZE, ! DELETE UINT64_FORMAT, queryDesc-estate-es_processed); break; default: strcpy(completionTag, ???); *** a/src/include/nodes/execnodes.h --- b/src/include/nodes/execnodes.h *** *** 375,381 typedef struct EState List *es_rowMarks; /* List of ExecRowMarks */ ! uint32 es_processed; /* # of tuples processed */ Oid es_lastoid; /* last oid processed (by INSERT) */ int es_top_eflags; /* eflags passed to ExecutorStart */ --- 375,381 List *es_rowMarks; /* List of ExecRowMarks */ ! uint64 es_processed; /* # of tuples processed */ Oid es_lastoid; /* last oid processed (by INSERT) */ int es_top_eflags; /* eflags passed to ExecutorStart */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To
Re: [HACKERS] updated emacs configuration
On Fri, Jan 31, 2014 at 11:57:28AM -0500, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: OK. I have updated entab.c to support this new ability as -m. When should it be run this against HEAD and supported back branches? Probably when we run pgindent for 9.4. Yeah. The whole point is to keep the branches in sync for patching, so we need to do them all at the same time. Right. Technically you could run entab -m on all the trees without running pgindent, but I see no reason for double-churn and breaking peoples' trees twice. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] install libpq.dll in bin directory on Windows / Cygwin
On Thu, Jul 25, 2013 at 04:53:45PM -0400, Andrew Dunstan wrote: Jeff Janes asked me about this, and Bruce just tripped up on it. Usually on Windows it's necessary to have libpq.dll/cygpq.dll either in the PATH or in the same directory as client .exe files. The buildfarm client has for many years simply copied this dll from the installation lib to the installation bin directory after running make install. But I can't really see why we don't do that as part of make install anyway. I haven't tested but I think something like this patch would achieve this goal - it would fix something that's tripped a lot of people up over the years. Comments? If we do this, should it be backported? Andrew, where are we on this? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pgindent wishlist item
While Bruce is working on pgindent, let me register a small wishlist item. It would be quite useful to be able to supply extra typedefs on the command line to supplement a typedefs file downloaded from the buildfarm or constructed however. A concrete example: in the code I have been recently working on, there are typedefs for Jsonb and JsonbValue. If I run pgindent as normal on the new code these items are not treated properly. What I had to do was take a special copy of the typedefs list and add those two items. If we could pass a list of extra typedefs to supplement the typedefs file that would be very useful. Then I could do something like: pgindent --typedef Jsonb --typedef JsonbValue src/backend/utils/adt/jsonfuncs.c without having to mangle a typedefs file. This would make using pgindent nicer to use during development, since any significant development is just about guaranteed to have some new typedefs the buildfarm can't have. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Insert result does not match record count
On 01/31/2014 06:19 PM, Bruce Momjian wrote: On Wed, Jul 24, 2013 at 08:08:32PM +0200, Andres Freund wrote: On 2013-07-24 13:48:23 -0400, Tom Lane wrote: Vik Fearing vik.fear...@dalibo.com writes: Also worth mentioning is bug #7766. http://www.postgresql.org/message-id/e1tlli5-0007tr...@wrigleys.postgresql.org Yeah, did you read that whole thread? The real issue here is going to be whether client-side code falls over on wider-than-32-bit counts. We can fix the backend and be pretty sure that we've found all the relevant places inside it, but we'll just be exporting the issue. I did look at libpq and noted that it doesn't seem to have any internal problem, because it returns the count to callers as a string (!). But what do you think are the odds that callers are using code that won't overflow? I'd bet on finding atoi() or suchlike in a lot of callers. Even if they thought to use strtoul(), unsigned long is not necessarily 64 bits wide. Application code that relies on the values already has problems though since the returned values are pretty bogus now. Including the fact that it can return 0 as the number of modified rows which is checked for more frequently than the actual number IME... So I think client code that uses simplistic stuff like atoi isn't worse off afterwards since the values will be about as bogus. I am more worried about code that does range checks like java's string conversion routines... I think fixing this for 9.4 is fine, but due to the compat issues I think it's to late for 9.3. Where are we on this? There was a posted patch, attached, but Vik Fearing said it was insufficent and he was working on a new one: http://www.postgresql.org/message-id/51eff67a.7020...@dalibo.com Unfortunately, I gave up on it as being over my head when I noticed I was changing the protocol itself. I should have notified the list so someone else could have taken over. -- Vik -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prohibit row-security + inheritance in 9.4?
* Yeb Havinga (yebhavi...@gmail.com) wrote: This reasoning could go either way. GRANT is on a complete set of rows. This is a restriction on the level of individual rows, and in that sense, it is more like a row-level CHECK constraint. Well, we certainly don't force CHECK constraints on children to apply to the parent. If you've got a trigger which is inserting into the child, that's a different story. If we wanted to make them the same then we'd throw out the ability to do any kind of changes or actions on the child and then we'd have actual partitioning. We don't have that though, we have inheiritance. I fail to understand this, probably because I do not have a partition use case for inheritance, but rather an information model that is more ontology like. The more specific childs get down the inheritance tree, more columns they get, and their requirements might differ completely in nature from their siblings, and make no sense at all as well when specified at the level of the parent (or even impossible, since the parent does not have all the columns). My point here is that you're making an argument for reducing what can be different between a parent and a child relation if you force permissions, etc. Then during expansion of the range table, no code is needed to ignore child rls quals and copy parent rels to child rels. This is what's already implemented and isn't a huge amount of code to begin with, so I don't see this as being an argument against having the flexibility. It would seem to me that any additional logic that can be avoided during planning is a good thing. Also, the argument that something is already implemented, does not itself make it good to commit. We're already avoiding additional logic by *not* considering the child relation's quals when it's queried by the parent. What do you mean with 'having the flexibility' and why is that good? The flexibility to allow a user to control access to the child independently from the access for the parent. Also, the security policy applied would be invariant to the route through which the rows were accessed: You could also get this by simply only allowing access to the parent and not granting any privileges on the children. That might work for partitioning, but not for use cases where childs have more columns than parents. You could still put whatever quals you want on the parent and the child independently to provide whatever security you want. The true inheiritance case makes that more clear, imv, not less- the quals that you want may not make any sense when applied to the parent as the parent has fewer columns (perhaps it doesn't have the 'sensitive' columns, for example). - directly to the child row: child rls quals and parent quals (by propagate at ddl) are applied. - through the parent: child rls quals and parent quals applied. If you want them to be the same then you can implement this yourself without having PG force it on you. I suggested it as a solution for a requirement worded upthread as It makes absolutely zero sense, in my head anyway, to have rows returned when querying the parent which should NOT be returned based on the quals of the parent. without disregarding rls-quals on childs. The point was to disregard the rls-quals on the children. There are ways we could make what you're suggesting work but trying to do it with DDL hacks wouldn't be the right answer anyway- consider what happens when you're setting up different quals on different children or when someone goes in and removes the quals on the parent directly. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pgindent wishlist item
Andrew Dunstan and...@dunslane.net writes: While Bruce is working on pgindent, let me register a small wishlist item. It would be quite useful to be able to supply extra typedefs on the command line to supplement a typedefs file downloaded from the buildfarm or constructed however. A concrete example: in the code I have been recently working on, there are typedefs for Jsonb and JsonbValue. If I run pgindent as normal on the new code these items are not treated properly. What I had to do was take a special copy of the typedefs list and add those two items. If we could pass a list of extra typedefs to supplement the typedefs file that would be very useful. Then I could do something like: pgindent --typedef Jsonb --typedef JsonbValue src/backend/utils/adt/jsonfuncs.c without having to mangle a typedefs file. Personally, I always just edit the downloaded file to add whatever typedefs the patch I'm working on adds. I wouldn't use a command line switch even if there was one, because then I'd have to remember which typedef names to add each time I run pgindent. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] install libpq.dll in bin directory on Windows / Cygwin
On 01/31/2014 12:25 PM, Bruce Momjian wrote: On Thu, Jul 25, 2013 at 04:53:45PM -0400, Andrew Dunstan wrote: Jeff Janes asked me about this, and Bruce just tripped up on it. Usually on Windows it's necessary to have libpq.dll/cygpq.dll either in the PATH or in the same directory as client .exe files. The buildfarm client has for many years simply copied this dll from the installation lib to the installation bin directory after running make install. But I can't really see why we don't do that as part of make install anyway. I haven't tested but I think something like this patch would achieve this goal - it would fix something that's tripped a lot of people up over the years. Comments? If we do this, should it be backported? Andrew, where are we on this? Hmm, looks like it got dropped. I think my original patch is probably about the the right thing to do, and given that it's already done by the MSVC build it's a bug and should be backported, as Alvaro said in the original discussion. I'll get on that shortly - since the patch was untested I'll need to do a test first. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLESPACE ... MOVE ALL TO ...
* Fujii Masao (masao.fu...@gmail.com) wrote: We should add the tab-completion for ALTER TABLESPACE MOVE? Attached does that. Committed. Thanks! I had planned to get to it, but appreciate your handling of it. Stephen signature.asc Description: Digital signature
[HACKERS] bgworker crashed or not?
In 9.3 I noticed that postmaster considers bgworker crashed (and therefore tries to restart it) even if it has exited with zero status code. I first thought about a patch like the one below, but then noticed that postmaster.c:bgworker_quickdie() signal handler exits with 0 too (when there's no success). Do we need my patch, my patch + something for the handler or no patch at all? // Antonin Houska (Tony) diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 0957e91..0313fd7 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -2791,11 +2814,7 @@ reaper(SIGNAL_ARGS) /* Was it one of our background workers? */ if (CleanupBackgroundWorker(pid, exitstatus)) - { - /* have it be restarted */ - HaveCrashedWorker = true; continue; - } /* * Else do standard backend child cleanup. @@ -2851,7 +2870,10 @@ CleanupBackgroundWorker(int pid, /* Delay restarting any bgworker that exits with a nonzero status. */ if (!EXIT_STATUS_0(exitstatus)) + { rw-rw_crashed_at = GetCurrentTimestamp(); + HaveCrashedWorker = true; + } else rw-rw_crashed_at = 0; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire
On Tue, Jan 28, 2014 at 6:39 PM, Rajeev rastogi rajeev.rast...@huawei.com wrote: On 28/01/14, Christian Kruse wrote: I have checked the revised patch. It looks fine to me except one minor code formatting issue. In elog.c, two tabs are missing in the definition of function errdetail_log_plural. Please run pgindent tool to check the same. I did, but this reformats various other locations in the file, too. Nevertheless I now ran pg_indent against it and removed the other parts. Attached you will find the corrected patch version. Also I would like to highlight one behavior here is that process ID of process trying to acquire lock is also listed in the list of Request queue. E.g. session 1 with process id X: BEGIN; LOCK TABLE foo IN SHARE MODE; session 2 with process id Y: BEGIN; LOCK TABLE foo IN EXCLUSIVE MODE; On execution of LOCK in session-2, as part of log it will display as: DETAIL: Process holding the lock: X. Request queue: Y. Where Y is the process ID of same process, which was trying to acquire lock. This is on purpose due to the rewording of the Message. In the first version the PID of the backend was missing. Thanks for the review! Now patch looks fine to me. I am marking this as Ready for Committer. When I tested the patch, I got the strange behavior. 1. I executed the SELECT FOR UPDATE query and kept waiting for a while in the session 1. [session 1] PID=33660 BEGIN; SELECT * FROM t WHERE i = 1 FOR UPDATE; (Wait) 2. I executed the SELECT FOR UPDATE query again in the session 2. [session 2] PID=33662 BEGIN; SELECT * FROM t WHERE i = 1 FOR UPDATE; (Waiting for the lock) 3. I got the following log message. This is OK. LOG: process 33662 still waiting for ShareLock on transaction 1011 after 1000.184 ms DETAIL: Process holding the lock: 33660. Request queue: 33662. 4. I executed the UPDATE query in the session 3. [session 3] PID=33665 UPDATE t SET j = j + 1 WHERE i = 1; (Waiting for the lock) 5. Then, I got the following log message. This looks strange and confusing to me. LOG: process 33665 still waiting for ExclusiveLock on tuple (0,4) of relation 16384 of database 12310 after 1000.134 ms DETAIL: Process holding the lock: 33662. Request queue: 33665 This log message says that the process 33662 is holding the lock, but it's not true. The process holding the lock is 33660. The process 33662 should be in the request queue. Is this the intentional behavior? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] inherit support for foreign tables
* Tom Lane (t...@sss.pgh.pa.us) wrote: Stephen Frost sfr...@snowman.net writes: I agree that using the FDW-specific options is the right approach and disallowing those to be set on foreign tables makes sense. I don't particularly like the idea of applying changes during inheiritance which we wouldn't allow the user to do directly. While it might be a no-op and no FDW might use it, it'd still be confusing. If we don't allow it directly, why not? Seems rather arbitrary. I'm apparently missing something here. From my perspective, they're either allowed or they're not and if they aren't allowed then they shouldn't be allowed to happen. I'd view this like a CHECK constraint- if it's a foreign table then it can't have a value for SET STORAGE. I don't see why it would be 'ok' to allow it to be set if we're setting it as part of inheiritance but otherwise not allow it to be set. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pgindent wishlist item
On 2014-01-31 12:29:52 -0500, Andrew Dunstan wrote: While Bruce is working on pgindent If it's christmas, let me wish for a not completly broken formatting of function typedefs. E.g. typedef ForeignScan *(*GetForeignPlan_function) (PlannerInfo *root, RelOptInfo *baserel, Oid foreigntableid, ForeignPath *best_path, List *tlist, List *scan_clauses); is ridiculous. It can't be convinced to add a newline at any helpful place afaik. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Regarding google summer of code
On 01/30/2014 07:23 PM, Anirudh wrote: Hello everyone, My name is Anirudh Subramanian and I am a graduate student in Computer Science. I would like to participate in Google Summer of Code and would like to contribute to postgresql. I am not familiar with the postgresql codebase yet but will surely get started in the near future. Right now as part of my coursework I am building Is there any list of project ideas that have been decided for Google Summer of Code 2014 that I can look at. Not yet! However, there's a lot of reading to do ... you could start looking at the code now. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: libpq: Support TLS versions beyond TLSv1.
On Sat, Jan 25, 2014 at 12:25:30PM -0500, Tom Lane wrote: Alternatively, given that TLS has been around for a dozen years and openssl versions that old have not gotten security updates for a long time, why don't we just reject SSLv3 on the backend side too? I guess it's barely possible that somebody out there is using a non-libpq-based client that uses a non-TLS-capable SSL library, but surely anybody like that is overdue to move into the 21st century. An SSL library that old is probably riddled with security issues. Attached patch disables SSLv3 in backend. TLS is supported in OpenSSL since fork from SSLeay, in Java since 1.4.2, in Windows since XP. It's hard to imagine this causing any compatibility problems. -- marko diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c index 43633e7..fc749f4 100644 --- a/src/backend/libpq/be-secure.c +++ b/src/backend/libpq/be-secure.c @@ -880,9 +880,9 @@ initialize_SSL(void) SSLerrmessage(; } - /* set up ephemeral DH keys, and disallow SSL v2 while at it */ + /* set up ephemeral DH keys, and disallow SSL v2/v3 while at it */ SSL_CTX_set_tmp_dh_callback(SSL_context, tmp_dh_cb); - SSL_CTX_set_options(SSL_context, SSL_OP_SINGLE_DH_USE | SSL_OP_NO_SSLv2); + SSL_CTX_set_options(SSL_context, SSL_OP_SINGLE_DH_USE | SSL_OP_NO_SSLv2 | SSL_OP_NO_SSLv3); /* set up ephemeral ECDH keys */ initialize_ecdh(); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prohibit row-security + inheritance in 9.4?
On 2014-01-31 15:10, Stephen Frost wrote: * Craig Ringer (cr...@2ndquadrant.com) wrote: On 01/31/2014 09:01 AM, Stephen Frost wrote: The only case prevented is one where access to the child via the parent shows rows that the parent's row-security qual would hide, because the child's qual doesn't. It makes absolutely zero sense, in my head anyway, to have rows returned when querying the parent which should NOT be returned based on the quals of the parent. IMHO, there is another way to implement this, other than the procedure to override the child-rel-quals with the ones from the parent. At DDL time, synchronize quals on the parent with rls quals of the childs. Isn't this also what happens with constraints? Then during expansion of the range table, no code is needed to ignore child rls quals and copy parent rels to child rels. Also, the security policy applied would be invariant to the route through which the rows were accessed: - directly to the child row: child rls quals and parent quals (by propagate at ddl) are applied. - through the parent: child rls quals and parent quals applied. regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgindent wishlist item
On Fri, Jan 31, 2014 at 12:44:22PM -0500, Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: While Bruce is working on pgindent, let me register a small wishlist item. It would be quite useful to be able to supply extra typedefs on the command line to supplement a typedefs file downloaded from the buildfarm or constructed however. A concrete example: in the code I have been recently working on, there are typedefs for Jsonb and JsonbValue. If I run pgindent as normal on the new code these items are not treated properly. What I had to do was take a special copy of the typedefs list and add those two items. If we could pass a list of extra typedefs to supplement the typedefs file that would be very useful. Then I could do something like: pgindent --typedef Jsonb --typedef JsonbValue src/backend/utils/adt/jsonfuncs.c without having to mangle a typedefs file. Personally, I always just edit the downloaded file to add whatever typedefs the patch I'm working on adds. I wouldn't use a command line switch even if there was one, because then I'd have to remember which typedef names to add each time I run pgindent. Easily added, so done with the attached, applied patch. You use it like this: pgindent --list-of-typedefs 'abc def' -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent new file mode 100755 index 8e45b18..2de7a53 *** a/src/tools/pgindent/pgindent --- b/src/tools/pgindent/pgindent *** my $indent_opts = *** 22,31 # indent-dependant settings my $extra_opts = ; ! my ($typedefs_file, $code_base, $excludes, $indent, $build); my %options = ( typedefs=s = \$typedefs_file, code-base=s = \$code_base, excludes=s = \$excludes, indent=s= \$indent, --- 22,32 # indent-dependant settings my $extra_opts = ; ! my ($typedefs_file, $typedef_str, $code_base, $excludes, $indent, $build); my %options = ( typedefs=s = \$typedefs_file, + list-of-typedefs=s = \$typedef_str, code-base=s = \$code_base, excludes=s = \$excludes, indent=s= \$indent, *** sub load_typedefs *** 125,130 --- 126,138 || die cannot open typedefs file \$typedefs_file\: $!\n; my @typedefs = $typedefs_fh; close($typedefs_fh); + if (defined($typedef_str)) + { + foreach my $typedef (split(m/[, \t\n]+/, $typedef_str)) + { + push(@typedefs, $typedef . \n); + } + } # remove certain entries @typedefs = -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgindent wishlist item
On Fri, Jan 31, 2014 at 07:15:05PM +0100, Andres Freund wrote: On 2014-01-31 12:29:52 -0500, Andrew Dunstan wrote: While Bruce is working on pgindent If it's christmas, let me wish for a not completly broken formatting of function typedefs. E.g. typedef ForeignScan *(*GetForeignPlan_function) (PlannerInfo *root, RelOptInfo *baserel, Oid foreigntableid, ForeignPath *best_path, List *tlist, List *scan_clauses); is ridiculous. It can't be convinced to add a newline at any helpful place afaik. Uh, not sure how to help here. :-( -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prefix compression of B-Tree keys
On Fri, Jan 31, 2014 at 3:51 AM, Peter Geoghegan p...@heroku.com wrote: Now, I haven't checked if it's already done. Sorry if it is. I did mock around btree code a lot and don't remember any of this, but I do remember stuff that could be used to achieve it (specifically, all the index-only-scan machinery, which allows for implicit info). I think it would make sense to do it on leaf pages, where there is frequently a lot of redundancy. The reason that I myself wouldn't do it first is that offhand I think that it'd be harder to come up with a very generic infrastructure to make it work across diverse types, and it isn't that useful where there isn't much redundancy in prefixes. The B-Tree code doesn't really know anything about the type indexed, and that's a useful property. But it's still definitely a useful goal. Well, for multi-column it should be really simple. You already have all the necessary information to decide on the prefix (the common prefix columns of leftmost and rightmost keys). It's harder for text columns, since you must abstract out the common prefix check,append prefix, and remove prefix operations. That would probably require extending opclasses. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Insert result does not match record count
On Fri, Jan 31, 2014 at 06:34:27PM +0100, Vik Fearing wrote: Application code that relies on the values already has problems though since the returned values are pretty bogus now. Including the fact that it can return 0 as the number of modified rows which is checked for more frequently than the actual number IME... So I think client code that uses simplistic stuff like atoi isn't worse off afterwards since the values will be about as bogus. I am more worried about code that does range checks like java's string conversion routines... I think fixing this for 9.4 is fine, but due to the compat issues I think it's to late for 9.3. Where are we on this? There was a posted patch, attached, but Vik Fearing said it was insufficent and he was working on a new one: http://www.postgresql.org/message-id/51eff67a.7020...@dalibo.com Unfortunately, I gave up on it as being over my head when I noticed I was changing the protocol itself. I should have notified the list so someone else could have taken over. OK, so that brings up a good question. Can we change the protocol for this without causing major breakage? Tom seems to indicate that it can be done for 9.4, but I thought protocol breakage was a major issue. Are we really changing the wire protocol here, or just the type of string we can pass back to the interface? I know the libpq API we give to clients is a string so it is OK. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On Fri, Jan 31, 2014 at 5:07 AM, Mitsumasa KONDO kondo.mitsum...@gmail.com wrote: And past result shows that your patch's most weak point is that deleting most old statement and inserting new old statement cost is very high, as you know. No, there is no reason to imagine that entry_dealloc() is any slower, really. There will perhaps be some additional contention with shared lockers, but that isn't likely to be a major aspect. When the hash table is full, in reality at that point it's very unlikely that there will be two simultaneous sessions that need to create a new entry. As I said, on many of the systems I work with it takes weeks to see a new entry. This will be especially true now that the *.max default is 5,000. It accelerate to affect update(delete and insert) cost in pg_stat_statements table. So you proposed new setting 10k in default max value. But it is not essential solution, because it is also good perfomance for old pg_stat_statements. I was merely pointing out that that would totally change the outcome of your very artificial test-case. Decreasing the number of statements to 5,000 would too. I don't think we should assign much weight to any test case where the large majority or all statistics are wrong afterwards, due to there being so much churn. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
Greg Stark st...@mit.edu writes: So just to summarize, this xlog record: [cur:EA1/637140, xid:1418089147, rmid:11(Btree), len/tot_len:18/6194, info:8, prev:EA1/635290] insert_leaf: s/d/r:1663/16385/1261982 tid 3634978/282 [cur:EA1/637140, xid:1418089147, rmid:11(Btree), len/tot_len:18/6194, info:8, prev:EA1/635290] bkpblock[1]: s/d/r:1663/16385/1261982 blk:3634978 hole_off/len:1240/2072 Appears to have been written to [ block 7141472 ] I've been staring at the code for a bit trying to guess how that could have happened. Since the WAL record has a backup block, btree_xlog_insert would have passed control to RestoreBackupBlock, which would call XLogReadBufferExtended with mode RBM_ZERO, so there would be no complaint about writing past the end of the relation. Now, you can imagine some very low-level error causing a write to go to the wrong page due to a seek problem or some such, but it's hard to credit that that would've resulted in creation of all the intervening segment files. Some level of our code had to have thought it was being told to extend the relation. However, on closer inspection I was a bit surprised to realize that there are two possible candidates for doing that! XLogReadBufferExtended will extend the relation, a block at a time, if told to write a page past the current nominal EOF. And in md.c, _mdfd_getseg will *also* extend the relation if we're InRecovery, even though it normally would not do so when called from mdwrite(). Given the behavior in XLogReadBufferExtended, I rather think that the InRecovery special case in _mdfd_getseg is dead code and should be removed. But for the purpose at hand, it's more interesting to try to confirm which of these code levels did the extension. I notice that _mdfd_getseg only bothers to write the last physical page of each segment, whereas XLogReadBufferExtended knows nothing of segments and will ploddingly write every page. So on a filesystem that supports holes in files, I'd expect that the added segments would be fully allocated if XLogReadBufferExtended did the deed, but they'd be quite small if _mdfd_getseg did so. The du results you started with suggest that the former is the case, but could you verify that the filesystem this is on supports holes and that du will report only the actually allocated space when there's a hole? Assuming that the extension was done in XLogReadBufferExtended, we are forced to the conclusion that XLogReadBufferExtended was passed a bad block number (viz 7141472); and it's pretty hard to see how that could happen. RestoreBackupBlock is just passing the value it got out of the WAL record. I thought about the idea that it was wrong about exactly where the BkpBlock struct was in the record, but that would presumably lead to garbage relnode and fork numbers not just a bad block number. So I'm still baffled ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [GENERAL] postgres FDW cost estimation options unrecognized in 9.3-beta1
On Fri, Jul 26, 2013 at 06:28:05PM -0400, Tom Lane wrote: Our documentation appears not to disclose this fine point, but a look at the SQL-MED standard says it's operating per spec. The standard also says that ADD is an error if the option is already defined, which is a bit more defensible, but still not exactly what I'd call user-friendly. And the error we issue for that case is pretty misleading too: regression=# ALTER SERVER cuda_db10 OPTIONS (use_remote_estimate 'true') ; ALTER SERVER regression=# ALTER SERVER cuda_db10 OPTIONS (use_remote_estimate 'false') ; ERROR: option use_remote_estimate provided more than once I think we could do with both more documentation, and better error messages for these cases. In the SET-where-you-should-use-ADD case, perhaps ERROR: option use_remote_estimate has not been set HINT: Use ADD not SET to define an option that wasn't already set. In the ADD-where-you-should-use-SET case, perhaps ERROR: option use_remote_estimate is already set HINT: Use SET not ADD to change an option's value. The provided more than once wording would be appropriate if the same option is specified more than once in the command text, but I'm not sure that it's worth the trouble to detect that case. Where are on this? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Fri, Jan 31, 2014 at 9:26 AM, Andrew Dunstan and...@dunslane.net wrote: On 01/31/2014 09:53 AM, Merlin Moncure wrote: On Fri, Jan 31, 2014 at 8:45 AM, Andrew Dunstan and...@dunslane.net wrote: On 01/31/2014 08:57 AM, Merlin Moncure wrote: On Fri, Jan 31, 2014 at 4:03 AM, Oleg Bartunov obartu...@gmail.com wrote: Hmm, neither me, nor Teodor have experience and knowledge with populate_record() and moreover hstore here is virgin and we don't know the right behaviour, so I think we better take it from jsonb, once Andrew realize it. Andrew ? Andrew Gierth wrote the current implementation of htsore populate_record IIRC. Unfortunately the plan for jsonb was to borrow hstore's (I don't think hstore can use the jsonb implementation because you'd be taking away the ability to handle internally nested structures it currently has). Of my two complaints upthread, the second one, not being able to populate from and internally well formed structure, is by far the more serious one I think. Umm, I think at least one of us is seriously confused. I am going to look at dealing with these issues in a way that can be used by both - at least the populate_record case. As far as populate_record goes, there is a bit of an impedance mismatch, since json/hstore records are heterogenous and one-dimensional, whereas sql arrays are homogeneous and multidimensional. Right now I am thinking I will deal with arrays up to two dimensions, because I can do that relatively simply, and after that throw in the towel. That will surely deal with 99.9% of use cases. Of course this would be documented. Anyway, Let me see what I can do. If Andrew Gierth wants to have a look at fixing the hstore() side that might help speed things up. (ah, you beat me to it.) Disregard my statements above. It works. postgres=# select jsonb_populate_record(null::x, hstore(row(1, array[row(1, array[row(1, array[1,2])::z])::y])::x)::jsonb); jsonb_populate_record - (1,{(1,\\{(1,{1,2})}\\)}) Actually, there is a workaround to the limitations of hstore(record): yeah I'm ok with hstore() function as it is. That also eliminates backwards compatibility concerns so things worked out. The only 'must fix' 9.4 facing issue I see on the table is to make sure jsonb populate function is forward compatible with future expectations of behavior which to me means zeroing in on the necessity of the as_text argument (but if you can expand coverage without jeopardizing 9.4 inclusion than great...). For my part I'm going to continue functionally testing the rest of the API (so far, a cursory look hasn't turned up anything else). I'm also signing up for some documentation refinements which will be done after you nail down these little bits but before the end of the 'fest. IMNSHO, formal code review needs to begin ASAP (salahaldin is the reviewer per the fest wiki) merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Recovery inconsistencies, standby much larger than primary
One thing I keep coming back to is a bad ran chip setting a bit in the block number. But I just can't seem to get it to add up. The difference is not a power of two, it had happened on two different machines, and we don't see other weirdness on the machine. It seems like a strange coincidence it would happen to the same variable twice and not to other variables. Unless there's some unrelated code writing through a wild pointer, possibly to a stack allocated object that just happens to often be that variable? -- greg On 31 Jan 2014 20:21, Tom Lane t...@sss.pgh.pa.us wrote: Greg Stark st...@mit.edu writes: So just to summarize, this xlog record: [cur:EA1/637140, xid:1418089147, rmid:11(Btree), len/tot_len:18/6194, info:8, prev:EA1/635290] insert_leaf: s/d/r:1663/16385/1261982 tid 3634978/282 [cur:EA1/637140, xid:1418089147, rmid:11(Btree), len/tot_len:18/6194, info:8, prev:EA1/635290] bkpblock[1]: s/d/r:1663/16385/1261982 blk:3634978 hole_off/len:1240/2072 Appears to have been written to [ block 7141472 ] I've been staring at the code for a bit trying to guess how that could have happened. Since the WAL record has a backup block, btree_xlog_insert would have passed control to RestoreBackupBlock, which would call XLogReadBufferExtended with mode RBM_ZERO, so there would be no complaint about writing past the end of the relation. Now, you can imagine some very low-level error causing a write to go to the wrong page due to a seek problem or some such, but it's hard to credit that that would've resulted in creation of all the intervening segment files. Some level of our code had to have thought it was being told to extend the relation. However, on closer inspection I was a bit surprised to realize that there are two possible candidates for doing that! XLogReadBufferExtended will extend the relation, a block at a time, if told to write a page past the current nominal EOF. And in md.c, _mdfd_getseg will *also* extend the relation if we're InRecovery, even though it normally would not do so when called from mdwrite(). Given the behavior in XLogReadBufferExtended, I rather think that the InRecovery special case in _mdfd_getseg is dead code and should be removed. But for the purpose at hand, it's more interesting to try to confirm which of these code levels did the extension. I notice that _mdfd_getseg only bothers to write the last physical page of each segment, whereas XLogReadBufferExtended knows nothing of segments and will ploddingly write every page. So on a filesystem that supports holes in files, I'd expect that the added segments would be fully allocated if XLogReadBufferExtended did the deed, but they'd be quite small if _mdfd_getseg did so. The du results you started with suggest that the former is the case, but could you verify that the filesystem this is on supports holes and that du will report only the actually allocated space when there's a hole? Assuming that the extension was done in XLogReadBufferExtended, we are forced to the conclusion that XLogReadBufferExtended was passed a bad block number (viz 7141472); and it's pretty hard to see how that could happen. RestoreBackupBlock is just passing the value it got out of the WAL record. I thought about the idea that it was wrong about exactly where the BkpBlock struct was in the record, but that would presumably lead to garbage relnode and fork numbers not just a bad block number. So I'm still baffled ... regards, tom lane
Re: [HACKERS] Regarding google summer of code
Thank you for your replies. I will get started. Cheers, Anirudh On Fri, Jan 31, 2014 at 1:17 PM, Josh Berkus j...@agliodbs.com wrote: On 01/30/2014 07:23 PM, Anirudh wrote: Hello everyone, My name is Anirudh Subramanian and I am a graduate student in Computer Science. I would like to participate in Google Summer of Code and would like to contribute to postgresql. I am not familiar with the postgresql codebase yet but will surely get started in the near future. Right now as part of my coursework I am building Is there any list of project ideas that have been decided for Google Summer of Code 2014 that I can look at. Not yet! However, there's a lot of reading to do ... you could start looking at the code now. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
Peter Geoghegan p...@heroku.com writes: On Fri, Jan 31, 2014 at 5:07 AM, Mitsumasa KONDO kondo.mitsum...@gmail.com wrote: It accelerate to affect update(delete and insert) cost in pg_stat_statements table. So you proposed new setting 10k in default max value. But it is not essential solution, because it is also good perfomance for old pg_stat_statements. I was merely pointing out that that would totally change the outcome of your very artificial test-case. Decreasing the number of statements to 5,000 would too. I don't think we should assign much weight to any test case where the large majority or all statistics are wrong afterwards, due to there being so much churn. To expand a bit: (1) It's really not very interesting to consider pg_stat_statement's behavior when the table size is too small to avoid 100% turnover; that's not a regime where the module will provide useful functionality. (2) It's completely unfair to pretend that a given hashtable size has the same cost in old and new code; there's more than a 7X difference in the shared-memory space eaten per hashtable entry. That does have an impact on whether people can set the table size large enough to avoid churn. The theory behind this patch is that for the same shared-memory consumption you can make the table size enough larger to move the cache hit rate up significantly, and that that will more than compensate performance-wise for the increased time needed to make a new table entry. Now that theory is capable of being disproven, but an artificial example with a flat query frequency distribution isn't an interesting test case. We don't care about optimizing performance for such cases, because they have nothing to do with real-world usage. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: libpq: Support TLS versions beyond TLSv1.
Marko Kreen mark...@gmail.com writes: On Sat, Jan 25, 2014 at 12:25:30PM -0500, Tom Lane wrote: Alternatively, given that TLS has been around for a dozen years and openssl versions that old have not gotten security updates for a long time, why don't we just reject SSLv3 on the backend side too? Attached patch disables SSLv3 in backend. TLS is supported in OpenSSL since fork from SSLeay, in Java since 1.4.2, in Windows since XP. It's hard to imagine this causing any compatibility problems. I didn't hear anyone objecting to this idea, so I'll go ahead and commit this in HEAD. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers