Re: [HACKERS] BRIN range operator class
Looking at patch 04, it seems to me that it would be better to have the OpcInfo struct carry the typecache struct rather than the type OID, so that we can avoid repeated typecache lookups in brin_deform_tuple; Here's the patch. Looks better to me. I will incorporate with this patch. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Disabling trust/ident authentication configure option
On Tue, May 5, 2015 at 10:39 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, May 5, 2015 at 8:05 AM, Volker Aßmann volker.assm...@gmail.com wrote: Changing the password to something simple is immediately obvious as a security flaw for most people who may come across database configurations, but for the TRUST mode you actually need to know some background on why this is dangerous and when. I frankly find that a bit difficult to swallow. You think that everyone knows that bad passwords are a problem, but some people might not realize that an authentication method called trust might not be secure? I suppose that's possible, but I really think that if you install an *authentication method* whose name means *just trust the other guy to be telling the truth* without thinking about the consequences of that, it's hard to have a lot of sympathy for whatever happens afterwards. Well trust actually does not sound that dangerous in case you only take a quick glance at the documentation - trust PostgreSQL to do the right thing? - so this at least requires to actively read about this, while any person working in IT should have at least a rough understanding that weak passwords are bad... And trust is IMHO dangerous because even people who might know about the dangers may choose to temporarily turn it on (let me just finish this update and I will worry about the authentication settings later) and then forget to disable it again - aka it might actually be useful... Besides, your patch doesn't just disable trust. It also disables ident authentication, which in some network configurations could conceivably be more secure than password authentication. When applied to the local machine, ident actually means peer, which has an *excellent* chance of being more secure than password authentication. For that matter, even trust might be better than password. Anybody who can sniff the network traffic can read the password right off of the wire. So either way, your PostgreSQL server is gonna get hacked, but if you use password authentication, you might reveal a password that is also used to protect access to something else that used to be secure. You are right, ident is not the same ättack vector as trust, you most likely won't activate this by accident, but I think it is still a dangerous mode that would be reasonable to deactivate if it's not needed. Personally, if I were going to start disabling authentication methods at compile time, I'd start with password and md5. If you are not using SSL, and you use password or md5 authentication, you're basically saying, well, I'm OK with somebody reading all of the data that I'm sending and receiving over the wire, and I'm willing to take the risk that my passwords are easily crackable or can be read straight off the wire using wireshark, but to send your own queries you will have to make at least some minimal effort. If you need real security, that is not nearly good enough. If you don't need real security, why bother making people hassle with a password that's not providing any real protection? There are some valid answers to that question - e.g. if you are on a corporate WAN, you probably can't fire somebody for blundering into an unprotected resource, but if somebody goes to the trouble of cracking the password, even if it's weak, then you can probably nail them. For most users, though, I think password and md5 authentication serve mostly to give people the illusion that they've secured the server. The real security, if there is any, comes primarily from restricting incoming connections via listen_addresses and/or operating system mechanisms such as iptables and/or pg_hba.conf, and from requiring the use of SSL. Passwords are weak sauce. Yes, passwords can be as bad as trust authentication or basically any other method done implemented insecurely, so from my point of view the best solution would be to be able to selectively enable/disable all authentication methods to customize the package for specific environments. Trust is in my point of view just the most immediately obvious shoot yourself in the foot option and in my use case the thing that users are actually bound to try and get wrong. Please note that the patch does nothing by default, it just adds the option to disable trust/ident but leaves them on in the standard configuration. I do not want to disable trust by default for everyone, but it would be great to have the option to do this without having to patch (and thus test and verify) the PostgreSQL sources for each new release. I think this is a sufficiently general requirement to warrant including an option to disable this, as most hardening guides I have seen for PostgreSQL unconditionally require to disable trust authentication and disabling it in the code removes the need to check this in the runtime configuration. A final point to consider is: what happens when you lock yourself out of
Re: [HACKERS] BRIN range operator class
Can you please explain what is the purpose of patch 07? I'm not sure I understand; are we trying to avoid having to add pg_amproc entries for these operators and instead piggy-back on btree opclass definitions? Not too much in love with that idea; I see that there is less tedium in that the brin opclass definition is simpler. One disadvantage is a 3x increase in the number of syscache lookups to get the function you need, unless I'm reading things wrong. Maybe this is not performance critical. It doesn't use btree opclass definitions. It uses brin opclass pg_amop entries instead of duplicating them in pg_amproc. The pg_amproc.h header says: * The amproc table identifies support procedures associated with index * operator families and classes. These procedures can't be listed in pg_amop * since they are not the implementation of any indexable operator. In our case, these procedures can be listed in pg_amop as they are implementations of indexable operators. The more important change on this patch is to request procedures for the right data types. Minmax opclasses return wrong results without this patch. You can reproduce it with this query on the regression database: select * from brintest where timestampcol = '1979-01-29 11:05:09'::timestamptz; Anyway I tried applying it on isolation, and found that it fails the assertion that tests the union support proc in brininsert. That doesn't seem okay. I mean, it's okay not to run the test for the inclusion opclasses, but why does it now fail in minmax which was previously passing? Couldn't figure it out. The regression tests passed when I tried it on the current master. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Manipulating complex types as non-contiguous structures in-memory
2015-05-06 0:50 GMT+02:00 Tom Lane t...@sss.pgh.pa.us: I wrote: Pavel Stehule pavel.steh...@gmail.com writes: Significant slowdown is on following test: do $$ declare a int[] := '{}'; begin for i in 1..9 loop a := a || 10; end loop; end$$ language plpgsql; do $$ declare a numeric[] := '{}'; begin for i in 1..9 loop a := a || 10.1; end loop; end$$ language plpgsql; integer master 14sec x patched 55sec numeric master 43sec x patched 108sec It is probably worst case - and it is known plpgsql antipattern Yeah, I have not expended a great deal of effort on the array_append/ array_prepend/array_cat code paths. Still, in these plpgsql cases, we should in principle have gotten down from two array copies per loop to one, so it's disappointing to not have better results there, even granting that the new copy step is not just a byte-by-byte copy. Let me see if there's anything simple to be done about that. The attached updated patch reduces both of those do-loop tests to about 60 msec on my machine. It contains two improvements over the 1.1 patch: 1. There's a fast path for copying an expanded array to another expanded array when the element type is pass-by-value: we can just memcpy the Datum array instead of working element-by-element. In isolation, that change made the patch a little faster than 9.4 on your int-array case, though of course it doesn't help for the numeric-array case (and I do not see a way to avoid working element-by-element for pass-by-ref cases). 2. pl/pgsql now detects cases like a := a || x and allows the array a to be passed as a read-write pointer to array_append, so that array_append can modify expanded arrays in-place and avoid inessential data copying altogether. (The earlier patch had made array_append and array_prepend safe for this usage, but there wasn't actually any way to invoke them with read-write pointers.) I had speculated about doing this in my earliest discussion of this patch, but there was no code for it before. The key question for change #2 is how do we identify what is a safe top-level function that can be trusted not to corrupt the read-write value if it fails partway through. I did not have a good answer before, and I still don't; what this version of the patch does is to hard-wire array_append and array_prepend as the functions considered safe. Obviously that is crying out for improvement, but we can leave that question for later; at least now we have infrastructure that makes it possible to do it. Change #1 is actually not relevant to these example cases, because we don't copy any arrays within the loop given change #2. But I left it in because it's not much code and it will help for situations where change #2 doesn't apply. I can confirm this speedup - pretty nice. Multidimensional append is slower 2x .. but it is probably corner case declare a int[] := '{}'; begin for i in 1..9 loop a := a || ARRAY[[i ]]; end loop; raise notice '%', 'aa'; end$$ language plpgsql; but this optimization doesn't work for code - that is semantically same like a || i; declare a int[] := '{}'; begin for i in 1..9 loop a := a || ARRAY[i ]; end loop; raise notice '%', 'aa'; end$$ language plpgsql; So there is some to much sensible There are slowdown with MD arrays, but it is not typical use case, and the speedup is about 5-10x and faster - so I'll be very happy if this patch will be in 9.5 Regards Pavel regards, tom lane
[HACKERS] is possible to upgrade from 9.2 to 9.4 with pg_upgrade
Hi I am working on preparation the migration from 9.2 to 9.4 pg_upgrade fails pg_upgrade -b /usr/lib64/pgsql/postgresql-9.2/bin -B /usr/bin/ -d /mnt/ebs/pgsql/data -D /mnt/ebs/pgsql/data94 -k' Performing Consistency Checks - Checking cluster versions ok The old cluster lacks some required control information: latest checkpoint oldest MultiXactId ? Regards Pavel
[HACKERS] Where are the detoast function called in select * from table_name case?
Hello, I've been reading the PostgreSQL code for weeks to figure out how TOAST works. I couldn't find where are the TOAST function called to detoast a tuple comes from a select query, for example, select * from table_name. Does anyone know this? Can you give me some help? And any help would save me a lot of time. Thanks, Rui Hai -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] is possible to upgrade from 9.2 to 9.4 with pg_upgrade
2015-05-06 15:15 GMT+02:00 Alvaro Herrera alvhe...@2ndquadrant.com: Pavel Stehule wrote: Hi I am working on preparation the migration from 9.2 to 9.4 pg_upgrade fails pg_upgrade -b /usr/lib64/pgsql/postgresql-9.2/bin -B /usr/bin/ -d /mnt/ebs/pgsql/data -D /mnt/ebs/pgsql/data94 -k' Performing Consistency Checks - Checking cluster versions ok The old cluster lacks some required control information: latest checkpoint oldest MultiXactId ? Uh, this is certainly supposed to work. Maybe pg_controldata or pg_resetxlog changed output format and pg_upgrade doesn't know how to read it. It is tested on fresh 9.2.10 to 9.4.1 Regards Pavel -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services
Re: [HACKERS] is possible to upgrade from 9.2 to 9.4 with pg_upgrade
Pavel Stehule wrote: Hi I am working on preparation the migration from 9.2 to 9.4 pg_upgrade fails pg_upgrade -b /usr/lib64/pgsql/postgresql-9.2/bin -B /usr/bin/ -d /mnt/ebs/pgsql/data -D /mnt/ebs/pgsql/data94 -k' Performing Consistency Checks - Checking cluster versions ok The old cluster lacks some required control information: latest checkpoint oldest MultiXactId ? Uh, this is certainly supposed to work. Maybe pg_controldata or pg_resetxlog changed output format and pg_upgrade doesn't know how to read it. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Where are the detoast function called in select * from table_name case?
Hi Depends on usage, but often times the detoasting is called from DatumGet* macros The values are detoasted only when it is required #define PG_GETARG_TEXT_PP(n) DatumGetTextPP(PG_GETARG_DATUM(n)) #define DatumGetTextPP(X) ((text *) PG_DETOAST_DATUM_PACKED(X)) #define PG_DETOAST_DATUM(datum) pg_detoast_datum((struct varlena *) DatumGetPointer(datum)) Regards Pavel Stehule 2015-05-06 15:00 GMT+02:00 Rui Hai Jiang ruihaiji...@msn.com: Hello, I've been reading the PostgreSQL code for weeks to figure out how TOAST works. I couldn't find where are the TOAST function called to detoast a tuple comes from a select query, for example, select * from table_name. Does anyone know this? Can you give me some help? And any help would save me a lot of time. Thanks, Rui Hai -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Manipulating complex types as non-contiguous structures in-memory
Pavel Stehule pavel.steh...@gmail.com writes: Multidimensional append is slower 2x .. but it is probably corner case declare a int[] := '{}'; begin for i in 1..9 loop a := a || ARRAY[[i ]]; end loop; raise notice '%', 'aa'; end$$ language plpgsql; Yeah, that's array_cat(), which I've not done anything with. I'm not really excited about adding code to it; I think use-cases like this one are probably too uncommon to justify more code. In any case we could go back and improve it later if there are enough complaints. Another way to look at it is that in this example, plpgsql's attempts to force the a array into expanded form are a mistake: we never get any benefit because array_cat() just wants it in flat form again, and delivers it in flat form. (It's likely that this is an unrealistic worst case: it's hard to imagine real array-using applications that never do any element-by-element access.) Possibly we could improve matters with a more refined heuristic about whether to force arrays to expanded form during assignments --- but I'm not sure what that would look like. plpgsql has very little direct knowledge of which operations will be applied to the array later. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)
On Tue, May 5, 2015 at 9:04 AM, Arjen Nienhuis a.g.nienh...@gmail.com wrote: Can someone look at this patch. It should fix bug #12845. The current tests for conversions are very minimal. I expanded them a bit for this bug. I think the binary search in the .map files should be removed but I leave that for another patch. Please add this patch to https://commitfest.postgresql.org/action/commitfest_view/open so we don't forget about it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Disabling trust/ident authentication configure option
Robert Haas wrote: I frankly find that a bit difficult to swallow. You think that everyone knows that bad passwords are a problem, but some people might not realize that an authentication method called trust might not be secure? Ultimately, what we offer to users is choice of a few options. Should we only offer options that we consider to be completely secure, and no others? If we were to follow that principle, we would completely disable non-SSL builds, and all auth methods other than, I dunno, GSSAPI and such. But we don't do that, because we trust that users will use whatever is most appropriate for them. I see this patch is, in a way, a mechanism to let system administrators choose at compile time what options are available to DBAs at setup time. This seems a reasonable thing to me. I don't necessarily agree with the patch as proposed. I would rather have a comma-separated list of methods, as in: --disable-auth=ident,peer which lets you choose what to disable without hardcoded choices. Due to the nature of autoconf, this might be too fiddly to implement, though, and if so I think the method proposed by this patch seems a reasonable compromise. I've seen configure in other programs offer options such as --disable-foo=list that lists acceptable values (or --disable-foo=help) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT syntax issues
On 2015-05-05 15:27:09 +0300, Heikki Linnakangas wrote: I'm a bit late to the party as I haven't paid much attention to the syntax before, but let me give some comments on this arbiter index inference thingie. To recap, there are three variants: A. INSERT ... ON CONFLICT DO NOTHING No arbiter is specified. This means that a conflict on any unique or exclusion constraint is not allowed (and will do nothing instead). This variant is only accepted for DO NOTHING. B. INSERT ... ON CONFLICT ON constraint name DO NOTHING/UPDATE In this variant, you explicitly specify the constraint by name. I do think it's a bit sad to not be able to specify unique indexes that aren't constraints. So I'd like to have a corresponding ON INDEX - which would be trivial. C. INSERT ... ON CONFLICT (index params) [WHERE expression] DO NOTHING/UPDATE This specifies an index (or indexes, in the corner case that there are several identical ones), by listing the columns/expressions and the predicate for a partial index. The list of columns and WHERE match the syntax for CREATE INDEX. That's pretty good overall. A few questions: 1. Why is the variant without specifying an index or constraint not allowed with DO UPDATE? I agree it might not make much sense, but then again, it might. If we're afraid that it's too unsafe to be the default if you don't specify any constraint, how about allowing it with a more verbose ON CONFLICT ON ANY CONSTRAINT syntax? I think that'd be useful. Peter seems to be against it on pureness grounds when we argued against it before, but I know that I'd wished for it before. 2. Why can't you specify multiple constraints, even though we implicitly allow any with the first variant? Yea. Finally, a couple of suggestions. It would be pretty handy to allow: INSERT ... ON CONFLICT ON PRIMARY KEY DO NOTHING/UPDATE Not sure if that really has that big of a use case, but it'd also be simple. Also, I wonder if we should change the B syntax to be: INSERT ... ON CONFLICT ON *CONSTRAINT* constraint name DO NOTHING/UPDATE Oh yes. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Manipulating complex types as non-contiguous structures in-memory
2015-05-06 15:50 GMT+02:00 Tom Lane t...@sss.pgh.pa.us: Pavel Stehule pavel.steh...@gmail.com writes: Multidimensional append is slower 2x .. but it is probably corner case declare a int[] := '{}'; begin for i in 1..9 loop a := a || ARRAY[[i ]]; end loop; raise notice '%', 'aa'; end$$ language plpgsql; Yeah, that's array_cat(), which I've not done anything with. I'm not really excited about adding code to it; I think use-cases like this one are probably too uncommon to justify more code. In any case we could go back and improve it later if there are enough complaints. Another way to look at it is that in this example, plpgsql's attempts to force the a array into expanded form are a mistake: we never get any benefit because array_cat() just wants it in flat form again, and delivers it in flat form. (It's likely that this is an unrealistic worst case: it's hard to imagine real array-using applications that never do any element-by-element access.) Possibly we could improve matters with a more refined heuristic about whether to force arrays to expanded form during assignments --- but I'm not sure what that would look like. plpgsql has very little direct knowledge of which operations will be applied to the array later. Isn't better to push information about possible target to function? array_cat(a, b, result) { if (undef(result)) return a || b; if (b == result) array_extend(result, a); return result; else if (a == result) array_extend(result, b); return result; else return a || b; } It can be used for arrays, for strings? On second hand it decrease readability related functions :( (but not all functions should to support this optimization). Regards Pavel regards, tom lane
Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)
Robert Haas wrote: On Tue, May 5, 2015 at 9:04 AM, Arjen Nienhuis a.g.nienh...@gmail.com wrote: Can someone look at this patch. It should fix bug #12845. The current tests for conversions are very minimal. I expanded them a bit for this bug. I think the binary search in the .map files should be removed but I leave that for another patch. Please add this patch to https://commitfest.postgresql.org/action/commitfest_view/open so we don't forget about it. If we think this is a bug fix, we should add it to the open items list, https://wiki.postgresql.org/wiki/Open_Items -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Seq Scan
On Wed, May 6, 2015 at 7:55 AM, Amit Kapila amit.kapil...@gmail.com wrote: - I believe the separation of concerns between ExecFunnel() and ExecEndFunnel() is not quite right. If the scan is shut down before it runs to completion (e.g. because of LIMIT), then I think we'll call ExecEndFunnel() before ExecFunnel() hits the TupIsNull(slot) path. I think you probably need to create a static subroutine that is called both as soon as TupIsNull(slot) and also from ExecEndFunnel(), in each case cleaning up whatever resources remain. Right, will fix as per suggestion. I observed one issue while working on this review comment. When we try to destroy the parallel setup via ExecEndNode (as due to Limit Node, it could not destroy after consuming all tuples), it waits for parallel workers to finish (WaitForParallelWorkersToFinish()) and parallel workers are waiting for master backend to signal them as their queue is full. I think in such a case master backend needs to inform workers either when the scan is discontinued due to limit node or while waiting for parallel workers to finish. Isn't this why TupleQueueFunnelShutdown() calls shm_mq_detach()? That's supposed to unstick the workers; any impending or future writes will just return SHM_MQ_DETACHED without waiting. That's technically true, but the incremental work involved in supporting a new worker is extremely small compare with worker startup times. I'm guessing that the setup cost is going to be on the order of hundred-thousands or millions and and the startup cost is going to be on the order of tens or ones. Can we safely estimate the cost of restoring parallel state (GUC's, combo CID, transaction state, snapshot, etc.) in each worker as a setup cost? There could be some work like restoration of locks (acquire all or relevant locks at start of parallel worker, if we follow your proposed design and even if we don't follow that there could be some similar substantial work) which could be substantial and we need to do the same for each worker. If you think restoration of parallel state in each worker is a pretty small work, then what you say makes sense to me. Well, all the workers restore that state in parallel, so adding it up across all workers doesn't really make sense. But anyway, no, I don't think that's a big cost. I think the big cost is going to the operating system overhead of process creation. The new process will incur lots of page faults as it populates its address space and dirties pages marked copy-on-write. That's where I expect most of the expense to be. And I actually hope you *can't* present some contrary evidence. Because if you can, then that might mean that we need to cost every possible path from 0 up to N workers and let the costing machinery decide which one is better. Not necesarally, we can follow a rule that number of workers that need to be used for any parallel statement are equal to degree of parallelism (parallel_seqscan_degree) as set by user. I think we need to do some split up of number workers when there are multiple parallel operations in single statement (like sort and parallel scan). Yeah. I'm hoping we will be able to use the same pool of workers for multiple operations, but I realize that's a feature we haven't designed yet. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Seq Scan
On Tue, Apr 28, 2015 at 5:37 PM, Robert Haas robertmh...@gmail.com wrote: On Fri, Apr 24, 2015 at 8:32 AM, Amit Kapila amit.kapil...@gmail.com wrote: - I believe the separation of concerns between ExecFunnel() and ExecEndFunnel() is not quite right. If the scan is shut down before it runs to completion (e.g. because of LIMIT), then I think we'll call ExecEndFunnel() before ExecFunnel() hits the TupIsNull(slot) path. I think you probably need to create a static subroutine that is called both as soon as TupIsNull(slot) and also from ExecEndFunnel(), in each case cleaning up whatever resources remain. Right, will fix as per suggestion. I observed one issue while working on this review comment. When we try to destroy the parallel setup via ExecEndNode (as due to Limit Node, it could not destroy after consuming all tuples), it waits for parallel workers to finish (WaitForParallelWorkersToFinish()) and parallel workers are waiting for master backend to signal them as their queue is full. I think in such a case master backend needs to inform workers either when the scan is discontinued due to limit node or while waiting for parallel workers to finish. - I don't think you need both setup cost and startup cost. Starting up more workers isn't particularly more expensive than starting up fewer of them, because most of the overhead is in waiting for them to actually start, and the number of workers is reasonable, then they're all be doing that in parallel with each other. I suggest removing parallel_startup_cost and keeping parallel_setup_cost. There is some work (like creation of shm queues, launching of workers) which is done proportional to number of workers during setup time. I have kept 2 parameters to distinguish such work. I think you have a point that start of some or all workers could be parallel, but I feel that still is a work proportinal to number of workers. For future parallel operations also such a parameter could be useful where we need to setup IPC between workers or some other stuff where work is proportional to workers. That's technically true, but the incremental work involved in supporting a new worker is extremely small compare with worker startup times. I'm guessing that the setup cost is going to be on the order of hundred-thousands or millions and and the startup cost is going to be on the order of tens or ones. Can we safely estimate the cost of restoring parallel state (GUC's, combo CID, transaction state, snapshot, etc.) in each worker as a setup cost? There could be some work like restoration of locks (acquire all or relevant locks at start of parallel worker, if we follow your proposed design and even if we don't follow that there could be some similar substantial work) which could be substantial and we need to do the same for each worker. If you think restoration of parallel state in each worker is a pretty small work, then what you say makes sense to me. And I actually hope you *can't* present some contrary evidence. Because if you can, then that might mean that we need to cost every possible path from 0 up to N workers and let the costing machinery decide which one is better. Not necesarally, we can follow a rule that number of workers that need to be used for any parallel statement are equal to degree of parallelism (parallel_seqscan_degree) as set by user. I think we need to do some split up of number workers when there are multiple parallel operations in single statement (like sort and parallel scan). With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Where are the detoast function called in select * from table_name case?
On 05/06/2015 04:00 PM, Rui Hai Jiang wrote: I've been reading the PostgreSQL code for weeks to figure out how TOAST works. I couldn't find where are the TOAST function called to detoast a tuple comes from a select query, for example, select * from table_name. Does anyone know this? Can you give me some help? And any help would save me a lot of time. The tuple isn't detoasted immediately when it's read. Individual datums of the tuple are detoasted lazily, when needed, in whatever function it's passed to. If you just do select * from table, the first function it's passed to is the datatype's output function, so that detoasts it. For example, if it's a text column, the toasted Datum is passed to textout(), which calls text_to_cstring(). text_to_cstring() calls pg_detoast_datum_packed(), which finally detoasts it. You can launch a debugger on the backend process, and put a breakpoint on e.g. heap_tuple_untoast_attr() to see it in action. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Manipulating complex types as non-contiguous structures in-memory
Pavel Stehule pavel.steh...@gmail.com writes: 2015-05-06 15:50 GMT+02:00 Tom Lane t...@sss.pgh.pa.us: Another way to look at it is that in this example, plpgsql's attempts to force the a array into expanded form are a mistake: we never get any benefit because array_cat() just wants it in flat form again, and delivers it in flat form. (It's likely that this is an unrealistic worst case: it's hard to imagine real array-using applications that never do any element-by-element access.) Possibly we could improve matters with a more refined heuristic about whether to force arrays to expanded form during assignments --- but I'm not sure what that would look like. plpgsql has very little direct knowledge of which operations will be applied to the array later. Isn't better to push information about possible target to function? I don't think that would solve the problem. For example, one of the cases I worry about is a function that does read-only examination of an array argument; consider something like create function sum_squares(a numeric[]) returns numeric as $$ declare s numeric := 0; begin for i in array_lower(a, 1) .. array_upper(a, 1) loop s := s + a[i] * a[i]; end loop; return s; end; $$ language plpgsql strict immutable; array_get_element() is not in a position here to force expansion of the array variable, so unless plpgsql itself does something we're not going to get a performance win (unless the argument happens to be already expanded on arrival). I'm inclined to think that we need to add information to pg_type about whether a type supports expansion (and how to convert to expanded form if so). In the patch as it stands, plpgsql just has hard-wired knowledge that it can call expand_array() on arrays that it's putting into function local variables. I'd be okay with shipping 9.5 like that, but pretty soon we'll want a solution that extension data types can use too. More generally, it'd be nice if the mechanism could be more flexible than always force variables of this type to expanded form. But I don't see how to teach plpgsql itself how to decide that intelligently, let alone how we might design an API that lets some third-party data type decide it at arm's length from plpgsql ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Disabling trust/ident authentication configure option
On 05/06/2015 02:13 PM, Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: (Personally I think there's a very good case for completely ripping out RFC1413 ident auth. I've not seen it used in a great long while, and it's always been a security risk.) FWIW, I agree with that --- or at least making it a not-built-by-default option. I have seen it in the last year, actually, but only once, which even for my personal pool represents 1% usage. So ... Probably the right time to make any such changes is at the same time we add the proposed more-secure-than-MD5 password option. +1 to kill off ident when we replace MD5, since users will need to be beaten over the head about changes to auth methods anyway. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Disabling trust/ident authentication configure option
On 5/6/15 12:56 PM, Peter Eisentraut wrote: I think this is a sufficiently general requirement to warrant including an option to disable this, as most hardening guides I have seen for PostgreSQL unconditionally require to disable trust authentication and disabling it in the code removes the need to check this in the runtime configuration. I think people would be interested in well-thought out, generalized hardening facilities. But that would likely include other things than just disabling an authentication method or two. And we can't be adding a new compile-time option as we add each one. We need a more general approach. Yeah. I think one of the big use cases here is that many environments are OK with at least ident (if not trust) but only from the local machine. So you'd probably want to handle that somehow. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BRIN range operator class
Alvaro Herrera alvhe...@2ndquadrant.com writes: Let's think together and try to find a reasonable way to get the union procedures tested regularly. It is pretty clear that having them run only when the race condition occurs is not acceptable; bugs go unnoticed. [ just a drive-by comment... ] Maybe you could set up a testing mode that forces the race condition to occur? Then you could test the calling code paths, not only the union procedures per se. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Disabling trust/ident authentication configure option
On 05/07/2015 01:32 AM, Jim Nasby wrote: On 5/6/15 12:56 PM, Peter Eisentraut wrote: I think this is a sufficiently general requirement to warrant including an option to disable this, as most hardening guides I have seen for PostgreSQL unconditionally require to disable trust authentication and disabling it in the code removes the need to check this in the runtime configuration. I think people would be interested in well-thought out, generalized hardening facilities. But that would likely include other things than just disabling an authentication method or two. And we can't be adding a new compile-time option as we add each one. We need a more general approach. Yeah. I think one of the big use cases here is that many environments are OK with at least ident (if not trust) but only from the local machine. So you'd probably want to handle that somehow. That's called 'peer', since 9.1. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT syntax issues
On 2015-05-06 13:05:16 -0700, Peter Geoghegan wrote: On Wed, May 6, 2015 at 7:53 AM, Andres Freund and...@anarazel.de wrote: In this variant, you explicitly specify the constraint by name. I do think it's a bit sad to not be able to specify unique indexes that aren't constraints. So I'd like to have a corresponding ON INDEX - which would be trivial. Then what's the point of having ON CONSTRAINT? That it supports exclusion constraints? I would care about the fact that you can't name a unique index with no constraint if there wasn't already a way of doing that with inference (I'm thinking in particular of partial indexes here, which never have constraints). But there is. So what's the problem? Personally I think a complex expression index is something many people will not want to specify every time. And since partial/expression indexes can't even have constraints... Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT syntax issues
On 05/06/2015 11:05 PM, Peter Geoghegan wrote: On Wed, May 6, 2015 at 7:53 AM, Andres Freund and...@anarazel.de wrote: In this variant, you explicitly specify the constraint by name. I do think it's a bit sad to not be able to specify unique indexes that aren't constraints. So I'd like to have a corresponding ON INDEX - which would be trivial. Then what's the point of having ON CONSTRAINT? The point of it working that way was we're not exposing the implementation detail of the index. While I happen to think that that's a distinction without a difference anyway, that certainly was the idea. Right, that's the idea. Indexes are just an implementation detail - conceivably you could have a constraint that's backed by some other mechanism. You should not embed implementation details like index names in your queries. Unfortunately you can't create a partial constraint - you'll have to create a partial index. I wish we would fix that directly, by allowing partial unique constraints. That said, I wouldn't necessarily be opposed to also having the syntax to name an index directly, as long as we had some notices in the docs to tell people to avoid it. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT syntax issues
On 05/07/2015 12:01 AM, Andres Freund wrote: On 2015-05-06 23:48:18 +0300, Heikki Linnakangas wrote: I'll see about fixing that. It's not just a matter of creating another alias for the same rel, I'm afraid: foo.t is supposed to refer to the tuple that we attempted to insert, like it does without the ON CONFLICT. I'm not sure what you mean here? Sorry, forget about that. I was confused and mixed up EXCLUDED and TARGET. Looks like they really aren't very good names :-). - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0
On 2015-05-06 22:51:43 +0300, Heikki Linnakangas wrote: Yeah, I agree that DO NOTHING should not lock the rows. It might make sense to have a DO LOCK variant, which locks the rows, although I don't immediately see what the use case would be. If you want to do something more complicated with the row than what you can do in the UPDATE. To do that right now you either need to do the DO UPDATE SET ... WHERE false; and refetch the tuple which might not be easy, or do a DO UPDATE SET pkey = target.pkey which produces bloat. Consider e.g. you're applying incoming data and in case of conflict you want to call user defined function deciding which row should survive. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT syntax issues
On Wed, May 6, 2015 at 1:48 PM, Heikki Linnakangas hlinn...@iki.fi wrote: TARGET is also very descriptive, because it situationally describes either the existing tuple actually present in the table, or (from a RETURNING clause) the final tuple present in the table post-UPDATE. We use the term target for that pervasively (in the docs and in the code). but I find that totally unconvincing. It's clear that TARGET refers to the table being upserted, but it's totally unclear on *which* version of the tuple it refers to. Then we're simply talking about 2 different things. My understanding is that it *is* the relation. And like UPDATE's RETURNING, it will be the same relation/alias but a different tuple here. Andres said this was a mutating tuple or something like that, and I suppose it is. But Vars are variables. Now, Andres (and now you) want to change it so that the TARGET alias becomes magical and expression-like, so that it really does refer to a tuple and not a relation (and so is closer to EXCLUDED.*). And you seem pretty set on that. That being the case, clearly TARGET is unsuitable for the reasons you state. I suppose that it doesn't much matter, but that's how I understood the situation all along. So I can see why you don't like TARGET in light of that. I would vote for EXISTING as an alternative, given that it's pretty clear that what is now TARGET.* will become a magic alias/expression thing. EXISTING is the EXISTING tuple, which goes well with EXCLUDED. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT syntax issues
On 2015-05-06 23:48:18 +0300, Heikki Linnakangas wrote: I'll see about fixing that. It's not just a matter of creating another alias for the same rel, I'm afraid: foo.t is supposed to refer to the tuple that we attempted to insert, like it does without the ON CONFLICT. I'm not sure what you mean here? But actually, I don't much like the target alias in the first place. We never really completed this discussion, everyone just got tired: Right. But that doesn't affect the it's not just a matter of ... bit above, right? Reading through this sub-thread, these spellings have been proposed: 1. TARGET and EXCLUDED 2. NEW and EXISTING 3. NEW and OLD 4. PROPOSED and EXISTING 5. CONFLICTING and EXISTING Did I miss any? Now, let me opine on these. How about 6. The tablename and EXCLUDED? Possibility with the ability to specify an AS for INSERT INTO foo AS whatever? From an implementation pov that'd be simple ;) NEW and OLD are pretty good. Like in an UPDATE trigger, NEW refers to the version after the UPDATE, and OLD to the version before. However, there's the serious problem that in a trigger function, OLD/NEW are already in use. How bad is that? At least in PL/pgSQL you can work around it by aliasing the variables, but it's a bit inconvenient. How often would INSERT .. ON CONFLICT DO UPDATE be used in a trigger? I personally think it's a killer. It'll be very annoying to understand mistaken usage of NEW/OLD in that case. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT syntax issues
On 05/07/2015 12:18 AM, Andres Freund wrote: On 2015-05-07 00:10:22 +0300, Heikki Linnakangas wrote: Right, that's the idea. Indexes are just an implementation detail - I think that's a distinction just about no user out there cares about. Unfortunately you can't create a partial constraint - you'll have to create a partial index. I wish we would fix that directly, by allowing partial unique constraints. It's not just partial ones, it's also expression ones, right? True. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT syntax issues
On Wed, May 6, 2015 at 7:53 AM, Andres Freund and...@anarazel.de wrote: In this variant, you explicitly specify the constraint by name. I do think it's a bit sad to not be able to specify unique indexes that aren't constraints. So I'd like to have a corresponding ON INDEX - which would be trivial. Then what's the point of having ON CONSTRAINT? The point of it working that way was we're not exposing the implementation detail of the index. While I happen to think that that's a distinction without a difference anyway, that certainly was the idea. I would care about the fact that you can't name a unique index with no constraint if there wasn't already a way of doing that with inference (I'm thinking in particular of partial indexes here, which never have constraints). But there is. So what's the problem? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT syntax issues
On Wed, May 6, 2015 at 1:22 PM, Andres Freund and...@anarazel.de wrote: That it supports exclusion constraints? But so does just naming the index. I don't think it's significant that exclusion operators are in pg_constraint -- you could just as easily name the index, since that's all you ultimately end up with anyway. I would care about the fact that you can't name a unique index with no constraint if there wasn't already a way of doing that with inference (I'm thinking in particular of partial indexes here, which never have constraints). But there is. So what's the problem? Personally I think a complex expression index is something many people will not want to specify every time. And since partial/expression indexes can't even have constraints... The downsides seem severe. A CREATE INDEX CONCURRENTLY just broke your statement, because you didn't account for the new, equivalent unique index during inference, and now you have to drop the old index and break application code. Is that really worth introducing just to prevent app devs from writing the inference specification? The specification explicitly documents their intent, which seems like a good thing. Robert really disliked the idea of even naming the constraint, let alone the index. I'm trying to balance things, here. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BRIN range operator class
I again have to refuse the notion that removing the assert-only block without any replacement is acceptable. I just spent a lot of time tracking down what turned out to be a bug in your patch 07: /* Adjust maximum, if B's max is greater than A's max */ - needsadj = FunctionCall2Coll(minmax_get_procinfo(bdesc, attno, - PROCNUM_GREATER), - colloid, col_b-bv_values[1], col_a-bv_values[1]); + frmg = minmax_get_strategy_procinfo(bdesc, attno, attr-atttypid, + BTGreaterStrategyNumber); + needsadj = FunctionCall2Coll(frmg, colloid, col_b-bv_values[0], + col_a-bv_values[0]); Note the removed lines use array index 1, while the added lines use array index 0. The only reason I noticed this is because I applied this patch without the others and saw the assertion fire; how would I have noticed the problem had I just removed it? Let's think together and try to find a reasonable way to get the union procedures tested regularly. It is pretty clear that having them run only when the race condition occurs is not acceptable; bugs go unnoticed. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Disabling trust/ident authentication configure option
On 05/06/2015 04:19 PM, Robert Haas wrote: On Wed, May 6, 2015 at 3:57 PM, Andrew Dunstan and...@dunslane.net wrote: I don't necessarily object to this idea, but I do think we need to ensure that we don't allow both trust and peer to be disabled (which means on Windows you would not be able to disable trust). Otherwise this becomes a footgun which would require the whole server to be stopped so you could connect in single user mode to correct certain mistakes, which are unfortunately all too common. Of course that's precisely what the OP wanted to do, which goes to my point that not everybody's going to want the same thing. If that is indeed the proposal, then I vote no. But he did say upthread: Single user sessions would work, but the peer authentication is also still available and should be the preferred method to reset passwords when trust is disabled, so this should not be an issue. (Personally I think there's a very good case for completely ripping out RFC1413 ident auth. I've not seen it used in a great long while, and it's always been a security risk.) cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT syntax issues
On 2015-05-06 13:37:07 -0700, Peter Geoghegan wrote: On Wed, May 6, 2015 at 1:22 PM, Andres Freund and...@anarazel.de wrote: That it supports exclusion constraints? But so does just naming the index. I don't think it's significant that exclusion operators are in pg_constraint -- you could just as easily name the index, since that's all you ultimately end up with anyway. The index name for constraints is generated and not trivially safely guessable for a user. I would care about the fact that you can't name a unique index with no constraint if there wasn't already a way of doing that with inference (I'm thinking in particular of partial indexes here, which never have constraints). But there is. So what's the problem? Personally I think a complex expression index is something many people will not want to specify every time. And since partial/expression indexes can't even have constraints... The downsides seem severe. A CREATE INDEX CONCURRENTLY just broke your statement, because you didn't account for the new, equivalent unique index during inference, and now you have to drop the old index and break application code. Is that really worth introducing just to prevent app devs from writing the inference specification? The specification explicitly documents their intent, which seems like a good thing. I don't find that all that severe. It might just as well be the case that the new unique constraint isn't intended to be handled by ON CONFLICT. And having a inference specification that's longer than the rest of the statement surely isn't helpful. Robert really disliked the idea of even naming the constraint, let alone the index. I'm trying to balance things, here. I fail to see what doing something halfhearted helps. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT syntax issues
Andres pointed out on IM that the TARGET alias is a bit crummy. In particular, adding an ON CONFLICT DO UPDATE can make a RETURNING clause invalid, because we change the alias of the target rel: create table foo (id int4 primary key, t text); This works: postgres=# insert into foo (id, t) values (1, 'x') returning foo.t; t --- x (1 row) INSERT 0 1 Same statement with ON CONFLICT DO UPDATE fails: postgres=# insert into foo (id, t) values (1, 'x') on conflict (id) do update set t = 'x' returning foo.t; ERROR: invalid reference to FROM-clause entry for table foo LINE 1: ...'x') on conflict (id) do update set t = 'x' returning foo.t; ^ HINT: Perhaps you meant to reference the table alias target. I'll see about fixing that. It's not just a matter of creating another alias for the same rel, I'm afraid: foo.t is supposed to refer to the tuple that we attempted to insert, like it does without the ON CONFLICT. But actually, I don't much like the target alias in the first place. We never really completed this discussion, everyone just got tired: On 04/29/2015 10:13 PM, Stephen Frost wrote: * Robert Haas (robertmh...@gmail.com) wrote: On Mon, Apr 27, 2015 at 7:21 PM, Peter Geoghegan p...@heroku.com wrote: * Don't change the names of the pseudo-alias EXCLUDED.* (or the alias TARGET.*). Those seem fine to me as well. There seem to be a few votes for NEW and OLD. That's what I proposed originally, and (surprise, surprise) I still like that better too. I was promoting NEW/OLD, until I realized that we'd end up having a problem in trigger functions because NEW/OLD are already defined there, unless you have a suggestion for how to improve on that? Reading through this sub-thread, these spellings have been proposed: 1. TARGET and EXCLUDED 2. NEW and EXISTING 3. NEW and OLD 4. PROPOSED and EXISTING 5. CONFLICTING and EXISTING Did I miss any? Now, let me opine on these. EXCLUDED seems fine to me. I don't see us using that term elsewhere, and it makes me think of exclusion constraints, but nevertheless I think it's pretty easy remember what it means. TARGET, however, is totally inscrutable. Peter argued earlier that: TARGET is also very descriptive, because it situationally describes either the existing tuple actually present in the table, or (from a RETURNING clause) the final tuple present in the table post-UPDATE. We use the term target for that pervasively (in the docs and in the code). but I find that totally unconvincing. It's clear that TARGET refers to the table being upserted, but it's totally unclear on *which* version of the tuple it refers to. NEW and OLD are pretty good. Like in an UPDATE trigger, NEW refers to the version after the UPDATE, and OLD to the version before. However, there's the serious problem that in a trigger function, OLD/NEW are already in use. How bad is that? At least in PL/pgSQL you can work around it by aliasing the variables, but it's a bit inconvenient. How often would INSERT .. ON CONFLICT DO UPDATE be used in a trigger? I don't have much to say about the rest. PROPOSED, EXISTING, CONFLICTING, they're all fairly descriptive, but long. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT syntax issues
On Wed, May 6, 2015 at 2:01 PM, Andres Freund and...@anarazel.de wrote: How about 6. The tablename and EXCLUDED? Possibility with the ability to specify an AS for INSERT INTO foo AS whatever? From an implementation pov that'd be simple ;) That's what I wanted to do when I realized what Andres wanted to do with the TARGET alias. Clearly that would compel us to actually make the RETURNING clause buy into this alias, just as with a regular UPDATE. And not having the alias on the target also be magical seems like a good thing. Nothing can be broken by this scheme. No? NEW and OLD are pretty good. Like in an UPDATE trigger, NEW refers to the version after the UPDATE, and OLD to the version before. However, there's the serious problem that in a trigger function, OLD/NEW are already in use. How bad is that? At least in PL/pgSQL you can work around it by aliasing the variables, but it's a bit inconvenient. How often would INSERT .. ON CONFLICT DO UPDATE be used in a trigger? I personally think it's a killer. It'll be very annoying to understand mistaken usage of NEW/OLD in that case. +1 -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Disabling trust/ident authentication configure option
Andrew Dunstan and...@dunslane.net writes: (Personally I think there's a very good case for completely ripping out RFC1413 ident auth. I've not seen it used in a great long while, and it's always been a security risk.) FWIW, I agree with that --- or at least making it a not-built-by-default option. Probably the right time to make any such changes is at the same time we add the proposed more-secure-than-MD5 password option. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT syntax issues
On 2015-05-07 00:10:22 +0300, Heikki Linnakangas wrote: Right, that's the idea. Indexes are just an implementation detail - I think that's a distinction just about no user out there cares about. Unfortunately you can't create a partial constraint - you'll have to create a partial index. I wish we would fix that directly, by allowing partial unique constraints. It's not just partial ones, it's also expression ones, right? Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)
Robert Haas robertmh...@gmail.com writes: On Wed, May 6, 2015 at 11:13 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Maybe not, but at the very least we should consider getting it fixed in 9.5 rather than waiting a full development cycle. Same as in https://www.postgresql.org/message-id/20150428131549.ga25...@momjian.us I'm not saying we MUST include it in 9.5, but we should at least consider it. If we simply stash it in the open CF we guarantee that it will linger there for a year. Sure, if somebody has the time to put into it now, I'm fine with that. I'm afraid it won't be me, though: even if I had the time, I don't know enough about encodings. I concur that we should at least consider this patch for 9.5. I've added it to https://wiki.postgresql.org/wiki/PostgreSQL_9.5_Open_Items I'm willing to look at it myself, whenever my non-copious spare time permits; but that won't be in the immediate future. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Disabling trust/ident authentication configure option
On Wed, May 6, 2015 at 3:57 PM, Andrew Dunstan and...@dunslane.net wrote: I don't necessarily object to this idea, but I do think we need to ensure that we don't allow both trust and peer to be disabled (which means on Windows you would not be able to disable trust). Otherwise this becomes a footgun which would require the whole server to be stopped so you could connect in single user mode to correct certain mistakes, which are unfortunately all too common. Of course that's precisely what the OP wanted to do, which goes to my point that not everybody's going to want the same thing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT syntax issues
On Tue, May 5, 2015 at 5:27 AM, Heikki Linnakangas hlinn...@iki.fi wrote: To recap, there are three variants: A. INSERT ... ON CONFLICT DO NOTHING No arbiter is specified. This means that a conflict on any unique or exclusion constraint is not allowed (and will do nothing instead). This variant is only accepted for DO NOTHING. B. INSERT ... ON CONFLICT ON constraint name DO NOTHING/UPDATE In this variant, you explicitly specify the constraint by name. C. INSERT ... ON CONFLICT (index params) [WHERE expression] DO NOTHING/UPDATE This specifies an index (or indexes, in the corner case that there are several identical ones), by listing the columns/expressions and the predicate for a partial index. The list of columns and WHERE match the syntax for CREATE INDEX. I would just say that there are two variants, only one of which mandates the inference clause. But I'm nitpicking. That's pretty good overall. A few questions: Thanks. I'm glad that we are now able to cover really any conceivable use-case, while playing nice with every existing feature (now updatable views are supported with ON CONFLICT DO UPDATE -- and we're also going to be able to suppor subqueries in the UPDATE). We've been incredibly thorough. 1. Why is the variant without specifying an index or constraint not allowed with DO UPDATE? I agree it might not make much sense, but then again, it might. If we're afraid that it's too unsafe to be the default if you don't specify any constraint, how about allowing it with a more verbose ON CONFLICT ON ANY CONSTRAINT syntax? I think it's dangerous. It's basically wrong headed to omit any constraint for DO UPDATE. I put a lot of effort into covering every possible case with the inference clause, and I think it's pretty cool that we have something that's so flexible. I don't feel bad about forcing users to be explicit about what they want, because the only conceivable downside is that they'll have to do a little extra typing (if even that - it's probably going to be ORM-generated more often than not). The upside -- not having their query break unexpectedly one day, when a new constraint is added -- is huge. 2. Why can't you specify multiple constraints, even though we implicitly allow any with the first variant? It's just an escape hatch. I don't want to encourage its over use, and I want to keep the grammar simple. Finally, a couple of suggestions. It would be pretty handy to allow: INSERT ... ON CONFLICT ON PRIMARY KEY DO NOTHING/UPDATE Is that really likely to be less verbose than naming the attributes directly? INSERT ... ON CONFLICT ON *CONSTRAINT* constraint name DO NOTHING/UPDATE That would allow the syntax can be expanded in the future to specify conflicts on other kind of things. The ON PRIMARY KEY syntax should be unambiguous with out, because PRIMARY is a reserved keyword, but for example, we might want to add ON UNIQUE INDEX index name later. Okay, we can do that. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0
On Tue, May 5, 2015 at 10:31 AM, Andres Freund and...@anarazel.de wrote: Another thing I'm wondering about is dealing with deferrable constraints/deferred indexes. a) Why does ExecCheckIndexConstraints() check for indisimmediate for *all* indexes and not just when it's an arbiter index? That seems needlessly restrictive. I guess it's a legacy of the time when it was all or nothing. But supporting that would involve further complicating the interface with ExecCheckIndexConstraints() so that we cared about the distinction between conflicts for one purpose and conflicts for another. It could be done, though. b) There's a doc addition to set_constraints.sgml + constraints are affected by this setting. Note that constraints + that are literalDEFERRED/literal cannot be used as arbiters by + the literalON CONFLICT/ clause that commandINSERT/ + supports. which I don't think makes sense: SET CONSTRAINTS will never change anything relevant for ON CONFLICT, right? You're right. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Manipulating complex types as non-contiguous structures in-memory
2015-05-06 18:54 GMT+02:00 Tom Lane t...@sss.pgh.pa.us: Pavel Stehule pavel.steh...@gmail.com writes: 2015-05-06 15:50 GMT+02:00 Tom Lane t...@sss.pgh.pa.us: Another way to look at it is that in this example, plpgsql's attempts to force the a array into expanded form are a mistake: we never get any benefit because array_cat() just wants it in flat form again, and delivers it in flat form. (It's likely that this is an unrealistic worst case: it's hard to imagine real array-using applications that never do any element-by-element access.) Possibly we could improve matters with a more refined heuristic about whether to force arrays to expanded form during assignments --- but I'm not sure what that would look like. plpgsql has very little direct knowledge of which operations will be applied to the array later. Isn't better to push information about possible target to function? I don't think that would solve the problem. For example, one of the cases I worry about is a function that does read-only examination of an array argument; consider something like create function sum_squares(a numeric[]) returns numeric as $$ declare s numeric := 0; begin for i in array_lower(a, 1) .. array_upper(a, 1) loop s := s + a[i] * a[i]; end loop; return s; end; $$ language plpgsql strict immutable; I remember this issue array_get_element() is not in a position here to force expansion of the array variable, so unless plpgsql itself does something we're not going to get a performance win (unless the argument happens to be already expanded on arrival). I'm inclined to think that we need to add information to pg_type about whether a type supports expansion (and how to convert to expanded form if so). In the patch as it stands, plpgsql just has hard-wired knowledge that it can call expand_array() on arrays that it's putting into function local variables. I'd be okay with shipping 9.5 like that, but pretty soon we'll want a solution that extension data types can use too. More generally, it'd be nice if the mechanism could be more flexible than always force variables of this type to expanded form. But I don't see how to teach plpgsql itself how to decide that intelligently, let alone how we might design an API that lets some third-party data type decide it at arm's length from plpgsql ... I agree - the core of work have to be elsewhere than in plpgsql. Some years ago there was a idea about toast cache. regards, tom lane
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0
On 05/06/2015 10:47 PM, Peter Geoghegan wrote: On Wed, May 6, 2015 at 8:20 AM, Andres Freund and...@anarazel.de wrote: On 2015-05-05 15:00:56 -0700, Peter Geoghegan wrote: Locking the row is not nothing, though. If you want to lock the row, use an UPSERT with a tautologically false WHERE clause (like WHERE false). That's not the same. For one it breaks RETURNING which is a death knell, for another it's not exactly obvious. DO NOTHING already doesn't project non-inserted tuples, in a way that fits with the way we won't do that when a before trigger returns NULL. So I don't know what you mean about RETURNING behavior. It may not be all that obvious, but then the requirement that you mention isn't either. I really strongly feel that DO NOTHING should do nothing. For the pgloader use-case, which is what I have in mind with that variant, that could literally make the difference between dirtying an enormous number of buffers and dirtying only a few. This will *frequently* be the case. And it's not as if the idea of an INSERT IGNORE is new or in any way novel. As I mentioned, many systems have a comparable command. So, yes, DO NOTHING does very little - and that is its appeal. Supporting this behavior does not short change those who actually care about the existing tuple sticking around for the duration of their transaction - they have a way of doing that. If you want to document INSERT IGNORE as being primarily an ETL-orientated thing, that would make sense, but let's not hobble that use case. Yeah, I agree that DO NOTHING should not lock the rows. It might make sense to have a DO LOCK variant, which locks the rows, although I don't immediately see what the use case would be. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0
On Wed, May 6, 2015 at 8:20 AM, Andres Freund and...@anarazel.de wrote: On 2015-05-05 15:00:56 -0700, Peter Geoghegan wrote: Locking the row is not nothing, though. If you want to lock the row, use an UPSERT with a tautologically false WHERE clause (like WHERE false). That's not the same. For one it breaks RETURNING which is a death knell, for another it's not exactly obvious. DO NOTHING already doesn't project non-inserted tuples, in a way that fits with the way we won't do that when a before trigger returns NULL. So I don't know what you mean about RETURNING behavior. It may not be all that obvious, but then the requirement that you mention isn't either. I really strongly feel that DO NOTHING should do nothing. For the pgloader use-case, which is what I have in mind with that variant, that could literally make the difference between dirtying an enormous number of buffers and dirtying only a few. This will *frequently* be the case. And it's not as if the idea of an INSERT IGNORE is new or in any way novel. As I mentioned, many systems have a comparable command. So, yes, DO NOTHING does very little - and that is its appeal. Supporting this behavior does not short change those who actually care about the existing tuple sticking around for the duration of their transaction - they have a way of doing that. If you want to document INSERT IGNORE as being primarily an ETL-orientated thing, that would make sense, but let's not hobble that use case. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Disabling trust/ident authentication configure option
On 05/06/2015 10:47 AM, Alvaro Herrera wrote: I don't necessarily agree with the patch as proposed. I would rather have a comma-separated list of methods, as in: --disable-auth=ident,peer which lets you choose what to disable without hardcoded choices. Due to the nature of autoconf, this might be too fiddly to implement, though, and if so I think the method proposed by this patch seems a reasonable compromise. I've seen configure in other programs offer options such as --disable-foo=list that lists acceptable values (or --disable-foo=help) I don't necessarily object to this idea, but I do think we need to ensure that we don't allow both trust and peer to be disabled (which means on Windows you would not be able to disable trust). Otherwise this becomes a footgun which would require the whole server to be stopped so you could connect in single user mode to correct certain mistakes, which are unfortunately all too common. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)
On Wed, May 6, 2015 at 11:13 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: It's a behavior change, so I don't think we would consider a back-patch. Maybe not, but at the very least we should consider getting it fixed in 9.5 rather than waiting a full development cycle. Same as in https://www.postgresql.org/message-id/20150428131549.ga25...@momjian.us I'm not saying we MUST include it in 9.5, but we should at least consider it. If we simply stash it in the open CF we guarantee that it will linger there for a year. Sure, if somebody has the time to put into it now, I'm fine with that. I'm afraid it won't be me, though: even if I had the time, I don't know enough about encodings. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Disabling trust/ident authentication configure option
On 5/6/15 6:02 AM, Volker Aßmann wrote: Well trust actually does not sound that dangerous in case you only take a quick glance at the documentation - trust PostgreSQL to do the right thing? Hah, we could rename it to wideopen. Please note that the patch does nothing by default, it just adds the option to disable trust/ident but leaves them on in the standard configuration. I do not want to disable trust by default for everyone, but it would be great to have the option to do this without having to patch (and thus test and verify) the PostgreSQL sources for each new release. Any new compile-time option creates a nonlinear maintenance burden. We're going to need to test whether each option builds cleanly and works, also in combination with other options, and on several platforms. The authentication code is already littered with build-time dependencies and platform-specific code. So the it doesn't bother anyone argument doesn't quite work. Actually, in this particular case, you wouldn't even need a compile-time option. You could just make it a restart-only option. I think this is a sufficiently general requirement to warrant including an option to disable this, as most hardening guides I have seen for PostgreSQL unconditionally require to disable trust authentication and disabling it in the code removes the need to check this in the runtime configuration. I think people would be interested in well-thought out, generalized hardening facilities. But that would likely include other things than just disabling an authentication method or two. And we can't be adding a new compile-time option as we add each one. We need a more general approach. Single user sessions would work, but the peer authentication is also still available and should be the preferred method to reset passwords when trust is disabled, so this should not be an issue. peer authentication is unfortunately not quite portable. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] is possible to upgrade from 9.2 to 9.4 with pg_upgrade
On Wed, May 6, 2015 at 6:16 AM, Pavel Stehule pavel.steh...@gmail.com wrote: 2015-05-06 15:15 GMT+02:00 Alvaro Herrera alvhe...@2ndquadrant.com: Pavel Stehule wrote: Hi I am working on preparation the migration from 9.2 to 9.4 pg_upgrade fails pg_upgrade -b /usr/lib64/pgsql/postgresql-9.2/bin -B /usr/bin/ -d /mnt/ebs/pgsql/data -D /mnt/ebs/pgsql/data94 -k' Performing Consistency Checks - Checking cluster versions ok The old cluster lacks some required control information: latest checkpoint oldest MultiXactId ? Uh, this is certainly supposed to work. Maybe pg_controldata or pg_resetxlog changed output format and pg_upgrade doesn't know how to read it. It is tested on fresh 9.2.10 to 9.4.1 I've done that migration many times. Can you give the pg_config output for both installations, and the pg_controldata output for the 9.2.10? Thanks, Jeff
Re: [HACKERS] is possible to upgrade from 9.2 to 9.4 with pg_upgrade
On Wed, May 6, 2015 at 10:26 AM, Jeff Janes jeff.ja...@gmail.com wrote: On Wed, May 6, 2015 at 6:16 AM, Pavel Stehule pavel.steh...@gmail.com wrote: 2015-05-06 15:15 GMT+02:00 Alvaro Herrera alvhe...@2ndquadrant.com: Pavel Stehule wrote: Hi I am working on preparation the migration from 9.2 to 9.4 pg_upgrade fails pg_upgrade -b /usr/lib64/pgsql/postgresql-9.2/bin -B /usr/bin/ -d /mnt/ebs/pgsql/data -D /mnt/ebs/pgsql/data94 -k' Performing Consistency Checks - Checking cluster versions ok The old cluster lacks some required control information: latest checkpoint oldest MultiXactId ? Uh, this is certainly supposed to work. Maybe pg_controldata or pg_resetxlog changed output format and pg_upgrade doesn't know how to read it. It is tested on fresh 9.2.10 to 9.4.1 I've done that migration many times. Can you give the pg_config output for both installations, and the pg_controldata output for the 9.2.10? Also, what is the full path to pg_upgrade? Cheers, Jeff
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0
On 05/06/2015 09:51 PM, Heikki Linnakangas wrote: So, yes, DO NOTHING does very little - and that is its appeal. Supporting this behavior does not short change those who actually care about the existing tuple sticking around for the duration of their transaction - they have a way of doing that. If you want to document INSERT IGNORE as being primarily an ETL-orientated thing, that would make sense, but let's not hobble that use case. Yeah, I agree that DO NOTHING should not lock the rows. It might make sense to have a DO LOCK variant, which locks the rows, although I don't immediately see what the use case would be. It seems like a very useful feature to me, if you want to upsert something into a table with a serial column and get the value of the serial column in a RETURNING clause (but not update any fields if there is a conflict). Then I am pretty sure I want to lock the row. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)
Robert Haas wrote: On Wed, May 6, 2015 at 10:55 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Robert Haas wrote: On Tue, May 5, 2015 at 9:04 AM, Arjen Nienhuis a.g.nienh...@gmail.com wrote: Can someone look at this patch. It should fix bug #12845. The current tests for conversions are very minimal. I expanded them a bit for this bug. I think the binary search in the .map files should be removed but I leave that for another patch. Please add this patch to https://commitfest.postgresql.org/action/commitfest_view/open so we don't forget about it. If we think this is a bug fix, we should add it to the open items list, https://wiki.postgresql.org/wiki/Open_Items It's a behavior change, so I don't think we would consider a back-patch. Maybe not, but at the very least we should consider getting it fixed in 9.5 rather than waiting a full development cycle. Same as in https://www.postgresql.org/message-id/20150428131549.ga25...@momjian.us I'm not saying we MUST include it in 9.5, but we should at least consider it. If we simply stash it in the open CF we guarantee that it will linger there for a year. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)
On Wed, May 6, 2015 at 10:55 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Robert Haas wrote: On Tue, May 5, 2015 at 9:04 AM, Arjen Nienhuis a.g.nienh...@gmail.com wrote: Can someone look at this patch. It should fix bug #12845. The current tests for conversions are very minimal. I expanded them a bit for this bug. I think the binary search in the .map files should be removed but I leave that for another patch. Please add this patch to https://commitfest.postgresql.org/action/commitfest_view/open so we don't forget about it. If we think this is a bug fix, we should add it to the open items list, https://wiki.postgresql.org/wiki/Open_Items It's a behavior change, so I don't think we would consider a back-patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0
On 2015-05-05 15:00:56 -0700, Peter Geoghegan wrote: Locking the row is not nothing, though. If you want to lock the row, use an UPSERT with a tautologically false WHERE clause (like WHERE false). That's not the same. For one it breaks RETURNING which is a death knell, for another it's not exactly obvious. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers