Re: [HACKERS] BRIN range operator class

2015-05-06 Thread Emre Hasegeli
 Looking at patch 04, it seems to me that it would be better to have
 the OpcInfo struct carry the typecache struct rather than the type OID,
 so that we can avoid repeated typecache lookups in brin_deform_tuple;

 Here's the patch.

Looks better to me.  I will incorporate with this patch.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disabling trust/ident authentication configure option

2015-05-06 Thread Volker Aßmann
On Tue, May 5, 2015 at 10:39 PM, Robert Haas robertmh...@gmail.com wrote:

 On Tue, May 5, 2015 at 8:05 AM, Volker Aßmann volker.assm...@gmail.com
 wrote:
  Changing the password to something simple is immediately obvious as a
  security flaw for most people who may come across database
 configurations,
  but for the TRUST mode you actually need to know some background on why
 this
  is dangerous and when.

 I frankly find that a bit difficult to swallow.  You think that
 everyone knows that bad passwords are a problem, but some people might
 not realize that an authentication method called trust might not be
 secure?  I suppose that's possible, but I really think that if you
 install an *authentication method* whose name means *just trust the
 other guy to be telling the truth* without thinking about the
 consequences of that, it's hard to have a lot of sympathy for whatever
 happens afterwards.


Well trust actually does not sound that dangerous in case you only take a
quick glance at the documentation - trust PostgreSQL to do the right
thing? - so this at least requires to actively read about this, while any
person working in IT should have at least a rough understanding that weak
passwords are bad...

And trust is IMHO dangerous because even people who might know about the
dangers may choose to temporarily turn it on (let me just finish this
update and I will worry about the authentication settings later) and then
forget to disable it again - aka it might actually be useful...


 Besides, your patch doesn't just disable trust.  It also disables
 ident authentication, which in some network configurations could
 conceivably be more secure than password authentication.  When applied
 to the local machine, ident actually means peer, which has an
 *excellent* chance of being more secure than password authentication.
 For that matter, even trust might be better than password.
 Anybody who can sniff the network traffic can read the password right
 off of the wire.  So either way, your PostgreSQL server is gonna get
 hacked, but if you use password authentication, you might reveal a
 password that is also used to protect access to something else that
 used to be secure.


You are right, ident is not the same ättack vector as trust, you most
likely won't activate this by accident, but I think it is still a dangerous
mode that would be reasonable to deactivate if it's not needed.


 Personally, if I were going to start disabling authentication methods
 at compile time, I'd start with password and md5.  If you are not
 using SSL, and you use password or md5 authentication, you're
 basically saying, well, I'm OK with somebody reading all of the data
 that I'm sending and receiving over the wire, and I'm willing to take
 the risk that my passwords are easily crackable or can be read
 straight off the wire using wireshark, but to send your own queries
 you will have to make at least some minimal effort.  If you need real
 security, that is not nearly good enough.  If you don't need real
 security, why bother making people hassle with a password that's not
 providing any real protection?  There are some valid answers to that
 question - e.g. if you are on a corporate WAN, you probably can't fire
 somebody for blundering into an unprotected resource, but if somebody
 goes to the trouble of cracking the password, even if it's weak, then
 you can probably nail them.

 For most users, though, I think password and md5 authentication serve
 mostly to give people the illusion that they've secured the server.
 The real security, if there is any, comes primarily from restricting
 incoming connections via listen_addresses and/or operating system
 mechanisms such as iptables and/or pg_hba.conf, and from requiring the
 use of SSL.  Passwords are weak sauce.


Yes, passwords can be as bad as trust authentication or basically any other
method done implemented insecurely, so from my point of view the best
solution would be to be able to selectively enable/disable all
authentication methods to customize the package for specific environments.

Trust is in my point of view just the most immediately obvious shoot
yourself in the foot option and in my use case the thing that users are
actually bound to try and get wrong.

Please note that the patch does nothing by default, it just adds the option
to disable trust/ident but leaves them on in the standard configuration. I
do not want to disable trust by default for everyone, but it would be
great to have the option to do this without having to patch (and thus test
and verify) the PostgreSQL sources for each new release.

I think this is a sufficiently general requirement to warrant including an
option to disable this, as most hardening guides I have seen for PostgreSQL
unconditionally require to disable trust authentication and disabling it in
the code removes the need to check this in the runtime configuration.


 A final point to consider is: what happens when you lock yourself out
 of 

Re: [HACKERS] BRIN range operator class

2015-05-06 Thread Emre Hasegeli
 Can you please explain what is the purpose of patch 07?  I'm not sure I
 understand; are we trying to avoid having to add pg_amproc entries for
 these operators and instead piggy-back on btree opclass definitions?
 Not too much in love with that idea; I see that there is less tedium in
 that the brin opclass definition is simpler.  One disadvantage is a 3x
 increase in the number of syscache lookups to get the function you need,
 unless I'm reading things wrong.  Maybe this is not performance critical.

It doesn't use btree opclass definitions.  It uses brin opclass
pg_amop entries instead of duplicating them in pg_amproc.
The pg_amproc.h header says:

 * The amproc table identifies support procedures associated with index
 * operator families and classes.  These procedures can't be listed in pg_amop
 * since they are not the implementation of any indexable operator.

In our case, these procedures can be listed in pg_amop as they
are implementations of indexable operators.

The more important change on this patch is to request procedures for
the right data types.  Minmax opclasses return wrong results without
this patch.  You can reproduce it with this query on
the regression database:

select * from brintest where timestampcol = '1979-01-29 11:05:09'::timestamptz;

 Anyway I tried applying it on isolation, and found that it fails the
 assertion that tests the union support proc in brininsert.  That
 doesn't seem okay.  I mean, it's okay not to run the test for the
 inclusion opclasses, but why does it now fail in minmax which was
 previously passing?  Couldn't figure it out.

The regression tests passed when I tried it on the current master.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Manipulating complex types as non-contiguous structures in-memory

2015-05-06 Thread Pavel Stehule
2015-05-06 0:50 GMT+02:00 Tom Lane t...@sss.pgh.pa.us:

 I wrote:
  Pavel Stehule pavel.steh...@gmail.com writes:
  Significant slowdown is on following test:

  do $$ declare a int[] := '{}'; begin for i in 1..9 loop a := a ||
 10;
  end loop; end$$ language plpgsql;
  do $$ declare a numeric[] := '{}'; begin for i in 1..9 loop a := a
 ||
  10.1; end loop; end$$ language plpgsql;

  integer master 14sec x patched 55sec
  numeric master 43sec x patched 108sec

  It is probably worst case - and it is known plpgsql antipattern

  Yeah, I have not expended a great deal of effort on the array_append/
  array_prepend/array_cat code paths.  Still, in these plpgsql cases,
  we should in principle have gotten down from two array copies per loop to
  one, so it's disappointing to not have better results there, even
 granting
  that the new copy step is not just a byte-by-byte copy.  Let me see if
  there's anything simple to be done about that.

 The attached updated patch reduces both of those do-loop tests to about
 60 msec on my machine.  It contains two improvements over the 1.1 patch:

 1. There's a fast path for copying an expanded array to another expanded
 array when the element type is pass-by-value: we can just memcpy the
 Datum array instead of working element-by-element.  In isolation, that
 change made the patch a little faster than 9.4 on your int-array case,
 though of course it doesn't help for the numeric-array case (and I do not
 see a way to avoid working element-by-element for pass-by-ref cases).

 2. pl/pgsql now detects cases like a := a || x and allows the array a
 to be passed as a read-write pointer to array_append, so that array_append
 can modify expanded arrays in-place and avoid inessential data copying
 altogether.  (The earlier patch had made array_append and array_prepend
 safe for this usage, but there wasn't actually any way to invoke them
 with read-write pointers.)  I had speculated about doing this in my
 earliest discussion of this patch, but there was no code for it before.

 The key question for change #2 is how do we identify what is a safe
 top-level function that can be trusted not to corrupt the read-write value
 if it fails partway through.  I did not have a good answer before, and
 I still don't; what this version of the patch does is to hard-wire
 array_append and array_prepend as the functions considered safe.
 Obviously that is crying out for improvement, but we can leave that
 question for later; at least now we have infrastructure that makes it
 possible to do it.

 Change #1 is actually not relevant to these example cases, because we
 don't copy any arrays within the loop given change #2.  But I left it in
 because it's not much code and it will help for situations where change #2
 doesn't apply.


I can confirm this speedup - pretty nice.

Multidimensional append is slower 2x .. but it is probably corner case

declare a int[] := '{}'; begin for i in 1..9 loop a := a || ARRAY[[i
]]; end loop; raise notice '%', 'aa'; end$$ language plpgsql;

but this optimization doesn't work for code - that is semantically same
like a || i;

declare a int[] := '{}'; begin for i in 1..9 loop a := a || ARRAY[i ];
end loop; raise notice '%', 'aa'; end$$ language plpgsql;

So there is some to much sensible

There are slowdown with MD arrays, but it is not typical use case, and the
speedup is about 5-10x and faster - so I'll be very happy if this patch
will be in 9.5

Regards

Pavel



 regards, tom lane




[HACKERS] is possible to upgrade from 9.2 to 9.4 with pg_upgrade

2015-05-06 Thread Pavel Stehule
Hi

I am working on preparation the migration from 9.2 to 9.4

pg_upgrade fails

 pg_upgrade -b /usr/lib64/pgsql/postgresql-9.2/bin -B /usr/bin/ -d
/mnt/ebs/pgsql/data -D /mnt/ebs/pgsql/data94 -k'
Performing Consistency Checks
-
Checking cluster versions   ok
The old cluster lacks some required control information:
  latest checkpoint oldest MultiXactId

?

Regards

Pavel


[HACKERS] Where are the detoast function called in select * from table_name case?

2015-05-06 Thread Rui Hai Jiang

Hello,

I've been reading the PostgreSQL code for weeks to figure out how TOAST 
works.
I couldn't find where are the TOAST function called to detoast a tuple 
comes from a select query, for example, select * from table_name.
Does anyone know this? Can you give me some help? And any help would 
save me a lot of time.


Thanks,
Rui Hai



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] is possible to upgrade from 9.2 to 9.4 with pg_upgrade

2015-05-06 Thread Pavel Stehule
2015-05-06 15:15 GMT+02:00 Alvaro Herrera alvhe...@2ndquadrant.com:

 Pavel Stehule wrote:
  Hi
 
  I am working on preparation the migration from 9.2 to 9.4
 
  pg_upgrade fails
 
   pg_upgrade -b /usr/lib64/pgsql/postgresql-9.2/bin -B /usr/bin/ -d
  /mnt/ebs/pgsql/data -D /mnt/ebs/pgsql/data94 -k'
  Performing Consistency Checks
  -
  Checking cluster versions   ok
  The old cluster lacks some required control information:
latest checkpoint oldest MultiXactId
 
  ?

 Uh, this is certainly supposed to work.  Maybe pg_controldata or
 pg_resetxlog changed output format and pg_upgrade doesn't know how to
 read it.


It is tested on fresh 9.2.10 to 9.4.1

Regards

Pavel



 --
 Álvaro Herrerahttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services



Re: [HACKERS] is possible to upgrade from 9.2 to 9.4 with pg_upgrade

2015-05-06 Thread Alvaro Herrera
Pavel Stehule wrote:
 Hi
 
 I am working on preparation the migration from 9.2 to 9.4
 
 pg_upgrade fails
 
  pg_upgrade -b /usr/lib64/pgsql/postgresql-9.2/bin -B /usr/bin/ -d
 /mnt/ebs/pgsql/data -D /mnt/ebs/pgsql/data94 -k'
 Performing Consistency Checks
 -
 Checking cluster versions   ok
 The old cluster lacks some required control information:
   latest checkpoint oldest MultiXactId
 
 ?

Uh, this is certainly supposed to work.  Maybe pg_controldata or
pg_resetxlog changed output format and pg_upgrade doesn't know how to
read it.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Where are the detoast function called in select * from table_name case?

2015-05-06 Thread Pavel Stehule
Hi

Depends on usage, but often times the detoasting is called from
DatumGet* macros

The values are detoasted only when it is required

#define PG_GETARG_TEXT_PP(n)   DatumGetTextPP(PG_GETARG_DATUM(n))
#define DatumGetTextPP(X)  ((text *) PG_DETOAST_DATUM_PACKED(X))
#define PG_DETOAST_DATUM(datum)   pg_detoast_datum((struct varlena *)
DatumGetPointer(datum))

Regards

Pavel Stehule

2015-05-06 15:00 GMT+02:00 Rui Hai Jiang ruihaiji...@msn.com:

 Hello,

 I've been reading the PostgreSQL code for weeks to figure out how TOAST
 works.
 I couldn't find where are the TOAST function called to detoast a tuple
 comes from a select query, for example, select * from table_name.
 Does anyone know this? Can you give me some help? And any help would save
 me a lot of time.

 Thanks,
 Rui Hai



 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] Manipulating complex types as non-contiguous structures in-memory

2015-05-06 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 Multidimensional append is slower 2x .. but it is probably corner case

 declare a int[] := '{}'; begin for i in 1..9 loop a := a || ARRAY[[i
 ]]; end loop; raise notice '%', 'aa'; end$$ language plpgsql;

Yeah, that's array_cat(), which I've not done anything with.  I'm not
really excited about adding code to it; I think use-cases like this one
are probably too uncommon to justify more code.  In any case we could
go back and improve it later if there are enough complaints.

Another way to look at it is that in this example, plpgsql's attempts to
force the a array into expanded form are a mistake: we never get any
benefit because array_cat() just wants it in flat form again, and delivers
it in flat form.  (It's likely that this is an unrealistic worst case:
it's hard to imagine real array-using applications that never do any
element-by-element access.)  Possibly we could improve matters with a more
refined heuristic about whether to force arrays to expanded form during
assignments --- but I'm not sure what that would look like.  plpgsql has
very little direct knowledge of which operations will be applied to the
array later.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)

2015-05-06 Thread Robert Haas
On Tue, May 5, 2015 at 9:04 AM, Arjen Nienhuis a.g.nienh...@gmail.com wrote:
 Can someone look at this patch. It should fix bug #12845.

 The current tests for conversions are very minimal. I expanded them a
 bit for this bug.

 I think the binary search in the .map files should be removed but I
 leave that for another patch.

Please add this patch to
https://commitfest.postgresql.org/action/commitfest_view/open so we
don't forget about it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disabling trust/ident authentication configure option

2015-05-06 Thread Alvaro Herrera
Robert Haas wrote:

 I frankly find that a bit difficult to swallow.  You think that
 everyone knows that bad passwords are a problem, but some people might
 not realize that an authentication method called trust might not be
 secure?

Ultimately, what we offer to users is choice of a few options.  Should
we only offer options that we consider to be completely secure, and no
others?  If we were to follow that principle, we would completely
disable non-SSL builds, and all auth methods other than, I dunno, GSSAPI
and such.  But we don't do that, because we trust that users will use
whatever is most appropriate for them.  I see this patch is, in a way, a
mechanism to let system administrators choose at compile time what
options are available to DBAs at setup time.  This seems a reasonable
thing to me.

I don't necessarily agree with the patch as proposed.  I would rather
have a comma-separated list of methods, as in:

--disable-auth=ident,peer

which lets you choose what to disable without hardcoded choices.  Due to
the nature of autoconf, this might be too fiddly to implement, though,
and if so I think the method proposed by this patch seems a reasonable
compromise.  I've seen configure in other programs offer options such as
--disable-foo=list that lists acceptable values (or --disable-foo=help)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT syntax issues

2015-05-06 Thread Andres Freund
On 2015-05-05 15:27:09 +0300, Heikki Linnakangas wrote:
 I'm a bit late to the party as I haven't paid much attention to the syntax
 before, but let me give some comments on this arbiter index inference
 thingie.
 
 
 To recap, there are three variants:
 
 A. INSERT ... ON CONFLICT DO NOTHING
 
 No arbiter is specified. This means that a conflict on any unique or
 exclusion constraint is not allowed (and will do nothing instead). This
 variant is only accepted for DO NOTHING.
 
 B. INSERT ... ON CONFLICT ON constraint name DO NOTHING/UPDATE
 
 In this variant, you explicitly specify the constraint by name.

I do think it's a bit sad to not be able to specify unique indexes that
aren't constraints. So I'd like to have a corresponding ON INDEX - which
would be trivial.

 C. INSERT ... ON CONFLICT (index params) [WHERE expression] DO
 NOTHING/UPDATE
 
 This specifies an index (or indexes, in the corner case that there are
 several identical ones), by listing the columns/expressions and the
 predicate for a partial index. The list of columns and WHERE match the
 syntax for CREATE INDEX.
 
 
 That's pretty good overall. A few questions:
 
 1. Why is the variant without specifying an index or constraint not allowed
 with DO UPDATE? I agree it might not make much sense, but then again, it
 might. If we're afraid that it's too unsafe to be the default if you don't
 specify any constraint, how about allowing it with a more verbose ON
 CONFLICT ON ANY CONSTRAINT syntax?

I think that'd be useful. Peter seems to be against it on pureness
grounds when we argued against it before, but I know that I'd wished for
it before.

 2. Why can't you specify multiple constraints, even though we implicitly
 allow any with the first variant?

Yea.

 Finally, a couple of suggestions. It would be pretty handy to allow:
 
 INSERT ... ON CONFLICT ON PRIMARY KEY DO NOTHING/UPDATE

Not sure if that really has that big of a use case, but it'd also be
simple.

 Also, I wonder if we should change the B syntax to be:
 
 INSERT ... ON CONFLICT ON *CONSTRAINT* constraint name DO NOTHING/UPDATE

Oh yes.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Manipulating complex types as non-contiguous structures in-memory

2015-05-06 Thread Pavel Stehule
2015-05-06 15:50 GMT+02:00 Tom Lane t...@sss.pgh.pa.us:

 Pavel Stehule pavel.steh...@gmail.com writes:
  Multidimensional append is slower 2x .. but it is probably corner case

  declare a int[] := '{}'; begin for i in 1..9 loop a := a || ARRAY[[i
  ]]; end loop; raise notice '%', 'aa'; end$$ language plpgsql;

 Yeah, that's array_cat(), which I've not done anything with.  I'm not
 really excited about adding code to it; I think use-cases like this one
 are probably too uncommon to justify more code.  In any case we could
 go back and improve it later if there are enough complaints.

 Another way to look at it is that in this example, plpgsql's attempts to
 force the a array into expanded form are a mistake: we never get any
 benefit because array_cat() just wants it in flat form again, and delivers
 it in flat form.  (It's likely that this is an unrealistic worst case:
 it's hard to imagine real array-using applications that never do any
 element-by-element access.)  Possibly we could improve matters with a more
 refined heuristic about whether to force arrays to expanded form during
 assignments --- but I'm not sure what that would look like.  plpgsql has
 very little direct knowledge of which operations will be applied to the
 array later.


Isn't better to push information about possible target to  function?

array_cat(a, b, result)
{
  if (undef(result))
 return a || b;

   if (b == result)
array_extend(result, a);
return result;
   else if (a == result)
array_extend(result, b);
return result;
   else
return a || b;
}

It can be used for arrays, for strings?

On second hand it decrease readability related functions :( (but not all
functions should to support this optimization).

Regards

Pavel


 regards, tom lane



Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)

2015-05-06 Thread Alvaro Herrera
Robert Haas wrote:
 On Tue, May 5, 2015 at 9:04 AM, Arjen Nienhuis a.g.nienh...@gmail.com wrote:
  Can someone look at this patch. It should fix bug #12845.
 
  The current tests for conversions are very minimal. I expanded them a
  bit for this bug.
 
  I think the binary search in the .map files should be removed but I
  leave that for another patch.
 
 Please add this patch to
 https://commitfest.postgresql.org/action/commitfest_view/open so we
 don't forget about it.

If we think this is a bug fix, we should add it to the open items list,
https://wiki.postgresql.org/wiki/Open_Items

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-05-06 Thread Robert Haas
On Wed, May 6, 2015 at 7:55 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 - I believe the separation of concerns between ExecFunnel() and
 ExecEndFunnel() is not quite right.  If the scan is shut down before
 it runs to completion (e.g. because of LIMIT), then I think we'll call
 ExecEndFunnel() before ExecFunnel() hits the TupIsNull(slot) path.  I
 think you probably need to create a static subroutine that is called
 both as soon as TupIsNull(slot) and also from ExecEndFunnel(), in each
 case cleaning up whatever resources remain.

 Right, will fix as per suggestion.

 I observed one issue while working on this review comment.  When we
 try to destroy the parallel setup via ExecEndNode (as due to Limit
 Node, it could not destroy after consuming all tuples), it waits for
 parallel
 workers to finish (WaitForParallelWorkersToFinish()) and parallel workers
 are waiting for master backend to signal them as their queue is full.
 I think in such a case master backend needs to inform workers either when
 the scan is discontinued due to limit node or while waiting for parallel
 workers to finish.

Isn't this why TupleQueueFunnelShutdown() calls shm_mq_detach()?
That's supposed to unstick the workers; any impending or future writes
will just return SHM_MQ_DETACHED without waiting.

 That's technically true, but the incremental work involved in
 supporting a new worker is extremely small compare with worker startup
 times.  I'm guessing that the setup cost is going to be on the order
 of hundred-thousands or millions and and the startup cost is going to
 be on the order of tens or ones.

 Can we safely estimate the cost of restoring parallel state (GUC's,
 combo CID, transaction state, snapshot, etc.) in each worker as a setup
 cost?  There could be some work like restoration of locks (acquire all or
 relevant locks at start of parallel worker, if we follow your proposed
 design
 and even if we don't follow that there could be some similar substantial
 work)
 which could be substantial and we need to do the same for each worker.
 If you think restoration of parallel state in each worker is a pretty
 small work, then what you say makes sense to me.

Well, all the workers restore that state in parallel, so adding it up
across all workers doesn't really make sense.  But anyway, no, I don't
think that's a big cost.  I think the big cost is going to the
operating system overhead of process creation.  The new process will
incur lots of page faults as it populates its address space and
dirties pages marked copy-on-write.  That's where I expect most of the
expense to be.

 And I actually hope you *can't* present some contrary evidence.
 Because if you can, then that might mean that we need to cost every
 possible path from 0 up to N workers and let the costing machinery
 decide which one is better.

 Not necesarally, we can follow a rule that number of workers
 that need to be used for any parallel statement are equal to degree of
 parallelism (parallel_seqscan_degree) as set by user.  I think we
 need to do some split up of number workers when there are multiple
 parallel operations in single statement (like sort and parallel scan).

Yeah.  I'm hoping we will be able to use the same pool of workers for
multiple operations, but I realize that's a feature we haven't
designed yet.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-05-06 Thread Amit Kapila
On Tue, Apr 28, 2015 at 5:37 PM, Robert Haas robertmh...@gmail.com wrote:

 On Fri, Apr 24, 2015 at 8:32 AM, Amit Kapila amit.kapil...@gmail.com
wrote:

 - I believe the separation of concerns between ExecFunnel() and
 ExecEndFunnel() is not quite right.  If the scan is shut down before
 it runs to completion (e.g. because of LIMIT), then I think we'll call
 ExecEndFunnel() before ExecFunnel() hits the TupIsNull(slot) path.  I
 think you probably need to create a static subroutine that is called
 both as soon as TupIsNull(slot) and also from ExecEndFunnel(), in each
 case cleaning up whatever resources remain.


 Right, will fix as per suggestion.

I observed one issue while working on this review comment.  When we
try to destroy the parallel setup via ExecEndNode (as due to Limit
Node, it could not destroy after consuming all tuples), it waits for
parallel
workers to finish (WaitForParallelWorkersToFinish()) and parallel workers
are waiting for master backend to signal them as their queue is full.
I think in such a case master backend needs to inform workers either when
the scan is discontinued due to limit node or while waiting for parallel
workers to finish.


  - I don't think you need both setup cost and startup cost.  Starting
  up more workers isn't particularly more expensive than starting up
  fewer of them, because most of the overhead is in waiting for them to
  actually start, and the number of workers is reasonable, then they're
  all be doing that in parallel with each other.  I suggest removing
  parallel_startup_cost and keeping parallel_setup_cost.
 
  There is some work (like creation of shm queues, launching of workers)
  which is done proportional to number of workers during setup time. I
  have kept 2 parameters to distinguish such work.  I think you have a
  point that start of some or all workers could be parallel, but I feel
  that still is a work proportinal to number of workers.  For future
  parallel operations also such a parameter could be useful where we need
  to setup IPC between workers or some other stuff where work is
proportional
  to workers.

 That's technically true, but the incremental work involved in
 supporting a new worker is extremely small compare with worker startup
 times.  I'm guessing that the setup cost is going to be on the order
 of hundred-thousands or millions and and the startup cost is going to
 be on the order of tens or ones.


Can we safely estimate the cost of restoring parallel state (GUC's,
combo CID, transaction state, snapshot, etc.) in each worker as a setup
cost?  There could be some work like restoration of locks (acquire all or
relevant locks at start of parallel worker, if we follow your proposed
design
and even if we don't follow that there could be some similar substantial
work)
which could be substantial and we need to do the same for each worker.
If you think restoration of parallel state in each worker is a pretty
small work, then what you say makes sense to me.


 And I actually hope you *can't* present some contrary evidence.
 Because if you can, then that might mean that we need to cost every
 possible path from 0 up to N workers and let the costing machinery
 decide which one is better.

Not necesarally, we can follow a rule that number of workers
that need to be used for any parallel statement are equal to degree of
parallelism (parallel_seqscan_degree) as set by user.  I think we
need to do some split up of number workers when there are multiple
parallel operations in single statement (like sort and parallel scan).


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Where are the detoast function called in select * from table_name case?

2015-05-06 Thread Heikki Linnakangas

On 05/06/2015 04:00 PM, Rui Hai Jiang wrote:

I've been reading the PostgreSQL code for weeks to figure out how TOAST
works.
I couldn't find where are the TOAST function called to detoast a tuple
comes from a select query, for example, select * from table_name.
Does anyone know this? Can you give me some help? And any help would
save me a lot of time.


The tuple isn't detoasted immediately when it's read. Individual datums 
of the tuple are detoasted lazily, when needed, in whatever function 
it's passed to. If you just do select * from table, the first function 
it's passed to is the datatype's output function, so that detoasts it. 
For example, if it's a text column, the toasted Datum is passed to 
textout(), which calls text_to_cstring(). text_to_cstring() calls 
pg_detoast_datum_packed(), which finally detoasts it.


You can launch a debugger on the backend process, and put a breakpoint 
on e.g. heap_tuple_untoast_attr() to see it in action.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Manipulating complex types as non-contiguous structures in-memory

2015-05-06 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 2015-05-06 15:50 GMT+02:00 Tom Lane t...@sss.pgh.pa.us:
 Another way to look at it is that in this example, plpgsql's attempts to
 force the a array into expanded form are a mistake: we never get any
 benefit because array_cat() just wants it in flat form again, and delivers
 it in flat form.  (It's likely that this is an unrealistic worst case:
 it's hard to imagine real array-using applications that never do any
 element-by-element access.)  Possibly we could improve matters with a more
 refined heuristic about whether to force arrays to expanded form during
 assignments --- but I'm not sure what that would look like.  plpgsql has
 very little direct knowledge of which operations will be applied to the
 array later.

 Isn't better to push information about possible target to  function?

I don't think that would solve the problem.  For example, one of the cases
I worry about is a function that does read-only examination of an array
argument; consider something like

create function sum_squares(a numeric[]) returns numeric as $$
declare s numeric := 0;
begin
for i in array_lower(a, 1) .. array_upper(a, 1) loop
s := s + a[i] * a[i];
end loop;
return s;
end;
$$ language plpgsql strict immutable;

array_get_element() is not in a position here to force expansion of the
array variable, so unless plpgsql itself does something we're not going
to get a performance win (unless the argument happens to be already
expanded on arrival).

I'm inclined to think that we need to add information to pg_type about
whether a type supports expansion (and how to convert to expanded form
if so).  In the patch as it stands, plpgsql just has hard-wired knowledge
that it can call expand_array() on arrays that it's putting into function
local variables.  I'd be okay with shipping 9.5 like that, but pretty soon
we'll want a solution that extension data types can use too.

More generally, it'd be nice if the mechanism could be more flexible than
always force variables of this type to expanded form.  But I don't see
how to teach plpgsql itself how to decide that intelligently, let alone
how we might design an API that lets some third-party data type decide it
at arm's length from plpgsql ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disabling trust/ident authentication configure option

2015-05-06 Thread Josh Berkus
On 05/06/2015 02:13 PM, Tom Lane wrote:
 Andrew Dunstan and...@dunslane.net writes:
 (Personally I think there's a very good case for completely ripping out 
 RFC1413 ident auth. I've not seen it used in a great long while, and 
 it's always been a security risk.)
 
 FWIW, I agree with that --- or at least making it a not-built-by-default
 option.

I have seen it in the last year, actually, but only once, which even for
my personal pool represents  1% usage.  So ...

 Probably the right time to make any such changes is at the same time
 we add the proposed more-secure-than-MD5 password option.

+1 to kill off ident when we replace MD5, since users will need to be
beaten over the head about changes to auth methods anyway.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disabling trust/ident authentication configure option

2015-05-06 Thread Jim Nasby

On 5/6/15 12:56 PM, Peter Eisentraut wrote:

I think this is a sufficiently general requirement to warrant including
an option to disable this, as most hardening guides I have seen for
PostgreSQL unconditionally require to disable trust authentication and
disabling it in the code removes the need to check this in the runtime
configuration.

I think people would be interested in well-thought out, generalized
hardening facilities.  But that would likely include other things than
just disabling an authentication method or two.  And we can't be adding
a new compile-time option as we add each one.  We need a more general
approach.


Yeah. I think one of the big use cases here is that many environments 
are OK with at least ident (if not trust) but only from the local 
machine. So you'd probably want to handle that somehow.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN range operator class

2015-05-06 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Let's think together and try to find a reasonable way to get the union
 procedures tested regularly.  It is pretty clear that having them run
 only when the race condition occurs is not acceptable; bugs go
 unnoticed.

[ just a drive-by comment... ]  Maybe you could set up a testing mode
that forces the race condition to occur?  Then you could test the calling
code paths, not only the union procedures per se.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disabling trust/ident authentication configure option

2015-05-06 Thread Heikki Linnakangas

On 05/07/2015 01:32 AM, Jim Nasby wrote:

On 5/6/15 12:56 PM, Peter Eisentraut wrote:

I think this is a sufficiently general requirement to warrant including

an option to disable this, as most hardening guides I have seen for
PostgreSQL unconditionally require to disable trust authentication and
disabling it in the code removes the need to check this in the runtime
configuration.

I think people would be interested in well-thought out, generalized
hardening facilities.  But that would likely include other things than
just disabling an authentication method or two.  And we can't be adding
a new compile-time option as we add each one.  We need a more general
approach.


Yeah. I think one of the big use cases here is that many environments
are OK with at least ident (if not trust) but only from the local
machine. So you'd probably want to handle that somehow.


That's called 'peer', since 9.1.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT syntax issues

2015-05-06 Thread Andres Freund
On 2015-05-06 13:05:16 -0700, Peter Geoghegan wrote:
 On Wed, May 6, 2015 at 7:53 AM, Andres Freund and...@anarazel.de wrote:
  In this variant, you explicitly specify the constraint by name.
 
  I do think it's a bit sad to not be able to specify unique indexes that
  aren't constraints. So I'd like to have a corresponding ON INDEX - which
  would be trivial.
 
 Then what's the point of having ON CONSTRAINT?

That it supports exclusion constraints?

 I would care about the fact that you can't name a unique index with no
 constraint if there wasn't already a way of doing that with inference
 (I'm thinking in particular of partial indexes here, which never have
 constraints). But there is. So what's the problem?

Personally I think a complex expression index is something many people
will not want to specify every time. And since partial/expression
indexes can't even have constraints...

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT syntax issues

2015-05-06 Thread Heikki Linnakangas

On 05/06/2015 11:05 PM, Peter Geoghegan wrote:

On Wed, May 6, 2015 at 7:53 AM, Andres Freund and...@anarazel.de wrote:

In this variant, you explicitly specify the constraint by name.


I do think it's a bit sad to not be able to specify unique indexes that
aren't constraints. So I'd like to have a corresponding ON INDEX - which
would be trivial.


Then what's the point of having ON CONSTRAINT? The point of it working
that way was we're not exposing the implementation detail of the
index. While I happen to think that that's a distinction without a
difference anyway, that certainly was the idea.


Right, that's the idea. Indexes are just an implementation detail - 
conceivably you could have a constraint that's backed by some other 
mechanism. You should not embed implementation details like index names 
in your queries.


Unfortunately you can't create a partial constraint - you'll have to 
create a partial index. I wish we would fix that directly, by allowing 
partial unique constraints.


That said, I wouldn't necessarily be opposed to also having the syntax 
to name an index directly, as long as we had some notices in the docs to 
tell people to avoid it.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT syntax issues

2015-05-06 Thread Heikki Linnakangas

On 05/07/2015 12:01 AM, Andres Freund wrote:

On 2015-05-06 23:48:18 +0300, Heikki Linnakangas wrote:

I'll see about fixing that. It's not just a matter of creating another alias
for the same rel, I'm afraid: foo.t is supposed to refer to the tuple that
we attempted to insert, like it does without the ON CONFLICT.


I'm not sure what you mean here?


Sorry, forget about that. I was confused and mixed up EXCLUDED and 
TARGET. Looks like they really aren't very good names :-).


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

2015-05-06 Thread Andres Freund
On 2015-05-06 22:51:43 +0300, Heikki Linnakangas wrote:
 Yeah, I agree that DO NOTHING should not lock the rows. It might make sense
 to have a DO LOCK variant, which locks the rows, although I don't
 immediately see what the use case would be.

If you want to do something more complicated with the row than what you
can do in the UPDATE. To do that right now you either need to do the DO
UPDATE SET ... WHERE false; and refetch the tuple which might not be
easy, or do a DO UPDATE SET pkey = target.pkey which produces
bloat. Consider e.g. you're applying incoming data and in case of
conflict you want to call user defined function deciding which row
should survive.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT syntax issues

2015-05-06 Thread Peter Geoghegan
On Wed, May 6, 2015 at 1:48 PM, Heikki Linnakangas hlinn...@iki.fi wrote:
 TARGET is also very descriptive, because it situationally describes
 either the existing tuple actually present in the table, or (from a
 RETURNING clause) the final tuple present in the table post-UPDATE.
 We use the term target for that pervasively (in the docs and in the
 code).


 but I find that totally unconvincing. It's clear that TARGET refers to the
 table being upserted, but it's totally unclear on *which* version of the
 tuple it refers to.

Then we're simply talking about 2 different things. My understanding
is that it *is* the relation. And like UPDATE's RETURNING, it will be
the same relation/alias but a different tuple here. Andres said this
was a mutating tuple or something like that, and I suppose it is. But
Vars are variables.

Now, Andres (and now you) want to change it so that the TARGET alias
becomes magical and expression-like, so that it really does refer to a
tuple and not a relation (and so is closer to EXCLUDED.*). And you
seem pretty set on that. That being the case, clearly TARGET is
unsuitable for the reasons you state.

I suppose that it doesn't much matter, but that's how I understood the
situation all along. So I can see why you don't like TARGET in light
of that. I would vote for EXISTING as an alternative, given that it's
pretty clear that what is now TARGET.* will become a magic
alias/expression thing. EXISTING is the EXISTING tuple, which goes
well with EXCLUDED.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT syntax issues

2015-05-06 Thread Andres Freund
On 2015-05-06 23:48:18 +0300, Heikki Linnakangas wrote:
 I'll see about fixing that. It's not just a matter of creating another alias
 for the same rel, I'm afraid: foo.t is supposed to refer to the tuple that
 we attempted to insert, like it does without the ON CONFLICT.

I'm not sure what you mean here?

 But actually, I don't much like the target alias in the first place. We
 never really completed this discussion, everyone just got tired:

Right. But that doesn't affect the it's not just a matter of ... bit
above, right?

 Reading through this sub-thread, these spellings have been proposed:
 
 1. TARGET and EXCLUDED
 
 2. NEW and EXISTING
 
 3. NEW and OLD
 
 4. PROPOSED and EXISTING
 
 5. CONFLICTING and EXISTING
 
 Did I miss any? Now, let me opine on these.

How about
6. The tablename and EXCLUDED? Possibility with the ability to specify
   an AS for INSERT INTO foo AS whatever?

From an implementation pov that'd be simple ;)

 NEW and OLD are pretty good. Like in an UPDATE trigger, NEW refers to the
 version after the UPDATE, and OLD to the version before. However, there's
 the serious problem that in a trigger function, OLD/NEW are already in use.
 How bad is that? At least in PL/pgSQL you can work around it by aliasing the
 variables, but it's a bit inconvenient. How often would INSERT .. ON
 CONFLICT DO UPDATE be used in a trigger?

I personally think it's a killer. It'll be very annoying to understand
mistaken usage of NEW/OLD in that case.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT syntax issues

2015-05-06 Thread Heikki Linnakangas

On 05/07/2015 12:18 AM, Andres Freund wrote:

On 2015-05-07 00:10:22 +0300, Heikki Linnakangas wrote:

Right, that's the idea. Indexes are just an implementation detail -


I think that's a distinction just about no user out there cares about.


Unfortunately you can't create a partial constraint - you'll have to
create a partial index. I wish we would fix that directly, by allowing
partial unique constraints.


It's not just partial ones, it's also expression ones, right?


True.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT syntax issues

2015-05-06 Thread Peter Geoghegan
On Wed, May 6, 2015 at 7:53 AM, Andres Freund and...@anarazel.de wrote:
 In this variant, you explicitly specify the constraint by name.

 I do think it's a bit sad to not be able to specify unique indexes that
 aren't constraints. So I'd like to have a corresponding ON INDEX - which
 would be trivial.

Then what's the point of having ON CONSTRAINT? The point of it working
that way was we're not exposing the implementation detail of the
index. While I happen to think that that's a distinction without a
difference anyway, that certainly was the idea.

I would care about the fact that you can't name a unique index with no
constraint if there wasn't already a way of doing that with inference
(I'm thinking in particular of partial indexes here, which never have
constraints). But there is. So what's the problem?
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT syntax issues

2015-05-06 Thread Peter Geoghegan
On Wed, May 6, 2015 at 1:22 PM, Andres Freund and...@anarazel.de wrote:
 That it supports exclusion constraints?

But so does just naming the index. I don't think it's significant that
exclusion operators are in pg_constraint -- you could just as easily
name the index, since that's all you ultimately end up with anyway.

 I would care about the fact that you can't name a unique index with no
 constraint if there wasn't already a way of doing that with inference
 (I'm thinking in particular of partial indexes here, which never have
 constraints). But there is. So what's the problem?

 Personally I think a complex expression index is something many people
 will not want to specify every time. And since partial/expression
 indexes can't even have constraints...

The downsides seem severe.  A CREATE INDEX CONCURRENTLY just broke
your statement, because you didn't account for the new, equivalent
unique index during inference, and now you have to drop the old index
and break application code. Is that really worth introducing just to
prevent app devs from writing the inference specification? The
specification explicitly documents their intent, which seems like a
good thing.

Robert really disliked the idea of even naming the constraint, let
alone the index. I'm trying to balance things, here.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN range operator class

2015-05-06 Thread Alvaro Herrera
I again have to refuse the notion that removing the assert-only block
without any replacement is acceptable.  I just spent a lot of time
tracking down what turned out to be a bug in your patch 07:

/* Adjust maximum, if B's max is greater than A's max */
-   needsadj = FunctionCall2Coll(minmax_get_procinfo(bdesc, attno,
-   
 PROCNUM_GREATER),
- colloid, col_b-bv_values[1], 
col_a-bv_values[1]);
+   frmg = minmax_get_strategy_procinfo(bdesc, attno, attr-atttypid,
+   
BTGreaterStrategyNumber);
+   needsadj = FunctionCall2Coll(frmg, colloid, col_b-bv_values[0],
+
col_a-bv_values[0]);

Note the removed lines use array index 1, while the added lines use
array index 0.  The only reason I noticed this is because I applied this
patch without the others and saw the assertion fire; how would I have
noticed the problem had I just removed it?

Let's think together and try to find a reasonable way to get the union
procedures tested regularly.  It is pretty clear that having them run
only when the race condition occurs is not acceptable; bugs go
unnoticed.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disabling trust/ident authentication configure option

2015-05-06 Thread Andrew Dunstan


On 05/06/2015 04:19 PM, Robert Haas wrote:

On Wed, May 6, 2015 at 3:57 PM, Andrew Dunstan and...@dunslane.net wrote:

I don't necessarily object to this idea, but I do think we need to ensure
that we don't allow both trust and peer to be disabled (which means on
Windows you would not be able to disable trust). Otherwise this becomes a
footgun which would require the whole server to be stopped so you could
connect in single user mode to correct certain mistakes, which are
unfortunately all too common.

Of course that's precisely what the OP wanted to do, which goes to my
point that not everybody's going to want the same thing.



If that is indeed the proposal, then I vote no.

But he did say upthread:

Single user sessions would work, but the peer authentication is also 
still available and should be the preferred method to reset passwords 
when trust is disabled, so this should not be an issue.


(Personally I think there's a very good case for completely ripping out 
RFC1413 ident auth. I've not seen it used in a great long while, and 
it's always been a security risk.)



cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT syntax issues

2015-05-06 Thread Andres Freund
On 2015-05-06 13:37:07 -0700, Peter Geoghegan wrote:
 On Wed, May 6, 2015 at 1:22 PM, Andres Freund and...@anarazel.de wrote:
  That it supports exclusion constraints?
 
 But so does just naming the index. I don't think it's significant that
 exclusion operators are in pg_constraint -- you could just as easily
 name the index, since that's all you ultimately end up with anyway.

The index name for constraints is generated and not trivially safely
guessable for a user.

  I would care about the fact that you can't name a unique index with no
  constraint if there wasn't already a way of doing that with inference
  (I'm thinking in particular of partial indexes here, which never have
  constraints). But there is. So what's the problem?
 
  Personally I think a complex expression index is something many people
  will not want to specify every time. And since partial/expression
  indexes can't even have constraints...
 
 The downsides seem severe.  A CREATE INDEX CONCURRENTLY just broke
 your statement, because you didn't account for the new, equivalent
 unique index during inference, and now you have to drop the old index
 and break application code. Is that really worth introducing just to
 prevent app devs from writing the inference specification? The
 specification explicitly documents their intent, which seems like a
 good thing.

I don't find that all that severe. It might just as well be the case
that the new unique constraint isn't intended to be handled by ON
CONFLICT. And having a inference specification that's longer than the
rest of the statement surely isn't helpful.

 Robert really disliked the idea of even naming the constraint, let
 alone the index. I'm trying to balance things, here.

I fail to see what doing something halfhearted helps.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT syntax issues

2015-05-06 Thread Heikki Linnakangas
Andres pointed out on IM that the TARGET alias is a bit crummy. In 
particular, adding an ON CONFLICT DO UPDATE can make a RETURNING clause 
invalid, because we change the alias of the target rel:


create table foo (id int4 primary key, t text);

This works:

postgres=# insert into foo (id, t) values (1, 'x') returning foo.t;
 t
---
 x
(1 row)

INSERT 0 1

Same statement with ON CONFLICT DO UPDATE fails:

postgres=# insert into foo (id, t) values (1, 'x') on conflict (id) do 
update set t = 'x' returning foo.t;

ERROR:  invalid reference to FROM-clause entry for table foo
LINE 1: ...'x') on conflict (id) do update set t = 'x' returning foo.t;
 ^
HINT:  Perhaps you meant to reference the table alias target.

I'll see about fixing that. It's not just a matter of creating another 
alias for the same rel, I'm afraid: foo.t is supposed to refer to the 
tuple that we attempted to insert, like it does without the ON CONFLICT.


But actually, I don't much like the target alias in the first place. 
We never really completed this discussion, everyone just got tired:


On 04/29/2015 10:13 PM, Stephen Frost wrote:

* Robert Haas (robertmh...@gmail.com) wrote:

On Mon, Apr 27, 2015 at 7:21 PM, Peter Geoghegan p...@heroku.com wrote:

* Don't change the names of the pseudo-alias EXCLUDED.* (or the alias
TARGET.*). Those seem fine to me as well.


There seem to be a few votes for NEW and OLD.  That's what I proposed
originally, and (surprise, surprise) I still like that better too.


I was promoting NEW/OLD, until I realized that we'd end up having a
problem in trigger functions because NEW/OLD are already defined there,
unless you have a suggestion for how to improve on that?


Reading through this sub-thread, these spellings have been proposed:

1. TARGET and EXCLUDED

2. NEW and EXISTING

3. NEW and OLD

4. PROPOSED and EXISTING

5. CONFLICTING and EXISTING

Did I miss any? Now, let me opine on these.

EXCLUDED seems fine to me. I don't see us using that term elsewhere, and 
it makes me think of exclusion constraints, but nevertheless I think 
it's pretty easy remember what it means. TARGET, however, is totally 
inscrutable. Peter argued earlier that:



TARGET is also very descriptive, because it situationally describes
either the existing tuple actually present in the table, or (from a
RETURNING clause) the final tuple present in the table post-UPDATE.
We use the term target for that pervasively (in the docs and in the
code).


but I find that totally unconvincing. It's clear that TARGET refers to 
the table being upserted, but it's totally unclear on *which* version of 
the tuple it refers to.


NEW and OLD are pretty good. Like in an UPDATE trigger, NEW refers to 
the version after the UPDATE, and OLD to the version before. However, 
there's the serious problem that in a trigger function, OLD/NEW are 
already in use. How bad is that? At least in PL/pgSQL you can work 
around it by aliasing the variables, but it's a bit inconvenient. How 
often would INSERT .. ON CONFLICT DO UPDATE be used in a trigger?


I don't have much to say about the rest. PROPOSED, EXISTING, 
CONFLICTING, they're all fairly descriptive, but long.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT syntax issues

2015-05-06 Thread Peter Geoghegan
On Wed, May 6, 2015 at 2:01 PM, Andres Freund and...@anarazel.de wrote:
 How about
 6. The tablename and EXCLUDED? Possibility with the ability to specify
an AS for INSERT INTO foo AS whatever?

 From an implementation pov that'd be simple ;)

That's what I wanted to do when I realized what Andres wanted to do
with the TARGET alias. Clearly that would compel us to actually make
the RETURNING clause buy into this alias, just as with a regular
UPDATE. And not having the alias on the target also be magical seems
like a good thing. Nothing can be broken by this scheme. No?

 NEW and OLD are pretty good. Like in an UPDATE trigger, NEW refers to the
 version after the UPDATE, and OLD to the version before. However, there's
 the serious problem that in a trigger function, OLD/NEW are already in use.
 How bad is that? At least in PL/pgSQL you can work around it by aliasing the
 variables, but it's a bit inconvenient. How often would INSERT .. ON
 CONFLICT DO UPDATE be used in a trigger?

 I personally think it's a killer. It'll be very annoying to understand
 mistaken usage of NEW/OLD in that case.

+1

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disabling trust/ident authentication configure option

2015-05-06 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 (Personally I think there's a very good case for completely ripping out 
 RFC1413 ident auth. I've not seen it used in a great long while, and 
 it's always been a security risk.)

FWIW, I agree with that --- or at least making it a not-built-by-default
option.

Probably the right time to make any such changes is at the same time
we add the proposed more-secure-than-MD5 password option.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT syntax issues

2015-05-06 Thread Andres Freund
On 2015-05-07 00:10:22 +0300, Heikki Linnakangas wrote:
 Right, that's the idea. Indexes are just an implementation detail -

I think that's a distinction just about no user out there cares about.

 Unfortunately you can't create a partial constraint - you'll have to
 create a partial index. I wish we would fix that directly, by allowing
 partial unique constraints.

It's not just partial ones, it's also expression ones, right?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)

2015-05-06 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, May 6, 2015 at 11:13 AM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
 Maybe not, but at the very least we should consider getting it fixed in
 9.5 rather than waiting a full development cycle.  Same as in
 https://www.postgresql.org/message-id/20150428131549.ga25...@momjian.us
 I'm not saying we MUST include it in 9.5, but we should at least
 consider it.  If we simply stash it in the open CF we guarantee that it
 will linger there for a year.

 Sure, if somebody has the time to put into it now, I'm fine with that.
 I'm afraid it won't be me, though: even if I had the time, I don't
 know enough about encodings.

I concur that we should at least consider this patch for 9.5.  I've
added it to
https://wiki.postgresql.org/wiki/PostgreSQL_9.5_Open_Items

I'm willing to look at it myself, whenever my non-copious spare time
permits; but that won't be in the immediate future.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disabling trust/ident authentication configure option

2015-05-06 Thread Robert Haas
On Wed, May 6, 2015 at 3:57 PM, Andrew Dunstan and...@dunslane.net wrote:
 I don't necessarily object to this idea, but I do think we need to ensure
 that we don't allow both trust and peer to be disabled (which means on
 Windows you would not be able to disable trust). Otherwise this becomes a
 footgun which would require the whole server to be stopped so you could
 connect in single user mode to correct certain mistakes, which are
 unfortunately all too common.

Of course that's precisely what the OP wanted to do, which goes to my
point that not everybody's going to want the same thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT syntax issues

2015-05-06 Thread Peter Geoghegan
On Tue, May 5, 2015 at 5:27 AM, Heikki Linnakangas hlinn...@iki.fi wrote:
 To recap, there are three variants:

 A. INSERT ... ON CONFLICT DO NOTHING

 No arbiter is specified. This means that a conflict on any unique or
 exclusion constraint is not allowed (and will do nothing instead). This
 variant is only accepted for DO NOTHING.

 B. INSERT ... ON CONFLICT ON constraint name DO NOTHING/UPDATE

 In this variant, you explicitly specify the constraint by name.

 C. INSERT ... ON CONFLICT (index params) [WHERE expression] DO
 NOTHING/UPDATE

 This specifies an index (or indexes, in the corner case that there are
 several identical ones), by listing the columns/expressions and the
 predicate for a partial index. The list of columns and WHERE match the
 syntax for CREATE INDEX.

I would just say that there are two variants, only one of which
mandates the inference clause. But I'm nitpicking.

 That's pretty good overall. A few questions:

Thanks. I'm glad that we are now able to cover really any conceivable
use-case, while playing nice with every existing feature (now
updatable views are supported with ON CONFLICT DO UPDATE -- and we're
also going to be able to suppor subqueries in the UPDATE). We've been
incredibly thorough.

 1. Why is the variant without specifying an index or constraint not allowed
 with DO UPDATE? I agree it might not make much sense, but then again, it
 might. If we're afraid that it's too unsafe to be the default if you don't
 specify any constraint, how about allowing it with a more verbose ON
 CONFLICT ON ANY CONSTRAINT syntax?

I think it's dangerous. It's basically wrong headed to omit any
constraint for DO UPDATE. I put a lot of effort into covering every
possible case with the inference clause, and I think it's pretty cool
that we have something that's so flexible. I don't feel bad about
forcing users to be explicit about what they want, because the only
conceivable downside is that they'll have to do a little extra typing
(if even that - it's probably going to be ORM-generated more often
than not). The upside -- not having their query break unexpectedly one
day, when a new constraint is added -- is huge.

 2. Why can't you specify multiple constraints, even though we implicitly
 allow any with the first variant?

It's just an escape hatch. I don't want to encourage its over use, and
I want to keep the grammar simple.

 Finally, a couple of suggestions. It would be pretty handy to allow:

 INSERT ... ON CONFLICT ON PRIMARY KEY DO NOTHING/UPDATE

Is that really likely to be less verbose than naming the attributes directly?

 INSERT ... ON CONFLICT ON *CONSTRAINT* constraint name DO NOTHING/UPDATE

 That would allow the syntax can be expanded in the future to specify
 conflicts on other kind of things. The ON PRIMARY KEY syntax should be
 unambiguous with out, because PRIMARY is a reserved keyword, but for
 example, we might want to add ON UNIQUE INDEX index name later.

Okay, we can do that.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

2015-05-06 Thread Peter Geoghegan
On Tue, May 5, 2015 at 10:31 AM, Andres Freund and...@anarazel.de wrote:
 Another thing I'm wondering about is dealing with deferrable
 constraints/deferred indexes.

 a) Why does ExecCheckIndexConstraints() check for indisimmediate for
*all* indexes and not just when it's an arbiter index? That seems
needlessly restrictive.

I guess it's a legacy of the time when it was all or nothing. But
supporting that would involve further complicating the interface with
ExecCheckIndexConstraints() so that we cared about the distinction
between conflicts for one purpose and conflicts for another. It could
be done, though.

 b) There's a doc addition to set_constraints.sgml
 +   constraints are affected by this setting.  Note that constraints
 +   that are literalDEFERRED/literal cannot be used as arbiters by
 +   the literalON CONFLICT/ clause that commandINSERT/
 +   supports.

 which I don't think makes sense: SET CONSTRAINTS will never change
 anything relevant for ON CONFLICT, right?

You're right.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Manipulating complex types as non-contiguous structures in-memory

2015-05-06 Thread Pavel Stehule
2015-05-06 18:54 GMT+02:00 Tom Lane t...@sss.pgh.pa.us:

 Pavel Stehule pavel.steh...@gmail.com writes:
  2015-05-06 15:50 GMT+02:00 Tom Lane t...@sss.pgh.pa.us:
  Another way to look at it is that in this example, plpgsql's attempts to
  force the a array into expanded form are a mistake: we never get any
  benefit because array_cat() just wants it in flat form again, and
 delivers
  it in flat form.  (It's likely that this is an unrealistic worst case:
  it's hard to imagine real array-using applications that never do any
  element-by-element access.)  Possibly we could improve matters with a
 more
  refined heuristic about whether to force arrays to expanded form during
  assignments --- but I'm not sure what that would look like.  plpgsql has
  very little direct knowledge of which operations will be applied to the
  array later.

  Isn't better to push information about possible target to  function?

 I don't think that would solve the problem.  For example, one of the cases
 I worry about is a function that does read-only examination of an array
 argument; consider something like

 create function sum_squares(a numeric[]) returns numeric as $$
 declare s numeric := 0;
 begin
 for i in array_lower(a, 1) .. array_upper(a, 1) loop
 s := s + a[i] * a[i];
 end loop;
 return s;
 end;
 $$ language plpgsql strict immutable;


I remember this issue


 array_get_element() is not in a position here to force expansion of the
 array variable, so unless plpgsql itself does something we're not going
 to get a performance win (unless the argument happens to be already
 expanded on arrival).

 I'm inclined to think that we need to add information to pg_type about
 whether a type supports expansion (and how to convert to expanded form
 if so).  In the patch as it stands, plpgsql just has hard-wired knowledge
 that it can call expand_array() on arrays that it's putting into function
 local variables.  I'd be okay with shipping 9.5 like that, but pretty soon
 we'll want a solution that extension data types can use too.

 More generally, it'd be nice if the mechanism could be more flexible than
 always force variables of this type to expanded form.  But I don't see
 how to teach plpgsql itself how to decide that intelligently, let alone
 how we might design an API that lets some third-party data type decide it
 at arm's length from plpgsql ...


I agree - the core of work have to be elsewhere than in plpgsql. Some years
ago there was a idea about toast cache.




 regards, tom lane



Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

2015-05-06 Thread Heikki Linnakangas

On 05/06/2015 10:47 PM, Peter Geoghegan wrote:

On Wed, May 6, 2015 at 8:20 AM, Andres Freund and...@anarazel.de wrote:

On 2015-05-05 15:00:56 -0700, Peter Geoghegan wrote:

Locking the row is not nothing, though. If you want to lock the row,
use an UPSERT with a tautologically false WHERE clause (like WHERE
false).


That's not the same. For one it breaks RETURNING which is a death
knell, for another it's not exactly obvious.


DO NOTHING already doesn't project non-inserted tuples, in a way that
fits with the way we won't do that when a before trigger returns NULL.
So I don't know what you mean about RETURNING behavior.

It may not be all that obvious, but then the requirement that you
mention isn't either. I really strongly feel that DO NOTHING should do
nothing. For the pgloader use-case, which is what I have in mind with
that variant, that could literally make the difference between
dirtying an enormous number of buffers and dirtying only a few. This
will *frequently* be the case. And it's not as if the idea of an
INSERT IGNORE is new or in any way novel. As I mentioned, many systems
have a comparable command.

So, yes, DO NOTHING does very little - and that is its appeal.
Supporting this behavior does not short change those who actually care
about the existing tuple sticking around for the duration of their
transaction - they have a way of doing that. If you want to document
INSERT IGNORE as being primarily an ETL-orientated thing, that would
make sense, but let's not hobble that use case.


Yeah, I agree that DO NOTHING should not lock the rows. It might make 
sense to have a DO LOCK variant, which locks the rows, although I don't 
immediately see what the use case would be.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

2015-05-06 Thread Peter Geoghegan
On Wed, May 6, 2015 at 8:20 AM, Andres Freund and...@anarazel.de wrote:
 On 2015-05-05 15:00:56 -0700, Peter Geoghegan wrote:
 Locking the row is not nothing, though. If you want to lock the row,
 use an UPSERT with a tautologically false WHERE clause (like WHERE
 false).

 That's not the same. For one it breaks RETURNING which is a death
 knell, for another it's not exactly obvious.

DO NOTHING already doesn't project non-inserted tuples, in a way that
fits with the way we won't do that when a before trigger returns NULL.
So I don't know what you mean about RETURNING behavior.

It may not be all that obvious, but then the requirement that you
mention isn't either. I really strongly feel that DO NOTHING should do
nothing. For the pgloader use-case, which is what I have in mind with
that variant, that could literally make the difference between
dirtying an enormous number of buffers and dirtying only a few. This
will *frequently* be the case. And it's not as if the idea of an
INSERT IGNORE is new or in any way novel. As I mentioned, many systems
have a comparable command.

So, yes, DO NOTHING does very little - and that is its appeal.
Supporting this behavior does not short change those who actually care
about the existing tuple sticking around for the duration of their
transaction - they have a way of doing that. If you want to document
INSERT IGNORE as being primarily an ETL-orientated thing, that would
make sense, but let's not hobble that use case.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disabling trust/ident authentication configure option

2015-05-06 Thread Andrew Dunstan


On 05/06/2015 10:47 AM, Alvaro Herrera wrote:



I don't necessarily agree with the patch as proposed.  I would rather
have a comma-separated list of methods, as in:

 --disable-auth=ident,peer

which lets you choose what to disable without hardcoded choices.  Due to
the nature of autoconf, this might be too fiddly to implement, though,
and if so I think the method proposed by this patch seems a reasonable
compromise.  I've seen configure in other programs offer options such as
--disable-foo=list that lists acceptable values (or --disable-foo=help)




I don't necessarily object to this idea, but I do think we need to 
ensure that we don't allow both trust and peer to be disabled (which 
means on Windows you would not be able to disable trust). Otherwise this 
becomes a footgun which would require the whole server to be stopped so 
you could connect in single user mode to correct certain mistakes, which 
are unfortunately all too common.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)

2015-05-06 Thread Robert Haas
On Wed, May 6, 2015 at 11:13 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 It's a behavior change, so I don't think we would consider a back-patch.

 Maybe not, but at the very least we should consider getting it fixed in
 9.5 rather than waiting a full development cycle.  Same as in
 https://www.postgresql.org/message-id/20150428131549.ga25...@momjian.us
 I'm not saying we MUST include it in 9.5, but we should at least
 consider it.  If we simply stash it in the open CF we guarantee that it
 will linger there for a year.

Sure, if somebody has the time to put into it now, I'm fine with that.
I'm afraid it won't be me, though: even if I had the time, I don't
know enough about encodings.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disabling trust/ident authentication configure option

2015-05-06 Thread Peter Eisentraut
On 5/6/15 6:02 AM, Volker Aßmann wrote:
 Well trust actually does not sound that dangerous in case you only
 take a quick glance at the documentation - trust PostgreSQL to do the
 right thing?

Hah, we could rename it to wideopen.

 Please note that the patch does nothing by default, it just adds the
 option to disable trust/ident but leaves them on in the standard
 configuration. I do not want to disable trust by default for everyone,
 but it would be great to have the option to do this without having to
 patch (and thus test and verify) the PostgreSQL sources for each new
 release.

Any new compile-time option creates a nonlinear maintenance burden.
We're going to need to test whether each option builds cleanly and
works, also in combination with other options, and on several platforms.
 The authentication code is already littered with build-time
dependencies and platform-specific code.  So the it doesn't bother
anyone argument doesn't quite work.

Actually, in this particular case, you wouldn't even need a compile-time
option.  You could just make it a restart-only option.

 I think this is a sufficiently general requirement to warrant including
 an option to disable this, as most hardening guides I have seen for
 PostgreSQL unconditionally require to disable trust authentication and
 disabling it in the code removes the need to check this in the runtime
 configuration.

I think people would be interested in well-thought out, generalized
hardening facilities.  But that would likely include other things than
just disabling an authentication method or two.  And we can't be adding
a new compile-time option as we add each one.  We need a more general
approach.

 Single user sessions would work, but the peer authentication is also
 still available and should be the preferred method to reset passwords
 when trust is disabled, so this should not be an issue.

peer authentication is unfortunately not quite portable.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] is possible to upgrade from 9.2 to 9.4 with pg_upgrade

2015-05-06 Thread Jeff Janes
On Wed, May 6, 2015 at 6:16 AM, Pavel Stehule pavel.steh...@gmail.com
wrote:



 2015-05-06 15:15 GMT+02:00 Alvaro Herrera alvhe...@2ndquadrant.com:

 Pavel Stehule wrote:
  Hi
 
  I am working on preparation the migration from 9.2 to 9.4
 
  pg_upgrade fails
 
   pg_upgrade -b /usr/lib64/pgsql/postgresql-9.2/bin -B /usr/bin/ -d
  /mnt/ebs/pgsql/data -D /mnt/ebs/pgsql/data94 -k'
  Performing Consistency Checks
  -
  Checking cluster versions   ok
  The old cluster lacks some required control information:
latest checkpoint oldest MultiXactId
 
  ?

 Uh, this is certainly supposed to work.  Maybe pg_controldata or
 pg_resetxlog changed output format and pg_upgrade doesn't know how to
 read it.


 It is tested on fresh 9.2.10 to 9.4.1


I've done that migration many times.  Can you give the pg_config output for
both installations, and the pg_controldata output for the 9.2.10?

Thanks,

Jeff


Re: [HACKERS] is possible to upgrade from 9.2 to 9.4 with pg_upgrade

2015-05-06 Thread Jeff Janes
On Wed, May 6, 2015 at 10:26 AM, Jeff Janes jeff.ja...@gmail.com wrote:

 On Wed, May 6, 2015 at 6:16 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:



 2015-05-06 15:15 GMT+02:00 Alvaro Herrera alvhe...@2ndquadrant.com:

 Pavel Stehule wrote:
  Hi
 
  I am working on preparation the migration from 9.2 to 9.4
 
  pg_upgrade fails
 
   pg_upgrade -b /usr/lib64/pgsql/postgresql-9.2/bin -B /usr/bin/ -d
  /mnt/ebs/pgsql/data -D /mnt/ebs/pgsql/data94 -k'
  Performing Consistency Checks
  -
  Checking cluster versions   ok
  The old cluster lacks some required control information:
latest checkpoint oldest MultiXactId
 
  ?

 Uh, this is certainly supposed to work.  Maybe pg_controldata or
 pg_resetxlog changed output format and pg_upgrade doesn't know how to
 read it.


 It is tested on fresh 9.2.10 to 9.4.1


 I've done that migration many times.  Can you give the pg_config output
 for both installations, and the pg_controldata output for the 9.2.10?


Also, what is the full path to pg_upgrade?

Cheers,

Jeff


Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

2015-05-06 Thread Andreas Karlsson

On 05/06/2015 09:51 PM, Heikki Linnakangas wrote:

So, yes, DO NOTHING does very little - and that is its appeal.
Supporting this behavior does not short change those who actually care
about the existing tuple sticking around for the duration of their
transaction - they have a way of doing that. If you want to document
INSERT IGNORE as being primarily an ETL-orientated thing, that would
make sense, but let's not hobble that use case.


Yeah, I agree that DO NOTHING should not lock the rows. It might make
sense to have a DO LOCK variant, which locks the rows, although I don't
immediately see what the use case would be.


It seems like a very useful feature to me, if you want to upsert 
something into a table with a serial column and get the value of the 
serial column in a RETURNING clause (but not update any fields if there 
is a conflict). Then I am pretty sure I want to lock the row.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)

2015-05-06 Thread Alvaro Herrera
Robert Haas wrote:
 On Wed, May 6, 2015 at 10:55 AM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
  Robert Haas wrote:
  On Tue, May 5, 2015 at 9:04 AM, Arjen Nienhuis a.g.nienh...@gmail.com 
  wrote:
   Can someone look at this patch. It should fix bug #12845.
  
   The current tests for conversions are very minimal. I expanded them a
   bit for this bug.
  
   I think the binary search in the .map files should be removed but I
   leave that for another patch.
 
  Please add this patch to
  https://commitfest.postgresql.org/action/commitfest_view/open so we
  don't forget about it.
 
  If we think this is a bug fix, we should add it to the open items list,
  https://wiki.postgresql.org/wiki/Open_Items
 
 It's a behavior change, so I don't think we would consider a back-patch.

Maybe not, but at the very least we should consider getting it fixed in
9.5 rather than waiting a full development cycle.  Same as in
https://www.postgresql.org/message-id/20150428131549.ga25...@momjian.us
I'm not saying we MUST include it in 9.5, but we should at least
consider it.  If we simply stash it in the open CF we guarantee that it
will linger there for a year.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)

2015-05-06 Thread Robert Haas
On Wed, May 6, 2015 at 10:55 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Robert Haas wrote:
 On Tue, May 5, 2015 at 9:04 AM, Arjen Nienhuis a.g.nienh...@gmail.com 
 wrote:
  Can someone look at this patch. It should fix bug #12845.
 
  The current tests for conversions are very minimal. I expanded them a
  bit for this bug.
 
  I think the binary search in the .map files should be removed but I
  leave that for another patch.

 Please add this patch to
 https://commitfest.postgresql.org/action/commitfest_view/open so we
 don't forget about it.

 If we think this is a bug fix, we should add it to the open items list,
 https://wiki.postgresql.org/wiki/Open_Items

It's a behavior change, so I don't think we would consider a back-patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT UPDATE/IGNORE 4.0

2015-05-06 Thread Andres Freund
On 2015-05-05 15:00:56 -0700, Peter Geoghegan wrote:
 Locking the row is not nothing, though. If you want to lock the row,
 use an UPSERT with a tautologically false WHERE clause (like WHERE
 false).

That's not the same. For one it breaks RETURNING which is a death
knell, for another it's not exactly obvious.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers