Re: [HACKERS] GIN improvements part 1: additional information

2013-12-18 Thread Oleg Bartunov
Guys,

before digging deep into the art of comp/decomp world I'd like to know
if you familiar with results of
http://wwwconference.org/www2008/papers/pdf/p387-zhangA.pdf paper and
some newer research ?  Do we agree in what we really want ? Basically,
there are three main features: size, compression, decompression speed
- we should take two :)

Should we design sort of plugin, which could support independent
storage on disk, so users can apply different techniques, depending on
data.

What I want to say is that we certainly can play with this very
challenged task, but we have limited time  before 9.4 and we should
think in positive direction.

Oleg

On Wed, Dec 18, 2013 at 6:50 PM, Heikki Linnakangas
 wrote:
> On 12/18/2013 01:45 PM, Alexander Korotkov wrote:
>>
>> On Tue, Dec 17, 2013 at 2:49 AM, Heikki Linnakangas
>> >>
>>> wrote:
>>
>>
>>> On 12/17/2013 12:22 AM, Alexander Korotkov wrote:
>>>   2) Storage would be easily extendable to hold additional information as

 well.
 Better compression shouldn't block more serious improvements.

>>>
>>> I'm not sure I agree with that. For all the cases where you don't care
>>> about additional information - which covers all existing users for
>>> example
>>> - reducing disk size is pretty important. How are you planning to store
>>> the
>>> additional information, and how does using another encoding gets in the
>>> way
>>> of that?
>>
>>
>> I was planned to store additional information datums between
>> varbyte-encoded tids. I was expected it would be hard to do with PFOR.
>> However, I don't see significant problems in your implementation of
>> Simple9
>> encoding. I'm going to dig deeper in your version of patch.
>
>
> Ok, thanks.
>
> I had another idea about the page format this morning. Instead of having the
> item-indexes at the end of the page, it would be more flexible to store a
> bunch of self-contained posting list "segments" on the page. So I propose
> that we get rid of the item-indexes, and instead store a bunch of
> independent posting lists on the page:
>
> typedef struct
> {
> ItemPointerData first;   /* first item in this segment (unpacked) */
> uint16  nwords;  /* number of words that follow */
> uint64  words[1];/* var length */
> } PostingListSegment;
>
> Each segment can be encoded and decoded independently. When searching for a
> particular item (like on insertion), you skip over segments where 'first' >
> the item you're searching for.
>
> This format offers a lot more flexibility compared to the separate item
> indexes. First, we don't need to have another fixed sized area on the page,
> which simplifies the page format. Second, we can more easily re-encode only
> one segment on the page, on insertion or vacuum. The latter is particularly
> important with the Simple-9 encoding, which operates one word at a time
> rather than one item at a time; removing or inserting an item in the middle
> can require a complete re-encoding of everything that follows. Third, when a
> page is being inserted into and contains only uncompressed items, you don't
> waste any space for unused item indexes.
>
> While we're at it, I think we should use the above struct in the inline
> posting lists stored directly in entry tuples. That wastes a few bytes
> compared to the current approach in the patch (more alignment, and 'words'
> is redundant with the number of items stored on the tuple header), but it
> simplifies the functions handling these lists.
>
>
> - Heikki
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2013-12-18 Thread Fujii Masao
On Mon, Dec 16, 2013 at 9:22 PM, MauMau  wrote:
> Hi, Fujii san,
>
>> On Wed, Aug 7, 2013 at 7:03 AM, Fujii Masao  wrote:
>>>
>>> On second thought, probably we cannot remove the restored WAL files early
>>> because they might be required for fast promotion which is new feature in
>>> 9.3.
>>> In fast promotion, an end-of-recovery checkpoint is not executed. After
>>> the end
>>> of recovery, normal online checkpoint starts. If the server crashes
>>> before such
>>> an online checkpoint completes, the server needs to replay again all the
>>> WAL
>>> files which it replayed before the promotion. Since this is the crash
>>> recovery,
>>> all those WAL files need to exist in pg_xlog directory. So if we remove
>>> the
>>> restored WAL file from pg_xlog early, such a crash recovery might fail.
>>>
>>> So even if cascade replication is disabled, if standby_mode = on, i.e.,
>>> fast
>>> promotion can be performed, we cannot remove the restored WAL files
>>> early.
>
>
> Following Fujii-san's advice, I've made the attached patch.

Thanks for the patch!

! if (source == XLOG_FROM_ARCHIVE && StandbyModeRequested)

Even when standby_mode is not enabled, we can use cascade replication and
it needs the accumulated WAL files. So I think that AllowCascadeReplication()
should be added into this condition.

!   snprintf(recoveryPath, MAXPGPATH, XLOGDIR "/RECOVERYXLOG");
!   XLogFilePath(xlogpath, ThisTimeLineID, endLogSegNo);
!
!   if (restoredFromArchive)

Don't we need to check !StandbyModeRequested and !AllowCascadeReplication()
here?

!   /*
!* If the latest segment is not archival, but there's still a
!* RECOVERYXLOG laying about, get rid of it.
!*/
!   unlink(recoveryPath);   /* ignore any error */

The similar line exists in the lower side of exitArchiveRecovery(), so ISTM that
you can refactor that.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters

2013-12-18 Thread Tatsuo Ishii
> I found that the psql tab-completion for ALTER SYSTEM SET has not been
> implemented yet.
> Attached patch does that. Barring any objections, I will commit this patch.

Good catch!

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New option for pg_basebackup, to specify a different directory for pg_xlog

2013-12-18 Thread Haribabu kommi
On 19 December 2013 05:31 Bruce Momjian wrote:
> On Wed, Dec 11, 2013 at 10:22:32AM +, Haribabu kommi wrote:
> > The make_absolute_path() function moving to port is changed in
> similar
> > way as Bruce Momjian approach. The psprintf is used to store the
> error
> > string which Occurred in the function. But psprintf is not used for
> > storing the absolute path As because it is giving problems in freeing
> the allocated memory in SelectConfigFiles.
> > Because the same memory is allocated in a different code branch from
> guc_malloc.
> >
> > After adding the make_absolute_path() function with psprintf stuff in
> > path.c file It is giving linking problem in compilation of ecpg. I am
> not able to find the problem.
> > So I added another file abspath.c in port which contains these two
> functions.
> 
> What errors are you seeing?

If I move the make_absolute_path function from abspath.c to path.c,
I was getting following linking errors while compiling "ecpg".

../../../../src/port/libpgport.a(path.o): In function `make_absolute_path':
/home/hari/postgres/src/port/path.c:795: undefined reference to `psprintf'
/home/hari/postgres/src/port/path.c:809: undefined reference to `psprintf'
/home/hari/postgres/src/port/path.c:818: undefined reference to `psprintf'
/home/hari/postgres/src/port/path.c:830: undefined reference to `psprintf'
collect2: ld returned 1 exit status
make[4]: *** [ecpg] Error 1
make[3]: *** [all-preproc-recurse] Error 2
make[2]: *** [all-ecpg-recurse] Error 2
make[1]: *** [all-interfaces-recurse] Error 2
make: *** [all-src-recurse] Error 2


Regards,
Hari babu.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters

2013-12-18 Thread Fujii Masao
On Thu, Dec 19, 2013 at 12:08 PM, Amit Kapila  wrote:
> On Wed, Dec 18, 2013 at 8:25 PM, Tatsuo Ishii  wrote:
 Is there any reason for the function returns int as it always returns
 0 or 1. Maybe returns bool is better?
>>>
>>
>> I have committed your patches. Thanks.
>
> Thank you very much.

I found that the psql tab-completion for ALTER SYSTEM SET has not been
implemented yet.
Attached patch does that. Barring any objections, I will commit this patch.

Regards,

-- 
Fujii Masao
*** a/src/bin/psql/tab-complete.c
--- b/src/bin/psql/tab-complete.c
***
*** 541,546  static const SchemaQuery Query_for_list_of_matviews = {
--- 541,552 
  "SELECT pg_catalog.quote_ident(nspname) FROM pg_catalog.pg_namespace "\
  " WHERE substring(pg_catalog.quote_ident(nspname),1,%d)='%s'"
  
+ #define Query_for_list_of_alter_system_set_vars \
+ "SELECT name FROM "\
+ " (SELECT pg_catalog.lower(name) AS name FROM pg_catalog.pg_settings "\
+ "  WHERE context != 'internal') ss "\
+ " WHERE substring(name,1,%d)='%s'"
+ 
  #define Query_for_list_of_set_vars \
  "SELECT name FROM "\
  " (SELECT pg_catalog.lower(name) AS name FROM pg_catalog.pg_settings "\
***
*** 930,936  psql_completion(char *text, int start, int end)
  		{"AGGREGATE", "COLLATION", "CONVERSION", "DATABASE", "DEFAULT PRIVILEGES", "DOMAIN",
  			"EXTENSION", "FOREIGN DATA WRAPPER", "FOREIGN TABLE", "FUNCTION",
  			"GROUP", "INDEX", "LANGUAGE", "LARGE OBJECT", "MATERIALIZED VIEW", "OPERATOR",
! 			"ROLE", "RULE", "SCHEMA", "SERVER", "SEQUENCE", "TABLE",
  			"TABLESPACE", "TEXT SEARCH", "TRIGGER", "TYPE",
  		"USER", "USER MAPPING FOR", "VIEW", NULL};
  
--- 936,942 
  		{"AGGREGATE", "COLLATION", "CONVERSION", "DATABASE", "DEFAULT PRIVILEGES", "DOMAIN",
  			"EXTENSION", "FOREIGN DATA WRAPPER", "FOREIGN TABLE", "FUNCTION",
  			"GROUP", "INDEX", "LANGUAGE", "LARGE OBJECT", "MATERIALIZED VIEW", "OPERATOR",
! 			 "ROLE", "RULE", "SCHEMA", "SERVER", "SEQUENCE", "SYSTEM SET", "TABLE",
  			"TABLESPACE", "TEXT SEARCH", "TRIGGER", "TYPE",
  		"USER", "USER MAPPING FOR", "VIEW", NULL};
  
***
*** 1263,1268  psql_completion(char *text, int start, int end)
--- 1269,1279 
  
  		COMPLETE_WITH_LIST(list_ALTER_SERVER);
  	}
+ 	/* ALTER SYSTEM SET  */
+ 	else if (pg_strcasecmp(prev3_wd, "ALTER") == 0 &&
+ 			 pg_strcasecmp(prev2_wd, "SYSTEM") == 0 &&
+ 			 pg_strcasecmp(prev_wd, "SET") == 0)
+ 		COMPLETE_WITH_QUERY(Query_for_list_of_alter_system_set_vars);
  	/* ALTER VIEW  */
  	else if (pg_strcasecmp(prev3_wd, "ALTER") == 0 &&
  			 pg_strcasecmp(prev2_wd, "VIEW") == 0)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-12-18 Thread Peter Geoghegan
On Thu, Dec 12, 2013 at 4:18 PM, Peter Geoghegan  wrote:
> Both of these revisions have identical ad-hoc test cases included as
> new files - see testcase.sh and upsert.sql. My patch doesn't have any
> unique constraint violations, and has pretty consistent performance,
> while yours has many unique constraint violations. I'd like to hear
> your thoughts on the testcase, and the design implications.

I withdraw the test-case. Both approaches behave similarly if you look
for long enough, and that's okay.

I also think that changes to HeapTupleSatisfiesUpdate() are made
unnecessary by recent bug fixes to that function. The test case
previously described [1] that broke that is no longer recreatable, at
least so far.

Do you think that we need to throw a serialization failure within
ExecLockHeapTupleForUpdateSpec() iff heap_lock_tuple() returns
HeapTupleInvisible and IsolationUsesXactSnapshot()? Also, I'm having a
hard time figuring out a good choke point to catch MVCC snapshots
availing of our special visibility rule where they should not due to
IsolationUsesXactSnapshot(). It seems sufficient to continue to assume
that  Postgres won't attempt to lock any tid invisible under
conventional MVCC rules in the first place, except within
ExecLockHeapTupleForUpdateSpec(), but what do we actually do within
ExecLockHeapTupleForUpdateSpec()? I'm thinking of a new tqual.c
routine concerning the tuple being in the future that we re-check when
IsolationUsesXactSnapshot(). That's not very modular, though. Maybe
we'd go through heapam.c.

I think it doesn't matter that what now constitute MVCC snapshots
(with the new, special "reach into the future" rule) have that new
rule, for the purposes of higher isolation levels, because we'll have
a serialization failure within ExecLockHeapTupleForUpdateSpec() before
this is allowed to become a problem. In order for the new rule to be
relevant, we'd have to be the Xact to lock in the first place, and as
an xact in non-read-committed mode, we'd be sure to call the new
tqual.c "in the future" routine or whatever. Only upserters can lock a
row in the future, so it is the job of upserters to care about this
special case.

Incidentally, I tried to rebase recently, and saw some shift/reduce
conflicts due to 1b4f7f93b4693858cb983af3cd557f6097dab67b, "Allow
empty target list in SELECT". The fix for that is not immediately
obvious.

So I think we should proceed with the non-conclusive-check-first
approach (if only on pragmatic grounds), but even now I'm not really
sure. I think there might be unprincipled deadlocking should
ExecInsertIndexTuples() fail to be completely consistent about its
ordering of insertion - the use of dirty snapshots (including as part
of conventional !UNIQUE_CHECK_PARTIAL unique index enforcement) plays
a part in this risk. Roughly speaking, heap_delete() doesn't render
the tuple immediately invisible to some-other-xact's dirty snapshot
[2], and I think that could have unpleasant interactions, even if it
is also beneficial in some ways. Our old, dead tuples from previous
attempts stick around, and function as "value locks" to everyone else,
since for example _bt_check_unique() cares about visibility having
merely been affected, which is grounds for blocking. More
counter-intuitive still, we go ahead with "value locking" (i.e. btree
UNIQUE_CHECK_PARTIAL tuple insertion originating from the main
speculative ExecInsertIndexTuples() call) even though we already know
that we will delete the corresponding heap row (which, as noted, still
satisfies HeapTupleSatisfiesDirty() and so is value-lock-like).

Empirically, retrying because ExecInsertIndexTuples() returns some
recheckIndexes occurs infrequently, so maybe that makes all of this
okay. Or maybe it happens infrequently *because* we don't give up on
insertion when it looks like the current iteration is futile. Maybe
just inserting into every unique index, and then blocking on an xid
within ExecCheckIndexConstraints(), works out fairly and performs
reasonably in all common cases. It's pretty damn subtle, though, and I
worry about the worst case performance, and basic correctness issues
for these reasons. The fact that deferred unique indexes also use
UNIQUE_CHECK_PARTIAL is cold comfort -- that only ever has to through
an error on conflict, and only once. We haven't "earned the right" to
lock *all* values in all unique indexes, but kind of do so anyway in
the event of an "insertion conflicted after pre-check".

Another concern that bears reiterating is: I think making the
lock-for-update case work for exclusion constraints is a lot of
additional complexity for a very small return.

Do you think it's worth optimizing ExecInsertIndexTuples() to avoid
futile non-unique/exclusion constrained index tuple insertion?

[1] 
http://www.postgresql.org/message-id/CAM3SWZS2--GOvUmYA2ks_aNyfesb0_H6T95_k8+wyx7Pi=c...@mail.gmail.com

[2] 
https://github.com/postgres/postgres/blob/94b899b829657332bda856ac3f06153d09077bd1/src/backend/utils/

Re: [HACKERS] Logging WAL when updating hintbit

2013-12-18 Thread Amit Kapila
On Wed, Dec 18, 2013 at 11:30 AM, Michael Paquier
 wrote:
> On Wed, Dec 18, 2013 at 11:22 AM, Amit Kapila  wrote:
>> On Fri, Dec 13, 2013 at 7:57 PM, Heikki Linnakangas
>>  wrote:
>>> Thanks, committed with some minor changes:
>>
>> Should this patch in CF app be moved to Committed Patches or is there
>> something left for this patch?
> Nothing has been forgotten for this patch. It can be marked as committed.

Thanks for confirmation, I have marked it as Committed.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row security roadmap proposal

2013-12-18 Thread Craig Ringer
On 12/18/2013 11:14 PM, Robert Haas wrote:

> To be clear, I wasn't advocating for a declarative approach; I think
> predicates are fine.  There are usability issues to worry about either
> way, and my concern is that we address those.  A declarative approach
> would certainly be valuable in that, for those people for whom it is
> sufficiently flexible, it's probably quite a lot easier than writing
> predicates.  But I fear that some people will want a lot more
> generality than a label-based system can accommodate.

Having spent some time reading the HL7 spec, I now tend to agree that
labels alone are not going to be sufficient. There are security
decisions made based on:

- Security labels

- User/group/role memberships, both for accessor and target entity owner
(down to the row level)

- Black/white list ACLs, with inheritance and deny rules.

- One-off or narrow authorizations. "I permit my GP to examine my mental
health record details, but only over the last year, and the
authorization stands only for today."

- Authorization assertions. "I declare that the patient told me it is OK
to access these, let me see them."

- "Break glass" access. "This is an emergency. Give me access and I'll
deal with the consequences later."


So while security labels are an important part of the picture I'm forced
to agree that they are not sufficient, and that a generalized row
security mechanism actually is necessary. We don't have the time or
resources to build all these sorts of things individually, and if we did
it'd still be too inflexible for somebody.

In the end, sometimes I guess there's no replacement for "WHERE
call_some_procedure()"


In particular, labels are totally orthogonal to entity identity in the
data model, and being able to make row access decisions  based on
information already in the data model is important. FK relationships to
owning entities and relationships between entities must be usable for
security access policy decisions.

So: I'm withdrawing the proposal to go straight to label security; I
concede that it's insufficiently expressive to meet all possible needs,
and we don't have the time to build anything declarative and
user-friendly that would be.

I do want to make sure that whatever we include this time around does
not create problems for a future label security approach, but that's
kind of required by the desire to add SEPostgreSQL RLS down the track
anyway.

Given that: What are your specific usability concerns about the current
approach proposed in KaiGai's patch?

My main worry was that it requires the user to build everything
manually, and is potentially error prone as a result. To address that we
can build convenience features (label security, ACL types and operators,
etc) on top of the same infrastructure later - it doesn't have to go in
right away.

So putting that side, the concerns I have are:

- The current RLS design is restricted to one predicate per table with
no contextual control over when that predicate applies. That means we
can't implement anything like "policy groups" or overlapping sets of
predicates, anything like that has to be built by the user. We don't
need multiple predicates to start with but I want to make sure there's
room for them later, particularly so that different apps / extensions
that use row-security don't have to fight with each other.

- You can't declare a predicate then apply it to a bunch of tables with
consistent security columns. Updating/changing predicates will be a
pain. We might be able to get around that by recommending that people
use an inlineable SQL function to declare their predicates, but
"inlineable" can be hard to pin down sometimes. If not, we need
something akin to GRANT ... ALL TABLES IN SCHEMA ... for row security,
and ALTER DEFAULT PRIVILEGES ... too.

- It's too hard to see what tables have row-security and what impact it
has. Needs psql improvements.

- There's no way to control whether or not a client can see the
row-security policy definition.


The other one I want to deal with is secure session variables, but
that's mostly a performance optimisation we can add later.

What's your list?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] clang's -Wmissing-variable-declarations shows some shoddy programming

2013-12-18 Thread Bruce Momjian
On Sat, Dec 14, 2013 at 04:52:28PM +0100, Andres Freund wrote:
> Hi,
> 
> Compiling postgres with said option in CFLAGS really gives an astounding
> number of warnings. Except some bison/flex generated ones, none of them
> looks acceptable to me.
> Most are just file local variables with a missing static and easy to
> fix. Several other are actually shared variables, where people simply
> haven't bothered to add the variable to a header. Some of them with
> comments declaring that fact, others adding longer comments, even others
> adding longer comments about that fact.
> 
> I've attached the output of such a compilation run for those without
> clang.

Now that pg_upgrade has stabilized, I think it is time to centralize all
the pg_upgrade_support control variables in a single C include file that
can be used by the backend and by pg_upgrade_support.  This will
eliminate the compiler warnings too.

The attached patch accomplishes this.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/contrib/pg_upgrade_support/pg_upgrade_support.c b/contrib/pg_upgrade_support/pg_upgrade_support.c
new file mode 100644
index 99e64c4..6e30deb
*** a/contrib/pg_upgrade_support/pg_upgrade_support.c
--- b/contrib/pg_upgrade_support/pg_upgrade_support.c
***
*** 11,16 
--- 11,17 
  
  #include "postgres.h"
  
+ #include "catalog/binary_upgrade.h"
  #include "catalog/namespace.h"
  #include "catalog/pg_type.h"
  #include "commands/extension.h"
***
*** 24,40 
  PG_MODULE_MAGIC;
  #endif
  
- extern PGDLLIMPORT Oid binary_upgrade_next_pg_type_oid;
- extern PGDLLIMPORT Oid binary_upgrade_next_array_pg_type_oid;
- extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_type_oid;
- 
- extern PGDLLIMPORT Oid binary_upgrade_next_heap_pg_class_oid;
- extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_oid;
- extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_oid;
- 
- extern PGDLLIMPORT Oid binary_upgrade_next_pg_enum_oid;
- extern PGDLLIMPORT Oid binary_upgrade_next_pg_authid_oid;
- 
  Datum		set_next_pg_type_oid(PG_FUNCTION_ARGS);
  Datum		set_next_array_pg_type_oid(PG_FUNCTION_ARGS);
  Datum		set_next_toast_pg_type_oid(PG_FUNCTION_ARGS);
--- 25,30 
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
new file mode 100644
index 6f2e142..032a20e
*** a/src/backend/catalog/heap.c
--- b/src/backend/catalog/heap.c
***
*** 34,39 
--- 34,40 
  #include "access/sysattr.h"
  #include "access/transam.h"
  #include "access/xact.h"
+ #include "catalog/binary_upgrade.h"
  #include "catalog/catalog.h"
  #include "catalog/dependency.h"
  #include "catalog/heap.h"
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
new file mode 100644
index aa31429..7ad9720
*** a/src/backend/catalog/index.c
--- b/src/backend/catalog/index.c
***
*** 30,35 
--- 30,36 
  #include "access/visibilitymap.h"
  #include "access/xact.h"
  #include "bootstrap/bootstrap.h"
+ #include "catalog/binary_upgrade.h"
  #include "catalog/catalog.h"
  #include "catalog/dependency.h"
  #include "catalog/heap.h"
diff --git a/src/backend/catalog/pg_enum.c b/src/backend/catalog/pg_enum.c
new file mode 100644
index 35899b4..23d2a41
*** a/src/backend/catalog/pg_enum.c
--- b/src/backend/catalog/pg_enum.c
***
*** 17,22 
--- 17,23 
  #include "access/heapam.h"
  #include "access/htup_details.h"
  #include "access/xact.h"
+ #include "catalog/binary_upgrade.h"
  #include "catalog/catalog.h"
  #include "catalog/indexing.h"
  #include "catalog/pg_enum.h"
diff --git a/src/backend/catalog/pg_type.c b/src/backend/catalog/pg_type.c
new file mode 100644
index 23ac3dd..634915b
*** a/src/backend/catalog/pg_type.c
--- b/src/backend/catalog/pg_type.c
***
*** 17,22 
--- 17,23 
  #include "access/heapam.h"
  #include "access/htup_details.h"
  #include "access/xact.h"
+ #include "catalog/binary_upgrade.h"
  #include "catalog/dependency.h"
  #include "catalog/indexing.h"
  #include "catalog/objectaccess.h"
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
new file mode 100644
index 385d64d..f58e434
*** a/src/backend/catalog/toasting.c
--- b/src/backend/catalog/toasting.c
***
*** 16,21 
--- 16,22 
  
  #include "access/tuptoaster.h"
  #include "access/xact.h"
+ #include "catalog/binary_upgrade.h"
  #include "catalog/dependency.h"
  #include "catalog/heap.h"
  #include "catalog/index.h"
***
*** 31,38 
  #include "utils/syscache.h"
  
  /* Potentially set by contrib/pg_upgrade_support functions */
- extern Oid	binary_upgrade_next_toast_pg_class_oid;
- 
  Oid			binary_upgrade_next_toast_pg_type_oid = InvalidOid;
  
  static bool create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
--- 32,37 
diff --git a/src/backend/commands/typecmds.c b/src/backend/commands/typecmds.c

Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters

2013-12-18 Thread Amit Kapila
On Wed, Dec 18, 2013 at 8:25 PM, Tatsuo Ishii  wrote:
>>> Is there any reason for the function returns int as it always returns
>>> 0 or 1. Maybe returns bool is better?
>>
>
> I have committed your patches. Thanks.

Thank you very much.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] preserving forensic information when we freeze

2013-12-18 Thread Robert Haas
On Wed, Dec 18, 2013 at 5:54 PM, Andres Freund  wrote:
>>   if (frz->frzflags & XLH_FREEZE_XVAC)
>> + {
>>   HeapTupleHeaderSetXvac(tuple, FrozenTransactionId);
>> + /* If we somehow haven't hinted the tuple previously, do it 
>> now. */
>> + HeapTupleHeaderSetXminCommitted(tuple);
>> + }
>
> What's the reasoning behind adding HeapTupleHeaderSetXminCommitted()
> here?

I'm just copying the existing logic.  See the final stanza of
heap_prepare_freeze_tuple.

> [ snip ]

Will fix.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autoconf 2.69 update

2013-12-18 Thread Peter Eisentraut
On Thu, 2013-11-14 at 22:00 -0500, Peter Eisentraut wrote:
> I'm proposing that we upgrade our Autoconf to 2.69

This has been done.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewarm status

2013-12-18 Thread Robert Haas
On Wed, Dec 18, 2013 at 6:07 PM, Cédric Villemain  wrote:
> In the case of effective_io_concurrency, however, this may not work as well as
> expected, IIRC it is used to prefetch heap blocks, hopefully the requested
> blocks are contiguous but if there are too much holes it is enough to fill the
> ring very quickly (with the current max value of effective_io_concurrency).

Yeah, we'd need to figure out how big the ring would need to be for
reasonable values of effective_io_concurrency.

>> When the prefetch process starts up, it services requests from the
>> queue by reading the requested blocks (or block ranges).  When the
>> queue is empty, it sleeps.  If it receives no requests for some period
>> of time, it unregisters itself and exits.  This is sort of a souped-up
>> version of the hibernation facility we already have for some auxiliary
>> processes, in that we don't just make the process sleep for a longer
>> period of time but actually get rid of it altogether.
>
> I'm just a bit skeptical about the starting time: backend will ReadBuffer very
> soon after requesting the Prefetch...

Yeah, absolutely.  The first backend that needs a prefetch probably
isn't going to get it in time.  I think that's OK though.  Once the
background process is started, response times will be quicker...
although possibly still not quick enough.  We'd need to benchmark this
to determine how quickly the background process can actually service
requests.  Does anybody have a good self-contained test case that
showcases the benefits of prefetching?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2013-12-18 Thread Andreas Karlsson

On 12/18/2013 01:02 PM, Alexander Korotkov wrote:

My idea for a solution was to modify tuplesort to allow storing the
already sorted keys in either memtuples or the sort result file, but
setting a field so it does not sort thee already sorted tuples
again. This would allow the rescan to work as it used to, but I am
unsure how clean or ugly this code would be. Was this something you
considered?


I'm not sure. I believe that best answer depends on particular
parameter: how much memory we've for sort, how expensive is underlying
node and how it performs rescan, how big are groups in partial sort.


Yes, if one does not need a rescan your solution will use less memory 
and about the same amount of CPU (if the tuplesort does not spill to 
disk). While if we keep all the already sorted tuples in the tuplesort 
rescans will be cheap but more memory will be used with an increased 
chance of spilling to disk.


--
Andreas Karlsson


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewarm status

2013-12-18 Thread Alvaro Herrera
Robert Haas escribió:

> I'm not inclined to wait for the next CommitFest to commit this,
> because it's a very simple patch and has already had a lot more field
> testing than most patches get before they're committed.  And it's just
> a contrib module, so the damage it can do if there is in fact a bug is
> pretty limited.  All that having been said, any review is appreciated.

Looks nice.

Some really minor things I noticed while skimming are that you have some
weird indentation using spaces in some ereport() calls; there's an extra
call to RelationOpenSmgr() in read mode; and the copyright year is 2012.
Please use — in sgml instead of plain dashes, and please avoid the
"!strcmp()" idiom.

I didn't actually try it out ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewarm status

2013-12-18 Thread Cédric Villemain
Le mercredi 18 décembre 2013 18:40:09, Robert Haas a écrit :
> Now that we have dynamic background workers, I've been thinking that
> it might be possible to write a background worker to do asynchronous
> prefetch on systems where we don't have OS support.  We could store a
> small ring buffer in shared memory, say big enough for 1k entries.
> Each entry would consist of a relfilenode, a starting block number,
> and a block count.  We'd also store a flag indicating whether the
> prefetch worker has been registered with the postmaster, and a pointer
> to the PGPROC of any running worker. When a process wants to do a
> prefetch, it locks the buffer, adds its prefetch request to the queue
> (overwriting the oldest existing request if the queue is already
> full), and checks the flag.  If the flag is not set, it also registers
> the background worker.  Then, it releases the lock and sets the latch
> of any running worker (whose PGPROC it remembered before releasing the
> lock).

Good idea.
If the list is full it is probably that the system is busy, I suppose that in 
such case some alternative behavior can be interesting. Perhaps flush a part of 
the ring. Oldest entries are the less interesting, we're talking about 
prefetching after all.

In the case of effective_io_concurrency, however, this may not work as well as 
expected, IIRC it is used to prefetch heap blocks, hopefully the requested 
blocks are contiguous but if there are too much holes it is enough to fill the 
ring very quickly (with the current max value of effective_io_concurrency).

> When the prefetch process starts up, it services requests from the
> queue by reading the requested blocks (or block ranges).  When the
> queue is empty, it sleeps.  If it receives no requests for some period
> of time, it unregisters itself and exits.  This is sort of a souped-up
> version of the hibernation facility we already have for some auxiliary
> processes, in that we don't just make the process sleep for a longer
> period of time but actually get rid of it altogether.

I'm just a bit skeptical about the starting time: backend will ReadBuffer very 
soon after requesting the Prefetch...

> All of this might be overkill; we could also do it with a permanent
> auxiliary process.  But it's sort of a shame to run an extra process
> for a facility that might never get used, or might be used only
> rarely.  And I'm wary of cluttering things up with a thicket of
> auxiliary processes each of which caters only to a very specific, very
> narrow situation.  Anyway, just thinking out loud here.

For windows see the C++ PrefetchVirtualMemory() function.

I really wonder if such a bgworker can improve the prefetching on !windows too 
if ring insert is faster than posix_fadvise call.

If this is true, then effective_io_concurrency can be revisited. Maybe Greg 
Stark already did some benchmark of that...

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] New option for pg_basebackup, to specify a different directory for pg_xlog

2013-12-18 Thread Bruce Momjian
On Wed, Dec 11, 2013 at 10:22:32AM +, Haribabu kommi wrote:
> The make_absolute_path() function moving to port is changed in similar way as
> Bruce Momjian approach. The psprintf is used to store the error string which
> Occurred in the function. But psprintf is not used for storing the absolute 
> path
> As because it is giving problems in freeing the allocated memory in 
> SelectConfigFiles.
> Because the same memory is allocated in a different code branch from 
> guc_malloc.
> 
> After adding the make_absolute_path() function with psprintf stuff in path.c 
> file
> It is giving linking problem in compilation of ecpg. I am not able to find 
> the problem.
> So I added another file abspath.c in port which contains these two functions.

What errors are you seeing?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Kevin Grittner
Josh Berkus  wrote:
> On 12/18/2013 11:26 AM, Jim Nasby wrote:

>> Another possibility is to allow for two different types of
>> assertions, one based on SSI and one based on locking.
>
> The locking version would have to pretty much lock on a table
> basis (or even a whole-database basis) every time an assertion
> executed, no?

As far as I can see, if SSI is *not* used, there needs to be a
mutually exclusive lock taken from somewhere inside the COMMIT code
until the transaction is complete -- effectively serializing
assertion processing for transactions which could affect a given
assertion.  Locking on tables would, as previously suggested, be
very prone to deadlocks on the heavyweight locks.  Locking on the
assertions in a predictable order seems more promising, especially
if there could be some way to only do that if the transaction
really might have done something which could affect the truth of
the assertion.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_length(anyarray)

2013-12-18 Thread Marko Tiikkaja

On 12/19/13, 12:01 AM, David Johnston wrote:

Marko Tiikkaja-4 wrote

On 2013-12-18 22:32, Andrew Dunstan wrote:

You're not really free to assume it - you'll need an exception handler
for the other-than-1 case, or your code might blow up.

This seems to be codifying a bad pattern, which should be using
array_lower() and array_upper() instead.


That's the entire point -- I *want* my code to blow up.  If someone
passes a multi-dimensional array to a function that assumes its input is
one-dimensional and its indexes start from 1, I want it to be obvious
that the caller did something wrong.  Now I either copy-paste lines and
lines of codes to always test for the weird cases or my code breaks in
subtle ways.

This is no different from an Assert() somewhere -- if the caller breaks
the documented interface, it's his problem, not mine.  And I don't want
to waste my time coding around the fact that this simple thing is so
hard to do in PG.


1) Why cannot we just make the second argument of the current function
optional and default to 1?


That still does the wrong thing for the empty array, multidimensional 
arrays and arrays that don't start from index 1.



2) How about providing a function that returns the "1-dim/lower=1" input
array or raise/exception if the input array does not conform?


CREATE FUNCTION array_normal(arr anyarray) RETURNS anyarray
$$
begin
 if (empty(arr)) return arr;
 if (ndim(arr) > 1) raise exception;
 if (array_lower() <> 1) raise exception
 return arr;
end;
$$


With this, I would still have to do 
COALESCE(array_length(array_normal($1), 1), 0).  That's pretty stupid 
for the most common use case of arrays, don't you think?



I can also see wanting 1-dimensional enforced without having to require the
lower-bound to be 1 so maybe a separate function for that.


I really don't see the point.  How often have you ever created a 
function that doesn't have a lower bound of 1 on purpose?  What good did 
it serve you?



Usage:

SELECT array_length(array_normal(input_array))

I could see this being especially useful for a domain and/or column
constraint definition and also allowing for a textbook case of separation of
concerns.


What would array_length() in this case be?  With what you suggested 
above, you would still get NULL for an empty array.



I am torn, but mostly opposed, to making an array_length(anyarray) function
with these limitations enforced - especially if other similar functions are
not created at the same time.  I fully agree that array_length(anyarray)
should be a valid call without requiring the user to specify ", 1" by rote.


I'm specifically asking for something that is different from 
array_length(anyarray, int), because I personally think it's too full of 
caveats.



Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stats for network traffic WIP

2013-12-18 Thread Bruce Momjian
On Wed, Dec 18, 2013 at 03:41:24PM -0500, Robert Haas wrote:
> On the other hand, there's not much value in adding monitoring
> features that are going to materially harm performance, and a lot of
> the monitoring features that get proposed die on the vine for exactly
> that reason.  I think the root of the problem is that our stats
> infrastructure is a streaming pile of crap.  A number of people have

"streaming"?  I can't imagine what that looks like.  ;-)

I think the larger point is that network is only one of many things we
need to address, so this needs a holistic approach that looks at all
needs and creates infrastructure to address it.
 
-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assertion failure in base backup code path

2013-12-18 Thread Andres Freund
Hi Magnus,

It looks to me like the path to do_pg_start_backup() outside of a
transaction context comes from your initial commit of the base backup
facility.
The problem is that you're not allowed to do anything leading to a
syscache/catcache lookup in those contexts.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] preserving forensic information when we freeze

2013-12-18 Thread Andres Freund
Hi,

On 2013-12-17 16:00:14 -0500, Robert Haas wrote:
> @@ -5874,19 +5858,27 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple, 
> TransactionId cutoff_xid,
>  void
>  heap_execute_freeze_tuple(HeapTupleHeader tuple, xl_heap_freeze_tuple *frz)
>  {
> + tuple->t_infomask = frz->t_infomask;
> + tuple->t_infomask2 = frz->t_infomask2;
> +
>   if (frz->frzflags & XLH_FREEZE_XMIN)
> - HeapTupleHeaderSetXmin(tuple, FrozenTransactionId);
> + HeapTupleHeaderSetXminFrozen(tuple);
>  
>   HeapTupleHeaderSetXmax(tuple, frz->xmax);
>  
>   if (frz->frzflags & XLH_FREEZE_XVAC)
> + {
>   HeapTupleHeaderSetXvac(tuple, FrozenTransactionId);
> + /* If we somehow haven't hinted the tuple previously, do it 
> now. */
> + HeapTupleHeaderSetXminCommitted(tuple);
> + }

What's the reasoning behind adding HeapTupleHeaderSetXminCommitted()
here?

> @@ -823,14 +823,14 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
>* NB: Like with per-tuple hint bits, 
> we can't set the
>* PD_ALL_VISIBLE flag if the inserter 
> committed
>* asynchronously. See SetHintBits for 
> more info. Check
> -  * that the HEAP_XMIN_COMMITTED hint 
> bit is set because of
> -  * that.
> +  * that the tuple is hinted 
> xmin-committed hint bit because
> +  * of that.
>*/

Looks like there's a "hint bit" too much here.

> @@ -65,6 +65,9 @@ manage to be a conflict it would merely mean that one 
> bit-update would
>  be lost and need to be done again later.  These four bits are only hints
>  (they cache the results of transaction status lookups in pg_clog), so no
>  great harm is done if they get reset to zero by conflicting updates.
> +Note, however, that a tuple is frozen by setting both HEAP_XMIN_INVALID
> +and HEAP_XMIN_COMMITTED; this is a critical update and accordingly requires
> +an exclusive buffer lock.

I think it'd be appropriate to mention that this needs to be preserved
via WAL logging, it sounds like it's suficient to set a hint bit without
persistenc guarantees.

(not sure if I already wrote this, but whatever)

Looking good.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_length(anyarray)

2013-12-18 Thread David Johnston
Marko Tiikkaja-4 wrote
> On 2013-12-18 22:32, Andrew Dunstan wrote:
>> You're not really free to assume it - you'll need an exception handler
>> for the other-than-1 case, or your code might blow up.
>>
>> This seems to be codifying a bad pattern, which should be using
>> array_lower() and array_upper() instead.
> 
> That's the entire point -- I *want* my code to blow up.  If someone 
> passes a multi-dimensional array to a function that assumes its input is 
> one-dimensional and its indexes start from 1, I want it to be obvious 
> that the caller did something wrong.  Now I either copy-paste lines and 
> lines of codes to always test for the weird cases or my code breaks in 
> subtle ways.
> 
> This is no different from an Assert() somewhere -- if the caller breaks 
> the documented interface, it's his problem, not mine.  And I don't want 
> to waste my time coding around the fact that this simple thing is so 
> hard to do in PG.

1) Why cannot we just make the second argument of the current function
optional and default to 1?
2) How about providing a function that returns the "1-dim/lower=1" input
array or raise/exception if the input array does not conform?


CREATE FUNCTION array_normal(arr anyarray) RETURNS anyarray
$$
begin
if (empty(arr)) return arr;
if (ndim(arr) > 1) raise exception;
if (array_lower() <> 1) raise exception
return arr;
end;
$$

I can also see wanting 1-dimensional enforced without having to require the
lower-bound to be 1 so maybe a separate function for that.

Usage:

SELECT array_length(array_normal(input_array))

I could see this being especially useful for a domain and/or column
constraint definition and also allowing for a textbook case of separation of
concerns.

I am torn, but mostly opposed, to making an array_length(anyarray) function
with these limitations enforced - especially if other similar functions are
not created at the same time.  I fully agree that array_length(anyarray)
should be a valid call without requiring the user to specify ", 1" by rote.

Tangential Question:

Is there any way to define a non-1-based array without using array-literal
syntax but by using ARRAY[1,2,3] syntax?


David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/array-length-anyarray-tp5783950p5783972.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2013-12-18 Thread Andres Freund
On 2013-12-18 15:23:23 -0500, Robert Haas wrote:
> It sounds like most people who have looked at this stuff are broadly
> happy with it, so I'd like to push on toward commit soon, but it'd be
> helpful, Andres, if you could review the comment additions to
> shm-mq-v2.patch and see whether those address your concerns.

FWIW, I haven't looked carefully enough at the patch to consider myself
having reviewed it. I am not saying that it's not ready for committer,
just that I don't know whether it is.

I will have a look at the comment improvements, and if you don't beat me
to it, give the patch a read-over.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 regression with dbt2

2013-12-18 Thread Dong Ye
Applying your patch plus adding -fno-omit-frame-pointer, I got 54526.90
notpm.

The profile (part) below:
# Samples: 610K of event 'cycles'
# Event count (approx.): 6686532056450
#
# Overhead Command  Shared Object
   Symbol
#   ..  .
 ..
#
 4.08%postgres  postgres   [.] hash_search_with_hash_value

  |
  --- hash_search_with_hash_value
 |
  --2.87%-- BufTableLookup
|
 --2.86%-- ReadBuffer_common
   ReadBufferExtended
   |
   |--0.86%-- index_fetch_heap
   |  |
   |   --0.62%-- index_getnext
   | IndexNext
   | ExecScan
   | ExecProcNode
   |
   |--0.71%-- BitmapHeapNext
   |  ExecScan
   |  ExecProcNode
   |
--0.57%-- _bt_relandgetbuf
  |
   --0.57%-- _bt_search

 3.75%postgres  postgres   [.] heap_hot_search_buffer

  |
  --- heap_hot_search_buffer
 |
 |--1.92%-- BitmapHeapNext
 |  ExecScan
 |  ExecProcNode
 |  |
 |   --1.12%-- standard_ExecutorRun
 | _SPI_execute_plan
 | SPI_execute_plan
 | |
 |  --0.70%-- payment
 |ExecMakeFunctionResult
 |ExecProject
 |ExecResult
 |ExecProcNode
 |standard_ExecutorRun
 |PortalRunSelect
 |PortalRun
 |PostgresMain
 |ServerLoop
 |PostmasterMain
 |main
 |__libc_start_main
 |
  --1.74%-- index_fetch_heap
|
 --1.46%-- index_getnext
   |
--1.45%-- IndexNext
  ExecScan
  ExecProcNode
  |
   --0.96%--
standard_ExecutorRun

 _SPI_execute_plan

 SPI_execute_plan
 |
  --0.50%--
new_order

ExecMakeFunctionResult

ExecProject

ExecResult

ExecProcNode

standard_ExecutorRun

PortalRunSelect

PortalRunFetch

PerformPortalFetch

standard_ProcessUtility

PortalRunUtility

FillPortalStore

PortalRun

PostgresMain

ServerLoop

PostmasterMain
main

__libc_start_main

 3.15%postgres  postgres   [.] LWLockAcquire

  |
  --- LWLockAcquire
 |
  --1.65%-- ReadBuffer_common
ReadBufferExtended

 2.74%postgres  postgres   [.] PinBuffer

  |
  --- PinBuffer
 |
  --2.72%-- ReadBuffer_common
ReadBufferExtended
|
|--0.71%-- index_fetch_heap
|  |
|   --0.51%-- index_getnext
| IndexNext
| ExecScan
| ExecProcNode
|
|--0.67%-- BitmapHeapNext
|  ExecScan
|  ExecProcNode
|
 --0.60%

Re: [HACKERS] Extension Templates S03E11

2013-12-18 Thread Greg Stark
Yeah I think this whole discussion is happening at the wrong level. The
problem you're having, despite appearances, is not that people disagree
about the best way to accomplish your goals.

The problem is that not everyone is convinced your goals are a good idea.
Either they just don't understand the goals or they do understand them but
don't agree that they're a good idea.

Personally I'm in the former category and would welcome a detailed
explanation of the goals of the feature and what use cases those goals
enable.

I think Tom is in the later category and needs a very good argument for why
those goals are important enough to outweigh the downsides.

I don't think loading shared libraries from RAM or a temp download
directory is a *complete* show stopper the way Tom says but it would
require a pretty compelling use case to make it worth the difficulties it
would cause.

-- 
greg


Re: [HACKERS] array_length(anyarray)

2013-12-18 Thread Marko Tiikkaja

On 2013-12-18 22:32, Andrew Dunstan wrote:

You're not really free to assume it - you'll need an exception handler
for the other-than-1 case, or your code might blow up.

This seems to be codifying a bad pattern, which should be using
array_lower() and array_upper() instead.


That's the entire point -- I *want* my code to blow up.  If someone 
passes a multi-dimensional array to a function that assumes its input is 
one-dimensional and its indexes start from 1, I want it to be obvious 
that the caller did something wrong.  Now I either copy-paste lines and 
lines of codes to always test for the weird cases or my code breaks in 
subtle ways.


This is no different from an Assert() somewhere -- if the caller breaks 
the documented interface, it's his problem, not mine.  And I don't want 
to waste my time coding around the fact that this simple thing is so 
hard to do in PG.




Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_length(anyarray)

2013-12-18 Thread Andrew Dunstan


On 12/18/2013 04:19 PM, Marko Tiikkaja wrote:

On 2013-12-18 22:13, Andrew Dunstan wrote:

On 12/18/2013 03:27 PM, Marko Tiikkaja wrote:

Attached is a patch to add support for array_length(anyarray), which
only works for one-dimensional arrays, returns 0 for empty arrays and
complains if the array's lower bound isn't 1.  In other words, does
the right thing when used with the arrays people use 99% of the time.


Why the heck would it complain if the lower bound isn't 1?


Because then you're free to assume that the first element is [1] and 
the last one is [array_length()].  Which is what 99% of the code using 
array_length(anyarray, int) does anyway.






You're not really free to assume it - you'll need an exception handler 
for the other-than-1 case, or your code might blow up.


This seems to be codifying a bad pattern, which should be using 
array_lower() and array_upper() instead.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_length(anyarray)

2013-12-18 Thread Marko Tiikkaja

On 2013-12-18 22:19, I wrote:

On 2013-12-18 22:13, Andrew Dunstan wrote:

On 12/18/2013 03:27 PM, Marko Tiikkaja wrote:

Attached is a patch to add support for array_length(anyarray), which
only works for one-dimensional arrays, returns 0 for empty arrays and
complains if the array's lower bound isn't 1.  In other words, does
the right thing when used with the arrays people use 99% of the time.


Why the heck would it complain if the lower bound isn't 1?


Because then you're free to assume that the first element is [1] and the
last one is [array_length()].  Which is what 99% of the code using
array_length(anyarray, int) does anyway.


Just to clarify, I mean that array_lower(.., 1) must be 1.  Whatever 
that's called.  "The lower bound of the only dimension of the array"?



Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_length(anyarray)

2013-12-18 Thread Marko Tiikkaja

On 2013-12-18 22:13, Andrew Dunstan wrote:

On 12/18/2013 03:27 PM, Marko Tiikkaja wrote:

Attached is a patch to add support for array_length(anyarray), which
only works for one-dimensional arrays, returns 0 for empty arrays and
complains if the array's lower bound isn't 1.  In other words, does
the right thing when used with the arrays people use 99% of the time.


Why the heck would it complain if the lower bound isn't 1?


Because then you're free to assume that the first element is [1] and the 
last one is [array_length()].  Which is what 99% of the code using 
array_length(anyarray, int) does anyway.



Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Josh Berkus
On 12/18/2013 11:26 AM, Jim Nasby wrote:
> The flip-side is that now you can get serialization failures, and I
> think there's a ton of software that has no clue how to deal with that.
> So now you don't get to use assertions at all unless you re-engineer
> your application (but see below).

Well, the software will need to deal with an Assertion failure, which I
doubt it's prepared to do right now either.

>> This is consistent with how we treat the interaction of constraints and
>> triggers; under some circumstances, we allow triggers to violate CHECK
>> and FK constraints.
> 
> We do? Under what circumstances?

AFTER triggers are allowed to ignore constraints sometimes.  For
example, if you have a tree table with an FK to other rows in the same
table, and you have an AFTER trigger on it, the AFTER trigger is allowed
to violate the self-FK.  That's the one I ran across, but I vaguely
remember other cases, and there's some documentation on this in the
order of application of triggers in the main docs.

> Another possibility is to allow for two different types of assertions,
> one based on SSI and one based on locking.

The locking version would have to pretty much lock on a table basis (or
even a whole-database basis) every time an assertion executed, no?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Andrew Dunstan


On 12/18/2013 04:09 PM, Heikki Linnakangas wrote:

On 12/18/2013 11:04 PM, Andrew Dunstan wrote:


On 12/18/2013 02:45 PM, Andres Freund wrote:

On 2013-12-18 16:39:58 -0300, Alvaro Herrera wrote:

Andres Freund wrote:

It would only force serialization for transactions that modify tables
covered by the assert, that doesn't seem to bad. Anything covered 
by an

assert shoulnd't be modified frequently, otherwise you'll run into
major
performance problems.
Well, as presented there is no way (for the system) to tell which 
tables

are covered by an assertion, is there?  That's my point.

Well, the patch's syntax seems to only allow to directly specify a SQL
query to check - we could iterate over the querytree to gather all
related tables and reject any function we do not understand.


Umm, that's really a major limitation in utility.


The query can be "SELECT is_my_assertion_true()", and the function can 
do anything.





OK, but isn't that what Andres is suggesting we reject?

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_length(anyarray)

2013-12-18 Thread Andrew Dunstan


On 12/18/2013 03:27 PM, Marko Tiikkaja wrote:

Hi,

Attached is a patch to add support for array_length(anyarray), which 
only works for one-dimensional arrays, returns 0 for empty arrays and 
complains if the array's lower bound isn't 1.  In other words, does 
the right thing when used with the arrays people use 99% of the time.


Why the heck would it complain if the lower bound isn't 1?

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Heikki Linnakangas

On 12/18/2013 11:04 PM, Andrew Dunstan wrote:


On 12/18/2013 02:45 PM, Andres Freund wrote:

On 2013-12-18 16:39:58 -0300, Alvaro Herrera wrote:

Andres Freund wrote:

It would only force serialization for transactions that modify tables
covered by the assert, that doesn't seem to bad. Anything covered by an
assert shoulnd't be modified frequently, otherwise you'll run into
major
performance problems.

Well, as presented there is no way (for the system) to tell which tables
are covered by an assertion, is there?  That's my point.

Well, the patch's syntax seems to only allow to directly specify a SQL
query to check - we could iterate over the querytree to gather all
related tables and reject any function we do not understand.


Umm, that's really a major limitation in utility.


The query can be "SELECT is_my_assertion_true()", and the function can 
do anything.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stats for network traffic WIP

2013-12-18 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
> On Wed, Dec 18, 2013 at 8:47 AM, Stephen Frost  wrote:
> > Agreed.  My other thought on this is that there's a lot to be said for
> > having everything you need available through one tool- kinda like how
> > Emacs users rarely go outside of it.. :)  And then there's also the
> > consideration that DBAs may not have access to the host system at all,
> > or not to the level needed to do similar analysis there.
> 
> I completely agree with this, and yet I still think we should reject
> the patch, because I think the overhead is going to be intolerable.

That's a fair point and I'm fine with rejecting it on the grounds that
the overhead is too much.  Hopefully that encourages the author to go
back and review Tom's comments and consider how the overhead could be
reduced or eliminated.  We absolutely need better monitoring and I have
had many of the same strace-involving conversations.  perf is nearly out
of the question as it's often not even installed and can be terribly
risky (I once had to get a prod box hard-reset after running perf on it
for mere moments because it never came back enough to let us do a clean
restart).

> I think the root of the problem is that our stats
> infrastructure is a streaming pile of crap.

+1

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Andrew Dunstan


On 12/18/2013 02:45 PM, Andres Freund wrote:

On 2013-12-18 16:39:58 -0300, Alvaro Herrera wrote:

Andres Freund wrote:

It would only force serialization for transactions that modify tables
covered by the assert, that doesn't seem to bad. Anything covered by an
assert shoulnd't be modified frequently, otherwise you'll run into major
performance problems.

Well, as presented there is no way (for the system) to tell which tables
are covered by an assertion, is there?  That's my point.

Well, the patch's syntax seems to only allow to directly specify a SQL
query to check - we could iterate over the querytree to gather all
related tables and reject any function we do not understand.



Umm, that's really a major limitation in utility. We need to come up 
with a better answer than this, which would essentially hobble the facility.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stats for network traffic WIP

2013-12-18 Thread Robert Haas
On Wed, Dec 18, 2013 at 8:47 AM, Stephen Frost  wrote:
> Agreed.  My other thought on this is that there's a lot to be said for
> having everything you need available through one tool- kinda like how
> Emacs users rarely go outside of it.. :)  And then there's also the
> consideration that DBAs may not have access to the host system at all,
> or not to the level needed to do similar analysis there.

I completely agree with this, and yet I still think we should reject
the patch, because I think the overhead is going to be intolerable.

Now, the fact is, the monitoring facilities we have in PostgreSQL
today are not nearly good enough.  Other products do better.  I cringe
every time I tell someone to attach strace to a long-running autovac
process to find out what block number it's currently on, so we can
estimate when it will finish; or every time we need data about lwlock
contention and the only way to get it is to use perf, or recompile
with LWLOCK_STATS defined.  These are not fun conversations to have
with customers who are in production.

On the other hand, there's not much value in adding monitoring
features that are going to materially harm performance, and a lot of
the monitoring features that get proposed die on the vine for exactly
that reason.  I think the root of the problem is that our stats
infrastructure is a streaming pile of crap.  A number of people have
worked diligently to improve it and that work has not been fruitless,
but the current situation is still not very good.  In many ways, this
situation reminds me of the situation with EXPLAIN a few years ago.
People kept proposing useful extensions to EXPLAIN which we did not
adopt because they required creating (and perhaps reserving) far too
many keywords.  Now that we have the extensible options syntax,
EXPLAIN has options for COSTS, BUFFERS, TIMING, and FORMAT, all of
which have proven to be worth their weight in code, at least IMHO.

I am really not sure what a better infrastructure for stats collection
should look like, but I know that until we get one, a lot of
monitoring patches that would be really nice to have are going to get
shot down because of concerns about performance, and specifically
stats file bloat.  Fixing that problem figures to be unglamorous, but
I'll buy whoever does it a beer (or another beverage of your choice).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Kevin Grittner
Jim Nasby  wrote:
> On 12/18/13, 1:42 PM, Kevin Grittner wrote:
>> Jim Nasby  wrote:
>>
>>> This is another case where it would be very useful to restrict
>>> what relations a transaction (or in this case, a substransaction)
>>> can access. If we had the ability to make that restriction then
>>> we could force assertions that aren't plain SQL to explicitly
>>> specify what tables the assert is going to hit, and if the assert
>>> tries to do something different then we throw an error.
>>>
>>> The ability to restrict object access within a transaction would
>>> also benefit VACUUM and possibly the Changeset stuff.
>>
>> I'm pretty sure that SSI could also optimize based on that,
>> although there are probably about 10 other optimizations that would
>> be bigger gains before getting to that.
>
> Any ideas how hard this would be?

If we had a list to check against, I think it would be possible to
do this during parse analysis and AcquireRewriteLocks().  (One or
the other happens before query rewrite.)  The hard part seems to me
to be defining a sane way to get the list.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] array_length(anyarray)

2013-12-18 Thread Marko Tiikkaja

Hi,

Attached is a patch to add support for array_length(anyarray), which 
only works for one-dimensional arrays, returns 0 for empty arrays and 
complains if the array's lower bound isn't 1.  In other words, does the 
right thing when used with the arrays people use 99% of the time.


I'll add this to the next commit fest, but feel free to discuss it 
before that.




Regards,
Marko Tiikkaja
*** a/doc/src/sgml/array.sgml
--- b/doc/src/sgml/array.sgml
***
*** 338,343  SELECT array_length(schedule, 1) FROM sal_emp WHERE name = 
'Carol';
--- 338,354 
  2
  (1 row)
  
+ 
+  The single-argument overload of array_length can be used
+  to get the length of a one-dimensional array:
+ 
+ SELECT array_length(schedule) FROM sal_emp WHERE name = 'Carol';
+ 
+  array_length
+ --
+ 2
+ (1 row)
+ 
   
   
  
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 11096,11101  SELECT NULLIF(value, '(none)') ...
--- 11096,2 
 
  
   
+   array_length(anyarray)
+  
+ 
+ int
+ returns the length of the array (array must be 
one-dimensional)
+ array_length(array[1,2,3])
+ 3
+
+
+ 
+  
array_lower(anyarray, 
int)
   
  
*** a/src/backend/utils/adt/arrayfuncs.c
--- b/src/backend/utils/adt/arrayfuncs.c
***
*** 1740,1745  array_length(PG_FUNCTION_ARGS)
--- 1740,1779 
  }
  
  /*
+  * array_length_single:
+  *Returns the length of a single-dimensional array.  The array 
must be
+  *single-dimensional or empty and its lower bound must be 1.
+  */
+ Datum
+ array_length_single(PG_FUNCTION_ARGS)
+ {
+   ArrayType  *v = PG_GETARG_ARRAYTYPE_P(0);
+   int result;
+   int*lb;
+   int*dimv;
+ 
+   /* empty array */
+   if (ARR_NDIM(v) == 0)
+   PG_RETURN_INT32(0);
+ 
+   if (ARR_NDIM(v) != 1)
+   ereport(ERROR, 
+   (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
+errmsg("input array is not 
single-dimensional")));
+ 
+   lb = ARR_LBOUND(v);
+   if (lb[0] != 1)
+   ereport(ERROR,
+   (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
+errmsg("lower bound of array is not 1")));
+ 
+   dimv = ARR_DIMS(v);
+   result = dimv[0];
+   PG_RETURN_INT32(result);
+ }
+ 
+ 
+ /*
   * array_ref :
   *  This routine takes an array pointer and a subscript array and returns
   *  the referenced item as a Datum.  Note that for a pass-by-reference
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
***
*** 840,845  DATA(insert OID = 2092 (  array_upper PGNSP PGUID 12 1 0 0 
0 f f f f t f i 2
--- 840,847 
  DESCR("array upper dimension");
  DATA(insert OID = 2176 (  array_length   PGNSP PGUID 12 1 0 0 0 f f f 
f t f i 2 0 23 "2277 23" _null_ _null_ _null_ _null_ array_length _null_ _null_ 
_null_ ));
  DESCR("array length");
+ DATA(insert OID = 3179 (  array_length   PGNSP PGUID 12 1 0 0 0 f f f 
f t f i 1 0 23 "2277" _null_ _null_ _null_ _null_ array_length_single _null_ 
_null_ _null_ ));
+ DESCR("array length");
  DATA(insert OID = 378 (  array_appendPGNSP PGUID 12 1 0 0 0 f f f f f f i 
2 0 2277 "2277 2283" _null_ _null_ _null_ _null_ array_push _null_ _null_ 
_null_ ));
  DESCR("append element onto end of array");
  DATA(insert OID = 379 (  array_prepend   PGNSP PGUID 12 1 0 0 0 f f f 
f f f i 2 0 2277 "2283 2277" _null_ _null_ _null_ _null_ array_push _null_ 
_null_ _null_ ));
*** a/src/include/utils/array.h
--- b/src/include/utils/array.h
***
*** 204,209  extern Datum array_dims(PG_FUNCTION_ARGS);
--- 204,210 
  extern Datum array_lower(PG_FUNCTION_ARGS);
  extern Datum array_upper(PG_FUNCTION_ARGS);
  extern Datum array_length(PG_FUNCTION_ARGS);
+ extern Datum array_length_single(PG_FUNCTION_ARGS);
  extern Datum array_larger(PG_FUNCTION_ARGS);
  extern Datum array_smaller(PG_FUNCTION_ARGS);
  extern Datum generate_subscripts(PG_FUNCTION_ARGS);
*** a/src/test/regress/expected/arrays.out
--- b/src/test/regress/expected/arrays.out
***
*** 1455,1460  select array_length(array[[1,2,3], [4,5,6]], 3);
--- 1455,1482 
   
  (1 row)
  
+ select array_length(NULL::int[]);
+  array_length 
+ --
+  
+ (1 row)
+ 
+ select array_length(array[1,2,3]);
+  array_length 
+ --
+ 3
+ (1 row)
+ 
+ select array_length('{}'::int[]);
+  array_length 
+ --
+ 0
+ (1 row)
+ 
+ select array_length('[2:4]={5,6,7}'::int[]);
+ ERROR:  lower bound of array is not 1
+ select array_length('{{1,2}}'::int[]);
+ ERROR:  input array is not single-dimensional
  select array_agg(uni

Re: [HACKERS] 9.3 regression with dbt2

2013-12-18 Thread Dong Ye
Hi,

Applied your patch (but not using -fno-omit-frame-pointer). It seems
recover the perf loss: 55659.72 notpm.
FWIW, the profile is below. I will do a run with the flag..
Samples: 598K of event 'cycles', Event count (approx.): 6568957160059


+   4.03%   postgres  postgres  [.]
hash_search_with_hash_value
+   3.74%   postgres  postgres  [.]
heap_hot_search_buffer
+   3.17%   postgres  postgres  [.] LWLockAcquire
+   2.92%   postgres  postgres  [.] _bt_compare
+   2.64%   postgres  postgres  [.] PinBuffer
+   2.62%   postgres  postgres  [.] AllocSetAlloc
+   2.34%   postgres  postgres  [.] XLogInsert
+   2.27%   postgres  postgres  [.] SearchCatCache
+   1.81%   postgres  postgres  [.]
HeapTupleSatisfiesMVCC
+   1.52%   postgres  postgres  [.] ExecInitExpr
+   1.45%   postgres  postgres  [.] heap_page_prune_opt
+   1.41%swapper  [kernel.kallsyms] [k] intel_idle
+   1.30%   postgres  postgres  [.] LWLockRelease
+   1.12%   postgres  postgres  [.] heapgetpage
+   0.96%   postgres  libc-2.17.so  [.] _int_malloc
+   0.86%   postgres  libc-2.17.so  [.] __memcpy_ssse3_back
+   0.84%   postgres  postgres  [.] hash_any
+   0.81%   postgres  postgres  [.]
MemoryContextAllocZeroAligned
+   0.76%   postgres  postgres  [.] XidInMVCCSnapshot
+   0.74%   postgres  postgres  [.] _bt_checkkeys
+   0.70%   postgres  postgres  [.]
fmgr_info_cxt_security
+   0.70%   postgres  postgres  [.] FunctionCall2Coll
+   0.66%   postgres  libc-2.17.so  [.]
__strncpy_sse2_unaligned
+   0.63%   postgres  postgres  [.] tbm_iterate
+   0.58%   postgres  postgres  [.] base_yyparse
+   0.58%   postgres  libc-2.17.so  [.] __printf_fp
+   0.55%   postgres  postgres  [.] palloc
+   0.51%   postgres  postgres  [.]
TransactionIdPrecedes
+   0.51%   postgres  postgres  [.] slot_deform_tuple
+   0.50%   postgres  postgres  [.] ReadBuffer_common


Re: [HACKERS] SQL objects UNITs

2013-12-18 Thread Jim Nasby

On 12/18/13, 4:22 AM, Dimitri Fontaine wrote:

 ALTER UNIT name SET SCHEMA ;


FWIW, with the "units" that we've developed we use schemas to differentiate between 
public objects and "internal" (private or protected) objects. So single-schema stuff 
becomes a PITA. Of course, since extensions already work that way I suppose that ship has sailed, 
but I thought I'd mention it.

For those who are curious... we make the distinction of public/protected/private via schemas 
because we don't want general users to need to wade through that stuff when looking at objects. So 
the convention we settled on is that public objects go in one schema, protected objects go in a 
schema of the same name that's prepended with "_", and private objects are in the 
protjected schema but also prepend "_" to their names. IE:

CREATE SCHEMA awesome_feature;
CREATE VIEW awesome_feature.have_some_data

CREATE SCHEMA _awesome_feature; -- Protected / private stuff
CREATE VIEW _awesome_feature.stuff_for_database_code_to_see_but_not_users
CREATE FUNCTION 
_awesome_feature._do_not_run_this_function_anywhere_outside_of_awesome_feature()
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 regression with dbt2

2013-12-18 Thread Dong Ye
HI

On Wed, Dec 18, 2013 at 3:03 PM, Andres Freund wrote:
>
> That looks like a postgres compiled without
> -fno-omit-frame-pointer. Without that hierarchical profiles are
> meaningless. Very new perfs can also do it using dwarf, but it's unusabl
> slow...
>
> Let me complete current test without the flag (to serve as a checkpoint as
well). I will do a run with the flag after.

Thanks,
Dong


Re: [HACKERS] 9.3 regression with dbt2

2013-12-18 Thread Andres Freund
Hi,

On 2013-12-18 14:59:45 -0500, Dong Ye wrote:
> ~20 minutes each run with binary.
> Try your patch now..
> You are right I used -g in perf record. But what I reported was flat (meant
> as a start).
> 
> Expand GetMultiXactIdMembers:
> 
>  3.82% postgres  postgres  [.]
> GetMultiXactIdMembers

That looks like a postgres compiled without
-fno-omit-frame-pointer. Without that hierarchical profiles are
meaningless. Very new perfs can also do it using dwarf, but it's unusabl
slow...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 regression with dbt2

2013-12-18 Thread Dong Ye
~20 minutes each run with binary.
Try your patch now..
You are right I used -g in perf record. But what I reported was flat (meant
as a start).

Expand GetMultiXactIdMembers:

 3.82% postgres  postgres  [.]
GetMultiXactIdMembers
   |
   |--9.09%-- GetMultiXactIdMembers
   |
   |--0.84%-- 0x48fb894853f58948
   |  |
   |  |--0.74%-- 0x4296e0004296c
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.64%-- 0x52f8d00052f8d
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.64%-- 0xf6ce8000f6ce8
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.62%-- 0x41de300041de1
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.61%-- 0xf2c77000f2c71
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.60%-- 0x3127700031275
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.59%-- 0x10c98b0010c987
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.59%-- 0x31df31df0
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.59%-- 0xbefbd000befbd
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.58%-- 0xfe97c000fe976
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.58%-- 0x82501000824f9
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.58%-- 0x3a4410003a43c
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.58%-- 0x3b0cf0003b0c3
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.58%-- 0x5325f0005325b
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.58%-- 0x7b6b80007b6b8
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.57%-- 0x52e9b00052e9b
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.57%-- 0xf3d45000f3d40
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.57%-- 0x27afd00027afa
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.57%-- 0x3244d0003244d
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.56%-- 0x53e0d00053e06
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.56%-- 0xb64c6000b64bc
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.56%-- 0x423f1000423ef
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.56%-- 0xc18f2000c18ed
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.56%-- 0x6bdcf0006bdcd
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.55%-- 0xc6d25000c6d25
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.55%-- 0xf6534000f6534
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.55%-- 0x10bba80010bba0
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.55%-- 0xb5a76000b5a6e
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.55%-- 0x2d3c10002d3b5
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.55%-- 0xcc095000cc095
   |  |  GetMultiXactIdMembers
   |  |
 

Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Jim Nasby

On 12/18/13, 1:42 PM, Kevin Grittner wrote:

Jim Nasby  wrote:


This is another case where it would be very useful to restrict
what relations a transaction (or in this case, a substransaction)
can access. If we had the ability to make that restriction then
we could force assertions that aren't plain SQL to explicitly
specify what tables the assert is going to hit, and if the assert
tries to do something different then we throw an error.

The ability to restrict object access within a transaction would
also benefit VACUUM and possibly the Changeset stuff.


I'm pretty sure that SSI could also optimize based on that,
although there are probably about 10 other optimizations that would
be bigger gains before getting to that.


Any ideas how hard this would be? My thought is that we might be able to 
perform this check in the functions that do catalog lookups, but I'm worried 
that that wouldn't allow us to support subtransaction checks (which we'd need 
for assertions), and it runs the risk of long-lasting object references 
spanning the transaction (or subtransaction) and thereby thwarting the check.

Another option would be in heap accessor functions, but that means we could 
only restrict access to tables. For assertions, it would be nice to also 
disallow access to functions that could have unintended consequences that could 
violate the assertion (like dblink).
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Andres Freund
On 2013-12-18 16:39:58 -0300, Alvaro Herrera wrote:
> Andres Freund wrote:
> > It would only force serialization for transactions that modify tables
> > covered by the assert, that doesn't seem to bad. Anything covered by an
> > assert shoulnd't be modified frequently, otherwise you'll run into major
> > performance problems.
> 
> Well, as presented there is no way (for the system) to tell which tables
> are covered by an assertion, is there?  That's my point.

Well, the patch's syntax seems to only allow to directly specify a SQL
query to check - we could iterate over the querytree to gather all
related tables and reject any function we do not understand.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Kevin Grittner
Jim Nasby  wrote:

> This is another case where it would be very useful to restrict
> what relations a transaction (or in this case, a substransaction)
> can access. If we had the ability to make that restriction then
> we could force assertions that aren't plain SQL to explicitly
> specify what tables the assert is going to hit, and if the assert
> tries to do something different then we throw an error.
>
> The ability to restrict object access within a transaction would
> also benefit VACUUM and possibly the Changeset stuff.

I'm pretty sure that SSI could also optimize based on that,
although there are probably about 10 other optimizations that would
be bigger gains before getting to that.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Alvaro Herrera
Andres Freund wrote:
> On 2013-12-18 13:44:15 -0300, Alvaro Herrera wrote:
> > Heikki Linnakangas wrote:
> > 
> > > Ah, I see. You don't need to block anyone else from modifying the
> > > table, you just need to block anyone else from committing a
> > > transaction that had modified the table. So the locks shouldn't
> > > interfere with regular table locks. A ShareUpdateExclusiveLock on
> > > the assertion should do it.
> > 
> > Causing serialization of transaction commit just because a single
> > assertion exists in the database seems too much of a hit, so looking for
> > optimization opportunities seems appropriate.
> 
> It would only force serialization for transactions that modify tables
> covered by the assert, that doesn't seem to bad. Anything covered by an
> assert shoulnd't be modified frequently, otherwise you'll run into major
> performance problems.

Well, as presented there is no way (for the system) to tell which tables
are covered by an assertion, is there?  That's my point.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Jim Nasby

On 12/18/13, 10:44 AM, Alvaro Herrera wrote:

It might prove useful to note that any given assertion involves tables
A, B, C.  If a transaction doesn't modify any of those tables then
there's no need to run the assertion when the transaction commits,
skipping acquisition of the assertion lock.

Probably this can only be implemented for SQL-language assertions, but
even so it might be worth considering.  (Assertions in any other
language would be checked by every transaction.)


This is another case where it would be very useful to restrict what relations a 
transaction (or in this case, a substransaction) can access. If we had the 
ability to make that restriction then we could force assertions that aren't 
plain SQL to explicitly specify what tables the assert is going to hit, and if 
the assert tries to do something different then we throw an error.

The ability to restrict object access within a transaction would also benefit 
VACUUM and possibly the Changeset stuff. From 
http://www.postgresql.org/message-id/52acaafd.6060...@nasby.net:

"This would be useful when you have some tables that see very frequent 
updates/deletes in a database that also has to support long-running transactions that 
don't hit those tables. You'd explicitly limit the tables your long-running transaction 
will touch and that way vacuum can ignore the long-running XID when calculating minimum 
tuple age for the heavy-hit tables."
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Jim Nasby

On 12/18/13, 11:57 AM, Josh Berkus wrote:

On 12/18/2013 08:44 AM, Alvaro Herrera wrote:

Another thought: at the initial run of the assertion, note which tables
it locked, and record this as an OID array in the catalog row for the
assertion; consider running the assertion only when those tables are
touched.  This doesn't work if the assertion code locks some tables when
run under certain conditions and other tables under different
conditions.  But then this can be checked too: if an assertion lists in
its catalog row that it involves tables A, B, C and then, under
different conditions, it tries to acquire lock on table D, have the
whole thing fail indicating that the assertion is misdeclared.


This sounds like you're re-inventing SSI.

SERIALIZABLE mode *exists* in order to be able to enforce constraints
which potentially involve more than one transaction.  "Balance can never
go below 0", for example. The whole reason we have this really cool and
unique SSI mode is so that we can do such things without killing
performance.  These sorts of requirements are ideally suited to
Assertions, so it's logically consistent to require Serializable mode in
order to use Assertions.


The flip-side is that now you can get serialization failures, and I think 
there's a ton of software that has no clue how to deal with that. So now you 
don't get to use assertions at all unless you re-engineer your application (but 
see below).

Now, if Postgres could re-attempt serialization failures, maybe that would 
become a non-issue... (though, there's probably a lot of dangers in doing 
that...)


This is consistent with how we treat the interaction of constraints and
triggers; under some circumstances, we allow triggers to violate CHECK
and FK constraints.


We do? Under what circumstances?


Alternately, we add a GUC assertion_serializable_mode, which can be
"off", "warn" or "error".  If it's set to "error", and the user triggers
an assertion while in READ COMMITTED mode, an exception occurs.  If it's
set to "off", then assertions are disabled, in order to deal with buggy
assertions.

Now, it would be even better if we could prevent users from switching
transaction mode, but that's a MUCH bigger and  more complicated patch.


Another possibility is to allow for two different types of assertions, one 
based on SSI and one based on locking.
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Andres Freund
On 2013-12-18 13:44:15 -0300, Alvaro Herrera wrote:
> Heikki Linnakangas wrote:
> 
> > Ah, I see. You don't need to block anyone else from modifying the
> > table, you just need to block anyone else from committing a
> > transaction that had modified the table. So the locks shouldn't
> > interfere with regular table locks. A ShareUpdateExclusiveLock on
> > the assertion should do it.
> 
> Causing serialization of transaction commit just because a single
> assertion exists in the database seems too much of a hit, so looking for
> optimization opportunities seems appropriate.

It would only force serialization for transactions that modify tables
covered by the assert, that doesn't seem to bad. Anything covered by an
assert shoulnd't be modified frequently, otherwise you'll run into major
performance problems.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 regression with dbt2

2013-12-18 Thread Andres Freund
Hello,

On 2013-12-18 10:24:56 -0800, Dong Ye wrote:
> It seems that 0ac5ad5134f2769ccbaefec73844f8504c4d6182 is the culprit
> commit.

How long does a run take to verify the problem? Could you retry with the
patch attached to
http://www.postgresql.org/message-id/20131201114514.gg18...@alap2.anarazel.de
? Based on the theory that it creates many superflous multixacts.

> Flat perf profiles of two such runs look like:

Those aren't really flat profiles tho ;)

> 0ac:
> 
> Samples: 706K of event 'cycles', Event count (approx.): 6690377376522
> 
> 
> +   3.82% postgres  postgres  [.]
> GetMultiXactIdMembers   

Could you expland that one some levels, so we see the callers?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9.3 regression with dbt2

2013-12-18 Thread Dong Ye
Hi,

 

We recently observed ~15% performance regression with dbt2 from PG 9.3.

We narrowed down on testing master between 9.2 cut and 9.3 cut.

It seems that 0ac5ad5134f2769ccbaefec73844f8504c4d6182 is the culprit
commit.

We did several runs and perf profiling comparing it against its parent
(f925c79b9f36c54b67053ade5ad225a75b8dc803).

Also tested a 12/16 commit on the master
(3b97e6823b949624afdc3ce4c92b29a80429715f) once, it performed similarly as
0ac..

 

Regards,

Dong

 

 

Results:

f92: 53k-56k'ish notpm

0ac: 47k-48k'ish notpm

 

Server SUT:

HP ML350 G6

Two  Xeon E5520 (4c/p, 8 cores total, hyper-threading disabled)

12GB DRAM

HP P410i RAID controller (256MB battery-backed cache)

- three 10k-rpm SAS: /

- three 10k-rpm SAS: /pgdata

- one 15k-rpm SAS: /pgxlog

- ext4 (rw,relatime,data=ordered) on all mounts. 

 

Fedora 19 (3.11.10-200.fc19.x86_64)

 

max_connections=100

shared_buffers=8192MB

effective_cache_size=10GB

temp_buffers=8186kB

work_mem=4093kB

maintenance_work_mem=399MB

wal_buffers=-1

checkpoint_segments=300

checkpoint_completion_target=0.9

logging_collector=on

log_timezone=UTC

datestyle='iso, mdy'

lc_messages=C

lc_monetary=C

lc_numeric=C

lc_time=C

default_text_search_config='pg_catalog.english'

listen_addresses='*'

log_destination=csvlog

log_directory=pg_log

log_filename='pg-%a'

log_rotation_age=1440

log_truncate_on_rotation=on

 

Client and workload:

Dell 390. Two core. Direct connect with the Server SUT.

dbt2 (ToT)

40 warehouse

8 terminals, 8 connections

zero think/key time

12-min run

 

Flat perf profiles of two such runs look like:

f92:

Samples: 608K of event 'cycles', Event count (approx.): 6679607097416


+   4.04%   postgres  postgres  [.]
heap_hot_search_buffer

+   3.63%   postgres  postgres  [.] AllocSetAlloc


+   3.37%   postgres  postgres  [.]
hash_search_with_hash_value   

+   2.85%   postgres  postgres  [.] _bt_compare


+   2.67%   postgres  postgres  [.] SearchCatCache


+   2.46%   postgres  postgres  [.] LWLockAcquire


+   2.16%   postgres  postgres  [.] XLogInsert


+   2.08%   postgres  postgres  [.] PinBuffer


+   1.32%   postgres  postgres  [.] ExecInitExpr


+   1.31%   postgres  libc-2.17.so  [.] _int_malloc


+   1.29%swapper  [kernel.kallsyms] [k] intel_idle


+   1.23%   postgres  postgres  [.]
MemoryContextAllocZeroAligned 

+   1.13%   postgres  postgres  [.]
heap_page_prune_opt   

+   1.06%   postgres  libc-2.17.so  [.]
__memcpy_ssse3_back   

+   1.02%   postgres  postgres  [.] LWLockRelease


+   0.94%   postgres  postgres  [.] copyObject


+   0.89%   postgres  postgres  [.]
fmgr_info_cxt_security

+   0.82%   postgres  postgres  [.] _bt_checkkeys


+   0.81%   postgres  postgres  [.] hash_any


+   0.73%   postgres  postgres  [.] FunctionCall2Coll


+   0.69%   postgres  libc-2.17.so  [.]
__strncpy_sse2_unaligned  

+   0.67%   postgres  postgres  [.]
HeapTupleSatisfiesMVCC

+   0.66%   postgres  postgres  [.] MemoryContextAlloc


+   0.65%   postgres  postgres  [.]
expression_tree_walker

+   0.59%   postgres  postgres  [.] check_stack_depth


+   0.57%   postgres  libc-2.17.so  [.] __printf_fp


+   0.56%   postgres  libc-2.17.so  [.] _int_free


+   0.52%   postgres  postgres  [.] base_yyparse

 

0ac:

Samples: 706K of event 'cycles', Event count (approx.): 6690377376522


+   3.82% postgres  postgres  [.]
GetMultiXactIdMembers   

+   3.43% postgres  postgres  [.] LWLockAcquire


+   3.31% postgres  postgres  [.]
hash_search_with_hash_value 

+   3.09% postgres  postgres  [.]
heap_hot_search_buffer  

+   3.00% postgres  postgres  [.] AllocSetAlloc


+   2.56% postgres  postgres  [.] _bt_compare


+   2.19% postgres  postgres  [.] PinBuffer


+   2.13% postgres  postgres  [.] SearchCatCache


+   1.99% postgres  postgres  [.] XLo

Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Josh Berkus
On 12/18/2013 08:44 AM, Alvaro Herrera wrote:
> Another thought: at the initial run of the assertion, note which tables
> it locked, and record this as an OID array in the catalog row for the
> assertion; consider running the assertion only when those tables are
> touched.  This doesn't work if the assertion code locks some tables when
> run under certain conditions and other tables under different
> conditions.  But then this can be checked too: if an assertion lists in
> its catalog row that it involves tables A, B, C and then, under
> different conditions, it tries to acquire lock on table D, have the
> whole thing fail indicating that the assertion is misdeclared.

This sounds like you're re-inventing SSI.

SERIALIZABLE mode *exists* in order to be able to enforce constraints
which potentially involve more than one transaction.  "Balance can never
go below 0", for example. The whole reason we have this really cool and
unique SSI mode is so that we can do such things without killing
performance.  These sorts of requirements are ideally suited to
Assertions, so it's logically consistent to require Serializable mode in
order to use Assertions.

I'm leaning towards the alternative that Assertions require SERIALIZABLE
mode, and throw a WARNING at the user and the log every time we create,
modify, or trigger an assertion while not in SERIALIZABLE mode.   And
beyond, that, we don't guarantee the integrity of Assertions if people
choose to run in READ COMMITTED anyway.

This is consistent with how we treat the interaction of constraints and
triggers; under some circumstances, we allow triggers to violate CHECK
and FK constraints.

Alternately, we add a GUC assertion_serializable_mode, which can be
"off", "warn" or "error".  If it's set to "error", and the user triggers
an assertion while in READ COMMITTED mode, an exception occurs.  If it's
set to "off", then assertions are disabled, in order to deal with buggy
assertions.

Now, it would be even better if we could prevent users from switching
transaction mode, but that's a MUCH bigger and  more complicated patch.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewarm status

2013-12-18 Thread Robert Haas
On Tue, Dec 17, 2013 at 7:05 PM, Cédric Villemain  wrote:
> Le mardi 17 décembre 2013 21:14:44, Josh Berkus a écrit :
>> On 12/17/2013 06:34 AM, Robert Haas wrote:
>> > On Tue, Dec 17, 2013 at 12:09 AM, Amit Kapila 
> wrote:
>> >> I have used pg_prewarm during some of work related to Buffer Management
>> >> and other performance related work. It is quite useful utility.
>> >> +1 for reviving this patch for 9.4
>> >
>> > Any other votes?
>>
>> I still support this patch (as I did originally), and don't think that
>> the overlap with pgFincore is of any consequence.  pgFincore does more
>> than pgrewarm ever will, but it's also platform-specific, so it still
>> makes sense for both to exist.
>
> Just for information, pgFincore is NOT limited to linux (the most interesting
> part, the memory snapshot, works also on BSD based kernels with mincore()
> syscall).
> Like for the PostgreSQL effective_io_concurrency (and pg_warm) it just doesn't
> work when posix_fadvise is not available.

This is a fair point, and I should not have implied that it was
Linux-only.  What I really meant was "does not support Windows",
because that's a really big part of our user base.  However, I
shouldn't have phrased it in a way that slights BSD and other UNIX
variants.

Now that we have dynamic background workers, I've been thinking that
it might be possible to write a background worker to do asynchronous
prefetch on systems where we don't have OS support.  We could store a
small ring buffer in shared memory, say big enough for 1k entries.
Each entry would consist of a relfilenode, a starting block number,
and a block count.  We'd also store a flag indicating whether the
prefetch worker has been registered with the postmaster, and a pointer
to the PGPROC of any running worker. When a process wants to do a
prefetch, it locks the buffer, adds its prefetch request to the queue
(overwriting the oldest existing request if the queue is already
full), and checks the flag.  If the flag is not set, it also registers
the background worker.  Then, it releases the lock and sets the latch
of any running worker (whose PGPROC it remembered before releasing the
lock).

When the prefetch process starts up, it services requests from the
queue by reading the requested blocks (or block ranges).  When the
queue is empty, it sleeps.  If it receives no requests for some period
of time, it unregisters itself and exits.  This is sort of a souped-up
version of the hibernation facility we already have for some auxiliary
processes, in that we don't just make the process sleep for a longer
period of time but actually get rid of it altogether.

All of this might be overkill; we could also do it with a permanent
auxiliary process.  But it's sort of a shame to run an extra process
for a facility that might never get used, or might be used only
rarely.  And I'm wary of cluttering things up with a thicket of
auxiliary processes each of which caters only to a very specific, very
narrow situation.  Anyway, just thinking out loud here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dynamic Shared Memory stuff

2013-12-18 Thread Robert Haas
On Tue, Dec 10, 2013 at 6:26 PM, Tom Lane  wrote:
> Noah Misch  writes:
>> On Tue, Dec 10, 2013 at 07:50:20PM +0200, Heikki Linnakangas wrote:
>>> Let's not add more cases like that, if we can avoid it.
>
>> Only if we can avoid it for a modicum of effort and feature compromise.
>> You're asking for PostgreSQL to reshape its use of persistent resources so 
>> you
>> can throw around "killall -9 postgres; rm -rf $PGDATA" without so much as a
>> memory leak.  That use case, not PostgreSQL, has the defect here.
>
> The larger point is that such a shutdown process has never in the history
> of Postgres been successful at removing shared-memory (or semaphore)
> resources.  I do not feel a need to put a higher recoverability standard
> onto the DSM code than what we've had for fifteen years with SysV shmem.
>
> But, by the same token, it shouldn't be any *less* recoverable.  In this
> context that means that starting a new postmaster should recover the
> resources, at least 90% of the time (100% if we still have the old
> postmaster lockfile).  I think the idea of keeping enough info in the
> SysV segment to permit recovery of DSM resources is a good one.  Then,
> any case where the existing logic is able to free the old SysV segment
> is guaranteed to also work for DSM.

So I'm taking a look at this.  There doesn't appear to be anything
intrinsically intractable here, but it seems important to time the
order of operations so as to minimize the chances that things fail in
awkward places.  The point where we remove the old shared memory
segment from the previous postmaster invocation is here:

/*
 * The segment appears to be from a dead Postgres process, or from a
 * previous cycle of life in this same process.  Zap it, if possible.
 * This probably shouldn't fail, but if it does, assume the segment
 * belongs to someone else after all, and continue quietly.
 */
shmdt(memAddress);
if (shmctl(shmid, IPC_RMID, NULL) < 0)
continue;

My first thought was to remember the control segment ID from the
header just before the shmdt() there, and then later when the DSM
module is starting, do cleanup.  But that doesn't seem like a good
idea, because then if startup fails after we remove the old shared
memory segment and before we get to the DSM initialization code, we'll
have lost the information on what control segment needs to be cleaned
up.  A subsequent startup attempt won't see the old shm again, because
it's already gone.  I'm fairly sure that this would be a net reduction
in reliability vs. the status quo.

So now what I'm thinking is that we ought to actually perform the DSM
cleanup before detaching the old segment and trying to remove it.
That shouldn't fail, but if it does, or if we get killed before
completing it, the next run will hopefully find the same old shm and
finish the cleanup.  But that kind of flies in the face of the comment
above: if we perform DSM cleanup and then discover that the segment
wasn't ours after all, that means we just stomped all over somebody
else's stuff.  That's not too good. But trying to remove the segment
first and then perform the cleanup creates a window where, if we get
killed, the next restart will have lost information about how to
finish cleaning up.  So it seems that some kind of logic tweak is
needed here, but I'm not sure exactly what.  As I see it, the options
are:

1. Make failure to remove the shared memory segment we thought was
ours an error.  This will definitely show up any problems, but only
after we've burned down some other processes's dynamic shared memory
segments.  The most likely outcome is creating failure-to-start
problems that don't exist today.

2. Make it a warning.  We'll still burn down somebody else's DSMs, but
at least we'll still start ourselves.  Sadly, "WARNING: You have just
done a bad thing.  It's too late to fix it.  Sorry!" is not very
appealing.

3. Remove the old shared memory segment first, then perform the
cleanup immediately afterwards.  If we get killed before completing
the cleanup, we'll leak the un-cleaned-up stuff.  Decide that's OK and
move on.

4. Adopt some sort of belt-and-suspenders approach, keeping the state
file we have now and backstopping it with this mechanism, so that we
really only need this to work when $PGDATA has been blown away and
recreated.  This seems pretty inelegant, and I'm not sure who it'll
benefit other than those (few?) people who kill -9 the postmaster (or
it segfaults or otherwise dies without running the code to remove
shared memory segments) and then remove $PGDATA and then re-initdb and
then expect cleanup to find the old DSMs.

5. Give up on this approach.  We could keep what we have now, or make
the DSM control segment land at a well-known address as we do for the
main segment.

What do people prefer?

The other thing I'm a bit squidgy about is injecting more code that
runs before we get the new shared memory segment up and r

Re: [HACKERS] pg_rewarm status

2013-12-18 Thread Cédric Villemain
Le mardi 17 décembre 2013 21:14:44, Josh Berkus a écrit :
> On 12/17/2013 06:34 AM, Robert Haas wrote:
> > On Tue, Dec 17, 2013 at 12:09 AM, Amit Kapila  
wrote:
> >> I have used pg_prewarm during some of work related to Buffer Management
> >> and other performance related work. It is quite useful utility.
> >> +1 for reviving this patch for 9.4
> > 
> > Any other votes?
> 
> I still support this patch (as I did originally), and don't think that
> the overlap with pgFincore is of any consequence.  pgFincore does more
> than pgrewarm ever will, but it's also platform-specific, so it still
> makes sense for both to exist.

Just for information, pgFincore is NOT limited to linux (the most interesting 
part, the memory snapshot, works also on BSD based kernels with mincore() 
syscall).
Like for the PostgreSQL effective_io_concurrency (and pg_warm) it just doesn't 
work when posix_fadvise is not available.

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] pg_rewarm status

2013-12-18 Thread Cédric Villemain
Le mardi 17 décembre 2013 17:45:51, Robert Haas a écrit :
> On Tue, Dec 17, 2013 at 11:02 AM, Jim Nasby  wrote:
> > On 12/17/13, 8:34 AM, Robert Haas wrote:
> >> On Tue, Dec 17, 2013 at 12:09 AM, Amit Kapila 
> >> 
> >> wrote:
> >>> I have used pg_prewarm during some of work related to Buffer Management
> >>> and
> >>> other performance related work. It is quite useful utility.
> >>> +1 for reviving this patch for 9.4
> >> 
> >> Any other votes?
> > 
> > We've had to manually code something that runs EXPLAIN ANALYZE SELECT *
> > from a bunch of tables to warm our caches after a restart, but there's
> > numerous flaws to that approach obviously.
> > 
> > Unfortunately, what we really need to warm isn't the PG buffers, it's the
> > FS cache, which I suspect this won't help. But I still see where just
> > pg_buffers would be useful for a lot of folks, so +1.
> 
> It'll do either one.  For the FS cache, on Linux, you can also use
> pgfincore.

on Linux, *BSD (including OS X). like what's in postgresql. Only Windows is 
out of scope so far. and there is a solution for windows too, there is just no 
requirement from pgfincore users. 

Maybe you can add the windows support in PostgreSQL now ?

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] RFC: Async query processing

2013-12-18 Thread Florian Weimer

On 11/04/2013 02:51 AM, Claudio Freire wrote:

On Sun, Nov 3, 2013 at 3:58 PM, Florian Weimer  wrote:

I would like to add truly asynchronous query processing to libpq, enabling
command pipelining.  The idea is to to allow applications to auto-tune to
the bandwidth-delay product and reduce the number of context switches when
running against a local server.

...

If the application is not interested in intermediate query results, it would
use something like this:

...

If there is no need to exit from the loop early (say, because errors are
expected to be extremely rare), the PQgetResultNoWait call can be left out.


It doesn't seem wise to me making such a distinction. It sounds like
you're oversimplifying, and that's why you need "modes", to overcome
the evidently restrictive limits of the simplified interface, and that
it would only be a matter of (a short) time when some other limitation
requires some other mode.


I need modes because I want to avoid unbound buffering, which means that 
result data has to be consumed in the order queries are issued.



   PGAsyncMode oldMode = PQsetsendAsyncMode(conn, PQASYNC_RESULT);
   bool more_data;
   do {
  more_data = ...;
  if (more_data) {
int ret = PQsendQueryParams(conn,
  "INSERT ... RETURNING ...", ...);
if (ret == 0) {
  // handle low-level error
}
  }
  // Consume all pending results.
  while (1) {
PGresult *res;
if (more_data) {
  res = PQgetResultNoWait(conn);
} else {
  res = PQgetResult(conn);
}


Somehow, that code looks backwards. I mean, really backwards. Wouldn't
that be !more_data?


No, if more data is available to transfer to the server, the no-wait 
variant has to be used to avoid a needless synchronization with the server.



In any case, pipelining like that, without a clear distinction, in the
wire protocol, of which results pertain to which query, could be a
recipe for trouble when subtle bugs, either in lib usage or
implementation, mistakenly treat one query's result as another's.


We already use pipelining in libpq (see pqFlush, PQsendQueryGuts and 
pqParseInput3), the server is supposed to support it, and there is a 
lack of a clear tit-for-tat response mechanism anyway because of 
NOTIFY/LISTEN and the way certain errors are reported.



Instead of buffering the results, we could buffer the encoded command
messages in PQASYNC_RESULT mode.  This means that PQsendQueryParams would
not block when it cannot send the (complete) command message, but store in
the connection object so that the subsequent PQgetResultNoWait and
PQgetResult would send it.  This might work better with single-tuple result
mode.  We cannot avoid buffering either multiple queries or multiple
responses if we want to utilize the link bandwidth, or we'd risk deadlocks.


This is a non-solution. Such an implementation, at least as described,
would not remove neither network latency nor context switches, it
would be a purely API change with no externally visible behavior
change.


Ugh, why?


An effective solution must include multi-command packets. Without
knowing the wire protocol in detail, something like:

PARSE: INSERT blah
BIND: args
EXECUTE with DISCARD
PARSE: INSERT blah
BIND: args
EXECUTE with DISCARD
PARSE: SELECT  blah
BIND: args
EXECUTE with FETCH ALL

All in one packet, would be efficient and error-free (IMO).


No, because this doesn't scale automatically with the bandwidth-delay 
product.  It also requires that the client buffers queries and their 
parameters even though the network has to do that anyway.


In any case, I don't want to change the wire protocol, I just want to 
enable libpq clients to use more of its capabilities.


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 reference constraint regression

2013-12-18 Thread Alvaro Herrera
Alvaro Herrera wrote:
> Andres Freund wrote:
> 
> > I have to say, it makes me really uncomfortable to take such
> > shortcuts. In other branches we are doing liveliness checks with
> > MultiXactIdIsRunning() et al. Why isn't that necessary here? And how
> > likely is that this won't cause breakage somewhere down the line because
> > somebody doesn't know of that subtlety?
> 
> I came up with the attached last night, which should do the right thing
> in both cases.

That one had a silly bug, which I fixed and pushed now.

Thanks for the feedback,

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Alvaro Herrera
Heikki Linnakangas wrote:

> Ah, I see. You don't need to block anyone else from modifying the
> table, you just need to block anyone else from committing a
> transaction that had modified the table. So the locks shouldn't
> interfere with regular table locks. A ShareUpdateExclusiveLock on
> the assertion should do it.

Causing serialization of transaction commit just because a single
assertion exists in the database seems too much of a hit, so looking for
optimization opportunities seems appropriate.  Here are some ideas for
brainstorming.

It might prove useful to note that any given assertion involves tables
A, B, C.  If a transaction doesn't modify any of those tables then
there's no need to run the assertion when the transaction commits,
skipping acquisition of the assertion lock.

Probably this can only be implemented for SQL-language assertions, but
even so it might be worth considering.  (Assertions in any other
language would be checked by every transaction.)

Another thought: at the initial run of the assertion, note which tables
it locked, and record this as an OID array in the catalog row for the
assertion; consider running the assertion only when those tables are
touched.  This doesn't work if the assertion code locks some tables when
run under certain conditions and other tables under different
conditions.  But then this can be checked too: if an assertion lists in
its catalog row that it involves tables A, B, C and then, under
different conditions, it tries to acquire lock on table D, have the
whole thing fail indicating that the assertion is misdeclared.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Kevin Grittner
Josh Berkus  wrote:

> Serializable or not, *what* do we lock for assertions?  It's not
> rows.  Tables?  Which tables?  What if the assertion is an
> interpreted language function?  Does the SSI reference counter
> really take care of all of this?

The simple answer is that, without adding any additional blocking,
SSI guarantees that the behavior of running any set of concurrent
serializable transactions is consistent with some serial
(one-at-a-time) execution of those transactions.  If the assertion
is run as part of the transaction, it is automatically handled,
regardless of the issues you are asking about.

The longer answer is in the README and the papers it references:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/storage/lmgr/README-SSI;hb=master

For practical examples of how it works, this Wiki page includes
examples of maintaining a multi-table invariant using triggers:

http://wiki.postgresql.org/wiki/SSI

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logging WAL when updating hintbit

2013-12-18 Thread Sawada Masahiko
2013/12/14 0:14 "Tom Lane" :
>
> Heikki Linnakangas  writes:
> > I'm not totally satisfied with the name of the GUC, wal_log_hintbits.
>
> Me either; at the very least, it's short an underscore: wal_log_hint_bits
> would be more readable.  But how about just "wal_log_hints"?
>

+1 too.
it's readble.

Masahiko Sawada


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Kevin Grittner
Heikki Linnakangas  wrote:
>On 12/18/2013 01:39 PM, Andres Freund wrote:
>> On 2013-12-18 13:07:51 +0200, Heikki Linnakangas wrote:
>>> Here's another idea that doesn't involve SSI:
>>>
>>> At COMMIT, take a new snapshot and check that the assertion still passes
>>> with that snapshot. Now, there's a race condition, if another transaction is
>>> committing at the same time, and performs the same check concurrently. That
>>> race condition can be eliminated by holding an exclusive lock while running
>>> the assertion, until commit, effectively allowing the assertion to be
>>> checked by only one transaction at a time.
>>>
>>> I think that would work, and would be simple, although it wouldn't scale too
>>> well.
>>
>> It probably would also be very prone to deadlocks.
>
> Hmm, since this would happen at commit, you would know all the
> assertions that need to be checked at that point. You could check them
> e.g in Oid order to avoid deadlocks.

So you would actually serialize all COMMITs, or at least all which
had done anything which could affect the truth of an assertion? 
That seems like it would work, but I suspect that it would be worse
for performance than SSI in many workloads, and better than SSI in
other workloads.  Maybe a GUC to determine which strategy is used?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 reference constraint regression

2013-12-18 Thread Alvaro Herrera
Andres Freund wrote:

> I have to say, it makes me really uncomfortable to take such
> shortcuts. In other branches we are doing liveliness checks with
> MultiXactIdIsRunning() et al. Why isn't that necessary here? And how
> likely is that this won't cause breakage somewhere down the line because
> somebody doesn't know of that subtlety?

I came up with the attached last night, which should do the right thing
in both cases.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
*** a/src/backend/access/transam/multixact.c
--- b/src/backend/access/transam/multixact.c
***
*** 1275,1280  retry:
--- 1275,1313 
  }
  
  /*
+  * MultiXactHasRunningRemoteMembers
+  * 		Does the given multixact have still-live members from
+  * 		transactions other than our own?
+  */
+ bool
+ MultiXactHasRunningRemoteMembers(MultiXactId multi)
+ {
+ 	MultiXactMember *members;
+ 	int			nmembers;
+ 	int			i;
+ 
+ 	nmembers = GetMultiXactIdMembers(multi, &members, true);
+ 	if (nmembers <= 0)
+ 		return false;
+ 
+ 	for (i = 0; i < nmembers; i++)
+ 	{
+ 		/* not interested in our own members */
+ 		if (TransactionIdIsCurrentTransactionId(members[i].xid))
+ 			continue;
+ 
+ 		if (TransactionIdIsInProgress(members[i].xid))
+ 		{
+ 			pfree(members);
+ 			return true;
+ 		}
+ 	}
+ 
+ 	pfree(members);
+ 	return false;
+ }
+ 
+ /*
   * mxactMemberComparator
   *		qsort comparison function for MultiXactMember
   *
*** a/src/backend/utils/time/tqual.c
--- b/src/backend/utils/time/tqual.c
***
*** 686,693  HeapTupleSatisfiesUpdate(HeapTupleHeader tuple, CommandId curcid,
  			if (tuple->t_infomask & HEAP_XMAX_INVALID)	/* xid invalid */
  return HeapTupleMayBeUpdated;
  
! 			if (HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask))	/* not deleter */
! return HeapTupleMayBeUpdated;
  
  			if (tuple->t_infomask & HEAP_XMAX_IS_MULTI)
  			{
--- 686,721 
  			if (tuple->t_infomask & HEAP_XMAX_INVALID)	/* xid invalid */
  return HeapTupleMayBeUpdated;
  
! 			if (HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask))
! 			{
! TransactionId	xmax;
! 
! xmax = HeapTupleHeaderGetRawXmax(tuple);
! 
! /*
!  * Careful here: even though this tuple was created by our own
!  * transaction, it might be locked by other transactions, if
!  * the original version was key-share locked when we updated
!  * it.
!  */
! 
! if (tuple->t_infomask & HEAP_XMAX_IS_MULTI)
! {
! 	if (MultiXactHasRunningRemoteMembers(xmax))
! 		return HeapTupleBeingUpdated;
! 	else
! 		return HeapTupleMayBeUpdated;
! }
! 
! /* if locker is gone, all's well */
! if (!TransactionIdIsInProgress(xmax))
! 	return HeapTupleMayBeUpdated;
! 
! if (!TransactionIdIsCurrentTransactionId(xmax))
! 	return HeapTupleBeingUpdated;
! else
! 	return HeapTupleMayBeUpdated;
! 			}
  
  			if (tuple->t_infomask & HEAP_XMAX_IS_MULTI)
  			{
***
*** 700,706  HeapTupleSatisfiesUpdate(HeapTupleHeader tuple, CommandId curcid,
--- 728,738 
  
  /* updating subtransaction must have aborted */
  if (!TransactionIdIsCurrentTransactionId(xmax))
+ {
+ 	if (MultiXactHasRunningRemoteMembers(xmax))
+ 		return HeapTupleBeingUpdated;
  	return HeapTupleMayBeUpdated;
+ }
  else
  {
  	if (HeapTupleHeaderGetCmax(tuple) >= curcid)
*** a/src/include/access/multixact.h
--- b/src/include/access/multixact.h
***
*** 89,94  extern bool MultiXactIdIsRunning(MultiXactId multi);
--- 89,95 
  extern void MultiXactIdSetOldestMember(void);
  extern int GetMultiXactIdMembers(MultiXactId multi, MultiXactMember **xids,
  	  bool allow_old);
+ extern bool MultiXactHasRunningRemoteMembers(MultiXactId multi);
  extern bool MultiXactIdPrecedes(MultiXactId multi1, MultiXactId multi2);
  extern bool MultiXactIdPrecedesOrEquals(MultiXactId multi1,
  			MultiXactId multi2);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logging WAL when updating hintbit

2013-12-18 Thread Robert Haas
On Tue, Dec 17, 2013 at 10:22 PM, Amit Kapila  wrote:
>> Me either; at the very least, it's short an underscore: wal_log_hint_bits
>> would be more readable.  But how about just "wal_log_hints"?
>
> +1 for wal_log_hints, it sounds better.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row security roadmap proposal

2013-12-18 Thread Robert Haas
On Wed, Dec 18, 2013 at 3:30 AM, Craig Ringer  wrote:
> In my view the proposed patch doesn't offer a significant improvement in
> declarative security, beyond what we can get by just adding update
> support to s.b. views and using search_path to control whether a user
> sees the view or the base table.
>
> It's a lot like Oracle Virtual Private Database (VPD): A set of low
> level building blocks you can build your own flexible row security model
> with. One where you have to very carefully write security-sensitive
> predicates and define all your security model tables, etc yourself.
>
> That's why I'm now of the opinion that we should make it possible to
> achieve the same thing with s.b. views and the search_path (by adding
> update support)... then build a declarative row-security system that
> doesn't require the user fiddling with delicate queries and sensitive
> scripts on top of that.

To be clear, I wasn't advocating for a declarative approach; I think
predicates are fine.  There are usability issues to worry about either
way, and my concern is that we address those.  A declarative approach
would certainly be valuable in that, for those people for whom it is
sufficiently flexible, it's probably quite a lot easier than writing
predicates.  But I fear that some people will want a lot more
generality than a label-based system can accommodate.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row security roadmap proposal

2013-12-18 Thread Robert Haas
On Tue, Dec 17, 2013 at 1:27 PM, Simon Riggs  wrote:
> Not sure I'd say required, but its certainly desirable to have
> updateable security barrier views in themselves. And it comes across
> to me as a cleaner and potentially more performant way of doing the
> security checks for RLS.

Yes, that's how I'm thinking of it.  It's required in the sense that
if we don't do it as a separate patch, we'll need to fold many of
changes into the RLS patch, which IMV is not desirable.  We'd end up
with more complexity and less functionality with no real upside that I
can see.

But I think we are basically saying the same thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewarm status

2013-12-18 Thread Robert Haas
On Tue, Dec 17, 2013 at 9:03 PM, KONDO Mitsumasa
 wrote:
> (2013/12/18 5:33), Robert Haas wrote:
>> Sounds like it might be worth dusting the patch off again...
>
> I'd like to request you to add all_index option and usage_count option.
> When all_index option is selected, all index become rewarm nevertheless user
> doesn't input relation name. And usage_count option adds usage_copunt in
> shared_buffers. Useful buffers will remain long and not to be thrown easly.
> I think these are easy to implements and useful. So please if you like.

Prewarming indexes is useful, but I don't think we need to complicate
the API for that.  With the version I just posted, you can simply do
something like this:

SELECT pg_prewarm(indexrelid) FROM pg_index WHERE indrelid =
'pgbench_accounts'::regclass;

I seriously doubt whether being able to set the usage count is actually useful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: SQL objects UNITs (was: [HACKERS] Extension Templates S03E11)

2013-12-18 Thread Alvaro Herrera
Stephen Frost escribió:
> * Dimitri Fontaine (dimi...@2ndquadrant.fr) wrote:

> > Basically with building `UNIT` we realise with hindsight that we failed to
> > build a proper `EXTENSION` system, and we send that message to our users.
> 
> Little difficult to draw conclusions about what out 'hindsight' will
> look like.

I haven't been keeping very close attention to this, but I fail to see
why extensions are so much of a failure.  Surely we can invent a new
"kind" of extensions, ones whose contents specifically are dumped by
pg_dump.  Regular extensions, the kind we have today, still wouldn't,
but we could have a flag, say "CREATE EXTENSION ... (WITH DUMP)" or
something.  That way you don't have to come up with UNIT at all (or
whatever).  A whole new set of catalogs just to fix up a minor issue
with extensions sounds a bit too much to me; we can just add this new
thing on top of the existing infrastructure.

I didn't much like the WITH UNIT/END UNIT thingy.  What's wrong with
CREATE foo; ALTER EXTENSION ADD foo?  There's a bit of a problem that if 
you create the object and die before being able to add it to the
extension, it would linger unreferenced; but that's easily fixable by
doing the creation in a transaction, I think.  (Alternatively, we could
have a single command that creates the extension and the contained
objects in one fell swoop, similar to how CREATE SCHEMA can do it; but
I'm not sure that's all that much better, and from a grammar POV it
probably sucks.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewarm status

2013-12-18 Thread Robert Haas
On Tue, Dec 17, 2013 at 12:35 PM, Jeff Janes  wrote:
> Since it doesn't use directIO, you can't warm the PG buffers without also
> warming FS cache as a side effect.  That is why I like 'buffer' as the
> default--if the data fits in shared_buffers, it warm those, otherwise it at
> least warms the FS.  If you want to only warm the FS cache, you can use
> either the 'prefetch' or 'read' modes instead.

All right, here is an updated patch.  I swapped the second and third
arguments, because I think overriding the prewarm mode will be a lot
more common than overriding the relation fork.  I also added defaults,
so you can do this:

SELECT pg_prewarm('pgbench_accounts');

Or this:

SELECT pg_prewarm('pgbench_accounts', 'read');

I also fixed some oversights in the error checks.

I'm not inclined to wait for the next CommitFest to commit this,
because it's a very simple patch and has already had a lot more field
testing than most patches get before they're committed.  And it's just
a contrib module, so the damage it can do if there is in fact a bug is
pretty limited.  All that having been said, any review is appreciated.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/contrib/Makefile b/contrib/Makefile
index 8a2a937..dd2683b 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -32,6 +32,7 @@ SUBDIRS = \
 		pg_archivecleanup \
 		pg_buffercache	\
 		pg_freespacemap \
+		pg_prewarm	\
 		pg_standby	\
 		pg_stat_statements \
 		pg_test_fsync	\
diff --git a/contrib/pg_prewarm/Makefile b/contrib/pg_prewarm/Makefile
new file mode 100644
index 000..176a29a
--- /dev/null
+++ b/contrib/pg_prewarm/Makefile
@@ -0,0 +1,18 @@
+# contrib/pg_prewarm/Makefile
+
+MODULE_big = pg_prewarm
+OBJS = pg_prewarm.o
+
+EXTENSION = pg_prewarm
+DATA = pg_prewarm--1.0.sql
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/pg_prewarm
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/pg_prewarm/pg_prewarm--1.0.sql b/contrib/pg_prewarm/pg_prewarm--1.0.sql
new file mode 100644
index 000..2bec776
--- /dev/null
+++ b/contrib/pg_prewarm/pg_prewarm--1.0.sql
@@ -0,0 +1,14 @@
+/* contrib/pg_prewarm/pg_prewarm--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION pg_prewarm" to load this file. \quit
+
+-- Register the function.
+CREATE FUNCTION pg_prewarm(regclass,
+		   mode text default 'buffer',
+		   fork text default 'main',
+		   first_block int8 default null,
+		   last_block int8 default null)
+RETURNS int8
+AS 'MODULE_PATHNAME', 'pg_prewarm'
+LANGUAGE C;
diff --git a/contrib/pg_prewarm/pg_prewarm.c b/contrib/pg_prewarm/pg_prewarm.c
new file mode 100644
index 000..10317f3
--- /dev/null
+++ b/contrib/pg_prewarm/pg_prewarm.c
@@ -0,0 +1,205 @@
+/*-
+ *
+ * pg_prewarm.c
+ *		  prewarming utilities
+ *
+ * Copyright (c) 2010-2012, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		  contrib/pg_prewarm/pg_prewarm.c
+ *
+ *-
+ */
+#include "postgres.h"
+
+#include 
+#include 
+
+#include "access/heapam.h"
+#include "catalog/catalog.h"
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "storage/bufmgr.h"
+#include "storage/smgr.h"
+#include "utils/acl.h"
+#include "utils/builtins.h"
+#include "utils/lsyscache.h"
+#include "utils/rel.h"
+
+PG_MODULE_MAGIC;
+
+extern Datum pg_prewarm(PG_FUNCTION_ARGS);
+
+PG_FUNCTION_INFO_V1(pg_prewarm);
+
+typedef enum
+{
+	PREWARM_PREFETCH,
+	PREWARM_READ,
+	PREWARM_BUFFER
+} PrewarmType;
+
+static char blockbuffer[BLCKSZ];
+
+/*
+ * pg_prewarm(regclass, mode text, fork text,
+ *			  first_block int8, last_block int8)
+ *
+ * The first argument is the relation to be prewarmed; the second controls
+ * how prewarming is done; legal options are 'prefetch', 'read', and 'buffer'.
+ * The third is the name of the relation fork to be prewarmed.  The fourth
+ * and fifth arguments specify the first and last block to be prewarmed.
+ * If the fourth argument is NULL, it will be taken as 0; if the fifth argument
+ * is NULL, it will be taken as the number of blocks in the relation.  The
+ * return value is the number of blocks successfully prewarmed.
+ */
+Datum
+pg_prewarm(PG_FUNCTION_ARGS)
+{
+	Oid		relOid;
+	text   *forkName;
+	text   *type;
+	int64	first_block;
+	int64	last_block;
+	int64	nblocks;
+	int64	blocks_done = 0;
+	int64		block;
+	Relation	rel;
+	ForkNumber	forkNumber;
+	char   *forkString;
+	char   *ttype;
+	PrewarmType	ptype;
+	AclResult	aclresult;
+
+	/* Basic sanity checking. */
+	if (PG_ARGISNULL(0))
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relation cannot be null")));
+	relOid = PG_GETARG_OID(0);
+	if (PG_A

Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters

2013-12-18 Thread Tatsuo Ishii
>> Is there any reason for the function returns int as it always returns
>> 0 or 1. Maybe returns bool is better?
> 
> No, return type should be bool, I have changed the same in attached patch.

Confirmed.

>> 2) initdb.c
>>
>> +   strcpy(tempautobuf, "# Do not edit this file manually! \n");
>> +   autoconflines[0] = pg_strdup(tempautobuf);
>> +   strcpy(tempautobuf, "# It will be overwritten by the ALTER SYSTEM 
>> command. \n");
>> +   autoconflines[1] = pg_strdup(tempautobuf);
>>
>> Is there any reason to use "tempautobuf" here? I think we can simply change 
>> to this:
>>
>> +   autoconflines[0] = pg_strdup("# Do not edit this file manually! \n");
>> +   autoconflines[1] = pg_strdup("# It will be overwritten by the ALTER 
>> SYSTEM command. \n");
> 
> You are right, I have changed code as per your suggestion.

Confirmed.

>> 3) initdb.c
>>
>> It seems the memory allocated for autoconflines[0] and
>> autoconflines[1] by pg_strdup is never freed.
> 
> I think, it gets freed in writefile() in below code.

Oh, I see. Sorry for noise.

I have committed your patches. Thanks.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part 1: additional information

2013-12-18 Thread Heikki Linnakangas

On 12/18/2013 01:45 PM, Alexander Korotkov wrote:

On Tue, Dec 17, 2013 at 2:49 AM, Heikki Linnakangas 
wrote:



On 12/17/2013 12:22 AM, Alexander Korotkov wrote:
  2) Storage would be easily extendable to hold additional information as

well.
Better compression shouldn't block more serious improvements.



I'm not sure I agree with that. For all the cases where you don't care
about additional information - which covers all existing users for example
- reducing disk size is pretty important. How are you planning to store the
additional information, and how does using another encoding gets in the way
of that?


I was planned to store additional information datums between
varbyte-encoded tids. I was expected it would be hard to do with PFOR.
However, I don't see significant problems in your implementation of Simple9
encoding. I'm going to dig deeper in your version of patch.


Ok, thanks.

I had another idea about the page format this morning. Instead of having 
the item-indexes at the end of the page, it would be more flexible to 
store a bunch of self-contained posting list "segments" on the page. So 
I propose that we get rid of the item-indexes, and instead store a bunch 
of independent posting lists on the page:


typedef struct
{
ItemPointerData first;   /* first item in this segment (unpacked) */
uint16  nwords;  /* number of words that follow */
uint64  words[1];/* var length */
} PostingListSegment;

Each segment can be encoded and decoded independently. When searching 
for a particular item (like on insertion), you skip over segments where 
'first' > the item you're searching for.


This format offers a lot more flexibility compared to the separate item 
indexes. First, we don't need to have another fixed sized area on the 
page, which simplifies the page format. Second, we can more easily 
re-encode only one segment on the page, on insertion or vacuum. The 
latter is particularly important with the Simple-9 encoding, which 
operates one word at a time rather than one item at a time; removing or 
inserting an item in the middle can require a complete re-encoding of 
everything that follows. Third, when a page is being inserted into and 
contains only uncompressed items, you don't waste any space for unused 
item indexes.


While we're at it, I think we should use the above struct in the inline 
posting lists stored directly in entry tuples. That wastes a few bytes 
compared to the current approach in the patch (more alignment, and 
'words' is redundant with the number of items stored on the tuple 
header), but it simplifies the functions handling these lists.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: SQL objects UNITs (was: [HACKERS] Extension Templates S03E11)

2013-12-18 Thread Stephen Frost
* Dimitri Fontaine (dimi...@2ndquadrant.fr) wrote:
> Here's my attempt:
> 
> # Inline Extension, Extension Templates
> 
> The problem with *Inline Extension* is the dump and restore policy. The
> contents of an extensions are not be found in a `pg_dump` script, ever.

You keep coming back to this and I think you're taking too narraw a view
to the comments made on the prior threads.  No, we don't really want
extensions which have .sql files out on disk somewhere as part of
them to be dumped out through pg_dump because then it becomes unclear
which set of scripts should be used during restore.  What we're talking
about here is intended to not have that issue by using a different
namespace, a flag, something which identifies these extensions as being
defined through the catalog instead.

> # The new thing™
> 
> A set of SQL objects that can be managed wholesale, with a version string
> attached to it. Objects are part of `pg_dump` output, the whole set can be
> relocatable, and has a version string attached.

I'd like to see more than just a single version string included and I
think that'd be beneficial for extensions too.

> Name:
[...]
> I'll pick UNIT here.

We can figure that later.

> Commands:
> 
> CREATE UNIT name [ SCHEMA ... ] [ [ NOT ] RELOCATABLE ] [ REQUIRE ...];
> 
> WITH UNIT name;
>   
> END UNIT name;

Interesting approach- I had considered something similar by having a
'fake' schema created into which you built up the 'UNIT'.  The reason I
was thinking schema instead of begin/end style commands, as you have
above, is because of questions around transactions.  Do you think the
syntax you have here would require the definition to be all inside of a
single transaction?  Do we feel that would even be an issue or perhaps
that it *should* be done that way?  I don't currently have any strong
feelings one way or the other on this and I'm curious what others
think.

> The `UPDATE TO` command only sets a new version string.

So, one of the things I had been wondering about is if we could provide
a 'diff' tool.  Using your 'WITH UNIT' syntax above, an author might
need to only write their initial implementation, build up a 'UNIT' based
on it, then adjust that implementation with another 'WITH UNIT' clause
and then ask PG for the differences.  It's not clear if we could make
that work but there is definitely a set of desireable capabilities out
there, which some other databases have, around automated upgrade script
building and doing schema differences.

> # Implementation details
> # Event Trigger support

Not sure we're really ready to get into these yet.

> The main drawback is that rather than building on extensions, both in a
> technical way and in building user trust, we are basically going to
> deprecate extensions entirely, giving them a new name an an incompatible way
> to manage them.

I don't see this as ending up deprecating extensions, even if we build a
new thing with a new name.  I would argue that properly supported
extensions, such as those in contrib and the other 'main' ones, like
PostGIS, and others that have any external dependencies (eg: FDWs) would
almost certainly continue as extensions and would be packaged through
the normal OS packaging systems.  While you plan to use the event
trigger mechanism to build something on top of this which tries to act
like extenisons-but-not, I think that's an extremely narrow and limited
use-case that very few people would have any interest in or use.

> Basically with building `UNIT` we realise with hindsight that we failed to
> build a proper `EXTENSION` system, and we send that message to our users.

Little difficult to draw conclusions about what out 'hindsight' will
look like.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Andrew Dunstan


On 12/18/2013 06:00 AM, Heikki Linnakangas wrote:



If you don't force everything to run in SSI mode, then you have to 
somehow figure out what parts do need to run in SSI mode to enforce 
the assertion. For example, if the assertion refers tables A and B, 
perhaps you can get away without predicate locks on table C?






But the assertion might simply run a function. For non-trivial cases 
that's what I would expect people to do.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters

2013-12-18 Thread Andrew Dunstan


On 12/18/2013 03:35 AM, Tatsuo Ishii wrote:

3) initdb.c

It seems the memory allocated for autoconflines[0] and
autoconflines[1] by pg_strdup is never freed.

(I think there's similar problem with "conflines" as well, though it
was not introduced by the patch).




Why would we care? initdb doesn't run for very long, and the memory will 
be freed when it exits, usually within a few seconds. My recollection 
from back when I originally rewrote initdb in C was that cleaning up the 
memory leaks wasn't worth the trouble.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stats for network traffic WIP

2013-12-18 Thread Stephen Frost
* Craig Ringer (cr...@2ndquadrant.com) wrote:
> On 12/12/2013 02:51 AM, Tom Lane wrote:
> > The thing that I'm wondering is why the database would be the right place
> > to be measuring it at all.  If you've got a network usage problem,
> > aggregate usage across everything on the server is probably what you
> > need to be worried about, and PG can't tell you that.
> 
> I suspect this feature would be useful for when you want to try to drill
> down and figure out what's having network issues - specifically, to
> associate network behaviour with individual queries, individual users,
> application_name, etc.
> 
> One sometimes faces the same issue with I/O: I know PostgreSQL is doing
> lots of I/O, but what exactly is causing the I/O? Especially if you
> can't catch it at the time it happens, it can be quite tricky to go from
> "there's lots of I/O" to "this query changed from using synchronized
> seqscans to doing an index-only scan that's hammering the cache".

Agreed.  My other thought on this is that there's a lot to be said for
having everything you need available through one tool- kinda like how
Emacs users rarely go outside of it.. :)  And then there's also the
consideration that DBAs may not have access to the host system at all,
or not to the level needed to do similar analysis there.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] sepgsql: label regression test failed

2013-12-18 Thread Sergey Muraviov
# semodule -l | grep sepgslq
sepgsql-regtest 1.07

Full list of modules is in attachment.


2013/12/18 Kohei KaiGai 

> Could you show me semodule -l on your environment?
> I believe security policy has not been changed between F19 and F20...
>
> Thanks,
>
> 2013/12/18 Sergey Muraviov :
> > Hi
> >
> > I've tried to test postgres 9.3.2 and 9.4devel with selinux on Fedora 20
> and
> > met with a label regression test failure.
> >
> > PS
> > I've got some warning during build process.
> >
> > --
> > Best regards,
> > Sergey Muraviov
> >
> >
> > --
> > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-hackers
> >
>
>
>
> --
> KaiGai Kohei 
>



-- 
Best regards,
Sergey Muraviov


modules
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sepgsql: label regression test failed

2013-12-18 Thread Kohei KaiGai
Could you show me semodule -l on your environment?
I believe security policy has not been changed between F19 and F20...

Thanks,

2013/12/18 Sergey Muraviov :
> Hi
>
> I've tried to test postgres 9.3.2 and 9.4devel with selinux on Fedora 20 and
> met with a label regression test failure.
>
> PS
> I've got some warning during build process.
>
> --
> Best regards,
> Sergey Muraviov
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] sepgsql: label regression test failed

2013-12-18 Thread Sergey Muraviov
Hi

I've tried to test postgres 9.3.2 and 9.4devel with selinux on Fedora 20
and met with a label regression test failure.

PS
I've got some warning during build process.

-- 
Best regards,
Sergey Muraviov


regression.out
Description: Binary data


regression.diffs
Description: Binary data


warnings
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters

2013-12-18 Thread Amit Kapila
On Wed, Dec 18, 2013 at 2:05 PM, Tatsuo Ishii  wrote:
> Hi,
>
> I have looked into this because it's marked as "ready for committer".
   Thank you.
> It looks like working as advertised. Great! However I have noticed a
> few minor issues.
>
> 1) validate_conf_option
>
> +/*
> + * Validates configuration parameter and value, by calling check hook 
> functions
> + * depending on record's vartype. It validates if the parameter
> + * value given is in range of expected predefined value for that parameter.
> + *
> + * freemem - true indicates memory for newval and newextra will be
> + *  freed in this function, false indicates it will be 
> freed
> + *  by caller.
> + * Return value:
> + * 1: the value is valid
> + * 0: the name or value is invalid
> + */
> +int
> +validate_conf_option(struct config_generic * record, const char *name,
> +const char *value, GucSource source, 
> int elevel,
> +bool freemem, void *newval, void 
> **newextra)
>
> Is there any reason for the function returns int as it always returns
> 0 or 1. Maybe returns bool is better?

No, return type should be bool, I have changed the same in attached patch.

> 2) initdb.c
>
> +   strcpy(tempautobuf, "# Do not edit this file manually! \n");
> +   autoconflines[0] = pg_strdup(tempautobuf);
> +   strcpy(tempautobuf, "# It will be overwritten by the ALTER SYSTEM 
> command. \n");
> +   autoconflines[1] = pg_strdup(tempautobuf);
>
> Is there any reason to use "tempautobuf" here? I think we can simply change 
> to this:
>
> +   autoconflines[0] = pg_strdup("# Do not edit this file manually! \n");
> +   autoconflines[1] = pg_strdup("# It will be overwritten by the ALTER 
> SYSTEM command. \n");

You are right, I have changed code as per your suggestion.

> 3) initdb.c
>
> It seems the memory allocated for autoconflines[0] and
> autoconflines[1] by pg_strdup is never freed.

I think, it gets freed in writefile() in below code.

for (line = lines; *line != NULL; line++)
{
if (fputs(*line, out_file) < 0)
{
fprintf(stderr, _("%s: could not write file \"%s\": %s\n"),
progname, path, strerror(errno));
exit_nicely();
}
free(*line);
}


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


alter_system_v13.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] commit fest 2013-11 final report

2013-12-18 Thread Robert Haas
On Tue, Dec 17, 2013 at 7:14 PM, Peter Eisentraut  wrote:
> On 12/17/13, 10:19 AM, Tom Lane wrote:
>> Perhaps we should just move all the Needs Review and RFC patches forward
>> to the next fest, so we don't forget about them?
>
> This was done the last few times, but it has caused some controversy.
> One problem was that a number of patches arrived in this commit fest
> without either the author or the reviewers knowing about it, which
> caused the already somewhat stale patch to become completely abandoned.
>
> I think what I'll do is send an email to each of the affected patch
> threads describing the situation.  But I'd like someone involved in the
> patch, either author or reviewer, to make the final call about moving
> the patch forward.

+1.  And thanks for your work on this CommitFest.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2013-12-18 Thread Alexander Korotkov
On Sat, Dec 14, 2013 at 11:47 PM, Andreas Karlsson wrote:

> On 12/14/2013 10:59 AM, Alexander Korotkov wrote:
>
>> This patch allows to use index for order-by if order-by clause and index
>> has non-empty common prefix. So, index gives right ordering for first n
>> order-by columns. In order to provide right order for rest m columns,
>> sort node is inserted. This sort node sorts groups of tuples where
>> values of first n order-by columns are equal.
>>
>
> I recently looked at the same problem. I see that you solved the
> rescanning problem by simply forcing the sort to be redone on
> ExecReScanSort if you have done a partial sort.
>

Naturally, I'm sure I solved it at all :) I just get version of patch
working for very limited use-cases.


> My idea for a solution was to modify tuplesort to allow storing the
> already sorted keys in either memtuples or the sort result file, but
> setting a field so it does not sort thee already sorted tuples again. This
> would allow the rescan to work as it used to, but I am unsure how clean or
> ugly this code would be. Was this something you considered?


I'm not sure. I believe that best answer depends on particular parameter:
how much memory we've for sort, how expensive is underlying node and how it
performs rescan, how big are groups in partial sort.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Heikki Linnakangas

On 12/18/2013 01:50 PM, Andres Freund wrote:

On 2013-12-18 13:46:59 +0200, Heikki Linnakangas wrote:

On 12/18/2013 01:39 PM, Andres Freund wrote:

On 2013-12-18 13:07:51 +0200, Heikki Linnakangas wrote:

Here's another idea that doesn't involve SSI:

At COMMIT, take a new snapshot and check that the assertion still passes
with that snapshot.
I think that would work, and would be simple, although it wouldn't scale too
well.


It probably would also be very prone to deadlocks.


Hmm, since this would happen at commit, you would know all the assertions
that need to be checked at that point. You could check them e.g in Oid order
to avoid deadlocks.


I think real problem are the lock upgrades, because eventual DML will
have acquired less heavy locks.


Ah, I see. You don't need to block anyone else from modifying the table, 
you just need to block anyone else from committing a transaction that 
had modified the table. So the locks shouldn't interfere with regular 
table locks. A ShareUpdateExclusiveLock on the assertion should do it.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2013-12-18 Thread Alexander Korotkov
On Sat, Dec 14, 2013 at 7:04 PM, Andres Freund wrote:

> Hi,
>
> > > >  Limit  (cost=69214.06..69214.08 rows=10 width=16) (actual
> > > > time=0.097..0.099 rows=10 loops=1)
> > > >->  Sort  (cost=69214.06..71714.06 rows=100 width=16) (actual
> > > > time=0.096..0.097 rows=10 loops=1)
> > > >  Sort Key: v1, v2
> > > >  Sort Method: top-N heapsort  Memory: 25kB
> > > >  ->  Index Scan using test_v1_idx on test
>  (cost=0.42..47604.42
> > > > rows=100 width=16) (actual time=0.017..0.066 rows=56 loops=1)
> > > >  Total runtime: 0.125 ms
> > > > (6 rows)
> > >
> > > Is that actually all that beneficial when sorting with a bog standard
> > > qsort() since that doesn't generally benefit from data being
> pre-sorted?
> > > I think we might need to switch to a different algorithm to really
> > > benefit from mostly pre-sorted input.
> > >
> >
> > In this patch I don't do full sort of dataset. For instance, index
> returns
> > data ordered by first column and we need to order them also by second
> > column.
>
> Ah, that makes sense.
>
> > But, I don't think we should expect pre-sorted values of second column
> > inside a group.
>
> Yes, if you do it that way, there doesn't seem to any need to assume
> that any more than we usually do.
>
> I think you should make the explain output reflect the fact that we're
> assuming v1 is presorted and just sorting v2. I'd be happy enough with:
> Sort Key: v1, v2
> Partial Sort: v2
> or even just
> "Partial Sort Key: [v1,] v2"
> but I am sure others disagree.
>

Sure, I just didn't change explain output yet. It should look like what you
propose.

> > > *partial-knn-1.patch*
>
> > > Rechecking from the heap means adding a sort node though, which I don't
> > > see here? Or am I misunderstanding something?
>
> > KNN-GiST contain RB-tree of scanned items. In this patch item is
> rechecked
> > inside GiST and reinserted into same RB-tree. It appears to be much
> easier
> > implementation for PoC and also looks very efficient. I'm not sure what
> is
> > actually right design for it. This is what I like to discuss.
>
> I don't have enough clue about gist to say wether it's the right design,
> but it doesn't look wrong to my eyes. It'd probably be useful to export
> the knowledge that we are rechecking and how often that happens to the
> outside.
> While I didn't really look into the patch, I noticed in passing that you
> pass a all_dead variable to heap_hot_search_buffer without using the
> result - just pass NULL instead, that performs a bit less work.


Useful notice, thanks.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] PoC: Partial sort

2013-12-18 Thread Alexander Korotkov
On Sat, Dec 14, 2013 at 6:39 PM, Martijn van Oosterhout
wrote:

> On Sat, Dec 14, 2013 at 06:21:18PM +0400, Alexander Korotkov wrote:
> > > Is that actually all that beneficial when sorting with a bog standard
> > > qsort() since that doesn't generally benefit from data being
> pre-sorted?
> > > I think we might need to switch to a different algorithm to really
> > > benefit from mostly pre-sorted input.
> > >
> >
> > In this patch I don't do full sort of dataset. For instance, index
> returns
> > data ordered by first column and we need to order them also by second
> > column. Then this node sorts groups (assumed to be small) where values of
> > the first column are same by value of second column. And with limit
> clause
> > only required number of such groups will be processed. But, I don't think
> > we should expect pre-sorted values of second column inside a group.
>
> Nice. I imagine this would be mostly beneficial for fast-start plans,
> since you no longer need to sort the whole table prior to returning the
> first tuple.
>
> Reduced memory usage might be a factor, especially for large sorts
> where you otherwise might need to spool to disk.
>
> You can now use an index on (a) to improve sorting for (a,b).
>
> Cost of sorting n groups of size l goes from O(nl log nl) to just O(nl
> log l), useful for large n.
>

Agree. Your reasoning looks correct.


> Minor comments:
>
> I find cmpTuple a bad name. That's what it's doing but perhaps
> cmpSkipColumns would be clearer.
>
> I think it's worthwhile adding a seperate path for the skipCols = 0
> case, to avoid extra copies.
>

Thanks. I'll take care about.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Andres Freund
On 2013-12-18 13:46:59 +0200, Heikki Linnakangas wrote:
> On 12/18/2013 01:39 PM, Andres Freund wrote:
> >On 2013-12-18 13:07:51 +0200, Heikki Linnakangas wrote:
> >>Here's another idea that doesn't involve SSI:
> >>
> >>At COMMIT, take a new snapshot and check that the assertion still passes
> >>with that snapshot.
> >>I think that would work, and would be simple, although it wouldn't scale too
> >>well.
> >
> >It probably would also be very prone to deadlocks.
> 
> Hmm, since this would happen at commit, you would know all the assertions
> that need to be checked at that point. You could check them e.g in Oid order
> to avoid deadlocks.

I think real problem are the lock upgrades, because eventual DML will
have acquired less heavy locks.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hstore ng index for <@ ?

2013-12-18 Thread Alexander Korotkov
Hi!

On Wed, Dec 18, 2013 at 2:35 PM, Kaare Rasmussen  wrote:

> In many ways the new hstore (and perhaps json) format looks very exciting.
> But will there ever be (GIN/GIST) index support for <@ ?


It looks not hard to do with GiST. About GIN I don't have promising ideas:
likely we can't effectively use GIN for such kind of queries.
I believe it's not implemented because wasn't requested yet. Could you
share your use-case?

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Heikki Linnakangas

On 12/18/2013 01:39 PM, Andres Freund wrote:

On 2013-12-18 13:07:51 +0200, Heikki Linnakangas wrote:

Here's another idea that doesn't involve SSI:

At COMMIT, take a new snapshot and check that the assertion still passes
with that snapshot. Now, there's a race condition, if another transaction is
committing at the same time, and performs the same check concurrently. That
race condition can be eliminated by holding an exclusive lock while running
the assertion, until commit, effectively allowing the assertion to be
checked by only one transaction at a time.

I think that would work, and would be simple, although it wouldn't scale too
well.


It probably would also be very prone to deadlocks.


Hmm, since this would happen at commit, you would know all the 
assertions that need to be checked at that point. You could check them 
e.g in Oid order to avoid deadlocks.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part 1: additional information

2013-12-18 Thread Alexander Korotkov
On Tue, Dec 17, 2013 at 2:49 AM, Heikki Linnakangas  wrote:

> On 12/17/2013 12:22 AM, Alexander Korotkov wrote:
>
>> On Mon, Dec 16, 2013 at 3:30 PM, Heikki Linnakangas <
>> hlinnakan...@vmware.com
>>
>>> wrote:
>>>
>>
>>  On 12/12/2013 06:44 PM, Alexander Korotkov wrote:
>>>
>>>   When values are packed into small groups, we have to either insert
>>>
 inefficiently encoded value or re-encode whole right part of values.

>>>
>>> It would probably be simplest to store newly inserted items uncompressed,
>>> in a separate area in the page. For example, grow the list of
>>> uncompressed
>>> items downwards from pg_upper, and the compressed items upwards from
>>> pg_lower. When the page fills up, re-encode the whole page.
>>>
>>
> I hacked together an implementation of a variant of Simple9, to see how it
> performs. Insertions are handled per the above scheme.
>
> In a limited pg_trgm test case I've been using a lot for this, this
> reduces the index size about 20%, compared to varbyte encoding. It might be
> possible to squeeze it a bit more, I handcrafted the "selectors" in the
> encoding algorithm to suite our needs, but I don't actually have a good
> idea of how to choose them optimally. Also, the encoding can encode 0
> values, but we never need to do that, so you could take advantage of that
> to pack items tighter.
>
> Compression and decompression speed seems to be about the same.
>
> Patch attached if you want to play with it. WAL replay is still broken,
> and there are probably bugs.
>
>
>  Good idea. But:
>> 1) We'll still need item indexes in the end of page for fast scan.
>>
>
> Sure.
>
>
>  2) Storage would be easily extendable to hold additional information as
>> well.
>> Better compression shouldn't block more serious improvements.
>>
>
> I'm not sure I agree with that. For all the cases where you don't care
> about additional information - which covers all existing users for example
> - reducing disk size is pretty important. How are you planning to store the
> additional information, and how does using another encoding gets in the way
> of that?


I was planned to store additional information datums between
varbyte-encoded tids. I was expected it would be hard to do with PFOR.
However, I don't see significant problems in your implementation of Simple9
encoding. I'm going to dig deeper in your version of patch.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Example query causing param_info to be set in plain rel path

2013-12-18 Thread Ashutosh Bapat
I got an example where paths for plain rel require param_info i.e. plain
rel scans require to take care of the lateral references. Here's the
example from PG regression

explain verbose select v.* from
  (int8_tbl x left join (select q1,(select coalesce(q2,0)) q2 from
int8_tbl) y on x.q2 = y.q1)
  left join int4_tbl z on z.f1 = x.q2,
  lateral (select x.q1,y.q1 from dual union all select x.q2,y.q2 from dual)
v(vx,vy);

There is note in create_scan_plan(), which says,
 324  * If it's a parameterized otherrel, there might be lateral
references
 325  * in the tlist, which need to be replaced with Params.  This
cannot
 326  * happen for regular baserels, though.  Note
use_physical_tlist()
 327  * always fails for otherrels, so we don't need to check this
above.
 328  */

Although it doesn't say why this can not happen for regular baserels.

So, we do have a testcase which tests this code path as well.


On Mon, Oct 28, 2013 at 12:30 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> No adding OFFSET there too didn't give the expected result. The lateral
> was handled in subquery and passed as param to the underlying table scan.
>
> I am particularly interested in tables (unlike functions or subqueries)
> since, the table scans are shipped to the datanodes and I wanted to test
> the effect of lateral in such cases. OTH, functions involving access to the
> tables or subqueries are initiated on the coordinators, where lateral gets
> executed in the same way as PostgreSQL.
>
> If it's so hard to come up with an example query which would cause
> lateral_relids to be set in RelOptInfo of a table, then it's very likely
> that relevant code is untested in PostgreSQL.
>
>
> On Fri, Oct 25, 2013 at 7:11 PM, Tom Lane  wrote:
>
>> Ashutosh Bapat  writes:
>> > In order to test various cases of LATERAL join in Postgres-XC, I am
>> trying
>> > to find a query where RelOptInof->lateral_relids would get set for plain
>> > base relations.
>>
>> I think you need a lateral reference in a function or VALUES FROM-item.
>> As you say, plain sub-selects are likely to get flattened.  (Possibly
>> if you stuck in a flattening fence such as OFFSET 0, you could get the
>> case to happen with a sub-select FROM item, but I'm too lazy to check.)
>>
>> regards, tom lane
>>
>
>
>
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Andres Freund
On 2013-12-18 13:07:51 +0200, Heikki Linnakangas wrote:
> Here's another idea that doesn't involve SSI:
> 
> At COMMIT, take a new snapshot and check that the assertion still passes
> with that snapshot. Now, there's a race condition, if another transaction is
> committing at the same time, and performs the same check concurrently. That
> race condition can be eliminated by holding an exclusive lock while running
> the assertion, until commit, effectively allowing the assertion to be
> checked by only one transaction at a time.
> 
> I think that would work, and would be simple, although it wouldn't scale too
> well.

It probably would also be very prone to deadlocks.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Heikki Linnakangas

On 12/18/2013 02:59 AM, Josh Berkus wrote:

On 12/17/2013 01:42 PM, Kevin Grittner wrote:

It works fine as long as you set default_transaction_isolation =
'serializable' and never override that.  :-)  Of course, it sure
would be nice to have a way to prohibit overrides, but that's
another issue.

Otherwise it is hard to see how to make it work in a general way
without a mutually exclusive lock mode on the table for the
duration of any transaction which modifies the table.


Serializable or not, *what* do we lock for assertions?  It's not rows.
Tables?  Which tables?  What if the assertion is an interpreted language
function?  Does the SSI reference counter really take care of all of this?


Here's another idea that doesn't involve SSI:

At COMMIT, take a new snapshot and check that the assertion still passes 
with that snapshot. Now, there's a race condition, if another 
transaction is committing at the same time, and performs the same check 
concurrently. That race condition can be eliminated by holding an 
exclusive lock while running the assertion, until commit, effectively 
allowing the assertion to be checked by only one transaction at a time.


I think that would work, and would be simple, although it wouldn't scale 
too well.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Heikki Linnakangas

On 12/18/2013 02:59 AM, Josh Berkus wrote:

On 12/17/2013 01:42 PM, Kevin Grittner wrote:

Josh Berkus  wrote:

Going back over this patch, I haven't seen any further discussion of the
point Heikki raises above, which seems like a bit of a showstopper.

Heikki, did you have specific ideas on how to solve this?  Right now my
mind boggles.


It works fine as long as you set default_transaction_isolation =
'serializable' and never override that.  :-)  Of course, it sure
would be nice to have a way to prohibit overrides, but that's
another issue.

Otherwise it is hard to see how to make it work in a general way
without a mutually exclusive lock mode on the table for the
duration of any transaction which modifies the table.


Serializable or not, *what* do we lock for assertions?  It's not rows.
Tables?  Which tables?  What if the assertion is an interpreted language
function?  Does the SSI reference counter really take care of all of this?


SSI does make everything rosy, as long as all the transactions 
participate in it. The open questions revolve around what happens if a 
transaction is not running in SSI mode.


If you force everyone to run in SSI as soon as you create even a single 
assertion in your database, that's kind of crappy for performance. Also, 
if one user creates an assertion, it becomes a funny kind of a partial 
denial of service attack to other users, if you force other user's to 
also run in SSI mode.


If you don't force everything to run in SSI mode, then you have to 
somehow figure out what parts do need to run in SSI mode to enforce the 
assertion. For example, if the assertion refers tables A and B, perhaps 
you can get away without predicate locks on table C?


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extension Templates S03E11

2013-12-18 Thread Dimitri Fontaine
Tom Lane  writes:
> I keep telling you this, and it keeps not sinking in.

How can you say that? I've been spending a couple of years on designing
and implementing and arguing for a complete feature set where dealing
with modules is avoided at all costs.

The problem we have now is that I'm being told that the current feature
is rejected if it includes anything about modules, and not interesting
enough if it's not dealing with modules.

I tried my best to make it so that nothing in-core changes wrt modules,
yet finding out-of-core solutions to still cope with that. It's a
failure, ok.

I think we need a conclusion on this thread: Extension specs are frozen.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] hstore ng index for <@ ?

2013-12-18 Thread Kaare Rasmussen

Hi

In many ways the new hstore (and perhaps json) format looks very 
exciting. But will there ever be (GIN/GIST) index support for <@ ?




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


SQL objects UNITs (was: [HACKERS] Extension Templates S03E11)

2013-12-18 Thread Dimitri Fontaine
Simon Riggs  writes:
> On 17 December 2013 23:42, Tom Lane  wrote:
>>> We aim to have the simplest implementation that meets the stated need
>>> and reasonable extrapolations of that. Text in a catalog table is the
>>> simplest implementation. That is not a reason to reject it, especially
>>> when we aren't suggesting a viable alternative.
>>
>> The first part of this assertion is debatable, and the claim that no
>> viable alternative has been suggested is outright wrong.

With due respect, it's only wrong when you buy into implementing
something new rather than improving extensions.

> Sounds like we have a way forward for this feature then, just not with
> the current patch.
>
> Can someone attempt to summarise the way forward, with any caveats and
> necessary restrictions? It would save further column inches of debate.

Here's my attempt:

# Inline Extension, Extension Templates

The problem with *Inline Extension* is the dump and restore policy. The
contents of an extensions are not be found in a `pg_dump` script, ever.

The problem with the *Extension Templates* is that we store the extension
scripts (plain text blobs) in the catalogs, where we already have the full
SQL objects and tools (such as `pg_dump` and `pg_depends`) to manipulate and
introspect them.

# The new thing™

A set of SQL objects that can be managed wholesale, with a version string
attached to it. Objects are part of `pg_dump` output, the whole set can be
relocatable, and has a version string attached.

Name:

  - not `PACKAGE`, Oracle
  - not `MODULE`, that's already the name of a .so file
  - not `SYSTEM`, already something else
  - `BUNDLE`
  - `LIBRARY`
  - `UNIT`

I'll pick UNIT here.
  
Commands:

CREATE UNIT name [ SCHEMA ... ] [ [ NOT ] RELOCATABLE ] [ REQUIRE ...];

WITH UNIT name;
  
END UNIT name;

ALTER UNIT name OWNER TO ;
ALTER UNIT name ADD ;
ALTER UNIT name DROP ;
ALTER UNIT name SET SCHEMA ;
ALTER UNIT name UPDATE TO ;
ALTER UNIT name SET [ NOT ] RELOCATABLE;
ALTER UNIT name REQUIRE a, b, c;

COMMENT ON UNIT name IS '';

DROP UNIT name [ CASCADE ];

The `UPDATE TO` command only sets a new version string.

# Implementation details

We need a new `pg_unit` catalog, that looks almost exactly like the
`pg_extension` one, except for the `extconfig` and `extcondition` fields.

We need a way to `recordDependencyOnCurrentUnit()`, so another pair of
static variables `creating_unit` and `CurrentUnitObject`. Each and every
command we do support for creating objects must be made aware of the new
`UNIT` concept, including `CREATE EXTENSION`.

The `pg_dump` dependencies have to be set so that all the objects are
restored independently first, as of today, and only then issue `CREATE
UNIT` and a bunch of `ALTER UNIT ADD` commands, one per object.

# Event Trigger support

Event Triggers are to be provided for all the `UNIT` commands.

# Life with Extensions and Units

PostgreSQL now includes two different ways to package SQL objects, with
about the same feature set. The only difference is the `pg_restore`
behavior: *Extensions* are re-created from external resources, *Units* are
re-created from what's in the dump.

The smarts about `ALTER EXTENSION ... UPDATE` are not available when dealing
with *UNITS*, leaving the user or the client scripts to care about that
entirely on their own.

In principle, a client can prepare a SQL script from a PGXN distribution and
apply it surrounded by `WITH UNIT` and `END UNIT` commands.

Upgrade scripts, once identified, can be run as straight SQL, adding a
simple `ALTER UNIT ... UPDATE TO ...` command before the `COMMIT` at the
end of the script. Identifying the upgrade script(s) may require
implementing current Extension update smarts into whatever client side
program is going to be built to support installing from PGXN etc.

# Conclusion

The main advantage of the `UNIT` proposal is that it copes very well with
relations and other usual schema objects, as the data are preserved at
`pg_restore` time.

A `UNIT` can also entirely replace an `EXTENSION`, including when it needs a
*module*, provided that the *module* is made available on the server's file
system before creating the functions in `LANGUAGE C` that depend on it.

It is possible to write a *UNIT distribution network* where a client
software drives the installation of SQL objects within an UNIT, and this
client software needs to include UNIT update smarts too. It's possible also
to build that software as a set of Event Triggers on the `CREATE UNIT` and
`ALTER UNIT UPDATE TO` commands.

# Analysis

The main drawback is that rather than building on extensions, both in a
technical way and in building user trust, we are basically going to
deprecate extensions entirely, giving them a new name an an incompatible way
to manage them.

Only *contribs* are going to be shipped as extensions, as they are basically
the only known extensions following the same delivery rules as the
PostgreSQL core p

Re: [HACKERS] stats for network traffic WIP

2013-12-18 Thread Craig Ringer
On 12/12/2013 02:51 AM, Tom Lane wrote:
> The thing that I'm wondering is why the database would be the right place
> to be measuring it at all.  If you've got a network usage problem,
> aggregate usage across everything on the server is probably what you
> need to be worried about, and PG can't tell you that.

I suspect this feature would be useful for when you want to try to drill
down and figure out what's having network issues - specifically, to
associate network behaviour with individual queries, individual users,
application_name, etc.

One sometimes faces the same issue with I/O: I know PostgreSQL is doing
lots of I/O, but what exactly is causing the I/O? Especially if you
can't catch it at the time it happens, it can be quite tricky to go from
"there's lots of I/O" to "this query changed from using synchronized
seqscans to doing an index-only scan that's hammering the cache".

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Problem with displaying "wide" tables in psql

2013-12-18 Thread Sergey Muraviov
Hello

2013/12/18 Sameer Thakur 

> On Wed, Dec 11, 2013 at 11:13 PM, Sergey Muraviov
>  wrote:
> > Hi.
> >
> > I've improved the patch.
> > It works in expanded mode when either format option is set to wrapped
> (\pset
> > format wrapped), or we have no pager, or pager doesn't chop long lines
> (so
> > you can still use the trick).
> > Target output width is taken from either columns option (\pset columns
> 70),
> > or environment variable $COLUMNS, or terminal size.
> > And it's also compatible with any border style (\pset border 0|1|2).
> >
> > Here are some examples:
> >
> > postgres=# \x 1
> > postgres=# \pset format wrapped
> > postgres=# \pset border 0
> > postgres=# select * from wide_table;
> > * Record 1
> > value afadsafasd fasdf asdfasd fsad fas df sadf sad f sadf  sadf sa df
> > sadfsadfa
> >   sd fsad fsa df sadf asd fa sfd sadfsadf asdf sad f sadf sad fadsf
> > * Record 2
> > value afadsafasd fasdf asdfasd
> >
> > postgres=# \pset border 1
> > postgres=# \pset columns 70
> > postgres=# select * from wide_table;
> > -[ RECORD 1 ]-
> > value | afadsafasd fasdf asdfasd fsad fas df sadf sad f sadf  sadf sa
> >   | df sadfsadfasd fsad fsa df sadf asd fa sfd sadfsadf asdf sad f
> >   |  sadf sad fadsf
> > -[ RECORD 2 ]-
> > value | afadsafasd fasdf asdfasd
> >
> > postgres=# \pset border 2
> > postgres=# \pset columns 60
> > postgres=# select * from wide_table;
> > +-[ RECORD 1 ]-+
> > | value | afadsafasd fasdf asdfasd fsad fas df sadf sad f  |
> > |   | sadf  sadf sa df sadfsadfasd fsad fsa df sadf as |
> > |   | d fa sfd sadfsadf asdf sad f sadf sad fadsf  |
> > +-[ RECORD 2 ]-+
> > | value | afadsafasd fasdf asdfasd |
> > +---+--+
> >
> > Regards,
> > Sergey
> >
>
> The patch  applies and compile cleanly. I tried the following
> \pset format wrapped
> \pset columns 70.
> Not in expanded mode
> select * from wide_table works fine.
> select * from pg_stats has problems in viewing. Is it that pg_stats
> can be viewed easily only in expanded mode i.e. if columns displayed
> are wrapped then there is no way to view results in non expanded mode?
> regards
> Sameer
>

The problem with non expanded mode is that all column headers have to be
displayed on one line.
Otherwise, it is difficult to bind values to columns.
And I have no idea how to solve this problem.

-- 
Best regards,
Sergey Muraviov


Re: [HACKERS] Proposal: variant of regclass

2013-12-18 Thread Tatsuo Ishii
>> For the pgpool-II use case, I'm happy to follow you because pgpool-II
>> always does a grammatical check (using raw parser) on a query first
>> and such syntax problem will be pointed out, thus never reaches to
>> the state where calling toregclass.
>>
>> One concern is, other users expect toregclass to detect such syntax
>> problems. Anybody?
> 
> It seems fine to me if the new function ignores the specific error of
> "relation does not exist" while continuing to throw other errors.

Thanks. Here is the revised conceptual patch. I'm going to post a
concrete patch and register it to 2014-01 Commit Fest.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
diff --git a/src/backend/utils/adt/regproc.c b/src/backend/utils/adt/regproc.c
index c24a2c1..0406c30 100644
--- a/src/backend/utils/adt/regproc.c
+++ b/src/backend/utils/adt/regproc.c
@@ -45,6 +45,7 @@ static char *format_operator_internal(Oid operator_oid, bool force_qualify);
 static char *format_procedure_internal(Oid procedure_oid, bool force_qualify);
 static void parseNameAndArgTypes(const char *string, bool allowNone,
 	 List **names, int *nargs, Oid *argtypes);
+static Datum regclass_guts(char *class_name_or_oid, bool raiseError);
 
 
 /*
@@ -804,21 +805,50 @@ Datum
 regclassin(PG_FUNCTION_ARGS)
 {
 	char	   *class_name_or_oid = PG_GETARG_CSTRING(0);
+	Oid			result;
+
+	result = regclass_guts(class_name_or_oid, true);
+	PG_RETURN_OID(result);
+}
+
+/*
+ * toregclass		- converts "classname" to class OID
+ *
+ * Diffrence from rgclassin is, this does not throw an error if the classname
+ * does not exist.
+ * Note: this is not an I/O function.
+ */
+Datum
+toregclass(PG_FUNCTION_ARGS)
+{
+	char	   *class_name_or_oid = PG_GETARG_CSTRING(0);
+	Oid			result;
+
+	result = regclass_guts(class_name_or_oid, false);
+	PG_RETURN_OID(result);
+}
+
+/*
+ * Guts of regclassin and toregclass.
+ * If raiseError is false, returns InvalidOid upon error.
+ */
+static Datum regclass_guts(char *class_name_or_oid, bool raiseError)
+{
 	Oid			result = InvalidOid;
 	List	   *names;
 
 	/* '-' ? */
 	if (strcmp(class_name_or_oid, "-") == 0)
-		PG_RETURN_OID(InvalidOid);
+		return result;
 
 	/* Numeric OID? */
 	if (class_name_or_oid[0] >= '0' &&
 		class_name_or_oid[0] <= '9' &&
 		strspn(class_name_or_oid, "0123456789") == strlen(class_name_or_oid))
 	{
-		result = DatumGetObjectId(DirectFunctionCall1(oidin,
+ 		result = DatumGetObjectId(DirectFunctionCall1(oidin,
 		CStringGetDatum(class_name_or_oid)));
-		PG_RETURN_OID(result);
+		return result;
 	}
 
 	/* Else it's a name, possibly schema-qualified */
@@ -848,16 +878,19 @@ regclassin(PG_FUNCTION_ARGS)
 		if (HeapTupleIsValid(tuple = systable_getnext(sysscan)))
 			result = HeapTupleGetOid(tuple);
 		else
-			ereport(ERROR,
-	(errcode(ERRCODE_UNDEFINED_TABLE),
-			   errmsg("relation \"%s\" does not exist", class_name_or_oid)));
+			if (raiseError)
+ereport(ERROR,
+		(errcode(ERRCODE_UNDEFINED_TABLE),
+		 errmsg("relation \"%s\" does not exist", class_name_or_oid)));
+			else
+return InvalidOid;
 
 		/* We assume there can be only one match */
 
 		systable_endscan(sysscan);
 		heap_close(hdesc, AccessShareLock);
 
-		PG_RETURN_OID(result);
+		return result;
 	}
 
 	/*
@@ -865,11 +898,16 @@ regclassin(PG_FUNCTION_ARGS)
 	 * pg_class entries in the current search path.
 	 */
 	names = stringToQualifiedNameList(class_name_or_oid);
+	if (names == NIL)
+		return InvalidOid;
 
 	/* We might not even have permissions on this relation; don't lock it. */
-	result = RangeVarGetRelid(makeRangeVarFromNameList(names), NoLock, false);
+	if (raiseError)
+		result = RangeVarGetRelid(makeRangeVarFromNameList(names), NoLock, false);
+	else
+		result = RangeVarGetRelid(makeRangeVarFromNameList(names), NoLock, true);
 
-	PG_RETURN_OID(result);
+	return result;
 }
 
 /*
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 0117500..472ccad 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -3304,13 +3304,14 @@ DATA(insert OID = 2218 (  regclassin		PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 2
 DESCR("I/O");
 DATA(insert OID = 2219 (  regclassout		PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 2275 "2205" _null_ _null_ _null_ _null_ regclassout _null_ _null_ _null_ ));
 DESCR("I/O");
+DATA(insert OID = 3179 (  toregclass		PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 2205 "2275" _null_ _null_ _null_ _null_ toregclass _null_ _null_ _null_ ));
+DESCR("convert classname to regclass");
 DATA(insert OID = 2220 (  regtypein			PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 2206 "2275" _null_ _null_ _null_ _null_ regtypein _null_ _null_ _null_ ));
 DESCR("I/O");
 DATA(insert OID = 2221 (  regtypeout		PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 2275 "2206" _null_ _null_ _null_ _null_ regtypeout _null_ _null_ _null_ ));
 DESCR(

  1   2   >