Re: [HACKERS] REVIEW: patch: remove redundant code from pl_exec.c

2011-01-19 Thread Pavel Stehule
2011/1/19 Stephen Frost :
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> I think we should reject this one.
>
> Works for me.
>
>> Using a switch there is a bit problematic since in some cases you want
>> to use "break" to exit the loop.  We could replace such breaks by gotos,
>> but that would be another strike against the argument that you're making
>> things more readable.  I think the switch in exec_stmt_loop is only
>> workable because it has no cleanup to do, so it can just "return" in
>> places where a loop break would otherwise be needed.  In short: if you
>> want to make these all look alike, better to go the other way.
>
> Ah, yeah, good point.  We do use gotos elsewhere for this reason, might
> consider revisiting those also, if we're trying to a 'clean-up' patch.
> In any case, I'll mark this as rejected.

ok, I don't thinking so current code is readable, but I can't to do better now.

Thank you for review.

Regards

Pavel Stehule

>
>        Thanks!
>
>                Stephen
>
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.10 (GNU/Linux)
>
> iEYEARECAAYFAk03S10ACgkQrzgMPqB3kigWdQCeIU/dvgJ8bMVZ7nh+TYiFHVlP
> 4H4AnR/JbboMWbCu95G2aUEcP3LZDDGM
> =R8c6
> -END PGP SIGNATURE-
>
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and Hot Standby

2011-01-19 Thread Heikki Linnakangas

On 20.01.2011 03:05, Kevin Grittner wrote:

If we don't do something like this, do we just provide REPEATABLE
READ on the standby as the strictest level of transaction isolation?
If so, do we generate an error on a request for SERIALIZABLE, warn
and provide degraded behavior, or just quietly give them REPEATABLE
READ behavior?


+1 for generating an error.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REVIEW: EXPLAIN and nfiltered

2011-01-19 Thread Itagaki Takahiro
On Thu, Jan 20, 2011 at 12:16, Stephen Frost  wrote:
> This patch looked good, in general, to me.  I added a few documentation
> updates and a comment, but it's a very straight-forward patch as far as
> I can tell.  Passes all regressions and my additional testing.

Looks good and useful for me, too.

We need to adjust a bit more documentation. The patch changes all of
EXPLAIN ANALYZE outputs. When I grep'ed the docs with "loops=",
EXPLAIN ANALYZE is also used in perform.sgml and auto-explain.sgml
in addition to explain.sgml.

It's good to have documentation about "nfiltered" parameter. The best
place would be around the descriptions of "loops" in "Using EXPLAIN" page:

http://developer.postgresql.org/pgdocs/postgres/using-explain.html

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Snapshot synchronization, again...

2011-01-19 Thread Noah Misch
On Wed, Jan 19, 2011 at 12:12:39AM -0500, Joachim Wieland wrote:
> Noah, thank you for this excellent review. I have taken over most
> (allmost all) of your suggestions (except for the documentation which
> is still missing). Please check the updated & attached patch for
> details.

Great.  I do get an assertion failure with this sequence:

BEGIN; SELECT pg_export_snapshot(); ROLLBACK;
SELECT 1;

TRAP: FailedAssertion("!(RegisteredSnapshots > 0)", File: "snapmgr.c", Line: 
430)

Perhaps some cleanup is missing from a ROLLBACK path?

> 
> On Fri, Jan 14, 2011 at 10:13 PM, Noah Misch  wrote:
> > "" is a valid md5 message digest. ?To avoid always 
> > accepting
> > a snapshot with that digest, we would need to separately store a flag 
> > indicating
> > whether a given syncSnapshotChksums member is currently valid. ?Maybe that 
> > hole
> > is acceptable, though.
> 
> In the end I decided to store md5 checksums as printable characters in
> shmem. We now need a few more bytes for each checksum but as soon as a
> byte is NUL we know that it is not a valid checksum.

Just to clarify, I was envisioning something like:

typedef struct { bool valid; char value[16]; } snapshotChksum;

I don't object to the way you've done it, but someone else might not like the
extra marshalling between text and binary.  Your call.

> >> + ? ? ?* Instead we must check to not go forward (we might have opened a 
> >> cursor
> >> + ? ? ?* in this transaction and still have its snapshot registered)
> >> + ? ? ?*/
> >
> > I'm thinking this case should yield an ERROR. ?Otherwise, our net result 
> > would
> > be to silently adopt a snapshot that is neither our old self nor the target.
> > Maybe there's a use case for that, but none comes to my mind.
> 
> This can happen when you do:
> 
> DECLARE foo CURSOR FOR SELECT * FROM bar;
> import snapshot...
> FETCH 1 FROM foo;

I see now.  You set snapshot->xmin unconditionally, but never move MyProc->xmin
forward, only backward.  Yes; that seems just right.

> > I guess the use case for pg_import_snapshot from READ COMMITTED would be
> > something like "DO $$BEGIN PERFORM pg_import_snapshot('...'); stuff; 
> > END$$;".
> > First off, would that work ("stuff" use the imported snapshot)? ?If it does
> > work, we should take the precedent of SET LOCAL and issue no warning. ?If it
> > doesn't work, then we should document that the snapshot does take effect 
> > until
> > the next command and probably keep this warning or something similar.
> 
> No, this will also give you a new snapshot for every query, no matter
> if it is executed within or outside of a DO-Block.

You're right.  Then consider "VALUES (pg_import_snapshot('...'), (SELECT
count(*) from t))" at READ COMMITTED.  It works roughly as I'd guess; the
subquery uses the imported snapshot.  If I flip the two expressions and do
"VALUES ((SELECT count(*) from t), pg_import_snapshot('...'))", the subquery
uses the normal snapshot.  That makes sense, but I can't really see a use case
for it.  A warning, like your code has today, seems appropriate.

> > Is it valid to scribble directly on snapshots like this?
> 
> I figured that previously executed code still has references pointing
> to the snapshots and so we cannot just push a new snapshot on top but
> really need to change the memory where they are pointing to.

Okay.  Had no special reason to believe otherwise, just didn't know.

> I am also adding a test script that shows the difference of READ
> COMMITTED and SERIALIZABLE in an importing transaction in a DO-Block.
> It's based on the script you sent.

Thanks; that was handy.  One thing I noticed is that the second "SELECT * FROM
kidseen" yields zero rows instead of yielding "ERROR:  relation "kidseen" does
not exist".  I changed things around per the attached "test-drop.pl", such that
the table is already gone in the ordinary snapshot, but still visible to the
imported snapshot.  Note how the pg_class row is visible, but an actual attempt
to query the table fails.  Must some kind of syscache invalidation follow the
snapshot alteration to make this behave normally?

General tests involving only DML seem to do the right thing.  Subtransaction
handling looks sound.  Patch runs cleanly according to Valgrind.

> So thanks again and I'm looking forward to your next review... :-)

Hope this one helps, too.

> diff --git a/src/backend/storage/ipc/procarray.c 
> b/src/backend/storage/ipc/procarray.c
> index 8b36df4..29c9426 100644

> + bytea *
> + ExportSnapshot(Snapshot snapshot)
> + {
> + int bufsize = 1024;
> + int bufsize_filled = 0; /* doesn't include 
> NUL byte */
> + int bufsize_left = bufsize - bufsize_filled;
> + char   *buf = (char *) palloc(bufsize);
> + int i;
> + TransactionId  *children;
> + int nchildren;
> + 
> + /* In a subtransact

Re: [HACKERS] ToDo List Item - System Table Index Clustering

2011-01-19 Thread Simone Aiken

>
>I seem to recall some muttering about teaching genbki to extract such
comments from the SGML sources or perhaps the C header files.  I tend to
agree though that it would be a lot >more work than it's worth.  And as you
say, pg_description entries aren't free.
>

I know I can't do all of the work, any submission requires review etc, but
it is worth it to me provided it does no harm to the codebase.

So the only outstanding question is the impact of increased size.

In my experience size increases related to documentation are almost always
worth it.  So I'm prejudiced right out of the gate.  I was wondering if
every pg_ table gets copied out to every database ..  if there is already a
mechanism for not replicating all of them we could utilize views or
re-writes rules to merge a single copy of catalog comments in a separate
table with each deployed database's pg_descriptions.  

If all catalog descriptions were handled this way it would actually decrease
the size of a deployed database ( by 210K? ) by absorbing the
pg_descriptions that are currently being duplicated.   Since users shouldn't
be messing with them anyway and they are purely for humans to refer to - not
computers to calculate explain plans with -  there shouldn't be anything
inherently wrong with moving static descriptions out of user space.  In
theory at least.  


-Simone Aiken




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TYPE 1: recheck index-based constraints

2011-01-19 Thread Noah Misch
On Wed, Jan 19, 2011 at 11:50:12PM -0500, Robert Haas wrote:
> On Wed, Jan 12, 2011 at 8:56 AM, Robert Haas  wrote:
> > On Wed, Jan 12, 2011 at 8:14 AM, Noah Misch  wrote:
> >> Something like the attached?
> >
> > I haven't really analyzed in this detail, but yes, approximately
> > something sorta like that.
> 
> [assessment of current uses] So I think the logic is correct and not overly
> complex.

Sounds correct to me.

> I think what might make sense is - instead of adding another Boolean
> argument, change the heap_rebuilt argument to int flags, and define
> REINDEX_CHECK_CONSTRAINTS and REINDEX_SUPPRESS_INDEX_USE as OR-able
> flags.  I think that's more clear about what's actually going on than
> heap_rebuit/tuples_changed.

There are two distinct questions here:

(1) Should reindex_relation receive boolean facts from its callers by way of two
boolean arguments, or by way of one flags vector?

The former seems best when you want every caller to definitely think about which
answer is right, and the latter when that's not so necessary.  (For a very long
list of options, the flags might be chosen on other grounds.)  As framed, I'd
lean toward keeping distinct arguments, as these are important questions.

However, suppose we inverted both flags, say REINDEX_SKIP_CONSTRAINT_CHECKS and
REINDEX_ALLOW_OLD_INDEX_USE.  Then, flags = 0 can hurt performance but not
correctness.  That's looking like a win.


(2) Should reindex_relation frame its boolean arguments in terms of what the
caller did (heap_rebuilt, tuples_changed), or in terms of what reindex_relation
will be doing (check_constraints, suppress_index_use)?

The former should be the default approach, but it requires that we be able to
frame good names that effectively convey an abstraction.  Prospective callers
must know what to send just by looking at the name and reading the header
comment.  When no prospective name is that expressive and you'll end up reading
the reindex_relation code to see how they play out, then it's better to go with
the latter instead.  A bad abstraction is worse than none at all.

I agree that both heap_rebuilt and tuples_changed are bad abstractions.
TRUNCATE is lying about the former, and the latter, as you say, is never really
correct.  column_values_changed, perhaps.  tuples_could_now_violate_constraints
would be correct, but that's just a bad spelling for REINDEX_CHECK_CONSTRAINTS.
I guess the equivalent long-winded improvement for heap_rebuilt would be
indexes_still_valid_for_snapshotnow.  Eh.

So yes, let's adopt callee-action-focused names like you propose.


Overall, I'd vote for a flags parameter with negative senses like I noted above.
My second preference would be for two boolean parameters, check_constraints and
suppress_index_use.  Not really a big deal to me, though.  (I feel a bit silly
writing this email.)  What do you think?

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sepgsql contrib module

2011-01-19 Thread KaiGai Kohei
(2011/01/20 13:01), Robert Haas wrote:
> 2011/1/19 KaiGai Kohei:
>>>   And how about adding a
>>> ProcessUtility_hook to trap evil non-DML statements that some
>>> nefarious user might issues?
>>>
>> It seems to me reasonable as long as the number of controlled command
>> are limited. For example, LOAD command may be a candidate being
>> controlled without exceptions.
>> However, it will be a tough work, if the plug-in tries to parse and
>> analyze supplied utility commands by itself.
> 
> I think the key is to either accept or reject the command based on
> very simple criteria - decide based only on the command type, and
> ignore its parameters.
> 
I can understand this idea, however, it is hard to implement this
criteria, because SELinux describes all the rules as a relationship
between a client and object using their label, so we cannot know
what actions (typically represented in command tag) are allowed or
denied without resolving their object names.

>> I uploaded my draft here.
>>   http://wiki.postgresql.org/wiki/SEPostgreSQL_Documentation
>>
>> If reasonable, I'll move them into *.sgml style.
> 
> I have yet to review that, but will try to get to it before too much
> more time goes by.
> 
OK, I try to translate it into *.sgml format.

>> I may want to simplify the step to installation using an installer
>> script.
> 
> OK, but let's get this nailed down as soon as possible.  Tempus fugit.
> 
I like to give my higher priority on the  ProcessUtility_hook, rather
than installation script.

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql: Add \dL to show languages

2011-01-19 Thread Robert Haas
On Wed, Jan 19, 2011 at 11:19 PM, Josh Kupershmidt  wrote:
> On Wed, Jan 19, 2011 at 9:09 PM, Robert Haas  wrote:
>> This patch doesn't seem terribly consistent to me - we show the name
>> of the call handler and the name of the validator, but for the inline
>> handler we just indicate whether there is one or not.  That seems like
>> something that we should make consistent, and my vote is to show the
>> name in all cases.
>
> OK, changed. I've shuffled the columns slightly so that handlers and
> Validator columns are next to each other.
>
>> Also, I'm wondering whether the System Language column be renamed to
>> Internal Language, for consistency with the terminology used here:
>>
>> http://www.postgresql.org/docs/current/static/catalog-pg-language.html
>
> Fixed, updated patch attached. Though, reading the description of
> lanispl on that page, I wonder if "user-defined language" correctly
> describes plpgsql these days, which comes installed by default.

OK, committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TYPE 1: recheck index-based constraints

2011-01-19 Thread Robert Haas
On Wed, Jan 12, 2011 at 8:56 AM, Robert Haas  wrote:
> On Wed, Jan 12, 2011 at 8:14 AM, Noah Misch  wrote:
>> Something like the attached?
>
> I haven't really analyzed in this detail, but yes, approximately
> something sorta like that.

I looked this over some more tonight.  The name "tuples_changed" is
giving me some angst, because if we rewrote the heap... the tuples
changed.  I think what you intend this to indicate whether the tuples
have changed in a semantic sense, ignoring CTIDs and so on.  But it's
not even quite that, either, because ExecuteTruncate() is passing
false, and the set of tuples has probably changed in that case.  It
seems that the cases here are:

1. When ALTER TABLE causes a rewrite, we set heap_rebuilt = true and
tuples_changed = true.  This causes constraints to be revalidated and
suppresses use of indexes while the rebuild is in progress.
2. When VACUUM FULL or CLUSTER causes a rewrite, we set heap_rebuilt =
true and tuples_changed = false.  This avoids constraint revalidation
but still suppresses use of indexes while the rebuild is in progress.
3. When TRUNCATE or REINDEX is invoked, we set heap_rebuilt = false
and tuples_changed = false.  Here we neither revalidate constraints
nor suppress use of indexes while the rebuild is in progress.

The first question I asked myself is whether the above is actually
correct, and whether it's possible to simplify back to two cases.  So:
The REINDEX case should definitely avoid suppressing the use of the
old index while the new one is rebuilt; I'm not really sure it matters
what TRUNCATE does, since we're going to be operating on a non-system
catalog on which we hold AccessExclusiveLock; the VACUUM FULL/CLUSTER
case certainly needs to suppress uses of indexes, because it can be
used on system catalogs, which may need to be used during the rebuild
itself.  Thus #2 and #3 must be distinct.  #1 must be distinct from #2
both for performance reasons and to prevent deadlocks when using
VACUUM FULL or CLUSTER on a system catalog (ALTER TABLE can't be so
used, so it's safe for it to not worry about this problem).  So I
think the logic is correct and not overly complex.

I think what might make sense is - instead of adding another Boolean
argument, change the heap_rebuilt argument to int flags, and define
REINDEX_CHECK_CONSTRAINTS and REINDEX_SUPPRESS_INDEX_USE as OR-able
flags.  I think that's more clear about what's actually going on than
heap_rebuit/tuples_changed.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and Hot Standby

2011-01-19 Thread Jeff Davis
On Wed, 2011-01-19 at 19:05 -0600, Kevin Grittner wrote:
> If we don't do something like this, do we just provide REPEATABLE
> READ on the standby as the strictest level of transaction isolation?
> If so, do we generate an error on a request for SERIALIZABLE, warn
> and provide degraded behavior, or just quietly give them REPEATABLE
> READ behavior?
>  
> Thoughts?

Hopefully there is a better option available. We don't want to silently
give wrong results.

Maybe we should bring back the compatibility GUC? It could throw an
error unless the user sets the compatibility GUC to turn "serializable"
into "repeatable read".

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup for streaming base backups

2011-01-19 Thread Fujii Masao
On Wed, Jan 19, 2011 at 9:37 PM, Magnus Hagander  wrote:
>> Great. Thanks for the quick update!
>>
>> Here are another comments:

Here are comments against the documents. The other code looks good.


It's helpful to document what to set to allow pg_basebackup connection.
That is not only the REPLICATION privilege but also max_wal_senders and
pg_hba.conf.


+ 
+  Options

Can we list the descriptions of option in the same order as
"pg_basebackup --help" does?

It's helpful to document that the target directory must be specified and
it must be empty.


+  
+   The backup will include all files in the data directory and tablespaces,
+   including the configuration files and any additional files placed in the
+   directory by third parties. Only regular files and directories are allowed
+   in the data directory, no symbolic links or special device files.

The latter sentence means that the backup of the database cluster
created by initdb -X is not supported? Because the symlink to the
actual WAL directory is included in it.

OTOH, I found the following source code comments:

+ * Receive a tar format stream from the connection to the server, and unpack
+ * the contents of it into a directory. Only files, directories and
+ * symlinks are supported, no other kinds of special files.

This says that symlinks are supported. Which is true? Is the symlink
supported only in tar format?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql: Add \dL to show languages

2011-01-19 Thread Josh Kupershmidt
On Wed, Jan 19, 2011 at 9:09 PM, Robert Haas  wrote:
> This patch doesn't seem terribly consistent to me - we show the name
> of the call handler and the name of the validator, but for the inline
> handler we just indicate whether there is one or not.  That seems like
> something that we should make consistent, and my vote is to show the
> name in all cases.

OK, changed. I've shuffled the columns slightly so that handlers and
Validator columns are next to each other.

> Also, I'm wondering whether the System Language column be renamed to
> Internal Language, for consistency with the terminology used here:
>
> http://www.postgresql.org/docs/current/static/catalog-pg-language.html

Fixed, updated patch attached. Though, reading the description of
lanispl on that page, I wonder if "user-defined language" correctly
describes plpgsql these days, which comes installed by default.

Josh
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 5f61561..30d4507 100644
*** a/doc/src/sgml/ref/psql-ref.sgml
--- b/doc/src/sgml/ref/psql-ref.sgml
*** testdb=>
*** 1249,1254 
--- 1249,1269 
  

  
+   
+ \dL[S+] [ pattern ]
+ 
+ 
+ Lists all procedural languages. If pattern
+ is specified, only languages whose names match the pattern are listed.
+ By default, only user-created languages
+ are shown; supply the S modifier to include system
+ objects. If + is appended to the command name, each
+ language is listed with its call handler, validator, access privileges,
+ and whether it is a system object.
+ 
+ 
+   
  

  \dn[S+] [ pattern ]
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 962c13c..301dc11 100644
*** a/src/bin/psql/command.c
--- b/src/bin/psql/command.c
*** exec_command(const char *cmd,
*** 416,421 
--- 416,424 
  			case 'l':
  success = do_lo_list();
  break;
+ 			case 'L':
+ success = listLanguages(pattern, show_verbose, show_system);
+ break;
  			case 'n':
  success = listSchemas(pattern, show_verbose, show_system);
  break;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 205190f..e1db4c0 100644
*** a/src/bin/psql/describe.c
--- b/src/bin/psql/describe.c
*** listTables(const char *tabtypes, const c
*** 2566,2571 
--- 2566,2638 
  }
  
  
+ /* \dL
+ *
+ * Describes Languages.
+ */
+ bool
+ listLanguages(const char *pattern, bool verbose, bool showSystem)
+ {
+ 	PQExpBufferData buf;
+ 	PGresult *res;
+ 	printQueryOpt myopt = pset.popt;
+ 
+ 	initPQExpBuffer(&buf);
+ 
+ 	printfPQExpBuffer(&buf,
+ 	  "SELECT l.lanname AS \"%s\",\n",
+ 	  gettext_noop("Name"));
+ 	if (pset.sversion >= 80300)
+ 			appendPQExpBuffer(&buf,
+ 			  "   pg_catalog.pg_get_userbyid(l.lanowner) as \"%s\",\n",
+ 			  gettext_noop("Owner"));
+ 
+ 	appendPQExpBuffer(&buf,
+ 	  "   l.lanpltrusted AS \"%s\"",
+ 	  gettext_noop("Trusted"));
+ 
+ 	if (verbose)
+ 	{
+ 			appendPQExpBuffer(&buf,
+ 			  ",\n   NOT l.lanispl AS \"%s\",\n"
+ 			  "   l.lanplcallfoid::regprocedure AS \"%s\",\n"
+ 			  "   l.lanvalidator::regprocedure AS \"%s\",\n   ",
+ 			  gettext_noop("Internal Language"),
+ 			  gettext_noop("Call Handler"),
+ 			  gettext_noop("Validator"));
+ 			if (pset.sversion >= 9)
+ appendPQExpBuffer(&buf, "l.laninline::regprocedure AS \"%s\",\n   ",
+   gettext_noop("Inline Handler"));
+ 			printACLColumn(&buf, "l.lanacl");
+ 	}
+ 
+ 	appendPQExpBuffer(&buf,
+ 	  "\nFROM pg_catalog.pg_language l\n");
+ 
+ 	processSQLNamePattern(pset.db, &buf, pattern, false, false,
+ 		  NULL, "l.lanname", NULL, NULL);
+ 
+ 	if (!showSystem && !pattern)
+ 		appendPQExpBuffer(&buf, "WHERE lanplcallfoid != 0\n");
+ 
+ 	appendPQExpBuffer(&buf, "ORDER BY 1;");
+ 
+ 	res = PSQLexec(buf.data, false);
+ 	termPQExpBuffer(&buf);
+ 	if (!res)
+ 		return false;
+ 
+ 	myopt.nullPrint = NULL;
+ 	myopt.title = _("List of languages");
+ 	myopt.translate_header = true;
+ 
+ 	printQuery(res, &myopt, pset.queryFout, pset.logfile);
+ 
+ 	PQclear(res);
+ 	return true;
+ }
+ 
+ 
  /*
   * \dD
   *
diff --git a/src/bin/psql/describe.h b/src/bin/psql/describe.h
index 2029ef8..4e80bcf 100644
*** a/src/bin/psql/describe.h
--- b/src/bin/psql/describe.h
*** extern bool listUserMappings(const char
*** 84,88 
--- 84,90 
  /* \det */
  extern bool listForeignTables(const char *pattern, bool verbose);
  
+ /* \dL */
+ extern bool listLanguages(const char *pattern, bool verbose, bool showSystem);
  
  #endif   /* DESCRIBE_H */
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index 96c85a2..bd5c4b7 100644
*** a/src/bin/psql/help.c
--- b/src/bin/psql/help.c
*** slashUsage(unsigned short int pager)
*** 211,216 
--- 211,217 
  	fprintf(output, _("  \\dg[+]  [PATTERN

Re: [HACKERS] estimating # of distinct values

2011-01-19 Thread Nathan Boley
On Wed, Jan 19, 2011 at 6:35 PM, Florian Pflug  wrote:
> On Jan20, 2011, at 02:41 , Nathan Boley wrote:
 If you think about it, it's a bit ridiculous to look at the whole table
 *just* to "estimate" ndistinct - if we go that far why dont we just
 store the full distribution and be done with it?
>>>
>>> - the best you could do is to average the
>>> individual probabilities which gives ... well, 1/ndistinct.
>>
>> That is certainly *not* the best you could do in every case. The mean
>> is only the best estimate in L2, which is definitely not the case
>> here.
>
> No, not in every case. But on average it comes out best, no?

In the sense of minimizing the average plan cost over all values?
Definitely not. Consider the trivial case where a bitmap scan is the
cost of a sequential scan + epsilon.

>
>> Consider a table with 10K values, 9,990 of which are 1 and the rest of
>> which are 2, 3, ..., 10, versus a table that has the same 10 distinct
>> values evenly distributed. For a simple equality query, in the first
>> case, a bitmap scan might be best. In the second case, a sequential
>> scan would always be best.
>
> True. But selectivity estimates alone don't show the difference there.

Yet the full distribution would - supporting my argument that even in
cases where we dont know a specific value, the full distribution is
more informative.

>
> Also, in my personal experience this isn't really the area we've got
> problems now. The problem cases for me always were queries with a rather
> large number of joins (7 to 10 or so) plus rather complex search
> conditions. The join order, not the access strategy, then becomes the
> deciding factor in plan performance. And the join order *is* driven purely
> off the selectivity estimates, unlike the access strategy which even today
> takes other factors into account (like clusteredness, I believe)
>

I think I'm losing you. Why would ndistinct be complete sufficient for
estimating the optimal join order?

>> This is precisely the point I was trying to make in my email: the loss
>> function is very complicated. Improving the ndistinct estimator could
>> significantly improve the estimates of ndistinct ( in the quadratic
>> loss sense ) while only marginally improving the plans.
>
>
> The point of this exercise isn't really to improve the ndistinct estimates
> for single columns. Rather, we'd like to get ndistinct estimates for
> groups of columns because that allows us to use the uniform bayesian
> approach to multi-column selectivity estimation that Tomas found literature
> on. Which on the whole seems like it *does* have a chance of greatly
> improving the plans in some cases. We could, of course, estimate multi-column
> ndistinct the same way we estimate single-column ndistincts, but quite a few
> people raised concerns that this wouldn't work due to the large error our
> current ndistinct estimates have.

Sure. But estimating multi-column ndistinct is surely not the only way
to approach this problem.

The comment I made, which you objected to, was that it's silly to look
at the whole table and then throw away all information *except*
ndistinct. If we really think that looking at the whole table is the
best approach, then shouldn't we be able to get better statistics? Is
this really an open question?

-Nathan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Include WAL in base backup

2011-01-19 Thread Stephen Frost
Magnus,

* Stephen Frost (sfr...@snowman.net) wrote:
> mkay, I'm not going to try to make this ready for committer, but will
> provide my comments on it overall.  Bit difficult to review when someone
> else is reviewing the base patch too. :/

I went ahead and marked it as 'waiting on author', since I'd like
feedback on my comments, but I'll try to make time in the next few days
to apply the two patches and test it out, unless I hear otherwise.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Include WAL in base backup

2011-01-19 Thread Stephen Frost
Greetings,

* Magnus Hagander (mag...@hagander.net) wrote:
> For now, you need to set wal_keep_segments to make it work properly,
> but if you do the idea is that the tar file/stream generated in the
> base backup will include all the required WAL files.

Is there some reason to not ERROR outright if we're asked to provide WAL
and wal_keep_segments isn't set..?  I'd rather do that than only ERROR
when a particular WAL is missing..  That could lead to transient backup
errors that an inexperienced sysadmin or DBA might miss or ignore.
They'll notice if it doesn't work the first time they try it and spits
out a hint about wal_keep_segments.

> That means that
> you can start a postmaster directly against the directory extracted
> from the tarball, and you no longer need to set up archive logging to
> get a backup.

Like that part.

> I've got some refactoring I want to do around the
> SendBackupDirectory() function after this, but a review of the
> functionality first would be good. And obviously, documentation is
> still necessary.

mkay, I'm not going to try to make this ready for committer, but will
provide my comments on it overall.  Bit difficult to review when someone
else is reviewing the base patch too. :/

Here goes:

- I'm not a huge fan of the whole 'closetar' option, that feels really
  rather wrong to me.  Why not just open it and close it in
  perform_base_backup(), unconditionally?

- I wonder if you're not getting to a level where you shold be using a
  struct to pass the relevant information to perform_base_backup()
  instead of adding more arguments on..  That's going to get unwieldy at
  some point.

- Why not initialize logid and logseg like so?:

int logid = startptr.xlogid;
int logseg = startptr.xrecoff / XLogSegSize;

  Then use those in your elog?  Seems cleaner to me.

- A #define right in the middle of a while loop...?  Really?

- The grammar changes strike me as..  odd.  Typically, you would have an
  'option' production that you can then have a list of and then let each
  option be whatever the OR'd set of options is.  Wouldn't the current
  grammar require that you put the options in a specific order?  That'd
  blow.

@@ -687,7 +690,7 @@ BaseBackup()
 * once since it can be relocated, and it will be checked 
before we do
 * anything anyway.
 */
-   if (basedir != NULL && i > 0)
+   if (basedir != NULL && !PQgetisnull(res, i, 1))
verify_dir_is_empty_or_create(PQgetvalue(res, i, 1));
}

- Should the 'i > 0' conditional above still be there..?

So, that's my review from just reading the source code and the thread..
Hope it's useful, sorry it's not more. :/

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] REVIEW: EXPLAIN and nfiltered

2011-01-19 Thread Robert Haas
On Wed, Jan 19, 2011 at 10:16 PM, Stephen Frost  wrote:
> This patch looked good, in general, to me.  I added a few documentation
> updates and a comment, but it's a very straight-forward patch as far as
> I can tell.  Passes all regressions and my additional testing.

I have not looked at the code for this patch at all as yet, but just
as a general user comment - I really, really want this.  It's one of
about, uh, two pieces of information that the EXPLAIN output doesn't
give you that is incredibly important for troubleshooting.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sepgsql contrib module

2011-01-19 Thread Robert Haas
2011/1/19 KaiGai Kohei :
>>  And how about adding a
>> ProcessUtility_hook to trap evil non-DML statements that some
>> nefarious user might issues?
>>
> It seems to me reasonable as long as the number of controlled command
> are limited. For example, LOAD command may be a candidate being
> controlled without exceptions.
> However, it will be a tough work, if the plug-in tries to parse and
> analyze supplied utility commands by itself.

I think the key is to either accept or reject the command based on
very simple criteria - decide based only on the command type, and
ignore its parameters.

> I uploaded my draft here.
>  http://wiki.postgresql.org/wiki/SEPostgreSQL_Documentation
>
> If reasonable, I'll move them into *.sgml style.

I have yet to review that, but will try to get to it before too much
more time goes by.

> I may want to simplify the step to installation using an installer
> script.

OK, but let's get this nailed down as soon as possible.  Tempus fugit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sepgsql contrib module

2011-01-19 Thread KaiGai Kohei
(2011/01/20 12:10), Robert Haas wrote:
> 2011/1/5 KaiGai Kohei:
>> How about feasibility to merge this 4KL chunks during the rest of
>> 45 days? I think we should decide this general direction at first.
> 
> I read through this code tonight and it looks pretty straightforward.
> I don't see much reason not to accept this more or less as-is.  I'm a
> bit suspicious of this line:
> 
>  { "translation",SEPG_PROCESS__TRANSITION },
> 
> I can't help wondering based on the rest of the table whether you
> intend to have the same word on that line twice, but you don't.  On a
> related note, would it make sense to pare down this table to the
> entries that are actually used at the moment?

Sorry for giving you a confusion.
It was my typo, so should be fixed as:
{ "transition",SEPG_PROCESS_TRANSITION },

This permission shall be checked when a security label of client being
switched from X to Y, typically on execution of trusted-procedure.
Also, I missed to check on sepgsql_needs_fmgr_hook(). We don't want to
allow inlining the function on lack of permission.

I'll fix them soon.

>  And how about adding a
> ProcessUtility_hook to trap evil non-DML statements that some
> nefarious user might issues?
> 
It seems to me reasonable as long as the number of controlled command
are limited. For example, LOAD command may be a candidate being
controlled without exceptions.
However, it will be a tough work, if the plug-in tries to parse and
analyze supplied utility commands by itself.

> One other problem is that you need to work on your whitespace a bit.
> I believe in a few places you have a mixture of tabs and spaces.  More
> seriously, pgindent is going to completely mangle things like this:
> 
> /*
>   * sepgsql_mode
>   *
>   * SEPGSQL_MODE_DISABLED: Disabled on runtime
>   * SEPGSQL_MODE_DEFAULT : Same as system settings
>   * SEPGSQL_MODE_PERMISSIVE  : Always permissive mode
>   * SEPGSQL_MODE_INTERNAL: Same as SEPGSQL_MODE_PERMISSIVE,
>   *except for
> no audit prints
>   */
> 
> You have to write it with a line of dashes on the first and last
> lines, if you don't want it reformatted as a paragraph.  It might be
> worth actually running pgindent over contrib/selinux and then check
> over the results.
> 
OK, I'll try to run pgindent to confirm the effect of reformat.

> Finally, we need to work on the documentation.
> 
I uploaded my draft here.
  http://wiki.postgresql.org/wiki/SEPostgreSQL_Documentation

If reasonable, I'll move them into *.sgml style.

I may want to simplify the step to installation using an installer
script.

> But overall, it looks pretty good, IMHO.
> 
Thanks for your reviewing in spite of a large patch.
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] REVIEW: EXPLAIN and nfiltered

2011-01-19 Thread Stephen Frost
Greetings,

On 2010-01-15 11:37 PM +200, Marko Tiikkaja wrote:
> On 2010-11-18 5:45 PM +0200, Marko Tiikkaja wrote:
> > Here's a patch for showing in EXPLAIN ANALYZE the number of rows a plan
> > qual filtered from a node's input.
> 
> Rebased against master.

This patch looked good, in general, to me.  I added a few documentation
updates and a comment, but it's a very straight-forward patch as far as
I can tell.  Passes all regressions and my additional testing.

commit fac899f7967ce74e14a90af9ca24e1a1f5a580e7
Author: Stephen Frost 
Date:   Wed Jan 19 22:14:54 2011 -0500

Fix < & > in docs to be < >, as required.

commit 5fcdb75a646912b8b273703caf33dadb80122e1c
Author: Stephen Frost 
Date:   Wed Jan 19 22:05:05 2011 -0500

Update documentation for EXPLAIN ANALYZE/nfiltered

This patch updates some documentation around EXPLAIN ANALYZE, whose
output has been changed by the patch which added nfiltered to it.

Also added a comment in the only place that seemed to need one.

commit 9ebb0108a217c2d3b7f815d1d902d6bdcc276104
Author: Stephen Frost 
Date:   Wed Jan 19 21:33:28 2011 -0500

Add nfiltered in EXPLAIN ANALYZE

This patch add the number of rows a plan qual filtered from a node's
input to the EXPLAIN ANALYZE output.

Patch by: Marko Tiikkaja

Thanks,

Stephen
*** a/doc/src/sgml/ref/explain.sgml
--- b/doc/src/sgml/ref/explain.sgml
***
*** 335,348  EXPLAIN (COSTS FALSE) SELECT * FROM foo WHERE i = 4;
 function:
  
  
! EXPLAIN SELECT sum(i) FROM foo WHERE i < 10;
! 
!  QUERY PLAN
! -
!  Aggregate  (cost=23.93..23.93 rows=1 width=4)
!->  Index Scan using fi on foo  (cost=0.00..23.92 rows=6 width=4)
!  Index Cond: (i < 10)
! (3 rows)
  

  
--- 335,352 
 function:
  
  
! CREATE TABLE test (id integer primary key, bar integer, foo integer);
! 
! EXPLAIN SELECT sum(foo) FROM test WHERE id < 10;
! 
!QUERY PLAN
! 
!  Aggregate  (cost=28.97..28.98 rows=1 width=4)
!->  Bitmap Heap Scan on test  (cost=9.26..27.35 rows=647 width=4)
!  Recheck Cond: (id < 10)
!  ->  Bitmap Index Scan on test_pkey  (cost=0.00..9.10 rows=647 width=0)
!Index Cond: (id < 10)
! (5 rows)
  

  
***
*** 351,369  EXPLAIN SELECT sum(i) FROM foo WHERE i < 10;
 display the execution plan for a prepared query:
  
  
  PREPARE query(int, int) AS SELECT sum(bar) FROM test
  WHERE id > $1 AND id < $2
  GROUP BY foo;
  
  EXPLAIN ANALYZE EXECUTE query(100, 200);
  
!QUERY PLAN
! -
!  HashAggregate  (cost=39.53..39.53 rows=1 width=8) (actual time=0.661..0.672 rows=7 loops=1)
!->  Index Scan using test_pkey on test  (cost=0.00..32.97 rows=1311 width=8) (actual time=0.050..0.395 rows=99 loops=1)
!  Index Cond: ((id > $1) AND (id < $2))
!  Total runtime: 0.851 ms
! (4 rows)
  

  
--- 355,377 
 display the execution plan for a prepared query:
  
  
+ CREATE TABLE test (id integer primary key, bar integer, foo integer);
+ 
  PREPARE query(int, int) AS SELECT sum(bar) FROM test
  WHERE id > $1 AND id < $2
  GROUP BY foo;
  
  EXPLAIN ANALYZE EXECUTE query(100, 200);
  
!  QUERY PLAN
! 
!  HashAggregate  (cost=14.98..15.01 rows=2 width=8) (actual time=0.045..0.045 rows=0 filtered=0 loops=1)
!->  Bitmap Heap Scan on test  (cost=4.35..14.93 rows=10 width=8) (actual time=0.041..0.041 rows=0 filtered=0 loops=1)
!  Recheck Cond: ((id > $1) AND (id < $2))
!  ->  Bitmap Index Scan on test_pkey  (cost=0.00..4.35 rows=10 width=0) (actual time=0.035..0.035 rows=0 filtered=0 loops=1)
!Index Cond: ((id > $1) AND (id < $2))
!  Total runtime: 0.118 ms
! (6 rows)
  

  
*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
***
*** 975,992  ExplainNode(PlanState *planstate, List *ancestors,
  		double		startup_sec = 1000.0 * planstate->instrument->startup / nloops;
  		double		total_sec = 1000.0 * planstate->instrument->total / nloops;
  		double		rows = planstate->instrument->ntuples / nloops;
  
  		if (es->format == EXPLAIN_FORMAT_TEXT)
  		{
  			appendStringInfo(es->str,
! 			 " (actual time=%.3f..%.3f rows=%.0f loops=%.0f)",
! 			 startup_sec, total_sec, rows, nloops);
  		}
  		else
  		{
  			ExplainPropertyFloat("Actual Startup Time", startu

Re: [HACKERS] ALTER TABLE ... REPLACE WITH

2011-01-19 Thread Robert Haas
On Wed, Jan 19, 2011 at 9:44 PM, Simon Riggs  wrote:
> Noah's patch is trivial, as are the changes to make mine work fully.

I dispute that.  In particular:

+   /*
+* Exchange table contents
+*
+* Swap heaps, toast tables, toast indexes
+* all forks
+* all indexes
+*
+* Checks:
+* * table definitions must match
+* * constraints must match
+* * indexes need not match
+* * outbound FKs don't need to match
+* * inbound FKs will be set to not validated
+*
+* No Trigger behaviour
+*
+* How is it WAL logged? By locks and underlying catalog updates
+*/

That's another way of saying "the patch is not anywhere close to being done".

> Neither can be achieved barring sensible review.

I think Noah posted a very nice review.

> This topic delivers important functionality. I think it's more important
> than simply who gets the credit.

This is not about credit.  I like credit as much as the next guy, but
this is about the fact that there was a deadline for this CommitFest,
and that deadline is now in the past, and this patch is not in a state
to be reviewed.  The CommitFest deadline is not a deadline by which
you much post something; it's a deadline by which you much post
something that is reasonably close to being committable, or at least
reviewable.  That's obviously not the case here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sepgsql contrib module

2011-01-19 Thread Robert Haas
2011/1/5 KaiGai Kohei :
> How about feasibility to merge this 4KL chunks during the rest of
> 45 days? I think we should decide this general direction at first.

I read through this code tonight and it looks pretty straightforward.
I don't see much reason not to accept this more or less as-is.  I'm a
bit suspicious of this line:

{ "translation",SEPG_PROCESS__TRANSITION },

I can't help wondering based on the rest of the table whether you
intend to have the same word on that line twice, but you don't.  On a
related note, would it make sense to pare down this table to the
entries that are actually used at the moment?  And how about adding a
ProcessUtility_hook to trap evil non-DML statements that some
nefarious user might issues?

One other problem is that you need to work on your whitespace a bit.
I believe in a few places you have a mixture of tabs and spaces.  More
seriously, pgindent is going to completely mangle things like this:

/*
 * sepgsql_mode
 *
 * SEPGSQL_MODE_DISABLED: Disabled on runtime
 * SEPGSQL_MODE_DEFAULT : Same as system settings
 * SEPGSQL_MODE_PERMISSIVE  : Always permissive mode
 * SEPGSQL_MODE_INTERNAL: Same as SEPGSQL_MODE_PERMISSIVE,
 *except for
no audit prints
 */

You have to write it with a line of dashes on the first and last
lines, if you don't want it reformatted as a paragraph.  It might be
worth actually running pgindent over contrib/selinux and then check
over the results.

Finally, we need to work on the documentation.

But overall, it looks pretty good, IMHO.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and Hot Standby

2011-01-19 Thread Kevin Grittner
Simon Riggs  wrote:
 
> I gave you a quick response to let you know that HS need not be a
> blocker, for this release. If you are saying you have knowingly
> ignored a requirement for a whole year, then I am shocked. How
> exactly did you think this would ever be committed?
 
I was asked not to discuss this effort on list for most of that time,
and while it was on the Wiki page, I just lost track of it -- not
maliciously or intentionally.  I really apologize.  By the time the
9.0 release was out and it was deemed OK for me to discuss things, I
started getting feedback on problems which needed response, and I got
into the mode of reacting to that rather than ticking through my
issues list.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and Hot Standby

2011-01-19 Thread Stephen Frost
* Simon Riggs (si...@2ndquadrant.com) wrote:
> I gave you a quick response to let you know that HS need not be a
> blocker, for this release. If you are saying you have knowingly ignored
> a requirement for a whole year, then I am shocked. How exactly did you
> think this would ever be committed?

Erm, to be perfectly honest, I think the answer is probably "I was
busy.", and "no one provided any feedback on *how* to deal with it."
Given the amount of work that Kevin's put into this patch (which has
been beyond impressive, imv), I have a hard time finding fault with
him not getting time to implement a solution for Hot Standby for this.

As you say, it's not a blocker, I agree completely with that, regardless
of when it was identified as an issue.  What we're talking about is
right now, and right now is too late to fix it for HS, and to be
perfectly frank, fixing it for HS isn't required or even a terribly
important factor in if it should be committed.

I'll refrain from casting stones about issues brought up nearly a year
ago on certain other patches which are apparently not going to include
what I, at least, consider extremely important to PG acceptance by
others.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] SSI and Hot Standby

2011-01-19 Thread Kevin Grittner
Robert Haas  wrote:
> Kevin Grittner  wrote:
>> I agree it's pretty late in the cycle, but I'm going through all
>> the loose ends and found this one -- which has been hanging out on
>> the Wiki page as an R&D item for over a full year without
>> discussion.  :-(  If we provide the snapshots (which we can safely
>> and easily do), can someone else who knows what they're doing with
>> WAL and HS get the rest of it safely into the release? That seems
>> to me to be the only way it can still happen for 9.1.
> 
> I think it's way too late to be embarking on what will probably
> turn out to be a reasonably complex and possibly controversial new
> development arc. I don't have a strong position on what we should
> do instead, but let's NOT do that.
 
If that can't reasonably be done for 9.1, well, my next sentence was:
 
>> If not, I agree this can be 9.2 material.
 
It'd be sweet if it could still happen 9.1, but hardly a shock if it
can't.  I didn't want to presume to make the call.
 
Like I said at the start, the alternative is to decide how noisy we
want to be about providing snapshot isolation on hot standbys when
SERIALIZABLE is requested, and figuring out where to document it.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and Hot Standby

2011-01-19 Thread Simon Riggs
On Wed, 2011-01-19 at 19:34 -0600, Kevin Grittner wrote:
 
> I agree it's pretty late in the cycle, but I'm going through all the
> loose ends and found this one -- which has been hanging out on the
> Wiki page as an R&D item for over a full year without discussion. 
> :-(  If we provide the snapshots (which we can safely and easily
> do), can someone else who knows what they're doing with WAL and HS
> get the rest of it safely into the release?  That seems to me to be
> the only way it can still happen for 9.1.

I gave you a quick response to let you know that HS need not be a
blocker, for this release. If you are saying you have knowingly ignored
a requirement for a whole year, then I am shocked. How exactly did you
think this would ever be committed?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REVIEW: "writable CTEs" - doc patch

2011-01-19 Thread Peter Geoghegan
I think that a major goal of the DocBook format is that it separates
content from presentation, so whatever tool is used to render that
content as HTML for .org isn't necessarily publicly available.

-- 
Regards,
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE ... REPLACE WITH

2011-01-19 Thread Simon Riggs
On Wed, 2011-01-19 at 21:01 -0500, Robert Haas wrote:
> On Wed, Jan 19, 2011 at 8:57 PM, Noah Misch  wrote:
> > I think Simon was referring to the proof-of-concept sketch I had included 
> > with
> > my review.
> 
> I think it's a bit late to be turning proofs-of-concept into code at
> this point, no matter who came up with them.

Noah's patch is trivial, as are the changes to make mine work fully.
Neither can be achieved barring sensible review.

This topic delivers important functionality. I think it's more important
than simply who gets the credit.

I'm not sure yet where to go, but we have viable options yet for this
release.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] estimating # of distinct values

2011-01-19 Thread Robert Haas
On Wed, Jan 19, 2011 at 9:35 PM, Florian Pflug  wrote:
> Also, in my personal experience this isn't really the area we've got
> problems now. The problem cases for me always were queries with a rather
> large number of joins (7 to 10 or so) plus rather complex search
> conditions. The join order, not the access strategy, then becomes the
> deciding factor in plan performance.

This certainly matches my experience (except sometimes with more joins).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] estimating # of distinct values

2011-01-19 Thread Robert Haas
On Wed, Jan 19, 2011 at 5:13 PM, Tomas Vondra  wrote:
>>> Regarding the crash scenario - if the commit fails, just throw away the
>>> local estimator copy, it's not needed. I'm not sure how to take care of
>>> the case when commit succeeds and the write of the merged estimator
>>> fails, but I think it might be possible to write the estimator to xlog
>>> or something like that. So it would be replayed during recovery etc. Or
>>> is it a stupid idea?
>>
>> It's not stupid, in the sense that that is what you'd need to do if
>> you want to avoid ever having to rescan the table.  But it is another
>> thing that I think is going to be way too expensive.
>
> Way too expensive? All you need to put into the logfile is a copy of the
> estimator, which is a few kBs. How is that 'way too expensive'?

At this point, this is all a matter of religion, right?  Neither of us
has a working implementation we've benchmarked.  But yes, I believe
you're going to find that implementing some kind of streaming
estimator is going to impose a...   6%
performance penalty, even after you've optimized the living daylights
out of it.  Now you might say... big deal, it improves my problem
queries by 100x.  OK, but if you could get the same benefit by doing
an occasional full table scan during off hours, you could have the
same performance with a *0%* performance penalty.  Even better, the
code changes would be confined to ANALYZE rather than spread out all
over the system, which has positive implications for robustness and
likelihood of commit.

I'm not trying to argue you out of working on this.  It's obviously
your time to spend, and if works better than I think it will, great!
I'm merely offering you an opinion on what will probably happen if you
go this route - namely, it'll carry an unpalatable run-time penalty.
That opinion may be worth no more than what you paid for it, but there
you have it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] estimating # of distinct values

2011-01-19 Thread Florian Pflug
On Jan20, 2011, at 02:41 , Nathan Boley wrote:
>>> If you think about it, it's a bit ridiculous to look at the whole table
>>> *just* to "estimate" ndistinct - if we go that far why dont we just
>>> store the full distribution and be done with it?
>> 
>> - the best you could do is to average the
>> individual probabilities which gives ... well, 1/ndistinct.
> 
> That is certainly *not* the best you could do in every case. The mean
> is only the best estimate in L2, which is definitely not the case
> here.

No, not in every case. But on average it comes out best, no?

> Consider a table with 10K values, 9,990 of which are 1 and the rest of
> which are 2, 3, ..., 10, versus a table that has the same 10 distinct
> values evenly distributed. For a simple equality query, in the first
> case, a bitmap scan might be best. In the second case, a sequential
> scan would always be best.

True. But selectivity estimates alone don't show the difference there.

Also, in my personal experience this isn't really the area we've got
problems now. The problem cases for me always were queries with a rather
large number of joins (7 to 10 or so) plus rather complex search
conditions. The join order, not the access strategy, then becomes the
deciding factor in plan performance. And the join order *is* driven purely
off the selectivity estimates, unlike the access strategy which even today
takes other factors into account (like clusteredness, I believe)

> This is precisely the point I was trying to make in my email: the loss
> function is very complicated. Improving the ndistinct estimator could
> significantly improve the estimates of ndistinct ( in the quadratic
> loss sense ) while only marginally improving the plans.


The point of this exercise isn't really to improve the ndistinct estimates
for single columns. Rather, we'd like to get ndistinct estimates for
groups of columns because that allows us to use the uniform bayesian
approach to multi-column selectivity estimation that Tomas found literature
on. Which on the whole seems like it *does* have a chance of greatly
improving the plans in some cases. We could, of course, estimate multi-column
ndistinct the same way we estimate single-column ndistincts, but quite a few
people raised concerns that this wouldn't work due to the large error our
current ndistinct estimates have. 

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REVIEW: "writable CTEs" - doc patch

2011-01-19 Thread Robert Haas
On Wed, Jan 19, 2011 at 9:13 PM, Stephen Frost  wrote:
> Greetings,
>
> * Peter Geoghegan (peter.geoghega...@gmail.com) wrote:
>> I do this all the time. Anyway, I intend for this doc patch to be
>> backported to 8.4 as a bugfix, which is part of the reason why it
>> isn't invasive - it's just a clarification. Clearly if it makes sense
>> for 9.1, it makes just as much sense for 9.0 and 8.4.
>
> I agree with the patch, in general, as well as the recommendation to
> back-port it.  I reviewed it and didn't find any issues (though I
> couldn't figure out the right magic things to install to actually build
> the docs.. :( ).  The only minor change I made was to capitalize Common
> Table Expressions (having it as an acronym w/o capitalizing the full
> name seemed odd to me..).
>
> Updated patch attached.  Marking as ready for committer.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pl/python refactoring

2011-01-19 Thread Jan Urbański
On 20/01/11 01:26, Jan Urbański wrote:
> On 19/01/11 10:57, Jan Urbański wrote:
>> On 18/01/11 23:22, Peter Eisentraut wrote:
>>> #2: It looks like this loses some information/formatting in the error
>>> message.  Should we keep the pointing arrow there?
> 
>>>  CONTEXT:  PL/Python function "sql_syntax_error"
>>> -ERROR:  syntax error at or near "syntax"
>>> -LINE 1: syntax error
>>> -^
>>> -QUERY:  syntax error
>>> +ERROR:  PL/Python: plpy.SPIError: syntax error at or near "syntax"
>>>  CONTEXT:  PL/Python function "sql_syntax_error"
> 
>> Yes, the message is less informative, because the error is reported by
>> PL/Python, by wrapping the SPI message. I guess I could try to extract
>> more info from the caught ErrorData and put it in the new error that
>> PL/Python throws.
> 
> All right, I found a way to shoehorn the extra information into
> SPIException, I'll post a new patch series with what's left of the
> general refactoring patch soon.

Here's an updated patch series for PL/Python refactoring. It was 16
patches at first, 8 are committed, 1 got dropped, so we're down to 7.

Cheers,
Jan
>From af0a83a4ff0356fe5b172b6c60692953d7e01bf0 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jan=20Urba=C5=84ski?= 
Date: Sun, 2 Jan 2011 11:42:54 +0100
Subject: [PATCH 7/7] Do not prefix error messages with the string "PL/Python: ".

It is redundant, given the error context. Also, call pg_verifymbstr
outside of the try/catch block, because it can lead to errorneously
reporting a Python error when it is in fact a Postgres error.
---
 src/pl/plpython/expected/plpython_do.out  |2 +-
 src/pl/plpython/expected/plpython_error.out   |   20 ++--
 src/pl/plpython/expected/plpython_test.out|2 +-
 src/pl/plpython/expected/plpython_types.out   |2 +-
 src/pl/plpython/expected/plpython_types_3.out |2 +-
 src/pl/plpython/plpython.c|   13 ++---
 6 files changed, 20 insertions(+), 21 deletions(-)

diff --git a/src/pl/plpython/expected/plpython_do.out b/src/pl/plpython/expected/plpython_do.out
index 9de261a..a21b088 100644
--- a/src/pl/plpython/expected/plpython_do.out
+++ b/src/pl/plpython/expected/plpython_do.out
@@ -2,5 +2,5 @@ DO $$ plpy.notice("This is plpythonu.") $$ LANGUAGE plpythonu;
 NOTICE:  This is plpythonu.
 CONTEXT:  PL/Python anonymous code block
 DO $$ nonsense $$ LANGUAGE plpythonu;
-ERROR:  PL/Python: NameError: global name 'nonsense' is not defined
+ERROR:  NameError: global name 'nonsense' is not defined
 CONTEXT:  PL/Python anonymous code block
diff --git a/src/pl/plpython/expected/plpython_error.out b/src/pl/plpython/expected/plpython_error.out
index 87225f2..ea4a54c 100644
--- a/src/pl/plpython/expected/plpython_error.out
+++ b/src/pl/plpython/expected/plpython_error.out
@@ -8,9 +8,9 @@ CREATE FUNCTION sql_syntax_error() RETURNS text
 'plpy.execute("syntax error")'
 LANGUAGE plpythonu;
 SELECT sql_syntax_error();
-WARNING:  PL/Python: plpy.SPIError: unrecognized error in PLy_spi_execute_query
+WARNING:  plpy.SPIError: unrecognized error in PLy_spi_execute_query
 CONTEXT:  PL/Python function "sql_syntax_error"
-ERROR:  PL/Python: plpy.SPIError: syntax error at or near "syntax"
+ERROR:  plpy.SPIError: syntax error at or near "syntax"
 LINE 1: syntax error
 ^
 QUERY:  syntax error
@@ -22,7 +22,7 @@ CREATE FUNCTION exception_index_invalid(text) RETURNS text
 'return args[1]'
 	LANGUAGE plpythonu;
 SELECT exception_index_invalid('test');
-ERROR:  PL/Python: IndexError: list index out of range
+ERROR:  IndexError: list index out of range
 CONTEXT:  PL/Python function "exception_index_invalid"
 /* check handling of nested exceptions
  */
@@ -32,9 +32,9 @@ CREATE FUNCTION exception_index_invalid_nested() RETURNS text
 return rv[0]'
 	LANGUAGE plpythonu;
 SELECT exception_index_invalid_nested();
-WARNING:  PL/Python: plpy.SPIError: unrecognized error in PLy_spi_execute_query
+WARNING:  plpy.SPIError: unrecognized error in PLy_spi_execute_query
 CONTEXT:  PL/Python function "exception_index_invalid_nested"
-ERROR:  PL/Python: plpy.SPIError: function test5(unknown) does not exist
+ERROR:  plpy.SPIError: function test5(unknown) does not exist
 LINE 1: SELECT test5('foo')
^
 HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
@@ -54,9 +54,9 @@ return None
 '
 	LANGUAGE plpythonu;
 SELECT invalid_type_uncaught('rick');
-WARNING:  PL/Python: plpy.SPIError: unrecognized error in PLy_spi_prepare
+WARNING:  plpy.SPIError: unrecognized error in PLy_spi_prepare
 CONTEXT:  PL/Python function "invalid_type_uncaught"
-ERROR:  PL/Python: plpy.SPIError: type "test" does not exist
+ERROR:  plpy.SPIError: type "test" does not exist
 CONTEXT:  PL/Python function "invalid_type_uncaught"
 /* for what it's worth catch the exception generated by
  * the typo, and return None
@@ -77,7 +77,7 @@ return None
 '
 	LANGUAGE plpythonu;
 SELECT invalid_type_caught('rick');
-WARNING:  PL/Python: plp

[HACKERS] REVIEW: "writable CTEs" - doc patch

2011-01-19 Thread Stephen Frost
Greetings,

* Peter Geoghegan (peter.geoghega...@gmail.com) wrote:
> I do this all the time. Anyway, I intend for this doc patch to be
> backported to 8.4 as a bugfix, which is part of the reason why it
> isn't invasive - it's just a clarification. Clearly if it makes sense
> for 9.1, it makes just as much sense for 9.0 and 8.4.

I agree with the patch, in general, as well as the recommendation to
back-port it.  I reviewed it and didn't find any issues (though I
couldn't figure out the right magic things to install to actually build
the docs.. :( ).  The only minor change I made was to capitalize Common
Table Expressions (having it as an acronym w/o capitalizing the full
name seemed odd to me..).

Updated patch attached.  Marking as ready for committer.

commit 91e9e9285752c9fbe0c222708a10a301731594c8
Author: Stephen Frost 
Date:   Wed Jan 19 20:56:44 2011 -0500

Update WITH documentation to capitalize acronym

Common Table Expressions, being a 'proper' name and having an
acronym associated with them, really should be capitalized.  This
patch makes that change in the WITH documentation.

commit 9e4565cc97b81fd6b3f96d8e346bcb7ee6e8878e
Author: Stephen Frost 
Date:   Wed Jan 19 20:54:47 2011 -0500

Add CTE as an acryonym and clarify WITH docs

This patch adds CTE to the list of acronyms and then updates the
WITH documentation to note that WITH queries are also known as
CTEs.

Patch by Peter Geoghegan

Thanks,

Stephen
*** a/doc/src/sgml/acronyms.sgml
--- b/doc/src/sgml/acronyms.sgml
***
*** 99,104 
--- 99,113 
 
  
 
+ CTE
+ 
+  
+   Common Table Expression
+  
+ 
+
+ 
+
  CVE
  
   
*** a/doc/src/sgml/queries.sgml
--- b/doc/src/sgml/queries.sgml
***
*** 1525,1531  SELECT select_list FROM table_expression
  
  
   
!   WITH Queries
  

 WITH
--- 1525,1531 
  
  
   
!   WITH Queries (Common Table Expressions)
  

 WITH
***
*** 1539,1545  SELECT select_list FROM table_expression
  

 WITH provides a way to write subqueries for use in a larger
!query.  The subqueries can be thought of as defining
 temporary tables that exist just for this query.  One use of this feature
 is to break down complicated queries into simpler parts.  An example is:
  
--- 1539,1546 
  

 WITH provides a way to write subqueries for use in a larger
!query.  The subqueries, which are often referred to as Common Table
!Expressions or CTEs, can be thought of as defining
 temporary tables that exist just for this query.  One use of this feature
 is to break down complicated queries into simpler parts.  An example is:
  


signature.asc
Description: Digital signature


Re: [HACKERS] Use of O_DIRECT only for open_* sync options

2011-01-19 Thread Robert Haas
On Wed, Jan 19, 2011 at 1:53 PM, Bruce Momjian  wrote:
> Is there a reason we only use O_DIRECT with open_* sync options?
> xlogdefs.h says:
>
> /*
>  *  Because O_DIRECT bypasses the kernel buffers, and because we never
>  *  read those buffers except during crash recovery, it is a win to use
>  *  it in all cases where we sync on each write().  We could allow O_DIRECT
>  *  with fsync(), but because skipping the kernel buffer forces writes out
>  *  quickly, it seems best just to use it for O_SYNC.  It is hard to imagine
>  *  how fsync() could be a win for O_DIRECT compared to O_SYNC and O_DIRECT.
>  *  Also, O_DIRECT is never enough to force data to the drives, it merely
>  *  tries to bypass the kernel cache, so we still need O_SYNC or fsync().
>  */
>
> This seems wrong because fsync() can win if there are two writes before
> the sync call.

Well, the comment does say "...in all cases where we sync on each
write()".  But that's certainly not true of WAL, so I dunno.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql: Add \dL to show languages

2011-01-19 Thread Robert Haas
On Wed, Jan 19, 2011 at 5:47 PM, Andreas Karlsson  wrote:
> The patch looks alright now so I will mark it as ready for committer
> now.

This patch doesn't seem terribly consistent to me - we show the name
of the call handler and the name of the validator, but for the inline
handler we just indicate whether there is one or not.  That seems like
something that we should make consistent, and my vote is to show the
name in all cases.

Also, I'm wondering whether the System Language column be renamed to
Internal Language, for consistency with the terminology used here:

http://www.postgresql.org/docs/current/static/catalog-pg-language.html

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup for streaming base backups

2011-01-19 Thread Fujii Masao
On Thu, Jan 20, 2011 at 10:53 AM, Tom Lane  wrote:
>> I'm not sure why that's the right solution. Why do you think that we should
>> not create the tablespace under the $PGDATA directory? I'm not surprised
>> that people mounts the filesystem on $PGDATA/mnt and creates the
>> tablespace on it.
>
> No?  Usually, having a mount point in a non-root-owned directory is
> considered a Bad Thing.

Hmm.. but ISTM we can have a root-owned mount point in $PGDATA
and create a tablespace there.

$ su -
# mkdir $PGDATA/mnt
# mount -t tmpfs tmpfs $PGDATA/mnt
# exit
$ mkdir $PGDATA/mnt/tblspcdir
$ psql
=# CREATE TABLESPACE tblspc LOCATION '$PGDATA/mnt/tblspcdir';
CREATE TABLESPACE

Am I missing something?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] estimating # of distinct values

2011-01-19 Thread Nathan Boley
>
> Not true. You're missing the goal of this effort - to get ndistinct
> estimate for combination of multiple columns. Which is usually
> pathological in case of dependent columns.



> Again, don't think about a single column (although even in that case
> there are known fail cases). Think about multiple columns and the number
> of distinct combinations. With attribute value independence assumption,
> this is usually significantly underestimated. And that's a problem as it
> often leads to an index scan instead of sequential scan (or something
> like that).

My point was that, in the case of single columns, we've done pretty
well despite the poor ndistinct estimates. I suspect the same will be
true in the case of multiple columns; good heuristics will trump
minimax estimators.

>> ( as I've advocated for before ) this means parameterizing the
>> distribution of values ( ie, find that the values are roughly uniform
>> ), maybe this means estimating the error of our statistics and using
>> the most robust rather than the best plan, or maybe it means something
>> totally different. But: ( and the literature seems to support me )
>
> Which literature? I've read an awful lot of papers on this topic lately,
> and I don't remember anything like that. If there's an interesting
> paper, I'd like to read it. Yes, all the papers state estimating a
> ndistinct is a particularly tricky task, but that's somehow expected?

It is definitely expected - non-parametric minimax results are always
*very* weak. The papers that you've cited seem to confirm this;
bounding ndistinct estimation error is tricky in the general case (
even with very simple loss functions, which we do not have ). The same
is true for other non-parametric methods. Think about kernel density
estimation: how many data points do you need to estimate the density
of a normal distribution well? What about if you *know* that the data
is normal ( or even that it comes from a big family? ).

> No, I'm not trying to do this just to improve the ndistinct estimate.
> The goal is to improve the cardinality estimate in case of dependent
> columns, and one of the papers depends on good ndistinct estimates.
>
> And actually it does not depend on ndistinct for the columns only, it
> depends on ndistinct estimates for the combination of columns. So
> improving the ndistinct estimates for columns is just a necessary first
> step (and only if it works reasonably well, we can do the next step).

I think that any approach which depends on precise estimates of
ndistinct is not practical.

I am very happy that you've spent so much time on this, and I'm sorry
if my previous email came off as combative. My point was only that
simple heuristics have served us well in the past and, before we go to
the effort of new, complicated schemes, we should see how well similar
heuristics work in the multiple column case. I am worried that if the
initial plan is too complicated then nothing will happen and, even if
something does happen, it will be tough to get it committed ( check
the archives for cross column stat threads - there are a lot ).

Best,
Nathan Boley

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and Hot Standby

2011-01-19 Thread Dan Ports
Kevin's suggestion seems eminently reasonable to me and probably the
best approach one can do for SSI and hot standby. Pulling it off in
time for 9.1 would be a stretch; 9.2 seems quite doable.

It's worth noting that one way or another, the semantics of
SERIALIZABLE transactions on hot standby replicas could be surprising
to some. There's no getting around this; serializability in distributed
systems is just a hard problem in general. Either we go with Kevin's
suggestion of treating SERIALIZABLE transactions as DEFERRABLE (whether
now or for 9.2), causing them to have to use an older snapshot or block
until an acceptable snapshot becomes available; or we require them to
be downgraded to REPEATABLE READ either implicitly or explicitly.

Now, neither of these is as alarming as they might sound, given that
replication lag is a fact of life for hot standby systems and
REPEATABLE READ is exactly the same as the current (9.0) SERIALIZABLE
behavior. But it's definitely something that should be addressed in
documentation.

Dan

-- 
Dan R. K. Ports  MIT CSAILhttp://drkp.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE ... REPLACE WITH

2011-01-19 Thread Robert Haas
On Wed, Jan 19, 2011 at 8:57 PM, Noah Misch  wrote:
> I think Simon was referring to the proof-of-concept sketch I had included with
> my review.

I think it's a bit late to be turning proofs-of-concept into code at
this point, no matter who came up with them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and Hot Standby

2011-01-19 Thread Robert Haas
On Wed, Jan 19, 2011 at 8:34 PM, Kevin Grittner
 wrote:
> I agree it's pretty late in the cycle, but I'm going through all the
> loose ends and found this one -- which has been hanging out on the
> Wiki page as an R&D item for over a full year without discussion.
> :-(  If we provide the snapshots (which we can safely and easily
> do), can someone else who knows what they're doing with WAL and HS
> get the rest of it safely into the release?  That seems to me to be
> the only way it can still happen for 9.1.

I think it's way too late to be embarking on what will probably turn
out to be a reasonably complex and possibly controversial new
development arc.  I don't have a strong position on what we should do
instead, but let's NOT do that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REVIEW: WIP: plpgsql - foreach in

2011-01-19 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
> On Wed, Jan 19, 2011 at 6:04 PM, Stephen Frost  wrote:
> > I'm going to mark this returned to author with feedback.
> 
> That implies you don't think it should be considered further for this
> CommitFest.  Perhaps you mean Waiting on Author?

I did, actually, and that's what I actually marked it as in the CF.
Sorry for any confusion.  When I went to mark it in CF, I realized my
mistake.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] ALTER TABLE ... REPLACE WITH

2011-01-19 Thread Noah Misch
On Wed, Jan 19, 2011 at 08:55:22PM -0500, Robert Haas wrote:
> On Wed, Jan 19, 2011 at 7:57 PM, Simon Riggs  wrote:
> > On Wed, 2011-01-19 at 17:46 -0500, Noah Misch wrote:
> >
> >> I'll go ahead and mark the patch Returned with Feedback.
> >
> > My understanding of the meaning of that is polite rejection. If you do
> > that there is no further author comment and we move to July 2011. That
> > then also rejects your own patch with what you say is an alternative
> > implementation...
> >
> > Is that what you wish? That isn't what I wish, either way. I suggest you
> > mark it Waiting on Author, so we can discuss it further.
> 
> Simon,
> 
> I have no idea what you're talking about here.  It is entirely fitting
> and appropriate to reject a patch the guts of which have not been
> written four days into the final CommitFest.  Doing so does not
> somehow reject Noah's patches, which stand or fall on their own
> merits.

I think Simon was referring to the proof-of-concept sketch I had included with
my review.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REVIEW: WIP: plpgsql - foreach in

2011-01-19 Thread Robert Haas
On Wed, Jan 19, 2011 at 6:04 PM, Stephen Frost  wrote:
> I'm going to mark this returned to author with feedback.

That implies you don't think it should be considered further for this
CommitFest.  Perhaps you mean Waiting on Author?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE ... REPLACE WITH

2011-01-19 Thread Robert Haas
On Wed, Jan 19, 2011 at 7:57 PM, Simon Riggs  wrote:
> On Wed, 2011-01-19 at 17:46 -0500, Noah Misch wrote:
>
>> I'll go ahead and mark the patch Returned with Feedback.
>
> My understanding of the meaning of that is polite rejection. If you do
> that there is no further author comment and we move to July 2011. That
> then also rejects your own patch with what you say is an alternative
> implementation...
>
> Is that what you wish? That isn't what I wish, either way. I suggest you
> mark it Waiting on Author, so we can discuss it further.

Simon,

I have no idea what you're talking about here.  It is entirely fitting
and appropriate to reject a patch the guts of which have not been
written four days into the final CommitFest.  Doing so does not
somehow reject Noah's patches, which stand or fall on their own
merits.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup for streaming base backups

2011-01-19 Thread Tom Lane
Fujii Masao  writes:
> On Thu, Jan 20, 2011 at 2:21 AM, Tom Lane  wrote:
>> Fujii Masao  writes:
>>> What I'm worried about is the case where a tablespace is created
>>> under the $PGDATA directory.

>> What would be the sense of that?  If you're concerned about whether the
>> code handles it correctly, maybe the right solution is to add code to
>> CREATE TABLESPACE to disallow it.

> I'm not sure why that's the right solution. Why do you think that we should
> not create the tablespace under the $PGDATA directory? I'm not surprised
> that people mounts the filesystem on $PGDATA/mnt and creates the
> tablespace on it.

No?  Usually, having a mount point in a non-root-owned directory is
considered a Bad Thing.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE ... REPLACE WITH

2011-01-19 Thread Noah Misch
On Thu, Jan 20, 2011 at 12:57:23AM +, Simon Riggs wrote:
> On Wed, 2011-01-19 at 17:46 -0500, Noah Misch wrote:
> 
> > I'll go ahead and mark the patch Returned with Feedback.
> 
> My understanding of the meaning of that is polite rejection. If you do
> that there is no further author comment and we move to July 2011. That
> then also rejects your own patch with what you say is an alternative
> implementation...
> 
> Is that what you wish? That isn't what I wish, either way. I suggest you
> mark it Waiting on Author, so we can discuss it further.

Before I consider my wishes too carefully, I've moved the patch back to Waiting
on Author, for the reason that it seems wrong to force it elsewhere today as
long as the author (you) would like it there.  Not that I have some kind of
authority over patch disposition in any case.

I had put it straight to RWF because it seemed clearly not intended to be
applied.  No political statement or harm intended, and maybe that determination
was not even correct.

nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup for streaming base backups

2011-01-19 Thread Fujii Masao
On Thu, Jan 20, 2011 at 2:21 AM, Tom Lane  wrote:
> Fujii Masao  writes:
>> What I'm worried about is the case where a tablespace is created
>> under the $PGDATA directory.
>
> What would be the sense of that?  If you're concerned about whether the
> code handles it correctly, maybe the right solution is to add code to
> CREATE TABLESPACE to disallow it.

I'm not sure why that's the right solution. Why do you think that we should
not create the tablespace under the $PGDATA directory? I'm not surprised
that people mounts the filesystem on $PGDATA/mnt and creates the
tablespace on it.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] estimating # of distinct values

2011-01-19 Thread Nathan Boley
>> If you think about it, it's a bit ridiculous to look at the whole table
>> *just* to "estimate" ndistinct - if we go that far why dont we just
>> store the full distribution and be done with it?
>
> - the best you could do is to average the
> individual probabilities which gives ... well, 1/ndistinct.
>

That is certainly *not* the best you could do in every case. The mean
is only the best estimate in L2, which is definitely not the case
here.

Consider a table with 10K values, 9,990 of which are 1 and the rest of
which are 2, 3, ..., 10, versus a table that has the same 10 distinct
values evenly distributed. For a simple equality query, in the first
case, a bitmap scan might be best. In the second case, a sequential
scan would always be best.

This is precisely the point I was trying to make in my email: the loss
function is very complicated. Improving the ndistinct estimator could
significantly improve the estimates of ndistinct ( in the quadratic
loss sense ) while only marginally improving the plans.

-Nathan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and Hot Standby

2011-01-19 Thread Kevin Grittner
Simon Riggs  wrote:
 
> In this release? Maybe? In later releases? Yes.
> 
> If it blocks your excellent contribution in this release, then
> from me, "no". If you can achieve this in this release, yes.
> However, if this is difficult or complex, then I would rather say
> "not yet" quickly now, than spend months working out the
> weirdnesses and possibly still get them wrong.
 
We already have a mechanism for generating a good snapshot, the hard
part (for me at least) would be to get that snapshot over to the hot
standby and have it use the latest one on a request for a
serializable transaction.  I have no experience with WAL file
output, and don't know what it would take for hot standby to use it
as I describe.
 
I agree it's pretty late in the cycle, but I'm going through all the
loose ends and found this one -- which has been hanging out on the
Wiki page as an R&D item for over a full year without discussion. 
:-(  If we provide the snapshots (which we can safely and easily
do), can someone else who knows what they're doing with WAL and HS
get the rest of it safely into the release?  That seems to me to be
the only way it can still happen for 9.1.
 
If not, I agree this can be 9.2 material.  We just have to decide
how to document it and answer the questions near the bottom of my
initial post of the thread.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup for streaming base backups

2011-01-19 Thread Fujii Masao
On Wed, Jan 19, 2011 at 9:37 PM, Magnus Hagander  wrote:
>> The "fast or slow" seems to lead users to always choose "fast". Instead,
>> what about "fast or smooth", "fast or spread" or "immediate or delayed"?
>
> Hmm. "fast or spread" seems reasonable to me. And I want to use "fast"
> for the fast version, because that's what we call it when you use
> pg_start_backup(). I'll go change it to spread for now  - it's the one
> I can find used in the docs.

Fair enough.

>> What if pg_basebackup receives a signal while doing a backup?
>> For example, users might do Ctrl-C to cancel the long-running backup.
>> We should define a signal handler and send a Terminate message
>> to the server to cancel the backup?
>
> Nah, we'll just disconnect and it'll deal with things that way. Just
> like we do with e.g. pg_dump. I don't see the need to complicate it
> with that.

Okay.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI and Hot Standby

2011-01-19 Thread Simon Riggs
On Wed, 2011-01-19 at 19:05 -0600, Kevin Grittner wrote:

> Here's an issue for feedback from the community -- do we want to
> support truly serializable transactions on hot standby machines?

In this release? Maybe? In later releases? Yes.

If it blocks your excellent contribution in this release, then from me,
"no". If you can achieve this in this release, yes. However, if this is
difficult or complex, then I would rather say "not yet" quickly now,
than spend months working out the weirdnesses and possibly still get
them wrong.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] SSI and Hot Standby

2011-01-19 Thread Kevin Grittner
Here's an issue for feedback from the community -- do we want to
support truly serializable transactions on hot standby machines?
 
The best way Dan and I have been able to think to do this is to
build on the SERIALIZABLE READ ONLY DEFERRABLE behavior.  We are
able to obtain a snapshot and then check to see if it is at a place
in the transaction processing that it would be guaranteed to be
serializable without participating in predicate locking, rw-conflict
detection, etc.  If it's not, we block until a READ WRITE
transaction completes, and then check again.  Repeat.  We may reach
a point where we determine that the snapshot can't work, and we get
a new one and start over.  Due to the somewhat complex rules for
this, you are likely to see a safe snapshot fairly quickly even in a
mix which always has short-lived READ WRITE transactions running,
although a single long-running READ WRITE transaction can block
things until it completes.
 
The idea is that whenever we see a valid snapshot which would yield
a truly serializable view of the data for a READ ONLY transaction,
we add a WAL record with that snapshot information.  Of course, we
might want some limit of how often they are sent, to avoid WAL
bloat.  A hot standby could just keep the most recently received of
these and use it when a SERIALIZABLE transaction is requested. 
Perhaps DEFERRABLE in this context could mean that it waits for the
*next* one and uses it, to assure "freshness".
 
Actually, we could try to get tricky to avoid sending a complete
snapshot by having two WAL messages with no payload -- one would
mean "the snapshot you would get now is being tested for
serializability".  If it failed reach that state we would send
another when we started working a new snapshot.  The other type of
message would mean "the snapshot you built when we last told you we
were starting to test one is good."  I *think* that can work, and it
may require less WAL space.
 
If we don't do something like this, do we just provide REPEATABLE
READ on the standby as the strictest level of transaction isolation?
If so, do we generate an error on a request for SERIALIZABLE, warn
and provide degraded behavior, or just quietly give them REPEATABLE
READ behavior?
 
Thoughts?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE ... REPLACE WITH

2011-01-19 Thread Simon Riggs
On Wed, 2011-01-19 at 17:46 -0500, Noah Misch wrote:

> I'll go ahead and mark the patch Returned with Feedback.

My understanding of the meaning of that is polite rejection. If you do
that there is no further author comment and we move to July 2011. That
then also rejects your own patch with what you say is an alternative
implementation...

Is that what you wish? That isn't what I wish, either way. I suggest you
mark it Waiting on Author, so we can discuss it further.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pl/python refactoring

2011-01-19 Thread Jan Urbański
On 19/01/11 10:57, Jan Urbański wrote:
> On 18/01/11 23:22, Peter Eisentraut wrote:
>> #2: It looks like this loses some information/formatting in the error
>> message.  Should we keep the pointing arrow there?

>>  CONTEXT:  PL/Python function "sql_syntax_error"
>> -ERROR:  syntax error at or near "syntax"
>> -LINE 1: syntax error
>> -^
>> -QUERY:  syntax error
>> +ERROR:  PL/Python: plpy.SPIError: syntax error at or near "syntax"
>>  CONTEXT:  PL/Python function "sql_syntax_error"

> Yes, the message is less informative, because the error is reported by
> PL/Python, by wrapping the SPI message. I guess I could try to extract
> more info from the caught ErrorData and put it in the new error that
> PL/Python throws.

All right, I found a way to shoehorn the extra information into
SPIException, I'll post a new patch series with what's left of the
general refactoring patch soon.

Shortly after, I'll post updated patches for doing SPI in subxacts,
explicit subxact starting and generated SPI exceptions, as they will
surely be broken by these changes.

Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] estimating # of distinct values

2011-01-19 Thread Florian Pflug
On Jan19, 2011, at 23:44 , Nathan Boley wrote:
> If you think about it, it's a bit ridiculous to look at the whole table
> *just* to "estimate" ndistinct - if we go that far why dont we just
> store the full distribution and be done with it?

The crucial point that you're missing here is that ndistinct provides an
estimate even if you *don't* have a specific value to search for at hand.
This is way more common than you may think, it e.g. happens every you time
PREPARE are statement with parameters. Even knowing the full distribution
has no advantage in this case - the best you could do is to average the
individual probabilities which gives ... well, 1/ndistinct.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] estimating # of distinct values

2011-01-19 Thread Tomas Vondra
Dne 19.1.2011 23:44, Nathan Boley napsal(a):
> 1) The distribution of values in a table is rarely pathological, and
> usually follows one of several common patterns. ( IOW, we have good
> heuristics )

Not true. You're missing the goal of this effort - to get ndistinct
estimate for combination of multiple columns. Which is usually
pathological in case of dependent columns. Which is exactly the case
when the user will explicitly enable this feature to get better
estimates (and thus better plans).

> 2) The choice of plan is fairly robust to mis-estimation of ndistinct,
> because there are only a few things the executer can choose. It
> doesn't usually matter if a value composes 5% or 20%  ( or 99% ) of
> the table, we still usually need to scan the entire table.

Again, don't think about a single column (although even in that case
there are known fail cases). Think about multiple columns and the number
of distinct combinations. With attribute value independence assumption,
this is usually significantly underestimated. And that's a problem as it
often leads to an index scan instead of sequential scan (or something
like that).

> Again, in my *very* humble opinion, If the ultimate goal is to improve
> the planner, we should try to cut down on the number of cases in which
> a poor estimate of ndistinct gives a really bad plan instead of
> chasing after marginal gains from a better ndistinct estimator. Maybe

What is a marginal gain? The ultimate goal is to build cross-column
stats, which in case of dependent columns usually results is poor plans.
How is fixing that a marginal gain?

Yes, it may turn out it's not worth it, but it's a bit early to say that
without an implementation and some hard data.

> ( as I've advocated for before ) this means parameterizing the
> distribution of values ( ie, find that the values are roughly uniform
> ), maybe this means estimating the error of our statistics and using
> the most robust rather than the best plan, or maybe it means something
> totally different. But: ( and the literature seems to support me )

Which literature? I've read an awful lot of papers on this topic lately,
and I don't remember anything like that. If there's an interesting
paper, I'd like to read it. Yes, all the papers state estimating a
ndistinct is a particularly tricky task, but that's somehow expected?

I've even checked how other databases do this - e.g. Oracle does it very
similarly to what I proposed (the user has to enable the feature, it has
to be rebuilt from time to time, etc.). I'm not saying we should do
everything like Oracle, but they are not clueless idiots.

> pounding away at the ndistinct estimator seems like a dead end. If you
> think about it, it's a bit ridiculous to look at the whole table
> *just* to "estimate" ndistinct - if we go that far why dont we just
> store the full distribution and be done with it?

No, I'm not trying to do this just to improve the ndistinct estimate.
The goal is to improve the cardinality estimate in case of dependent
columns, and one of the papers depends on good ndistinct estimates.

And actually it does not depend on ndistinct for the columns only, it
depends on ndistinct estimates for the combination of columns. So
improving the ndistinct estimates for columns is just a necessary first
step (and only if it works reasonably well, we can do the next step).

regards
Tomas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] REVIEW: WIP: plpgsql - foreach in

2011-01-19 Thread Stephen Frost
Greetings,

* Pavel Stehule (pavel.steh...@gmail.com) wrote:
> attached patch contains a implementation of iteration over a array:

I've gone through this patch and, in general, it looks pretty reasonable
to me.  There's a number of places where I think additional comments
would be good and maybe some variable name improvments.  Also, my
changes should be reviewed to make sure they make sense.

Attached is a patch against master which includes my changes, and a
patch against Pavel's patch, so he can more easily see my changes and
include them if he'd like.

I'm going to mark this returned to author with feedback.

commit 30295015739930e68c33b29da4f7ef535bc293ea
Author: Stephen Frost 
Date:   Wed Jan 19 17:58:24 2011 -0500

Clean up foreach-in-array PL/PgSQL code/comments

Minor clean-up of the PL/PgSQL foreach-in-array patch, includes
some white-space cleanup, grammar fixes, additional errhint where
it makes sense, etc.

Also added a number of 'XXX' comments asking for clarification
and additional comments on what's happening in the code.

commit f1a02fe3a8fa84217dae32d5ba74e9764c77431c
Author: Stephen Frost 
Date:   Wed Jan 19 15:11:53 2011 -0500

PL/PgSQL - Add interate-over-array support

This patch adds support for iterating over an array in PL/PgSQL.

Patch Author: Pavel Stehule

Thanks,

Stephen
*** a/doc/src/sgml/plpgsql.sgml
--- b/doc/src/sgml/plpgsql.sgml
***
*** 2238,2243  END LOOP  label ;
--- 2238,2268 
  
 
  
+
+ Looping Through Array
+ 
+  <

Re: [HACKERS] psql: Add \dL to show languages

2011-01-19 Thread Andreas Karlsson
On Tue, 2011-01-18 at 19:34 -0500, Josh Kupershmidt wrote:
> Got that now too. I lost my ~/.emacs file recently, which is mostly
> why I'm making whitespace mistakes. Rebuilding slowly though;
> (setq-default show-trailing-whitespace t) is what I needed.

Aha, I see.

> I left the "Call Handler" and "Validator" columns in the verbose
> output since I haven't heard otherwise.
> 
> Josh

I do not really care either way. The columns are not terribly useful but
not pointless either.

The patch looks alright now so I will mark it as ready for committer
now.

Andreas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE ... REPLACE WITH

2011-01-19 Thread Noah Misch
Hi Simon,

I'm reviewing this patch for CommitFest 2011-01.

On Sat, Jan 15, 2011 at 10:02:03PM +, Simon Riggs wrote:
> On Tue, 2010-12-14 at 19:48 +, Simon Riggs wrote:
> > REPLACE TABLE ying WITH yang
> Patch. Needs work.

First, I'd like to note that the thread for this patch had *four* "me-too"
responses to the use case.  That's extremely unusual; the subject is definitely
compelling to people.  It addresses the bad behavior of natural attempts to
atomically swap two tables in the namespace:

psql -c "CREATE TABLE t AS VALUES ('old'); CREATE TABLE new_t AS VALUES 
('new')"
psql -c 'SELECT pg_sleep(2) FROM t' & # block the ALTER or DROP briefly
sleep 1   # 
give prev time to take AccessShareLock

# Do it this way, and the next SELECT gets data from the old table.
#psql -c 'ALTER TABLE t RENAME TO old_t; ALTER TABLE new_t RENAME TO t' 
&
# Do it this way, and get: ERROR:  could not open relation with OID 
41380
psql -c 'DROP TABLE t; ALTER TABLE new_t RENAME TO t' &

psql -c 'SELECT * FROM t'   # I get 'old' or an error, 
never 'new'.
psql -c 'DROP TABLE IF EXISTS t, old_t, new_t'

by letting you do this instead:

psql -c "CREATE TABLE t AS VALUES ('old'); CREATE TABLE new_t AS VALUES 
('new')"
psql -c 'SELECT pg_sleep(2) FROM t' & # block the ALTER or DROP briefly
sleep 1   # 
give prev time to take AccessShareLock

psql -c 'EXCHANGE TABLE new_t TO t &

psql -c 'SELECT * FROM t'   # I get 'new', finally!
psql -c 'DROP TABLE IF EXISTS t, new_t'

I find Heikki's (4d07c6ec.2030...@enterprisedb.com) suggestion from the thread
interesting: can we just make the first example work?  Even granting that the
second syntax may be a useful addition, the existing behavior of the first
example is surely worthless, even actively harmful.  I tossed together a
proof-of-concept patch, attached, that makes the first example DTRT.  Do you see
any value in going down that road?

As for your patch itself:

> + /*
> +  * Exchange table contents
> +  *
> +  * Swap heaps, toast tables, toast indexes
> +  * all forks
> +  * all indexes

For indexes, would you basically swap yin<->yang in pg_index.indrelid, update
pg_class.relnamespace as needed, and check for naming conflicts (when yin and
yang differ in schema)?  Is there more?

> +  *
> +  * Checks:
> +  * * table definitions must match

Is there a particular reason to require this, or is it just a simplification to
avoid updating things to match?

> +  * * constraints must match

Wouldn't CHECK constraints have no need to match?

> +  * * indexes need not match
> +  * * outbound FKs don't need to match
> +  * * inbound FKs will be set to not validated

We would need to ensure that inbound FOREIGN KEY constraints still have indexes
suitable to implement them.  The easiest thing there might be to internally drop
and recreate the constraint, so we get all that verification.

> +  *
> +  * No Trigger behaviour
> +  *
> +  * How is it WAL logged? By locks and underlying catalog updates
> +  */

I see that the meat of the patch is yet to be written, so this ended up as more
of a design review based on your in-patch comments.  Hopefully it's of some
value.  I'll go ahead and mark the patch Returned with Feedback.

Thanks,
nm
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
***
*** 970,995  relation_openrv(const RangeVar *relation, LOCKMODE lockmode)
  {
Oid relOid;
  
!   /*
!* Check for shared-cache-inval messages before trying to open the
!* relation.  This is needed to cover the case where the name 
identifies a
!* rel that has been dropped and recreated since the start of our
!* transaction: if we don't flush the old syscache entry then we'll 
latch
!* onto that entry and suffer an error when we do RelationIdGetRelation.
!* Note that relation_open does not need to do this, since a relation's
!* OID never changes.
!*
!* We skip this if asked for NoLock, on the assumption that the caller 
has
!* already ensured some appropriate lock is held.
!*/
!   if (lockmode != NoLock)
!   AcceptInvalidationMessages();
! 
!   /* Look up the appropriate relation using namespace search */
!   relOid = RangeVarGetRelid(relation, false);
  
/* Let relation_open do the rest */
!   return relation_open(relOid, lockmode);
  }
  
  /* 
--- 970,980 
  {
Oid relOid;
  
!   /* Look up and lock the appropriate relation using namespace search */
!   relOid = RangeVarLockRelid(relation, lockmode, false);
  
/* Let rel

Re: [HACKERS] estimating # of distinct values

2011-01-19 Thread Nathan Boley
On Wed, Jan 19, 2011 at 2:13 PM, Tomas Vondra  wrote:
> Dne 19.1.2011 20:25, Robert Haas napsal(a):
>> On Tue, Jan 18, 2011 at 5:16 PM, Tomas Vondra  wrote:
>>> Yes, I was aware of this problem (amount of memory consumed with lots of
>>> updates), and I kind of hoped someone will come up with a reasonable
>>> solution.
>>
>> As far as I can see, periodically sampling some or all of the table is
>> really the only thing that has a chance of working OK.  You could try
>> to track changes only when they're not too big, but I think that's
>> still going to have nasty performance consequences.
>
> Yes, sampling all the table is the only way to get reliable ndistinct
> estimates.

IMHO continuing to focus on worst case results is a dead end. Yes, for
some distributions, ndistinct is very hard to estimate well. However,
let us not forget that the current ndistinct estimator is very bad but
the postgres planner is, on the whole, very good.  As far as I can see
this is due to two facts:

1) The distribution of values in a table is rarely pathological, and
usually follows one of several common patterns. ( IOW, we have good
heuristics )

2) The choice of plan is fairly robust to mis-estimation of ndistinct,
because there are only a few things the executer can choose. It
doesn't usually matter if a value composes 5% or 20%  ( or 99% ) of
the table, we still usually need to scan the entire table.

Again, in my *very* humble opinion, If the ultimate goal is to improve
the planner, we should try to cut down on the number of cases in which
a poor estimate of ndistinct gives a really bad plan instead of
chasing after marginal gains from a better ndistinct estimator. Maybe
( as I've advocated for before ) this means parameterizing the
distribution of values ( ie, find that the values are roughly uniform
), maybe this means estimating the error of our statistics and using
the most robust rather than the best plan, or maybe it means something
totally different. But: ( and the literature seems to support me )
pounding away at the ndistinct estimator seems like a dead end. If you
think about it, it's a bit ridiculous to look at the whole table
*just* to "estimate" ndistinct - if we go that far why dont we just
store the full distribution and be done with it?

Best,
Nathan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ToDo List Item - System Table Index Clustering

2011-01-19 Thread Tom Lane
Robert Haas  writes:
> On Wed, Jan 19, 2011 at 3:10 PM, Tom Lane  wrote:
>> Well, on my machine pg_description is about 210K (per database) as of
>> HEAD.  90% of its contents are pg_proc entries, though I have no good
>> fix on how much of that is for internal-use-only functions.  A very
>> rough estimate from counting pg_proc and pg_operator entries suggests
>> that the answer might be "about a third".  So if we do what was said in
>> the above-cited thread, ie move existing comments to pg_operator and
>> add boilerplate ones to pg_proc, we probably would pay <100K for it.

> I guess that's not enormously expensive, but it's not insignificant
> either.  On my machine, a template database is 5.5MB.

The implementation I was thinking about was to have initdb run a SQL
command that would do something like

INSERT INTO pg_description
  SELECT oprcode, 'pg_proc'::regclass, 0, 'implementation of ' || oprname
  FROM pg_operator
  WHERE theres-not-already-a-description-of-the-oprcode-function

So it would be minimal work to either provide or omit the boilerplate
descriptions.  I think we can postpone the decision till we have a
closer fix on the number of entries we're talking about.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] estimating # of distinct values

2011-01-19 Thread Tomas Vondra
Dne 19.1.2011 20:46, Tom Lane napsal(a):
> Robert Haas  writes:
>> ... I guess I'm just saying I'd think really, really hard
>> before abandoning the idea of periodic sampling.  You have to get an
>> awful lot of benefit out of those cross-column stats to make it worth
>> paying a run-time cost for them.
> 
> I've been trying to not discourage Tomas from blue-sky speculation,
> but I have to say I agree with Robert that the chance of any usable
> result from this approach is very very small.  Feel free to do some
> experimentation to prove us wrong --- but don't put a lot of effort
> into it before that.
> 
>   regards, tom lane

Discourage? Not really - I have not expected this to be a simple thing
to implement. And yes, it might turn out to be a dead end eventually.

OTOH the approach "it seems really expensive so let's not try to build
it" is a certain dead end, so I'm not going to surrender due to such
speculations (althouh those are valid concerns and I'll have to address
them in the future).

regards
Tomas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] estimating # of distinct values

2011-01-19 Thread Tomas Vondra
Dne 19.1.2011 20:25, Robert Haas napsal(a):
> On Tue, Jan 18, 2011 at 5:16 PM, Tomas Vondra  wrote:
>> Yes, I was aware of this problem (amount of memory consumed with lots of
>> updates), and I kind of hoped someone will come up with a reasonable
>> solution.
> 
> As far as I can see, periodically sampling some or all of the table is
> really the only thing that has a chance of working OK.  You could try
> to track changes only when they're not too big, but I think that's
> still going to have nasty performance consequences.

Yes, sampling all the table is the only way to get reliable ndistinct
estimates. I'm not sure what you mean by 'tracking changes' - as I've
mentioned in the previous post, this might be solved by updating a local
copy. Which requires a constant space (a few kB, see the previous post).
Is that acceptable? I don't know, that's up to the user if he want's to
pay this price.

>> So the algorithm would be something like this:
>>
>> 1. create copy for all distinct estimators influenced by the INSERT
>> 2. update the local copy
>> 3. commit and merge the local copies back into the original estimator
> 
> Well, maybe.  But DELETEs seem like a problem.  And it's still adding
> foreground work to every query, which I just have a hard time
> believing is ever going to work out

Yes, deletes are difficult to handle. My idea was to compute something
like ((tuples changed + tuples deleted) / tuples total), and indicate
that a rebuild of the estimator is needed if this coefficient changes
too much - e.g. log a message or something like that.

>> Regarding the crash scenario - if the commit fails, just throw away the
>> local estimator copy, it's not needed. I'm not sure how to take care of
>> the case when commit succeeds and the write of the merged estimator
>> fails, but I think it might be possible to write the estimator to xlog
>> or something like that. So it would be replayed during recovery etc. Or
>> is it a stupid idea?
> 
> It's not stupid, in the sense that that is what you'd need to do if
> you want to avoid ever having to rescan the table.  But it is another
> thing that I think is going to be way too expensive.

Way too expensive? All you need to put into the logfile is a copy of the
estimator, which is a few kBs. How is that 'way too expensive'? Sure, it
might be expensive when the user does a lot of small changes in separate
transactions, that's true, but I guess we could minimize the amount of
data written to the xlog by doing a diff of the estimators or something
like that.

>> Yes, as I've mentioned above, the multi-column stats are the reason why
>> I started working on this. And yes, it really should work like this:
>>
>> 1. user realizes there's something wrong as the plans are bad
>> 2. after analysis, the user realizes the reason is an imprecise
>>   estimate due to dependence between columns
>> 3. the user enables cross-column stats on the columns and checks
>>   if it actually solved the issue
>> 4. if the cost outweights the gains, the user drops the stats
>>
>> Does that sound reasonable?
> 
> Yes.  The only caveat I'm trying to insert is that it's hard for me to
> see how the proposed approach could ever be cheap enough to be a win.

IMHO the question is not 'How expensive is that?' but rather 'How
expensive is it compared to the gains?' If the user gets much better
estimates and a then a much better plan, then this may be a completely
acceptable price.

> If we need some kind of more expensive kind of ANALYZE that scans the
> whole table, and it's optional, sure, why not?  But that's a one-time
> hit.  You run it, it sucks, it's over, and now your queries work.
> Odds are good you may never need to touch it again.  Now, if we can
> convince ourselves that multi-column stats are likely to require
> constant updating, then maybe there would be more to recommend the
> stream-based approach, although even then it seems dreadfully complex
> and expensive to me.  But I bet these things are pretty static.  If

No, the multi-column statistics do not require constant updating. There
are cases where a sampling is perfectly fine, although you may need a
bit larger sample. Generally if you can use a multi-dimensional
histogram, you don't need to scan the whole table.

But the multi-dimensional histograms are not applicable to some cases.
Especially to the ZIP code fail case, that was repeatedly discussed. So
in case of discrete data, we need a different approach - and the
solution I've proposed is based on using ndistinct estimates to get the
estimates (actually it's based on one of the papers listed on wiki).

> and expensive to me.  But I bet these things are pretty static.  If
> the city and state tend to imply the zip code when there are 10K rows
> in the table, they probably still will when there are 100K rows in the
> table.  If users with org_id = 1 have a higher karma score on average

OK, how will you measure the "implicativeness"?

We have discussed this in another threa

Re: [HACKERS] pl/python refactoring

2011-01-19 Thread Robert Haas
On Wed, Jan 19, 2011 at 2:59 PM, Peter Eisentraut  wrote:
> On ons, 2011-01-19 at 00:52 -0300, Alvaro Herrera wrote:
>> Excerpts from Peter Eisentraut's message of mar ene 18 19:22:50 -0300 2011:
>>
>> > #16: This is probably pointless because pgindent will reformat this.
>>
>> pgindent used to remove useless braces around single-statement blocks,
>> but this behavior was removed years ago because it'd break formatting
>> around PG_TRY blocks.
>
> Good to know.  Committed then.

I cracked up upon reading your commit message.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

2011-01-19 Thread Robert Haas
On Wed, Jan 19, 2011 at 4:18 PM, Tom Lane  wrote:
> One idea that I think we discussed was to tie cache entries to the
> memory context they were demanded in, and mark them unused at the next
> context reset/delete.  That way they'd be considered unused at the same
> points where the current implementation would certainly have discarded
> the value.  This isn't perfect (because of pfree) but might be good
> enough.

Yeah, I was thinking that's probably what would have to be done.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

2011-01-19 Thread Tom Lane
"Kevin Grittner"  writes:
> Tom Lane  wrote:
>> If we could solve the refcounting problem I think this would be a
>> very significant win.
 
> Instead of trying to keep a refcount, how about just evicting from
> the buffer as needed based on LRU?

Well, unless you know for certain that an item is no longer used, you
can't evict it.  There are ways you could finesse that --- for instance,
you might try to only flush between statements, when certainly no user
functions are running --- but the problem is that the cache is likely
to contain some large values that you can't adopt such a laissez faire
approach with, or you'll run out of memory.

One idea that I think we discussed was to tie cache entries to the
memory context they were demanded in, and mark them unused at the next
context reset/delete.  That way they'd be considered unused at the same
points where the current implementation would certainly have discarded
the value.  This isn't perfect (because of pfree) but might be good
enough.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Couple document fixes

2011-01-19 Thread Thom Brown
On 19 January 2011 21:10, Tom Lane  wrote:
> Thom Brown  writes:
>> I've attached a couple minor fixes to the docs.  One relating to
>> SECURITY LABEL and the other for pg_class.relpersistence
>
> Applied, thanks.

Cheers Mr Lane.

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Couple document fixes

2011-01-19 Thread Tom Lane
Thom Brown  writes:
> I've attached a couple minor fixes to the docs.  One relating to
> SECURITY LABEL and the other for pg_class.relpersistence

Applied, thanks.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

2011-01-19 Thread Kevin Grittner
Tom Lane  wrote:
 
> If we could solve the refcounting problem I think this would be a
> very significant win.
 
Instead of trying to keep a refcount, how about just evicting from
the buffer as needed based on LRU?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

2011-01-19 Thread Tom Lane
Robert Haas  writes:
> I do remember that discussion.  Aside from the problem you mention, it
> also seems that maintaining the hash table and doing lookups into it
> would have some intrinsic cost.

Well, sure, but it's still far cheaper than going out to the toast table
(which will require multiple hashtable lookups, not to mention I/O and
possible lock blocking).  If we could solve the refcounting problem I
think this would be a very significant win.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extending opfamilies for GIN indexes

2011-01-19 Thread Tom Lane
Dimitri Fontaine  writes:
> For the GIN indexes, we have 2 methods for building the index and 3
> others to search it to solve the query.  You're proposing that the 2
> former methods would be in the opfamily and the 3 later in the opclass.

Actually the other way around.  An opclass is the subset of an opfamily
that is tightly bound to an index.  The "build" methods have to be
associatable with an index, so they're part of the index's opclass.
The "query" methods could be loose in the opfamily.

> So we would want the planner to know that in the GIN case an index built
> with any opclass of a given opfamily can help answer a query that would
> need any opclass of the opfamily.  Right?

The planner's not the problem here --- what's missing is the rule for
the index AM to look up the right support functions to call at runtime.

The trick is to associate the proper query support methods with any
given query operator (which'd also be loose in the family, probably).
The existing schema for pg_amop and pg_amproc is built on the assumption
that the amoplefttype/amoprighttype are sufficient for making this
association; but that seems to fall down if we would like to allow
contrib modules to add new query operators that coincidentally take the
same input types as an existing opfamily member.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REVIEW: patch: remove redundant code from pl_exec.c

2011-01-19 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> I think we should reject this one.

Works for me.

> Using a switch there is a bit problematic since in some cases you want
> to use "break" to exit the loop.  We could replace such breaks by gotos,
> but that would be another strike against the argument that you're making
> things more readable.  I think the switch in exec_stmt_loop is only
> workable because it has no cleanup to do, so it can just "return" in
> places where a loop break would otherwise be needed.  In short: if you
> want to make these all look alike, better to go the other way.

Ah, yeah, good point.  We do use gotos elsewhere for this reason, might
consider revisiting those also, if we're trying to a 'clean-up' patch.
In any case, I'll mark this as rejected.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] REVIEW: patch: remove redundant code from pl_exec.c

2011-01-19 Thread Tom Lane
Stephen Frost  writes:
> While I can certainly appreciate wanting to remove redundant code, I
> don't think this change actually improves the situation.  The problem is
> more than just that we might want to make a change to 'while' but not
> 'for', it's also that it makes it very difficult to follow the code
> flow.

That was my reaction as well; and I was also concerned that this could
have a non-negligible performance price.  (At the very least it's adding
an additional function call per loop execution, and there could also be
a penalty from forcing "rc" to be in memory rather than a register.)

I think we should reject this one.

> In the end, I'd recommend cleaning up the handling of the exec_stmts()
> return code so that all of these pieces follow the same style and look
> similar (I'd go with the switch-based approach and remove the if/else
> branches).  That'll make it easier for anyone coming along later who
> does end up needing to change all three.

Using a switch there is a bit problematic since in some cases you want
to use "break" to exit the loop.  We could replace such breaks by gotos,
but that would be another strike against the argument that you're making
things more readable.  I think the switch in exec_stmt_loop is only
workable because it has no cleanup to do, so it can just "return" in
places where a loop break would otherwise be needed.  In short: if you
want to make these all look alike, better to go the other way.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ToDo List Item - System Table Index Clustering

2011-01-19 Thread Robert Haas
On Wed, Jan 19, 2011 at 3:10 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Wed, Jan 19, 2011 at 2:26 PM, Tom Lane  wrote:
>>> Which brings up another point though. I have a personal TODO item to
>>> make the comments for operator support functions more consistent:
>>> http://archives.postgresql.org/message-id/21407.1287157...@sss.pgh.pa.us
>>> Should we consider removing those comments altogether, instead?
>
>> I could go either way on that.  Most of those comments are pretty
>> short, aren't they?  How much storage are they really costing us?
>
> Well, on my machine pg_description is about 210K (per database) as of
> HEAD.  90% of its contents are pg_proc entries, though I have no good
> fix on how much of that is for internal-use-only functions.  A very
> rough estimate from counting pg_proc and pg_operator entries suggests
> that the answer might be "about a third".  So if we do what was said in
> the above-cited thread, ie move existing comments to pg_operator and
> add boilerplate ones to pg_proc, we probably would pay <100K for it.

I guess that's not enormously expensive, but it's not insignificant
either.  On my machine, a template database is 5.5MB.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

2011-01-19 Thread Robert Haas
On Wed, Jan 19, 2011 at 2:58 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Wed, Jan 19, 2011 at 12:10 PM, Tom Lane  wrote:
>>> Robert Haas  writes:
 Yeah.  Many-times-repeated detoasting is really bad, and this is not
 the only place in the backend where we have this problem.  :-(
>
>>> Yeah, there's been some discussion of a more general solution, and I
>>> think I even had a trial patch at one point (which turned out not to
>>> work terribly well, but maybe somebody will have a better idea someday).
>
>> I'm pretty doubtful that there's going to be a general solution to
>> this problem - I think it's going to require gradual refactoring of
>> problem spots.
>
> Do you remember the previous discussion?  One idea that was on the table
> was to make the TOAST code maintain a cache of detoasted values, which
> could be indexed by the toast pointer OIDs (toast rel OID + value OID),
> and then PG_DETOAST_DATUM might give back a pointer into the cache
> instead of a fresh value.  In principle that could be done in a fairly
> centralized way.  The hard part is to know when a cache entry is not
> actively referenced anymore ...

I do remember that discussion.  Aside from the problem you mention, it
also seems that maintaining the hash table and doing lookups into it
would have some intrinsic cost.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ToDo List Item - System Table Index Clustering

2011-01-19 Thread Tom Lane
Robert Haas  writes:
> On Wed, Jan 19, 2011 at 2:26 PM, Tom Lane  wrote:
>> Which brings up another point though. I have a personal TODO item to
>> make the comments for operator support functions more consistent:
>> http://archives.postgresql.org/message-id/21407.1287157...@sss.pgh.pa.us
>> Should we consider removing those comments altogether, instead?

> I could go either way on that.  Most of those comments are pretty
> short, aren't they?  How much storage are they really costing us?

Well, on my machine pg_description is about 210K (per database) as of
HEAD.  90% of its contents are pg_proc entries, though I have no good
fix on how much of that is for internal-use-only functions.  A very
rough estimate from counting pg_proc and pg_operator entries suggests
that the answer might be "about a third".  So if we do what was said in
the above-cited thread, ie move existing comments to pg_operator and
add boilerplate ones to pg_proc, we probably would pay <100K for it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extending opfamilies for GIN indexes

2011-01-19 Thread Dimitri Fontaine
Tom Lane  writes:
> I think you missed the point: right now, to use both the core and
> intarray operators on an integer[] column, you have to create *two*
> GIN indexes, which will have exactly identical contents.  I'm looking
> for a way to let intarray extend the core opfamily definition so that
> one index can serve.

That I think I understood, but then I mixed opfamily and opclasses
badly.  Let's try again.

For the GIN indexes, we have 2 methods for building the index and 3
others to search it to solve the query.  You're proposing that the 2
former methods would be in the opfamily and the 3 later in the opclass.

We'd like to be able to use the same index (which building depends on
the opfamily) for solving different kind of queries, for which we can
use different traversal and search algorithms, that's the opclass.

So we would want the planner to know that in the GIN case an index built
with any opclass of a given opfamily can help answer a query that would
need any opclass of the opfamily.  Right?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pl/python refactoring

2011-01-19 Thread Peter Eisentraut
On ons, 2011-01-19 at 00:52 -0300, Alvaro Herrera wrote:
> Excerpts from Peter Eisentraut's message of mar ene 18 19:22:50 -0300 2011:
> 
> > #16: This is probably pointless because pgindent will reformat this.
> 
> pgindent used to remove useless braces around single-statement blocks,
> but this behavior was removed years ago because it'd break formatting
> around PG_TRY blocks.

Good to know.  Committed then.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

2011-01-19 Thread Tom Lane
Robert Haas  writes:
> On Wed, Jan 19, 2011 at 12:10 PM, Tom Lane  wrote:
>> Robert Haas  writes:
>>> Yeah.  Many-times-repeated detoasting is really bad, and this is not
>>> the only place in the backend where we have this problem.  :-(

>> Yeah, there's been some discussion of a more general solution, and I
>> think I even had a trial patch at one point (which turned out not to
>> work terribly well, but maybe somebody will have a better idea someday).

> I'm pretty doubtful that there's going to be a general solution to
> this problem - I think it's going to require gradual refactoring of
> problem spots.

Do you remember the previous discussion?  One idea that was on the table
was to make the TOAST code maintain a cache of detoasted values, which
could be indexed by the toast pointer OIDs (toast rel OID + value OID),
and then PG_DETOAST_DATUM might give back a pointer into the cache
instead of a fresh value.  In principle that could be done in a fairly
centralized way.  The hard part is to know when a cache entry is not
actively referenced anymore ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] estimating # of distinct values

2011-01-19 Thread Tom Lane
Robert Haas  writes:
> ... I guess I'm just saying I'd think really, really hard
> before abandoning the idea of periodic sampling.  You have to get an
> awful lot of benefit out of those cross-column stats to make it worth
> paying a run-time cost for them.

I've been trying to not discourage Tomas from blue-sky speculation,
but I have to say I agree with Robert that the chance of any usable
result from this approach is very very small.  Feel free to do some
experimentation to prove us wrong --- but don't put a lot of effort
into it before that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ToDo List Item - System Table Index Clustering

2011-01-19 Thread Robert Haas
On Wed, Jan 19, 2011 at 2:26 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Jan 18, 2011 at 6:49 PM, Simone Aiken
>>  wrote:
>>> Pages like this one have column comments for the system tables:
>>>
>>> http://www.psql.it/manuale/8.3/catalog-pg-attribute.html
>
>> Oh, I see.  I don't think we want to go there.  We'd need some kind of
>> system for keeping the two places in sync.
>
> I seem to recall some muttering about teaching genbki to extract such
> comments from the SGML sources or perhaps the C header files.  I tend to
> agree though that it would be a lot more work than it's worth.  And as
> you say, pg_description entries aren't free.
>
> Which brings up another point though.  I have a personal TODO item to
> make the comments for operator support functions more consistent:
> http://archives.postgresql.org/message-id/21407.1287157...@sss.pgh.pa.us
> Should we consider removing those comments altogether, instead?

I could go either way on that.  Most of those comments are pretty
short, aren't they?  How much storage are they really costing us?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] REVIEW: patch: remove redundant code from pl_exec.c

2011-01-19 Thread Stephen Frost
Greetings,

* Pavel Stehule (pavel.steh...@gmail.com) wrote:
> This patch remove redundant rows from PL/pgSQL executor (-89 lines).

While I can certainly appreciate wanting to remove redundant code, I
don't think this change actually improves the situation.  The problem is
more than just that we might want to make a change to 'while' but not
'for', it's also that it makes it very difficult to follow the code
flow.

If you could have found a way to make it work for all calls to
exec_stmts(), it might be a bit better, but you can't without kludging
it up further.  If you could, then it might be possible to move some of
this logic *into* exec_stmts(), but I don't see that being reasonable
either.

In the end, I'd recommend cleaning up the handling of the exec_stmts()
return code so that all of these pieces follow the same style and look
similar (I'd go with the switch-based approach and remove the if/else
branches).  That'll make it easier for anyone coming along later who
does end up needing to change all three.

> Doesn't change a functionality.

I'm not convinced of this either, to be honest..  In exec_stmt_while(),
there is:

for(;;)
{
[...]
if (isnull || !value)
break;

rc = exec_stmts(estate, stmt->body);
[...]
}
return PLPGSQL_RC_OK;

You replaced the last return with:

return rc;

Which could mean returning an uninitialized rc if the above break;
happens.

In the end, I guess it's up to the committers on how they feel about
this, so here's an updated patch which fixes the above issue and
improves the comments/grammer.  Passes all regressions and works in my
limited testing.

commit e6639d83db5b301e184bf158b1591007a3f79526
Author: Stephen Frost 
Date:   Wed Jan 19 14:28:36 2011 -0500

Improve comments in pl_exec.c wrt can_leave_loop()

This patch improves some of the comments around can_leave_loop(), and
fixes a couple of fall-through cases to make sure they behave the same
as before the code de-duplication patch that introduced
can_leave_loop().

commit f8262adcba662164dbc24d289cb036b3f0aee582
Author: Stephen Frost 
Date:   Wed Jan 19 14:26:27 2011 -0500

remove redundant code from pl_exec.c

This patch removes redundant handling of exec_stmts()'s return code
by creating a new function to be called after exec_stmts() is run.
This new function will then handle the return code, update it if
necessary, and return if the loop should continue or not.

Patch by Pavel Stehule.

Thanks,

Stephen
*** a/src/pl/plpgsql/src/pl_exec.c
--- b/src/pl/plpgsql/src/pl_exec.c
***
*** 204,209  static Portal exec_dynquery_with_params(PLpgSQL_execstate *estate,
--- 204,211 
  		  PLpgSQL_expr *dynquery, List *params,
  		  const char *portalname, int cursorOptions);
  
+ static bool can_leave_loop(PLpgSQL_execstate *estate,
+ PLpgSQL_any_loop *stmt, int *rc);
  
  /* --
   * plpgsql_exec_function	Called by the call handler for
***
*** 1566,1571  exec_stmt_case(PLpgSQL_execstate *estate, PLpgSQL_stmt_case *stmt)
--- 1568,1650 
  	return exec_stmts(estate, stmt->else_stmts);
  }
  
+ /*
+  * can_leave_loop
+  *
+  * Check the result of exec_stmts for the various exec_stmt_loop
+  * functions (unconditional loop, while loop, for-over-integer loop,
+  * for-over-portal loop).
+  *
+  * Returns true when the outer cycle should be left, otherwise false.
+  * Will also update the return code (rc) as necessary.
+  */
+ static bool
+ can_leave_loop(PLpgSQL_execstate *estate, PLpgSQL_any_loop *stmt, int *rc)
+ {
+ 	/*
+ 	 * Check which return code exec_stmts() returned and handle it
+ 	 * accordingly.
+ 	 */
+ 	switch (*rc)
+ 	{
+ 		case PLPGSQL_RC_OK:
+ 			/* Just continue the outer loop on PLPGSQL_RC_OK */
+ 			return false;
+ 
+ 		case PLPGSQL_RC_RETURN:
+ 			/* Time to break out of the outer loop */
+ 			return true;
+ 
+ 		case PLPGSQL_RC_EXIT:
+ 			if (estate->exitlabel == NULL)
+ 			{
+ /* unlabelled exit, finish the current loop */
+ *rc = PLPGSQL_RC_OK;
+ 			}
+ 			if (stmt->label != NULL && strcmp(stmt->label, estate->exitlabel) == 0)
+ 			{
+ /* labelled exit, matches the current stmt's label */
+ estate->exitlabel = NULL;
+ *rc = PLPGSQL_RC_OK;
+ 			}
+ 
+ 			/*
+ 			 * otherwise, this is a labelled exit that does not match the
+ 			 * current statement's label, if any: return RC_EXIT so that the
+ 			 * EXIT continues to propagate up the stack.
+ 			 */
+ 			return true;
+ 
+ 		case PLPGSQL_RC_CONTINUE:
+ 			if (estate->exitlabel == NULL)
+ 			{
+ /* anonymous continue, so re-run the loop */
+ *rc = PLPGSQL_RC_OK;
+ 			}
+ 			else if (stmt->label != NULL &&
+ 	 strcmp(stmt->label, estate->exitlabel) == 0)
+ 			{
+ /* label matches named continue, so re-run loop */
+ estate->exitlabel = NULL;
+ *rc = PLPGSQL_RC_OK;
+ 	

Re: [HACKERS] ToDo List Item - System Table Index Clustering

2011-01-19 Thread Alvaro Herrera
Excerpts from Robert Haas's message of mié ene 19 15:25:00 -0300 2011:

> Oh, I see.  I don't think we want to go there.  We'd need some kind of
> system for keeping the two places in sync.

Maybe autogenerate both the .sgml and the postgres.description files
from a single source.

> And there'd be no easy way
> to upgrade the in-database descriptions when we upgraded to a newer
> minor release, supposing they'd changed in the meantime.

I wouldn't worry about this issue.  We don't do many catalog changes in
minor releases anyway.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ToDo List Item - System Table Index Clustering

2011-01-19 Thread Tom Lane
Robert Haas  writes:
> On Tue, Jan 18, 2011 at 6:49 PM, Simone Aiken
>  wrote:
>> Pages like this one have column comments for the system tables:
>> 
>> http://www.psql.it/manuale/8.3/catalog-pg-attribute.html

> Oh, I see.  I don't think we want to go there.  We'd need some kind of
> system for keeping the two places in sync.

I seem to recall some muttering about teaching genbki to extract such
comments from the SGML sources or perhaps the C header files.  I tend to
agree though that it would be a lot more work than it's worth.  And as
you say, pg_description entries aren't free.

Which brings up another point though.  I have a personal TODO item to
make the comments for operator support functions more consistent:
http://archives.postgresql.org/message-id/21407.1287157...@sss.pgh.pa.us
Should we consider removing those comments altogether, instead?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] estimating # of distinct values

2011-01-19 Thread Robert Haas
On Tue, Jan 18, 2011 at 5:16 PM, Tomas Vondra  wrote:
> Yes, I was aware of this problem (amount of memory consumed with lots of
> updates), and I kind of hoped someone will come up with a reasonable
> solution.

As far as I can see, periodically sampling some or all of the table is
really the only thing that has a chance of working OK.  You could try
to track changes only when they're not too big, but I think that's
still going to have nasty performance consequences.

> I was thinking about it and I believe at least one of the algorithms
> (the "adaptive sampling") might handle "merging" i.e. the backends would
> not need to store the list of values, just a private copy of the
> estimator and update it. And at the end (after commit), the estimator
> would be merged back into the actual one.
>
> So the algorithm would be something like this:
>
> 1. create copy for all distinct estimators influenced by the INSERT
> 2. update the local copy
> 3. commit and merge the local copies back into the original estimator

Well, maybe.  But DELETEs seem like a problem.  And it's still adding
foreground work to every query, which I just have a hard time
believing is ever going to work out

> Regarding the crash scenario - if the commit fails, just throw away the
> local estimator copy, it's not needed. I'm not sure how to take care of
> the case when commit succeeds and the write of the merged estimator
> fails, but I think it might be possible to write the estimator to xlog
> or something like that. So it would be replayed during recovery etc. Or
> is it a stupid idea?

It's not stupid, in the sense that that is what you'd need to do if
you want to avoid ever having to rescan the table.  But it is another
thing that I think is going to be way too expensive.

> Yes, as I've mentioned above, the multi-column stats are the reason why
> I started working on this. And yes, it really should work like this:
>
> 1. user realizes there's something wrong as the plans are bad
> 2. after analysis, the user realizes the reason is an imprecise
>   estimate due to dependence between columns
> 3. the user enables cross-column stats on the columns and checks
>   if it actually solved the issue
> 4. if the cost outweights the gains, the user drops the stats
>
> Does that sound reasonable?

Yes.  The only caveat I'm trying to insert is that it's hard for me to
see how the proposed approach could ever be cheap enough to be a win.
If we need some kind of more expensive kind of ANALYZE that scans the
whole table, and it's optional, sure, why not?  But that's a one-time
hit.  You run it, it sucks, it's over, and now your queries work.
Odds are good you may never need to touch it again.  Now, if we can
convince ourselves that multi-column stats are likely to require
constant updating, then maybe there would be more to recommend the
stream-based approach, although even then it seems dreadfully complex
and expensive to me.  But I bet these things are pretty static.  If
the city and state tend to imply the zip code when there are 10K rows
in the table, they probably still will when there are 100K rows in the
table.  If users with org_id = 1 have a higher karma score on average
than users with org_id != 1, that'll probably still be true when there
are more users in both classes.  Whether the database can understand
that without rescanning the table depends on the data representation,
of course, but I guess I'm just saying I'd think really, really hard
before abandoning the idea of periodic sampling.  You have to get an
awful lot of benefit out of those cross-column stats to make it worth
paying a run-time cost for them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extending opfamilies for GIN indexes

2011-01-19 Thread Tom Lane
Robert Haas  writes:
> On Wed, Jan 19, 2011 at 1:33 PM, Tom Lane  wrote:
>> AFAICS that means integrating contrib/intarray into core.  Independently
>> of whether that's a good idea or not, PG is supposed to be an extensible
>> system, so it would be nice to have a solution that supported add-on
>> extensions.

> Yeah, I'm just wondering if it's worth the effort, especially in view
> of a rather large patch queue we seem to have outstanding at the
> moment.

Oh, maybe we're not on the same page here: I wasn't really proposing
to do this right now, it's more of a TODO item.

Offhand the only reason to do it now would be if we settled on something
that required a layout change in pg_amop/pg_amproc.  Since we already
have one such change in 9.1, getting the additional change done in the
same release would be valuable to reduce the number of distinct cases
for pg_dump and other clients to support.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Couple document fixes

2011-01-19 Thread Kevin Grittner
Thom Brown  wrote:
 
> relkind in the same table is the same type, but isn't displayed as
> "char" in the docs, and the same applies to many other system
tables.
> They would need changing too then.
> 
> Examples are:
> 
> pg_type.typtype
> pg_proc.provolatile
> pg_attribute.attstorage
 
That's a good point.  Consistency would trump getting a single entry
right, for sure.  I wonder, though, whether we shouldn't
consistently distinguish them.  For one thing, I've seen multiple
posts where people were reporting "bugs" because of having confused
char with "char".
 
FWIW, \d shows:
 
   Table "pg_catalog.pg_class"
 Column  |   Type| Modifiers
-+---+---
 relname | name  | not null
 relnamespace| oid   | not null
 reltype | oid   | not null
 reloftype   | oid   | not null
 relowner| oid   | not null
 relam   | oid   | not null
 relfilenode | oid   | not null
 reltablespace   | oid   | not null
 relpages| integer   | not null
 reltuples   | real  | not null
 reltoastrelid   | oid   | not null
 reltoastidxid   | oid   | not null
 relhasindex | boolean   | not null
 relisshared | boolean   | not null
 relpersistence  | "char"| not null
 relkind | "char"| not null
 relnatts| smallint  | not null
 relchecks   | smallint  | not null
 relhasoids  | boolean   | not null
 relhaspkey  | boolean   | not null
 relhasexclusion | boolean   | not null
 relhasrules | boolean   | not null
 relhastriggers  | boolean   | not null
 relhassubclass  | boolean   | not null
 relfrozenxid| xid   | not null
 relacl  | aclitem[] |
 reloptions  | text[]|
Indexes:
"pg_class_oid_index" UNIQUE, btree (oid)
"pg_class_relname_nsp_index" UNIQUE, btree (relname,
relnamespace)
 
Currently we don't seem to distinguish them in very many places in
the docs:
 
$ find -name '*.sgml' | xargs egrep -n '\"char\"'
./doc/src/sgml/textsearch.sgml:1271:setweight(vector tsvector,
weight "char")
returns tsvector
./doc/src/sgml/datatype.sgml:1116:length might change in a
future release. The type "char"
./doc/src/sgml/datatype.sgml:1134:   
"char"
./doc/src/sgml/release-old.sgml:4406:Add routines for single-byte
"char" type(Thomas)
./doc/src/sgml/release-old.sgml:4747:Make "char" type a synonym for
"char(1)" (actually implemented as bpchar)(Thomas)
./doc/src/sgml/xfunc.sgml:1794:
"char"
./doc/src/sgml/release-8.0.sgml:3389:  "char" data type
have been removed.
./doc/src/sgml/release-8.0.sgml:4460:"char" data
type have been removed.
./doc/src/sgml/release-8.0.sgml:4466:to do arithmetic on a
"char" column, you can cast it to
./doc/src/sgml/func.sgml:8462:
setweight(tsvector,
"char")
./doc/src/sgml/btree-gin.sgml:17:  oid, money,
"char",
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Use of O_DIRECT only for open_* sync options

2011-01-19 Thread Bruce Momjian
Is there a reason we only use O_DIRECT with open_* sync options?
xlogdefs.h says:

/*
 *  Because O_DIRECT bypasses the kernel buffers, and because we never
 *  read those buffers except during crash recovery, it is a win to use
 *  it in all cases where we sync on each write().  We could allow O_DIRECT
 *  with fsync(), but because skipping the kernel buffer forces writes out
 *  quickly, it seems best just to use it for O_SYNC.  It is hard to imagine
 *  how fsync() could be a win for O_DIRECT compared to O_SYNC and O_DIRECT.
 *  Also, O_DIRECT is never enough to force data to the drives, it merely
 *  tries to bypass the kernel cache, so we still need O_SYNC or fsync().
 */

This seems wrong because fsync() can win if there are two writes before
the sync call.  Can kernels not issue fsync() if the write was O_DIRECT?
If that is the cause, we should document it.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

2011-01-19 Thread Tom Lane
Noah Misch  writes:
> On Wed, Jan 19, 2011 at 12:10:16PM -0500, Tom Lane wrote:
>> In the meantime, the proposal at hand seems like a bit of a stop-gap,
>> which is why I'd prefer to see something with a very minimal code
>> footprint.  Detoast at assignment would likely need only a few lines
>> of code added in a single place.

> What qualifies a patch as stop-gap scale?  Pavel's patch is ~60 lines.

Yeah, but they're spread out all over plpgsql, and seem likely to
metastasize to other places --- the additional field that needs to be
initialized is the main culprit.  I'd like a one-spot patch that will
be easy to remove when/if it's no longer needed.

> If adding PLpgSQL_var.should_be_detoasted is your main pain point, testing
> VARATT_IS_EXTENDED there might be the least-harmful way to avoid it.

I thought about that too, but adding an additional set of tests into
exec_eval_datum isn't free --- that's a hot spot for plpgsql execution.
Doing it in exec_assign_value would be significantly cheaper, first
because it's reasonable to assume that assignments are less frequent
than reads, and second because there's already a test there to detect
pass-by-ref datatypes, as well as a datumCopy() step that could be
skipped altogether when we detoast.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extending opfamilies for GIN indexes

2011-01-19 Thread Robert Haas
On Wed, Jan 19, 2011 at 1:33 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Wed, Jan 19, 2011 at 12:29 PM, Tom Lane  wrote:
>>> I think you missed the point: right now, to use both the core and
>>> intarray operators on an integer[] column, you have to create *two*
>>> GIN indexes, which will have exactly identical contents. I'm looking
>>> for a way to let intarray extend the core opfamily definition so that
>>> one index can serve.
>
>> Maybe this is a dumb question, but why not just put whatever stuff
>> intarray[] adds directly into the core opfamily?
>
> AFAICS that means integrating contrib/intarray into core.  Independently
> of whether that's a good idea or not, PG is supposed to be an extensible
> system, so it would be nice to have a solution that supported add-on
> extensions.

Yeah, I'm just wondering if it's worth the effort, especially in view
of a rather large patch queue we seem to have outstanding at the
moment.

> The subtext here is that GIN, unlike the other index AMs, uses a
> representation that seems pretty amenable to supporting a wide variety
> of query types with a single index.  contrib/intarray's "query_int"
> operators are not at all like the subset-inclusion-testing operators
> that the core opclass supports, and it's not very hard to think of
> additional cases that could be of interest to somebody (example: find
> all arrays that contain some/all entries within a given integer range).
> I think we're going to come up against similar situations over and over
> until we find a solution.

Interesting.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extending opfamilies for GIN indexes

2011-01-19 Thread Tom Lane
Robert Haas  writes:
> On Wed, Jan 19, 2011 at 12:29 PM, Tom Lane  wrote:
>> I think you missed the point: right now, to use both the core and
>> intarray operators on an integer[] column, you have to create *two*
>> GIN indexes, which will have exactly identical contents. I'm looking
>> for a way to let intarray extend the core opfamily definition so that
>> one index can serve.

> Maybe this is a dumb question, but why not just put whatever stuff
> intarray[] adds directly into the core opfamily?

AFAICS that means integrating contrib/intarray into core.  Independently
of whether that's a good idea or not, PG is supposed to be an extensible
system, so it would be nice to have a solution that supported add-on
extensions.

The subtext here is that GIN, unlike the other index AMs, uses a
representation that seems pretty amenable to supporting a wide variety
of query types with a single index.  contrib/intarray's "query_int"
operators are not at all like the subset-inclusion-testing operators
that the core opclass supports, and it's not very hard to think of
additional cases that could be of interest to somebody (example: find
all arrays that contain some/all entries within a given integer range).
I think we're going to come up against similar situations over and over
until we find a solution.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Couple document fixes

2011-01-19 Thread Thom Brown
On 19 January 2011 18:11, Kevin Grittner  wrote:
> Thom Brown  wrote:
>
>> I've attached a couple minor fixes to the docs.  One relating to
>> SECURITY LABEL and the other for pg_class.relpersistence
>
> relpersistence should be "char", not char.
> Oddly enough, there is a difference.
>
> -Kevin

relkind in the same table is the same type, but isn't displayed as
"char" in the docs, and the same applies to many other system tables.
They would need changing too then.

Examples are:

pg_type.typtype
pg_proc.provolatile
pg_attribute.attstorage

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ToDo List Item - System Table Index Clustering

2011-01-19 Thread Robert Haas
On Tue, Jan 18, 2011 at 6:49 PM, Simone Aiken
 wrote:
> Pages like this one have column comments for the system tables:
>
> http://www.psql.it/manuale/8.3/catalog-pg-attribute.html

Oh, I see.  I don't think we want to go there.  We'd need some kind of
system for keeping the two places in sync.  And there'd be no easy way
to upgrade the in-database descriptions when we upgraded to a newer
minor release, supposing they'd changed in the meantime.  And some of
the descriptions are quite long, so they wouldn't fit nicely in the
amount of space you typically have available when you run \d+.  And it
would enlarge the size of an empty database by however much was
required to store all those comments, which could be an issue for
PostgreSQL instances that have many small databases.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Couple document fixes

2011-01-19 Thread Kevin Grittner
Thom Brown  wrote:
 
> I've attached a couple minor fixes to the docs.  One relating to
> SECURITY LABEL and the other for pg_class.relpersistence
 
relpersistence should be "char", not char.
Oddly enough, there is a difference.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Couple document fixes

2011-01-19 Thread Thom Brown
Hi,

I've attached a couple minor fixes to the docs.  One relating to
SECURITY LABEL and the other for pg_class.relpersistence

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935


doc_fixes.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

2011-01-19 Thread Noah Misch
On Wed, Jan 19, 2011 at 12:10:16PM -0500, Tom Lane wrote:
> Robert Haas  writes:
> > On Tue, Jan 18, 2011 at 5:22 PM, Pavel Stehule  
> > wrote:
> >> opinion isn't strong in this topic. One or twenty useless detoasting
> >> isn't really significant in almost use cases (problem is thousands
> >> detoasting).
> 
> > Yeah.  Many-times-repeated detoasting is really bad, and this is not
> > the only place in the backend where we have this problem.  :-(
> 
> Yeah, there's been some discussion of a more general solution, and I
> think I even had a trial patch at one point (which turned out not to
> work terribly well, but maybe somebody will have a better idea someday).
> In the meantime, the proposal at hand seems like a bit of a stop-gap,
> which is why I'd prefer to see something with a very minimal code
> footprint.  Detoast at assignment would likely need only a few lines
> of code added in a single place.

What qualifies a patch as stop-gap scale?  Pavel's patch is ~60 lines.

If adding PLpgSQL_var.should_be_detoasted is your main pain point, testing
VARATT_IS_EXTENDED there might be the least-harmful way to avoid it.  Saving a
few more lines by moving the work to exec_assign_value probably does not justify
the associated performance regressions Pavel has cited.

nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extending opfamilies for GIN indexes

2011-01-19 Thread Robert Haas
On Wed, Jan 19, 2011 at 12:29 PM, Tom Lane  wrote:
> Dimitri Fontaine  writes:
>> Tom Lane  writes:
>>> Oh, wait a minute: there's a bad restriction there, namely that a
>>> contrib module could only add "loose" operators that had different
>>> declared input types from the ones known to the core opclass.
>
>> I would have though that such contrib would then need to offer their own
>> opfamily and opclasses, and users would have to use the specific opclass
>> manually like they do e.g. for text_pattern_ops.  Can't it work that way?
>
> I think you missed the point: right now, to use both the core and
> intarray operators on an integer[] column, you have to create *two*
> GIN indexes, which will have exactly identical contents.  I'm looking
> for a way to let intarray extend the core opfamily definition so that
> one index can serve.

Maybe this is a dumb question, but why not just put whatever stuff
intarray[] adds directly into the core opfamily?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

2011-01-19 Thread Robert Haas
On Wed, Jan 19, 2011 at 12:10 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Jan 18, 2011 at 5:22 PM, Pavel Stehule  
>> wrote:
>>> opinion isn't strong in this topic. One or twenty useless detoasting
>>> isn't really significant in almost use cases (problem is thousands
>>> detoasting).
>
>> Yeah.  Many-times-repeated detoasting is really bad, and this is not
>> the only place in the backend where we have this problem.  :-(
>
> Yeah, there's been some discussion of a more general solution, and I
> think I even had a trial patch at one point (which turned out not to
> work terribly well, but maybe somebody will have a better idea someday).

I'm pretty doubtful that there's going to be a general solution to
this problem - I think it's going to require gradual refactoring of
problem spots.

> In the meantime, the proposal at hand seems like a bit of a stop-gap,
> which is why I'd prefer to see something with a very minimal code
> footprint.  Detoast at assignment would likely need only a few lines
> of code added in a single place.

OK.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extending opfamilies for GIN indexes

2011-01-19 Thread Tom Lane
Dimitri Fontaine  writes:
> Tom Lane  writes:
>> Oh, wait a minute: there's a bad restriction there, namely that a
>> contrib module could only add "loose" operators that had different
>> declared input types from the ones known to the core opclass.

> I would have though that such contrib would then need to offer their own
> opfamily and opclasses, and users would have to use the specific opclass
> manually like they do e.g. for text_pattern_ops.  Can't it work that way?

I think you missed the point: right now, to use both the core and
intarray operators on an integer[] column, you have to create *two*
GIN indexes, which will have exactly identical contents.  I'm looking
for a way to let intarray extend the core opfamily definition so that
one index can serve.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] limiting hint bit I/O

2011-01-19 Thread Robert Haas
On Wed, Jan 19, 2011 at 11:52 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> ... So what we
>> want to do is write a percentage of them, in a way that guarantees
>> that they'll all eventually get written if people continue to access
>> the same data.
>
> The word "guarantee" seems quite inappropriate here, since as far as I
> can see this approach provides no such guarantee --- even after many
> cycles you'd never be really certain all the bits were set.
>
> What I asked for upthread was that we continue to have some
> deterministic, practical way to force all hint bits in a table to be
> set.  This is not *remotely* responding to that request.  It's still not
> deterministic, and even if it were, vacuuming a large table 20 times
> isn't a very practical solution.

I get the impression you haven't spent as much time reading my email
as I spent writing it.  Perhaps I'm wrong, but in any case the code
doesn't do what you're suggesting.  In the most recently posted
version of this patch, which is v2, if VACUUM hits a page that is
hint-bit-dirty, it always writes it.  Full stop.  The "20 times" bit
applies to a SELECT * FROM table, which is a rather different case.

As I write this, I realize that there is a small fly in the ointment
here, which is that neither VACUUM nor SELECT force out all the pages
they modify to disk.  So there is some small amount of remaining
nondeterminism, even if you VACUUM, because VACUUM will leave the last
few pages it dirties in shared_buffers, and whether those hint bits
hit the disk will depend on a decision made at the time they're
evicted, not at the time they were dirtied.  Possibly I could fix that
by making SetBufferCommitInfoNeedsSave() set BM_DIRTY during vacuum
and BM_HINT_BITS at other times.  That would nail the lid shut pretty
tight.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >