date:20121207

Re: [HACKERS] Minor optimizations in lazy_scan_heap

2012-12-07 Thread Pavan Deolasee

On Tue, Dec 4, 2012 at 11:02 PM, Robert Haas  wrote:

> On Mon, Dec 3, 2012 at 1:23 AM, Pavan Deolasee 
> >
> > Comments ? Anyone thinks any/all of above is useful ?
>
> I doubt that any of these things make enough difference to be worth
> bothering with,

You're right. These are not big ticket optimisations, still I felt they are
worth doing because tiny bits add up over a time and also because the code
may become little simpler. The benchmarks don't show anything interesting
though. The time taken to scan 100K+ bits is sub-second. So even when I
tried with the attached patch, the numbers did not show any noticeable
difference. It might be worth trying with a table with 1M or 10M data
blocks, but I don't have such a hardware to test.

The patch itself can be improved further, especially we can possibly
optimise the loop and test 32-bits at a time, instead of 8 I am doing
currently. Not sure its worth though.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

vm_test_range-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Patch to compute Max LSN of Data Pages

2012-12-07 Thread Tom Lane

Amit kapila  writes:
> On Friday, December 07, 2012 7:43 PM Muhammad Usama wrote:
>> Although I am thinking why are you disallowing the absolute path of file. 
>> Any particular reason?

> The reason to disallow absolute path is that, we need to test on multiple 
> platforms and to keep the scope little less.

This argument seems to me to be completely nuts.  What's wrong with an
absolute path?  Wouldn't you have to go out of your way to disallow it?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Patch to compute Max LSN of Data Pages

2012-12-07 Thread Amit kapila

Hi Muhammad,

On Friday, December 07, 2012 7:43 PM Muhammad Usama wrote:
>Hi Amit

On Mon, Dec 3, 2012 at 6:56 PM, Amit kapila 
mailto:amit.kap...@huawei.com>> wrote:
>>I think we should expect provided path to be relative to current directory
>> or may consider it to be relative to either one of Data or CWD.
>>Because normally we expect path to be relative to CWD if some program is
>> asking for path in command line.

> Please find the attached patch to make the path relative to CWD and check if 
> the path is under data directory.

> Works good now. 

Thank you for verification.

> Although I am thinking why are you disallowing the absolute path of file. Any 
> particular reason?

The reason to disallow absolute path is that, we need to test on multiple 
platforms and to keep the scope little less.
I thought we can allow absolute paths in future.

> I also had a similar point made by Alvaro to allow all the segments of the 
> relation for a given relation file name, or add another option do do the 
> same. But if everybody is fine with leaving it for the future, I do > not 
> have any further concerns with the patch. It is good from my side.

In my opinion we can extend the utility in future for both the below points 
suggested:
1. allow absolute paths in file path
2. allow to get max lsn for relation segments.

If you are also okay, then we can proceed and let Committer also share his 
opinion.

Thank you for reviewing the patch.

With Regards,
Amit Kapila.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Serious problem: media recovery fails after system or PostgreSQL crash

2012-12-07 Thread Jeff Janes

On Thu, Dec 6, 2012 at 3:52 PM, Tomas Vondra  wrote:
> Hi,
>
> On 6.12.2012 23:45, MauMau wrote:
>> From: "Tom Lane" 
>>> Well, that's unfortunate, but it's not clear that automatic recovery is
>>> possible.  The only way out of it would be if an undamaged copy of the
>>> segment was in pg_xlog/ ... but if I recall the logic correctly, we'd
>>> not even be trying to fetch from the archive if we had a local copy.
>>
>> No, PG will try to fetch the WAL file from pg_xlog when it cannot get it
>> from archive.  XLogFileReadAnyTLI() does that.  Also, PG manual contains
>> the following description:
>>
>> http://www.postgresql.org/docs/9.1/static/continuous-archiving.html#BACKUP-ARCHIVING-WAL
>>
>>
>> WAL segments that cannot be found in the archive will be sought in
>> pg_xlog/; this allows use of recent un-archived segments. However,
>> segments that are available from the archive will be used in preference
>> to files in pg_xlog/.
>
> So why don't you use an archive command that does not create such
> incomplete files? I mean something like this:
>
> archive_command = 'cp %p /arch/%f.tmp && mv /arch/%f.tmp /arch/%f'
>
> Until the file is renamed, it's considered 'incomplete'.

Wouldn't having the incomplete file be preferable over having none of it at all?

It seems to me you need considerable expertise to figure out how to do
optimal recovery (i.e. losing the least transactions) in this
situation, and that that expertise cannot be automated.  Do you trust
a partial file from a good hard drive, or a complete file from a
partially melted pg_xlog?

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)

2012-12-07 Thread Stephen Frost

Jeff,

* Jeff Davis (pg...@j-davis.com) wrote:
> Most of your concerns seem to be related to freezing, and I'm mostly
> interested in HEAP_XMIN_COMMITTED optimizations. So I think we're
> talking past each other.

My concern is about this patch/capability as a whole.  I agree that the
hint bit setting may be fine by itself, I'm not terribly concerned with
that.  Now, if you (and others) aren't concerned about the overall
behavior that is being introduced here, that's fine, but it was my
understanding from previous discussions that making tuples visible to
all transactions, even those started before the committing transaction
which are set more strictly than 'read-committed', wasn't 'ok'.

Now, what I've honestly been hoping for on this thread was for someone
to speak up and point out why I'm wrong about this concern and explain
how this patch addresses that issue.  If that's been done, I've missed
it..

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)

2012-12-07 Thread Jeff Davis

On Thu, 2012-12-06 at 22:34 -0500, Stephen Frost wrote:
> * Jeff Davis (pg...@j-davis.com) wrote:
> > That is documented in the committed patch -- it's a trade, basically
> > saying that you lose isolation but avoid extra writes. It seems
> > reasonable that the user gets this behavior if specifically requested.
> 
> Strictly speaking, it could actually be two different uesrs..

That's a good point. Somewhat like SERIALIZABLE, unless everyone is on
board, it doesn't really work.

> > In the simple approach that only affects loads in the same transaction
> > as the create, this is not an issue. The only issue there is for other
> > reads in the same transaction. I think you already know that, but I am
> > repeating for clarity.
> 
> Actually, no, I'm not convinced of that.  If a seperate transaction
> starts before the create/insert, and is still open when the create/insert
> transaction commits, wouldn't that seperate transaction see rows in the
> newly created table?

Not if all you do is set HEAP_XMIN_COMMITTED. Committed <> visible.

> That's more-or-less the specific issue I'm bringing up as a potential
> concern.  If it's just a FrozenXID stored in the heap and there isn't
> anything telling the 2nd transaction to not consider this table that
> exists through SnapshotNow, how are we going to know to not look at the
> tuples?  Or do we accept that it's "ok" for those to be visible?

I am talking about HEAP_XMIN_COMMITTED. I think it's understood that
freezing has much stricter requirements for safety in various situations
because it loses information.

> How I wish we could fix the table visibility and not have to worry about
> this issue at all, which would also remove the need for some special
> keyword to be used to tell us it's 'ok' to use this optimization...

I think we need to be clear about which one we're discussing:
HEAP_XMIN_COMMITTED or FrozenXID. I believe discussing them together has
caused a significant amount of confusion in this thread.

Most of your concerns seem to be related to freezing, and I'm mostly
interested in HEAP_XMIN_COMMITTED optimizations. So I think we're
talking past each other.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ALTER TABLE ... NOREWRITE option

2012-12-07 Thread Dimitri Fontaine

Robert Haas  writes:
> is needed there, or we may need to move some things from the exec
> phase to the prep phase to make it all work out.  I think it's totally
> doable, but it's not going to be a 50-line patch.

If you want to work on it, please be my guest :)

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Bruce Momjian

On Fri, Dec  7, 2012 at 10:38:39PM +0100, Andres Freund wrote:
> On 2012-12-07 16:30:36 -0500, Bruce Momjian wrote:
> > On Fri, Dec  7, 2012 at 04:21:48PM -0500, Tom Lane wrote:
> > > Andres Freund  writes:
> > > > On 2012-12-07 13:59:41 -0500, Tom Lane wrote:
> > > >> indisvalid should be sufficient.  If you try to test more than that
> > > >> you're going to make the code more version-specific, without actually
> > > >> buying much.
> > >
> > > > Doesn't the check need to be at least indisvalid && indisready? Given
> > > > that 9.2 represents !indislive as indisvalid && !indisready?
> > >
> > > Um, good point.  It's annoying that we had to do it like that ...
> >
> > So, does this affect pg_upgrade?  Which PG versions?
> 
> Only 9.2 :(. Before that there was no DROP INDEX CONCURRENTLY and in 9.3
> there's an actual indislive field and indisready is always set to false
> there if indislive is false.
> 
> But I see no problem using !indisvalid || !indisready as the condition
> in all (supported) versions.

OK, updated patch attached.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c
new file mode 100644
index bccceb1..196b8d0
*** a/contrib/pg_upgrade/check.c
--- b/contrib/pg_upgrade/check.c
*** static void check_is_super_user(ClusterI
*** 20,25 
--- 20,26 
  static void check_for_prepared_transactions(ClusterInfo *cluster);
  static void check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster);
  static void check_for_reg_data_type_usage(ClusterInfo *cluster);
+ static void check_for_invalid_indexes(ClusterInfo *cluster);
  static void get_bin_version(ClusterInfo *cluster);
  static char *get_canonical_locale_name(int category, const char *locale);
  
*** check_and_dump_old_cluster(bool live_che
*** 97,102 
--- 98,104 
  	check_is_super_user(&old_cluster);
  	check_for_prepared_transactions(&old_cluster);
  	check_for_reg_data_type_usage(&old_cluster);
+ 	check_for_invalid_indexes(&old_cluster);
  	check_for_isn_and_int8_passing_mismatch(&old_cluster);
  
  	/* old = PG 8.3 checks? */
*** check_for_reg_data_type_usage(ClusterInf
*** 920,925 
--- 922,1016 
  			   "%s\n\n", output_path);
  	}
  	else
+ 		check_ok();
+ }
+ 
+ 
+ /*
+  * check_for_invalid_indexes()
+  *
+  *	CREATE INDEX CONCURRENTLY can create invalid indexes if the index build
+  *	fails.  These are dumped as valid indexes by pg_dump, but the
+  *	underlying files are still invalid indexes.  This checks to make sure
+  *	no invalid indexes exist, either failed index builds or concurrent
+  *	indexes in the process of being created.
+  */
+ static void
+ check_for_invalid_indexes(ClusterInfo *cluster)
+ {
+ 	int			dbnum;
+ 	FILE	   *script = NULL;
+ 	bool		found = false;
+ 	char		output_path[MAXPGPATH];
+ 
+ 	prep_status("Checking for invalid indexes from concurrent index builds");
+ 
+ 	snprintf(output_path, sizeof(output_path), "invalid_indexes.txt");
+ 
+ 	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+ 	{
+ 		PGresult   *res;
+ 		bool		db_used = false;
+ 		int			ntups;
+ 		int			rowno;
+ 		int			i_nspname,
+ 	i_relname;
+ 		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+ 		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+ 
+ 		res = executeQueryOrDie(conn,
+ "SELECT n.nspname, c.relname "
+ "FROM	pg_catalog.pg_class c, "
+ "		pg_catalog.pg_namespace n, "
+ "		pg_catalog.pg_index i "
+ "WHERE	(i.indisvalid = false OR "
+ "		 i.indisready = false) AND "
+ "		i.indexrelid = c.oid AND "
+ "		c.relnamespace = n.oid AND "
+ /* we do not migrate these, so skip them */
+ 			" 		n.nspname != 'pg_catalog' AND "
+ "		n.nspname != 'information_schema' AND "
+ /* indexes do not have toast tables */
+ "		n.nspname != 'pg_toast'");
+ 
+ 		ntups = PQntuples(res);
+ 		i_nspname = PQfnumber(res, "nspname");
+ 		i_relname = PQfnumber(res, "relname");
+ 		for (rowno = 0; rowno < ntups; rowno++)
+ 		{
+ 			found = true;
+ 			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+ pg_log(PG_FATAL, "Could not open file \"%s\": %s\n",
+ 	   output_path, getErrorText(errno));
+ 			if (!db_used)
+ 			{
+ fprintf(script, "Database: %s\n", active_db->db_name);
+ db_used = true;
+ 			}
+ 			fprintf(script, "  %s.%s\n",
+ 	PQgetvalue(res, rowno, i_nspname),
+ 	PQgetvalue(res, rowno, i_relname));
+ 		}
+ 
+ 		PQclear(res);
+ 
+ 		PQfinish(conn);
+ 	}
+ 
+ 	if (script)
+ 		fclose(script);
+ 
+ 	if (found)
+ 	{
+ 		pg_log(PG_REPORT, "fatal\n");
+ 		pg_log(PG_FATAL,
+ 			   "Your installation contains invalid indexes due to failed or\n"
+ 		 	   "currently running CREATE INDEX CONCURRENTLY operations.  You\n"
+ 			   "cannot upgrade until these ind

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Andres Freund

On 2012-12-07 16:30:36 -0500, Bruce Momjian wrote:
> On Fri, Dec  7, 2012 at 04:21:48PM -0500, Tom Lane wrote:
> > Andres Freund  writes:
> > > On 2012-12-07 13:59:41 -0500, Tom Lane wrote:
> > >> indisvalid should be sufficient.  If you try to test more than that
> > >> you're going to make the code more version-specific, without actually
> > >> buying much.
> >
> > > Doesn't the check need to be at least indisvalid && indisready? Given
> > > that 9.2 represents !indislive as indisvalid && !indisready?
> >
> > Um, good point.  It's annoying that we had to do it like that ...
>
> So, does this affect pg_upgrade?  Which PG versions?

Only 9.2 :(. Before that there was no DROP INDEX CONCURRENTLY and in 9.3
there's an actual indislive field and indisready is always set to false
there if indislive is false.

But I see no problem using !indisvalid || !indisready as the condition
in all (supported) versions.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Tom Lane

Bruce Momjian  writes:
> On Fri, Dec  7, 2012 at 04:21:48PM -0500, Tom Lane wrote:
>> Andres Freund  writes:
>>> Doesn't the check need to be at least indisvalid && indisready? Given
>>> that 9.2 represents !indislive as indisvalid && !indisready?

>> Um, good point.  It's annoying that we had to do it like that ...

> So, does this affect pg_upgrade?  Which PG versions?

I think you can just insist on indisvalid and indisready both being
true.  That's okay in all releases back to 8.3.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Bruce Momjian

On Fri, Dec  7, 2012 at 04:21:48PM -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2012-12-07 13:59:41 -0500, Tom Lane wrote:
> >> indisvalid should be sufficient.  If you try to test more than that
> >> you're going to make the code more version-specific, without actually
> >> buying much.
> 
> > Doesn't the check need to be at least indisvalid && indisready? Given
> > that 9.2 represents !indislive as indisvalid && !indisready?
> 
> Um, good point.  It's annoying that we had to do it like that ...

So, does this affect pg_upgrade?  Which PG versions?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Tom Lane

Andres Freund  writes:
> On 2012-12-07 13:59:41 -0500, Tom Lane wrote:
>> indisvalid should be sufficient.  If you try to test more than that
>> you're going to make the code more version-specific, without actually
>> buying much.

> Doesn't the check need to be at least indisvalid && indisready? Given
> that 9.2 represents !indislive as indisvalid && !indisready?

Um, good point.  It's annoying that we had to do it like that ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] User control over psql error stream

2012-12-07 Thread Karl O. Pinc

On 11/21/2012 11:33:18 PM, Karl O. Pinc wrote:
> On 11/21/2012 01:41:56 PM, Peter Eisentraut wrote:
> > On 11/15/12 3:53 PM, Karl O. Pinc wrote:
> > > This patch gives the end user control over psql's
> > > error stream.  This allows a single psql session
> > > to use \o to send both errors and table output
> > > to multiple files.  Useful when capturing test output, etc.
> > 
> > What does this do that cannot in practice be achieved using shell
> > redirection operators?
> 
> To look at it from another angle, you need it for the
> same reason you need \o -- to redirect output
> to multiple places from within a single psql process.
> \o redirects stdout, but I'm interested in
> redirecting stderr.

There is an interaction here with \set EXIT_ON_ERROR
that may need additional documentation.  As the patch is,
when \set EXIT_ON_ERROR is used the error stream is sent
to stderr in addition to the \pset estream setting.
At least I think this is what is happening.
This behavior seems correct to me.

Before proceeding further I would like to wait
for feedback regarding whether a \e command,
per above and previous, is preferable to the existing patch's
\pset estream.

Regards,

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Bruce Momjian

On Fri, Dec  7, 2012 at 11:57:34AM -0500, Bruce Momjian wrote:
> On Fri, Dec  7, 2012 at 11:46:51AM -0500, Tom Lane wrote:
> > Bruce Momjian  writes:
> > > On Fri, Dec  7, 2012 at 10:29:22AM -0500, Tom Lane wrote:
> > >> On balance I am coming around to support the "just throw an error if
> > >> there are any invalid indexes" position.  Adding extra complication in
> > >> pg_dump and pg_upgrade to handle ignoring them doesn't seem like a good
> > >> idea --- for one thing, it will evidently weaken the strength of the
> > >> same-number-of-relations cross-check.
> > 
> > > The check would remain the same --- the change would be to prevent
> > > invalid indexes from being considered on both the old and new servers.
> > 
> > But that weakens the check.  For instance, if you had seven invalid
> > indexes in one cluster and eight in the other, you wouldn't notice.
> 
> That is true, though the assumption is that invalid indexes are
> insignficant.  It would be a new case where actual non-system-table
> _files_ were not transfered.
> 
> Seems most people want the error so I will start working on a patch.

Patch attached.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c
new file mode 100644
index bccceb1..832f22a
*** a/contrib/pg_upgrade/check.c
--- b/contrib/pg_upgrade/check.c
*** static void check_is_super_user(ClusterI
*** 20,25 
--- 20,26 
  static void check_for_prepared_transactions(ClusterInfo *cluster);
  static void check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster);
  static void check_for_reg_data_type_usage(ClusterInfo *cluster);
+ static void check_for_invalid_indexes(ClusterInfo *cluster);
  static void get_bin_version(ClusterInfo *cluster);
  static char *get_canonical_locale_name(int category, const char *locale);
  
*** check_and_dump_old_cluster(bool live_che
*** 97,102 
--- 98,104 
  	check_is_super_user(&old_cluster);
  	check_for_prepared_transactions(&old_cluster);
  	check_for_reg_data_type_usage(&old_cluster);
+ 	check_for_invalid_indexes(&old_cluster);
  	check_for_isn_and_int8_passing_mismatch(&old_cluster);
  
  	/* old = PG 8.3 checks? */
*** check_for_reg_data_type_usage(ClusterInf
*** 920,925 
--- 922,1015 
  			   "%s\n\n", output_path);
  	}
  	else
+ 		check_ok();
+ }
+ 
+ 
+ /*
+  * check_for_invalid_indexes()
+  *
+  *	CREATE INDEX CONCURRENTLY can create invalid indexes if the index build
+  *	fails.  These are dumped as valid indexes by pg_dump, but the
+  *	underlying files are still invalid indexes.  This checks to make sure
+  *	no invalid indexes exist, either failed index builds or concurrent
+  *	indexes in the process of being created.
+  */
+ static void
+ check_for_invalid_indexes(ClusterInfo *cluster)
+ {
+ 	int			dbnum;
+ 	FILE	   *script = NULL;
+ 	bool		found = false;
+ 	char		output_path[MAXPGPATH];
+ 
+ 	prep_status("Checking for invalid indexes from concurrent index builds");
+ 
+ 	snprintf(output_path, sizeof(output_path), "invalid_indexes.txt");
+ 
+ 	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+ 	{
+ 		PGresult   *res;
+ 		bool		db_used = false;
+ 		int			ntups;
+ 		int			rowno;
+ 		int			i_nspname,
+ 	i_relname;
+ 		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+ 		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+ 
+ 		res = executeQueryOrDie(conn,
+ "SELECT n.nspname, c.relname "
+ "FROM	pg_catalog.pg_class c, "
+ "		pg_catalog.pg_namespace n, "
+ "		pg_catalog.pg_index i "
+ "WHERE	i.indisvalid = false AND "
+ "		i.indexrelid = c.oid AND "
+ "		c.relnamespace = n.oid AND "
+ /* we do not migrate these, so skip them */
+ 			" 		n.nspname != 'pg_catalog' AND "
+ "		n.nspname != 'information_schema' AND "
+ /* indexes do not have toast tables */
+ "		n.nspname != 'pg_toast'");
+ 
+ 		ntups = PQntuples(res);
+ 		i_nspname = PQfnumber(res, "nspname");
+ 		i_relname = PQfnumber(res, "relname");
+ 		for (rowno = 0; rowno < ntups; rowno++)
+ 		{
+ 			found = true;
+ 			if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+ pg_log(PG_FATAL, "Could not open file \"%s\": %s\n",
+ 	   output_path, getErrorText(errno));
+ 			if (!db_used)
+ 			{
+ fprintf(script, "Database: %s\n", active_db->db_name);
+ db_used = true;
+ 			}
+ 			fprintf(script, "  %s.%s\n",
+ 	PQgetvalue(res, rowno, i_nspname),
+ 	PQgetvalue(res, rowno, i_relname));
+ 		}
+ 
+ 		PQclear(res);
+ 
+ 		PQfinish(conn);
+ 	}
+ 
+ 	if (script)
+ 		fclose(script);
+ 
+ 	if (found)
+ 	{
+ 		pg_log(PG_REPORT, "fatal\n");
+ 		pg_log(PG_FATAL,
+ 			   "Your installation contains invalid indexes due to failed or\n"
+ 		 	   "currently running CREATE INDEX CONCURRENTLY operations.  You\n

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Andres Freund

On 2012-12-07 13:59:41 -0500, Tom Lane wrote:
> Bruce Momjian  writes:
> > On Fri, Dec  7, 2012 at 07:49:14PM +0100, Andres Freund wrote:
> >> !indislive indexes are created during DROP INDEX CONCURRENTLY. Thats a
> >> different case than CREATE INDEX CONCURRENTLY. Accessing their
> >> definition is actually problematic because i can vanish while youre
> >> examing it which could cause errors both in the backend and in pg_dump.
>
> > Is that something pg_upgrade need to worry about too?  Is
> > pg_index.indisvalid the only thing pg_upgrade need to check?
>
> indisvalid should be sufficient.  If you try to test more than that
> you're going to make the code more version-specific, without actually
> buying much.

Doesn't the check need to be at least indisvalid && indisready? Given
that 9.2 represents indislive as indisvalid && !indisready?

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Andres Freund

On 2012-12-07 13:54:44 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > !indislive indexes are created during DROP INDEX CONCURRENTLY. Thats a
> > different case than CREATE INDEX CONCURRENTLY. Accessing their
> > definition is actually problematic because i can vanish while youre
> > examing it which could cause errors both in the backend and in pg_dump.
>
> That's true of any index, not just !indislive ones.  If you're doing DDL
> during a pg_dump, and that makes it fail, you get to keep both pieces.

Uhm. pg_dump is going to lock the tables at start, so I don't really see
anything dangerous happening otherwise? Once it has done so newly
started concurrent CREATE/DROPs won't finish and normal ones can't even
start.

What I am thinking about is parts of the dead indexes pg_attribute
entries vanishing while pg_get_indexdef() is running. That would
probably result in some scary error messages.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Tom Lane

Bruce Momjian  writes:
> On Fri, Dec  7, 2012 at 07:49:14PM +0100, Andres Freund wrote:
>> !indislive indexes are created during DROP INDEX CONCURRENTLY. Thats a
>> different case than CREATE INDEX CONCURRENTLY. Accessing their
>> definition is actually problematic because i can vanish while youre
>> examing it which could cause errors both in the backend and in pg_dump.

> Is that something pg_upgrade need to worry about too?  Is
> pg_index.indisvalid the only thing pg_upgrade need to check?

indisvalid should be sufficient.  If you try to test more than that
you're going to make the code more version-specific, without actually
buying much.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Bruce Momjian

On Fri, Dec  7, 2012 at 07:49:14PM +0100, Andres Freund wrote:
> On 2012-12-07 10:44:53 -0800, Josh Berkus wrote:
> >
> > > I wonder though if we shouldn't ignore !indislive indexes in pg_dump
> > > (and the respective bw-compat hack).
> 
> Obviously I wanted to ask whether we *should* ignore them in the future.
> 
> > Quite likely we shouldn't.  However, that is why it wasn't considered a
> > problem.
> 
> !indislive indexes are created during DROP INDEX CONCURRENTLY. Thats a
> different case than CREATE INDEX CONCURRENTLY. Accessing their
> definition is actually problematic because i can vanish while youre
> examing it which could cause errors both in the backend and in pg_dump.

Is that something pg_upgrade need to worry about too?  Is
pg_index.indisvalid the only thing pg_upgrade need to check?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Tom Lane

Andres Freund  writes:
> !indislive indexes are created during DROP INDEX CONCURRENTLY. Thats a
> different case than CREATE INDEX CONCURRENTLY. Accessing their
> definition is actually problematic because i can vanish while youre
> examing it which could cause errors both in the backend and in pg_dump.

That's true of any index, not just !indislive ones.  If you're doing DDL
during a pg_dump, and that makes it fail, you get to keep both pieces.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Andres Freund

On 2012-12-07 10:44:53 -0800, Josh Berkus wrote:
>
> > I wonder though if we shouldn't ignore !indislive indexes in pg_dump
> > (and the respective bw-compat hack).

Obviously I wanted to ask whether we *should* ignore them in the future.

> Quite likely we shouldn't.  However, that is why it wasn't considered a
> problem.

!indislive indexes are created during DROP INDEX CONCURRENTLY. Thats a
different case than CREATE INDEX CONCURRENTLY. Accessing their
definition is actually problematic because i can vanish while youre
examing it which could cause errors both in the backend and in pg_dump.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Josh Berkus


> I wonder though if we shouldn't ignore !indislive indexes in pg_dump
> (and the respective bw-compat hack).

Quite likely we shouldn't.  However, that is why it wasn't considered a
problem.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review of Row Level Security

2012-12-07 Thread Simon Riggs

On 5 December 2012 11:16, Kohei KaiGai  wrote:

>> Oracle defaults to putting VPD on all event types: INSERT, UPDATE,
>> DELETE, SELECT. ISTM we should be doing the same, not just say "we can
>> add an INSERT trigger if you want".
>>
>> Adding a trigger just begs the question as to why we are bothering in
>> the first place, since this functionality could already be added by
>> INSERT, UPDATE or DELETE triggers, if they are a full replacement for
>> this feature. The only answer is "ease of use"
>>
>> We can easily add syntax like this
>>
>> [ROW SECURITY CHECK (  ) [ON [ ALL | INSERT, UPDATE, DELETE, SELECT 
>> [..,
>>
>> with the default being "ALL"
>>
> I think it is flaw of Oracle. :-)

Agreed

> In case when user can define leakable function, it enables to leak contents
> of invisible rows at the timing when executor fetch the rows, prior to
> modification
> stage, even if we allows to configure individual row-security policies
> for SELECT
> and DELETE or UPDATE commands.
> My preference is one policy on a particular table for all the commands.

Yes, only one security policy allowed.

Question is, should we offer the option to enforce it on a subset of
command types.

That isn't anything I can see a need for myself.


>> * psql \d support needed
>>
> Are you suggesting to print out full qualifiers of row-security?
> Or, a mark to indicate whether row-security is configured, or not?

One of those options, yes

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review of Row Level Security

2012-12-07 Thread Simon Riggs

On 5 December 2012 11:16, Kohei KaiGai  wrote:

>> * TRUNCATE works, and allows you to remove all rows of a table, even
>> ones you can't see to run a DELETE on. Er...
>>
> It was my oversight. My preference is to rewrite TRUNCATE command
> with DELETE statement in case when row-security policy is active on
> the target table.
> In this case, a NOTICE message may be helpful for users not to assume
> the table is always empty after the command.

I think the default must be to throw an ERROR, since part of the
contract with TRUNCATE is that it is fast and removes storage.


>> * Docs need work, but thats OK.
>>
> I'd like to want some help with native English speakers.

I'll help with that.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Andres Freund

On 2012-12-07 10:22:12 -0800, Josh Berkus wrote:
>
> > Yes, I thought of not dumping it.  The problem is that we don't delete
> > the index when it fails, so I assumed we didn't want to lose the index
> > creation information.  I need to understand why we did that.  Why do we
> > have pg_dump dump the index then?
>
> Because pg_restore will recreate the index from scratch, which is
> presumably what users want most of the time.  So this issue doesn't
> exist outside of pg_upgrade.

I wonder though if we shouldn't ignore !indislive indexes in pg_dump
(and the respective bw-compat hack).

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Josh Berkus


> Yes, I thought of not dumping it.  The problem is that we don't delete
> the index when it fails, so I assumed we didn't want to lose the index
> creation information.  I need to understand why we did that.  Why do we
> have pg_dump dump the index then?

Because pg_restore will recreate the index from scratch, which is
presumably what users want most of the time.  So this issue doesn't
exist outside of pg_upgrade.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for REINDEX CONCURRENTLY

2012-12-07 Thread Simon Riggs

On 7 December 2012 17:19, Andres Freund  wrote:
> On 2012-12-07 12:01:52 -0500, Tom Lane wrote:
>> Simon Riggs  writes:
>> > On 7 December 2012 12:37, Michael Paquier  
>> > wrote:
>> >> - There is still a problem with toast indexes. If the concurrent reindex 
>> >> of
>> >> a toast index fails for a reason or another, pg_relation will finish with
>> >> invalid toast index entries. I am still wondering about how to clean up
>> >> that. Any ideas?
>>
>> > Build another toast index, rather than reindexing the existing one,
>> > then just use the new oid.
>
> Thats easier said than done in the first place. toast_save_datum()
> explicitly opens/modifies the one index it needs and updates it.

Well, yeh, I know what I'm saying: it would need to maintain 2 indexes
for a while.

The point is to use the same trick we do manually now, which works
fine for normal indexes and can be made to work for toast indexes
also.

>> Um, I don't think you can swap in a new toast index OID without taking
>> exclusive lock on the parent table at some point.
>
> The whole swapping issue isn't solved satisfyingly as whole yet :(.
>
> If we just swap the index relfilenodes in the pg_index entries itself,
> we wouldn't need to modify the main table's pg_class at all.

yes

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proof of concept: auto updatable views [Review of Patch]

2012-12-07 Thread Tom Lane

[ finally getting back to this patch, after many distractions ]

Dean Rasheed  writes:
> On 8 November 2012 21:13, Tom Lane  wrote:
>> I think the most reasonable backwards-compatibility argument is that we
>> shouldn't change the behavior if there are either INSTEAD rules or
>> INSTEAD triggers.  Otherwise we may be disturbing carefully constructed
>> behavior (and no, I don't buy that "throw an error" couldn't be what the
>> user intended).

> The current behaviour, if there is only a conditional instead rule, is
> to throw an error whether or not that condition is satisfied. It's
> hard to imagine that's an error the user intended.

Well, the current definition is that a rule set with a conditional
DO INSTEAD rule is incomplete and needs to be fixed.  I don't see
anything particularly wrong with that, and I remain hesitant to silently
redefine the behavior of existing rule sets.

> However, given the niche nature of conditional instead rules, it
> doesn't seem so bad to say that auto-updatable views don't support
> them at the moment, so long as backwards compatibility is maintained
> in the table and trigger-updatable view cases. So I think the current
> behaviour to maintain is, for a relation with only a conditional
> instead rule:

> if the relation is a table:
> if the condition is satisfied: fire the rule action
> else: modify the table
> else if the relation is a view with triggers:
> if the condition is satisfied: fire the rule action
> else: modify the view using the triggers
> else:
> throw an error unconditionally

> That's backwards compatible and easy to document - views with
> conditional instead rules are not auto-updatable. If anyone cared
> enough about it, or could come up with a realistic use case, we could
> always add support for that case in the future.

But if there's an unconditional INSTEAD rule, we aren't going to apply
the transformation either, so it seems to me this comes out to the
same thing I said above: any kind of INSTEAD rule disables application
of the auto-update transformation.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fix for pg_upgrade status display

2012-12-07 Thread Bruce Momjian

On Wed, Dec  5, 2012 at 10:04:53PM -0500, Bruce Momjian wrote:
> Pg_upgrade displays file names during copy and database names during
> dump/restore.  Andrew Dunstan identified three bugs:
> 
> *  long file names were being truncated to 60 _leading_ characters, which
> often do not change for long file names
> 
> *  file names were truncated to 60 characters in log files
> 
> *  carriage returns were being output to log files
> 
> The attached patch fixes these --- it prints 60 _trailing_ characters to
> the status display, and full path names without carriage returns to log
> files.

Patch applied.   It also suppresses status output to the log file unless
verbose mode is used.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for REINDEX CONCURRENTLY

2012-12-07 Thread Andres Freund

On 2012-12-07 12:01:52 -0500, Tom Lane wrote:
> Simon Riggs  writes:
> > On 7 December 2012 12:37, Michael Paquier  wrote:
> >> - There is still a problem with toast indexes. If the concurrent reindex of
> >> a toast index fails for a reason or another, pg_relation will finish with
> >> invalid toast index entries. I am still wondering about how to clean up
> >> that. Any ideas?
>
> > Build another toast index, rather than reindexing the existing one,
> > then just use the new oid.

Thats easier said than done in the first place. toast_save_datum()
explicitly opens/modifies the one index it needs and updates it.

> Um, I don't think you can swap in a new toast index OID without taking
> exclusive lock on the parent table at some point.

The whole swapping issue isn't solved satisfyingly as whole yet :(.

If we just swap the index relfilenodes in the pg_index entries itself,
we wouldn't need to modify the main table's pg_class at all.

> One sticking point is the need to update pg_class.reltoastidxid.  I
> wonder how badly we need that field though --- could we get rid of it
> and treat toast-table indexes just the same as normal ones?  (Whatever
> code is looking at the field could perhaps instead rely on
> RelationGetIndexList.)

We could probably just set Relation->rd_toastidx when building the
relcache entry for the toast table so it doesn't have to search the
whole indexlist all the time. Not that that would be too big, but...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for REINDEX CONCURRENTLY

2012-12-07 Thread Tom Lane

Simon Riggs  writes:
> On 7 December 2012 12:37, Michael Paquier  wrote:
>> - There is still a problem with toast indexes. If the concurrent reindex of
>> a toast index fails for a reason or another, pg_relation will finish with
>> invalid toast index entries. I am still wondering about how to clean up
>> that. Any ideas?

> Build another toast index, rather than reindexing the existing one,
> then just use the new oid.

Um, I don't think you can swap in a new toast index OID without taking
exclusive lock on the parent table at some point.

One sticking point is the need to update pg_class.reltoastidxid.  I
wonder how badly we need that field though --- could we get rid of it
and treat toast-table indexes just the same as normal ones?  (Whatever
code is looking at the field could perhaps instead rely on
RelationGetIndexList.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Bruce Momjian

On Fri, Dec  7, 2012 at 11:46:51AM -0500, Tom Lane wrote:
> Bruce Momjian  writes:
> > On Fri, Dec  7, 2012 at 10:29:22AM -0500, Tom Lane wrote:
> >> On balance I am coming around to support the "just throw an error if
> >> there are any invalid indexes" position.  Adding extra complication in
> >> pg_dump and pg_upgrade to handle ignoring them doesn't seem like a good
> >> idea --- for one thing, it will evidently weaken the strength of the
> >> same-number-of-relations cross-check.
> 
> > The check would remain the same --- the change would be to prevent
> > invalid indexes from being considered on both the old and new servers.
> 
> But that weakens the check.  For instance, if you had seven invalid
> indexes in one cluster and eight in the other, you wouldn't notice.

That is true, though the assumption is that invalid indexes are
insignficant.  It would be a new case where actual non-system-table
_files_ were not transfered.

Seems most people want the error so I will start working on a patch.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Tom Lane

Bruce Momjian  writes:
> On Fri, Dec  7, 2012 at 10:29:22AM -0500, Tom Lane wrote:
>> On balance I am coming around to support the "just throw an error if
>> there are any invalid indexes" position.  Adding extra complication in
>> pg_dump and pg_upgrade to handle ignoring them doesn't seem like a good
>> idea --- for one thing, it will evidently weaken the strength of the
>> same-number-of-relations cross-check.

> The check would remain the same --- the change would be to prevent
> invalid indexes from being considered on both the old and new servers.

But that weakens the check.  For instance, if you had seven invalid
indexes in one cluster and eight in the other, you wouldn't notice.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ALTER TABLE ... NOREWRITE option

2012-12-07 Thread Robert Haas

On Thu, Dec 6, 2012 at 3:34 PM, Dimitri Fontaine  wrote:
> I think we need to be solving that problem when we implement new firing
> points for event trigger. The 'table_rewrite' event needs to fire at a
> time when the code can cope with it. That's the main difficulty in
> adding events in that system, asserting their code location safety.

Agreed.

> They need to fire before catalogs are modified, or after, not in
> between, I agree with that. I don't see other ways of implementing that
> than carefully placing the call to user code in the backend's code.

Also agreed.

> The log insert needs to happen either before or after the rewrite, in
> terms of catalog state. I don't see any magic bullet here.

And again agreed.

I think in this case we need to work out before actually doing
anything whether a rewrite will occur, and then remember that
decision.  If the decision is yes, then we call the trigger.  After
calling the trigger, we need to revalidate that it hasn't invalidated
any critical assumptions upon which we're relying for the sanity of
the system (e.g. the table hasn't been altered or dropped).  Assuming
all is well, we then proceed to do the actual operation, basing the
decision as to whether or not to rewrite on the remembered state.

I think ALTER TABLE actually has a lot of this machinery already in
the form of a distinction between "prep" and "exec".  However, there
are some cases where the prep work has been folded into the execute
phase (or maybe visca versa).  So there may be some code cleanup that
is needed there, or we may need to move some things from the exec
phase to the prep phase to make it all work out.  I think it's totally
doable, but it's not going to be a 50-line patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)

2012-12-07 Thread Simon Riggs

On 4 December 2012 20:38, Jeff Davis  wrote:
> Even if it is, I would
> think that a sufficiently narrow case -- such as CTAS outside of a
> transaction block -- would be safe

CTAS would be safe to do that with. CLUSTER and VACUUM FULL are already done.

Outside of a transaction block would be automatic.

Inside of a transaction block we could use a parameter called
initially_frozen. This would act same as FREEZE on COPY. Parameter
would not be recorded into the reloptions field on pg_class, since it
is a one-time option and has no meaning after end of xact.

> along with some slightly broader
> cases (like BEGIN; CREATE TABLE; INSERT/COPY).

Historically we've chosen not to allow bulk insert options on INSERT,
so that is a separate discussion.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Bruce Momjian

On Fri, Dec  7, 2012 at 10:29:22AM -0500, Tom Lane wrote:
> Bruce Momjian  writes:
> > On Fri, Dec  7, 2012 at 03:32:51PM +0100, Andres Freund wrote:
> >> ISTM that anybody who does DDL during or after pg_upgrade --check
> >> deserves any pain.
> >> 
> >> So throwing an error in both seems perfectly fine for me.
> 
> I agree with Andres on this.
> 
> > I am just saying that this makes the --check report more likely to false
> > failures than currently configured.
> 
> It's not a "false" failure.  If you were to force-shutdown the system at
> that instant (not allowing C.I.C. to complete) then do a pg_upgrade, it
> would fail.  So what's wrong with telling the user so?

True.

> If you want, you could have the error message include some weasel
> wording about how this might be only a transient state, but frankly
> I think that's more likely to be confusing than helpful.  As Andres
> says, nobody should expect that it's sensible to do "pg_upgrade --check"
> while any DDL is actively executing, so you'd be warning about a state
> that more than likely isn't reality anyway.

Well, if we fail, we want to give the user a solution to fixing it, and
that solution is going to be different if the index is in-process, or
from long ago.  Those index names are going to be dumped into a file for
the user to review.

It will require weasel wording because pg_upgrade will have to say that
the indexes are either in the process of being concurrently created, or
are the result of a failed concurrent index creation in the past.  They
will either have to wait for the index creation to complete or delete
the invalid index.

> On balance I am coming around to support the "just throw an error if
> there are any invalid indexes" position.  Adding extra complication in
> pg_dump and pg_upgrade to handle ignoring them doesn't seem like a good
> idea --- for one thing, it will evidently weaken the strength of the
> same-number-of-relations cross-check.

The check would remain the same --- the change would be to prevent
invalid indexes from being considered on both the old and new servers.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for REINDEX CONCURRENTLY

2012-12-07 Thread Andres Freund

On 2012-12-07 21:37:06 +0900, Michael Paquier wrote:
> Hi all,
>
> Long time this thread has not been updated...
> Please find attached the version 3 of the patch for support of REINDEX
> CONCURRENTLY.
> The code has been realigned with master up to commit da07a1e (6th December).
>
> Here are the things modified:
> - Improve code to use index_set_state_flag introduced by Tom in commit
> 3c84046
> - One transaction is used for each index swap (N transactions if N indexes
> reindexed at the same time)
> - Fixed a bug to drop the old indexes concurrently at the end of process
>
> The index swap is managed by switching the names of the new and old indexes
> using RenameRelationInternal several times. This API takes an exclusive
> lock on the relation that is renamed until the end of the transaction
> managing the swap. This has been discussed in this thread and other
> threads, but it is important to mention it for people who have not read the
> patch.

Won't working like this cause problems when dependencies towards that
index exist? E.g. an index-based constraint?

As you have an access exlusive lock you should be able to just switch
the relfilenodes of both and concurrently drop the *_cci index with the
old relfilenode afterwards, that would preserve the index states.

Right now I think clearing checkxmin is all you would need to other than
that. We know we don't need it in the concurrent context.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Switching timeline over streaming replication

2012-12-07 Thread Heikki Linnakangas


On 06.12.2012 15:39, Amit Kapila wrote:

On Thursday, December 06, 2012 12:53 AM Heikki Linnakangas wrote:

On 05.12.2012 14:32, Amit Kapila wrote:

On Tuesday, December 04, 2012 10:01 PM Heikki Linnakangas wrote:

After some diversions to fix bugs and refactor existing code, I've
committed a couple of small parts of this patch, which just add some
sanity checks to notice incorrect PITR scenarios. Here's a new
version of the main patch based on current HEAD.


After testing with the new patch, the following problems are observed.

Defect - 1:

  1. start primary A
  2. start standby B following A
  3. start cascade standby C following B.
  4. start another standby D following C.
  5. Promote standby B.
  6. After successful time line switch in cascade standby C&   D,

stop D.

  7. Restart D, Startup is successful and connecting to standby C.
  8. Stop C.
  9. Restart C, startup is failing.


Ok, the error I get in that scenario is:

C 2012-12-05 19:55:43.840 EET 9283 FATAL:  requested timeline 2 does not
contain minimum recovery point 0/3023F08 on timeline 1 C 2012-12-05
19:55:43.841 EET 9282 LOG:  startup process (PID 9283) exited with exit
code 1 C 2012-12-05 19:55:43.841 EET 9282 LOG:  aborting startup due to
startup process failure





That mismatch causes the error. I'd like to fix this by always treating
the checkpoint record to be part of the new timeline. That feels more
correct. The most straightforward way to implement that would be to peek
at the xlog record before updating replayEndRecPtr and replayEndTLI. If
it's a checkpoint record that changes TLI, set replayEndTLI to the new
timeline before calling the redo-function. But it's a bit of a
modularity violation to peek into the record like that.

Or we could just revert the sanity check at beginning of recovery that
throws the "requested timeline 2 does not contain minimum recovery point
0/3023F08 on timeline 1" error. The error I added to redo of checkpoint
record that says "unexpected timeline ID %u in checkpoint record, before
reaching minimum recovery point %X/%X on timeline %u" checks basically
the same thing, but at a later stage. However, the way
minRecoveryPointTLI is updated still seems wrong to me, so I'd like to
fix that.

I'm thinking of something like the attached (with some more comments
before committing). Thoughts?


This has fixed the problem reported.
However, I am not able to think will there be any problem if we remove check
"requested timeline 2 does not contain minimum recovery point

0/3023F08 on timeline 1" at beginning of recovery and just update

replayEndTLI with ThisTimeLineID?


Well, it seems wrong for the control file to contain a situation like this:

pg_control version number:932
Catalog version number:   201211281
Database system identifier:   5819228770976387006
Database cluster state:   shut down in recovery
pg_control last modified: pe  7. joulukuuta 2012 17.39.57
Latest checkpoint location:   0/3023EA8
Prior checkpoint location:0/260
Latest checkpoint's REDO location:0/3023EA8
Latest checkpoint's REDO WAL file:00020003
Latest checkpoint's TimeLineID:   2
...
Time of latest checkpoint:pe  7. joulukuuta 2012 17.39.49
Min recovery ending location: 0/3023F08
Min recovery ending loc's timeline:   1

Note the latest checkpoint location and its TimelineID, and compare them 
with the min recovery ending location. The min recovery ending location 
is ahead of latest checkpoint's location; the min recovery ending 
location actually points to the end of the checkpoint record. But how 
come the min recovery ending location's timeline is 1, while the 
checkpoint record's timeline is 2.


Now maybe that would happen to work if remove the sanity check, but it 
still seems horribly confusing. I'm afraid that discrepancy will come 
back to haunt us later if we leave it like that. So I'd like to fix that.


Mulling over this for some more, I propose the attached patch. With the 
patch, we peek into the checkpoint record, and actually perform the 
timeline switch (by changing ThisTimeLineID) before replaying it. That 
way the checkpoint record is really considered to be on the new timeline 
for all purposes. At the moment, the only difference that makes in 
practice is that we set replayEndTLI, and thus minRecoveryPointTLI, to 
the new TLI, but it feels logically more correct to do it that way.


- Heikki
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 2618c8d..9bd7f03 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -605,6 +605,7 @@ static void SetLatestXTime(TimestampTz xtime);
 static void SetCurrentChunkStartTime(TimestampTz xtime);
 static void CheckRequiredParameterValues(void);
 static void XLogReportParameters(void);
+static void checkTimeLineSwitch(XLogRecPtr lsn, TimeLine

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Tom Lane

Bruce Momjian  writes:
> On Fri, Dec  7, 2012 at 03:32:51PM +0100, Andres Freund wrote:
>> ISTM that anybody who does DDL during or after pg_upgrade --check
>> deserves any pain.
>> 
>> So throwing an error in both seems perfectly fine for me.

I agree with Andres on this.

> I am just saying that this makes the --check report more likely to false
> failures than currently configured.

It's not a "false" failure.  If you were to force-shutdown the system at
that instant (not allowing C.I.C. to complete) then do a pg_upgrade, it
would fail.  So what's wrong with telling the user so?

If you want, you could have the error message include some weasel
wording about how this might be only a transient state, but frankly
I think that's more likely to be confusing than helpful.  As Andres
says, nobody should expect that it's sensible to do "pg_upgrade --check"
while any DDL is actively executing, so you'd be warning about a state
that more than likely isn't reality anyway.

On balance I am coming around to support the "just throw an error if
there are any invalid indexes" position.  Adding extra complication in
pg_dump and pg_upgrade to handle ignoring them doesn't seem like a good
idea --- for one thing, it will evidently weaken the strength of the
same-number-of-relations cross-check.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Bruce Momjian

On Fri, Dec  7, 2012 at 03:32:51PM +0100, Andres Freund wrote:
> > Well, it is a CONCURRENT index creation, so locking would be minimal.
> 
> I wouldn't call a ShareUpdateExclusiveLock minimal...
> 
> > Do we want pg_upgrade to be groveling over the lock view to look for
> > locks?  I don't think so.
> 
> ISTM that anybody who does DDL during or after pg_upgrade --check
> deserves any pain.
> 
> So throwing an error in both seems perfectly fine for me.

Well, most of the current checks relate to checks for created objects. 
To fail for in-process concurrent index creation is to fail for an
intermediate state --- index creation in process, but might complete
before we do the actual upgrade.  Or it might not be an intermediate
state.

I am just saying that this makes the --check report more likely to false
failures than currently configured.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Andres Freund

On 2012-12-07 09:27:39 -0500, Bruce Momjian wrote:
> On Fri, Dec  7, 2012 at 03:24:46PM +0100, Andres Freund wrote:
> > On 2012-12-07 09:21:58 -0500, Bruce Momjian wrote:
> > > On Thu, Dec  6, 2012 at 10:27:21PM -0500, Stephen Frost wrote:
> > > > * Bruce Momjian (br...@momjian.us) wrote:
> > > > > On Thu, Dec  6, 2012 at 09:45:11PM -0500, Stephen Frost wrote:
> > > > > > Or preserve it as-is.  I don't really like the 'make them fix it'
> > > > > > option, as a user could run into that in the middle of a planned 
> > > > > > upgrade
> > > > > > that had been tested and never had that come up.
> > > > >
> > > > > They would get the warning during pg_upgrade --check, of course.
> > > >
> > > > Sure, if they happened to have a concurrent index creation going when
> > > > they ran the check...  But what if they didn't and it only happened to
> > > > happen during the actual pg_upgrade?  I'm still not thrilled with this
> > > > idea of making the user have to abort in the middle to address something
> > > > that, really, isn't a big deal to just preserve and deal with later...
> > >
> > > If a concurrent index creation was happening during the check,
> > > pg_upgrade --check would fail.  I don't think there is any indication if
> > > the index is failed, or in process.
> >
> > There should be a lock on the table + index if the creation is in
> > progress.
>
> Well, it is a CONCURRENT index creation, so locking would be minimal.

I wouldn't call a ShareUpdateExclusiveLock minimal...

> Do we want pg_upgrade to be groveling over the lock view to look for
> locks?  I don't think so.

ISTM that anybody who does DDL during or after pg_upgrade --check
deserves any pain.

So throwing an error in both seems perfectly fine for me.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Bruce Momjian

On Fri, Dec  7, 2012 at 03:24:46PM +0100, Andres Freund wrote:
> On 2012-12-07 09:21:58 -0500, Bruce Momjian wrote:
> > On Thu, Dec  6, 2012 at 10:27:21PM -0500, Stephen Frost wrote:
> > > * Bruce Momjian (br...@momjian.us) wrote:
> > > > On Thu, Dec  6, 2012 at 09:45:11PM -0500, Stephen Frost wrote:
> > > > > Or preserve it as-is.  I don't really like the 'make them fix it'
> > > > > option, as a user could run into that in the middle of a planned 
> > > > > upgrade
> > > > > that had been tested and never had that come up.
> > > >
> > > > They would get the warning during pg_upgrade --check, of course.
> > >
> > > Sure, if they happened to have a concurrent index creation going when
> > > they ran the check...  But what if they didn't and it only happened to
> > > happen during the actual pg_upgrade?  I'm still not thrilled with this
> > > idea of making the user have to abort in the middle to address something
> > > that, really, isn't a big deal to just preserve and deal with later...
> >
> > If a concurrent index creation was happening during the check,
> > pg_upgrade --check would fail.  I don't think there is any indication if
> > the index is failed, or in process.
> 
> There should be a lock on the table + index if the creation is in
> progress.

Well, it is a CONCURRENT index creation, so locking would be minimal. 
Do we want pg_upgrade to be groveling over the lock view to look for
locks?  I don't think so.

> > That is a good argument for _not_ throwing an error because index
> > creation is more of an intermediate state.
> 
> Uhm. If pg_upgrade is actually running its definitely not an
> intermediate state anymore...

It would be running pg_upgrade --check, which can be run while the old
server is running.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for hint bit i/o mitigation

2012-12-07 Thread Hari Babu

On Thu, Dec 6, 2012 at 8:52 PM, Merlin Moncure  wrote:
>Thanks for that -- that's fairly comprehensive I'd say.  I'm quite
>interested in that benchmarking framework as well.  Do you need help
>setting up the scripts?

Presently I am testing with pgbench custom query option & taking IO & VM
statistics in parallel. 

Following way I had written the test script for case -1.

./psql -f bench_test_1_init.sql postgres 
iostat -t 1 -d > hint.test1.iostat.reading_3.txt & 
vmstat -n 1 > hint.test1.vmstat.reading_3.txt & 
./pgbench -f bench_test_1.sql -T 300 -c 8 -j 8 -n postgres 
killall -s SIGINT iostat 
killall -s SIGINT vmstat

Where the sql files are as follows:

-- bench_test_1.sql
select count(*) from  bench where f1 is not null;

--bench_test_1_init.sql

drop table if exists bench;
create table bench(f0 int primary key, f1 char(50));
insert into bench values (generate_series(1, 10), 'a');
insert into bench values (generate_series(11, 20), 'a');
...
insert into bench values (generate_series(981, 990), 'a');
insert into bench values (generate_series(991, 1000), 'a');
checkpoint;

I will provide the test results later.

Any suggestions/comments?

Regards,
Hari babu.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Andres Freund

On 2012-12-07 09:21:58 -0500, Bruce Momjian wrote:
> On Thu, Dec  6, 2012 at 10:27:21PM -0500, Stephen Frost wrote:
> > * Bruce Momjian (br...@momjian.us) wrote:
> > > On Thu, Dec  6, 2012 at 09:45:11PM -0500, Stephen Frost wrote:
> > > > Or preserve it as-is.  I don't really like the 'make them fix it'
> > > > option, as a user could run into that in the middle of a planned upgrade
> > > > that had been tested and never had that come up.
> > >
> > > They would get the warning during pg_upgrade --check, of course.
> >
> > Sure, if they happened to have a concurrent index creation going when
> > they ran the check...  But what if they didn't and it only happened to
> > happen during the actual pg_upgrade?  I'm still not thrilled with this
> > idea of making the user have to abort in the middle to address something
> > that, really, isn't a big deal to just preserve and deal with later...
>
> If a concurrent index creation was happening during the check,
> pg_upgrade --check would fail.  I don't think there is any indication if
> the index is failed, or in process.

There should be a lock on the table + index if the creation is in
progress.

> That is a good argument for _not_ throwing an error because index
> creation is more of an intermediate state.

Uhm. If pg_upgrade is actually running its definitely not an
intermediate state anymore...

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Serious problem: media recovery fails after system or PostgreSQL crash

2012-12-07 Thread MauMau


Tomas wrote:

So why don't you use an archive command that does not create such
incomplete files? I mean something like this:

archive_command = 'cp %p /arch/%f.tmp && mv /arch/%f.tmp /arch/%f'

Until the file is renamed, it's considered 'incomplete'.



That is a good idea.  This would surely solve my problem.  Don't you think 
this must be described in PG manual?


Now there are two solutions given.  I think it would be better to implement 
both of these.


1. Tomas's idea
Instruct the method like below in the PG manual.

archive_command = 'cp %p /arch/%f.tmp && mv /arch/%f.tmp /arch/%f'

[Merits]
The incomplete file will not eventually remain.  The successful archival 
erases the incomplete file created during the previous interrupted archival.


[Demerits]
The administrator or developer of existing systems or products using 
PostgreSQL will not be likely to notice this change in the PG manual.  So 
they might face the media recovery failure.



2. My idea
PG continues media recovery when it encounters a partially filled WAL 
segment file.  It ereport(LOG)'s the unexpected size, and try to fetch the 
WAL file from pg_xlog/.  This is logically natural.  If the archived WAL 
file is smaller than the expected size, it (almost always) means that 
archive processing was interrupted and the unarchived WAL file is in 
pg_xlog/.  So PG should instead the same WAL file from pg_xlog/ and continue 
recovery.  PG will then try to fetch the subsequent WAL files from archive 
again, but will hardly find those files there and fetch them from pg_xlog/. 
This behavior complies with the description in the PG manual.


I mean just changing FATAL to LOG in the code in RestoreArchivedFile():

--
   if (StandbyMode && stat_buf.st_size < expectedSize)
elevel = DEBUG1;
   else
elevel = FATAL;
   ereport(elevel,
 (errmsg("archive file \"%s\" has wrong size: %lu instead of %lu",
   xlogfname,
   (unsigned long) stat_buf.st_size,
   (unsigned long) expectedSize)));
   return false;
--

[Merits]
The administrator or developer of existing systems or products using 
PostgreSQL will benefit from this fix.


[Demerits]
The incomplete archive file(s) might be left forever unless the DBA delete 
them.  That occurs if pg_xlog/ is lost and the incomplete archive files will 
not be overwritten.



Could you give me opinions what to do?  I'm willing to submit these fixes.


Regards
MauMau





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Bruce Momjian

On Fri, Dec  7, 2012 at 09:16:56AM +, Simon Riggs wrote:
> On 7 December 2012 04:02, Alvaro Herrera  wrote:
> > Andrew Dunstan wrote:
> >>
> >> On 12/06/2012 09:23 PM, Bruce Momjian wrote:
> >
> >> >As soon as pg_dump stopped dumping the CREATE INDEX, pg_upgrade would
> >> >stop creating creating it in the new cluster, and not transfer the index
> >> >files.
> >>
> >> So we'll lose the index definition and leave some files behind? This
> >> sounds a bit messy to say the least.
> >
> > I find it hard to get excited about this being a real problem.  If the
> > index has been kept invalid, how come the definition is so valuable?
> 
> Agreed.
> 
> I don't see the problem... just say "rebuild any invalid indexes
> before you run pg_upgrade".
> 
> I don't think pg_upgrade should take the responsibility of fixing
> everything broken before you run an upgrade. Making that rod for our
> backs looks pretty bad here and could get worse if other situations
> come to light.
> 
> Maybe it should have a mode where it detects that it would fail if you
> attempted the upgrade...

That's what pg_upgrade --check does, but see my email about in-process
concurrent index builds also causing a failure.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Bruce Momjian

On Thu, Dec  6, 2012 at 10:27:21PM -0500, Stephen Frost wrote:
> * Bruce Momjian (br...@momjian.us) wrote:
> > On Thu, Dec  6, 2012 at 09:45:11PM -0500, Stephen Frost wrote:
> > > Or preserve it as-is.  I don't really like the 'make them fix it'
> > > option, as a user could run into that in the middle of a planned upgrade
> > > that had been tested and never had that come up.
> > 
> > They would get the warning during pg_upgrade --check, of course.
> 
> Sure, if they happened to have a concurrent index creation going when
> they ran the check...  But what if they didn't and it only happened to
> happen during the actual pg_upgrade?  I'm still not thrilled with this
> idea of making the user have to abort in the middle to address something
> that, really, isn't a big deal to just preserve and deal with later...

If a concurrent index creation was happening during the check,
pg_upgrade --check would fail.  I don't think there is any indication if
the index is failed, or in process.

That is a good argument for _not_ throwing an error because index
creation is more of an intermediate state.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Patch to compute Max LSN of Data Pages

2012-12-07 Thread Muhammad Usama

Hi Amit

On Mon, Dec 3, 2012 at 6:56 PM, Amit kapila  wrote:

>  >I think we should expect provided path to be relative to current
> directory
> > or may consider it to be relative to either one of Data or CWD.
> >Because normally we expect path to be relative to CWD if some program is
> > asking for path in command line.
>
> Please find the attached patch to make the path relative to CWD and check
> if the path is under data directory.
>

Works good now. Although I am thinking why are you disallowing the absolute
path of file. Any particular reason?

>
> Now the only point raised by Alvaro and you for this patch which is not
> yet addressed is :
>
> > Hmm.  I think I'd expect that if I give pg_computemaxlsn a number, it
> > should consider that it is a relfilenode, and so it should get a list of
> > all segments for all forks of it.  So if I give "12345" it should get
> > 12345, 12345.1 and so on, and also 12345_vm 12345_vm.1 and so on.
> > However, if what I give it is a path, i.e. it contains a slash, I think
> > it should only consider the specific file mentioned.  In that light, I'm
> > not sure that command line options chosen are the best set.
>  I am just not sure whether we should handle this functionality and if we
> have to handle what is better way to provide it to user.
> Shall we provide new option -r or something for it.
>
> Opinions/Suggestions?
>
> IMHO, such functionality can be easily extendable in future.
> However I have no problem in implementing such functionality if you are of
> opinion that this is basic and it should go with first version of feature.
>
>

I also had a similar point made by Alvaro to allow all the segments of the
relation for a given relation file name, or add another option do do the
same. But if everybody is fine with leaving it for the future, I do not
have any further concerns with the patch. It is good from my side.


With Regards,
> Amit Kapila.
>
>

Thanks
Muhammad Usama

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Alvaro Herrera

Bruce Momjian wrote:
> On Thu, Dec  6, 2012 at 09:23:14PM -0500, Bruce Momjian wrote:
> > On Thu, Dec  6, 2012 at 09:10:21PM -0500, Tom Lane wrote:
> > > Bruce Momjian  writes:
> > > > On Thu, Dec  6, 2012 at 07:53:57PM -0500, Tom Lane wrote:
> > > >> Because CREATE INDEX CONCURRENTLY can't drop the index if it's already
> > > >> failed.  It's not because we want to do that, it's an implementation
> > > >> restriction of the horrid kluge that is CREATE/DROP INDEX CONCURRENTLY.
> > > 
> > > > Well, what is the logic that pg_dump dumps it then, even in
> > > > non-binary-upgrade mode?
> > > 
> > > Actually, I was thinking about proposing exactly that.  Ideally the
> > > system should totally ignore an invalid index (we just fixed some bugs
> > > in that line already).  So it would be perfectly consistent for pg_dump
> > > to ignore it too, with or without --binary-upgrade.
> > > 
> > > One possible spanner in the works for pg_upgrade is that this would mean
> > > there can be relation files in the database directories that it should
> > > ignore (not transfer over).  Dunno if that takes any logic changes.
> > 
> > As soon as pg_dump stopped dumping the CREATE INDEX, pg_upgrade would
> > stop creating creating it in the new cluster, and not transfer the index
> > files.
> 
> Sorry, I was wrong about this.  We would need to modify pg_dump to skip
> invalid indexes (perhaps only for --binary-upgrade),

Invalid indexes only persist in the database because the system cannot
drop them automatically.  I see no reason to have pg_dump include them
in its output, with or without --binary-upgrade.

> and pg_upgrade would also need to be modified to skip such indexes.
> This is necessary because, as a safety check, pg_upgrade requires
> there to be an exact match of relations between old and new clusters.

Right.  This shouldn't be too hard, though; just make sure to exclude
invalid indexes in the pg_upgrade query that obtains relations from the
old cluster.  Since pg_dump would not include them in the dump, the
numbers should match.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Bruce Momjian

On Thu, Dec  6, 2012 at 09:23:14PM -0500, Bruce Momjian wrote:
> On Thu, Dec  6, 2012 at 09:10:21PM -0500, Tom Lane wrote:
> > Bruce Momjian  writes:
> > > On Thu, Dec  6, 2012 at 07:53:57PM -0500, Tom Lane wrote:
> > >> Because CREATE INDEX CONCURRENTLY can't drop the index if it's already
> > >> failed.  It's not because we want to do that, it's an implementation
> > >> restriction of the horrid kluge that is CREATE/DROP INDEX CONCURRENTLY.
> > 
> > > Well, what is the logic that pg_dump dumps it then, even in
> > > non-binary-upgrade mode?
> > 
> > Actually, I was thinking about proposing exactly that.  Ideally the
> > system should totally ignore an invalid index (we just fixed some bugs
> > in that line already).  So it would be perfectly consistent for pg_dump
> > to ignore it too, with or without --binary-upgrade.
> > 
> > One possible spanner in the works for pg_upgrade is that this would mean
> > there can be relation files in the database directories that it should
> > ignore (not transfer over).  Dunno if that takes any logic changes.
> 
> As soon as pg_dump stopped dumping the CREATE INDEX, pg_upgrade would
> stop creating creating it in the new cluster, and not transfer the index
> files.

Sorry, I was wrong about this.  We would need to modify pg_dump to skip
invalid indexes (perhaps only for --binary-upgrade), and pg_upgrade
would also need to be modified to skip such indexes.  This is necessary
because, as a safety check, pg_upgrade requires there to be an exact
match of relations between old and new clusters.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for REINDEX CONCURRENTLY

2012-12-07 Thread Simon Riggs

On 7 December 2012 12:37, Michael Paquier  wrote:

> There are still two things that are missing in this patch, but I would like
> to have more feedback before moving forward:
> - REINDEX CONCURRENTLY needs tests in src/test/isolation

Yes, it needs those

> - There is still a problem with toast indexes. If the concurrent reindex of
> a toast index fails for a reason or another, pg_relation will finish with
> invalid toast index entries. I am still wondering about how to clean up
> that. Any ideas?

Build another toast index, rather than reindexing the existing one,
then just use the new oid.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Setting visibility map in VACUUM's second phase

2012-12-07 Thread Pavan Deolasee

On Fri, Dec 7, 2012 at 9:21 AM, Pavan Deolasee  wrote:
> On Thu, Dec 6, 2012 at 11:31 PM, Tom Lane  wrote:
>
>>
>> I think taking a second whack at setting the visibility bit is a fine
>> idea, but let's drop all the rest of this premature optimization.
>>
>
> Fair enough. I thought about doing it that way but was worried that an
> additional page scan will raise eyebrows. While it does not affect the
> common case because we would have done that scan anyways in the
> subsequent vacuum, but in the worst case where most of the pages not
> remain all-visible by the time we come back to the second phase of
> vacuum, this additional line pointer scan will add some overhead. With
> couple of pluses for the approach, I won't mind doing it this way
> though.
>

Revised patch attached. This time much less invasive. I added a new
function heap_page_is_all_visible() to check if the given page is
all-visible and also return the visibility cutoff xid. Couple of
things:

- We use the same OldestXmin computed in the first phase for
visibility checks. We can potentially recompute this at the start of
second phase. I wasn't convinced thats needed or even correct

- It will be a good idea to consolidate the page scanning in a single
place to avoid code duplication. But may be will do that some other
day.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


vacuum-secondphase-setvm-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] -DCLOBBER_CACHE_ALWAYS shows COPY FREEZE regression problem

2012-12-07 Thread Simon Riggs

On 7 December 2012 00:13, Tom Lane  wrote:
> Andrew Dunstan  writes:
>> On a new buildfarm member friarbird
>> ,
>> configured with _DCLOBBER_CACHE_ALWAYS:
>
>>BEGIN;
>>TRUNCATE vistest;
>>COPY vistest FROM stdin CSV FREEZE;
>> + NOTICE:  FREEZE option specified but pre-conditions not met
>
> The notice is coming out because the relcache is dropping the value of
> rd_newRelfilenodeSubid as a result of cache flushes.  The COPY FREEZE
> code is aware of this risk, commenting
>
>  * As mentioned in comments in utils/rel.h, the in-same-transaction test
>  * is not completely reliable, since in rare cases rd_createSubid or
>  * rd_newRelfilenodeSubid can be cleared before the end of the 
> transaction.
>  * However this is OK since at worst we will fail to make the 
> optimization.
>
> but I'd say this seriously throws into question whether it should be
> emitting a notice.  That seems to tie into the discussion a little bit
> ago about whether the FREEZE option is a command or a hint.  Throwing a
> notice seems like a pretty schizophrenic choice: it's not an error but
> you're in the user's face about it anyway.  In any case, if the option
> is unreliable (and with this implementation it certainly is), we can
> *not* treat its failure as an error, so it's not a command.

Hmm, yes, its pretty schizophrenic, but that comes from my attempt to
offer something useful to the user in line with Robert's wish for the
hint to not be silently ignored. If I had overridden that and made it
truly silent as I had proposed, the clobber error would not have
happened.

I guess my mind must have been aware of the above, but it regrettably
wasn't consciously there. The above text was there from before, not
from the recent patch.

I'll remove the message now so tests pass, since that was my original
intention, but others may wish to suggest other ways forward.

> TBH I think that COPY FREEZE should not have been committed yet;
> it doesn't seem to be fully baked.

I think so too, in hindsight, but at the time it seemed a simple patch
that followed review and recent on-list discussion. My mistake was
tinkering with the patch before commit, which I then revoked.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.3] writable foreign tables

2012-12-07 Thread Albe Laurenz

Kohei KaiGai wrote:
> The attached patch is revised version.
> 
> One most difference from the previous version is, it constructed
> PoC features on Hanada-san's latest postgres-fdw.v5.patch.
> Yesh, it looks to me working fine on RDBMS backend also.

Cool.

> Even though the filename of this patch contains "poc" phrase,
> I think it may be time to consider adoption of the core regarding
> to the interface portion.
> (Of course, postgres_fdw is still works in progress.)

Ok, I'll try to review it with regard to that.

> Here is a few operation examples.
[...]
> postgres=# INSERT INTO tbl VALUES (7,'ggg'),(8,'hhh');
> INSERT 0 2

Weird, that fails for me.

CREATE TABLE test(
   id integer PRIMARY KEY,
   val text NOT NULL
);

CREATE FOREIGN TABLE rtest(
   id integer not null,
   val text not null
) SERVER loopback OPTIONS (nspname 'laurenz', relname 'test');

test=> INSERT INTO test(id, val) VALUES (1, 'one');
INSERT 0 1
test=> INSERT INTO rtest(id, val) VALUES (2, 'two');
The connection to the server was lost. Attempting reset: Failed.
!>

Here is the stack trace:
317 RangeTblEntry  *rte = root->simple_rte_array[rtindex];
#0  0x00231cb9 in deparseInsertSql (buf=0xbfffafb0, root=0x9be3614, rtindex=1) 
at deparse.c:317
#1  0x0023018c in postgresPlanForeignModify (root=0x9be3614, plan=0x9be3278, 
resultRelaion=1, subplan=0x9be3bec)
at postgres_fdw.c:1538
#2  0x082a3ac2 in make_modifytable (root=0x9be3614, operation=CMD_INSERT, 
canSetTag=1 '\001', resultRelations=0x9be3c7c,
subplans=0x9be3c4c, returningLists=0x0, rowMarks=0x0, epqParam=0) at 
createplan.c:4787
#3  0x082a7ada in subquery_planner (glob=0x9be357c, parse=0x9be3304, 
parent_root=0x0, hasRecursion=0 '\0', tuple_fraction=0,
subroot=0xbfffb0ec) at planner.c:574
#4  0x082a71dc in standard_planner (parse=0x9be3304, cursorOptions=0, 
boundParams=0x0) at planner.c:200
#5  0x082a707b in planner (parse=0x9be3304, cursorOptions=0, boundParams=0x0) 
at planner.c:129
#6  0x0832c14c in pg_plan_query (querytree=0x9be3304, cursorOptions=0, 
boundParams=0x0) at postgres.c:753
#7  0x0832c1ec in pg_plan_queries (querytrees=0x9be3a98, cursorOptions=0, 
boundParams=0x0) at postgres.c:812
#8  0x0832c46e in exec_simple_query (query_string=0x9be267c "INSERT INTO 
rtest(id, val) VALUES (2, 'two');") at postgres.c:977

(gdb) print root->simple_rte_array
$1 = (RangeTblEntry **) 0x0

The bug I reported in my last review does not seem to be fixed, either.
The symptoms are different now (with the definitions from above):

test=> UPDATE rtest SET val='new' FROM rtest t2 WHERE rtest.id=t2.id AND t2.val 
LIKE '%e';
TRAP: FailedAssertion("!(baserel->relid == var->varno)", File: "foreign.c", 
Line: 601)
The connection to the server was lost. Attempting reset: Failed.
!>

The same happens for DELETE ... USING.

A different failure happens if I join with a local table:

test=> UPDATE rtest SET val='new' FROM test t2 WHERE rtest.id=t2.id AND t2.val 
= 'nonexist';
TRAP: FailedAssertion("!(const Node*)(fscan))->type) == T_ForeignScan))", 
File: "postgres_fdw.c", Line: 1526)
The connection to the server was lost. Attempting reset: Failed.
!>

I gave up testing the functionality after that.

> So, let's back to the main topic of this patch.
> According to the upthread discussion, I renamed the interface to inform
> expected width of result set as follows:
> 
> +typedef AttrNumber (*GetForeignRelWidth_function) (PlannerInfo *root,
> +  RelOptInfo *baserel,
> +  Relation foreignrel,
> +  bool inhparent,
> +  List *targetList);
> 
> It informs the core how many slots for regular and pseudo columns shall
> be acquired. If it is identical with number of attributed in foreign table
> definition, it also means this scan does not use any pseudo columns.
> A typical use case of pseudo column is "rowid" to move an identifier of
> remote row from scan stage to modify stage. It is responsibility of FDW
> driver to ensure "rowid" has uniqueness on the remote side; my
> enhancement on postgres_fdw uses ctid.
> 
> get_pseudo_rowid_column() is a utility function that picks up an attribute
> number of pseudo "rowid" column if query rewriter injected on previous
> stage. If FDW does not support any other pseudo column features, the
> value to be returned is just return-value of this function.

Thanks, I think this will make the FDW writer's work easier.

> Other relevant APIs are as follows:
> 
> +typedef List *(*PlanForeignModify_function) (PlannerInfo *root,
> +ModifyTable *plan,
> +Index resultRelation,
> +Plan *subplan);
> +
> I newly added this handler on construction of ModifyTable structure.
> Because INSERT command does not have scan stage directly co

Re: [HACKERS] pg_upgrade problem with invalid indexes

2012-12-07 Thread Simon Riggs

On 7 December 2012 04:02, Alvaro Herrera  wrote:
> Andrew Dunstan wrote:
>>
>> On 12/06/2012 09:23 PM, Bruce Momjian wrote:
>
>> >As soon as pg_dump stopped dumping the CREATE INDEX, pg_upgrade would
>> >stop creating creating it in the new cluster, and not transfer the index
>> >files.
>>
>> So we'll lose the index definition and leave some files behind? This
>> sounds a bit messy to say the least.
>
> I find it hard to get excited about this being a real problem.  If the
> index has been kept invalid, how come the definition is so valuable?

Agreed.

I don't see the problem... just say "rebuild any invalid indexes
before you run pg_upgrade".

I don't think pg_upgrade should take the responsibility of fixing
everything broken before you run an upgrade. Making that rod for our
backs looks pretty bad here and could get worse if other situations
come to light.

Maybe it should have a mode where it detects that it would fail if you
attempted the upgrade...

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2012-12-07 Thread Kyotaro HORIGUCHI

Hello, I looked into the patch and have some comments.

>From the restriction of the time for this rather big patch,
please excuse that these comments are on a part of it. Others
will follow in few days.


 heaptuple.c

noncachegetattr(_with_len): 

- att_getlength should do strlen as worst case or VARSIZE_ANY
  which is heavier than doing one comparizon, so I recommend to
  add 'if (len)' as the restriction for doing this, and give NULL
  as &len to nocachegetattr_with_len in nocachegetattr.

heap_attr_get_length_and_check_equals:

- Size seems to be used conventionary as the type for memory
  object length, so it might be better using Size instead of
  int32 as the type for *tup[12]_attr_len in parameter.

- This function returns always false for attrnum <= 0 as whole
  tuple or some system attrs comparison regardless of the real
  result, which is a bit different from the anticipation which
  the name gives. If you need to keep this optimization, it
  should have the name more specific to the purpose.

haap_delta_encode:

- Some misleading variable names (like match_not_found),
  some reatitions of similiar codelets (att_align_pointer, pglz_out_tag),
  misleading slight difference of the meanings of variables of
  similar names(old_off and new_off and the similar pairs),
  and bit tricky use of pglz_out_add and pglz_out_tag with length = 0.

  These are welcome to be modified for better readability.

 heapam.c

fastgetattr_with_len

- Missing left paren in the line 867 ('nocachegetattr_with_len(tup)...')

- Missing enclosing paren in heapam.c:879 (len, only on style)

- Allowing len = NULL will be good for better performance, like
  noncachegetattr.


fastgetattr

- I suppose that the coding covension here is that macro and
  alternative c-code are expected to be look similar. fastgetattr
  looks quite differ to corresponding macro.

...


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

55 matches

Mail list logo