Re: [HACKERS] CREATE INDEX CONCURRENTLY?

2015-12-30 Thread Tim Kane
This just hit us today... Admittedly on an old cluster still running 9.2,
though I can't see any mention of it being addressed since.

Any chance of getting this on to to-do list?
On Sat, 1 Nov 2014 at 07:45, Simon Riggs  wrote:

> On 31 October 2014 17:46, Michael Banck  wrote:
>
> > I wonder whether that is pilot error (fair enough), or whether something
> > could be done about this?
>
> When originally written the constraints were tighter, but have since
> been relaxed.
>
> Even so a CIC waits until all snapshots that can see it have gone. So
> what you observe is correct and known.
>
>
> Can it be changed? Maybe.
>
> CREATE INDEX gets around the wait by using indcheckxmin to see whether
> the row is usable. So the command completes, even if the index is not
> usable by all current sessions.
>
> We perform the wait in a completely different way for CIC, for this
> reason (in comments)
>
>   We also need not set indcheckxmin during a concurrent index build,
>   because we won't set indisvalid true until all transactions that care
>   about the broken HOT chains are gone.
>
> Reading that again, I can't see why we do it that way. If CREATE INDEX
> can exit once the index is built, so could CONCURRENTLY.
>
> ISTM that we could indcheckxmin into an Xid, not a boolean
>For CREATE INDEX, set the indcheckxmin = xid of creating transaction
>For CREATE INDEX CONCURRENTLY set the indcheckxmin = xid of the
> completing transaction
>
> The apparent reason it does this is that the Xmin value used currently
> is the Xmin of the index row. The index row is inserted prior to the
> index being valid so that technique cannot work. So I am suggesting
> for CIC that we use the xid of the transaction that completes the
> index, not the xid that originally created the index row. Plus handle
> the difference between valid and not.
>
> --
>  Simon Riggs   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] postgres_fdw foreign keys with default sequence

2015-02-17 Thread Tim Kane
Slight typo on my local host example there.  s/clone/local/
More like the below:


CREATE FOREIGN TABLE IF NOT EXISTS live.devices (
 device_id  bigint NOT NULL
 );

CREATE MATERIALISED VIEW local.devices;

CREATE test_table (device_id bigint FOREIGN KEY (device_id) REFERENCES
*local*.devices (device_id) );

ERROR:  referenced relation devices is not a table

On Tue, Feb 17, 2015 at 1:08 PM, Tim Kane tim.k...@gmail.com wrote:

 Hi all,

 Not sure if this has been reported already, it seems to be a variation on
 this thread:

 http://www.postgresql.org/message-id/20130515151059.go4...@tamriel.snowman.net


 One minor difference is, in my scenario - my source table field is defined
 as BIGINT (not serial) - though it does have a default nextval on a
 sequence, so ultimately - the same dependence.

 The primary difference (IMHO), is that I am actually foreign keying on a
 local materialised view of the fdw'ed foreign table.



 On the foreign host:
   Table live.devices
Column   |  Type  | Modifiers

 ++---
  device_id  | bigint | not null default
 nextval('devices_id_sequence'::regclass)


 On the local host:


 CREATE FOREIGN TABLE IF NOT EXISTS live.devices (
  device_id  bigint NOT NULL
  );

 CREATE MATERIALISED VIEW local.devices;

 CREATE test_table (device_id bigint FOREIGN KEY (device_id) REFERENCES
 clone.devices (device_id) );

 ERROR:  referenced relation devices is not a table



 Though this is a similar scenario to the previous thread, I would have
 expected foreign keying from a materialised view to behave independently of
 the FDW, as if from a regular local table.

 FYI, I'm running postgresql 9.3.4

 Cheers,

 Tim





[HACKERS] postgres_fdw foreign keys with default sequence

2015-02-17 Thread Tim Kane
Hi all,

Not sure if this has been reported already, it seems to be a variation on
this thread:

http://www.postgresql.org/message-id/20130515151059.go4...@tamriel.snowman.net


One minor difference is, in my scenario - my source table field is defined
as BIGINT (not serial) - though it does have a default nextval on a
sequence, so ultimately - the same dependence.

The primary difference (IMHO), is that I am actually foreign keying on a
local materialised view of the fdw'ed foreign table.



On the foreign host:
  Table live.devices
   Column   |  Type  | Modifiers
++---
 device_id  | bigint | not null default
nextval('devices_id_sequence'::regclass)


On the local host:


CREATE FOREIGN TABLE IF NOT EXISTS live.devices (
 device_id  bigint NOT NULL
 );

CREATE MATERIALISED VIEW local.devices;

CREATE test_table (device_id bigint FOREIGN KEY (device_id) REFERENCES
clone.devices (device_id) );

ERROR:  referenced relation devices is not a table



Though this is a similar scenario to the previous thread, I would have
expected foreign keying from a materialised view to behave independently of
the FDW, as if from a regular local table.

FYI, I'm running postgresql 9.3.4

Cheers,

Tim


Re: [HACKERS] [GENERAL] Migrating from 9.2.4 to 9.3.0 with XML DOCTYPE

2014-06-04 Thread Tim Kane
 
 
 From:  Tom Lane t...@sss.pgh.pa.us
 
 Hm, can you restore it into 9.2 either?
 
 AFAICS, pg_dump has absolutely no idea that it should be worried about the
 value of xmloption, despite the fact that that setting affects what is
 considered valid XML data.  What's worse, even if it were attempting to do
 something about xmloption, I don't see how it could deal with a case like
 this where you have different values inside the same database that require
 two different settings in order to parse.
 
 This isn't a 9.3.x bug, it's an aboriginal misdesign of the XML datatype.
 Not sure what we can do about it at this point.  Perhaps we could invent
 a document_or_content setting that would tell xml_in to accept either
 case?  And then have pg_dump force that setting to be used during restore?


This sounds reasonable. My use case is purely as a document store, with the
ability to perform xml parse functions against it – as such, I’m not
concerned wether it’s a document or content – hence why we have both types
recorded against that field.

For the minute, I’m getting around the restore problem by mangling the dump
such that the table is created using the text type rather than xml.  This at
least gets the data onto a 9.3 cluster, even if it’s cosmetically
represented as text instead of xml.  I can worry about the document vs
content problem at a later stage.

 
 PS: BTW, I agree with the advice expressed by David J: under no
 circumstances put any data you care about on 9.3.0.  That release
 was rather a disaster from a quality-control standpoint :-(
 But that's unrelated to your XML issue.


Ack. Thanks for the info. I’ll push the upgrade-path agenda a little harder.









Re: [HACKERS] removing old ports and architectures

2013-10-18 Thread Tim Kane
Just to be pedantic, commit message shows
support for Tru64 ended in 201.

I think you mean 2012.





On 18/10/2013 13:41, Robert Haas robertmh...@gmail.com wrote:

On Thu, Oct 17, 2013 at 5:41 PM, Peter Eisentraut pete...@gmx.net wrote:
 On 10/17/13 12:45 PM, Robert Haas wrote:
 The attached patch, which I propose to apply relatively soon if nobody
 objects, removes the IRIX port.

 +1

Done.  And here's a patch for removing the alpha architecture and
Tru64 UNIX (aka OSF/1) which runs on that architecture, per discussion
upthread.  Barring objections, I'll apply this next week.

On a related note, I think we should update the paragaraph in
installation.sgml that begins In general, PostgreSQL can be expected
to work on these CPU architectures.  Any architecture that doesn't
have a buildfarm animal should be relegated to the second sentence,
which reads Code support exists for ... but these architectures are
not known to have been tested recently.  Similarly, I think the
following paragraph should be revised so that only operating systems
for which we have current buildfarm support are considered fully
supported.  Others should be relegated to a sentence later in the
paragraph that says something like code support exists but not tested
recently or expected to work but not tested regularly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggestion for concurrent index creation using a single full scan operation

2013-07-24 Thread Tim Kane
Wow.. thanks guys, really appreciate the detailed analysis.

Tim


On Wed, Jul 24, 2013 at 4:08 AM, Noah Misch n...@leadboat.com wrote:

 On Tue, Jul 23, 2013 at 01:06:26PM +0100, Tim Kane wrote:
  I haven't given this a lot of thought, but it struck me that when
  rebuilding tables (be it for a restore process, or some other operational
  activity) - there is more often than not a need to build an index or two,
  sometimes many indexes, against the same relation.
 
  It strikes me that in order to build just one index, we probably need to
  perform a full table scan (in a lot of cases).   If we are building
  multiple indexes sequentially against that same table, then we're
 probably
  performing multiple sequential scans in succession, once for each index.

 Check.

  Could we architect a mechanism that allowed multiple index creation
  statements to execute concurrently, with all of their inputs fed directly
  from a single sequential scan against the full relation?
 
  From a language construct point of view, this may not be trivial to
  implement for raw/interactive SQL - but possibly this is a candidate for
  the custom format restore?

 As Greg Stark mentioned, pg_restore can already issue index build commands
 in
 parallel.  Where applicable, that's probably superior to having one backend
 build multiple indexes during a single heap scan.  Index builds are
 CPU-intensive, and the pg_restore approach takes advantage of additional
 CPU
 cores in addition to possibly saving I/O.

 However, the pg_restore method is not applicable if you want CREATE INDEX
 CONCURRENTLY, and it's not applicable for implicit index building such as
 happens for ALTER TABLE rewrites and for VACUUM FULL.  Backend-managed
 concurrent index builds could shine there.

  I presume this would substantially increase the memory overhead required
 to
  build those indexes, though the performance gains may be advantageous.

 The multi-index-build should respect maintenance_work_mem overall.
  Avoiding
 cases where that makes concurrent builds slower than sequential builds is a
 key challenge for such a project:

 - If the index builds each fit in maintenance_work_mem when run
 sequentially
   and some spill to disk when run concurrently, expect concurrency to lose.
 - If the heap is small enough to stay in cache from one index build to the
   next, performing the builds concurrently is probably a wash or a loss.
 - Concurrency should help when a wide-row table large enough to exhaust OS
   cache has narrow indexes that all fit in maintenance_work_mem.  I don't
 know
   whether concurrency would help for a huge-table scenario where the
 indexes
   do overspill maintenance_work_mem.  You would have N indexes worth of
   external merge files competing for disk bandwidth; that could cancel out
   heap I/O savings.

 Overall, it's easy to end up with a loss.  We could punt by having an
 index_build_concurrency GUC, much like pg_restore relies on the user to
 discover a good -j value.  But if finding cases where concurrency helps
 is
 too hard, leaving the GUC at one would become the standard advice.

  Apologies in advance if this is not the correct forum for suggestions..

 It's the right forum.

 Thanks,
 nm

 --
 Noah Misch
 EnterpriseDB http://www.enterprisedb.com



[HACKERS] Suggestion for concurrent index creation using a single full scan operation

2013-07-23 Thread Tim Kane
Hi all,

I haven't given this a lot of thought, but it struck me that when
rebuilding tables (be it for a restore process, or some other operational
activity) - there is more often than not a need to build an index or two,
sometimes many indexes, against the same relation.

It strikes me that in order to build just one index, we probably need to
perform a full table scan (in a lot of cases).   If we are building
multiple indexes sequentially against that same table, then we're probably
performing multiple sequential scans in succession, once for each index.

Could we architect a mechanism that allowed multiple index creation
statements to execute concurrently, with all of their inputs fed directly
from a single sequential scan against the full relation?

From a language construct point of view, this may not be trivial to
implement for raw/interactive SQL - but possibly this is a candidate for
the custom format restore?

I presume this would substantially increase the memory overhead required to
build those indexes, though the performance gains may be advantageous.

Feel free to shoot holes through this :)

Apologies in advance if this is not the correct forum for suggestions..