Re: [HACKERS] Graph datatype addition

2013-04-30 Thread Atri Sharma
 Usually though, you'd be interested a large graphs which include
 information for lots of records (e.g., nodes are individual users,
 or products, or whatever). A graph datatype is not well suited for
 that, because it'd store each graph as a single value, and updating
 the graph would mean rewriting that whole value. If you're e.g. doing
 social network analysis, and each new edge between two users requires
 you to pull the whole graph from disk, update it, and write it back,
 you'll probably hit problems once you reach a few hundred users or
 so… Which really isn't a lot for that kind of application.

Yes, I agree. Hence, I suggested keeping the nodes and links separate.
So, if you need to add a new edge, you just need to update the
adjacency lists.

What I think will work is more of a 'logical' graph i.e. Consider a
tuple in some table. To add it to a new/existing graph as a node, we
just need to add an adjacency list for it. For each edge that connects
the tuple to some other node, we add the corresponding entry in the
list.

Essentially, the idea is again, to separate the nodes and links,
hence, the actual node is accessed pretty infrequently as compared to
the links. Since they are separate, there is no need to pull out the
entire graph for updating a single edge. This was the fundamental
problem with my HStore based design.

Regards,

Atri


--
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Add sql_drop event for event triggers

2013-04-30 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 Applied with some further hacking.

Thanks!

 Hmm, that leads me to wonder exactly how extensively the regression
 tests test event triggers, because it sure looked to me like there
 were multiple bugs left in this version.

The first versions of the event triggers patch series used to include a
large regression test files with some level of covering for each and
every supported command. It was used this way because there was some
command specific code at the time.

It was apparently not testing the right things and was judged to be too
much tests for no clear benefit, thus thrown away. Then with the never
stopping refactoring of the patch series, no specific regression tests
ever found their way back in.

It's not clear to me what tests to add exactly, will find some time to
read the current code in more details and figure out what's easy enough
to cover and not covered already.

 Finally, I've been surprised to find out that those cases are only
 triggering for ddl_event_start (and not ddl_event_end), I think
 that's a bug we should be fixing:

 Agreed, I fixed it.

Thanks!

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Graph datatype addition

2013-04-30 Thread Atri Sharma
 Have you considered maybe ltree datatype?

 To me all described sounds solveable on pure sql way ( + ltree datatype to
 help with indexes and performance as materialised path to avoid recursive
 query all the time...)

This may build over the existing solutions itself. I will have to check that.

 Though would be nice to see something new what would simplify the tasks...

Exactly.A specialized data type with built in support for common
operations will be helpful,IMHO.


Regards,

Atri


--
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Graph datatype addition

2013-04-30 Thread Atri Sharma
 I believe it would be best to largely separate the questions of storage and
 access. Partly because of Florian's concern that you'd frequently want only
 one representation of the whole graph, but also because the actual storage
 interface does NOT have to be user friendly if we have a good access layer.
 In particular, if rows had a low overhead, we'd probably just store graphs
 that way. That's obviously not the case in PG, so is there some kind of
 hybrid approach we could use? Perhaps sections of a graph could be stored
 with one piece of MVCC overhead per section?

Yes, I agree. We can probably store some(persistent) data in a
table,and the 'hot' parts of the datatype(like adjacency lists) could
be stored locally in a data structure which allows fast and low
overhead accesses and updates.

I completely agree with separating storage and access layers.
Supporting Florian's concern, I would also agree with your point of
abstraction. Users should not concern themselves with backend storage
details.

 That's why I think separating access from storage is going to be very
 important; if we do that up-front, we can change the storage latter as we
 get real experience with this.

+1.

 Second, we should consider how much the access layer should build on WITH
 RECURSIVE and the like. Being able to detect specific use patterns of
 CTE/WITH RECURSIVE seems like it could add a lot of value; but I also worry
 that it's way to magical to be practical.

I thought of a frequent access patterns mining tool recently. We can
store the recent accesses and apply a pattern mining algorithm on that
dataset(I was thinking of FP Growth algorithm). But again, it has its
own set of issues and is a project in itself.

Regards,

Atri


--
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The missing pg_get_*def functions

2013-04-30 Thread Andrew Dunstan


On 04/29/2013 11:47 PM, Joel Jacobson wrote:

Note also that minor releases can readily fix bugs in C-language functions,
but we have no infrastructure to update sql-language functions after initdb.
That flexibility is unfortunate to lose, particularly for something that
pg_dump depends on.  Now, the right thing is probably to design a mechanism
for applying simple catalog updates in concert with a minor release.  In the
mean time, its absence puts the sql PL at a nontrivial disadvantage here.

What do you mean with infrastructure? Isn't it as simple as CREATE
OR REPLACE FUNCTION? As long as the interface the pg_get_*def
functions don't change, I cannot see how simply replacing the existing
functions in a minor release upgrade could do any harm.




Minor releases are supposed not to require any such operations. You 
should normally be able to drop the binaries in place and restart. For C 
language functions that is indeed all you have to do, but that's not the 
case for SQL language functions, where the definition is contained in 
the catalogs, not the binary.


If all you want is SQL language functions, there is nothing to stop you 
from writing them and publishing them today as an extension.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Substituting Checksum Algorithm (was: Enabling Checksums)

2013-04-30 Thread Simon Riggs
On 30 April 2013 06:57, Simon Riggs si...@2ndquadrant.com wrote:

 I'm about to light up the build farm with a trial commit of the
 compiler instructions stuff.

Amazingly that seemed to work.

ISTM that we also need this patch to put memory barriers in place
otherwise the code might be rearranged.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


checksum_memory_barrier.v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-30 Thread Simon Riggs
On 9 April 2013 08:36, Jeff Davis pg...@j-davis.com wrote:

 1. I believe that the issue I brought up at the end of this email:

 http://www.postgresql.org/message-id/1365035537.7580.380.camel@sussancws0025

 is a real issue. In lazy_vacuum_page(), the following sequence can
 happen when checksums are on:

a. PageSetAllVisible
b. Pass heap page to visibilitymap_set
c. visibilitymap_set logs the heap page and sets the LSN
d. MarkBufferDirty

 If a checkpoint happens between (c) and (d), then we have a problem. The
 fix is easy: just mark the heap buffer dirty first. There's another call
 site that looks like a potential problem, but I don't think it is. I
 simplified the code there to make it (hopefully) more clearly correct.


Applied

 2. A cleanup patch to pass the buffer_std flag down through
 MarkBufferDirtyHint. This is a matter of preference and purely cosmetic,
 so it might not be wanted. The reason I thought it was useful is that a
 future caller that sets a hint on a non-standard buffer might easily
 miss the assumption that we have a standard buffer.

Skipped that for now. Do we really need it? Can you set a hint on a
non-standard buffer?

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-30 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 If you assume that people are going to modify files while the backend
 is running, nothing we do anywhere is safe.

I agree that it's a bad idea and that people who do such things deserve
what they get, but that doesn't mean it won't happen when people realize
they can do it.

And, really, it's all fun and games until someone writes a tool to do
it...

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Substituting Checksum Algorithm (was: Enabling Checksums)

2013-04-30 Thread Andres Freund
On 2013-04-30 11:55:29 +0100, Simon Riggs wrote:
 ISTM that we also need this patch to put memory barriers in place
 otherwise the code might be rearranged.
 
 --
  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services

 --- a/src/backend/storage/page/bufpage.c
 +++ b/src/backend/storage/page/bufpage.c
 @@ -960,11 +960,14 @@ PageCalcChecksum16(Page page, BlockNumber blkno)
* Save pd_checksum and set it to zero, so that the checksum calculation
* isn't affected by the checksum stored on the page. We do this to
* allow optimization of the checksum calculation on the whole block
 -  * in one go.
 +  * in one go. Memory barriers are required to avoid rearrangement here.
*/
   save_checksum = phdr-pd_checksum;
 + pg_memory_barrier();
   phdr-pd_checksum = 0;
 + pg_memory_barrier();
   checksum = checksum_block(page, BLCKSZ);
 + pg_memory_barrier();
   phdr-pd_checksum = save_checksum;
  
   /* mix in the block number to detect transposed pages */

Why? I am not sure which rearrangement you're fearing? In all cases
where there is danger of concurrent write access to the page we should
already be working on a copy?
Also, if we need a memory barrier I can only see a point in the 2nd
one. The first and third shouldn't ever be able to change anything since
we are only writing to local memory?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Substituting Checksum Algorithm (was: Enabling Checksums)

2013-04-30 Thread Ants Aasma
On Tue, Apr 30, 2013 at 1:55 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On 30 April 2013 06:57, Simon Riggs si...@2ndquadrant.com wrote:

 I'm about to light up the build farm with a trial commit of the
 compiler instructions stuff.

 Amazingly that seemed to work.

Thanks for committing. Sorry about missing the .h file from the patch.
The two commits look good to me.

I can confirm that compiling with CFLAGS=-O2 -march=native will
vectorize the committed code on GCC 4.7.

I also checked the situation on clang. clang-3.2 isn't able to
vectorize the loop even with vectorization options. I will check what
is stopping it. If any volunteer has a working build setup with ICC or
MSVC and is willing to run a couple of test compiles, I think we can
achieve vectorization there too.

 ISTM that we also need this patch to put memory barriers in place
 otherwise the code might be rearranged.

The compiler and CPU both have to preserve correctness when
rearranging code, so I don't think we care about it here. It might
matter if these routine could be called concurrently by multiple
backends for a single buffer, but in that case memory barriers won't
be enough, we'd need full exclusion.

Regards,
Ants Aasma
-- 
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-30 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 Kevin Grittner kgri...@ymail.com wrote:

 The hysteria and FUD about using the currently-supported technique
 for unlogged tables to implement unlogged matviews are very
 discouraging.  And the notion that we would release a matview
 feature which allowed false results (in the form of treating a
 never-populated matview as a legal empty matview) is truly scary to
 me.

 I think that labeling other people's concerns as hysteria and FUD is
 not something which will advance the debate.

The point of that language is that charged language which has
nothing to do with reality is already being thrown around; by all
means let's get back to constructive discussion.  If there is some
particular problem someone sees, I don't think it has been
expressed yet, which makes it impossible to address, unless you
count the assertion that *if* we had chosen a zero-length heap to
represent a table with valid data we *would* have a problem?

Let's look at the corner this supposedly paints us into.  If a
later major release creates a better mechanism, current pg_dump and
load will already use it, based on the way matviews are created
empty and REFRESHed by pg_dump.  Worst case, we need to modify the
behavior of pg_dump running with the switch used by pg_upgrade to
use a new ALTER MATERIALIZED VIEW SET (populated); (or whatever
syntax is chosen) -- a command we would probably want at that point
anyway.  I'm not seeing cause for panic here.

What is a real problem or risk with using this mechanism until we
engineer something better?  What problems with converting to a
later major release does anyone see?

 If we can't tell the difference between those two things, I
 don't think we should be releasing the feature.
 
 So, speaking of hysteria...

Yeah, I know you don't get it, but as a DBA I would never have
allowed a feature which could quietly generate false results to be
used -- which is exactly what you have without a way to
differentiate these cases.  If it comes down to a choice between a
performance tweak like unlogged matviews and an issue of
correctness of results, I will pick correctness.  Now, as I've
already said, this tweak is significant (I expect it will make
matviews useful in roughly twice as many cases), but it is just a
performance tweak.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The missing pg_get_*def functions

2013-04-30 Thread Joel Jacobson
On Tue, Apr 30, 2013 at 11:22 AM, Andrew Dunstan and...@dunslane.net wrote:
 Minor releases are supposed not to require any such operations. You should
 normally be able to drop the binaries in place and restart. For C language
 functions that is indeed all you have to do, but that's not the case for SQL
 language functions, where the definition is contained in the catalogs, not
 the binary.

Minor releases don't even require pg_dump/pg_restore, so I really
don't see the issue.

Nothing in the versioning policy* appears to be in conflict with fixing bugs in
pg_catalog sql-language functions. Maybe it's an unwritten rule minor versions
mustn't affect the pg_catalog. I thought the rule was to require no
data relational
modifications of pg_catalog between minor versions, but simply modifying or
adding functions, without breaking their interfaces, is not a modification of
the pg_catalog as I see it, as it couldn't affect any old code, since the tables
and columns would be untouched. Old code couldn't be affected of new functions,
or the inner-details of existing pg_catalog functions, as long as they
don't change
the interface, i.e., the IN/OUT parameters.

I would actually want to go a bit further and propose to move the sql queries
within the else if (fout-remoteVersion = ...) { ... } statements also for
_old_ versions, and put them in sql-language functions, which, if you would
use a future version of pg_dump to dump an old version, would install
sql-language functions for the old version in pg_catalog, so they could
be called by the new pg_dump.

That way we could get rid of these kind of sql queries for _all_ versions,
and not just for future versions.

These functions would need to adhere to exactly the same interface,
where the OUT params must fit perfectly with the args passed to ArchiveEntry().

The existing pg_get_*def() are not comprehensive enough, as they still require
a lot of sql code to extract various other necessary fields.

*) http://www.postgresql.org/support/versioning/

 If all you want is SQL language functions, there is nothing to stop you from
 writing them and publishing them today as an extension.

Yes, maybe I will, as a proof-of-concept and test of how complex or simple it
would be, and how many lines of code pg_dump.c could be reduced with.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Substituting Checksum Algorithm (was: Enabling Checksums)

2013-04-30 Thread Simon Riggs
On 30 April 2013 12:23, Ants Aasma a...@cybertec.at wrote:

 ISTM that we also need this patch to put memory barriers in place
 otherwise the code might be rearranged.

 The compiler and CPU both have to preserve correctness when
 rearranging code, so I don't think we care about it here. It might
 matter if these routine could be called concurrently by multiple
 backends for a single buffer, but in that case memory barriers won't
 be enough, we'd need full exclusion.

Certainly happy not to commit anything else...

Thanks to Ants and Andres.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The missing pg_get_*def functions

2013-04-30 Thread Andrew Dunstan


On 04/30/2013 07:34 AM, Joel Jacobson wrote:

On Tue, Apr 30, 2013 at 11:22 AM, Andrew Dunstan and...@dunslane.net wrote:

Minor releases are supposed not to require any such operations. You should
normally be able to drop the binaries in place and restart. For C language
functions that is indeed all you have to do, but that's not the case for SQL
language functions, where the definition is contained in the catalogs, not
the binary.

Minor releases don't even require pg_dump/pg_restore, so I really
don't see the issue.



You have already been told what the issue is. There is no provision for 
updating the catalogs. Our users expect to be able to drop a minor 
upgrade binary in place and have nothing more to do. It's what they are 
used to and I at least have no interest in changing that. If you change 
the definition of provided SQL language functions you would need to 
apply those changes to each database in each cluster. Otherwise, a user 
will think they are on version X which has a fix for function Y and in 
fact they won't have the fix. That's a recipe for utter confusion.


All the existing pg_catalog SQL functions are one liners, apart from 
ts_debug().



If all you want is SQL language functions, there is nothing to stop you from
writing them and publishing them today as an extension.

Yes, maybe I will, as a proof-of-concept and test of how complex or simple it
would be, and how many lines of code pg_dump.c could be reduced with.




pg_dump does bulk operations in many cases, and one way we could make it 
faster would be to increase that, not reduce it (see comments where we 
get table attributes in pg_dump.c, for example). Providing singleton 
operations like this will not help it at all.


If your aim is to be able to replace all the code pg_dump runs by these 
singleton operations then I think it's almost certainly doomed to failure.


Having undertaken some of the exercise, I can also assure you that 
writing pg_get_table_def() in a single pure SQL statement will be  
challenging.


There is a case for having functions like these, but probably not for 
use by pg_dump, and I suspect they would best be done in C.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Graph datatype addition

2013-04-30 Thread Atri Sharma
 1.  We created JSON and XML types as ways of storing data that has a robust
 validation system.

 They're still, in a sense, just plain old text, but it's plain old text
 that the user can be certain satisfies the respective rules for
 representations.

Yes, although, I feel that we can use JSON to port a graph's data out
of the database.


 I suspect that the best answer is NOT one where a graph is represented as a
 value in a table; that has the implication that modifying The Graph requires
 altering a single tuple, and that seems likely to become a horrible
 bottleneck.  I'm suspicious that using HSTORE leads down that path, where
 you'll have a database that has a table with just one tuple in it, and where
 it's nearly impossible to alter that tuple.

I agree, hence, I have dropped the idea of building it completely over HStore.

 I'm having a hard time thinking about what it looks like to have a graph as
 table except to effectively compose the graph as a set of nodes, one tuple
 per node, and I'm not sure that a new data type has anything to contribute
 to that.

I can think of three points:

1) We can maintain pointers to tuples which are nodes in a graph
instead of copying over the actual data. This idea is based on the
assumption that node data access will be very specific i.e. accessing
actual data of nodes happens very less, for most analysis we are more
interested in the linkage.

2) In continuation of the above point, the more frequently accessed
areas are the link details, or in my current plan, the adjacency
lists. This could be maintained in a data structure by the data type,
with low overheads of updating, traversal and insertion.

3) The data type shall have in built support for common operations on
graphs. The user can call them directly and hence shall be saved the
overhead of designing his own data structures and functions.


 The one place where I *could* see a special type having a contribution is
 for there to be a data type that can contain an arbitrary number of links.
 That means you have one tuple per node, and, instead of needing a tuple for
 each link between nodes, you have one attribute indicating *all* the links.
 (And interesting is for that one attribute to be usable for foreign key
 purposes.)  That has a hard time scaling in cases where nodes are
 over-connected, which is, broadly speaking, an acceptable sort of scenario.

Yes, that one attribute can be the adjacency set of adjacency lists.
For efficient access, we can store pointers to adjacency lists in hash
maps,with node identifier as keys(maybe).


Regards,

Atri

--
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-30 Thread Robert Haas
On Tue, Apr 30, 2013 at 6:58 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 9 April 2013 08:36, Jeff Davis pg...@j-davis.com wrote:

 1. I believe that the issue I brought up at the end of this email:

 http://www.postgresql.org/message-id/1365035537.7580.380.camel@sussancws0025

 is a real issue. In lazy_vacuum_page(), the following sequence can
 happen when checksums are on:

a. PageSetAllVisible
b. Pass heap page to visibilitymap_set
c. visibilitymap_set logs the heap page and sets the LSN
d. MarkBufferDirty

 If a checkpoint happens between (c) and (d), then we have a problem. The
 fix is easy: just mark the heap buffer dirty first. There's another call
 site that looks like a potential problem, but I don't think it is. I
 simplified the code there to make it (hopefully) more clearly correct.

 Applied

Uh, wait a minute.  I think this is completely wrong.  The buffer is
LOCKED for this entire sequence of operations.  For a checkpoint to
happen, it's got to write every buffer, which it will not be able to
do for so long as the buffer is locked.

The effect of the change to lazy_scan_heap is to force the buffer to
be written even if we're only updating the visibility map page.
That's a bad idea and should be reverted.  The change to
lazy_vacuum_page is harmless, but the comment is wrong.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Graph datatype addition

2013-04-30 Thread Atri Sharma
 actually, i was going to suggest to Atri to take a look at that,
 pgrouting is currently in develop of v2.0 which will rewrite some
 parts (including some of the algorithms).

 Maybe this datatype could fit better as part of pgrouting

Thanks, I will have a look at pgRouting.Although, in a short look, I
feel that the purposes of pgrouting and the proposed data type are
different, even though they support similar functions.

One good idea can be to refer to pgRouting's functions when
implementing similar functions for the proposed datatype.

Regards,

Atri



--
Regards,

Atri
l'apprenant


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-30 Thread Stephen Frost
Kevin,

* Kevin Grittner (kgri...@ymail.com) wrote:
 If there is some
 particular problem someone sees, I don't think it has been
 expressed yet, which makes it impossible to address, unless you
 count the assertion that *if* we had chosen a zero-length heap to
 represent a table with valid data we *would* have a problem?

The problem which people see is that this approach isn't a good idea and
will lead down a road to ugly hacks to make things work correctly.  It's
not a question about code correctness- it's a question about the design.
Those are two very different things but both need to be considered.

 I'm not seeing cause for panic here.

Speaking to overcharged words, I don't see any 'panic' here but rather a
strong disagreement between individuals over this design.  Actually, I'm
not even sure there is disagreement about there being a better design
for this- it sounds like everyone agrees that there is.  The question
boils down to do we ship it with an inferior design, or hold off and do
it right the first time.

 What is a real problem or risk with using this mechanism until we
 engineer something better?  What problems with converting to a
 later major release does anyone see?

Various examples have been articulated and you've worked through
specific high-level designs for how to deal with those, which is great,
but design isn't about just those things which you can forsee and
predict, it's about ensuring flexibility is there for those things that
you don't predict.

  If we can't tell the difference between those two things, I
  don't think we should be releasing the feature.
  
  So, speaking of hysteria...
 
 Yeah, I know you don't get it, but as a DBA I would never have
 allowed a feature which could quietly generate false results to be
 used -- which is exactly what you have without a way to
 differentiate these cases.  

It also happens to be what you can end up having with unlogged tables
already..  We ran into exactly that issue and decided that we'd be
better off not using them (for now anyway) even for things which really
should be just transient data sets.

 If it comes down to a choice between a
 performance tweak like unlogged matviews and an issue of
 correctness of results, I will pick correctness.  Now, as I've
 already said, this tweak is significant (I expect it will make
 matviews useful in roughly twice as many cases), but it is just a
 performance tweak.

And, I think anyway, I agree w/ you (Kevin) here- we should really have
a way to tell the difference between no-data-in-query-result and
never-populated.  I'd like more options than just a big ERROR result
happening when it's never been populated, but that's a different
discussion.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-30 Thread Simon Riggs
On 30 April 2013 13:34, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Apr 30, 2013 at 6:58 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 9 April 2013 08:36, Jeff Davis pg...@j-davis.com wrote:

 1. I believe that the issue I brought up at the end of this email:

 http://www.postgresql.org/message-id/1365035537.7580.380.camel@sussancws0025

 is a real issue. In lazy_vacuum_page(), the following sequence can
 happen when checksums are on:

a. PageSetAllVisible
b. Pass heap page to visibilitymap_set
c. visibilitymap_set logs the heap page and sets the LSN
d. MarkBufferDirty

 If a checkpoint happens between (c) and (d), then we have a problem. The
 fix is easy: just mark the heap buffer dirty first. There's another call
 site that looks like a potential problem, but I don't think it is. I
 simplified the code there to make it (hopefully) more clearly correct.

 Applied

 Uh, wait a minute.  I think this is completely wrong.

Thanks for your review.

 The buffer is
 LOCKED for this entire sequence of operations.  For a checkpoint to
 happen, it's got to write every buffer, which it will not be able to
 do for so long as the buffer is locked.

I was following the logic we use for WAL: mark the buffer dirty, then write WAL.

The important thing is for us to signal that the buffer should be
written, which we do by marking it dirty. The actual writing of the
buffer comes later, often minutes later.

 The effect of the change to lazy_scan_heap is to force the buffer to
 be written even if we're only updating the visibility map page.
 That's a bad idea and should be reverted.

OK, agreed. Will wait for this discussion to end before acting though,
so I'm sure exactly what you mean.

 The change to
 lazy_vacuum_page is harmless, but the comment is wrong.

In what way, wrong? What should it say?

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The missing pg_get_*def functions

2013-04-30 Thread Stephen Frost
Tom, all,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
 It may be that the functions Joel proposes are worth having for other
 tools to use, but I'm not in favor of making pg_dump use them.

I care very little about migrating the pg_dump functionality into
server-side functions and a great deal about *having* such functionality
available server-side.  We've been around this time and time again:
there are use-cases beyond pg_dump for being able to get the definition
of an object.  Having to first figure out and then replicate what pg_dump
(or psql) does is no trivial feat.

In short, I think I agree w/ Tom here (though I contend that there *are*
such use-cases, not that 'it may be'.. :).  Let's drop the discussion
about changing pg_dump and/or psql and instead simply go implement these
functions.  We can revisit the pg_dump discussion 5 or 10 years down the
road, after we've seen how well these functions work and what uses they
are put to outside of our (relatively small) world.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] The missing pg_get_*def functions

2013-04-30 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 On Mon, Apr 29, 2013 at 7:58 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  One of the things that we frequently recommend when doing
  upgrades is that you do the dump with the newer version's pg_dump, so
  as to get the benefits of any bug fixes that are in it.  The more
  dump functionality is on the server side, the less opportunity we have
  to repair things that way.
 
 But why wouldn't we be able to fix the version in the server, if it
 turns out to be buggy?  I suppose we wouldn't fix bugs discovered
 after EOL, but I'm not sure that's a sufficient objection.

There are other things beyond bugs here..  Changes in reserved keywords
is actually the biggest reason, ime, to use the newer pg_dump when
you're trying to move to a newer PG version.  I don't think we'd want to
either go to quoteing everything (yuck), or having a point release
suddenly change what gets quoted and what doesn't in a pg_dump..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] The missing pg_get_*def functions

2013-04-30 Thread Stephen Frost
* Joel Jacobson (j...@trustly.com) wrote:
 On Tue, Apr 30, 2013 at 12:58 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  That alone would probably be sufficient reason why we would never allow
  pg_dump to depend on any such thing (not that I see a compelling
  argument for it to do so anyway...).
 
 It would be better to find a way to update sql-language functions in
 minor upgrades, instead of shutting that door entirely for all future
 implementation ideas involving sql-language functions in the
 pg_catalog.

Go for it? :)  I don't think you'll find much disagreement here, but
it's no trivial thing to do either..  Perhaps some kind of one-time
startup script that gets run?  Or maybe an internal identifier that
tracks the version of the catalog and runs a script which updates it on
first start when there are changes to the catalog?  But what about any
dependencies which exist?  Are we confident enough in the existing
infrastructure for 'create-or-replace' that, if we don't change the API,
it'll just replace the contents and we won't hurt user objects?

  The long and the short of it here is that there isn't any very good
  reason to migrate any of the existing pg_dump-side functionality into
  server-side functions, and especially not server-side functions that
  aren't in C.  One of the things that we frequently recommend when doing
  upgrades is that you do the dump with the newer version's pg_dump, so
  as to get the benefits of any bug fixes that are in it.  The more
  dump functionality is on the server side, the less opportunity we have
  to repair things that way.
 
 There are two very good reasons in particular of doing exactly that:

[... good reasons to have pg_dump-like capability in the server ...]

There's a difference between 'migrate' and 'implement' that I'm not sure
you saw..  I agree w/ Tom that there's probably no really good reason to
*remove* the existing code in pg_dump.  Adding code which is similar to
the server is quite a different thing and, as Tom commented later, might
be useful for other tools to use.

 b) Reduce the amount of hard-coded sql queries in pg_dump.c. 

Unless you're hacking on pg_dump quite a bit (which, if you are, you
probably need to understand/see the queries that it's making anyway), I
don't see why this is actually a problem..?  I'm thinking that you've
simply had to be in the guts of pg_dump enough, as some others of us
have also, that you grow tired of having to extract out the useful bits
from it.  If those bits are replicated as clean functions in the
backend, you won't need to be groking around in pg_dump anymore.. :)

 Won't
 help for previous versions, but perhaps the sql queries to support
 older versions can be hidden in some separately included file at least
 so you don't have to see it, which would make sense since the sql for
 the old versions won't change and won't need the same amount of active
 development as the code for the latest version. Makes no sense of
 keeping the sql for all versions at the same place.

I'd be all for a rework of pg_dump which more cleanly sets up the SQL
for each server version- feel free to work on such. :)

  It may be that the functions Joel proposes are worth having for other
  tools to use, but I'm not in favor of making pg_dump use them.
 
 I don't agree with you as I don't think it's a very hard problem to
 deal with updates of sql-language functions in the pg_catalog schema.

The above doesn't really seem like a response to the piece quoted, so
I'll respond to just what you wrote- I agree.  Let's figure out a way to
update sql-language functions.  Once we have that, it'll certainly open
up the door to having more of them.  I don't think we can punt on this
as a well, we'll do it when we need it; we've needed it in the past
(as I recall, we've had to tell people in the release notes to hack the
catalog for some issue or another at least once in most major revs..)
and it'd be great to have the problem solved.

 If we fix that first, this and a whole range of problems, where the
 solution involves taking in-data from pg_catalog, and producing
 out-data based on that, becomes super easy to achieve by writing
 simple sql-language functions with direct and simple access to the
 pg_catalog, instead of writing C functions with inline hard-coded sql.

Sounds fantastic.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] The missing pg_get_*def functions

2013-04-30 Thread Joel Jacobson
On Tue, Apr 30, 2013 at 2:12 PM, Stephen Frost sfr...@snowman.net wrote:
 Go for it? :)  I don't think you'll find much disagreement here, but
 it's no trivial thing to do either..  Perhaps some kind of one-time
 startup script that gets run?  Or maybe an internal identifier that

 The above doesn't really seem like a response to the piece quoted, so
 I'll respond to just what you wrote- I agree.  Let's figure out a way to
 update sql-language functions.  Once we have that, it'll certainly open
 up the door to having more of them.  I don't think we can punt on this
 as a well, we'll do it when we need it; we've needed it in the past
 (as I recall, we've had to tell people in the release notes to hack the
 catalog for some issue or another at least once in most major revs..)
 and it'd be great to have the problem solved.

 Sounds fantastic.

Thanks for the motivational speech! :) I'll start working on it today!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The missing pg_get_*def functions

2013-04-30 Thread Stephen Frost
* Joel Jacobson (j...@trustly.com) wrote:
 Thanks for the motivational speech! :) I'll start working on it today!

Great, but you should really come up with an initial design and get
feedback on it before you start coding up something. :)  I've outlined a
few very, very high-level ideas about what could be done, but there's a
*lot* of details that would need to be worked out here.  You might also
ask some of the packagers (eg: Martin Pitt w/ Ubuntu/Debian) about how
they've handled point-upgrades which have required catalog hackery.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Remaining beta blockers

2013-04-30 Thread Andres Freund
Could we please stop the ad-hominem stuff from all sides? We want to
solve the issue not to make it bigger.

On 2013-04-30 04:29:26 -0700, Kevin Grittner wrote:
 Let's look at the corner this supposedly paints us into.  If a
 later major release creates a better mechanism, current pg_dump and
 load will already use it, based on the way matviews are created
 empty and REFRESHed by pg_dump.  Worst case, we need to modify the
 behavior of pg_dump running with the switch used by pg_upgrade to
 use a new ALTER MATERIALIZED VIEW SET (populated); (or whatever
 syntax is chosen) -- a command we would probably want at that point
 anyway.  I'm not seeing cause for panic here.

I don't think panic is appropriate either, but I think there are some
valid concerns around this.

1) vacuum can truncate the table to an empty relation already if there is
   no data returned by the view's query which then leads to the wrong
   scannability state.

   S1: CREATE MATERIALIZED VIEW matview_empty AS SELECT false WHERE random()  
-1;
   S2: S2: SELECT * FROM matview_empty; -- works
   S1: VACUUM matview_empty;
   S2: SELECT * FROM matview_empty; -- still works
   S3: SELECT * FROM matview_empty; -- errors out

   So we need to make vacuum skip cleaning out the last page. Once we
   get incrementally updated matviews there are more situations to get
   into this than just a query not returning anything.
   I remember this being discussed somewhere already, but couldn't find
   it quickly enough.

   Imo this is quite an indicator that the idea of using the filesize is
   just plain wrong. Adding logic to vacuum not to truncate data because
   its a flag for matview scannability is quite the layer violation and
   a sign for a bad design decision. Such a hack has already been added
   to copy_heap_data(), while not as bad, shows that it is hard to find
   all the places where we need to add it.

2) Since we don't have a metapage to represent scannability in 9.3 we
   cannot easily use one in 9.4+ without pg_upgrade emptying all
   matviews unless we only rely on the catalogs which we currently
   cannot. This will either slow down development or make users
   unhappy. Alternatively we can add yet another fork, but that has its
   price (quite a bit more open files during normal operation, slower
   CREATE DATABASE).

   This is actually an argument for not releasing matviews without such
   an external state. Going from disk-based state to catalog is easy,
   the other way round: not so much.

3) Using the filesize as a flag will make other stuff like preallocating
   on-disk data in bigger chunks and related things more complicated.

4) doing the check for scannability in the executor imo is a rather bad
   idea. Note that we e.g. don't see an error about a matview which
   won't be scanned because of constraint exclusion or one-time
   filters.

   S1: CREATE MATERIALIZED VIEW matview_unit_false AS SELECT false WHERE false 
WITH NO DATA;
   S1: SELECT * FROM matview_unit_false;

You can get there in far more reasonable cases than WHERE false.

5) I have to agree with Kevin that the scannability is an important thing
   to track though.

   a) we cannot remove support for WITH NO DATA because of pg_dump order
  and unlogged relations. So even without unlogged relations the
  problem exists although its easier to represent.
   b) Just giving wrong responses is bad [tm]. Outdated data is something
  completely different (it has existed in that state in the (near)
  past) than giving an empty response (might never have been a visible
  state, and more likely not so in any reasonably near
  past). Consider an application behind a pooler suddently getting
  an empty response from a SELECT * FROM unlogged_matview; It won't
  notice anything without a unscannable state since it probably
  won't even notice the database restart.

Not sure what the consequence of this is. The most reasonable solution
seems to be to introduce a metapage somewhere *now* which sucks, but ...

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The missing pg_get_*def functions

2013-04-30 Thread Andres Freund
On 2013-04-30 05:14:15 +0100, Joel Jacobson wrote:
 On Tue, Apr 30, 2013 at 12:58 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  Noah Misch n...@leadboat.com writes:
  Note also that minor releases can readily fix bugs in C-language functions,
  but we have no infrastructure to update sql-language functions after 
  initdb.
  That flexibility is unfortunate to lose, particularly for something that
  pg_dump depends on.
 
  That alone would probably be sufficient reason why we would never allow
  pg_dump to depend on any such thing (not that I see a compelling
  argument for it to do so anyway...).
 
 It would be better to find a way to update sql-language functions in
 minor upgrades, instead of shutting that door entirely for all future
 implementation ideas involving sql-language functions in the
 pg_catalog.

I'd be very careful with jumping on this task. I am pretty sure its a
very good way to get very, very frustrated if you don't present a widely
accepted design beforehand. Doing this correctly is *far far* from easy.

Just a little collection of problems:
* You need to connect to all databases, not just one. There's no
  infrastructure for this.
* You need to do the update *before* allowing any external
  connections. Otherwise the feature won't be useful to fix actual
  problems. Again, there is no infrastructure for this.
* You need to do it in a way that a) doesn't slow down normal startup b)
  doesn't break if the update has only been applied to  of 1
  databases.


FWIW I really, really doubt you can do all the functions referred to
upthread sensibly and correctly from SQL functions. The infrastructure
to do this just isn't there.

Besides, I really don't buy this helping pg_dump. I think the far more
realistic course here is to put some parts that are required by pg_dump
and by such functions in some common library that then can *also* be
used by such backend functions. But thats a humongous task that, besides
deep technical knowledge in lots of areas, requires quite a bit of
politics.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-30 Thread Robert Haas
On Tue, Apr 30, 2013 at 7:29 AM, Kevin Grittner kgri...@ymail.com wrote:
 Let's look at the corner this supposedly paints us into.  If a
 later major release creates a better mechanism, current pg_dump and
 load will already use it, based on the way matviews are created
 empty and REFRESHed by pg_dump.  Worst case, we need to modify the
 behavior of pg_dump running with the switch used by pg_upgrade to
 use a new ALTER MATERIALIZED VIEW SET (populated); (or whatever
 syntax is chosen) -- a command we would probably want at that point
 anyway.  I'm not seeing cause for panic here.

 What is a real problem or risk with using this mechanism until we
 engineer something better?  What problems with converting to a
 later major release does anyone see?

Well, it's a pg_upgrade hazard, if nothing else, isn't it?

 Yeah, I know you don't get it, but as a DBA I would never have
 allowed a feature which could quietly generate false results to be
 used -- which is exactly what you have without a way to
 differentiate these cases.  If it comes down to a choice between a
 performance tweak like unlogged matviews and an issue of
 correctness of results, I will pick correctness.  Now, as I've
 already said, this tweak is significant (I expect it will make
 matviews useful in roughly twice as many cases), but it is just a
 performance tweak.

Sure, I wouldn't allow that either.  My point is that I feel that
could be engineered around in user space.  If you have a materialized
view which could legitimately be empty (there are many for which that
won't be an issue), then you can either arrange the view definition so
that it isn't (e.g. by including a dummy record that clients can look
for), or you can include a sentinel unlogged table that will contain a
row if and only if materialized views have been refreshed since the
last crash.  In the examples I can think of,
indefinitely-stale-but-valid-at-some-point wouldn't be very good
either, so I would anticipate needing to do some engineering around
relative freshness anyway - e.g. keeping a side table that lists the
last-refreshed-time for each matview.  Or, maybe I'd wouldn't care
about tracking elapsed time per se, but instead want to track
freshness relative to updates - e.g. set things up so that anyone who
performs an action that would invalidate a row in the materialized
view would also update a row someplace that would allow me to identify
the row as stale.  In either case, handling the case where the view is
altogether invalid doesn't seem to add a whole lot of additional
complexity.

Now, I can imagine cases where it does.  For example, suppose you have
a cron job (which you trust to work) to refresh your materialized
views every night.  Well, that means that you'll never be more than 24
hours stale - but if any of those materialized views are unlogged,
that also means that you could have no data at all for up to 24 hours
following a crash.  Not great, because now you need some logic to
handle just that one case that wouldn't be necessary if the DB did it
for you.  But I just think it's a judgement call how serious one
thinks that scenario is, vs. any other scenario where a boolean isn't
adequate either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-30 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:
 On 2013-04-30 04:29:26 -0700, Kevin Grittner wrote:

 Let's look at the corner this supposedly paints us into.  If a
 later major release creates a better mechanism, current pg_dump and
 load will already use it, based on the way matviews are created
 empty and REFRESHed by pg_dump.  Worst case, we need to modify the
 behavior of pg_dump running with the switch used by pg_upgrade to
 use a new ALTER MATERIALIZED VIEW SET (populated); (or whatever
 syntax is chosen) -- a command we would probably want at that point
 anyway.  I'm not seeing cause for panic here.

 I don't think panic is appropriate either, but I think there are some
 valid concerns around this.

 1) vacuum can truncate the table to an empty relation already if there is
   no data returned by the view's query which then leads to the wrong
   scannability state.

   S1: CREATE MATERIALIZED VIEW matview_empty AS SELECT false WHERE random()  
-1;
   S2: S2: SELECT * FROM matview_empty; -- works
   S1: VACUUM matview_empty;
   S2: SELECT * FROM matview_empty; -- still works
   S3: SELECT * FROM matview_empty; -- errors out

   So we need to make vacuum skip cleaning out the last page. Once we
   get incrementally updated matviews there are more situations to get
   into this than just a query not returning anything.
   I remember this being discussed somewhere already, but couldn't find
   it quickly enough.

Yeah, I posted a short patch earlier on this thread that fixes
that.  Nobody has commented on it, and so far I have not committed
anything related to this without posting details and giving ample
opportunity for anyone to comment.  If nobody objects, I can push
that, and this issue is gone.

   Imo this is quite an indicator that the idea of using the filesize is
   just plain wrong. Adding logic to vacuum not to truncate data because
   its a flag for matview scannability is quite the layer violation and
   a sign for a bad design decision. Such a hack has already been added
   to copy_heap_data(), while not as bad, shows that it is hard to find
   all the places where we need to add it.

I agree, but if you review the thread I started with a flag in
pg_class, I tried using the special area of the first page of the
heap, and finally wound up with the current technique based on
several people reviewing the patch and offering opinions and
reaching an apparent consensus that this was the least evil way to
do it given current infrastructure for unlogged tables.  This
really is a result of trying to preserver *correct* behavior while
catering to those emphasizing how important the performance of
unlogged matviews is.

 2) Since we don't have a metapage to represent scannability in 9.3 we
   cannot easily use one in 9.4+ without pg_upgrade emptying all
   matviews unless we only rely on the catalogs which we currently
   cannot.

I am not following this argument at all.  Doesn't pg_upgrade use
pg_dump to create the tables and matviews WITH NO DATA and take
later action for ones which are populated in the source?  It would
not be hard at all to move to a new release which used a different
technique for tracking populated tables by changing what pg_dump
does for populated tables with the switch pg_upgrade uses.

   This will either slow down development or make users
   unhappy. Alternatively we can add yet another fork, but that has its
   price (quite a bit more open files during normal operation, slower
   CREATE DATABASE).

   This is actually an argument for not releasing matviews without such
   an external state. Going from disk-based state to catalog is easy,
   the other way round: not so much.

I am not seeing this at all.  Given my comment above about how this
works for pg_upgrade and pg_dump, can you explain how this is a
problem?

 3) Using the filesize as a flag will make other stuff like preallocating
   on-disk data in bigger chunks and related things more complicated.

Why?  You don't need to preallocate for a non-populated matview,
since its heap will be replaced on REFRESH.  You would not *want*
to preallocate a lot of space for something guaranteed to remain at
zero length until deleted.

 4) doing the check for scannability in the executor imo is a rather bad
   idea. Note that we e.g. don't see an error about a matview which
   won't be scanned because of constraint exclusion or one-time
   filters.

   S1: CREATE MATERIALIZED VIEW matview_unit_false AS SELECT false WHERE false
 WITH NO DATA;
   S1: SELECT * FROM matview_unit_false;

 You can get there in far more reasonable cases than WHERE false.

I am a little concerned about that, but the case you show doesn't demonstrate a 
problem:

test=# CREATE MATERIALIZED VIEW matview_unit_false AS SELECT false WHERE false 
WITH NO DATA;
SELECT 0
test=# SELECT * FROM matview_unit_false;
ERROR:  materialized view matview_unit_false has not been populated
HINT:  Use the REFRESH MATERIALIZED VIEW command.

The view was defined WITH NO DATA 

Re: [HACKERS] Remaining beta blockers

2013-04-30 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 Kevin Grittner kgri...@ymail.com wrote:

 Let's look at the corner this supposedly paints us into.  If a
 later major release creates a better mechanism, current pg_dump and
 load will already use it, based on the way matviews are created
 empty and REFRESHed by pg_dump.  Worst case, we need to modify the
 behavior of pg_dump running with the switch used by pg_upgrade to
 use a new ALTER MATERIALIZED VIEW SET (populated); (or whatever
 syntax is chosen) -- a command we would probably want at that point
 anyway.  I'm not seeing cause for panic here.

 What is a real problem or risk with using this mechanism until we
 engineer something better?  What problems with converting to a
 later major release does anyone see?

 Well, it's a pg_upgrade hazard, if nothing else, isn't it?

I don't think so.  What do you see as a problem?

 Sure, I wouldn't allow that either.  My point is that I feel that
 could be engineered around in user space.  If you have a materialized
 view which could legitimately be empty (there are many for which that
 won't be an issue), then you can either arrange the view definition so
 that it isn't (e.g. by including a dummy record that clients can look
 for), or you can include a sentinel unlogged table that will contain a
 row if and only if materialized views have been refreshed since the
 last crash.  In the examples I can think of,
 indefinitely-stale-but-valid-at-some-point wouldn't be very good
 either, so I would anticipate needing to do some engineering around
 relative freshness anyway - e.g. keeping a side table that lists the
 last-refreshed-time for each matview.  Or, maybe I'd wouldn't care
 about tracking elapsed time per se, but instead want to track
 freshness relative to updates - e.g. set things up so that anyone who
 performs an action that would invalidate a row in the materialized
 view would also update a row someplace that would allow me to identify
 the row as stale.  In either case, handling the case where the view is
 altogether invalid doesn't seem to add a whole lot of additional
 complexity.

 Now, I can imagine cases where it does.  For example, suppose you have
 a cron job (which you trust to work) to refresh your materialized
 views every night.  Well, that means that you'll never be more than 24
 hours stale - but if any of those materialized views are unlogged,
 that also means that you could have no data at all for up to 24 hours
 following a crash.  Not great, because now you need some logic to
 handle just that one case that wouldn't be necessary if the DB did it
 for you.  But I just think it's a judgement call how serious one
 thinks that scenario is, vs. any other scenario where a boolean isn't
 adequate either.

Staleness is a completely different issue, in my view, from
quietly returning results that are not, and never were, accurate. 
Sure we need to implement more refined scannability tests than
whether valid data from *some* point in time is present.  But that
should always be *part* of the scannability testing, and without it
I don't feel we have a feature of a quality suitable for delivery.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The missing pg_get_*def functions

2013-04-30 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 If all you want is SQL language functions, there is nothing to stop you 
 from writing them and publishing them today as an extension.

It's worth noting also that we actually *have* infrastructure for
updating extensions without initdb; unlike the initial contents of
pg_proc.  So this approach is more attractive than it might seem
on its face, assuming you are going to do this as SQL functions.
(I share the doubts expressed elsewhere as to how feasible that
actually is.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Substituting Checksum Algorithm (was: Enabling Checksums)

2013-04-30 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 ISTM that we also need this patch to put memory barriers in place
 otherwise the code might be rearranged.

This is simply silly.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Substituting Checksum Algorithm (was: Enabling Checksums)

2013-04-30 Thread Simon Riggs
On 30 April 2013 15:51, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 ISTM that we also need this patch to put memory barriers in place
 otherwise the code might be rearranged.

 This is simply silly.

You crack me up sometimes. Yes, it is; seem to be having a bad day for thinkos.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Back branches vs. gcc 4.8.0

2013-04-30 Thread Greg Stark
On Fri, Apr 5, 2013 at 11:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Since gcc 4.8 is going to be on a lot of people's machines pretty soon,
 I think we need to do something to prevent it from breaking 8.4.x and
 9.0.x.  It looks like our choices are (1) teach configure to enable
 -fno-aggressive-loop-optimizations if the compiler recognizes it,
 or (2) back-port commit 8137f2c32322c624e0431fac1621e8e9315202f9.

 I'm a bit leaning towards (1), mainly because I'm not excited about

I'm confused. I would have described (1) as entering an arms race.
Each new optimization related to arrays and structs would need a new
flag. Whereas (2) makes the code pretty common traditional code that
gcc is going to need to tolerate for the foreseeable future

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-30 Thread Andres Freund
On 2013-04-30 07:33:05 -0700, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:
  1) vacuum can truncate the table to an empty relation already if there is
    no data returned by the view's query which then leads to the wrong
    scannability state.
 
    S1: CREATE MATERIALIZED VIEW matview_empty AS SELECT false WHERE random() 
  -1;
    S2: S2: SELECT * FROM matview_empty; -- works
    S1: VACUUM matview_empty;
    S2: SELECT * FROM matview_empty; -- still works
    S3: SELECT * FROM matview_empty; -- errors out
 
    So we need to make vacuum skip cleaning out the last page. Once we
    get incrementally updated matviews there are more situations to get
    into this than just a query not returning anything.
    I remember this being discussed somewhere already, but couldn't find
    it quickly enough.
 
 Yeah, I posted a short patch earlier on this thread that fixes
 that.  Nobody has commented on it, and so far I have not committed
 anything related to this without posting details and giving ample
 opportunity for anyone to comment.  If nobody objects, I can push
 that, and this issue is gone.

Well, this bug is gone, but the multiple layer violations aren't.

    Imo this is quite an indicator that the idea of using the filesize is
    just plain wrong. Adding logic to vacuum not to truncate data because
    its a flag for matview scannability is quite the layer violation and
    a sign for a bad design decision. Such a hack has already been added
    to copy_heap_data(), while not as bad, shows that it is hard to find
    all the places where we need to add it.
 
 I agree, but if you review the thread I started with a flag in
 pg_class, I tried using the special area of the first page of the
 heap, and finally wound up with the current technique based on
 several people reviewing the patch and offering opinions and
 reaching an apparent consensus that this was the least evil way to
 do it given current infrastructure for unlogged tables.  This
 really is a result of trying to preserver *correct* behavior while
 catering to those emphasizing how important the performance of
 unlogged matviews is.

Imo it now uses the worst of both worlds...

  2) Since we don't have a metapage to represent scannability in 9.3 we
    cannot easily use one in 9.4+ without pg_upgrade emptying all
    matviews unless we only rely on the catalogs which we currently
    cannot.

 I am not following this argument at all.  Doesn't pg_upgrade use
 pg_dump to create the tables and matviews WITH NO DATA and take
 later action for ones which are populated in the source?  It would
 not be hard at all to move to a new release which used a different
 technique for tracking populated tables by changing what pg_dump
 does for populated tables with the switch pg_upgrade uses.

What I am thinking about is a 100GB materialized view which has been
filled in 9.3 and should now be pg_upgraded into 9.4. If we don't have a
metapage yet and we want to add one we would need to rewrite the whole
100GB which seems like a rather bad idea. Or how are you proposing to
add the metapage? You cannot add it to the end or somesuch.

 I am not seeing this at all.  Given my comment above about how this
 works for pg_upgrade and pg_dump, can you explain how this is a
 problem?

Well, that only works if there is a cheap way to add the new notation to
existing matview heaps which I don't see.

  3) Using the filesize as a flag will make other stuff like preallocating
    on-disk data in bigger chunks and related things more complicated.
 
 Why?  You don't need to preallocate for a non-populated matview,
 since its heap will be replaced on REFRESH.  You would not *want*
 to preallocate a lot of space for something guaranteed to remain at
 zero length until deleted.

But we would likely also want to optimize reducing the filesize in the
same chunks because you otherwise would end up with even worse
fragmentation. And I am not talking about an unscannable relation but
about one which is currently empty (e.g. SELECT ... WHERE false).

  4) doing the check for scannability in the executor imo is a rather bad
    idea. Note that we e.g. don't see an error about a matview which
    won't be scanned because of constraint exclusion or one-time
    filters.
 
    S1: CREATE MATERIALIZED VIEW matview_unit_false AS SELECT false WHERE 
 false
  WITH NO DATA;
    S1: SELECT * FROM matview_unit_false;
 
  You can get there in far more reasonable cases than WHERE false.
 
 I am a little concerned about that, but the case you show doesn't demonstrate 
 a problem:

 The view was defined WITH NO DATA and is behaving correctly for
 that case.  When populated (with zero rows) it behaves correctly:

Ah. Tom already fixed the problem in
5194024d72f33fb209e10f9ab0ada7cc67df45b7. I was working in a branch
based on 3ccae48f44d993351e1f881761bd6c556ebd6638 and it failed there.

 I would be interested to see whether anyone can come up with an
 actual mis-behavior 

Re: [HACKERS] The missing pg_get_*def functions

2013-04-30 Thread Andres Freund
On 2013-04-30 15:57:02 +0200, Andres Freund wrote:
 On 2013-04-30 05:14:15 +0100, Joel Jacobson wrote:
  On Tue, Apr 30, 2013 at 12:58 AM, Tom Lane t...@sss.pgh.pa.us wrote:
   Noah Misch n...@leadboat.com writes:
   Note also that minor releases can readily fix bugs in C-language 
   functions,
   but we have no infrastructure to update sql-language functions after 
   initdb.
   That flexibility is unfortunate to lose, particularly for something that
   pg_dump depends on.
  
   That alone would probably be sufficient reason why we would never allow
   pg_dump to depend on any such thing (not that I see a compelling
   argument for it to do so anyway...).
  
  It would be better to find a way to update sql-language functions in
  minor upgrades, instead of shutting that door entirely for all future
  implementation ideas involving sql-language functions in the
  pg_catalog.
 
 I'd be very careful with jumping on this task. I am pretty sure its a
 very good way to get very, very frustrated if you don't present a widely
 accepted design beforehand. Doing this correctly is *far far* from easy.
 
 Just a little collection of problems:
 * You need to connect to all databases, not just one. There's no
   infrastructure for this.
 * You need to do the update *before* allowing any external
   connections. Otherwise the feature won't be useful to fix actual
   problems. Again, there is no infrastructure for this.
 * You need to do it in a way that a) doesn't slow down normal startup b)
   doesn't break if the update has only been applied to  of 1
   databases.

Another rather fundamental problem:

This obviously cannot be done directly on a standby. So either we cannot
rely on those updates having been performed or you need to update the
standby in lockstep with the primary. Neither seems acceptable.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-30 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:
 On 2013-04-30 07:33:05 -0700, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:

 2) Since we don't have a metapage to represent scannability in 9.3
    we cannot easily use one in 9.4+ without pg_upgrade emptying all
    matviews unless we only rely on the catalogs which we currently
    cannot.

 I am not following this argument at all.  Doesn't pg_upgrade use
 pg_dump to create the tables and matviews WITH NO DATA and take
 later action for ones which are populated in the source?  It would
 not be hard at all to move to a new release which used a different
 technique for tracking populated tables by changing what pg_dump
 does for populated tables with the switch pg_upgrade uses.

 What I am thinking about is a 100GB materialized view which has been
 filled in 9.3 and should now be pg_upgraded into 9.4. If we don't have a
 metapage yet and we want to add one we would need to rewrite the whole
 100GB which seems like a rather bad idea. Or how are you proposing to
 add the metapage? You cannot add it to the end or somesuch.

Oh, you are suggesting prepending a metapage to existing matviews
(and tables?)?  So to check whether a view has been populated you
not only look at the directory but open the file and read a page?
Now I follow why you think this would be an issue.  I'm not sure I
think that is the best solution, though.  In what way would it be
better than adding info to pg_class or some other system table? 
Why would this be important for unlogged matviews but not unlogged
tables?

 I am not seeing this at all.  Given my comment above about how this
 works for pg_upgrade and pg_dump, can you explain how this is a
 problem?

 Well, that only works if there is a cheap way to add the new notation to
 existing matview heaps which I don't see.

We could perhaps reserve some space in the special area of the
first page, if we can agree on a generous enough amount of space.

 3) Using the filesize as a flag will make other stuff like
    preallocating on-disk data in bigger chunks and related
    things more complicated.

 Why?  You don't need to preallocate for a non-populated matview,
 since its heap will be replaced on REFRESH.  You would not *want*
 to preallocate a lot of space for something guaranteed to remain at
 zero length until deleted.

 But we would likely also want to optimize reducing the filesize in the
 same chunks because you otherwise would end up with even worse
 fragmentation. And I am not talking about an unscannable relation but
 about one which is currently empty (e.g. SELECT ... WHERE false).

Yes, if we get to both incremental updates *and* preallocations
before we develop a better technique for this, a special case would
be needed for the case where a matview was incrementally updated to
zero rows.

 Not sure what the consequence of this is. The most reasonable solution
 seems to be to introduce a metapage somewhere *now* which sucks, but
 ...

 That seems far riskier to me than using the current
 lame-but-functional approach now and improving it in a subsequent
 release.

 Why? We have bitten by the lack of such metapages several times now and
 I don't think we have bitten by their presence in relation types that
 have them by now.

Like I said, months ago I had a version which used the special area
for the first page of a matview heap, but was convinced to change
that.  I could probably be convinced to change back.  :-)  I don't
know whether you see a problem with using the special like that for
the metadata you envision.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-30 Thread Robert Haas
On Tue, Apr 30, 2013 at 10:40 AM, Kevin Grittner kgri...@ymail.com wrote:
 What is a real problem or risk with using this mechanism until we
 engineer something better?  What problems with converting to a
 later major release does anyone see?

 Well, it's a pg_upgrade hazard, if nothing else, isn't it?

 I don't think so.  What do you see as a problem?

pg_upgrade only handles changes in catalog state, not on-disk
representation.  If the on-disk representation of an non-scannable
view might change in a future release, it's a pg_upgrade hazard.

 Sure, I wouldn't allow that either.  My point is that I feel that
 could be engineered around in user space.  If you have a materialized
 view which could legitimately be empty (there are many for which that
 won't be an issue), then you can either arrange the view definition so
 that it isn't (e.g. by including a dummy record that clients can look
 for), or you can include a sentinel unlogged table that will contain a
 row if and only if materialized views have been refreshed since the
 last crash.  In the examples I can think of,
 indefinitely-stale-but-valid-at-some-point wouldn't be very good
 either, so I would anticipate needing to do some engineering around
 relative freshness anyway - e.g. keeping a side table that lists the
 last-refreshed-time for each matview.  Or, maybe I'd wouldn't care
 about tracking elapsed time per se, but instead want to track
 freshness relative to updates - e.g. set things up so that anyone who
 performs an action that would invalidate a row in the materialized
 view would also update a row someplace that would allow me to identify
 the row as stale.  In either case, handling the case where the view is
 altogether invalid doesn't seem to add a whole lot of additional
 complexity.

 Now, I can imagine cases where it does.  For example, suppose you have
 a cron job (which you trust to work) to refresh your materialized
 views every night.  Well, that means that you'll never be more than 24
 hours stale - but if any of those materialized views are unlogged,
 that also means that you could have no data at all for up to 24 hours
 following a crash.  Not great, because now you need some logic to
 handle just that one case that wouldn't be necessary if the DB did it
 for you.  But I just think it's a judgement call how serious one
 thinks that scenario is, vs. any other scenario where a boolean isn't
 adequate either.

 Staleness is a completely different issue, in my view, from
 quietly returning results that are not, and never were, accurate.
 Sure we need to implement more refined scannability tests than
 whether valid data from *some* point in time is present.  But that
 should always be *part* of the scannability testing, and without it
 I don't feel we have a feature of a quality suitable for delivery.

I don't really see these as being all that separate.  The user is
going to want to know whether the data is usable or not.  The answer
to that question depends on the user's business rules, and those could
be anything.  The current system implements the following business
rule: The data can be used if the matview has ever been populated.
But there are lots of other possibilities:

- The data can be used if the matview is fully up-to-date.
- The data can be used if the matview is not out of date by more than
a certain amount of time.
- The data can be used if the matview is out of date with respect to
one of its base tables, but not if it is out of date with respect to
another of its base tables.  For example, maybe you're OK with using
an order-summary view if new orders have arrived (but not if the view
is more than a day out of date); but not if any customers have been
renamed.

Not only can the business rules vary, but they can vary between
queries.  Query A might need an exact answer, hence can only use the
matview when its up to date.  Query B against the very same matview
might only need data that's up to date within some time threshold, and
query C might well want to just use whatever data exists.   I'm not at
all clear that it's even feasible to solve all of these problems
inside the database; my suspicion is that at least some of these
things are going to end up being things that have to be handled at
least partially in user-space, while others will be handled in the DB
through mechanisms not yet designed.

I am even willing to go so far as to say that I am unconvinced that
everyone will want a just-truncated materialized view to be
automatically set to non-scannable.  Suppose you have an unlogged view
that summarizes a real-time data stream - all SMTP traps received in
the last minute summarized by the node from which they were received.
When the server reboots unexpectedly, the raw data is lost along with
the summary.  But for the scannability flag, we'd read the summary as
empty, which would be right.  Adding the scannability flag forces the
user to add application logic to treat the error as equivalent to an
empty 

Re: [HACKERS] Remaining beta blockers

2013-04-30 Thread Andres Freund
On 2013-04-30 08:35:32 -0700, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:
  On 2013-04-30 07:33:05 -0700, Kevin Grittner wrote:
  Andres Freund and...@2ndquadrant.com wrote:
 
  2) Since we don't have a metapage to represent scannability in 9.3
     we cannot easily use one in 9.4+ without pg_upgrade emptying all
     matviews unless we only rely on the catalogs which we currently
     cannot.
 
  I am not following this argument at all.  Doesn't pg_upgrade use
  pg_dump to create the tables and matviews WITH NO DATA and take
  later action for ones which are populated in the source?  It would
  not be hard at all to move to a new release which used a different
  technique for tracking populated tables by changing what pg_dump
  does for populated tables with the switch pg_upgrade uses.
 
  What I am thinking about is a 100GB materialized view which has been
  filled in 9.3 and should now be pg_upgraded into 9.4. If we don't have a
  metapage yet and we want to add one we would need to rewrite the whole
  100GB which seems like a rather bad idea. Or how are you proposing to
  add the metapage? You cannot add it to the end or somesuch.
 
 Oh, you are suggesting prepending a metapage to existing matviews
 (and tables?)?  So to check whether a view has been populated you
 not only look at the directory but open the file and read a page?
 Now I follow why you think this would be an issue.  I'm not sure I
 think that is the best solution, though.  In what way would it be
 better than adding info to pg_class or some other system table?

You can write/read it without having a database opened. Like in the
startup process.

Then you can abstract away the knowledge about that into
PageIsMetapage(relation, blockno) or such which then can be used by
vacuum, heapam et al in an extensible fashion.

This is far from a fully working solution though - you still have issues
with BEGIN;REFRESH ...; ROLLBACK; if you do it naively. Afair that was
what made Tom protest against this.


 Why would this be important for unlogged matviews but not unlogged
 tables?

Unlogged tables basically have some very raw version of this - the _init
fork. But yes, a more generic version would have been nice.

  I am not seeing this at all.  Given my comment above about how this
  works for pg_upgrade and pg_dump, can you explain how this is a
  problem?
 
  Well, that only works if there is a cheap way to add the new notation to
  existing matview heaps which I don't see.
 
 We could perhaps reserve some space in the special area of the
 first page, if we can agree on a generous enough amount of space.

If we do it we should just use a whole page, no point in being to
cautious there imo.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Substituting Checksum Algorithm (was: Enabling Checksums)

2013-04-30 Thread Greg Smith
I re-ran the benchmark that's had me most worried against the committed 
code and things look good so far.  I've been keeping quiet because my 
tests recently have all agreed with what Ants already described.  This 
is more a confirmation summary than new data.


The problem case has been Jeff's test 2 worst-case overhead for 
calculating checksum while reading data from the OS cache.  I wrapped 
that into a test harness and gave results similar to Jeff's at 
http://www.postgresql.org/message-id/5133d732.4090...@2ndquadrant.com

based on the originally proposed Fletcher-16 checksum.

I made some system improvements since then such that the absolute 
runtime improved for most of the tests I'm running.  But the percentage 
changes didn't seem off enough to bother re-running the Fletcher tests 
again.  Details are in attached spreadsheet, to summarize:


-The original Fletcher-16 code slowed this test case down 24 to 32%, 
depending on whether you look at the average of 3 runs or the median.


-The initial checksum commit with the truncated WAL CRC was almost an 
order of magnitude worse:  146% to 224% slowdown.  The test case that 
took ~830ms was taking as much as 2652ms with that method.  I'm still 
not sure why the first run of this test is always so much faster than 
the second and third.  But since it happens so often I think it's fair 
to consider that worst case really important.


-Committed FNV-1a implementation is now slightly better than Fletcher-16 
speed wise:  19 to 27% slowdown.


-Slicing by 8 CRC I didn't test because once I'd fully come around to 
agree with Ants's position it didn't seem likely to be useful.  I don't 
want to lose track of that idea though, it might be the right path for a 
future implementation with 32 bit checksums.


Since the =25% slowdown on this test with Fletcher-16 turned into more 
like a 2% drop on more mixed workloads, I'd expect we're back to where 
that's again the case with the new FNV-1a.  I plan to step back to 
looking at more of those cases, but it will take a few days at least to 
start sorting that out.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


ChecksumMethods.xls
Description: MS-Excel spreadsheet

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-30 Thread Josh Berkus
Robert,

 - The data can be used if the matview is fully up-to-date.
 - The data can be used if the matview is not out of date by more than
 a certain amount of time.
 - The data can be used if the matview is out of date with respect to
 one of its base tables, but not if it is out of date with respect to
 another of its base tables.  For example, maybe you're OK with using
 an order-summary view if new orders have arrived (but not if the view
 is more than a day out of date); but not if any customers have been
 renamed.

We discussed this around the beginning of CF4.  For some reason (which I
didn't quite understand at the time), the consensus was that creating a
matview last updated timestamp was not implementable for 9.3 and would
need to wait for 9.4.

Again, all of this points to additional columns, or at least relopts, in
pg_class.  Why is that off the table?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-30 Thread Kevin Grittner
Josh Berkus j...@agliodbs.com wrote:

 We discussed this around the beginning of CF4.  For some reason
 (which I didn't quite understand at the time), the consensus was
 that creating a matview last updated timestamp was not
 implementable for 9.3 and would need to wait for 9.4.

The reason was that the start of CF4 was deemed too late in the
development cycle to be trying to design what that should look
like.  No sooner had you suggested that one column than someone
suggested two others which might also be useful, and it seemed to
me that it would be hard to get it right before we had a better
sense of what incremental maintenance based on the view declaration
would look like.  So, as I recall it, the consensus developed
around the simplest possible test, which from the user perspective
was hidden behind a function, should be used for this release.

 Again, all of this points to additional columns, or at least
 relopts, in pg_class.  Why is that off the table?

That was deemed to be incompatible with unlogged matviews, which
some didn't want to give up in this initial release.

Basically, what this patch aims at is more or less what some other
databases had in their initial releases of materialized views 10 to
20 years ago.  Other products have built on those foundations with
each major release.  I was hoping we could do the same.  We are not
going to reach parity on this with any other major database product
in one release, or probably even two or three.

I didn't dig back through all the threads to confirm these
recollections, so take them with a grain of salt.  The one thing I
am sure of is that they are a rehash of discussions which go back
to before the start of CF3.  Only with more tone and urgency.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] materialized view scannability in other DBs

2013-04-30 Thread Robert Haas
On Tue, Apr 30, 2013 at 10:40 AM, Kevin Grittner kgri...@ymail.com wrote:
 Staleness is a completely different issue, in my view, from
 quietly returning results that are not, and never were, accurate.
 Sure we need to implement more refined scannability tests than
 whether valid data from *some* point in time is present.  But that
 should always be *part* of the scannability testing, and without it
 I don't feel we have a feature of a quality suitable for delivery.

I did a little more research on this.  The Oracle equivalent of WITH
NO DATA is apparently BUILD DEFERRED.  And as far as I can see, if you
specify BUILD DEFERRED when creating the view, then it just shows up
as having no rows:

http://dbataj.blogspot.com/2007/11/materialized-view-for-data-warehouse.html

The system declines to use such a view for query rewrite, but you can
still select from it without an error.   See also Pro Oracle Database
11g Administration by Darl Kuhn, page 378, which confirms that this is
the case and that it will simply return 0 rows.

On the other hand, in SQL Anywhere and DB2, it seems that it works the
way you've implemented it:

http://dcx.sybase.com/1201/en/saerrors/errm1077.html
http://www.ibm.com/developerworks/data/library/techarticle/dm-0509melnyk/

The question doesn't seem to arise for Microsoft SQL Server or MySQL.
The former supports indexed views rather than materialized views,
but you can't build them deferred, so the question of what data you'd
see if you could doesn't arise.  The latter doesn't seem to support
materialized views at all.

I have to admit that I'm slightly surprised to find that both DB2 and
Sybase have this concept, so maybe this is not as much of a random
wart as I had been thinking.  However, the fact that Oracle has
(FWICT) had materialized views since 8i (1999), and that they have not
felt compelled to add a flag of this type, suggests to me that the
feature can't be considered mandatory for a minimal implementation.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The missing pg_get_*def functions

2013-04-30 Thread Robert Haas
On Tue, Apr 30, 2013 at 9:02 AM, Stephen Frost sfr...@snowman.net wrote:
 * Robert Haas (robertmh...@gmail.com) wrote:
 On Mon, Apr 29, 2013 at 7:58 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  One of the things that we frequently recommend when doing
  upgrades is that you do the dump with the newer version's pg_dump, so
  as to get the benefits of any bug fixes that are in it.  The more
  dump functionality is on the server side, the less opportunity we have
  to repair things that way.

 But why wouldn't we be able to fix the version in the server, if it
 turns out to be buggy?  I suppose we wouldn't fix bugs discovered
 after EOL, but I'm not sure that's a sufficient objection.

 There are other things beyond bugs here..  Changes in reserved keywords
 is actually the biggest reason, ime, to use the newer pg_dump when
 you're trying to move to a newer PG version.

Oh.  Good point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] materialized view scannability in other DBs

2013-04-30 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:

 Oracle has (FWICT) had materialized views since 8i (1999)

8i was when they expanded beyond requiring a manual refresh, and
changed the name of the feature from snapshot to materialized
view.  I'm not sure how long they had snapshots before 8i.

 Microsoft SQL Server ... you can't build them deferred, so the
 question of what data you'd see if you could doesn't arise.

 in SQL Anywhere and DB2, it seems that it works the way you've
 implemented it

Sybase ASE, which is probably a better comparison than Sybase SQL
Anywhere, has the concept, too:

http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc36272.1572/html/commands/commands31.htm

See the populate | nonpopulate section.  If not populated, you
can't enable it for operations.

 MySQL ... doesn't seem to support materialized views at all.

Correct.

 the fact that Oracle has [...] not felt compelled to add a flag
 of this type, suggests to me that the feature can't be considered
 mandatory for a minimal implementation.

It seems to me pretty fundamental to have a way to avoid quietly
generating completely bogus results, whether or not one other
vendor has decided it doesn't matter.  It's not like they are
completely without the concept of freshness (or, as they seem to
express it, staleness).  If you build with DEFERRED that property
of the matview is set to UNUSABLE; but in their world that doesn't
mean it's unusable by direct reference -- only for automatic query
rewrites.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-30 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:

 Ah. Tom already fixed the problem in
 5194024d72f33fb209e10f9ab0ada7cc67df45b7. I was working in a
 branch based on 3ccae48f44d993351e1f881761bd6c556ebd6638 and it
 failed there.

Since previous regression tests missed this bug, I've added the
test you posted, to make sure it doesn't come back unnoticed.

Thanks for the test case!

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-30 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:
 On 2013-04-30 07:33:05 -0700, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:

 1) vacuum can truncate the table to an empty relation already
    if there is no data returned by the view's query which then
    leads to the wrong scannability state.

   So we need to make vacuum skip cleaning out the last page.
   Once we get incrementally updated matviews there are more
   situations to get into this than just a query not returning
   anything.

 Yeah, I posted a short patch earlier on this thread that fixes
 that.  Nobody has commented on it, and so far I have not
 committed anything related to this without posting details and
 giving ample opportunity for anyone to comment.  If nobody
 objects, I can push that, and this issue is gone.

 Well, this bug is gone, but the multiple layer violations aren't.

Attached is a patch with regression test if we don't obviate the
need for it by tracking the populated status in a different way or
allowing unpopulated matviews to be used in queries.  I'll hold off
on pushing it until we decide.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company*** a/src/backend/commands/vacuumlazy.c
--- b/src/backend/commands/vacuumlazy.c
***
*** 230,236  lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
--- 230,242 
  	 *
  	 * Don't even think about it unless we have a shot at releasing a goodly
  	 * number of pages.  Otherwise, the time taken isn't worth it.
+ 	 *
+ 	 * Leave a populated materialized view with at least one page.
  	 */
+ 	if (onerel-rd_rel-relkind == RELKIND_MATVIEW 
+ 		vacrelstats-nonempty_pages == 0)
+ 		vacrelstats-nonempty_pages = 1;
+ 
  	possibly_freeable = vacrelstats-rel_pages - vacrelstats-nonempty_pages;
  	if (possibly_freeable  0 
  		(possibly_freeable = REL_TRUNCATE_MINIMUM ||
*** a/src/test/regress/expected/matview.out
--- b/src/test/regress/expected/matview.out
***
*** 430,432  SELECT * FROM matview_unit_false;
--- 430,452 
  (0 rows)
  
  DROP MATERIALIZED VIEW matview_unit_false;
+ -- test that vacuum does not make empty matview look unpopulated
+ CREATE TABLE hoge (i int);
+ INSERT INTO hoge VALUES (generate_series(1,10));
+ CREATE MATERIALIZED VIEW hogeview AS SELECT * FROM hoge WHERE i % 2 = 0;
+ CREATE INDEX hogeviewidx ON hogeview (i);
+ DELETE FROM hoge;
+ REFRESH MATERIALIZED VIEW hogeview;
+ SELECT * FROM hogeview WHERE i  10;
+  i 
+ ---
+ (0 rows)
+ 
+ VACUUM ANALYZE;
+ SELECT * FROM hogeview WHERE i  10;
+  i 
+ ---
+ (0 rows)
+ 
+ DROP TABLE hoge CASCADE;
+ NOTICE:  drop cascades to materialized view hogeview
*** a/src/test/regress/sql/matview.sql
--- b/src/test/regress/sql/matview.sql
***
*** 136,138  SELECT * FROM matview_unit_false;
--- 136,150 
  REFRESH MATERIALIZED VIEW matview_unit_false;
  SELECT * FROM matview_unit_false;
  DROP MATERIALIZED VIEW matview_unit_false;
+ 
+ -- test that vacuum does not make empty matview look unpopulated
+ CREATE TABLE hoge (i int);
+ INSERT INTO hoge VALUES (generate_series(1,10));
+ CREATE MATERIALIZED VIEW hogeview AS SELECT * FROM hoge WHERE i % 2 = 0;
+ CREATE INDEX hogeviewidx ON hogeview (i);
+ DELETE FROM hoge;
+ REFRESH MATERIALIZED VIEW hogeview;
+ SELECT * FROM hogeview WHERE i  10;
+ VACUUM ANALYZE;
+ SELECT * FROM hogeview WHERE i  10;
+ DROP TABLE hoge CASCADE;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-30 Thread Josh Berkus
Kevin,

 The reason was that the start of CF4 was deemed too late in the
 development cycle to be trying to design what that should look
 like.  No sooner had you suggested that one column than someone
 suggested two others which might also be useful, and it seemed to

Yeah, I'm just pointing out that we *already had* this discussion, so
there isn't any point in having it again.

 That was deemed to be incompatible with unlogged matviews, which
 some didn't want to give up in this initial release.

Huh?  Unlogged tables don't go in pg_class?

 Basically, what this patch aims at is more or less what some other
 databases had in their initial releases of materialized views 10 to
 20 years ago.  Other products have built on those foundations with
 each major release.  I was hoping we could do the same.  We are not
 going to reach parity on this with any other major database product
 in one release, or probably even two or three.

Yep.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-30 Thread Kevin Grittner
Josh Berkus j...@agliodbs.com wrote:

 That was deemed to be incompatible with unlogged matviews, which
 some didn't want to give up in this initial release.

 Huh?  Unlogged tables don't go in pg_class?

Sorry if I wasn't clear.  I try to do incremental development --
get one part working and then go on to the next.  I had stored the
populated state in pg_class, although under what, in retrospect,
was a bad name -- I had a bool called relisvalid.  It should have
been relispopulated or a bit in a flag byte.  Anyway, as I
proceeded to implement the unlogged feature I discovered that the
heap is truncated at recovery time based on the init fork, before
the server is at the point where we can access the system tables. 
So then I tried to keep the pg_class state but populate it when an
unlogged matview was opened, based on using the special space in
the first page of the init fork, updating pg_class if that was
found to indicate a truncated heap.  I was roundly criticized for
keeping this state in two places -- both the heap and pg_class,
and urged to keep it only in the heap, because only keeping it in
pg_class would not allow the proper recovery for an unlogged
matview.  So I responded to that feedback with the current
implementation.

Clearly we would need to change how we do recovery of unlogged
relations to both allow unlogged matviews and keep the populated
state in pg_class.  I don't think we can really make the two
places technique work, where the recovery state of the unlogged
matview is used to trigger a pg_class change.  For one thing, it
was really messy -- far more so than current code.  For another
thing, the matview would show as good until it was first opened,
which was nasty.

Kinda verbose, but I hope that makes it more clear.

-- 
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Substituting Checksum Algorithm (was: Enabling Checksums)

2013-04-30 Thread Martijn van Oosterhout
On Tue, Apr 30, 2013 at 01:05:30PM -0400, Greg Smith wrote:
 I re-ran the benchmark that's had me most worried against the
 committed code and things look good so far.  I've been keeping quiet
 because my tests recently have all agreed with what Ants already
 described.  This is more a confirmation summary than new data.

I came across this today: Data Integrity Extensions, basically a
standard for have an application calculate a checksum of a block and
submitting it together with the block so that the disk can verify that
the block it is writing matches what the application sent.

It appears SCSI has standardised on a CRC-16 checksum with polynomial
0x18bb7 .

http://www.t10.org/ftp/t10/document.03/03-290r0.pdf
https://oss.oracle.com/~mkp/docs/dix-draft.pdf

Not directly relavent to PostgreSQL now, but possibly in the future.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-30 Thread Jeff Davis
On Tue, 2013-04-30 at 08:34 -0400, Robert Haas wrote:
 Uh, wait a minute.  I think this is completely wrong.  The buffer is
 LOCKED for this entire sequence of operations.  For a checkpoint to
 happen, it's got to write every buffer, which it will not be able to
 do for so long as the buffer is locked.

I went back and forth on this, so you could be right, but here was my
reasoning:

I was worried because SyncOneBuffer checks whether it needs writing
without taking a content lock, so the exclusive lock doesn't help. That
makes sense, because you don't want a checkpoint to have to get a
content lock on every buffer in the buffer pool. But it also means we
need to follow the rules laid out in transam/README and dirty the pages
before writing WAL.

 The effect of the change to lazy_scan_heap is to force the buffer to
 be written even if we're only updating the visibility map page.
 That's a bad idea and should be reverted.

The only time the VM and the data page are out of sync during vacuum is
after a crash, right? If that's the case, I didn't think it was a big
deal to dirty one extra page (should be extremely rare). Am I missing
something?

The reason I removed that special case was just code
complexity/readability. I tried preserving the previous behavior, and
it's not so bad, but it seemed unnecessarily ugly for the benefit of a
rare case.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Incomplete description of pg_start_backup?

2013-04-30 Thread Dmitry Koterov
I think that at
http://www.postgresql.org/docs/current/static/functions-admin.html and
http://www.postgresql.org/docs/current/static/continuous-archiving.html two
important points on how pg_start_backup() works are missing:

1. After pg_start_backup() and till pg_stop_backup() VACUUM is denied (e.g.
autovacuum is turned off), so the new data is always appended to data
files, is never written at their middle. This allows to archive the data
directory using any copying tools (rsync, tar, cp etc.). If you forget to
call pg_stop_backup() by accident, data files will grow forever. So
pg_start_backup() switches the database to append-only mode which is safe
to backup without stopping (data files temporarily become append-only, WAL
logs are append-only always).

2. After pg_start_backup() and till pg_stop_backup() full_page_writes is
forced to be ON.

BTW are these points fully correct? If yes, it would be great to update the
documentation, because in google there are a lot of questions on how
exactly backup with pg_start_backup() works and why cp, tar etc. are safe
after pg_start_backup(), but no clear answers. If no, could you please give
a couple of comments on all these?


Re: [HACKERS] The missing pg_get_*def functions

2013-04-30 Thread Noah Misch
On Tue, Apr 30, 2013 at 04:47:58AM +0100, Joel Jacobson wrote:
 On Tue, Apr 30, 2013 at 12:46 AM, Noah Misch n...@leadboat.com wrote:
  Those existing functions give a mostly-SnapshotNow picture of their objects,
  but an sql-language implementation would give a normally-snapshotted 
  picture.
 
 I assume normally is better than mostly?

Inconsistent snapshot usage (mostly-anything) is bad, especially within the
confines of a single function.  pg_get_viewdef() grabs pg_rewrite.ev_action
using an MVCC snapshot, then interprets it relative to SnapshotNow.  That's a
paradox waiting to happen.  (Granted, the same could be said for *any* use of
SnapshotNow in the absence of locking.)

Whether all-normally-snapshotted (all-MVCC) beats all-SnapshotNow is less
clear-cut, but I tentatively think it would.

  That status quo is perhaps more an implementation accident than a designed
  behavior.  Before proliferating functions like this, we should pick a 
  snapshot
  policy and stick to it.  See the block comment at the top of pg_dump.c.
 
 I didn't think there would be any reason to migrate the existing
 functions from C to SQL, but this snapshot problem seems like a good
 motive to do it. If they would all be written in SQL, the snapshot
 problem would be solved, right?

Nominally yes, but not because of difficulty using a normal MVCC snapshot from
C.  It's just that the sql PL uses nothing but normal MVCC snapshots.  So,
this isn't a sound reason to translate C to SQL.  In any case, I can't fathom
a prudent 100% sql implementation of pg_get_viewdef(), which needs to deparse
arbitrary queries.

  Note also that minor releases can readily fix bugs in C-language functions,
  but we have no infrastructure to update sql-language functions after initdb.
  That flexibility is unfortunate to lose, particularly for something that
  pg_dump depends on.  Now, the right thing is probably to design a mechanism
  for applying simple catalog updates in concert with a minor release.  In the
  mean time, its absence puts the sql PL at a nontrivial disadvantage here.
 
 What do you mean with infrastructure? Isn't it as simple as CREATE
 OR REPLACE FUNCTION? As long as the interface the pg_get_*def
 functions don't change, I cannot see how simply replacing the existing
 functions in a minor release upgrade could do any harm.

Stephen described the sort of infrastructure I had in mind.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Substituting Checksum Algorithm (was: Enabling Checksums)

2013-04-30 Thread Greg Smith

On 4/30/13 5:26 PM, Martijn van Oosterhout wrote:

I came across this today: Data Integrity Extensions, basically a
standard for have an application calculate a checksum of a block and
submitting it together with the block so that the disk can verify that
the block it is writing matches what the application sent.

It appears SCSI has standardised on a CRC-16 checksum with polynomial
0x18bb7 .


To be pedantic for a minute (for the first time *ever* on pgsql-hackers) 
it's not quite all of SCSI.  iSCSI has joined btrfs by settling on 
CRC-32C with the Castagnoli polynomial, as mentioned in that first 
reference.  CRC-32C is also the one with the SSE4.2 instructions to help 
too.  All the work around the T10/Data Integrity Field standard that's 
going on is nice.  I think it's going to leave a lot of PostgreSQL users 
behind though.  I'd bet a large sum of money that five years from now, 
there will still be more than 10X as many PostgreSQL servers on EC2 as 
on T10/DIF capable hardware.


I feel pretty good that this new FNV-1a implementation is a good 
trade-off spot that balances error detection and performance impact.  If 
you want a 16 bit checksum that seems ready for beta today, we can't do 
much better.  Fletcher-16 had too many detection holes, the WAL checksum 
was way too expensive.  Optimized FNV-1a is even better than unoptimized 
Fletcher-16 without as many detection issues.  Can't even complain about 
the code bloat for this part either--checksum.c is only 68 lines if you 
take out its documentation.


The WAL logging of hint bits is where the scary stuff to me for this 
feature has always been at.  My gut feel is that doing that needed to 
start being available as an option anyway.  Just this month we've had 
two customer issues pop up where we had to look for block differences 
between a master and a standby.  The security update forced some normal 
update stragglers to where they now have the 9.1.6 index corruption fix, 
and we're looking for cases where standby indexes might have been 
corrupted by it.  In this case the comparisons can just avoid anything 
but indexes, so hint bits are thankfully not involved.


But having false positives pop out of comparing a master and standby due 
to hint bits makes this sort of process much harder in general.  Being 
able to turn checksums on, and then compare more things between master 
and standby without expecting any block differences, that will make both 
routine quality auditing and forensics of broken clusters so much easier.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Substituting Checksum Algorithm (was: Enabling Checksums)

2013-04-30 Thread Noah Misch
Orthogonal to this thread, but:

On Tue, Apr 30, 2013 at 06:39:09PM -0400, Greg Smith wrote:
 The WAL logging of hint bits is where the scary stuff to me for this  
 feature has always been at.  My gut feel is that doing that needed to  
 start being available as an option anyway.  Just this month we've had  
 two customer issues pop up where we had to look for block differences  
 between a master and a standby.  The security update forced some normal  
 update stragglers to where they now have the 9.1.6 index corruption fix,  
 and we're looking for cases where standby indexes might have been  
 corrupted by it.  In this case the comparisons can just avoid anything  
 but indexes, so hint bits are thankfully not involved.

B-tree indexes have hints; see callers of ItemIdMarkDead().

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Incomplete description of pg_start_backup?

2013-04-30 Thread Jeff Janes
On Tue, Apr 30, 2013 at 3:24 PM, Dmitry Koterov dmi...@koterov.ru wrote:

 I think that at
 http://www.postgresql.org/docs/current/static/functions-admin.html and
 http://www.postgresql.org/docs/current/static/continuous-archiving.html two
 important points on how pg_start_backup() works are missing:

 1. After pg_start_backup() and till pg_stop_backup() VACUUM is denied
 (e.g. autovacuum is turned off), so the new data is always appended to data
 files, is never written at their middle.


This is not the case.  Autovacuum continues to run during the backup.



 This allows to archive the data directory using any copying tools (rsync,
 tar, cp etc.). If you forget to call pg_stop_backup() by accident, data
 files will grow forever. So pg_start_backup() switches the database to
 append-only mode which is safe to backup without stopping (data files
 temporarily become append-only, WAL logs are append-only always).


No, it doesn't work that way.  I don't know why appending would be any
safer than normal updates would be anyway.  WAL replay fixes up any
problems that might arise.


 2. After pg_start_backup() and till pg_stop_backup() full_page_writes is
 forced to be ON.


Effectively yes, this is documented in one of your links above (and is one
of the reasons vacuuming during the backup is not a problem)

Cheers,

Jeff


Re: [HACKERS] Substituting Checksum Algorithm (was: Enabling Checksums)

2013-04-30 Thread Andres Freund
On 2013-04-30 18:39:09 -0400, Greg Smith wrote:
 The WAL logging of hint bits is where the scary stuff to me for this feature
 has always been at.  My gut feel is that doing that needed to start being
 available as an option anyway.  Just this month we've had two customer
 issues pop up where we had to look for block differences between a master
 and a standby.  The security update forced some normal update stragglers to
 where they now have the 9.1.6 index corruption fix, and we're looking for
 cases where standby indexes might have been corrupted by it.  In this case
 the comparisons can just avoid anything but indexes, so hint bits are
 thankfully not involved.
 
 But having false positives pop out of comparing a master and standby due to
 hint bits makes this sort of process much harder in general.  Being able to
 turn checksums on, and then compare more things between master and standby
 without expecting any block differences, that will make both routine quality
 auditing and forensics of broken clusters so much easier.

I don't think the current implementation helps you with that. We only
log the first hint bit set after a checkpoint, you will still get
inconsistent bits set after that. So you might have some fewer
inconsistencies but not enough to weed them out manually or such.
c.f. MarkBufferDirtyHint() and XLogSaveBufferForHint().

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Should pg_upgrade use --quote-all-identifiers?

2013-04-30 Thread Tom Lane
Seems like this might be a good idea to avoid the type of failure
exhibited in bug #8128.  We don't care too much about the readability
of the dump script created during an upgrade, so it's hard to see a
downside.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should pg_upgrade use --quote-all-identifiers?

2013-04-30 Thread Greg Stark
On Wed, May 1, 2013 at 12:55 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Seems like this might be a good idea to avoid the type of failure
 exhibited in bug #8128.  We don't care too much about the readability
 of the dump script created during an upgrade, so it's hard to see a
 downside.

Huh. I thought you were talking about quoting identifiers in an SQL
dump. But you're not, you're talking about quoting identifiers in sql
being sent to the server during the pg_dump process. Why did pg_dump
ever not quote all such identifiers?


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should pg_upgrade use --quote-all-identifiers?

2013-04-30 Thread Tom Lane
Greg Stark st...@mit.edu writes:
 Huh. I thought you were talking about quoting identifiers in an SQL
 dump. But you're not, you're talking about quoting identifiers in sql
 being sent to the server during the pg_dump process. Why did pg_dump
 ever not quote all such identifiers?

Well, readability of those commands is worth something too, but in any
case the short answer is that pg_dump has only one quote-an-identifier
function, not different ones for server commands and final output.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining beta blockers

2013-04-30 Thread Greg Stark
On Tue, Apr 30, 2013 at 10:19 PM, Kevin Grittner kgri...@ymail.com wrote:
 Clearly we would need to change how we do recovery of unlogged
 relations to both allow unlogged matviews and keep the populated
 state in pg_class.  I don't think we can really make the two
 places technique work, where the recovery state of the unlogged
 matview is used to trigger a pg_class change.  For one thing, it
 was really messy -- far more so than current code.  For another
 thing, the matview would show as good until it was first opened,
 which was nasty.

Can I ask what is the design goal of unlogged relations? Are they just
an optimization so you can load lots of data without generating piles
of wal log? Or are they intended to generate zero wal traffic so they
can be populated on a standby for example, or outside a transaction
entirely?

Because if it's the former then I don't see any problem generating wal
traffic to update the pg_class entry. If they can satisfy the latter
that gets a bit trickier.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should pg_upgrade use --quote-all-identifiers?

2013-04-30 Thread Greg Stark
On Wed, May 1, 2013 at 2:23 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Well, readability of those commands is worth something too, but in any
 case the short answer is that pg_dump has only one quote-an-identifier
 function, not different ones for server commands and final output.

Well for the final output one reason it's nice not to quote is that it
makes it easier to use the SQL generated by --inserts on a
non-postgres database. Mainly I'm thinking of the case issue but also
some databases use strange quoting rules.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Add regression test for bug fixed by recent refactoring.

2013-04-30 Thread Tom Lane
Kevin Grittner kgri...@postgresql.org writes:
 Add regression test for bug fixed by recent refactoring.
 Test case by Andres Freund for bug fixed by Tom Lane's refactoring
 in commit 5194024d72f33fb209e10f9ab0ada7cc67df45b7

Hm, that actually has got nothing much to do with matviews:

regression=# create view vv1 as select false;
CREATE VIEW
regression=# create view vv2 as select false where false;
CREATE VIEW
regression=# create user joe;
CREATE ROLE
regression=# \c - joe
You are now connected to database regression as user joe.
regression= select * from vv1;
ERROR:  permission denied for relation vv1
regression= select * from vv2;
 bool 
--
(0 rows)

Of course the select from vv2 should fail as well, but it doesn't,
because vv2 is nowhere to be seen in the rangetable passed to the
executor.  I think the planner is probably trashing the rangetable
once it realizes that the query is completely dummy; but this is wrong
because we need to leave view rangetable entries behind for permissions
checks, even when they're unreferenced in the finished plan.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers