Re: [HACKERS] logical changeset generation v3 - comparison to Postgres-R change set format

2012-11-17 Thread Markus Wanner
On 11/16/2012 03:05 PM, Andres Freund wrote:
 I'd like to provide a comparison of the proposed change set format to
 the one used in Postgres-R.
 
 Uh, sorry to interrupt you right here, but thats not the proposed
 format ;)

Understood. Sorry, I didn't mean to imply that. It's pretty obvious to
me that this is more of a human readable format and that others,
including binary formats, can be implemented. I apologize for the bad
wording of a proposed format, which doesn't make that clear.

 The Postgres-R approach is independent of WAL and its format, where as
 the approach proposed here clearly is not. Either way, there is a
 certain overhead - however minimal it is - which the former adds to the
 transaction processing itself, while the later postpones it to a
 separate XLogReader process. If there's any noticeable difference, it
 might reduce latency in case of asynchronous replication, but can only
 increase latency in the synchronous case. As far as I understood Andres,
 it was easier to collect the additional meta data from within the
 separate process.
 
 There also is the point that if you do the processing inside heap_* you
 need to make sure the replication targeted data is safely received 
 fsynced away, in our case thats not necessary as WAL already provides
 crash safety, so should the replication connection break you can simply
 start from the location last confirmed as being safely sent.

In the case of Postgres-R, the safely received part isn't really
handled at the change set level at all. And regarding the fsync
guarantee: you can well use the WAL to provide that, without basing
change set generation on in. In that regard, Postgres-R is probably the
more general approach: you can run Postgres-R with WAL turned off
entirely - which may well make sense if you take into account the vast
amount of cloud resources available, which don't have a BBWC. Instead of
WAL, you can add more nodes at more different locations. And no, you
don't want your database to ever go down, anyway  :-)

 In summary, I'd say that Postgres-R is an approach specifically
 targeting and optimized for multi-master replication between Postgres
 nodes, where as the proposed patches are kept more general.
 
 One major aim definitely was optionally be able to replicate into just
 about any target system, so yes, I certainly agree.

I'm glad I got that correct ;-)

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v3 - comparison to Postgres-R change set format

2012-11-17 Thread Markus Wanner
On 11/16/2012 03:14 PM, Andres Freund wrote:
 Whats the data type of the COID in -R?

It's short for CommitOrderId, a 32bit global transaction identifier,
being wrapped-around, very much like TransactionIds are. (In that sense,
it's global, but unique only for a certain amount of time).

 In the patchset the output plugin has enough data to get the old xid and
 the new xid in the case of updates (not in the case of deletes, but
 thats a small bug and should be fixable with a single line of code), and
 it has enough information to extract the primary key without problems.

It's the xmin of the old tuple that Postgres-R needs to get the COID.

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 05/14] Add a new relmapper.c function RelationMapFilenodeToOid that acts as a reverse of RelationMapOidToFilenode

2012-11-17 Thread Michael Paquier
On Fri, Nov 16, 2012 at 7:58 PM, Andres Freund and...@2ndquadrant.comwrote:

 Hi,

 On 2012-11-16 13:44:45 +0900, Michael Paquier wrote:
  This patch looks OK.
 
  I got 3 comments:
  1) Why changing the OID of pg_class_tblspc_relfilenode_index from 3171 to
  3455? It does not look necessary.

 Its a mismerge and should have happened in Add a new RELFILENODE
 syscache to fetch a pg_class entry via (reltablespace, relfilenode) but
 it seems I squashed the wrong two commits.
 I had originally used 3171 but that since got used up for lo_tell64...

  2) You should perhaps change the header of RelationMapFilenodeToOid so as
  not mentionning it as the opposite operation of RelationMapOidToFilenode
  but as an operation that looks for the OID of a relation based on its
  relfilenode. Both functions are opposite but independent.

 I described it as the opposite because RelationMapOidToFilenode is the
 relmappers stated goal and RelationMapFilenodeToOid is just some
 side-business.

  3) Both functions are doing similar operations. Could it be possible
  to wrap them in the same central function?

 I don't really see how without making both quite a bit more
 complicated. The amount of if's needed seems to be too large to me.

OK thanks for your answers.
As this patch only adds a new function and is not that much complicated, I
think there is no problem in committing it. The only thing to remove is the
diff in indexing.h. Could someone take care of that?
If other people have additional comments on the ability to perform a
relfileoid-reloid operation for cached maps, of course go ahead.
-- 
Michael Paquier
http://michael.otacoo.com


Re: [HACKERS] Do we need so many hint bits?

2012-11-17 Thread Andres Freund
On 2012-11-16 17:19:23 -0800, Jeff Davis wrote:
 On Fri, 2012-11-16 at 16:09 +0100, Andres Freund wrote:
  As far as I understand the code the crash-safety aspects of the
  visibilitymap currently rely on on having the knowledge that ALL_VISIBLE
  has been cleared during a heap_(insert|update|delete). That allows
  management of the visibilitymap without it being xlogged itself which
  seems pretty important to me.

 It looks like the xlrec does contain a cleared all visible flag in it,
 and it uses that to clear the VM as well as PD_ALL_VISIBLE.

I think the point is that to check whether the visibilitymap bit needs
to be unset - and thus locked exlusively - no locks have to be acquired
but those heap_* already has. Given that in a the large amount of cases
ALL_VISIBLE does *not* need to be reset I think that this is a very
important property for concurrency purposes. If you consider the large
amount of pages that are covered by a single visibilitymap page we don't
want more locking in that path...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-17 Thread Amit Kapila
On Friday, November 16, 2012 7:52 PM Cédric Villemain wrote:
 Le vendredi 16 novembre 2012 15:08:30, Amit Kapila a écrit :
  On Thursday, November 15, 2012 8:18 PM Amit kapila wrote:
   On Wednesday, November 14, 2012 12:24 AM Robert Haas wrote:
   On Mon, Nov 12, 2012 at 10:59 PM, Amit kapila
   amit.kap...@huawei.com
  
   wrote:
Uh, no, I don't think that's a good idea.  IMHO, what we should do
 is:
   
1. Read postgresql.conf.auto and remember all the settings we saw.
If
  
   we see something funky like an include directive, barf.
  
2. Forget the value we remembered for the particular setting being
  
   changed.  Instead, remember the user-supplied new value for that
   parameter.
  
3. Write a new postgresql.conf.auto based on the information
  
   remembered in steps 1 and 2.
  
   Attached patch contains implementaion for above concept.
   It can be changed to adapt the write file based on GUC variables as
   described by me yesterday or in some other better way.
  
   Currenty to remember and forget, I have used below concept:
   1. Read postgresql.auto.conf in memory.
   2. parse the GUC_file for exact loction of changed variable 3.
   update the changed variable in memory at location found in step-2 4.
   Write the postgresql.auto.conf
  
   Overall changes:
   1. include dir in postgresql.conf at time of initdb 2. new built-in
   function pg_change_conf to change configuration settings 3. write
   file as per logic described above.
  
   Some more things left are:
   1. Transactional capability to command, so that rename of .lock file
   to .auto.conf can be done at commit time.
 
  About transaction capability, I think it will be difficult to
  implement it in transaction block, because during Rollback to
  savepoint it is difficult to rollback (delete the file), as there is
  no track of changes w.r.t Savepoint.
 
 not a problem.
 consider that pseudo code:
 
 begin serializable;
 
 update pg_settings; -- or whatever the name of the object (can include
 creation of a table, etc...)
 
 savepoint...
 
 update pg_settings;
 
 rollback to savepoint;
 
 commit;  -- here the deferred trigger FOR STATEMENT on pg_settings is
 fired and is responsible to write/mv the/a file.
 
 Is there something obvious I'm not seeing ?

I think transaction handling is better with the way you are mentioning.

Here is what I am able to think about your idea:

1. have a system table pg_global_system_settings(key,value)
2. On SQL command execution, insert if the value doesn't exist or update if
already existing.
3. On commit, a deffered trigger will read from table and put all the rows
in a .auto flat file
4. In the deffered trigger, may be we need to use lock for writting to file,
so that 2 backends
   writting same time may not garbled the file. I am not sure if lock is
required or not?



Advantages of this approach:
1. transaction handling can be better.
2. updation of config value row can be easier 

Problem which needs to be thought
Sychronization between flat file .auto.conf and system table
   Case-1
   a. During commit, we write into file (deffered trigger execution)
before marking transaction as commit.
   b. after writting to file, any error or system crash, then table and file
will have different contents.
   Case-2
   a. During commit, we write into file (deffered trigger execution) after
marking transaction as commit.
   b. any error or system crash before write into file can lead to different
contents in table and flat file.

Resolution
May be during recovery, we can try to make table and file consistent, but it
can be tricky.
 


Comparison with Approach I have implemented
1. Because it will be done in serializable isolation, 2 users trying to
modify the same value, will get error.
   However in current approach, user will not get this error.
2. The lock time will be lesser in system table approach but don't think it
will matter because it is a rarely used
   command.

I think, some other people thoughts are also required to see if there is any

deeper design issue which I could not see in this approach and whether it
can clearly score over approach 
with which currently it is implemented(directly operate on a file).


Suggestions/Thoughts?

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] another idea for changing global configuration settings from SQL

2012-11-17 Thread Amit Kapila
On Saturday, November 17, 2012 3:35 AM Dimitri Fontaine wrote:
 Tom Lane t...@sss.pgh.pa.us writes:
  Have you considered ALTER SYSTEM SET ... ?  We'd talked about that in
  the context of the other patch, but it seems to fit much more
 naturally
  with this one.  Or maybe ALTER GLOBAL SET or ALTER ALL SET.
 
 I would paint that one ALTER SYSTEM SET and the file based one ALTER
 CONFIGURATION SET. No new keyword were armed in that proposal.

One more could be to have built-in function

pg_change_config(level,key,value)

level - PG_NEW_CONNECTION
  - PG_SYTEM_LEVEL

Level will distinguish how and when the value will be used.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v3 - comparison to Postgres-R change set format

2012-11-17 Thread Hannu Krosing

On 11/16/2012 02:46 PM, Markus Wanner wrote:

Andres,

On 11/15/2012 01:27 AM, Andres Freund wrote:

In response to this you will soon find the 14 patches that currently
implement $subject.

Congratulations on that piece of work.


I'd like to provide a comparison of the proposed change set format to
the one used in Postgres-R. I hope for this comparison to shed some
light on the similarities and differences of the two projects. As the
author of Postgres-R, I'm obviously biased, but I try to be as neutral
as I can.

...

Let's compare by example:


table replication_example: INSERT: id[int4]:1 somedata[int4]:1 text[varchar]:1
table replication_example: UPDATE: id[int4]:1 somedata[int4]:-1 
text[varchar]:1
table replication_example: DELETE (pkey): id[int4]:1

In Postgres-R, the change sets for these same operations would carry the
following information, in a binary representation:


table replication_example: INSERT: VALUES (1, 1, '1')
table replication_example: UPDATE: PKEY(1) COID(77) MODS('nyn') VALUES(-1)
table replication_example: DELETE: PKEY(1) COID(78)
Is it possible to replicate UPDATEs and DELETEs without a primary key in 
PostgreSQL-R



Hannu


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v3 - comparison to Postgres-R change set format

2012-11-17 Thread Markus Wanner
On 11/17/2012 02:30 PM, Hannu Krosing wrote:
 Is it possible to replicate UPDATEs and DELETEs without a primary key in
 PostgreSQL-R

No. There must be some way to logically identify the tuple. Note,
though, that theoretically any (unconditional) unique key would suffice.
In practice, that usually doesn't matter, as you rarely have one or more
unique keys without a primary.

Also note that the underlying index is useful for remote application of
change sets (except perhaps for very small tables).

In some cases, for example for n:m linking tables, you need to add a
uniqueness key that spans all columns (as opposed to a simple index on
one of the columns that's usually required, anyway). I hope for
index-only scans eventually mitigating this issue.

Alternatively, I've been thinking about the ability to add a hidden
column, which can then be used as a PRIMARY KEY without breaking legacy
applications that rely on SELECT * not returning that primary key.

Are there other reasons to want tables without primary keys that I'm
missing?

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Parser - Query Analyser

2012-11-17 Thread Michael Giannakopoulos
Hello guys,

My name is Michail Giannakopoulos and I am a graduate student at University
of Toronto. I have no previous experience in developing a system like
postgreSQL before.

What I am trying to explore is if it is possible to extend postgreSQL in
order to accept queries of the form:

Select function(att1, att2, att3) AS output(out1, out2, ..., outk) FROM
[database_name];

where att1, att2, att3 are attributes of the relation [database_name] while
output(out1, out2, out3) expresses the output that comes from 'function'
and the fields that this output should have are (out1, out2, out3). What I
mean is that this kind of query should return tuples that obay to the
'output(...)' schema and are produced by processing the original tuples of
'database_name' relation in attributes att1, att2 and att3.

From what I have seen I need to teak both the parser and the query analyser
in order to accept this form of query. However, I do not know how to tweak
these systems. Even worse I do not know where to begin from.

The manual of extending SQL did not helped me and currently I am debugging
postgreSQL source code. I would appreciate if you could provide me with
more resources and correct ways of how to hack inside postgreSQL files and
system. For example, where do I add a new parse node, where should I change
the the grammar.

Thank you very much for all your time and all your help,
Michail


Re: [HACKERS] foreign key locks

2012-11-17 Thread Andres Freund
On 2012-11-16 22:31:51 -0500, Noah Misch wrote:
 On Fri, Nov 16, 2012 at 05:31:12PM +0100, Andres Freund wrote:
  On 2012-11-16 13:17:47 -0300, Alvaro Herrera wrote:
   Andres is on the verge of convincing me that we need to support
   singleton FOR SHARE without multixacts due to performance concerns.
 
  I don't really see any arguments against doing so. We aren't in a that
  big shortage of flags and if we need more than available I think we can
  free some (e.g. XMAX/XMIN_INVALID).

 The patch currently leaves two unallocated bits.  Reclaiming currently-used
 bits means a binary compatibility break.  Until we plan out such a thing,
 reclaimable bits are not as handy as never-allocated bits.  I don't think we
 should lightly expend one of the final two.

Not sure what you mean with a binary compatibility break?
pg_upgrade'ability?

What I previously suggested somewhere was:

#define HEAP_XMAX_SHR_LOCK0x0010
#define HEAP_XMAX_EXCL_LOCK   0x0040
#define HEAP_XMAX_KEYSHR_LOCK (HEAP_XMAX_SHR_LOCK|HEAP_XMAX_EXCL_LOCK)
/*
 * Different from _LOCK_BITS because it doesn't include LOCK_ONLY
 */
#define HEAP_LOCK_MASK(HEAP_XMAX_SHR_LOCK|HEAP_XMAX_EXCL_LOCK)

#define HEAP_XMAX_IS_SHR_LOCKED(tup) \
(((tup)-t_infomask  HEAP_LOCK_BITS) == HEAP_XMAX_SHR_LOCK)
#define HEAP_XMAX_IS_EXCL_LOCKED(tup) \
(((tup)-t_infomask  HEAP_LOCK_BITS) == HEAP_XMAX_EXCL_LOCK)
#define HEAP_XMAX_IS_KEYSHR_LOCKED(tup) \
(((tup)-t_infomask  HEAP_LOCK_BITS) == HEAP_XMAX_KEYSHR_LOCK)

It makes checking for locks sightly more more complicated, but its not
too bad...

   It
   would be useful for more people to chime in here: is FOR SHARE an
   important case to cater for?  I wonder if using FOR KEY SHARE (keep
   performance characteristics, but would need to revise application code)
   would satisfy Andres' users, for example.
 
  It definitely wouldn't help in the cases I have seen because the point
  is to protect against actual content changes of the rows, not just the
  keys.
  Note that you actually need to use explicit FOR SHARE/UPDATE for
  correctness purposes in many scenarios unless youre running in 9.1+
  serializable mode. And that cannot be used in some cases (try queuing
  for example) because the rollback rates would be excessive.

 I agree that tripling FOR SHARE cost is risky.  Where is the added cost
 concentrated?  Perchance that multiple belies optimization opportunities.

Good question, let me play a bit.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v3 - comparison to Postgres-R change set format

2012-11-17 Thread Hannu Krosing

On 11/17/2012 03:00 PM, Markus Wanner wrote:

On 11/17/2012 02:30 PM, Hannu Krosing wrote:

Is it possible to replicate UPDATEs and DELETEs without a primary key in
PostgreSQL-R

No. There must be some way to logically identify the tuple.
It can be done as selecting on _all_ attributes and updating/deleting 
just the first matching row


create cursor ...
select from t ... where t.* = ()
fetch one ...
delete where current of ...

This is on distant (round 3 or 4) roadmap for this work, just was 
interested

if you had found any better way of doing this :)

Hannu


Note,
though, that theoretically any (unconditional) unique key would suffice.
In practice, that usually doesn't matter, as you rarely have one or more
unique keys without a primary.

Also note that the underlying index is useful for remote application of
change sets (except perhaps for very small tables).

In some cases, for example for n:m linking tables, you need to add a
uniqueness key that spans all columns (as opposed to a simple index on
one of the columns that's usually required, anyway). I hope for
index-only scans eventually mitigating this issue.

Alternatively, I've been thinking about the ability to add a hidden
column, which can then be used as a PRIMARY KEY without breaking legacy
applications that rely on SELECT * not returning that primary key.

Are there other reasons to want tables without primary keys that I'm
missing?

Regards

Markus Wanner




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parser - Query Analyser

2012-11-17 Thread David Johnston
On Nov 17, 2012, at 9:18, Michael Giannakopoulos miccagi...@gmail.com wrote:

 Hello guys,
 
 My name is Michail Giannakopoulos and I am a graduate student at University 
 of Toronto. I have no previous experience in developing a system like 
 postgreSQL before.
 
 What I am trying to explore is if it is possible to extend postgreSQL in 
 order to accept queries of the form:
 
 Select function(att1, att2, att3) AS output(out1, out2, ..., outk) FROM 
 [database_name];
 

Anything is possible but what you are trying to do makes little sense generally 
and would take a tremendous amount of work to be done in PostgreSQL.  The two 
main limitations are that you are creating a entirely new query language format 
and that the name of the database is constant and determined at the time of 
connection to the database.

From a practical perspective I do not believe it (as written exactly above) 
can done without breaking existing functionality and/or introducing 
ambiguities.

As I am not a PostgreSQL developer myself I cannot be of much more help but 
ISTM that providing more why and less what would get you better advice.  As to 
learning how to contribute to the project I will let others point you to the 
existing resources that are out there.  It would, however, probably help to 
explain what skills and background you already posses.

David J.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doc patch, put pg_temp into the documentation's index

2012-11-17 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On Fri, 2012-09-28 at 11:10 -0500, Karl O. Pinc wrote:
 pg_temp-toindex.patch
 Puts pg_temp into the index of the docs.

 But there is no object called pg_temp.  It always pg_temp_
 something.  How should that be indexed?

We do replaceable/ in a lot of places, and that seems
serviceable enough, at least in output formats where the  can be
rendered differently from plain text.  I don't remember though whether
the sgml index infrastructure allows markup in an index item.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-17 Thread Noah Misch
On Sat, Nov 17, 2012 at 03:20:20PM +0100, Andres Freund wrote:
 On 2012-11-16 22:31:51 -0500, Noah Misch wrote:
  On Fri, Nov 16, 2012 at 05:31:12PM +0100, Andres Freund wrote:
   On 2012-11-16 13:17:47 -0300, Alvaro Herrera wrote:
Andres is on the verge of convincing me that we need to support
singleton FOR SHARE without multixacts due to performance concerns.
  
   I don't really see any arguments against doing so. We aren't in a that
   big shortage of flags and if we need more than available I think we can
   free some (e.g. XMAX/XMIN_INVALID).
 
  The patch currently leaves two unallocated bits.  Reclaiming currently-used
  bits means a binary compatibility break.  Until we plan out such a thing,
  reclaimable bits are not as handy as never-allocated bits.  I don't think we
  should lightly expend one of the final two.
 
 Not sure what you mean with a binary compatibility break?
 pg_upgrade'ability?

Yes.  If we decide HEAP_XMIN_INVALID isn't helping, we can stop adding it to
tuples anytime.  Old tuples may continue to carry the bit, with no particular
deadline for their demise.  To reuse that bit in the mean time, we'll need to
prove that no tuple writable by PostgreSQL 8.3+ will get an unacceptable
interpretation under the new meaning of the bit.  Alternately, build the
mechanism to prove that all such old bits are gone before using the bit in the
new way.  This keeps HEAP_MOVED_IN and HEAP_MOVED_OFF unavailable today.

 What I previously suggested somewhere was:
 
 #define HEAP_XMAX_SHR_LOCK0x0010
 #define HEAP_XMAX_EXCL_LOCK   0x0040
 #define HEAP_XMAX_KEYSHR_LOCK (HEAP_XMAX_SHR_LOCK|HEAP_XMAX_EXCL_LOCK)
 /*
  * Different from _LOCK_BITS because it doesn't include LOCK_ONLY
  */
 #define HEAP_LOCK_MASK(HEAP_XMAX_SHR_LOCK|HEAP_XMAX_EXCL_LOCK)
 
 #define HEAP_XMAX_IS_SHR_LOCKED(tup) \
 (((tup)-t_infomask  HEAP_LOCK_BITS) == HEAP_XMAX_SHR_LOCK)
 #define HEAP_XMAX_IS_EXCL_LOCKED(tup) \
 (((tup)-t_infomask  HEAP_LOCK_BITS) == HEAP_XMAX_EXCL_LOCK)
 #define HEAP_XMAX_IS_KEYSHR_LOCKED(tup) \
 (((tup)-t_infomask  HEAP_LOCK_BITS) == HEAP_XMAX_KEYSHR_LOCK)
 
 It makes checking for locks sightly more more complicated, but its not
 too bad...

Agreed; that seems reasonable.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v3 - comparison to Postgres-R change set format

2012-11-17 Thread Hannu Krosing

On 11/17/2012 03:00 PM, Markus Wanner wrote:

On 11/17/2012 02:30 PM, Hannu Krosing wrote:

Is it possible to replicate UPDATEs and DELETEs without a primary key in
PostgreSQL-R

No. There must be some way to logically identify the tuple. Note,
though, that theoretically any (unconditional) unique key would suffice.
In practice, that usually doesn't matter, as you rarely have one or more
unique keys without a primary.

...

Are there other reasons to want tables without primary keys that I'm
missing?

Nope. The only place a table without a primary key would be needed is a 
log table, but as these are (supposed to be) INSERT-only this is not a 
problem for them.


Hannu


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parser - Query Analyser

2012-11-17 Thread Tom Lane
Michael Giannakopoulos miccagi...@gmail.com writes:
 What I am trying to explore is if it is possible to extend postgreSQL in
 order to accept queries of the form:

 Select function(att1, att2, att3) AS output(out1, out2, ..., outk) FROM
 [database_name];

 where att1, att2, att3 are attributes of the relation [database_name] while
 output(out1, out2, out3) expresses the output that comes from 'function'
 and the fields that this output should have are (out1, out2, out3).

You're not being terribly clear about what you intend this to mean,
but the standard interpretation of AS is that it just provides a column
renaming and doesn't for instance change datatypes.  If that's what you
have in mind then it can be done today using AS in the FROM clause:

select * from foo() AS output(out1, out2, ...);

That doesn't allow passing data from a table to the function, but as of
HEAD we have LATERAL, so you could do

select output.* from tab, LATERAL foo(att1, att2) AS output(out1, out2, ...);

If you really insist on doing the renaming within a single composite
column in the SELECT output list then you're going to have a lot of
issues.  Column name aliases are normally only associated with RTEs
(FROM-list entries) and SELECT output columns.  Column names for columns
of a composite data type are properties of the type and so are out of
the reach of AS-renaming in the current system design.  I think you'd
have to cons up an anonymous record type and treat the AS as an implicit
cast to that type.  Seems like an awful lot of work in order to have a
nonstandard way to do something that can be done already.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2012-11-17 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 You could make that same claim about plain views, but in point of
 fact the demand for making them work in COPY has been minimal.
 So I'm not convinced this is an essential first-cut feature.
 We can always add it later.

 Of course.  I just had the impression that we could support COPY FROM by
 *deleting* a couple lines from Kevin's patch, rather than it being extra
 work.

Even if it happens to be trivial in the current patch, it's an added
functional requirement that we might later regret having cavalierly
signed up for.  And, as noted upthread, relations that support only
one direction of COPY don't exist at the moment; that would be adding
an asymmetry that we might later regret, too.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parser - Query Analyser

2012-11-17 Thread Любен Каравелов
- Цитат от Michael Giannakopoulos (miccagi...@gmail.com), на 17.11.2012 в 
16:18 -

 Hello guys,
 
 My name is Michail Giannakopoulos and I am a graduate student at University
 of Toronto. I have no previous experience in developing a system like
 postgreSQL before.
 
 What I am trying to explore is if it is possible to extend postgreSQL in
 order to accept queries of the form:
 
 Select function(att1, att2, att3) AS output(out1, out2, ..., outk) FROM
 [database_name];
 

Why invent non-standard syntax for something that you could do in SQL. 

You could try something like this:

SELECT
(m.f).f_out_name1 AS out1,
(m.f).f_out_name2 AS out2,
(m.f).f_out_name3 AS out3
FROM  (
SELECT f(att1,att2,att3)  FROM input_table_name
) AS m;

Best regards

--
Luben Karavelov

Re: [HACKERS] foreign key locks

2012-11-17 Thread Andres Freund
  I agree that tripling FOR SHARE cost is risky.  Where is the added cost
  concentrated?  Perchance that multiple belies optimization opportunities.

 Good question, let me play a bit.

Ok, I benchmarked around and from what I see there is no single easy
target.
The biggest culprits I could find are:
1. higher amount of XLogInsert calls per transaction (visible
in pgbench -t instead of -T mode while watching the WAL volume)
2. Memory allocations in GetMultiXactIdMembers
3. Memory allocations in mXactCachePut
 a) cache entry itself
 b) the cache context
4. More lwlocks acquisitions

We can possibly optimize a bit with 2) by using a static buffer for
common member sizes, but thats not going to buy us too much...

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doc patch, put pg_temp into the documentation's index

2012-11-17 Thread Karl O. Pinc
On 11/17/2012 12:19:02 AM, Peter Eisentraut wrote:
 On Fri, 2012-09-28 at 11:10 -0500, Karl O. Pinc wrote:
  pg_temp-toindex.patch
  Puts pg_temp into the index of the docs.
 
 But there is no object called pg_temp.  It always pg_temp_
 something.  How should that be indexed?

My thought is not to index the db object; it isn't
particularly interesting to a user.  Instead what's
indexed is the token pg_temp, used when
setting search_path.   The utility of the token is
explained in several places in the docs.

Regards,

Karl k...@meme.com
Free Software:  You don't pay back, you pay forward.
 -- Robert A. Heinlein



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v3 - comparison to Postgres-R change set format

2012-11-17 Thread Markus Wanner
Hannu,

On 11/17/2012 03:40 PM, Hannu Krosing wrote:
 On 11/17/2012 03:00 PM, Markus Wanner wrote:
 On 11/17/2012 02:30 PM, Hannu Krosing wrote:
 Is it possible to replicate UPDATEs and DELETEs without a primary key in
 PostgreSQL-R
 No. There must be some way to logically identify the tuple.
 It can be done as selecting on _all_ attributes and updating/deleting
 just the first matching row
 
 create cursor ...
 select from t ... where t.* = ()
 fetch one ...
 delete where current of ...

That doesn't sound like it could possibly work for Postgres-R. At least
not when there can be multiple rows with all the same attributes, i.e.
without a unique key constraint over all columns.

Otherwise, some nodes could detect two concurrent UPDATES as a conflict,
while other nodes select different rows and don't handle it as a conflict.

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add big fat caution to pg_restore docs regards partial db restores

2012-11-17 Thread Karl O. Pinc
On 11/17/2012 12:27:14 AM, Peter Eisentraut wrote:
 On Sun, 2012-09-23 at 21:22 -0500, Karl O. Pinc wrote:
  Hi,
  
  Adds a caution to the pg_restore docs
  
  Against git master.
 
 I'm not sure what you are trying to get at here.  It's basically
 saying,
 if you make an incomplete database restore, you might get an
 incomplete
 database.  Is there any specific failure scenario that we should
 address?

Basically, no.  It's a reminder of all the various sorts
of inconsistencies that might arise from a partial restore,
not just referential integrity but other integrity constraints
that might be enforced by triggers or the application.
Possibly even manual procedural checks.
The idea is that such a list might direct the attention
of the person doing data recovery to overlooked
integrity issues.

I agree, there's no point in a generic warning.
The warning is only useful if it leads the reader
to do a better job of data recovery.

I would summarize slightly differently, an incomplete
restore can lead to an inconsistent database.

---

One criticism of this patch:

Suggestion for --truncate-tables to pg_restore
https://commitfest.postgresql.org/action/patch_view?id=944

was that because it allowed an incomplete restore
you might get an inconsistent database.  If that's
a problem, and at some level it is, then it's a
problem right now and the only
way to address the problem is to help
the data recovery person.  I thought some sort
of brief checklist of kinds of data integrity might
help, put someplace where it would be seen when needed.

Feel free to reject.  I sent in the patch to
try out the idea.  It's a bit crazy, but I didn't
think too crazy to share with the list.

Regards,

Karl k...@meme.com
Free Software:  You don't pay back, you pay forward.
 -- Robert A. Heinlein



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] array exclusion constraint

2012-11-17 Thread Philip Taylor
CREATE TABLE foo (
   x CHAR(32) PRIMARY KEY,
   y CHAR(32) NOT NULL,
   EXCLUDE USING gist ((ARRAY[x, y]) WITH )
);

ERROR:  data type character[] has no default operator class for access method 
gist
HINT:  You must specify an operator class for the index or define a default 
operator class for the data type.

Neither gist nor gin work. You can do that with integers using the intarray 
extension module.
Could you please suggest me a clean way to achieve the same result (using the 
char data type, not integers)?
Probably I could create a specific operator class or some other workaround, but 
the intarray implementation looks a bit complex.
Someone so kind to point me in the right direction?

Of course the following produces the same error:

CREATE TABLE bar (
   x VARCHAR[] NOT NULL,
   EXCLUDE USING gist (x WITH )
);

Re: [HACKERS] foreign key locks

2012-11-17 Thread Noah Misch
On Sat, Nov 17, 2012 at 05:07:18PM +0100, Andres Freund wrote:
   I agree that tripling FOR SHARE cost is risky.  Where is the added cost
   concentrated?  Perchance that multiple belies optimization opportunities.
 
  Good question, let me play a bit.
 
 Ok, I benchmarked around and from what I see there is no single easy
 target.
 The biggest culprits I could find are:
 1. higher amount of XLogInsert calls per transaction (visible
 in pgbench -t instead of -T mode while watching the WAL volume)
 2. Memory allocations in GetMultiXactIdMembers
 3. Memory allocations in mXactCachePut
  a) cache entry itself
  b) the cache context
 4. More lwlocks acquisitions
 
 We can possibly optimize a bit with 2) by using a static buffer for
 common member sizes, but thats not going to buy us too much...

In that case, +1 for your proposal to prop up FOR SHARE.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] another idea for changing global configuration settings from SQL

2012-11-17 Thread Fujii Masao
On Fri, Nov 16, 2012 at 2:53 AM, Peter Eisentraut pete...@gmx.net wrote:
 Independent of the discussion of how to edit configuration files from
 SQL, I had another idea how many of the use cases for this could be handled.

 We already have the ability to store in pg_db_role_setting configuration
 settings for

 specific user, specific database
 specific user, any database
 any user, specific database

 The existing infrastructure would also support

 any user, any database (= all the time)

 All you'd need is to add

 ApplySetting(InvalidOid, InvalidOid, relsetting, PGC_S_$SOMETHING);

 in postinit.c, and have some SQL command to modify this setting.

 The only thing you couldn't handle that way are SIGHUP settings, but the
 often-cited use cases work_mem, logging, etc. would work.

 There would also be the advantage that pg_dumpall would save these settings.

 Thoughts?

In this approach, we cannot change the settings in the standby?
If yes, I don't like this approach.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need so many hint bits?

2012-11-17 Thread Jeff Davis
On Sat, 2012-11-17 at 14:24 +0100, Andres Freund wrote:
 I think the point is that to check whether the visibilitymap bit needs
 to be unset

What's the problem with that? If you already have the VM buffer pinned
(which should be possible if we keep the VM buffer in a longer-lived
structure), then doing the test is almost as cheap as checking
PD_ALL_VISIBLE, because you don't need any locks.

So, the proposal is:
  1. Keep the VM buffer around in a longer-lived structure for scans and
nodeModifyTable.
  2. Replace all tests of PD_ALL_VISIBLE with tests directly against the
VM, hopefully using a buffer that we already have a pin on.

I haven't really dug into this yet, but I don't see any obvious problem.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-17 Thread Fujii Masao
On Sat, Nov 17, 2012 at 10:25 PM, Amit Kapila amit.kap...@huawei.com wrote:
 1. have a system table pg_global_system_settings(key,value)

Do we really need to store the settings in a system table?
Since WAL would be generated when storing the settings
in a system table, this approach seems to prevent us from
changing the settings in the standby.

 2. On SQL command execution, insert if the value doesn't exist or update if
 already existing.

This means that we should implement something like MERGE
command first?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need so many hint bits?

2012-11-17 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 What's the problem with that? If you already have the VM buffer pinned
 (which should be possible if we keep the VM buffer in a longer-lived
 structure), then doing the test is almost as cheap as checking
 PD_ALL_VISIBLE, because you don't need any locks.

Really?  What about race conditions?  Specifically, I think what you
suggest is likely to be unreliable on machines with weak memory
ordering.  Consider possibility that someone else just changed the VM
bit.  Getting a lock ensures synchronization.  (Yeah, it's possible that
we could use some primitive cheaper than a lock ... but it's not going
to be free.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-17 Thread Tom Lane
Fujii Masao masao.fu...@gmail.com writes:
 Do we really need to store the settings in a system table?
 Since WAL would be generated when storing the settings
 in a system table, this approach seems to prevent us from
 changing the settings in the standby.

That's a really good point: if we try to move all GUCs into a system
table, there's no way for a standby to have different values; and for
some of them different values are *necessary*.

I think that shoots down this line of thought entirely.  Can we go
back to the plain write a file approach now?  I think a SET
PERSISTENT command that's disallowed in transaction blocks and just
writes the file immediately is perfectly sensible.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doc patch, put pg_temp into the documentation's index

2012-11-17 Thread Peter Eisentraut
On Sat, 2012-11-17 at 11:33 -0600, Karl O. Pinc wrote:
 On 11/17/2012 12:19:02 AM, Peter Eisentraut wrote:
  On Fri, 2012-09-28 at 11:10 -0500, Karl O. Pinc wrote:
   pg_temp-toindex.patch
   Puts pg_temp into the index of the docs.
  
  But there is no object called pg_temp.  It always pg_temp_
  something.  How should that be indexed?
 
 My thought is not to index the db object; it isn't
 particularly interesting to a user.  Instead what's
 indexed is the token pg_temp, used when
 setting search_path.   The utility of the token is
 explained in several places in the docs.

Actually, since this is the pg_temp alias for the search path, it is
appropriate.  So committed as is.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need so many hint bits?

2012-11-17 Thread Simon Riggs
On 16 November 2012 19:58, Jeff Davis pg...@j-davis.com wrote:
 On Fri, 2012-11-16 at 11:58 -0500, Robert Haas wrote:
  Also, I am wondering about PD_ALL_VISIBLE. It was originally introduced
  in the visibility map patch, apparently as a way to know when to clear
  the VM bit when doing an update. It was then also used for scans, which
  showed a significant speedup. But I wonder: why not just use the
  visibilitymap directly from those places?

 Well, you'd have to look up, lock and pin the page to do that.  I
 suspect that overhead is pretty significant.  The benefit of noticing
 that the flag is set is that you need not call HeapTupleSatisfiesMVCC
 for each tuple on the page: checking one bit in the page header is a
 lot cheaper than calling that function for every tuple.  However, if
 you had to lock and pin a second page in order to check whether the
 page is all-visible, I suspect it wouldn't be a win; you'd probably be
 better off just doing the HeapTupleSatisfiesMVCC checks for each
 tuple.

 That's pretty easy to test. Here's what I got on a 10M record table
 (Some runs got some strangely high numbers around 1700ms, which I assume
 is because it's difficult to keep the data in shared buffers, so I took
 the lower numbers.):

   PD_ALL_VISIBLE:  661ms
   VM Lookup:   667ms
   Neither: 740ms

 Even if pinning the vm buffer were slow, we could keep the pin longer
 during a scan (it seems like the VM API is designed for that kind of a
 use case), so I don't think scans are a problem at all, even if there is
 a lot of concurrency.

The biggest problem with hint bits is SeqScans on a table that ends up
dirtying many pages. Repeated checks against clog and hint bit setting
are massive overheads for the user that hits that, plus it generates
an unexpected surge of database writes. Even without checksums that is
annoying.

ISTM that we should tune that specifically by performing a VM lookup
for next 32 pages (or more), so we reduce the lookups well below 1 per
page. That way the overhead of using the VM will be similar to using
the PD_ALL_VISIBLE. Also, if we pass through a flag to
HeapTupleSateisfies indicating we are not interested in setting hints
on a SeqScan then we can skip individual tuple hints also. If the
whole page becomes visible then we can set the VM.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doc patch, put pg_temp into the documentation's index

2012-11-17 Thread Karl O. Pinc
On 11/17/2012 05:10:12 PM, Peter Eisentraut wrote:
 On Sat, 2012-11-17 at 11:33 -0600, Karl O. Pinc wrote:

  
what's
  indexed is the token pg_temp, used when
  setting search_path. 

 Actually, since this is the pg_temp alias for the search path, it is
 appropriate.  So committed as is.

Thanks for the work on this and the other patches you've helped
me out with, and for the larger work on PG of course.
I would feel like I was cluttering the channel if I
sent a thanks each time but I do want to acknowledge
both your help and the work the other Postgres people
do.

Regards,

Karl k...@meme.com
Free Software:  You don't pay back, you pay forward.
 -- Robert A. Heinlein



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need so many hint bits?

2012-11-17 Thread Jeff Davis
On Sat, 2012-11-17 at 16:53 -0500, Tom Lane wrote:
 Jeff Davis pg...@j-davis.com writes:
  What's the problem with that? If you already have the VM buffer pinned
  (which should be possible if we keep the VM buffer in a longer-lived
  structure), then doing the test is almost as cheap as checking
  PD_ALL_VISIBLE, because you don't need any locks.
 
 Really?  What about race conditions?  Specifically, I think what you
 suggest is likely to be unreliable on machines with weak memory
 ordering.  Consider possibility that someone else just changed the VM
 bit.  Getting a lock ensures synchronization.  (Yeah, it's possible that
 we could use some primitive cheaper than a lock ... but it's not going
 to be free.)

There's already a similar precedent in IndexOnlyNext, which calls
visibilitymap_test with no lock.

I am not authoritative on these kinds of lockless accesses, but it looks
like we can satisfy those memory barrier requirements in the places we
need to. 

Here is my analysis:

Process A (process that clears a VM bit for a data page):
  1. Acquires exclusive lock on data buffer
  2. Acquires exclusive lock on VM buffer
  3. clears VM bit
  4. Releases VM buffer lock
  5. Releases data buffer lock

Process B (process that tests the VM bit for the same data page):
  1. Acquires shared lock (if it's a scan doing a visibility test) or an
exclusive lock (if it's an I/U/D that wants to know whether to clear the
bit or not) on the data buffer.
  2. Tests bit using an already-pinned VM buffer.
  3. Releases data buffer lock.

Process A and B must be serialized, because A takes an exclusive lock on
the data buffer and B takes at least a shared lock on the data buffer.
The only dangerous case is when A happens right before B. So, the
question is: are there enough memory barriers between A-3 and B-2? And I
think the answer is yes. A-4 should act as a write barrier after
clearing the bit, and B-1 should act as a read barrier before reading
the bit.

Let me know if there is a flaw with this analysis.

If not, then I still agree with you that it's not as cheap as testing
PD_ALL_VISIBLE, but I am skeptical that memory-ordering constraints
we're imposing on the CPU are expensive enough to matter in these cases.
If you have a test case in mind that might exercise this, then I will
try to run it (although my workstation is only 4 cores, and the most I
can probably get access to is 16 cores).

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need so many hint bits?

2012-11-17 Thread Jeff Davis
On Sat, 2012-11-17 at 19:35 -0500, Simon Riggs wrote:
 The biggest problem with hint bits is SeqScans on a table that ends up
 dirtying many pages. Repeated checks against clog and hint bit setting
 are massive overheads for the user that hits that, plus it generates
 an unexpected surge of database writes. Even without checksums that is
 annoying.

Yeah. I am nowhere close to a general solution for that, but I am
targeting the PD_ALL_VISIBLE hint for removal (which is one part of the
problem), and I think I am close to an approach with no measurable
downside.

 ISTM that we should tune that specifically by performing a VM lookup
 for next 32 pages (or more), so we reduce the lookups well below 1 per
 page. That way the overhead of using the VM will be similar to using
 the PD_ALL_VISIBLE.

That's another potential way to mitigate the effects during a scan, but
it does add a little complexity. Right now, it share locks a buffer, and
uses an array with one element for each tuple in the page. If
PD_ALL_VISIBLE is set, then it marks all of the tuples *currently
present* on the page as visible in the array, and then releases the
share lock. Then, when reading the page, if another tuple is added
(because we released the share lock and only have a pin), it doesn't
matter because it's already invisible according to the array.

With this approach, we'd need to keep a larger array to represent many
pages. And it sounds like we'd need to share lock the pages ahead, and
find out which items are currently present, in order to properly fill in
the array. Not quite sure what to do there, but would require some more
thought.

I'm inclined to avoid going down this path unless there is some
performance reason to do so. We can keep a VM buffer pinned and do some
lockless testing (similar to that in IndexOnlyNext; see my response to
Tom), which will hopefully be fast enough that we don't need anything
else.

  Also, if we pass through a flag to
 HeapTupleSateisfies indicating we are not interested in setting hints
 on a SeqScan then we can skip individual tuple hints also. If the
 whole page becomes visible then we can set the VM.

Hmm, that's an idea. Maybe we shouldn't bother setting the hints if it's
already all-visible in the VM? Something else to think about. 

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need so many hint bits?

2012-11-17 Thread Jeff Janes
On Fri, Nov 16, 2012 at 5:35 PM, Jeff Davis pg...@j-davis.com wrote:
 On Fri, 2012-11-16 at 17:04 -0800, Jeff Janes wrote:


 Your question prompts me to post something I had been wondering.
 Might it be worthwhile to break the PD_ALL_VISIBLE / vm equivalence?
 Should the vm bit get cleared by a HOT update?

 To clarify: are you saying that a hot update should clear the
 PD_ALL_VISIBLE bit, but not the VM bit?

Yes.


 And anyone can vacuum a block that has only had HOT updates, you don't
 need to be dedicated vacuum worker to do that.

 And obviously this would be incompatible with removing the
 PD_ALL_VISIBLE, unless we also wanted to eliminate the ability to
 short-cut hint bit checks.

 I'm still a little unclear on what the benefit is.

The benefit would be that index only scans would be more likely to not
need to visit the heap page, if it only had HOT updates done to it
since the last time it was all-visible.

Also some reduced vacuuming, but I don't know if that benefit would be
beneficial.

 It sounds like a slightly different kind of hint, so maybe we should
 just treat it as a completely different thing after removing
 PD_ALL_VISIBLE. If it's related to HOT updates, then the page will
 probably be dirty anyway, so that removes my primary complaint about
 PD_ALL_VISIBLE.

Right, but if the HOT-only bit was on the page itself, it would no
longer help out index-only-scans.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers