Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2014-02-18 Thread Heikki Linnakangas

On 02/16/2014 10:19 PM, Jim Nasby wrote:

On 1/24/14, 3:52 PM, Jaime Casanova wrote:

On Tue, Dec 3, 2013 at 11:25 AM, Bruce Momjianbr...@momjian.us  wrote:


Is everyone else OK with this approach?  Updated patch attached.


Hi,

I started to look at this patch and i found that it fails an assertion
as soon as you run a VACUUM FULL after a lazy VACUUM even if those are
on unrelated relations. For example in an assert-enabled build with
the regression database run:

VACUUM customer;
[... insert here whatever commands you like or nothing at all ...]
VACUUM FULL tenk1;


Is anyone else confused/concerned that regression testing didn't pick this up?


I wouldn't expect that to be explicitly tested - it's pretty unexpected 
that they work independently but not when run one after another. But 
it's a bit surprising that we don't happen to do that combination in any 
of the tests by pure chance.



The vacuum.sql test does not test lazy vacuum at all, and I can't seem to find 
any other tests that test lazy vacuum either...


There are several lazy vacuums in the regression suite:

sql/alter_table.sql:vacuum analyze atacc1(a);
sql/alter_table.sql:vacuum analyze atacc1(pg.dropped.1);
sql/hs_standby_disallowed.sql:VACUUM hs2;
sql/indirect_toast.sql:VACUUM FREEZE toasttest;
sql/indirect_toast.sql:VACUUM FREEZE toasttest;
sql/matview.sql:VACUUM ANALYZE hogeview;
sql/numeric_big.sql:VACUUM ANALYZE num_exp_add;
sql/numeric_big.sql:VACUUM ANALYZE num_exp_sub;
sql/numeric_big.sql:VACUUM ANALYZE num_exp_div;
sql/numeric_big.sql:VACUUM ANALYZE num_exp_mul;
sql/numeric_big.sql:VACUUM ANALYZE num_exp_sqrt;
sql/numeric_big.sql:VACUUM ANALYZE num_exp_ln;
sql/numeric_big.sql:VACUUM ANALYZE num_exp_log10;
sql/numeric_big.sql:VACUUM ANALYZE num_exp_power_10_ln;
sql/numeric.sql:VACUUM ANALYZE num_exp_add;
sql/numeric.sql:VACUUM ANALYZE num_exp_sub;
sql/numeric.sql:VACUUM ANALYZE num_exp_div;
sql/numeric.sql:VACUUM ANALYZE num_exp_mul;
sql/numeric.sql:VACUUM ANALYZE num_exp_sqrt;
sql/numeric.sql:VACUUM ANALYZE num_exp_ln;
sql/numeric.sql:VACUUM ANALYZE num_exp_log10;
sql/numeric.sql:VACUUM ANALYZE num_exp_power_10_ln;
sql/sanity_check.sql:VACUUM;
sql/without_oid.sql:VACUUM ANALYZE wi;
sql/without_oid.sql:VACUUM ANALYZE wo;

Most of those commands are there to analyze, rather than vacuum, but 
lazy vacuum is definitely exercised by the regression tests. I agree 
it's quite surprising that vacuum.sql doesn't actually perform any lazy 
vacuums.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2014-02-16 Thread Andres Freund
On 2014-02-15 21:34:15 -0500, Bruce Momjian wrote:
 Thank you for the thorough review.  Unless someone else can complete
 this, I think it should be marked as returned with feedback.  I don't
 think I am going to learn enough to complete this during the
 commit-fest.

Agreed. Marked it as such.


 I guess there is always PG 9.5.

I sure hope so ;)

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2014-02-16 Thread Jim Nasby

On 1/24/14, 3:52 PM, Jaime Casanova wrote:

On Tue, Dec 3, 2013 at 11:25 AM, Bruce Momjianbr...@momjian.us  wrote:


Is everyone else OK with this approach?  Updated patch attached.


Hi,

I started to look at this patch and i found that it fails an assertion
as soon as you run a VACUUM FULL after a lazy VACUUM even if those are
on unrelated relations. For example in an assert-enabled build with
the regression database run:

VACUUM customer;
[... insert here whatever commands you like or nothing at all ...]
VACUUM FULL tenk1;


Is anyone else confused/concerned that regression testing didn't pick this up? 
The vacuum.sql test does not test lazy vacuum at all, and I can't seem to find 
any other tests that test lazy vacuum either...
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2014-02-15 Thread Andres Freund
Hi,

 *** end_heap_rewrite(RewriteState state)
 *** 281,286 
 --- 284,290 
   true);
   RelationOpenSmgr(state-rs_new_rel);
   
 + update_page_vm(state-rs_new_rel, state-rs_buffer, 
 state-rs_blockno);
   PageSetChecksumInplace(state-rs_buffer, state-rs_blockno);
   
   smgrextend(state-rs_new_rel-rd_smgr, MAIN_FORKNUM, 
 state-rs_blockno,
 *** raw_heap_insert(RewriteState state, Heap
 *** 633,638 
 --- 637,643 
*/
   RelationOpenSmgr(state-rs_new_rel);
   
 + update_page_vm(state-rs_new_rel, page, 
 state-rs_blockno);
   PageSetChecksumInplace(page, state-rs_blockno);
   
   smgrextend(state-rs_new_rel-rd_smgr, MAIN_FORKNUM,

I think those two cases should be combined.

 + static void
 + update_page_vm(Relation relation, Page page, BlockNumber blkno)
 + {
 + Buffer  vmbuffer = InvalidBuffer;
 + TransactionId visibility_cutoff_xid;
 + 
 + visibilitymap_pin(relation, blkno, vmbuffer);
 + Assert(BufferIsValid(vmbuffer));
 + 
 + if (!visibilitymap_test(relation, blkno, vmbuffer) 
 + heap_page_is_all_visible(relation, InvalidBuffer, page,
 +  
 visibility_cutoff_xid))
 + {
 + PageSetAllVisible(page);
 + visibilitymap_set(relation, blkno, InvalidBuffer,
 +   InvalidXLogRecPtr, vmbuffer, 
 visibility_cutoff_xid);
 + }
 + ReleaseBuffer(vmbuffer);
 + }

How could visibilitymap_test() be true here?

 diff --git a/src/backend/access/heap/visibilitymap.c 
 b/src/backend/access/heap/visibilitymap.c
 new file mode 100644
 index 899ffac..3e7546b
 *** a/src/backend/access/heap/visibilitymap.c
 --- b/src/backend/access/heap/visibilitymap.c
 *** visibilitymap_set(Relation rel, BlockNum
 *** 257,263 
   #endif
   
   Assert(InRecovery || XLogRecPtrIsInvalid(recptr));
 - Assert(InRecovery || BufferIsValid(heapBuf));
   
   /* Check that we have the right heap page pinned, if present */
   if (BufferIsValid(heapBuf)  BufferGetBlockNumber(heapBuf) != heapBlk)
 --- 257,262 
 *** visibilitymap_set(Relation rel, BlockNum
 *** 278,284 
   map[mapByte] |= (1  mapBit);
   MarkBufferDirty(vmBuf);
   
 ! if (RelationNeedsWAL(rel))
   {
   if (XLogRecPtrIsInvalid(recptr))
   {
 --- 277,283 
   map[mapByte] |= (1  mapBit);
   MarkBufferDirty(vmBuf);
   
 ! if (RelationNeedsWAL(rel)  BufferIsValid(heapBuf))
   {
   if (XLogRecPtrIsInvalid(recptr))
   {

At the very least this needs to revise visibilitymap_set's comment about
the requirement of passing a buffer unless !InRecovery.

Also, how is this going to work with replication if you're explicitly
not WAL logging this?


 *** vac_cmp_itemptr(const void *left, const
 *** 1730,1743 
* transactions. Also return the visibility_cutoff_xid which is the highest
* xmin amongst the visible tuples.
*/
 ! static bool
 ! heap_page_is_all_visible(Relation rel, Buffer buf, TransactionId 
 *visibility_cutoff_xid)
   {
 - Pagepage = BufferGetPage(buf);
   OffsetNumber offnum,
   maxoff;
   boolall_visible = true;
   
   *visibility_cutoff_xid = InvalidTransactionId;
   
   /*
 --- 1728,1744 
* transactions. Also return the visibility_cutoff_xid which is the highest
* xmin amongst the visible tuples.
*/
 ! bool
 ! heap_page_is_all_visible(Relation rel, Buffer buf, Page page,
 !  TransactionId 
 *visibility_cutoff_xid)
   {
   OffsetNumber offnum,
   maxoff;
   boolall_visible = true;
   
 + if (BufferIsValid(buf))
 + page = BufferGetPage(buf);
 + 
   *visibility_cutoff_xid = InvalidTransactionId;
   
   /*
 *** heap_page_is_all_visible(Relation rel, B
 *** 1758,1764 
   if (!ItemIdIsUsed(itemid) || ItemIdIsRedirected(itemid))
   continue;
   
 ! ItemPointerSet((tuple.t_self), BufferGetBlockNumber(buf), 
 offnum);
   
   /*
* Dead line pointers can have index pointers pointing to them. 
 So
 --- 1759,1767 
   if (!ItemIdIsUsed(itemid) || ItemIdIsRedirected(itemid))
   continue;
   
 ! /* XXX use 0 or real offset? */
 ! ItemPointerSet((tuple.t_self), BufferIsValid(buf) ?
 !BufferGetBlockNumber(buf) : 0, 
 offnum);

How about passing in the page and block 

Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2014-02-15 Thread Bruce Momjian
On Sat, Feb 15, 2014 at 04:16:40PM +0100, Andres Freund wrote:
 Hi,
 
  *** end_heap_rewrite(RewriteState state)
  *** 281,286 
  --- 284,290 
  true);
  RelationOpenSmgr(state-rs_new_rel);

  +   update_page_vm(state-rs_new_rel, state-rs_buffer, 
  state-rs_blockno);
  PageSetChecksumInplace(state-rs_buffer, state-rs_blockno);

  smgrextend(state-rs_new_rel-rd_smgr, MAIN_FORKNUM, 
  state-rs_blockno,
  *** raw_heap_insert(RewriteState state, Heap
  *** 633,638 
  --- 637,643 
   */
  RelationOpenSmgr(state-rs_new_rel);

  +   update_page_vm(state-rs_new_rel, page, 
  state-rs_blockno);
  PageSetChecksumInplace(page, state-rs_blockno);

  smgrextend(state-rs_new_rel-rd_smgr, MAIN_FORKNUM,
 
 I think those two cases should be combined.

Uh, what I did was to pair the new update_page_vm with the existing
PageSetChecksumInplace calls, figuring if we needed a checksum before we
wrote the page we would need a update_page_vm too.  Is that incorrect? 
It is that _last_ page write in the second instance.

  + static void
  + update_page_vm(Relation relation, Page page, BlockNumber blkno)
  + {
  +   Buffer  vmbuffer = InvalidBuffer;
  +   TransactionId visibility_cutoff_xid;
  + 
  +   visibilitymap_pin(relation, blkno, vmbuffer);
  +   Assert(BufferIsValid(vmbuffer));
  + 
  +   if (!visibilitymap_test(relation, blkno, vmbuffer) 
  +   heap_page_is_all_visible(relation, InvalidBuffer, page,
  +
  visibility_cutoff_xid))
  +   {
  +   PageSetAllVisible(page);
  +   visibilitymap_set(relation, blkno, InvalidBuffer,
  + InvalidXLogRecPtr, vmbuffer, 
  visibility_cutoff_xid);
  +   }
  +   ReleaseBuffer(vmbuffer);
  + }
 
 How could visibilitymap_test() be true here?

Oh, you are right that I can only hit that once per page;  test removed.

  diff --git a/src/backend/access/heap/visibilitymap.c 
  b/src/backend/access/heap/visibilitymap.c
  new file mode 100644
  index 899ffac..3e7546b
  *** a/src/backend/access/heap/visibilitymap.c
  --- b/src/backend/access/heap/visibilitymap.c
  *** visibilitymap_set(Relation rel, BlockNum
  *** 257,263 
#endif

  Assert(InRecovery || XLogRecPtrIsInvalid(recptr));
  -   Assert(InRecovery || BufferIsValid(heapBuf));

  /* Check that we have the right heap page pinned, if present */
  if (BufferIsValid(heapBuf)  BufferGetBlockNumber(heapBuf) != heapBlk)
  --- 257,262 
  *** visibilitymap_set(Relation rel, BlockNum
  *** 278,284 
  map[mapByte] |= (1  mapBit);
  MarkBufferDirty(vmBuf);

  !   if (RelationNeedsWAL(rel))
  {
  if (XLogRecPtrIsInvalid(recptr))
  {
  --- 277,283 
  map[mapByte] |= (1  mapBit);
  MarkBufferDirty(vmBuf);

  !   if (RelationNeedsWAL(rel)  BufferIsValid(heapBuf))
  {
  if (XLogRecPtrIsInvalid(recptr))
  {
 
 At the very least this needs to revise visibilitymap_set's comment about
 the requirement of passing a buffer unless !InRecovery.

Oh, good point, comment block updated.

 Also, how is this going to work with replication if you're explicitly
 not WAL logging this?

Uh, I assumed that using the existing functions would handle this.  If
not, I don't know the answer.

  *** vac_cmp_itemptr(const void *left, const
  *** 1730,1743 
 * transactions. Also return the visibility_cutoff_xid which is the 
  highest
 * xmin amongst the visible tuples.
 */
  ! static bool
  ! heap_page_is_all_visible(Relation rel, Buffer buf, TransactionId 
  *visibility_cutoff_xid)
{
  -   Pagepage = BufferGetPage(buf);
  OffsetNumber offnum,
  maxoff;
  boolall_visible = true;

  *visibility_cutoff_xid = InvalidTransactionId;

  /*
  --- 1728,1744 
 * transactions. Also return the visibility_cutoff_xid which is the 
  highest
 * xmin amongst the visible tuples.
 */
  ! bool
  ! heap_page_is_all_visible(Relation rel, Buffer buf, Page page,
  !TransactionId 
  *visibility_cutoff_xid)
{
  OffsetNumber offnum,
  maxoff;
  boolall_visible = true;

  +   if (BufferIsValid(buf))
  +   page = BufferGetPage(buf);
  + 
  *visibility_cutoff_xid = InvalidTransactionId;

  /*
  *** heap_page_is_all_visible(Relation rel, B
  *** 1758,1764 
  if (!ItemIdIsUsed(itemid) || ItemIdIsRedirected(itemid))
  

Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2014-02-15 Thread Andres Freund
On 2014-02-15 12:50:14 -0500, Bruce Momjian wrote:

 On Sat, Feb 15, 2014 at 04:16:40PM +0100, Andres Freund wrote:
  Hi,
  
   *** end_heap_rewrite(RewriteState state)
   *** 281,286 
   --- 284,290 
 true);
 RelationOpenSmgr(state-rs_new_rel);
 
   + update_page_vm(state-rs_new_rel, state-rs_buffer, 
   state-rs_blockno);
 PageSetChecksumInplace(state-rs_buffer, 
   state-rs_blockno);
 
 smgrextend(state-rs_new_rel-rd_smgr, MAIN_FORKNUM, 
   state-rs_blockno,
   *** raw_heap_insert(RewriteState state, Heap
   *** 633,638 
   --- 637,643 
  */
 RelationOpenSmgr(state-rs_new_rel);
 
   + update_page_vm(state-rs_new_rel, page, 
   state-rs_blockno);
 PageSetChecksumInplace(page, state-rs_blockno);
 
 smgrextend(state-rs_new_rel-rd_smgr, 
   MAIN_FORKNUM,
  
  I think those two cases should be combined.
 
 Uh, what I did was to pair the new update_page_vm with the existing
 PageSetChecksumInplace calls, figuring if we needed a checksum before we
 wrote the page we would need a update_page_vm too.  Is that incorrect?
 It is that _last_ page write in the second instance.

What I mean is that there should be a new function handling the writeout
of a page.

   diff --git a/src/backend/access/heap/visibilitymap.c 
   b/src/backend/access/heap/visibilitymap.c
   new file mode 100644
   index 899ffac..3e7546b
   *** a/src/backend/access/heap/visibilitymap.c
   --- b/src/backend/access/heap/visibilitymap.c
   *** visibilitymap_set(Relation rel, BlockNum
   *** 257,263 
 #endif
 
 Assert(InRecovery || XLogRecPtrIsInvalid(recptr));
   - Assert(InRecovery || BufferIsValid(heapBuf));
 
 /* Check that we have the right heap page pinned, if present */
 if (BufferIsValid(heapBuf)  BufferGetBlockNumber(heapBuf) != 
   heapBlk)
   --- 257,262 
   *** visibilitymap_set(Relation rel, BlockNum
   *** 278,284 
 map[mapByte] |= (1  mapBit);
 MarkBufferDirty(vmBuf);
 
   ! if (RelationNeedsWAL(rel))
 {
 if (XLogRecPtrIsInvalid(recptr))
 {
   --- 277,283 
 map[mapByte] |= (1  mapBit);
 MarkBufferDirty(vmBuf);
 
   ! if (RelationNeedsWAL(rel)  BufferIsValid(heapBuf))
 {
 if (XLogRecPtrIsInvalid(recptr))
 {
  
  At the very least this needs to revise visibilitymap_set's comment about
  the requirement of passing a buffer unless !InRecovery.
 
 Oh, good point, comment block updated.
 
  Also, how is this going to work with replication if you're explicitly
  not WAL logging this?
 
 Uh, I assumed that using the existing functions would handle this.  If
 not, I don't know the answer.

Err. Read the block above again, where you added the
BufferIsValid(heapBuf) check. That's exactly the part doing the WAL
logging.

   *** a/src/backend/utils/time/tqual.c
   --- b/src/backend/utils/time/tqual.c
   *** static inline void
   *** 108,113 
   --- 108,117 
 SetHintBits(HeapTupleHeader tuple, Buffer buffer,
 uint16 infomask, TransactionId xid)
 {
   + /* we might not have a buffer if we are doing raw_heap_insert() 
   */
   + if (!BufferIsValid(buffer))
   + return;
   + 
 if (TransactionIdIsValid(xid))
 {
 /* NB: xid must be known committed here! */
  
  This looks seriously wrong to me.
  
  I think the whole idea of doing this in private memory is bad. The
  visibility routines aren't written for that, and the above is just one
  instance of that, and I am far from convinced it's the only one. So you
  either need to work out how to rely on the visibility checking done in
  cluster.c, or you need to modify rewriteheap.c to write via
  shared_buffers.
 
 I don't think we can change rewriteheap.c to use shared buffers --- that
 code was written to do things in private memory for performance reasons
 and I don't think setting hit bits justifies changing that.

I don't think that's necessarily contradictory. You'd need to use a
ringbuffer strategy for IO, like e.g. VACUUM does. But I admit it's not
a insignificant amount of work, and it might need some adjustements in
some places.

 Can you be more specific about the cluster.c idea?  I see
 copy_heap_data() in cluster.c calling HeapTupleSatisfiesVacuum() with a
 'buf' just like the code I am working in.

Yes, it does. But in contrast to your patch it does so on the *origin*
buffer. Which is 

Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2014-02-15 Thread Bruce Momjian
On Sat, Feb 15, 2014 at 07:08:40PM +0100, Andres Freund wrote:
  Can you be more specific about the cluster.c idea?  I see
  copy_heap_data() in cluster.c calling HeapTupleSatisfiesVacuum() with a
  'buf' just like the code I am working in.
 
 Yes, it does. But in contrast to your patch it does so on the *origin*
 buffer. Which is in shared memory.
 The idea would be to modify rewrite_heap_tuple() to get a new parameter
 informing it it about the tuple's visibility. That'd allow you to avoid
 doing any additional visibility checks.
 
 Unfortunately that would still not fix the problem that
 visibilitymap_set requires a real buffer to be passed in. If you're
 going that route, you'll need to make a bit more sweeping changes. You'd
 need to add new blockno parameter and a BufferIsValid() check to the
 right places in log_heap_visible(). Pretty ugly if you ask me...

Thank you for the thorough review.  Unless someone else can complete
this, I think it should be marked as returned with feedback.  I don't
think I am going to learn enough to complete this during the
commit-fest.

Since learning of the limitations in setting vmmap bits for index-only
scans in October, I have been unable to improve VACUUM/CLUSTER, and
unable to improve autovacuum --- a double failure.  I guess there is
always PG 9.5.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2014-01-24 Thread Jaime Casanova
On Tue, Dec 3, 2013 at 11:25 AM, Bruce Momjian br...@momjian.us wrote:

 Is everyone else OK with this approach?  Updated patch attached.


Hi,

I started to look at this patch and i found that it fails an assertion
as soon as you run a VACUUM FULL after a lazy VACUUM even if those are
on unrelated relations. For example in an assert-enabled build with
the regression database run:

VACUUM customer;
[... insert here whatever commands you like or nothing at all ...]
VACUUM FULL tenk1;

TRAP: FailedAssertion(!(InRecovery || ( ((void) ((bool) ((!
assert_enabled) || ! (!((heapBuf) = NBuffers  (heapBuf) =
-NLocBuffer)) || (ExceptionalCondition(!((heapBuf) = NBuffers 
(heapBuf) = -NLocBuffer), (FailedAssertion), visibilitymap.c,
260), 0, (heapBuf) != 0 )), File: visibilitymap.c, Line: 260)
LOG:  server process (PID 25842) was terminated by signal 6: Aborted
DETAIL:  Failed process was running: vacuum FULL customer;
LOG:  terminating any other active server processes

trace:
(gdb) bt
#0  0x7f9a3d00d475 in *__GI_raise (sig=optimized out) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x7f9a3d0106f0 in *__GI_abort () at abort.c:92
#2  0x00777597 in ExceptionalCondition (
conditionName=conditionName@entry=0x7cd3b8 !(InRecovery || (
((void) ((bool) ((! assert_enabled) || ! (!((heapBuf) = NBuffers 
(heapBuf) = -NLocBuffer)) || (ExceptionalCondition(\!((heapBuf) =
NBuffers  (heapBuf) = -NLocBuffer)\, (\Fai...,
errorType=errorType@entry=0x7b0730 FailedAssertion,
fileName=fileName@entry=0x7cd105 visibilitymap.c,
lineNumber=lineNumber@entry=260) at assert.c:54
#3  0x004a7d99 in visibilitymap_set
(rel=rel@entry=0x7f9a3da56a00, heapBlk=heapBlk@entry=0,
heapBuf=heapBuf@entry=0, recptr=recptr@entry=0, vmBuf=220,
cutoff_xid=2) at visibilitymap.c:260
#4  0x004a33e5 in update_page_vm (relation=0x7f9a3da56a00,
page=page@entry=0x1868b18 , blkno=0) at rewriteheap.c:702
#5  0x004a3668 in raw_heap_insert
(state=state@entry=0x1849e98, tup=tup@entry=0x184f208) at
rewriteheap.c:641
#6  0x004a3b8b in rewrite_heap_tuple
(state=state@entry=0x1849e98, old_tuple=old_tuple@entry=0x1852a50,
new_tuple=new_tuple@entry=0x184f208)
at rewriteheap.c:433
#7  0x0055c373 in reform_and_rewrite_tuple
(tuple=tuple@entry=0x1852a50,
oldTupDesc=oldTupDesc@entry=0x7f9a3da4d350,
newTupDesc=newTupDesc@entry=0x7f9a3da599a8,
values=values@entry=0x1852f40, isnull=isnull@entry=0x184f920 ,
newRelHasOids=1 '\001',
rwstate=rwstate@entry=0x1849e98) at cluster.c:1670

-- 
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitaciĆ³n
Phone: +593 4 5107566 Cell: +593 987171157


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2014-01-24 Thread Bruce Momjian
On Fri, Jan 24, 2014 at 04:52:55PM -0500, Jaime Casanova wrote:
 On Tue, Dec 3, 2013 at 11:25 AM, Bruce Momjian br...@momjian.us wrote:
 
  Is everyone else OK with this approach?  Updated patch attached.
 
 
 Hi,
 
 I started to look at this patch and i found that it fails an assertion
 as soon as you run a VACUUM FULL after a lazy VACUUM even if those are
 on unrelated relations. For example in an assert-enabled build with
 the regression database run:
 
 VACUUM customer;
 [... insert here whatever commands you like or nothing at all ...]
 VACUUM FULL customer;

Wow, thanks for the testing.  You are right that I had two bugs, both in
visibilitymap_set().  First, I made setting heapBuf optional, but forgot
to remove the Assert check now that it was optional.  Second, I passed
InvalidBlockNumber instead of InvalidBuffer when calling
visibilitymap_set().

Both are fixed in the attached patch.  Thanks for the report.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
new file mode 100644
index c34ab98..ff275fa
*** a/src/backend/access/heap/rewriteheap.c
--- b/src/backend/access/heap/rewriteheap.c
***
*** 107,112 
--- 107,114 
  #include access/rewriteheap.h
  #include access/transam.h
  #include access/tuptoaster.h
+ #include access/visibilitymap.h
+ #include commands/vacuum.h
  #include storage/bufmgr.h
  #include storage/smgr.h
  #include utils/memutils.h
*** typedef OldToNewMappingData *OldToNewMap
*** 172,177 
--- 174,180 
  
  /* prototypes for internal functions */
  static void raw_heap_insert(RewriteState state, HeapTuple tup);
+ static void update_page_vm(Relation relation, Page page, BlockNumber blkno);
  
  
  /*
*** end_heap_rewrite(RewriteState state)
*** 281,286 
--- 284,290 
  		true);
  		RelationOpenSmgr(state-rs_new_rel);
  
+ 		update_page_vm(state-rs_new_rel, state-rs_buffer, state-rs_blockno);
  		PageSetChecksumInplace(state-rs_buffer, state-rs_blockno);
  
  		smgrextend(state-rs_new_rel-rd_smgr, MAIN_FORKNUM, state-rs_blockno,
*** raw_heap_insert(RewriteState state, Heap
*** 633,638 
--- 637,643 
  			 */
  			RelationOpenSmgr(state-rs_new_rel);
  
+ 			update_page_vm(state-rs_new_rel, page, state-rs_blockno);
  			PageSetChecksumInplace(page, state-rs_blockno);
  
  			smgrextend(state-rs_new_rel-rd_smgr, MAIN_FORKNUM,
*** raw_heap_insert(RewriteState state, Heap
*** 678,680 
--- 683,705 
  	if (heaptup != tup)
  		heap_freetuple(heaptup);
  }
+ 
+ static void
+ update_page_vm(Relation relation, Page page, BlockNumber blkno)
+ {
+ 	Buffer		vmbuffer = InvalidBuffer;
+ 	TransactionId visibility_cutoff_xid;
+ 
+ 	visibilitymap_pin(relation, blkno, vmbuffer);
+ 	Assert(BufferIsValid(vmbuffer));
+ 
+ 	if (!visibilitymap_test(relation, blkno, vmbuffer) 
+ 		heap_page_is_all_visible(relation, InvalidBuffer, page,
+  visibility_cutoff_xid))
+ 	{
+ 		PageSetAllVisible(page);
+ 		visibilitymap_set(relation, blkno, InvalidBuffer,
+ 		  InvalidXLogRecPtr, vmbuffer, visibility_cutoff_xid);
+ 	}
+ 	ReleaseBuffer(vmbuffer);
+ }
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
new file mode 100644
index 899ffac..3e7546b
*** a/src/backend/access/heap/visibilitymap.c
--- b/src/backend/access/heap/visibilitymap.c
*** visibilitymap_set(Relation rel, BlockNum
*** 257,263 
  #endif
  
  	Assert(InRecovery || XLogRecPtrIsInvalid(recptr));
- 	Assert(InRecovery || BufferIsValid(heapBuf));
  
  	/* Check that we have the right heap page pinned, if present */
  	if (BufferIsValid(heapBuf)  BufferGetBlockNumber(heapBuf) != heapBlk)
--- 257,262 
*** visibilitymap_set(Relation rel, BlockNum
*** 278,284 
  		map[mapByte] |= (1  mapBit);
  		MarkBufferDirty(vmBuf);
  
! 		if (RelationNeedsWAL(rel))
  		{
  			if (XLogRecPtrIsInvalid(recptr))
  			{
--- 277,283 
  		map[mapByte] |= (1  mapBit);
  		MarkBufferDirty(vmBuf);
  
! 		if (RelationNeedsWAL(rel)  BufferIsValid(heapBuf))
  		{
  			if (XLogRecPtrIsInvalid(recptr))
  			{
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
new file mode 100644
index 75e5f15..8d84bef
*** a/src/backend/commands/vacuumlazy.c
--- b/src/backend/commands/vacuumlazy.c
*** static void lazy_record_dead_tuple(LVRel
*** 152,159 
  	   ItemPointer itemptr);
  static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
  static int	vac_cmp_itemptr(const void *left, const void *right);
- static bool heap_page_is_all_visible(Relation rel, Buffer buf,
- 		 TransactionId *visibility_cutoff_xid);
  
  
  /*
--- 152,157 
*** lazy_vacuum_page(Relation onerel, BlockN
*** 1232,1238 
  	 * check if the page has become 

Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2014-01-15 Thread Simon Riggs
On 28 November 2013 22:17, Robert Haas robertmh...@gmail.com wrote:

 The fact that you've needed to modify SetHintBits() to make this work
 is pretty nasty.  I'm not exactly sure what to do about that, but it
 doesn't seem good.

That makes me feel bad also.

I think we should be looking for some special case routines rather
than piggybacking there.

Wonderful to see you personally and directly addressing this concern
and very supportive of this solution for 9.4.

I wonder about a user option to let these routines wait until they are
the oldest snapshot, so they can then set every page as all-visible
without even bothering to check individual items. Or maybe a user
option to set them all-visible even without actually being the oldest,
as is possible with COPY FREEZE. Full lock on the table is a big
thing, so a shame to waste the opportunity to set everything visible
just because some long running task is still executing somewhere (but
clearly not here otherwise we wouldn't have the lock).

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2013-12-03 Thread Bruce Momjian
On Thu, Nov 28, 2013 at 05:38:05PM -0500, Bruce Momjian wrote:
  I wonder if we ought to mark each page as all-visible in
  raw_heap_insert() when we first initialize it, and then clear the flag
  when we come across a tuple that isn't all-visible.  We could try to
  set hint bits on the tuple before placing it on the page, too, though
  I'm not sure of the details.
 
 I went with the per-page approach because I wanted to re-use the vacuum
 lazy function.  Is there some other code that does this already?  I am
 trying to avoid yet-another set of routines that would need to be
 maintained or could be buggy.  This hit bit setting is tricky.
 
 And thanks much for the review!

So, should I put this in the next commit fest?  I still have an unknown
about the buffer number to use here:

!   /* XXX use 0 or real offset? */
!   ItemPointerSet((tuple.t_self), BufferIsValid(buf) ?
!  BufferGetBlockNumber(buf) : 0, offnum);

Is everyone else OK with this approach?  Updated patch attached.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
new file mode 100644
index 951894c..44ae5d8
*** a/src/backend/access/heap/rewriteheap.c
--- b/src/backend/access/heap/rewriteheap.c
***
*** 107,112 
--- 107,114 
  #include access/rewriteheap.h
  #include access/transam.h
  #include access/tuptoaster.h
+ #include access/visibilitymap.h
+ #include commands/vacuum.h
  #include storage/bufmgr.h
  #include storage/smgr.h
  #include utils/memutils.h
*** typedef OldToNewMappingData *OldToNewMap
*** 172,177 
--- 174,180 
  
  /* prototypes for internal functions */
  static void raw_heap_insert(RewriteState state, HeapTuple tup);
+ static void update_page_vm(Relation relation, Page page, BlockNumber blkno);
  
  
  /*
*** end_heap_rewrite(RewriteState state)
*** 280,285 
--- 283,289 
  		state-rs_buffer);
  		RelationOpenSmgr(state-rs_new_rel);
  
+ 		update_page_vm(state-rs_new_rel, state-rs_buffer, state-rs_blockno);
  		PageSetChecksumInplace(state-rs_buffer, state-rs_blockno);
  
  		smgrextend(state-rs_new_rel-rd_smgr, MAIN_FORKNUM, state-rs_blockno,
*** raw_heap_insert(RewriteState state, Heap
*** 632,637 
--- 636,642 
  			 */
  			RelationOpenSmgr(state-rs_new_rel);
  
+ 			update_page_vm(state-rs_new_rel, page, state-rs_blockno);
  			PageSetChecksumInplace(page, state-rs_blockno);
  
  			smgrextend(state-rs_new_rel-rd_smgr, MAIN_FORKNUM,
*** raw_heap_insert(RewriteState state, Heap
*** 677,679 
--- 682,704 
  	if (heaptup != tup)
  		heap_freetuple(heaptup);
  }
+ 
+ static void
+ update_page_vm(Relation relation, Page page, BlockNumber blkno)
+ {
+ 	Buffer		vmbuffer = InvalidBuffer;
+ 	TransactionId visibility_cutoff_xid;
+ 
+ 	visibilitymap_pin(relation, blkno, vmbuffer);
+ 	Assert(BufferIsValid(vmbuffer));
+ 
+ 	if (!visibilitymap_test(relation, blkno, vmbuffer) 
+ 		heap_page_is_all_visible(relation, InvalidBuffer, page,
+  visibility_cutoff_xid))
+ 	{
+ 		PageSetAllVisible(page);
+ 		visibilitymap_set(relation, blkno, InvalidBlockNumber,
+ 		  InvalidXLogRecPtr, vmbuffer, visibility_cutoff_xid);
+ 	}
+ 	ReleaseBuffer(vmbuffer);
+ }
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
new file mode 100644
index 7f40d89..a42511b
*** a/src/backend/access/heap/visibilitymap.c
--- b/src/backend/access/heap/visibilitymap.c
*** visibilitymap_set(Relation rel, BlockNum
*** 278,284 
  		map[mapByte] |= (1  mapBit);
  		MarkBufferDirty(vmBuf);
  
! 		if (RelationNeedsWAL(rel))
  		{
  			if (XLogRecPtrIsInvalid(recptr))
  			{
--- 278,284 
  		map[mapByte] |= (1  mapBit);
  		MarkBufferDirty(vmBuf);
  
! 		if (RelationNeedsWAL(rel)  BufferIsValid(heapBuf))
  		{
  			if (XLogRecPtrIsInvalid(recptr))
  			{
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
new file mode 100644
index fe2d9e7..4f6578f
*** a/src/backend/commands/vacuumlazy.c
--- b/src/backend/commands/vacuumlazy.c
*** static void lazy_record_dead_tuple(LVRel
*** 151,158 
  	   ItemPointer itemptr);
  static bool lazy_tid_reaped(ItemPointer itemptr, void *state);
  static int	vac_cmp_itemptr(const void *left, const void *right);
- static bool heap_page_is_all_visible(Relation rel, Buffer buf,
- 		 TransactionId *visibility_cutoff_xid);
  
  
  /*
--- 151,156 
*** lazy_vacuum_page(Relation onerel, BlockN
*** 1197,1203 
  	 * check if the page has become all-visible.
  	 */
  	if (!visibilitymap_test(onerel, blkno, vmbuffer) 
! 		heap_page_is_all_visible(onerel, buffer, visibility_cutoff_xid))
  	{
  		Assert(BufferIsValid(*vmbuffer));
  		PageSetAllVisible(page);
--- 1195,1201 
  	 * 

Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2013-12-03 Thread Robert Haas
On Tue, Dec 3, 2013 at 11:25 AM, Bruce Momjian br...@momjian.us wrote:
 On Thu, Nov 28, 2013 at 05:38:05PM -0500, Bruce Momjian wrote:
  I wonder if we ought to mark each page as all-visible in
  raw_heap_insert() when we first initialize it, and then clear the flag
  when we come across a tuple that isn't all-visible.  We could try to
  set hint bits on the tuple before placing it on the page, too, though
  I'm not sure of the details.

 I went with the per-page approach because I wanted to re-use the vacuum
 lazy function.  Is there some other code that does this already?  I am
 trying to avoid yet-another set of routines that would need to be
 maintained or could be buggy.  This hit bit setting is tricky.

 And thanks much for the review!

 So, should I put this in the next commit fest?

+1.

 I still have an unknown
 about the buffer number to use here:

 !   /* XXX use 0 or real offset? */
 !   ItemPointerSet((tuple.t_self), BufferIsValid(buf) ?
 !  BufferGetBlockNumber(buf) : 0, offnum);

 Is everyone else OK with this approach?  Updated patch attached.

I started looking at this a little more the other day but got bogged
down in other things before I got through it all.  I think we're going
to want to revise this so that it doesn't go through the functions
that current assume that they're always deal with a shared_buffer, but
I haven't figured out exactly what the most elegant way of doing that
is yet.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2013-12-03 Thread Bruce Momjian
On Tue, Dec  3, 2013 at 02:01:52PM -0500, Robert Haas wrote:
 On Tue, Dec 3, 2013 at 11:25 AM, Bruce Momjian br...@momjian.us wrote:
  On Thu, Nov 28, 2013 at 05:38:05PM -0500, Bruce Momjian wrote:
   I wonder if we ought to mark each page as all-visible in
   raw_heap_insert() when we first initialize it, and then clear the flag
   when we come across a tuple that isn't all-visible.  We could try to
   set hint bits on the tuple before placing it on the page, too, though
   I'm not sure of the details.
 
  I went with the per-page approach because I wanted to re-use the vacuum
  lazy function.  Is there some other code that does this already?  I am
  trying to avoid yet-another set of routines that would need to be
  maintained or could be buggy.  This hit bit setting is tricky.
 
  And thanks much for the review!
 
  So, should I put this in the next commit fest?
 
 +1.

OK, I added it to the January commit fest.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2013-11-28 Thread Robert Haas
On Wed, Nov 27, 2013 at 4:33 PM, Bruce Momjian br...@momjian.us wrote:
 On Sat, Jan 12, 2013 at 02:14:03PM -0500, Kevin Grittner wrote:
 Amit Kapila wrote:
  On Thursday, January 10, 2013 6:09 AM Josh Berkus wrote:

  Surely VACUUM FULL should rebuild the visibility map, and make
  tuples in the new relation all-visible, no?

 Certainly it seems odd to me that VACUUM FULL leaves the the table
 in a less-well maintained state in terms of visibility than a
 normal vacuum. VACUUM FULL should not need to be followed by
 another VACUUM.

  I think it cannot made all visible.

 I don't think all tuples in the relation are necessarily visible to
 all transactions, but the ones which are should probably be flagged
 that way.

 I have developed the attached proof-of-concept patch to fix the problem
 of having no visibility map after CLUSTER or VACUUM FULL.  I tested with
 these queries:

 CREATE TABLE test(x INT PRIMARY KEY);
 INSERT INTO test VALUES (1);
 VACUUM FULL test; -- or CLUSTER
 SELECT relfilenode FROM pg_class WHERE relname = 'test';
  relfilenode
 -
16399

 Then 'ls -l data/base/16384/16399*' to see the *_vm file.  I am not sure
 how to test that the vm contents are valid.

 This patch is fairly tricky because our CLUSTER/VACUUM FULL behavior
 does not do writes through the shared buffer cache, as outlined in this
 C comment block:

  * We can't use the normal heap_insert function to insert into the new
  * heap, because heap_insert overwrites the visibility information.
  * We use a special-purpose raw_heap_insert function instead, which
  * is optimized for bulk inserting a lot of tuples, knowing that we have
  * exclusive access to the heap.  raw_heap_insert builds new pages in
  * local storage.  When a page is full, or at the end of the process,
  * we insert it to WAL as a single record and then write it to disk
  * directly through smgr.  Note, however, that any data sent to the new
  * heap's TOAST table will go through the normal bufmgr.

 I originally tried to do this higher up in the stack but ran into
 problems because I couldn't access the new heap page so I had to do it
 at the non-shared-buffer page level.  I reused the lazy vacuum routines.

 I need to know this is the right approach, and need to know what things
 are wrong or missing.

The fact that you've needed to modify SetHintBits() to make this work
is pretty nasty.  I'm not exactly sure what to do about that, but it
doesn't seem good.

This patch also has the desirable but surprising consequence that hint
bits will be set as a side effect of update_page_vm(), which means
that you'd better do that BEFORE checksumming the page.

I wonder if we ought to mark each page as all-visible in
raw_heap_insert() when we first initialize it, and then clear the flag
when we come across a tuple that isn't all-visible.  We could try to
set hint bits on the tuple before placing it on the page, too, though
I'm not sure of the details.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2013-11-28 Thread Bruce Momjian
On Thu, Nov 28, 2013 at 05:17:07PM -0500, Robert Haas wrote:
  I need to know this is the right approach, and need to know what things
  are wrong or missing.
 
 The fact that you've needed to modify SetHintBits() to make this work
 is pretty nasty.  I'm not exactly sure what to do about that, but it
 doesn't seem good.

Hey, if that's the worst of the problems, I will be very happy.  There
is a sense that because we are not using the shared buffer cache and not
WAL logging, we have to skip some stuff.

 This patch also has the desirable but surprising consequence that hint
 bits will be set as a side effect of update_page_vm(), which means
 that you'd better do that BEFORE checksumming the page.

Yes, I already saw that and fixed it in my git tree.

 I wonder if we ought to mark each page as all-visible in
 raw_heap_insert() when we first initialize it, and then clear the flag
 when we come across a tuple that isn't all-visible.  We could try to
 set hint bits on the tuple before placing it on the page, too, though
 I'm not sure of the details.

I went with the per-page approach because I wanted to re-use the vacuum
lazy function.  Is there some other code that does this already?  I am
trying to avoid yet-another set of routines that would need to be
maintained or could be buggy.  This hit bit setting is tricky.

And thanks much for the review!

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2013-11-27 Thread Bruce Momjian
On Sat, Jan 12, 2013 at 02:14:03PM -0500, Kevin Grittner wrote:
 Amit Kapila wrote:
  On Thursday, January 10, 2013 6:09 AM Josh Berkus wrote:
 
  Surely VACUUM FULL should rebuild the visibility map, and make
  tuples in the new relation all-visible, no?
 
 Certainly it seems odd to me that VACUUM FULL leaves the the table
 in a less-well maintained state in terms of visibility than a
 normal vacuum. VACUUM FULL should not need to be followed by
 another VACUUM.
 
  I think it cannot made all visible.
 
 I don't think all tuples in the relation are necessarily visible to
 all transactions, but the ones which are should probably be flagged
 that way.

I have developed the attached proof-of-concept patch to fix the problem
of having no visibility map after CLUSTER or VACUUM FULL.  I tested with
these queries:

CREATE TABLE test(x INT PRIMARY KEY);
INSERT INTO test VALUES (1);
VACUUM FULL test; -- or CLUSTER
SELECT relfilenode FROM pg_class WHERE relname = 'test';
 relfilenode
-
   16399

Then 'ls -l data/base/16384/16399*' to see the *_vm file.  I am not sure
how to test that the vm contents are valid.

This patch is fairly tricky because our CLUSTER/VACUUM FULL behavior
does not do writes through the shared buffer cache, as outlined in this
C comment block:

 * We can't use the normal heap_insert function to insert into the new
 * heap, because heap_insert overwrites the visibility information.
 * We use a special-purpose raw_heap_insert function instead, which
 * is optimized for bulk inserting a lot of tuples, knowing that we have
 * exclusive access to the heap.  raw_heap_insert builds new pages in
 * local storage.  When a page is full, or at the end of the process,
 * we insert it to WAL as a single record and then write it to disk
 * directly through smgr.  Note, however, that any data sent to the new
 * heap's TOAST table will go through the normal bufmgr.

I originally tried to do this higher up in the stack but ran into
problems because I couldn't access the new heap page so I had to do it
at the non-shared-buffer page level.  I reused the lazy vacuum routines.

I need to know this is the right approach, and need to know what things
are wrong or missing.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
new file mode 100644
index 951894c..c01a6a8
*** a/src/backend/access/heap/rewriteheap.c
--- b/src/backend/access/heap/rewriteheap.c
***
*** 107,112 
--- 107,114 
  #include access/rewriteheap.h
  #include access/transam.h
  #include access/tuptoaster.h
+ #include access/visibilitymap.h
+ #include commands/vacuum.h
  #include storage/bufmgr.h
  #include storage/smgr.h
  #include utils/memutils.h
*** typedef OldToNewMappingData *OldToNewMap
*** 172,177 
--- 174,180 
  
  /* prototypes for internal functions */
  static void raw_heap_insert(RewriteState state, HeapTuple tup);
+ static void update_page_vm(Relation relation, Page page, BlockNumber blkno);
  
  
  /*
*** end_heap_rewrite(RewriteState state)
*** 281,286 
--- 284,290 
  		RelationOpenSmgr(state-rs_new_rel);
  
  		PageSetChecksumInplace(state-rs_buffer, state-rs_blockno);
+ 		update_page_vm(state-rs_new_rel, state-rs_buffer, state-rs_blockno);
  
  		smgrextend(state-rs_new_rel-rd_smgr, MAIN_FORKNUM, state-rs_blockno,
     (char *) state-rs_buffer, true);
*** raw_heap_insert(RewriteState state, Heap
*** 633,638 
--- 637,643 
  			RelationOpenSmgr(state-rs_new_rel);
  
  			PageSetChecksumInplace(page, state-rs_blockno);
+ 			update_page_vm(state-rs_new_rel, page, state-rs_blockno);
  
  			smgrextend(state-rs_new_rel-rd_smgr, MAIN_FORKNUM,
  	   state-rs_blockno, (char *) page, true);
*** raw_heap_insert(RewriteState state, Heap
*** 677,679 
--- 682,704 
  	if (heaptup != tup)
  		heap_freetuple(heaptup);
  }
+ 
+ static void
+ update_page_vm(Relation relation, Page page, BlockNumber blkno)
+ {
+ 	Buffer		vmbuffer = InvalidBuffer;
+ 	TransactionId visibility_cutoff_xid;
+ 
+ 	visibilitymap_pin(relation, blkno, vmbuffer);
+ 	Assert(BufferIsValid(vmbuffer));
+ 
+ 	if (!visibilitymap_test(relation, blkno, vmbuffer) 
+ 		heap_page_is_all_visible(relation, InvalidBuffer, page,
+  visibility_cutoff_xid))
+ 	{
+ 		PageSetAllVisible(page);
+ 		visibilitymap_set(relation, blkno, InvalidBlockNumber,
+ 		  InvalidXLogRecPtr, vmbuffer, visibility_cutoff_xid);
+ 	}
+ 	ReleaseBuffer(vmbuffer);
+ }
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
new file mode 100644
index 7f40d89..a42511b
*** a/src/backend/access/heap/visibilitymap.c
--- b/src/backend/access/heap/visibilitymap.c
*** 

Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2013-01-12 Thread Kevin Grittner
Amit Kapila wrote:
 On Thursday, January 10, 2013 6:09 AM Josh Berkus wrote:

 Surely VACUUM FULL should rebuild the visibility map, and make
 tuples in the new relation all-visible, no?

Certainly it seems odd to me that VACUUM FULL leaves the the table
in a less-well maintained state in terms of visibility than a
normal vacuum. VACUUM FULL should not need to be followed by
another VACUUM.

 I think it cannot made all visible.

I don't think all tuples in the relation are necessarily visible to
all transactions, but the ones which are should probably be flagged
that way.

 How about if any transaction in SSI mode is started before Vacuum
 Full, should it see all tuples.

There are no differences between visibility rules for serializable
transactions (SSI) and repeatable read transactions. It should be
based on whether any snapshots exist which can still see the tuple.

-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2013-01-10 Thread Amit Kapila
On Thursday, January 10, 2013 12:01 PM Pavan Deolasee wrote:
 On Thu, Jan 10, 2013 at 11:45 AM, Amit Kapila amit.kap...@huawei.com
 wrote:
  On Thursday, January 10, 2013 6:09 AM Josh Berkus wrote:
 
 
  Surely VACUUM FULL should rebuild the visibility map, and make
 tuples
  in
  the new relation all-visible, no?
 
  I think it cannot made all visible.
  How about if any transaction in SSI mode is started before Vacuum
 Full, should it see all tuples.
 
 
 We can definitely do better than what we are doing today and that
 should fix many use cases and rebuild the VM for large part of the
 table if not all. More precisely, in cluster.c we can see what does
 HeapTupleSatisfiesVacuum() returns for every tuple in a page. If there
 are only DEAD or LIVE tuples in a page, we can set the VM bit. We may
 need similar additional checks for LIVE tuples like we have in vacuum
 code path. But its certainly doable.

 Do we document this behavior or add a TODO item?
 

 Both?

IMO, we should do both.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2013-01-09 Thread Bruce Momjian
How do we want to handle the case where VACUUM FULL clears the
visibility map, causing loss of index-only scans?

http://archives.postgresql.org/pgsql-performance/2012-11/msg00317.php

Do we document this behavior or add a TODO item?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2013-01-09 Thread Josh Berkus
On 01/09/2013 03:07 PM, Bruce Momjian wrote:
 How do we want to handle the case where VACUUM FULL clears the
 visibility map, causing loss of index-only scans?
 
   http://archives.postgresql.org/pgsql-performance/2012-11/msg00317.php
 
 Do we document this behavior or add a TODO item?
 

Both?

Surely VACUUM FULL should rebuild the visibility map, and make tuples in
the new relation all-visible, no?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2013-01-09 Thread Amit Kapila
On Thursday, January 10, 2013 6:09 AM Josh Berkus wrote:
 On 01/09/2013 03:07 PM, Bruce Momjian wrote:
  How do we want to handle the case where VACUUM FULL clears the
  visibility map, causing loss of index-only scans?
 
  http://archives.postgresql.org/pgsql-performance/2012-
 11/msg00317.php
 
  Do we document this behavior or add a TODO item?
 
 
 Both?
 
 Surely VACUUM FULL should rebuild the visibility map, and make tuples
 in
 the new relation all-visible, no?

I think it cannot made all visible.
How about if any transaction in SSI mode is started before Vacuum Full, should 
it see all tuples.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2013-01-09 Thread Pavan Deolasee
On Thu, Jan 10, 2013 at 11:45 AM, Amit Kapila amit.kap...@huawei.com wrote:
 On Thursday, January 10, 2013 6:09 AM Josh Berkus wrote:


 Surely VACUUM FULL should rebuild the visibility map, and make tuples
 in
 the new relation all-visible, no?

 I think it cannot made all visible.
 How about if any transaction in SSI mode is started before Vacuum Full, 
 should it see all tuples.


We can definitely do better than what we are doing today and that
should fix many use cases and rebuild the VM for large part of the
table if not all. More precisely, in cluster.c we can see what does
HeapTupleSatisfiesVacuum() returns for every tuple in a page. If there
are only DEAD or LIVE tuples in a page, we can set the VM bit. We may
need similar additional checks for LIVE tuples like we have in vacuum
code path. But its certainly doable.

I'd also suggested doing something similar for cases when a table is
created and written in the same transaction. But that's more invasive.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers