Re: [HACKERS] git: uh-oh

2010-08-28 Thread tomas
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Wed, Aug 25, 2010 at 12:35:53PM -0400, Robert Haas wrote:
 On Wed, Aug 25, 2010 at 12:02 PM, Michael Haggerty mhag...@alum.mit.edu 
 wrote:

[...]

  I must say, it is refreshing to have users who actually care about their
  conversion, as opposed to the usual rabble who think that git-cvsimport
  is Just Fine :-)  I guess if the postgresql project didn't care about
  data integrity then we would all have to worry :-)
 
 I laughed when I read this - yeah, we're kind of paranoid about that.

Going a bit off-topic -- although I'm extremely strapped on time I have
been following this thread with attention. The above reminds me why I
appreciate git; it's not just the technical parts, but the culture
behind it what convince me.

Thanks for a very interesting thread
- -- tomás
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFMeLSNBcgs9XrR2kYRAoTUAJ9AnpqkVID3YO1l3RhwU1rRMMtIIwCfR9yW
T4NsNX5Ju5BZQhbxIEmDIpg=
=+u0H
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] refactoring comment.c

2010-08-28 Thread Martijn van Oosterhout
On Fri, Aug 27, 2010 at 09:35:55PM -0400, Tom Lane wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  Didn't we inject some smarts so that the compiler would notice that
  elog(ERROR) doesn't return?
 
 No.  If you know a portable (as in works on every compiler) way
 to do that, we could talk.  If only some compilers understand it,
 we'll probably end up worse off --- the ones that don't understand it
 will still need things like these unreachable assignments, while the
 ones that do understand will start warning about unreachable code.

We've been here before:

http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg72113.html

The problem appears to be mainly avoiding double evaluation, as
otherwise it's just some macro tweaking. On gcc you can avoid the
double evaluation also, but it's other compilers which we also need to
support.

If we really wanted to we could arrange for GCC to throw an error if
the first argument to elog was non-constant, which would prevent bugs
creeping in due to double evaluation. That still won't help users of
other compilers though.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Patriotism is when love of your own people comes first; nationalism,
 when hate for people other than your own comes first. 
   - Charles de Gaulle


signature.asc
Description: Digital signature


Re: [HACKERS] Git conversion progress report and call for testing assistance

2010-08-28 Thread Magnus Hagander
On Fri, Aug 27, 2010 at 22:02, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 On Fri, Aug 27, 2010 at 21:30, Tom Lane t...@sss.pgh.pa.us wrote:
 OK, the list works now, but the commitdiff links in the messages do
 not.  Looks like they are pointing at the wrong repository.

 Well, actually, they point to the right one, but the mirroring off to
 that one is turned off. There are just too many different versions
 right now to keep that thing pointing the right way :-)

 Oh, OK.

 In any case, you asked for feedback on the message format, so here is
 some: I don't like having the commitdiff link first.  I think the log
 message is the most important thing and should be first.  Possible
 format (based on the example I just received):

Ok. I see the point of that - particularly for MUAs that show a short
preview and don't filter the link out of it.

Should we consider actually removing the Log Message header and just
put the message right at the start? If there's no link first, that
would be fairly obvious, I think..



 Log Message
 ---
 Another experimental commit.
 Undo hacking other people previously did on README.

 Branch
 --
 REL9_0_STABLE

 Details
 ---
 http://git.postgresql.org/gitweb?p=postgresql-migration.git;a=commitdiff;h=1add27be08978f20edb67ca5854f6701f0f6ea86

 Summary
 ---
 README |    4 
 1 files changed, 0 insertions(+), 4 deletions(-)


 You could argue it either way about whether to put the commitdiff link
 before or after the change summary, but offhand I think before will be
 more useful.  If there are a lot of files touched, people won't want
 to have to scroll to the bottom to find the link.

I definitely think it should stay before - that way it will almost
always stay above the fold for readers. Keeps it very easy for those
who prefer to just click that link to get the full diff right away.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git conversion progress report and call for testing assistance

2010-08-28 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 Should we consider actually removing the Log Message header and just
 put the message right at the start? If there's no link first, that
 would be fairly obvious, I think..

Either way (header or not) is OK by me.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multibyte charater set in levenshtein function

2010-08-28 Thread Robert Haas
On Aug 28, 2010, at 8:34 AM, Alexander Korotkov aekorot...@gmail.com wrote:
 Here is the patch which adds levenshtein_less_equal function. I'm going to 
 add it to current commitfest.

Cool. Please submit some performance results comparing levenshtein in HEAD vs. 
levenshtein with this patch vs. levenshtein_less_equal. Perhaps the test cases 
we used previously would be a good place to start.

...Robert

Re: [HACKERS] [JDBC] Trouble with COPY IN

2010-08-28 Thread James William Pye
On Aug 9, 2010, at 11:49 AM, Kris Jurka wrote:
 Oh, duh.  It's a server side copy not going through the client at all. Here's 
 a hopefully final patch.

Trying it out... Works for me.

I understand the resistance to the patch, but it would be
quite nice to see this wart in the rear view. =\
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] 3rd time is a charm.....right sibling is not next child crash.

2010-08-28 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Jeff Amiel's message of mar jun 08 09:26:25 -0400 2010:
 Jun  7 15:05:01 db-1 postgres[9334]: [ID 748848 local0.crit] [3989781-1] 
 2010-06-07 15:05:01.087 CDT9334PANIC:  right sibling 169 of block 168 is 
 not next child of 249 in index sl_seqlog_idx

 I've seen this problem (and others) in a high-load environment.  Not
 Slony related though.

 I wrote a small tool to check btree index files for consistency problems
 such as this one, by parsing pg_filedump output.  I've seen strange
 things such as index pointers pointing to pages that shouldn't have been
 pointed to; mismatching sibling pointers; and others.

I spent some more time today looking for possible causes of this (within
our code that is; the obvious elephant in the room is the possibility of
the disk storage system losing an update).  I didn't find anything, but
it did occur to me that there is a simple way to ameliorate this
problem: we could rearrange the code in _bt_pagedel() so it checks for
this case before entering its critical section.  Then, corruption of
this kind is at least only an ERROR not a PANIC.

Any objections to doing that?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multibyte charater set in levenshtein function

2010-08-28 Thread Alexander Korotkov
Here is the patch which adds levenshtein_less_equal function. I'm going to
add it to current commitfest.


With best regards,
Alexander Korotkov.


On Tue, Aug 3, 2010 at 3:23 AM, Robert Haas robertmh...@gmail.com wrote:

 On Mon, Aug 2, 2010 at 5:07 PM, Alexander Korotkov aekorot...@gmail.com
 wrote:
  Now I think patch is as good as can be. :)

 OK, committed.

  I'm going to prepare less-or-equal function in same manner as this patch.

 Sounds good.  Since we're now more than half-way through this
 CommitFest and this patch has undergone quite a bit of change from
 what we started out with, I'd like to ask that you submit the new
 patch for the next CommitFest, to begin September 15th.

 https://commitfest.postgresql.org/action/commitfest_view/open

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise Postgres Company

*** a/contrib/fuzzystrmatch/fuzzystrmatch.c
--- b/contrib/fuzzystrmatch/fuzzystrmatch.c
***
*** 61,66  PG_MODULE_MAGIC;
--- 61,68 
   */
  extern Datum levenshtein_with_costs(PG_FUNCTION_ARGS);
  extern Datum levenshtein(PG_FUNCTION_ARGS);
+ extern Datum levenshtein_less_equal_with_costs(PG_FUNCTION_ARGS);
+ extern Datum levenshtein_less_equal(PG_FUNCTION_ARGS);
  extern Datum metaphone(PG_FUNCTION_ARGS);
  extern Datum soundex(PG_FUNCTION_ARGS);
  extern Datum difference(PG_FUNCTION_ARGS);
***
*** 92,98  soundex_code(char letter)
  #define MAX_LEVENSHTEIN_STRLEN		255
  
  static int levenshtein_internal(text *s, text *t,
! 	 int ins_c, int del_c, int sub_c);
  
  
  /*
--- 94,100 
  #define MAX_LEVENSHTEIN_STRLEN		255
  
  static int levenshtein_internal(text *s, text *t,
! 	 int ins_c, int del_c, int sub_c, int max_d);
  
  
  /*
***
*** 202,211  rest_of_char_same(const char *s1, const char *s2, int len)
   *		  between supplied strings. Generally
   *		  (1, 1, 1) penalty costs suffices common
   *		  cases, but your mileage may vary.
   */
  static int
  levenshtein_internal(text *s, text *t,
! 	 int ins_c, int del_c, int sub_c)
  {
  	int			m,
  n,
--- 204,217 
   *		  between supplied strings. Generally
   *		  (1, 1, 1) penalty costs suffices common
   *		  cases, but your mileage may vary.
+  *		  Returns accurate value if max_d  0 or
+  *		  actual distance is less or equal than
+  *		  max_d, otherwise returns value greater
+  *		  than max_d.
   */
  static int
  levenshtein_internal(text *s, text *t,
! 	 int ins_c, int del_c, int sub_c, int max_d)
  {
  	int			m,
  n,
***
*** 219,224  levenshtein_internal(text *s, text *t,
--- 225,233 
  	const char *s_data;
  	const char *t_data;
  	const char *y;
+ 	const char *prev_x = NULL;
+ 	intcurr_left = 0, curr_right = 0, prev_left, prev_right, d;
+ 	intdelta = 0, min_d = 0, theor_max_d;
  
  	/* Extract a pointer to the actual character data. */
  	s_data = VARDATA_ANY(s);
***
*** 238,243  levenshtein_internal(text *s, text *t,
--- 247,266 
  		return n * ins_c;
  	if (!n)
  		return m * del_c;
+ 	
+ 	/*
+ 	 * There is theoretical maximum distance based of string lengths. It
+ 	 * represents the case, when no characters are matching. If max_d
+ 	 * reaches this value than we can use accurate calculation of distance
+ 	 * which is faster in this case.
+ 	 */
+ 	if (max_d = 0)
+ 	{
+ 		theor_max_d = Min(m*del_c + n*ins_c, (m  n) ?
+ 			(n * sub_c + (m - n) * del_c):(m * sub_c + (n - m) * ins_c)) - 1;
+ 		if (max_d = theor_max_d)
+ 			max_d = -1;
+ 	}
  
  	/*
  	 * For security concerns, restrict excessive CPU+RAM usage. (This
***
*** 251,256  levenshtein_internal(text *s, text *t,
--- 274,296 
  		MAX_LEVENSHTEIN_STRLEN)));
  
  	/*
+ 	 * We can find the minimal distance by the difference of string lengths.
+ 	 */
+ 	if (max_d = 0)
+ 	{
+ 		delta = m - n;
+ 		if (delta  0)
+ 			min_d = delta * del_c;
+ 		else if (delta  0)
+ 			min_d = - delta * ins_c;
+ 		else
+ 			min_d = 0;
+ 
+ 		if (min_d  max_d)
+ 			return max_d + 1;
+ 	}
+ 
+ 	/*
  	 * In order to avoid calling pg_mblen() repeatedly on each character in s,
  	 * we cache all the lengths before starting the main loop -- but if all the
  	 * characters in both strings are single byte, then we skip this and use
***
*** 286,369  levenshtein_internal(text *s, text *t,
  	curr = prev + m;
  
  	/* Initialize the previous row to 0..cols */
! 	for (i = 0; i  m; i++)
! 		prev[i] = i * del_c;
  
  	/* Loop through rows of the notional array */
  	for (y = t_data, j = 1; j  n; j++)
  	{
  		int		   *temp;
- 		const char *x = s_data;
  		int			y_char_len = n != t_bytes + 1 ? pg_mblen(y) : 1;
  
  		/*
! 		 * First cell must increment sequentially, as we're on the j'th row of
! 		 * the (m+1)x(n+1) array.
! 		 */
! 		curr[0] = j * ins_c;
! 
! 		/*
! 		 * This inner loop is critical to performance, so we include a
  		 * fast-path to handle the 

Re: [HACKERS] multibyte charater set in levenshtein function

2010-08-28 Thread Alexander Korotkov
SELECT SUM(levenshtein(a, 'foo')) from words;
SELECT SUM(levenshtein(a, 'Urbański')) FROM words;
SELECT SUM(levenshtein(a, 'ańs')) FROM words;
SELECT SUM(levenshtein(a, 'foo')) from words2;
SELECT SUM(levenshtein(a, 'дом')) FROM words2;
SELECT SUM(levenshtein(a, 'компьютер')) FROM words2;

Before this patch results was:
67 98 63 102 108 177
And after patch:
65 100 65 104 111 182
It is slower a bit, but I think it is a price for having less code
duplication.

Now test for levenshtein_less_equal performance.

test=# SELECT * FROM words WHERE levenshtein(a, 'extensize') = 2;
 a
---
 expensive
 extension
 extensive
(3 rows)

Time: 98,083 ms

test=# SELECT * FROM words WHERE levenshtein_less_equal(a, 'extensize', 2)
= 2;
 a
---
 expensive
 extension
 extensive
(3 rows)

Time: 48,069 ms

test=# SELECT * FROM words2 WHERE levenshtein(a, 'клубничный') = 2;
 a

 кузничный
 кубичный
 клубочный
 клубничный
(4 rows)

Time: 197,499 ms

test=# SELECT * FROM words2 WHERE levenshtein_less_equal(a, 'клубничный', 3)
= 2;
 a

 кузничный
 кубичный
 клубочный
 клубничный
(4 rows)

Time: 100,073 ms

It gives some speedup for search on word with small distances.

test=# SELECT sum(levenshtein(a, 'extensize')) from words;
  sum

 835545
(1 row)

Time: 96,253 ms

test=# SELECT sum(levenshtein_less_equal(a, 'extensize', 20)) from words;
  sum

 835545
(1 row)

Time: 98,438 ms

With big distances it gives similar performance because in this case similar
branch of code is used.
In order to test it with longer words I created a table with random
combinations of words.

test=# create table phrases (a text primary key);
test=# insert into phrases select string_agg(y.a, ' ') from (select x.a,
row_number() over () from (select a from words order by random()) x) y group
by (y.row_number / 10);

test=# select * from phrases limit 10;

a
--
 hosiery's spermatozoa Haleakala disco greenness beehive paranoids atrophy
unfair
 Weinberg's all profanity's individualized bellowed perishables Romanov's
communicant's landlady's pauperized
 Ito seasoned jawbreakers you'll hurling's showcasing anecdota cauliflower
intellectualism facilitate
 infantryman's busheled designing lucid nutritionists Aesculapius's rifle
clefting exult Milosevic
 foresight lurking capitulation's pantsuits traumatizes moonlighting
lancer's Longfellow rising unclasps
 permutes centralest cavalryman Dwight granddaughter knight tallyho's sober
graduate locket's
 knucklehead courtliest sapphires baize coniferous emolument antarctic
Laocoon's deadens unseemly
 Tartars utopians Pekingese's Cartwright badge's Dollie's spryness Ajax's
Amber's bookkeeper
 spouting concordant Indus carbonation cinnamon's stockbrokers Evita's
Canaries Waldorf's rampage
 Amparo exertions Kuznetsk's divots humongous wolverine chugged laurels
Goliaths hazelnut
(10 rows)

test=# select * from phrases where levenshtein('nucklehead courtliest
sapphires be coniferous emolument antarctic Laocoon''s deadens unseemly', a)
= 10;

a
--
 knucklehead courtliest sapphires baize coniferous emolument antarctic
Laocoon's deadens unseemly
(1 row)

Time: 581,850 ms

test=# select * from phrases where levenshtein_less_equal('nucklehead
courtliest   sapphires be coniferous emolument antarctic Laocoon''s deadens
unseemly', a, 10) = 10;

a
--
 knucklehead courtliest sapphires baize coniferous emolument antarctic
Laocoon's deadens unseemly
(1 row)

Time: 22,876 ms

It gives great speedup for search with small distances on relatively long
strings.

test=# select sum(levenshtein('nucklehead courtliest   sapphires be
coniferous emolument antarctic Laocoon''s deadens unseemly', a)) from
phrases;
  sum

 794601
(1 row)

Time: 571,138 ms

test=# select sum(levenshtein_less_equal('nucklehead courtliest
sapphires be coniferous emolument antarctic Laocoon''s deadens unseemly', a,
100)) from phrases;
  sum

 794601
(1 row)

Time: 562,895 ms

Similar results appears for multi-byte strings.

test=# create table phrases2 (a text primary key);
test=# insert into phrases2 select string_agg(y.a, ' ') from (select x.a,
row_number() over () from (select a from words2 order by random()) x) y
group by (y.row_number / 10);
INSERT 0 14584


test=# select * from phrases2 limit 10;

a

---
 борис спиннинг растопочный цифра выдохнуть иридий гнёздышко омоновский
базовый
 пожжёте закосить насыщающий паратый продрых обеспложивание милливатт
бамбуковый отпекающий книжница
 приложиться разный устремляющийся отучающийся абрикосовка 

Re: Interruptible sleeps (was Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)

2010-08-28 Thread Greg Smith

Tom Lane wrote:

Greg Smith g...@2ndquadrant.com writes:
  
...  The only clear case where this is 
always a win is when the system it totally idle.


If you'll climb down off that horse for a moment: yeah, the idle case is
*exactly* what they're complaining about.


I wasn't on a horse here--I saw the specific case they must be most 
annoyed with.  If it's possible to turn this around into a push model 
instead of how it works now, without tanking so that performance (and 
maybe even power use!) suffers for the worst or even average case, that 
refactoring could end up outperforming the current model.  I'd like the 
background writer to never go to sleep at all really if enough works 
come in to keep it always busy, and I think that might fall out nicely 
as a result of aiming for the other end--making it sleeper deeper when 
it's not needed.


Just pointing out the very specific place where I suspect that will go 
wrong if it's not done carefully is the fsync absorption, because it's 
already broken enough that we're reworking it here.  PostgreSQL already 
ship a bad enough default configuration to decide yet another spot 
should be yielded to somebody else's priorities, unless that actually 
meets performance goals.  I think I can quantify what those should be 
with a test case for this part.



I see this as just another facet of the argument about whether it's okay
to have default settings that try to take over the entire machine.
  


Mostly agreed here.  So long as the result is automatic enough to not 
introduce yet another GUC set to the wrong thing for busy systems by 
default, this could turn out great.  The list of things you must change 
to get the database to work right for serious work is already far too 
long, and I dread the thought of putting yet another on there just for 
the sake of lower power use.  Another part of the plan for world 
domination is that Oracle DBAs install the database for a test and say 
wow, that wasn't nearly as complicated as I'm used to and it performs 
well, that was nice.  That those people matter too is all I'm saying.  
I could easily file a bug from their perspective saying Background 
writer is a lazy SOB in default config that would be no less valid than 
the one being discussed here.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] 3rd time is a charm.....right sibling is not next child crash.

2010-08-28 Thread Tom Lane
I wrote:
 it did occur to me that there is a simple way to ameliorate this
 problem: we could rearrange the code in _bt_pagedel() so it checks for
 this case before entering its critical section.  Then, corruption of
 this kind is at least only an ERROR not a PANIC.

In particular, I propose the attached patch, which gets rid of
unnecessary PANIC cases in _bt_split as well as _bt_pagedel;
all of these get reported from the field every now and then.
In passing this also makes the error messages out of _bt_split
a bit more detailed.

FWIW, I think the original rationale for PANIC here was so we could
capture a core dump for study; but since no one has ever yet cooperated
by providing such a dump, it seems like not panicking is a better plan.

Barring objections, I plan to back-patch this as far as it will
conveniently go (looks like 8.2 or 8.3 or so).

regards, tom lane

Index: src/backend/access/nbtree/nbtinsert.c
===
RCS file: /cvsroot/pgsql/src/backend/access/nbtree/nbtinsert.c,v
retrieving revision 1.178
diff -c -r1.178 nbtinsert.c
*** src/backend/access/nbtree/nbtinsert.c	28 Mar 2010 09:27:01 -	1.178
--- src/backend/access/nbtree/nbtinsert.c	28 Aug 2010 23:31:44 -
***
*** 74,82 
  static void _bt_checksplitloc(FindSplitData *state,
    OffsetNumber firstoldonright, bool newitemonleft,
    int dataitemstoleft, Size firstoldonrightsz);
! static void _bt_pgaddtup(Relation rel, Page page,
! 			 Size itemsize, IndexTuple itup,
! 			 OffsetNumber itup_off, const char *where);
  static bool _bt_isequal(TupleDesc itupdesc, Page page, OffsetNumber offnum,
  			int keysz, ScanKey scankey);
  static void _bt_vacuum_one_page(Relation rel, Buffer buffer, Relation heapRel);
--- 74,81 
  static void _bt_checksplitloc(FindSplitData *state,
    OffsetNumber firstoldonright, bool newitemonleft,
    int dataitemstoleft, Size firstoldonrightsz);
! static bool _bt_pgaddtup(Page page, Size itemsize, IndexTuple itup,
! 			 OffsetNumber itup_off);
  static bool _bt_isequal(TupleDesc itupdesc, Page page, OffsetNumber offnum,
  			int keysz, ScanKey scankey);
  static void _bt_vacuum_one_page(Relation rel, Buffer buffer, Relation heapRel);
***
*** 753,759 
  		/* Do the update.  No ereport(ERROR) until changes are logged */
  		START_CRIT_SECTION();
  
! 		_bt_pgaddtup(rel, page, itemsz, itup, newitemoff, page);
  
  		MarkBufferDirty(buf);
  
--- 752,760 
  		/* Do the update.  No ereport(ERROR) until changes are logged */
  		START_CRIT_SECTION();
  
! 		if (!_bt_pgaddtup(page, itemsz, itup, newitemoff))
! 			elog(PANIC, failed to add new item to block %u in index \%s\,
!  itup_blkno, RelationGetRelationName(rel));
  
  		MarkBufferDirty(buf);
  
***
*** 879,884 
--- 880,887 
  	Page		origpage;
  	Page		leftpage,
  rightpage;
+ 	BlockNumber origpagenumber,
+ rightpagenumber;
  	BTPageOpaque ropaque,
  lopaque,
  oopaque;
***
*** 894,904 
--- 897,923 
  	OffsetNumber i;
  	bool		isroot;
  
+ 	/* Acquire a new page to split into */
  	rbuf = _bt_getbuf(rel, P_NEW, BT_WRITE);
+ 
+ 	/*
+ 	 * origpage is the original page to be split.  leftpage is a temporary
+ 	 * buffer that receives the left-sibling data, which will be copied back
+ 	 * into origpage on success.  rightpage is the new page that receives
+ 	 * the right-sibling data.  If we fail before reaching the critical
+ 	 * section, origpage hasn't been modified and leftpage is only workspace.
+ 	 * In principle we shouldn't need to worry about rightpage either,
+ 	 * because it hasn't been linked into the btree page structure; but to
+ 	 * avoid leaving possibly-confusing junk behind, we are careful to reinit
+ 	 * rightpage to empty before throwing any error.
+ 	 */
  	origpage = BufferGetPage(buf);
  	leftpage = PageGetTempPage(origpage);
  	rightpage = BufferGetPage(rbuf);
  
+ 	origpagenumber = BufferGetBlockNumber(buf);
+ 	rightpagenumber = BufferGetBlockNumber(rbuf);
+ 
  	_bt_pageinit(leftpage, BufferGetPageSize(buf));
  	/* rightpage was already initialized by _bt_getbuf */
  
***
*** 923,930 
  	lopaque-btpo_flags = ~(BTP_ROOT | BTP_SPLIT_END | BTP_HAS_GARBAGE);
  	ropaque-btpo_flags = lopaque-btpo_flags;
  	lopaque-btpo_prev = oopaque-btpo_prev;
! 	lopaque-btpo_next = BufferGetBlockNumber(rbuf);
! 	ropaque-btpo_prev = BufferGetBlockNumber(buf);
  	ropaque-btpo_next = oopaque-btpo_next;
  	lopaque-btpo.level = ropaque-btpo.level = oopaque-btpo.level;
  	/* Since we already have write-lock on both pages, ok to read cycleid */
--- 942,949 
  	lopaque-btpo_flags = ~(BTP_ROOT | BTP_SPLIT_END | BTP_HAS_GARBAGE);
  	ropaque-btpo_flags = lopaque-btpo_flags;
  	lopaque-btpo_prev = oopaque-btpo_prev;
! 	lopaque-btpo_next = rightpagenumber;
! 	ropaque-btpo_prev = origpagenumber;
  	ropaque-btpo_next = oopaque-btpo_next;
  	lopaque-btpo.level =