John Naylor <john.nay...@2ndquadrant.com> 于2019年7月29日周一 上午11:49写道:

> On Thu, Jul 25, 2019 at 10:21 PM Binguo Bao <djydew...@gmail.com> wrote:
> My goal for this stage of review was to understand more fully what the
>
code is doing, and make it as simple and clear as possible, starting
> at the top level. In doing so, it looks like I found some additional
> performance gains. I haven't looked much yet at the TOAST fetching
> logic.
>
>
> 1). For every needle comparison, text_position_next_internal()
> calculates how much of the value is needed and passes that to
> detoast_iterate(), which then calculates if it has to do something or
> not. This is a bit hard to follow. There might also be a performance
> penalty -- the following is just a theory, but it sounds plausible:
> The CPU can probably correctly predict that detoast_iterate() will
> usually return the same value it did last time, but it still has to
> call the function and make sure, which I imagine is more expensive
> than advancing the needle. Ideally, we want to call the iterator only
> if we have to.
>
> In the attached patch (applies on top of your v5),
> text_position_next_internal() simply compares hptr to the detoast
> buffer limit, and calls detoast_iterate() until it can proceed. I
> think this is clearer.


Yes, I think this is a general scenario where the caller continually
calls detoast_iterate until gets enough data, so I think such operations can
be extracted as a macro, as I did in patch v6. In the macro, the
detoast_iterate
function is called only when the data requested by the caller is greater
than the
buffer limit.

(I'm not sure of the error handling, see #2.)
> In this scheme, the only reason to know length is to pass to
> pglz_decompress_iterate() in the case of in-line compression. As I
> alluded to in my first review, I don't think it's worth the complexity
> to handle that iteratively since the value is only a few kB. I made it
> so in-line datums are fully decompressed as in HEAD and removed struct
> members to match.


Sounds good. This not only simplifies the structure and logic of Detoast
Iterator
but also has no major impact on efficiency.


> I also noticed that no one updates or looks at
> "toast_iter.done" so I removed that as well.
>

toast_iter.done is updated when the buffer limit reached the buffer
capacity now.
So, I added it back.


> Now pglz_decompress_iterate() doesn't need length at all. For testing
> I just set decompress_all = true and let the compiler optimize away
> the rest. I left finishing it for you if you agree with these changes.
>

Done.


> 2). detoast_iterate() and fetch_datum_iterate() return a value but we
> don't check it or do anything with it. Should we do something with it?
> It's also not yet clear if we should check the iterator state instead
> of return values. I've added some XXX comments as a reminder. We
> should also check the return value of pglz_decompress_iterate().
>

IMO, we need to provide users with a simple iterative interface.
Using the required data pointer to compare with the buffer limit is an easy
way.
And the application scenarios of the iterator are mostly read operations.
So I think there is no need to return a value, and the iterator needs to
throw an
exception for some wrong calls, such as all the data have been iterated,
but the user still calls the iterator.


>
> 3). Speaking of pglz_decompress_iterate(), I diff'd it with
> pglz_decompress(), and I have some questions on it:
>
> a).
> + srcend = (const unsigned char *) (source->limit == source->capacity
> ? source->limit : (source->limit - 4));
>
> What does the 4 here mean in this expression?


Since we fetch chunks one by one, if we make srcend equals to the source
buffer limit,
In the while loop "while (sp < srcend && dp < destend)", sp may exceed the
source buffer limit and
read unallocated bytes. Giving a four-byte buffer can prevent sp from
exceeding the source buffer limit.
If we have read all the chunks, we don't need to be careful to cross the
border,
just make srcend equal to source buffer limit. I've added comments to
explain it in patch v6.



> Is it possible it's
> compensating for this bit in init_toast_buffer()?
>
> + buf->limit = VARDATA(buf->buf);
>
> It seems the initial limit should also depend on whether the datum is
> compressed, right? Can we just do this:
>
> + buf->limit = buf->position;
>

I'm afraid not. buf->position points to the data portion of the buffer, but
the beginning of
the chunks we read may contain header information. For example, for
compressed data chunks,
the first four bytes record the size of raw data, this means that limit is
four bytes ahead of position.
This initialization doesn't cause errors, although the position is less
than the limit in other cases.
Because we always fetch chunks first, then decompress it.


> b).
> - while (sp < srcend && dp < destend)
> ...
> + while (sp + 1 < srcend && dp < destend &&
> ...
>
> Why is it here "sp + 1"?
>

Ignore it, I set the inactive state of detoast_iter->ctrl to 8 in patch v6
to
achieve the purpose of parsing ctrl correctly every time.


>
> 4. Note that varlena.c has a static state variable, and a cleanup
> function that currently does:
>
> static void
> text_position_cleanup(TextPositionState *state)
> {
> /* no cleanup needed */
> }
>
> It seems to be the detoast iterator could be embedded in this state
> variable, and then free-ing can happen here. That has a possible
> advantage that the iterator struct would be on the same cache line as
> the state data. That would also remove the need to pass "iter" as a
> parameter, since these functions already pass "state". I'm not sure if
> this would be good for other users of the iterator, so maybe we can
> hold off on that for now.
>

Good idea. I've implemented it in patch v6.


> 5. Would it be a good idea to add tests (not always practical), or
> more Assert()'s? You probably already know this, but as a reminder
> it's good to develop with asserts enabled, but never build with them
> for performance testing.
>

I've added more Assert()'s to check iterator state.


>
> I think that's enough for now. If you have any questions or
> counter-arguments, let me know. I've set the commitfest entry to
> waiting on author.
>
>
> --
> John Naylor                https://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

BTW, I found that iterators come in handy for json/jsonb's find field value
or get array elements operations.
I will continue to optimize the json/jsonb query based on the detoast
iterator patch.

-- 
Best regards,
Binguo Bao
From f59bd661d49d9c84941616280c8e4c66d53b570f Mon Sep 17 00:00:00 2001
From: BBG <djydew...@gmail.com>
Date: Tue, 4 Jun 2019 22:56:42 +0800
Subject: [PATCH] de-TOASTing using a iterator

---
 src/backend/access/heap/tuptoaster.c | 447 +++++++++++++++++++++++++++++++++++
 src/backend/utils/adt/varlena.c      |  48 +++-
 src/include/access/tuptoaster.h      |  90 +++++++
 src/include/fmgr.h                   |   7 +
 4 files changed, 580 insertions(+), 12 deletions(-)

diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c
index 55d6e91..2f77f11 100644
--- a/src/backend/access/heap/tuptoaster.c
+++ b/src/backend/access/heap/tuptoaster.c
@@ -83,6 +83,13 @@ static int	toast_open_indexes(Relation toastrel,
 static void toast_close_indexes(Relation *toastidxs, int num_indexes,
 								LOCKMODE lock);
 static void init_toast_snapshot(Snapshot toast_snapshot);
+static FetchDatumIterator create_fetch_datum_iterator(struct varlena *attr);
+static bool free_fetch_datum_iterator(FetchDatumIterator iter);
+static void fetch_datum_iterate(FetchDatumIterator iter);
+static void init_toast_buffer(ToastBuffer *buf, int size, bool compressed);
+static bool free_toast_buffer(ToastBuffer *buf);
+static void pglz_decompress_iterate(ToastBuffer *source, ToastBuffer *dest,
+									DetoastIterator iter);
 
 
 /* ----------
@@ -347,6 +354,115 @@ heap_tuple_untoast_attr_slice(struct varlena *attr,
 
 
 /* ----------
+ * create_detoast_iterator -
+ *
+ * Initialize detoast iterator.
+ * ----------
+ */
+DetoastIterator create_detoast_iterator(struct varlena *attr) {
+	struct varatt_external toast_pointer;
+	DetoastIterator iterator = NULL;
+	if (VARATT_IS_EXTERNAL_ONDISK(attr))
+	{
+		/*
+		 * This is an externally stored datum --- create fetch datum iterator
+		 */
+		iterator = (DetoastIterator) palloc0(sizeof(DetoastIteratorData));
+		iterator->fetch_datum_iterator = create_fetch_datum_iterator(attr);
+		VARATT_EXTERNAL_GET_POINTER(toast_pointer, attr);
+		if (VARATT_EXTERNAL_IS_COMPRESSED(toast_pointer))
+		{
+			/* If it's compressed, prepare buffer for raw data */
+			iterator->buf = (ToastBuffer *) palloc0(sizeof(ToastBuffer));
+			init_toast_buffer(iterator->buf, toast_pointer.va_rawsize, false);
+			iterator->ctrlc = 8;
+			iterator->compressed = true;
+		}
+		else
+		{
+			iterator->buf = iterator->fetch_datum_iterator->buf;
+			iterator->ctrlc = 8;
+			iterator->compressed = false;
+		}
+	}
+	else if (VARATT_IS_EXTERNAL_INDIRECT(attr))
+	{
+		/*
+		 * This is an indirect pointer --- dereference it
+		 */
+		struct varatt_indirect redirect;
+
+		VARATT_EXTERNAL_GET_POINTER(redirect, attr);
+		attr = (struct varlena *) redirect.pointer;
+
+		/* nested indirect Datums aren't allowed */
+		Assert(!VARATT_IS_EXTERNAL_INDIRECT(attr));
+
+		/* recurse in case value is still extended in some other way */
+		iterator = create_detoast_iterator(attr);
+
+	}
+	else if (VARATT_IS_COMPRESSED(attr))
+	{
+		/*
+		 * This is a compressed value inside of the main tuple
+		 * Skip the iterator and just decompress the whole thing.
+		 */
+		return NULL;
+	}
+
+	return iterator;
+}
+
+
+/* ----------
+ * free_detoast_iterator -
+ *
+ * Free the memory space occupied by the de-Toast iterator.
+ * ----------
+ */
+bool free_detoast_iterator(DetoastIterator iter) {
+	if (iter == NULL)
+	{
+		return false;
+	}
+	if (iter->buf != iter->fetch_datum_iterator->buf)
+	{
+		free_toast_buffer(iter->buf);
+	}
+	free_fetch_datum_iterator(iter->fetch_datum_iterator);
+	pfree(iter);
+	return true;
+}
+
+
+/* ----------
+ * detoast_iterate -
+ *
+ * Iterate through the toasted value referenced by iterator.
+ *
+ * As long as there is another slice in compression or external storage,
+ * detoast it into toast buffer in iterator.
+ * ----------
+ */
+extern void detoast_iterate(DetoastIterator iter)
+{
+	FetchDatumIterator fetch_iter = iter->fetch_datum_iterator;
+
+	Assert(iter != NULL && !iter->done);
+
+	fetch_datum_iterate(fetch_iter);
+
+	if (iter->compressed)
+		pglz_decompress_iterate(fetch_iter->buf, iter->buf, iter);
+
+	if (iter->buf->limit == iter->buf->capacity) {
+		iter->done = true;
+	}
+}
+
+
+/* ----------
  * toast_raw_datum_size -
  *
  *	Return the raw (detoasted) size of a varlena datum
@@ -2409,3 +2525,334 @@ init_toast_snapshot(Snapshot toast_snapshot)
 
 	InitToastSnapshot(*toast_snapshot, snapshot->lsn, snapshot->whenTaken);
 }
+
+
+/* ----------
+ * create_fetch_datum_iterator -
+ *
+ * Initialize fetch datum iterator.
+ * ----------
+ */
+static FetchDatumIterator
+create_fetch_datum_iterator(struct varlena *attr) {
+	int			validIndex;
+	FetchDatumIterator iterator;
+
+	if (!VARATT_IS_EXTERNAL_ONDISK(attr))
+		elog(ERROR, "create_fetch_datum_itearator shouldn't be called for non-ondisk datums");
+
+	iterator = (FetchDatumIterator) palloc0(sizeof(FetchDatumIteratorData));
+
+	/* Must copy to access aligned fields */
+	VARATT_EXTERNAL_GET_POINTER(iterator->toast_pointer, attr);
+
+	iterator->ressize = iterator->toast_pointer.va_extsize;
+	iterator->numchunks = ((iterator->ressize - 1) / TOAST_MAX_CHUNK_SIZE) + 1;
+
+	/*
+	 * Open the toast relation and its indexes
+	 */
+	iterator->toastrel = table_open(iterator->toast_pointer.va_toastrelid, AccessShareLock);
+
+	/* Look for the valid index of the toast relation */
+	validIndex = toast_open_indexes(iterator->toastrel,
+									AccessShareLock,
+									&iterator->toastidxs,
+									&iterator->num_indexes);
+
+	/*
+	 * Setup a scan key to fetch from the index by va_valueid
+	 */
+	ScanKeyInit(&iterator->toastkey,
+				(AttrNumber) 1,
+				BTEqualStrategyNumber, F_OIDEQ,
+				ObjectIdGetDatum(iterator->toast_pointer.va_valueid));
+
+	/*
+	 * Read the chunks by index
+	 *
+	 * Note that because the index is actually on (valueid, chunkidx) we will
+	 * see the chunks in chunkidx order, even though we didn't explicitly ask
+	 * for it.
+	 */
+
+	init_toast_snapshot(&iterator->SnapshotToast);
+	iterator->toastscan = systable_beginscan_ordered(iterator->toastrel, iterator->toastidxs[validIndex],
+										   &iterator->SnapshotToast, 1, &iterator->toastkey);
+
+	iterator->buf = (ToastBuffer *) palloc0(sizeof(ToastBuffer));
+	init_toast_buffer(iterator->buf, iterator->ressize + VARHDRSZ, VARATT_EXTERNAL_IS_COMPRESSED(iterator->toast_pointer));
+
+	iterator->nextidx = 0;
+	iterator->done = false;
+
+	return iterator;
+}
+
+static bool
+free_fetch_datum_iterator(FetchDatumIterator iter)
+{
+	if (iter == NULL)
+	{
+		return false;
+	}
+
+	if (!iter->done)
+	{
+		systable_endscan_ordered(iter->toastscan);
+		toast_close_indexes(iter->toastidxs, iter->num_indexes, AccessShareLock);
+		table_close(iter->toastrel, AccessShareLock);
+	}
+	free_toast_buffer(iter->buf);
+	pfree(iter);
+	return true;
+}
+
+/* ----------
+ * fetch_datum_iterate -
+ *
+ * Iterate through the toasted value referenced by iterator.
+ *
+ * As long as there is another chunk data in compression or external storage,
+ * fetch it into buffer in iterator.
+ * ----------
+ */
+static void
+fetch_datum_iterate(FetchDatumIterator iter) {
+	HeapTuple	ttup;
+	TupleDesc	toasttupDesc;
+	int32		residx;
+	Pointer		chunk;
+	bool		isnull;
+	char		*chunkdata;
+	int32		chunksize;
+
+	Assert(iter != NULL && !iter->done);
+
+	ttup = systable_getnext_ordered(iter->toastscan, ForwardScanDirection);
+	if (ttup == NULL)
+	{
+		/*
+		 * Final checks that we successfully fetched the datum
+		 */
+		if (iter->nextidx != iter->numchunks)
+			elog(ERROR, "missing chunk number %d for toast value %u in %s",
+				 iter->nextidx,
+				 iter->toast_pointer.va_valueid,
+				 RelationGetRelationName(iter->toastrel));
+
+		/*
+		 * End scan and close relations
+		 */
+		systable_endscan_ordered(iter->toastscan);
+		toast_close_indexes(iter->toastidxs, iter->num_indexes, AccessShareLock);
+		table_close(iter->toastrel, AccessShareLock);
+
+		iter->done = true;
+		return;
+	}
+
+	/*
+	 * Have a chunk, extract the sequence number and the data
+	 */
+	toasttupDesc = iter->toastrel->rd_att;
+	residx = DatumGetInt32(fastgetattr(ttup, 2, toasttupDesc, &isnull));
+	Assert(!isnull);
+	chunk = DatumGetPointer(fastgetattr(ttup, 3, toasttupDesc, &isnull));
+	Assert(!isnull);
+	if (!VARATT_IS_EXTENDED(chunk))
+	{
+		chunksize = VARSIZE(chunk) - VARHDRSZ;
+		chunkdata = VARDATA(chunk);
+	}
+	else if (VARATT_IS_SHORT(chunk))
+	{
+		/* could happen due to heap_form_tuple doing its thing */
+		chunksize = VARSIZE_SHORT(chunk) - VARHDRSZ_SHORT;
+		chunkdata = VARDATA_SHORT(chunk);
+	}
+	else
+	{
+		/* should never happen */
+		elog(ERROR, "found toasted toast chunk for toast value %u in %s",
+			 iter->toast_pointer.va_valueid,
+			 RelationGetRelationName(iter->toastrel));
+		chunksize = 0;		/* keep compiler quiet */
+		chunkdata = NULL;
+	}
+
+	/*
+	 * Some checks on the data we've found
+	 */
+	if (residx != iter->nextidx)
+		elog(ERROR, "unexpected chunk number %d (expected %d) for toast value %u in %s",
+			 residx, iter->nextidx,
+			 iter->toast_pointer.va_valueid,
+			 RelationGetRelationName(iter->toastrel));
+	if (residx < iter->numchunks - 1)
+	{
+		if (chunksize != TOAST_MAX_CHUNK_SIZE)
+			elog(ERROR, "unexpected chunk size %d (expected %d) in chunk %d of %d for toast value %u in %s",
+				 chunksize, (int) TOAST_MAX_CHUNK_SIZE,
+				 residx, iter->numchunks,
+				 iter->toast_pointer.va_valueid,
+				 RelationGetRelationName(iter->toastrel));
+	}
+	else if (residx == iter->numchunks - 1)
+	{
+		if ((residx * TOAST_MAX_CHUNK_SIZE + chunksize) != iter->ressize)
+			elog(ERROR, "unexpected chunk size %d (expected %d) in final chunk %d for toast value %u in %s",
+				 chunksize,
+				 (int) (iter->ressize - residx * TOAST_MAX_CHUNK_SIZE),
+				 residx,
+				 iter->toast_pointer.va_valueid,
+				 RelationGetRelationName(iter->toastrel));
+	}
+	else
+		elog(ERROR, "unexpected chunk number %d (out of range %d..%d) for toast value %u in %s",
+			 residx,
+			 0, iter->numchunks - 1,
+			 iter->toast_pointer.va_valueid,
+			 RelationGetRelationName(iter->toastrel));
+
+	/*
+	 * Copy the data into proper place in our iterator buffer
+	 */
+	memcpy(iter->buf->limit, chunkdata, chunksize);
+	iter->buf->limit += chunksize;
+
+	iter->nextidx++;
+}
+
+
+static void
+init_toast_buffer(ToastBuffer *buf, int32 size, bool compressed) {
+	buf->buf = (const char *) palloc0(size);
+	if (compressed) {
+		SET_VARSIZE_COMPRESSED(buf->buf, size);
+		buf->position = VARDATA_4B_C(buf->buf);
+	}
+	else
+	{
+		SET_VARSIZE(buf->buf, size);
+		buf->position = VARDATA_4B(buf->buf);
+	}
+	buf->limit = VARDATA(buf->buf);
+	buf->capacity = buf->buf + size;
+	buf->buf_size = size;
+}
+
+
+static bool
+free_toast_buffer(ToastBuffer *buf)
+{
+	if (buf == NULL)
+	{
+		return false;
+	}
+
+	pfree((void *)buf->buf);
+	pfree(buf);
+
+	return true;
+}
+
+
+/* ----------
+ * pglz_decompress_iterate -
+ *
+ *		Decompresses source into dest until the source is exhausted.
+ * ----------
+ */
+static void
+pglz_decompress_iterate(ToastBuffer *source, ToastBuffer *dest, DetoastIterator iter)
+{
+	const unsigned char *sp;
+	const unsigned char *srcend;
+	unsigned char *dp;
+	unsigned char *destend;
+
+	sp = (const unsigned char *) source->position;
+	/*
+	 * Provides a four-byte buffer to prevent sp from reading unallocated bytes.
+	 */
+	srcend = (const unsigned char *)
+		(source->limit == source->capacity ? source->limit : (source->limit - 4));
+	dp = (unsigned char *) dest->limit;
+	destend = (unsigned char *) dest->capacity;
+
+	while (sp < srcend && dp < destend)
+	{
+		/*
+		 * Read one control byte and process the next 8 items (or as many as
+		 * remain in the compressed input).
+		 */
+		unsigned char ctrl;
+		int			ctrlc;
+		if (iter->ctrlc < 8) {
+			ctrl = iter->ctrl;
+			ctrlc = iter->ctrlc;
+		}
+		else
+		{
+			ctrl = *sp++;
+			ctrlc = 0;
+		}
+
+
+		for (; ctrlc < 8 && sp < srcend && dp < destend; ctrlc++)
+		{
+
+			if (ctrl & 1)
+			{
+				/*
+				 * Otherwise it contains the match length minus 3 and the
+				 * upper 4 bits of the offset. The next following byte
+				 * contains the lower 8 bits of the offset. If the length is
+				 * coded as 18, another extension tag byte tells how much
+				 * longer the match really was (0-255).
+				 */
+				int32		len;
+				int32		off;
+
+				len = (sp[0] & 0x0f) + 3;
+				off = ((sp[0] & 0xf0) << 4) | sp[1];
+				sp += 2;
+				if (len == 18)
+					len += *sp++;
+
+				/*
+				 * Now we copy the bytes specified by the tag from OUTPUT to
+				 * OUTPUT. It is dangerous and platform dependent to use
+				 * memcpy() here, because the copied areas could overlap
+				 * extremely!
+				 */
+				len = Min(len, destend - dp);
+				while (len--)
+				{
+					*dp = dp[-off];
+					dp++;
+				}
+			}
+			else
+			{
+				/*
+				 * An unset control bit means LITERAL BYTE. So we just copy
+				 * one from INPUT to OUTPUT.
+				 */
+				*dp++ = *sp++;
+			}
+
+			/*
+			 * Advance the control bit
+			 */
+			ctrl >>= 1;
+		}
+
+		iter->ctrlc = ctrlc;
+		iter->ctrl = ctrl;
+	}
+
+	source->position = (char *) sp;
+	dest->limit = (char *) dp;
+}
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 0864838..2f3c7ac 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -56,6 +56,8 @@ typedef struct
 	int			len1;			/* string lengths in bytes */
 	int			len2;
 
+	DetoastIterator iter;
+
 	/* Skip table for Boyer-Moore-Horspool search algorithm: */
 	int			skiptablemask;	/* mask for ANDing with skiptable subscripts */
 	int			skiptable[256]; /* skip distance for given mismatched char */
@@ -122,8 +124,9 @@ static text *text_substring(Datum str,
 							int32 length,
 							bool length_not_specified);
 static text *text_overlay(text *t1, text *t2, int sp, int sl);
-static int	text_position(text *t1, text *t2, Oid collid);
-static void text_position_setup(text *t1, text *t2, Oid collid, TextPositionState *state);
+static int	text_position(text *t1, text *t2, Oid collid, DetoastIterator iter);
+static void text_position_setup(text *t1, text *t2, DetoastIterator iter,
+								Oid collid, TextPositionState *state);
 static bool text_position_next(TextPositionState *state);
 static char *text_position_next_internal(char *start_ptr, TextPositionState *state);
 static char *text_position_get_match_ptr(TextPositionState *state);
@@ -1092,10 +1095,20 @@ text_overlay(text *t1, text *t2, int sp, int sl)
 Datum
 textpos(PG_FUNCTION_ARGS)
 {
-	text	   *str = PG_GETARG_TEXT_PP(0);
+	text		*str;
+	DetoastIterator iter = create_detoast_iterator((struct varlena *)(DatumGetPointer(PG_GETARG_DATUM(0))));
 	text	   *search_str = PG_GETARG_TEXT_PP(1);
 
-	PG_RETURN_INT32((int32) text_position(str, search_str, PG_GET_COLLATION()));
+	if (iter != NULL)
+	{
+		str = (text *) iter->buf->buf;
+	}
+	else
+	{
+		str = PG_GETARG_TEXT_PP(0);
+	}
+
+	PG_RETURN_INT32((int32) text_position(str, search_str, PG_GET_COLLATION(), iter));
 }
 
 /*
@@ -1113,7 +1126,7 @@ textpos(PG_FUNCTION_ARGS)
  *	functions.
  */
 static int
-text_position(text *t1, text *t2, Oid collid)
+text_position(text *t1, text *t2, Oid collid, DetoastIterator iter)
 {
 	TextPositionState state;
 	int			result;
@@ -1121,7 +1134,7 @@ text_position(text *t1, text *t2, Oid collid)
 	if (VARSIZE_ANY_EXHDR(t1) < 1 || VARSIZE_ANY_EXHDR(t2) < 1)
 		return 0;
 
-	text_position_setup(t1, t2, collid, &state);
+	text_position_setup(t1, t2, iter, collid, &state);
 	if (!text_position_next(&state))
 		result = 0;
 	else
@@ -1130,7 +1143,6 @@ text_position(text *t1, text *t2, Oid collid)
 	return result;
 }
 
-
 /*
  * text_position_setup, text_position_next, text_position_cleanup -
  *	Component steps of text_position()
@@ -1148,7 +1160,7 @@ text_position(text *t1, text *t2, Oid collid)
  */
 
 static void
-text_position_setup(text *t1, text *t2, Oid collid, TextPositionState *state)
+text_position_setup(text *t1, text *t2, DetoastIterator iter, Oid collid, TextPositionState *state)
 {
 	int			len1 = VARSIZE_ANY_EXHDR(t1);
 	int			len2 = VARSIZE_ANY_EXHDR(t2);
@@ -1196,6 +1208,7 @@ text_position_setup(text *t1, text *t2, Oid collid, TextPositionState *state)
 	state->str2 = VARDATA_ANY(t2);
 	state->len1 = len1;
 	state->len2 = len2;
+	state->iter = iter;
 	state->last_match = NULL;
 	state->refpoint = state->str1;
 	state->refpos = 0;
@@ -1358,6 +1371,10 @@ text_position_next_internal(char *start_ptr, TextPositionState *state)
 		hptr = start_ptr;
 		while (hptr < haystack_end)
 		{
+			if (state->iter != NULL) {
+				PG_DETOAST_ITERATE(state->iter, hptr);
+			}
+
 			if (*hptr == nchar)
 				return (char *) hptr;
 			hptr++;
@@ -1375,6 +1392,11 @@ text_position_next_internal(char *start_ptr, TextPositionState *state)
 			const char *nptr;
 			const char *p;
 
+			if (state->iter != NULL)
+			{
+				PG_DETOAST_ITERATE(state->iter, hptr);
+			}
+
 			nptr = needle_last;
 			p = hptr;
 			while (*nptr == *p)
@@ -1438,7 +1460,9 @@ text_position_get_match_pos(TextPositionState *state)
 static void
 text_position_cleanup(TextPositionState *state)
 {
-	/* no cleanup needed */
+	if (state->iter != NULL) {
+		free_detoast_iterator(state->iter);
+	}
 }
 
 static void
@@ -4229,7 +4253,7 @@ replace_text(PG_FUNCTION_ARGS)
 		PG_RETURN_TEXT_P(src_text);
 	}
 
-	text_position_setup(src_text, from_sub_text, PG_GET_COLLATION(), &state);
+	text_position_setup(src_text, from_sub_text, NULL, PG_GET_COLLATION(), &state);
 
 	found = text_position_next(&state);
 
@@ -4590,7 +4614,7 @@ split_text(PG_FUNCTION_ARGS)
 			PG_RETURN_TEXT_P(cstring_to_text(""));
 	}
 
-	text_position_setup(inputstring, fldsep, PG_GET_COLLATION(), &state);
+	text_position_setup(inputstring, fldsep, NULL, PG_GET_COLLATION(), &state);
 
 	/* identify bounds of first field */
 	start_ptr = VARDATA_ANY(inputstring);
@@ -4754,7 +4778,7 @@ text_to_array_internal(PG_FUNCTION_ARGS)
 													 TEXTOID, -1, false, 'i'));
 		}
 
-		text_position_setup(inputstring, fldsep, PG_GET_COLLATION(), &state);
+		text_position_setup(inputstring, fldsep, NULL, PG_GET_COLLATION(), &state);
 
 		start_ptr = VARDATA_ANY(inputstring);
 
diff --git a/src/include/access/tuptoaster.h b/src/include/access/tuptoaster.h
index f0aea24..049782f 100644
--- a/src/include/access/tuptoaster.h
+++ b/src/include/access/tuptoaster.h
@@ -17,6 +17,96 @@
 #include "storage/lockdefs.h"
 #include "utils/relcache.h"
 
+#ifndef FRONTEND
+#include "access/genam.h"
+
+/*
+ * TOAST buffer is a producer consumer buffer.
+ *
+ *    +--+--+--+--+--+--+--+--+--+--+--+--+--+
+ *    |  |  |  |  |  |  |  |  |  |  |  |  |  |
+ *    +--+--+--+--+--+--+--+--+--+--+--+--+--+
+ *    ^           ^           ^              ^
+ *   buf      position      limit         capacity
+ *
+ * buf: point to the start of buffer.
+ * position: point to the next char to be consume.
+ * limit: point to the next char to be produce.
+ * capacity: point to the end of buffer.
+ *
+ * Constrains that need to be satisfied:
+ * buf <= position <= limit <= capacity
+ */
+typedef struct ToastBuffer
+{
+	const char	*buf;
+	const char	*position;
+	char		*limit;
+	const char	*capacity;
+	int32		buf_size;
+} ToastBuffer;
+
+
+typedef struct FetchDatumIteratorData
+{
+	ToastBuffer	*buf;
+	Relation	toastrel;
+	Relation	*toastidxs;
+	SysScanDesc	toastscan;
+	ScanKeyData	toastkey;
+	SnapshotData			SnapshotToast;
+	struct varatt_external	toast_pointer;
+	int32		ressize;
+	int32		nextidx;
+	int32		numchunks;
+	int			num_indexes;
+	bool		done;
+}				FetchDatumIteratorData;
+
+typedef struct FetchDatumIteratorData *FetchDatumIterator;
+
+typedef struct DetoastIteratorData
+{
+	ToastBuffer 		*buf;
+	FetchDatumIterator	fetch_datum_iterator;
+	unsigned char		ctrl;
+	int					ctrlc;
+	bool				compressed;		/* toast value is compressed? */
+	bool				done;
+}			DetoastIteratorData;
+
+typedef struct DetoastIteratorData *DetoastIterator;
+
+/* ----------
+ * create_detoast_iterator -
+ *
+ * Initialize detoast iterator.
+ * ----------
+ */
+extern DetoastIterator create_detoast_iterator(struct varlena *attr);
+
+/* ----------
+ * free_detoast_iterator -
+ *
+ * Free the memory space occupied by the de-Toast iterator.
+ * ----------
+ */
+extern bool free_detoast_iterator(DetoastIterator iter);
+
+/* ----------
+ * detoast_iterate -
+ *
+ * Iterate through the toasted value referenced by iterator.
+ *
+ * As long as there is another slice in compression or external storage,
+ * detoast it into toast buffer in iterator.
+ * ----------
+ */
+extern void detoast_iterate(DetoastIterator iter);
+
+#endif
+
+
 /*
  * This enables de-toasting of index entries.  Needed until VACUUM is
  * smart enough to rebuild indexes from scratch.
diff --git a/src/include/fmgr.h b/src/include/fmgr.h
index 3ff0999..446c880 100644
--- a/src/include/fmgr.h
+++ b/src/include/fmgr.h
@@ -239,6 +239,13 @@ extern struct varlena *pg_detoast_datum_packed(struct varlena *datum);
 #define PG_DETOAST_DATUM_SLICE(datum,f,c) \
 		pg_detoast_datum_slice((struct varlena *) DatumGetPointer(datum), \
 		(int32) (f), (int32) (c))
+#define PG_DETOAST_ITERATE(iter, need)									\
+	do {																\
+		Assert(need >= iter->buf->buf && need <= iter->buf->capacity);	\
+		while (!iter->done && need >= iter->buf->limit) { 				\
+			detoast_iterate(iter);										\
+		}																\
+	} while (0)
 /* WARNING -- unaligned pointer */
 #define PG_DETOAST_DATUM_PACKED(datum) \
 	pg_detoast_datum_packed((struct varlena *) DatumGetPointer(datum))
-- 
2.7.4

Reply via email to