Hello,
Attached is a basic implementation of TABLESAMPLE clause. It's SQL
standard clause and couple of people tried to submit it before so I
think I don't need to explain in length what it does - basically returns
"random" sample of a table using a specified sampling method.
I implemented both SYSTEM and BERNOULLI sampling as specified by SQL
standard. The SYSTEM sampling does block level sampling using same
algorithm as ANALYZE, BERNOULLI scans whole table and picks tuple randomly.
There is API for sampling methods which consists of 4 functions at the
moment - init, end, nextblock and nexttuple. I added catalog which maps
the sampling method to the functions implementing this API. The grammar
creates new TableSampleRange struct that I added for sampling. Parser
then uses the catalog to load information about the sampling method into
TableSampleClause which is then attached to RTE. Planner checks for if
this parameter is present in the RTE and if it finds it it will create
plan with just one path - SampleScan. SampleScan implements standard
executor API and calls the sampling method API as needed.
It is possible to write custom sampling methods. The sampling method
parameters are not limited to just percent number as in standard but
dynamic list of expressions which is checked against the definition of
the init function in a similar fashion (although much simplified) as
function calls are.
Notable lacking parts are:
- proper costing and returned row count estimation - given the dynamic
nature of parameters I think for we'll need to let the sampling method
do this, so there will have to be fifth function in the API.
- ruleutils support (it needs a bit of code in get_from_clause_item
function)
- docs are sparse at the moment
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index 01d24a5..250ae29 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -49,7 +49,7 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac
<phrase>where <replaceable class="parameter">from_item</replaceable> can be one of:</phrase>
- [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ] ]
+ [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ TABLESAMPLE <replaceable class="parameter">sampling_method</replaceable> ( <replaceable class="parameter">argument</replaceable> [, ...] ) [ REPEATABLE ( <replaceable class="parameter">seed</replaceable> ) ] ] [ [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ] ]
[ LATERAL ] ( <replaceable class="parameter">select</replaceable> ) [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ]
<replaceable class="parameter">with_query_name</replaceable> [ [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ] ]
[ LATERAL ] <replaceable class="parameter">function_name</replaceable> ( [ <replaceable class="parameter">argument</replaceable> [, ...] ] )
@@ -317,6 +317,38 @@ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ]
</varlistentry>
<varlistentry>
+ <term>TABLESAMPLE <replaceable class="parameter">sampling_method</replaceable> ( <replaceable class="parameter">argument</replaceable> [, ...] ) [ REPEATABLE ( <replaceable class="parameter">seed</replaceable> ) ]</term>
+ <listitem>
+ <para>
+ Table sample clause after
+ <replaceable class="parameter">table_name</replaceable> indicates that
+ a <replaceable class="parameter">sampling_method</replaceable> should
+ be used to retrieve subset of rows in the table.
+ The <replaceable class="parameter">sampling_method</replaceable> can be
+ one of:
+ <itemizedlist>
+ <listitem>
+ <para><literal>SYSTEM</literal></para>
+ </listitem>
+ <listitem>
+ <para><literal>BERNOULLI</literal></para>
+ </listitem>
+ </itemizedlist>
+ Both of those sampling methods currently accept only single argument
+ which is the percent (floating point from 0 to 100) of the rows to
+ be returned.
+ The <literal>SYSTEM</literal> sampling method does block level
+ sampling with each block having same chance of being selected and
+ returns all rows from each selected block.
+ The <literal>BERNOULLI</literal> scans whole table and returns
+ individual rows with equal probability.
+ The optional numeric parameter <literal>REPEATABLE</literal> is used
+ as random seed for sampling.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><replaceable class="parameter">alias</replaceable></term>
<listitem>
<para>
diff --git a/src/backend/access/Makefile b/src/backend/access/Makefile
index 21721b4..595737c 100644
--- a/src/backend/access/Makefile
+++ b/src/backend/access/Makefile
@@ -8,6 +8,7 @@ subdir = src/backend/access
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
-SUBDIRS = brin common gin gist hash heap index nbtree rmgrdesc spgist transam
+SUBDIRS = brin common gin gist hash heap index nbtree rmgrdesc spgist \
+ transam tsm
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/access/tsm/Makefile b/src/backend/access/tsm/Makefile
new file mode 100644
index 0000000..73bbbd7
--- /dev/null
+++ b/src/backend/access/tsm/Makefile
@@ -0,0 +1,17 @@
+#-------------------------------------------------------------------------
+#
+# Makefile--
+# Makefile for access/tsm
+#
+# IDENTIFICATION
+# src/backend/access/tsm/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/backend/access/tsm
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+OBJS = tsm_system.o tsm_bernoulli.o
+
+include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/access/tsm/tsm_bernoulli.c b/src/backend/access/tsm/tsm_bernoulli.c
new file mode 100644
index 0000000..c273ca6
--- /dev/null
+++ b/src/backend/access/tsm/tsm_bernoulli.c
@@ -0,0 +1,135 @@
+/*-------------------------------------------------------------------------
+ *
+ * tsm_bernoulli.c
+ * interface routines for BERNOULLI table sample method
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/backend/access/tsm/tsm_bernoulli.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+
+#include "access/tsm_bernoulli.h"
+
+#include "nodes/execnodes.h"
+#include "storage/bufmgr.h"
+#include "utils/sampling.h"
+
+
+/* Data structure for Algorithm S from Knuth 3.4.2 */
+typedef struct
+{
+ long seed;
+ BlockNumber tblocks;
+ BlockNumber blockno;
+ float percent;
+ OffsetNumber lt; /* last tuple returned from current block */
+} BernoulliSamplerData;
+
+
+Datum
+tsm_bernoulli_init(PG_FUNCTION_ARGS)
+{
+ SampleScanState *scanstate = (SampleScanState *) PG_GETARG_POINTER(0);
+ long seed = PG_GETARG_UINT32(1);
+ float4 percent = PG_GETARG_FLOAT4(2);
+ Relation rel = scanstate->ss.ss_currentRelation;
+ BernoulliSamplerData *sampler;
+
+ if (percent < 0 || percent > 100)
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("invalid sample size"),
+ errhint("Sample size can be numeric value between 0 and 100 (inclusive).")));
+
+ sampler = palloc0(sizeof(BernoulliSamplerData));
+
+ /* Remember initial values for reinit */
+ sampler->seed = seed;
+ sampler->tblocks = RelationGetNumberOfBlocks(rel);
+ sampler->blockno = InvalidBlockNumber;
+ sampler->percent = percent / 100;
+ sampler->lt = InvalidOffsetNumber;
+
+ sampler_setseed(seed);
+
+ scanstate->tsmdata = (void *) sampler;
+
+ PG_RETURN_VOID();
+}
+
+Datum
+tsm_bernoulli_nextblock(PG_FUNCTION_ARGS)
+{
+ SampleScanState *scanstate = (SampleScanState *) PG_GETARG_POINTER(0);
+ BernoulliSamplerData *sampler = (BernoulliSamplerData *) scanstate->tsmdata;
+
+ if (sampler->blockno == InvalidBlockNumber)
+ sampler->blockno = 0;
+ else if (++sampler->blockno >= sampler->tblocks)
+ PG_RETURN_UINT32(InvalidBlockNumber);
+
+ PG_RETURN_UINT32(sampler->blockno);
+}
+
+Datum
+tsm_bernoulli_nexttuple(PG_FUNCTION_ARGS)
+{
+ SampleScanState *scanstate = (SampleScanState *) PG_GETARG_POINTER(0);
+ OffsetNumber maxoffset = PG_GETARG_UINT16(2);
+ BernoulliSamplerData *sampler = (BernoulliSamplerData *) scanstate->tsmdata;
+ OffsetNumber tupoffset = sampler->lt;
+ double percent = sampler->percent;
+
+ if (tupoffset == InvalidOffsetNumber)
+ tupoffset = FirstOffsetNumber;
+ else
+ tupoffset++;
+
+ /* Every tuple has percent chance of being returned */
+ while (sampler_random_fract() > percent)
+ {
+ tupoffset++;
+
+ if (tupoffset > maxoffset)
+ break;
+ }
+
+ if (tupoffset > maxoffset)
+ /* Tell SampleScan that we want next block. */
+ tupoffset = InvalidOffsetNumber;
+
+ sampler->lt = tupoffset;
+
+ PG_RETURN_UINT16(tupoffset);
+}
+
+Datum
+tsm_bernoulli_end(PG_FUNCTION_ARGS)
+{
+ SampleScanState *scanstate = (SampleScanState *) PG_GETARG_POINTER(0);
+
+ pfree(scanstate->tsmdata);
+
+ PG_RETURN_VOID();
+}
+
+Datum
+tsm_bernoulli_reset(PG_FUNCTION_ARGS)
+{
+ SampleScanState *scanstate = (SampleScanState *) PG_GETARG_POINTER(0);
+ BernoulliSamplerData *sampler = (BernoulliSamplerData *) scanstate->tsmdata;
+
+ sampler->blockno = InvalidBlockNumber;
+ sampler->lt = InvalidOffsetNumber;
+ sampler_setseed(sampler->seed);
+
+ PG_RETURN_VOID();
+}
diff --git a/src/backend/access/tsm/tsm_system.c b/src/backend/access/tsm/tsm_system.c
new file mode 100644
index 0000000..5834078
--- /dev/null
+++ b/src/backend/access/tsm/tsm_system.c
@@ -0,0 +1,124 @@
+/*-------------------------------------------------------------------------
+ *
+ * tsm_system.c
+ * interface routines for system table sample method
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/backend/access/tsm/tsm_system.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+
+#include "access/tsm_system.h"
+
+#include "nodes/execnodes.h"
+#include "storage/bufmgr.h"
+#include "utils/sampling.h"
+
+
+/* Data structure for Algorithm S from Knuth 3.4.2 */
+typedef struct
+{
+ BlockSamplerData bs;
+ long seed;
+ BlockNumber tblocks;
+ int samplesize;
+ OffsetNumber lt; /* last tuple returned from current block */
+} SystemSamplerData;
+
+
+Datum
+tsm_system_init(PG_FUNCTION_ARGS)
+{
+ SampleScanState *scanstate = (SampleScanState *) PG_GETARG_POINTER(0);
+ long seed = PG_GETARG_UINT32(1);
+ float4 percent = PG_GETARG_FLOAT4(2);
+ SystemSamplerData *sampler;
+
+ if (percent < 0 || percent > 100)
+ ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("invalid sample size"),
+ errhint("Sample size can be numeric value between 0 and 100 (inclusive).")));
+
+ sampler = palloc0(sizeof(SystemSamplerData));
+
+ /* Remember initial values for reinit */
+ sampler->seed = seed;
+ sampler->tblocks = RelationGetNumberOfBlocks(scanstate->ss.ss_currentRelation);
+ sampler->samplesize = 1 + (int) (sampler->tblocks * (percent / 100.0));
+ sampler->lt = InvalidOffsetNumber;
+
+ sampler_setseed(seed);
+ BlockSampler_Init(&sampler->bs, sampler->tblocks, sampler->samplesize);
+
+ scanstate->tsmdata = (void *) sampler;
+
+ PG_RETURN_VOID();
+}
+
+Datum
+tsm_system_nextblock(PG_FUNCTION_ARGS)
+{
+ SampleScanState *scanstate = (SampleScanState *) PG_GETARG_POINTER(0);
+ SystemSamplerData *sampler = (SystemSamplerData *) scanstate->tsmdata;
+ BlockNumber blockno;
+
+ if (!BlockSampler_HasMore(&sampler->bs))
+ PG_RETURN_UINT32(InvalidBlockNumber);
+
+ blockno = BlockSampler_Next(&sampler->bs);
+
+ PG_RETURN_UINT32(blockno);
+}
+
+Datum
+tsm_system_nexttuple(PG_FUNCTION_ARGS)
+{
+ SampleScanState *scanstate = (SampleScanState *) PG_GETARG_POINTER(0);
+ OffsetNumber maxoffset = PG_GETARG_UINT16(2);
+ SystemSamplerData *sampler = (SystemSamplerData *) scanstate->tsmdata;
+ OffsetNumber tupoffset = sampler->lt;
+
+ if (tupoffset == InvalidOffsetNumber)
+ tupoffset = FirstOffsetNumber;
+ else
+ tupoffset++;
+
+ if (tupoffset > maxoffset)
+ tupoffset = InvalidOffsetNumber;
+
+ sampler->lt = tupoffset;
+
+ PG_RETURN_UINT16(tupoffset);
+}
+
+Datum
+tsm_system_end(PG_FUNCTION_ARGS)
+{
+ SampleScanState *scanstate = (SampleScanState *) PG_GETARG_POINTER(0);
+
+ pfree(scanstate->tsmdata);
+
+ PG_RETURN_VOID();
+}
+
+Datum
+tsm_system_reset(PG_FUNCTION_ARGS)
+{
+ SampleScanState *scanstate = (SampleScanState *) PG_GETARG_POINTER(0);
+ SystemSamplerData *sampler = (SystemSamplerData *) scanstate->tsmdata;
+
+ sampler->lt = InvalidOffsetNumber;
+ sampler_setseed(sampler->seed);
+ BlockSampler_Init(&sampler->bs, sampler->tblocks, sampler->samplesize);
+
+ PG_RETURN_VOID();
+}
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index a403c64..5598244 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -39,7 +39,7 @@ POSTGRES_BKI_SRCS = $(addprefix $(top_srcdir)/src/include/catalog/,\
pg_ts_config.h pg_ts_config_map.h pg_ts_dict.h \
pg_ts_parser.h pg_ts_template.h pg_extension.h \
pg_foreign_data_wrapper.h pg_foreign_server.h pg_user_mapping.h \
- pg_foreign_table.h pg_policy.h \
+ pg_foreign_table.h pg_policy.h pg_tablesamplemethod.h \
pg_default_acl.h pg_seclabel.h pg_shseclabel.h pg_collation.h pg_range.h \
toasting.h indexing.h \
)
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 732ab22..4b011c7 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -50,23 +50,13 @@
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/pg_rusage.h"
+#include "utils/sampling.h"
#include "utils/sortsupport.h"
#include "utils/syscache.h"
#include "utils/timestamp.h"
#include "utils/tqual.h"
-/* Data structure for Algorithm S from Knuth 3.4.2 */
-typedef struct
-{
- BlockNumber N; /* number of blocks, known in advance */
- int n; /* desired sample size */
- BlockNumber t; /* current block number */
- int m; /* blocks selected so far */
-} BlockSamplerData;
-
-typedef BlockSamplerData *BlockSampler;
-
/* Per-index data for ANALYZE */
typedef struct AnlIndexData
{
@@ -88,10 +78,6 @@ static BufferAccessStrategy vac_strategy;
static void do_analyze_rel(Relation onerel, VacuumStmt *vacstmt,
AcquireSampleRowsFunc acquirefunc, BlockNumber relpages,
bool inh, bool in_outer_xact, int elevel);
-static void BlockSampler_Init(BlockSampler bs, BlockNumber nblocks,
- int samplesize);
-static bool BlockSampler_HasMore(BlockSampler bs);
-static BlockNumber BlockSampler_Next(BlockSampler bs);
static void compute_index_stats(Relation onerel, double totalrows,
AnlIndexData *indexdata, int nindexes,
HeapTuple *rows, int numrows,
@@ -947,94 +933,6 @@ examine_attribute(Relation onerel, int attnum, Node *index_expr)
}
/*
- * BlockSampler_Init -- prepare for random sampling of blocknumbers
- *
- * BlockSampler is used for stage one of our new two-stage tuple
- * sampling mechanism as discussed on pgsql-hackers 2004-04-02 (subject
- * "Large DB"). It selects a random sample of samplesize blocks out of
- * the nblocks blocks in the table. If the table has less than
- * samplesize blocks, all blocks are selected.
- *
- * Since we know the total number of blocks in advance, we can use the
- * straightforward Algorithm S from Knuth 3.4.2, rather than Vitter's
- * algorithm.
- */
-static void
-BlockSampler_Init(BlockSampler bs, BlockNumber nblocks, int samplesize)
-{
- bs->N = nblocks; /* measured table size */
-
- /*
- * If we decide to reduce samplesize for tables that have less or not much
- * more than samplesize blocks, here is the place to do it.
- */
- bs->n = samplesize;
- bs->t = 0; /* blocks scanned so far */
- bs->m = 0; /* blocks selected so far */
-}
-
-static bool
-BlockSampler_HasMore(BlockSampler bs)
-{
- return (bs->t < bs->N) && (bs->m < bs->n);
-}
-
-static BlockNumber
-BlockSampler_Next(BlockSampler bs)
-{
- BlockNumber K = bs->N - bs->t; /* remaining blocks */
- int k = bs->n - bs->m; /* blocks still to sample */
- double p; /* probability to skip block */
- double V; /* random */
-
- Assert(BlockSampler_HasMore(bs)); /* hence K > 0 and k > 0 */
-
- if ((BlockNumber) k >= K)
- {
- /* need all the rest */
- bs->m++;
- return bs->t++;
- }
-
- /*----------
- * It is not obvious that this code matches Knuth's Algorithm S.
- * Knuth says to skip the current block with probability 1 - k/K.
- * If we are to skip, we should advance t (hence decrease K), and
- * repeat the same probabilistic test for the next block. The naive
- * implementation thus requires an anl_random_fract() call for each block
- * number. But we can reduce this to one anl_random_fract() call per
- * selected block, by noting that each time the while-test succeeds,
- * we can reinterpret V as a uniform random number in the range 0 to p.
- * Therefore, instead of choosing a new V, we just adjust p to be
- * the appropriate fraction of its former value, and our next loop
- * makes the appropriate probabilistic test.
- *
- * We have initially K > k > 0. If the loop reduces K to equal k,
- * the next while-test must fail since p will become exactly zero
- * (we assume there will not be roundoff error in the division).
- * (Note: Knuth suggests a "<=" loop condition, but we use "<" just
- * to be doubly sure about roundoff error.) Therefore K cannot become
- * less than k, which means that we cannot fail to select enough blocks.
- *----------
- */
- V = anl_random_fract();
- p = 1.0 - (double) k / (double) K;
- while (V < p)
- {
- /* skip */
- bs->t++;
- K--; /* keep K == N - t */
-
- /* adjust p to be new cutoff point in reduced range */
- p *= 1.0 - (double) k / (double) K;
- }
-
- /* select */
- bs->m++;
- return bs->t++;
-}
-
-/*
* acquire_sample_rows -- acquire a random sample of rows from the table
*
* Selected rows are returned in the caller-allocated array rows[], which
@@ -1089,6 +987,8 @@ acquire_sample_rows(Relation onerel, int elevel,
/* Need a cutoff xmin for HeapTupleSatisfiesVacuum */
OldestXmin = GetOldestXmin(onerel, true);
+ /* Seed the sampler random number generator */
+ sampler_setseed(random());
/* Prepare for sampling block numbers */
BlockSampler_Init(&bs, totalblocks, targrows);
/* Prepare for sampling rows */
@@ -1249,7 +1149,7 @@ acquire_sample_rows(Relation onerel, int elevel,
* Found a suitable tuple, so save it, replacing one
* old tuple at random
*/
- int k = (int) (targrows * anl_random_fract());
+ int k = (int) (targrows * sampler_random_fract());
Assert(k >= 0 && k < targrows);
heap_freetuple(rows[k]);
@@ -1308,13 +1208,6 @@ acquire_sample_rows(Relation onerel, int elevel,
return numrows;
}
-/* Select a random value R uniformly distributed in (0 - 1) */
-double
-anl_random_fract(void)
-{
- return ((double) random() + 1) / ((double) MAX_RANDOM_VALUE + 2);
-}
-
/*
* These two routines embody Algorithm Z from "Random sampling with a
* reservoir" by Jeffrey S. Vitter, in ACM Trans. Math. Softw. 11, 1
@@ -1333,7 +1226,7 @@ double
anl_init_selection_state(int n)
{
/* Initial value of W (for use when Algorithm Z is first applied) */
- return exp(-log(anl_random_fract()) / n);
+ return exp(-log(sampler_random_fract()) / n);
}
double
@@ -1348,7 +1241,7 @@ anl_get_next_S(double t, int n, double *stateptr)
double V,
quot;
- V = anl_random_fract(); /* Generate V */
+ V = sampler_random_fract(); /* Generate V */
S = 0;
t += 1;
/* Note: "num" in Vitter's code is always equal to t - n */
@@ -1380,7 +1273,7 @@ anl_get_next_S(double t, int n, double *stateptr)
tmp;
/* Generate U and X */
- U = anl_random_fract();
+ U = sampler_random_fract();
X = t * (W - 1.0);
S = floor(X); /* S is tentatively set to floor(X) */
/* Test if U <= h(S)/cg(X) in the manner of (6.3) */
@@ -1409,7 +1302,7 @@ anl_get_next_S(double t, int n, double *stateptr)
y *= numer / denom;
denom -= 1;
}
- W = exp(-log(anl_random_fract()) / n); /* Generate W in advance */
+ W = exp(-log(sampler_random_fract()) / n); /* Generate W in advance */
if (exp(log(y) / n) <= (t + X) / t)
break;
}
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 332f04a..2b1186b 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -725,6 +725,7 @@ ExplainPreScanNode(PlanState *planstate, Bitmapset **rels_used)
case T_WorkTableScan:
case T_ForeignScan:
case T_CustomScan:
+ case T_SampleScan:
*rels_used = bms_add_member(*rels_used,
((Scan *) plan)->scanrelid);
break;
@@ -951,6 +952,9 @@ ExplainNode(PlanState *planstate, List *ancestors,
else
pname = sname;
break;
+ case T_SampleScan:
+ pname = sname = "Sample Scan";
+ break;
case T_Material:
pname = sname = "Materialize";
break;
@@ -1068,6 +1072,7 @@ ExplainNode(PlanState *planstate, List *ancestors,
case T_WorkTableScan:
case T_ForeignScan:
case T_CustomScan:
+ case T_SampleScan:
ExplainScanTarget((Scan *) plan, es);
break;
case T_IndexScan:
@@ -1320,6 +1325,7 @@ ExplainNode(PlanState *planstate, List *ancestors,
case T_CteScan:
case T_WorkTableScan:
case T_SubqueryScan:
+ case T_SampleScan:
show_scan_qual(plan->qual, "Filter", planstate, ancestors, es);
if (plan->qual)
show_instrumentation_count("Rows Removed by Filter", 1,
@@ -2148,6 +2154,7 @@ ExplainTargetRel(Plan *plan, Index rti, ExplainState *es)
case T_TidScan:
case T_ForeignScan:
case T_CustomScan:
+ case T_SampleScan:
case T_ModifyTable:
/* Assert it's on a real relation */
Assert(rte->rtekind == RTE_RELATION);
diff --git a/src/backend/executor/Makefile b/src/backend/executor/Makefile
index af707b0..75f799c 100644
--- a/src/backend/executor/Makefile
+++ b/src/backend/executor/Makefile
@@ -21,7 +21,7 @@ OBJS = execAmi.o execCurrent.o execGrouping.o execJunk.o execMain.o \
nodeLimit.o nodeLockRows.o \
nodeMaterial.o nodeMergeAppend.o nodeMergejoin.o nodeModifyTable.o \
nodeNestloop.o nodeFunctionscan.o nodeRecursiveunion.o nodeResult.o \
- nodeSeqscan.o nodeSetOp.o nodeSort.o nodeUnique.o \
+ nodeSamplescan.o nodeSeqscan.o nodeSetOp.o nodeSort.o nodeUnique.o \
nodeValuesscan.o nodeCtescan.o nodeWorktablescan.o \
nodeGroup.o nodeSubplan.o nodeSubqueryscan.o nodeTidscan.o \
nodeForeignscan.o nodeWindowAgg.o tstoreReceiver.o spi.o
diff --git a/src/backend/executor/execAmi.c b/src/backend/executor/execAmi.c
index 7027d7f..1826059 100644
--- a/src/backend/executor/execAmi.c
+++ b/src/backend/executor/execAmi.c
@@ -39,6 +39,7 @@
#include "executor/nodeNestloop.h"
#include "executor/nodeRecursiveunion.h"
#include "executor/nodeResult.h"
+#include "executor/nodeSamplescan.h"
#include "executor/nodeSeqscan.h"
#include "executor/nodeSetOp.h"
#include "executor/nodeSort.h"
@@ -155,6 +156,10 @@ ExecReScan(PlanState *node)
ExecReScanSeqScan((SeqScanState *) node);
break;
+ case T_SampleScanState:
+ ExecReScanSampleScan((SampleScanState *) node);
+ break;
+
case T_IndexScanState:
ExecReScanIndexScan((IndexScanState *) node);
break;
@@ -480,6 +485,9 @@ ExecSupportsBackwardScan(Plan *node)
}
return false;
+ case T_SampleScan:
+ return false;
+
case T_Material:
case T_Sort:
/* these don't evaluate tlist */
diff --git a/src/backend/executor/execCurrent.c b/src/backend/executor/execCurrent.c
index d5079ef..613f799 100644
--- a/src/backend/executor/execCurrent.c
+++ b/src/backend/executor/execCurrent.c
@@ -261,6 +261,7 @@ search_plan_tree(PlanState *node, Oid table_oid)
* Relation scan nodes can all be treated alike
*/
case T_SeqScanState:
+ case T_SampleScanState:
case T_IndexScanState:
case T_IndexOnlyScanState:
case T_BitmapHeapScanState:
diff --git a/src/backend/executor/execProcnode.c b/src/backend/executor/execProcnode.c
index e27c062..a1cba97 100644
--- a/src/backend/executor/execProcnode.c
+++ b/src/backend/executor/execProcnode.c
@@ -102,6 +102,7 @@
#include "executor/nodeNestloop.h"
#include "executor/nodeRecursiveunion.h"
#include "executor/nodeResult.h"
+#include "executor/nodeSamplescan.h"
#include "executor/nodeSeqscan.h"
#include "executor/nodeSetOp.h"
#include "executor/nodeSort.h"
@@ -190,6 +191,11 @@ ExecInitNode(Plan *node, EState *estate, int eflags)
estate, eflags);
break;
+ case T_SampleScan:
+ result = (PlanState *) ExecInitSampleScan((SampleScan *) node,
+ estate, eflags);
+ break;
+
case T_IndexScan:
result = (PlanState *) ExecInitIndexScan((IndexScan *) node,
estate, eflags);
@@ -406,6 +412,10 @@ ExecProcNode(PlanState *node)
result = ExecSeqScan((SeqScanState *) node);
break;
+ case T_SampleScanState:
+ result = ExecSampleScan((SampleScanState *) node);
+ break;
+
case T_IndexScanState:
result = ExecIndexScan((IndexScanState *) node);
break;
@@ -644,6 +654,10 @@ ExecEndNode(PlanState *node)
ExecEndSeqScan((SeqScanState *) node);
break;
+ case T_SampleScanState:
+ ExecEndSampleScan((SampleScanState *) node);
+ break;
+
case T_IndexScanState:
ExecEndIndexScan((IndexScanState *) node);
break;
diff --git a/src/backend/executor/nodeSamplescan.c b/src/backend/executor/nodeSamplescan.c
new file mode 100644
index 0000000..818cddd
--- /dev/null
+++ b/src/backend/executor/nodeSamplescan.c
@@ -0,0 +1,388 @@
+/*-------------------------------------------------------------------------
+ *
+ * nodeSamplescan.c
+ * Support routines for sample scans of relations (table sampling).
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/executor/nodeSamplescan.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "catalog/pg_tablesamplemethod.h"
+#include "executor/executor.h"
+#include "access/relscan.h"
+#include "executor/nodeSamplescan.h"
+#include "parser/parsetree.h"
+#include "storage/bufmgr.h"
+#include "utils/rel.h"
+#include "utils/syscache.h"
+#include "utils/tqual.h"
+
+static void InitScanRelation(SampleScanState *node, EState *estate, int eflags);
+static TupleTableSlot *SampleNext(SampleScanState *node);
+
+/*
+ * Initialize the sampling method - loads function info and
+ * calls the tsminit function.
+ *
+ * We need special handling for this because the tsminit function
+ * is allowed to take optional additional arguments.
+ */
+static void
+InitSamplingMethod(SampleScanState *scanstate, TableSampleClause *tablesample)
+{
+ FunctionCallInfoData fcinfo;
+ int i;
+ List *args = tablesample->args;
+ ListCell *arg;
+ ExprContext *econtext = scanstate->ss.ps.ps_ExprContext;
+
+ /* Load functions */
+ fmgr_info(tablesample->tsminit, &(scanstate->tsminit));
+ fmgr_info(tablesample->tsmnextblock, &(scanstate->tsmnextblock));
+ fmgr_info(tablesample->tsmnexttuple, &(scanstate->tsmnexttuple));
+ fmgr_info(tablesample->tsmend, &(scanstate->tsmend));
+ fmgr_info(tablesample->tsmreset, &(scanstate->tsmreset));
+
+ InitFunctionCallInfoData(fcinfo, &scanstate->tsminit,
+ list_length(args) + 1,
+ InvalidOid, NULL, NULL);
+
+ /* First arg is always SampleScanState */
+ fcinfo.arg[0] = PointerGetDatum(scanstate);
+ fcinfo.argnull[0] = false;
+
+ i = 1;
+ foreach(arg, args)
+ {
+ Expr *argexpr = (Expr *) lfirst(arg);
+ ExprState *argstate = ExecInitExpr(argexpr, (PlanState *) scanstate);
+
+ if (argstate == NULL)
+ {
+ fcinfo.argnull[i] = true;
+ fcinfo.arg[i] = (Datum) 0;;
+ }
+
+ fcinfo.arg[i] = ExecEvalExpr(argstate, econtext,
+ &fcinfo.argnull[i], NULL);
+ i++;
+ }
+ Assert(i == fcinfo.nargs);
+
+ /* REPEATABLE was not specified */
+ if (fcinfo.argnull[1])
+ {
+ fcinfo.arg[1] = UInt32GetDatum(random());
+ fcinfo.argnull[1] = false;
+ }
+
+ (void) FunctionCallInvoke(&fcinfo);
+}
+
+
+/* ----------------------------------------------------------------
+ * Scan Support
+ * ----------------------------------------------------------------
+ */
+
+/* ----------------------------------------------------------------
+ * SampleNext
+ *
+ * This is a workhorse for ExecSampleScan
+ * ----------------------------------------------------------------
+ */
+static TupleTableSlot *
+SampleNext(SampleScanState *node)
+{
+ EState *estate;
+ TupleTableSlot *slot;
+ BlockNumber blockno = InvalidBlockNumber;
+ Snapshot snapshot;
+ Relation relation;
+ bool found = false;
+ bool retry = false;
+ OffsetNumber tupoffset, maxoffset;
+ Buffer buffer;
+ Page page;
+ HeapTuple tuple = &(node->tup);
+
+ /*
+ * get information from the estate and scan state
+ */
+ estate = node->ss.ps.state;
+ snapshot = estate->es_snapshot;
+ slot = node->ss.ss_ScanTupleSlot;
+ relation = node->ss.ss_currentRelation;
+ buffer = node->openbuffer;
+
+ if (BufferIsValid(buffer))
+ {
+ blockno = BufferGetBlockNumber(buffer);
+ page = BufferGetPage(buffer);
+ maxoffset = PageGetMaxOffsetNumber(page);
+ }
+
+ /*
+ * get the next tuple from the table
+ */
+ for (;;)
+ {
+ ItemId itemid;
+
+ /* Load next block if needed. */
+ if (!BufferIsValid(buffer))
+ {
+ blockno = DatumGetInt32(FunctionCall2(&node->tsmnextblock,
+ PointerGetDatum(node),
+ BoolGetDatum(retry)));
+ /* No more blocks to fetch */
+ if (!BlockNumberIsValid(blockno))
+ break;
+
+ buffer = ReadBufferExtended(relation, MAIN_FORKNUM, blockno,
+ RBM_NORMAL, NULL);
+ LockBuffer(buffer, BUFFER_LOCK_SHARE);
+
+ node->openbuffer = buffer;
+ page = BufferGetPage(buffer);
+ maxoffset = PageGetMaxOffsetNumber(page);
+ }
+
+ tupoffset = DatumGetUInt16(FunctionCall4(&node->tsmnexttuple,
+ PointerGetDatum(node),
+ UInt32GetDatum(blockno),
+ UInt16GetDatum(maxoffset),
+ BoolGetDatum(retry)));
+ /* Go to next block. */
+ if (!OffsetNumberIsValid(tupoffset))
+ {
+ UnlockReleaseBuffer(buffer);
+ node->openbuffer = buffer = InvalidBuffer;
+ continue;
+ }
+ retry = true;
+
+ /* Skip invalid tuple pointers. */
+ itemid = PageGetItemId(page, tupoffset);
+ if (!ItemIdIsNormal(itemid))
+ continue;
+
+ tuple->t_tableOid = RelationGetRelid(relation);
+ tuple->t_data = (HeapTupleHeader) PageGetItem(page, itemid);
+ tuple->t_len = ItemIdGetLength(itemid);
+ ItemPointerSet(&tuple->t_self, blockno, tupoffset);
+
+ /* Found visible tuple, return it. */
+ if (HeapTupleSatisfiesVisibility(tuple, snapshot, buffer))
+ {
+ found = true;
+ break;
+ }
+ }
+
+ if (found)
+ ExecStoreTuple(tuple, /* tuple to store */
+ slot, /* slot to store in */
+ buffer, /* buffer associated with this tuple */
+ false); /* don't pfree this pointer */
+ else
+ ExecClearTuple(slot);
+
+ return slot;
+}
+
+/*
+ * SampleRecheck -- access method routine to recheck a tuple in EvalPlanQual
+ */
+static bool
+SampleRecheck(SampleScanState *node, TupleTableSlot *slot)
+{
+ /* No need to recheck for SampleScan */
+ return true;
+}
+
+/* ----------------------------------------------------------------
+ * ExecSampleScan(node)
+ *
+ * Scans the relation sequentially and returns the next qualifying
+ * tuple while calling the sampling method functions.
+ * We call the ExecScan() routine and pass it the appropriate
+ * access method functions.
+ * ----------------------------------------------------------------
+ */
+TupleTableSlot *
+ExecSampleScan(SampleScanState *node)
+{
+ return ExecScan((ScanState *) node,
+ (ExecScanAccessMtd) SampleNext,
+ (ExecScanRecheckMtd) SampleRecheck);
+}
+
+/* ----------------------------------------------------------------
+ * InitScanRelation
+ *
+ * Set up to access the scan relation.
+ * ----------------------------------------------------------------
+ */
+static void
+InitScanRelation(SampleScanState *node, EState *estate, int eflags)
+{
+ Relation currentRelation;
+
+ /*
+ * get the relation object id from the relid'th entry in the range table,
+ * open that relation and acquire appropriate lock on it.
+ */
+ currentRelation = ExecOpenScanRelation(estate,
+ ((SampleScan *) node->ss.ps.plan)->scanrelid,
+ eflags);
+
+ node->ss.ss_currentRelation = currentRelation;
+ node->ss.ss_currentScanDesc = NULL;
+
+ /* and report the scan tuple slot's rowtype */
+ ExecAssignScanType(&node->ss, RelationGetDescr(currentRelation));
+}
+
+
+/* ----------------------------------------------------------------
+ * ExecInitSampleScan
+ * ----------------------------------------------------------------
+ */
+SampleScanState *
+ExecInitSampleScan(SampleScan *node, EState *estate, int eflags)
+{
+ SampleScanState *scanstate;
+ RangeTblEntry *rte = rt_fetch(node->scanrelid,
+ estate->es_range_table);
+
+ /*
+ * Once upon a time it was possible to have an outerPlan of a SanpleScan, but
+ * not any more.
+ */
+ Assert(outerPlan(node) == NULL);
+ Assert(innerPlan(node) == NULL);
+ Assert(rte->tablesample != NULL);
+
+ /*
+ * create state structure
+ */
+ scanstate = makeNode(SampleScanState);
+ scanstate->ss.ps.plan = (Plan *) node;
+ scanstate->ss.ps.state = estate;
+
+ /*
+ * Miscellaneous initialization
+ *
+ * create expression context for node
+ */
+ ExecAssignExprContext(estate, &scanstate->ss.ps);
+
+ /*
+ * initialize child expressions
+ */
+ scanstate->ss.ps.targetlist = (List *)
+ ExecInitExpr((Expr *) node->plan.targetlist,
+ (PlanState *) scanstate);
+ scanstate->ss.ps.qual = (List *)
+ ExecInitExpr((Expr *) node->plan.qual,
+ (PlanState *) scanstate);
+
+ /*
+ * tuple table initialization
+ */
+ ExecInitResultTupleSlot(estate, &scanstate->ss.ps);
+ ExecInitScanTupleSlot(estate, &scanstate->ss);
+
+ /*
+ * initialize scan relation
+ */
+ InitScanRelation(scanstate, estate, eflags);
+
+ scanstate->ss.ps.ps_TupFromTlist = false;
+
+ /*
+ * Initialize result tuple type and projection info.
+ */
+ ExecAssignResultTypeFromTL(&scanstate->ss.ps);
+ ExecAssignScanProjectionInfo(&scanstate->ss);
+
+ scanstate->openbuffer = InvalidBuffer;
+
+ InitSamplingMethod(scanstate, rte->tablesample);
+
+ return scanstate;
+}
+
+/* ----------------------------------------------------------------
+ * ExecEndSampleScan
+ *
+ * frees any storage allocated through C routines.
+ * ----------------------------------------------------------------
+ */
+void
+ExecEndSampleScan(SampleScanState *node)
+{
+ /*
+ * Tell sampling function that we finished thes can.
+ */
+ FunctionCall1(&node->tsmend, PointerGetDatum(node));
+
+ if (BufferIsValid(node->openbuffer))
+ {
+ UnlockReleaseBuffer(node->openbuffer);
+ node->openbuffer = InvalidBuffer;
+ }
+
+ /*
+ * Free the exprcontext
+ */
+ ExecFreeExprContext(&node->ss.ps);
+
+ /*
+ * clean out the tuple table
+ */
+ ExecClearTuple(node->ss.ps.ps_ResultTupleSlot);
+ ExecClearTuple(node->ss.ss_ScanTupleSlot);
+
+ /*
+ * close the heap relation.
+ */
+ ExecCloseScanRelation(node->ss.ss_currentRelation);
+}
+
+/* ----------------------------------------------------------------
+ * Join Support
+ * ----------------------------------------------------------------
+ */
+
+/* ----------------------------------------------------------------
+ * ExecReScanSampleScan
+ *
+ * Rescans the relation.
+ *
+ * ----------------------------------------------------------------
+ */
+void
+ExecReScanSampleScan(SampleScanState *node)
+{
+ if (BufferIsValid(node->openbuffer))
+ {
+ UnlockReleaseBuffer(node->openbuffer);
+ node->openbuffer = InvalidBuffer;
+ }
+
+ /*
+ * Tell sampling function to reset its state for rescan.
+ */
+ FunctionCall1(&node->tsmreset, PointerGetDatum(node));
+
+ ExecReScan((PlanState *) node);
+}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 6b1bf7b..47769d0 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -628,6 +628,22 @@ _copyCustomScan(const CustomScan *from)
}
/*
+ * _copySampleScan
+ */
+static SampleScan *
+_copySampleScan(const SampleScan *from)
+{
+ SampleScan *newnode = makeNode(SampleScan);
+
+ /*
+ * copy node superclass fields
+ */
+ CopyScanFields((const Scan *) from, (Scan *) newnode);
+
+ return newnode;
+}
+
+/*
* CopyJoinFields
*
* This function copies the fields of the Join node. It is used by
@@ -2006,6 +2022,7 @@ _copyRangeTblEntry(const RangeTblEntry *from)
COPY_SCALAR_FIELD(rtekind);
COPY_SCALAR_FIELD(relid);
COPY_SCALAR_FIELD(relkind);
+ COPY_NODE_FIELD(tablesample);
COPY_NODE_FIELD(subquery);
COPY_SCALAR_FIELD(security_barrier);
COPY_SCALAR_FIELD(jointype);
@@ -2138,6 +2155,34 @@ _copyCommonTableExpr(const CommonTableExpr *from)
return newnode;
}
+static RangeTableSample *
+_copyRangeTableSample(const RangeTableSample *from)
+{
+ RangeTableSample *newnode = makeNode(RangeTableSample);
+
+ COPY_NODE_FIELD(relation);
+ COPY_STRING_FIELD(method);
+ COPY_NODE_FIELD(args);
+
+ return newnode;
+}
+
+static TableSampleClause *
+_copyTableSampleClause(const TableSampleClause *from)
+{
+ TableSampleClause *newnode = makeNode(TableSampleClause);
+
+ COPY_SCALAR_FIELD(tsmid);
+ COPY_SCALAR_FIELD(tsminit);
+ COPY_SCALAR_FIELD(tsmnextblock);
+ COPY_SCALAR_FIELD(tsmnexttuple);
+ COPY_SCALAR_FIELD(tsmend);
+ COPY_SCALAR_FIELD(tsmreset);
+ COPY_NODE_FIELD(args);
+
+ return newnode;
+}
+
static A_Expr *
_copyAExpr(const A_Expr *from)
{
@@ -4076,6 +4121,9 @@ copyObject(const void *from)
case T_CustomScan:
retval = _copyCustomScan(from);
break;
+ case T_SampleScan:
+ retval = _copySampleScan(from);
+ break;
case T_Join:
retval = _copyJoin(from);
break;
@@ -4724,6 +4772,12 @@ copyObject(const void *from)
case T_CommonTableExpr:
retval = _copyCommonTableExpr(from);
break;
+ case T_RangeTableSample:
+ retval = _copyRangeTableSample(from);
+ break;
+ case T_TableSampleClause:
+ retval = _copyTableSampleClause(from);
+ break;
case T_PrivGrantee:
retval = _copyPrivGrantee(from);
break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index d5db71d..9125b43 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2324,6 +2324,7 @@ _equalRangeTblEntry(const RangeTblEntry *a, const RangeTblEntry *b)
COMPARE_SCALAR_FIELD(rtekind);
COMPARE_SCALAR_FIELD(relid);
COMPARE_SCALAR_FIELD(relkind);
+ COMPARE_NODE_FIELD(tablesample);
COMPARE_NODE_FIELD(subquery);
COMPARE_SCALAR_FIELD(security_barrier);
COMPARE_SCALAR_FIELD(jointype);
@@ -2443,6 +2444,30 @@ _equalCommonTableExpr(const CommonTableExpr *a, const CommonTableExpr *b)
}
static bool
+_equalRangeTableSample(const RangeTableSample *a, const RangeTableSample *b)
+{
+ COMPARE_NODE_FIELD(relation);
+ COMPARE_STRING_FIELD(method);
+ COMPARE_NODE_FIELD(args);
+
+ return true;
+}
+
+static bool
+_equalTableSampleClause(const TableSampleClause *a, const TableSampleClause *b)
+{
+ COMPARE_SCALAR_FIELD(tsmid);
+ COMPARE_SCALAR_FIELD(tsminit);
+ COMPARE_SCALAR_FIELD(tsmnextblock);
+ COMPARE_SCALAR_FIELD(tsmnexttuple);
+ COMPARE_SCALAR_FIELD(tsmend);
+ COMPARE_SCALAR_FIELD(tsmreset);
+ COMPARE_NODE_FIELD(args);
+
+ return true;
+}
+
+static bool
_equalXmlSerialize(const XmlSerialize *a, const XmlSerialize *b)
{
COMPARE_SCALAR_FIELD(xmloption);
@@ -3151,6 +3176,12 @@ equal(const void *a, const void *b)
case T_CommonTableExpr:
retval = _equalCommonTableExpr(a, b);
break;
+ case T_RangeTableSample:
+ retval = _equalRangeTableSample(a, b);
+ break;
+ case T_TableSampleClause:
+ retval = _equalTableSampleClause(a, b);
+ break;
case T_PrivGrantee:
retval = _equalPrivGrantee(a, b);
break;
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index ae857a0..557aa3a 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -3209,6 +3209,16 @@ raw_expression_tree_walker(Node *node,
return walker(((WithClause *) node)->ctes, context);
case T_CommonTableExpr:
return walker(((CommonTableExpr *) node)->ctequery, context);
+ case T_RangeTableSample:
+ {
+ RangeTableSample *rts = (RangeTableSample *) node;
+
+ if (walker(rts->relation, context))
+ return true;
+ if (walker(rts->args, context))
+ return true;
+ }
+ break;
default:
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(node));
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index edbd09f..be91e13 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -578,6 +578,14 @@ _outCustomScan(StringInfo str, const CustomScan *node)
}
static void
+_outSampleScan(StringInfo str, const SampleScan *node)
+{
+ WRITE_NODE_TYPE("SAMPLESCAN");
+
+ _outScanInfo(str, (const Scan *) node);
+}
+
+static void
_outJoin(StringInfo str, const Join *node)
{
WRITE_NODE_TYPE("JOIN");
@@ -2391,6 +2399,30 @@ _outCommonTableExpr(StringInfo str, const CommonTableExpr *node)
}
static void
+_outRangeTableSample(StringInfo str, const RangeTableSample *node)
+{
+ WRITE_NODE_TYPE("RANGETABLESAMPLE");
+
+ WRITE_NODE_FIELD(relation);
+ WRITE_STRING_FIELD(method);
+ WRITE_NODE_FIELD(args);
+}
+
+static void
+_outTableSampleClause(StringInfo str, const TableSampleClause *node)
+{
+ WRITE_NODE_TYPE("TABLESAMPLECLAUSE");
+
+ WRITE_OID_FIELD(tsmid);
+ WRITE_OID_FIELD(tsminit);
+ WRITE_OID_FIELD(tsmnextblock);
+ WRITE_OID_FIELD(tsmnexttuple);
+ WRITE_OID_FIELD(tsmend);
+ WRITE_OID_FIELD(tsmreset);
+ WRITE_NODE_FIELD(args);
+}
+
+static void
_outSetOperationStmt(StringInfo str, const SetOperationStmt *node)
{
WRITE_NODE_TYPE("SETOPERATIONSTMT");
@@ -2420,6 +2452,7 @@ _outRangeTblEntry(StringInfo str, const RangeTblEntry *node)
case RTE_RELATION:
WRITE_OID_FIELD(relid);
WRITE_CHAR_FIELD(relkind);
+ WRITE_NODE_FIELD(tablesample);
break;
case RTE_SUBQUERY:
WRITE_NODE_FIELD(subquery);
@@ -2887,6 +2920,9 @@ _outNode(StringInfo str, const void *obj)
case T_CustomScan:
_outCustomScan(str, obj);
break;
+ case T_SampleScan:
+ _outSampleScan(str, obj);
+ break;
case T_Join:
_outJoin(str, obj);
break;
@@ -3228,6 +3264,12 @@ _outNode(StringInfo str, const void *obj)
case T_CommonTableExpr:
_outCommonTableExpr(str, obj);
break;
+ case T_RangeTableSample:
+ _outRangeTableSample(str, obj);
+ break;
+ case T_TableSampleClause:
+ _outTableSampleClause(str, obj);
+ break;
case T_SetOperationStmt:
_outSetOperationStmt(str, obj);
break;
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index a3efdd4..50aca4e 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -350,6 +350,40 @@ _readCommonTableExpr(void)
}
/*
+ * _readRangeTableSample
+ */
+static RangeTableSample *
+_readRangeTableSample(void)
+{
+ READ_LOCALS(RangeTableSample);
+
+ READ_NODE_FIELD(relation);
+ READ_STRING_FIELD(method);
+ READ_NODE_FIELD(args);
+
+ READ_DONE();
+}
+
+/*
+ * _readTableSampleClause
+ */
+static TableSampleClause *
+_readTableSampleClause(void)
+{
+ READ_LOCALS(TableSampleClause);
+
+ READ_OID_FIELD(tsmid);
+ READ_OID_FIELD(tsminit);
+ READ_OID_FIELD(tsmnextblock);
+ READ_OID_FIELD(tsmnexttuple);
+ READ_OID_FIELD(tsmend);
+ READ_OID_FIELD(tsmreset);
+ READ_NODE_FIELD(args);
+
+ READ_DONE();
+}
+
+/*
* _readSetOperationStmt
*/
static SetOperationStmt *
@@ -1216,6 +1250,7 @@ _readRangeTblEntry(void)
case RTE_RELATION:
READ_OID_FIELD(relid);
READ_CHAR_FIELD(relkind);
+ READ_NODE_FIELD(tablesample);
break;
case RTE_SUBQUERY:
READ_NODE_FIELD(subquery);
@@ -1311,6 +1346,10 @@ parseNodeString(void)
return_value = _readRowMarkClause();
else if (MATCH("COMMONTABLEEXPR", 15))
return_value = _readCommonTableExpr();
+ else if (MATCH("RANGETABLESAMPLE", 16))
+ return_value = _readRangeTableSample();
+ else if (MATCH("TABLESAMPLECLAUSE", 17))
+ return_value = _readTableSampleClause();
else if (MATCH("SETOPERATIONSTMT", 16))
return_value = _readSetOperationStmt();
else if (MATCH("ALIAS", 5))
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 449fdc3..53fa356 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -71,6 +71,8 @@ static void set_plain_rel_size(PlannerInfo *root, RelOptInfo *rel,
RangeTblEntry *rte);
static void set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
RangeTblEntry *rte);
+static void set_tablesample_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
+ RangeTblEntry *rte);
static void set_foreign_size(PlannerInfo *root, RelOptInfo *rel,
RangeTblEntry *rte);
static void set_foreign_pathlist(PlannerInfo *root, RelOptInfo *rel,
@@ -332,6 +334,11 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
/* Foreign table */
set_foreign_pathlist(root, rel, rte);
}
+ else if (rte->tablesample != NULL)
+ {
+ /* Build sample scan on relation */
+ set_tablesample_rel_pathlist(root, rel, rte);
+ }
else
{
/* Plain relation */
@@ -418,6 +425,34 @@ set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
}
/*
+ * set_tablesample_rel_pathlist
+ * Build access paths for a sampled relation
+ *
+ * There is only one possible path - sampling scan
+ */
+static void
+set_tablesample_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
+{
+ Relids required_outer;
+
+ /*
+ * We don't support pushing join clauses into the quals of a seqscan, but
+ * it could still have required parameterization due to LATERAL refs in
+ * its tlist.
+ */
+ required_outer = rel->lateral_relids;
+
+ /* We only do sample scan if it was requested */
+ add_path(rel, create_samplescan_path(root, rel, required_outer));
+
+ /*
+ * There is only one plan to consider but we still need to set
+ * parameters for RelOptInfo.
+ */
+ set_cheapest(rel);
+}
+
+/*
* set_foreign_size
* Set size estimates for a foreign table RTE
*/
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 659daa2..615c3f5 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -219,6 +219,54 @@ cost_seqscan(Path *path, PlannerInfo *root,
}
/*
+ * cost_samplescan
+ * Determines and returns the cost of scanning a relation using sampling.
+ *
+ * 'baserel' is the relation to be scanned
+ * 'param_info' is the ParamPathInfo if this is a parameterized path, else NULL
+ */
+void
+cost_samplescan(Path *path, PlannerInfo *root,
+ RelOptInfo *baserel, ParamPathInfo *param_info)
+{
+ Cost startup_cost = 0;
+ Cost run_cost = 0;
+ double spc_sample_page_cost;
+ QualCost qpqual_cost;
+ Cost cpu_per_tuple;
+
+ /* Should only be applied to base relations */
+ Assert(baserel->relid > 0);
+ Assert(baserel->rtekind == RTE_RELATION);
+
+ /* Mark the path with the correct row estimate */
+ if (param_info)
+ path->rows = param_info->ppi_rows;
+ else
+ path->rows = baserel->rows;
+
+ /* fetch estimated page cost for tablespace containing table */
+ get_tablespace_page_costs(baserel->reltablespace,
+ NULL,
+ &spc_sample_page_cost);
+
+ /*
+ * disk costs
+ */
+ run_cost += spc_sample_page_cost * baserel->pages;
+
+ /* CPU costs */
+ get_restriction_qual_cost(root, baserel, param_info, &qpqual_cost);
+
+ startup_cost += qpqual_cost.startup;
+ cpu_per_tuple = cpu_tuple_cost + qpqual_cost.per_tuple;
+ run_cost += cpu_per_tuple * baserel->tuples;
+
+ path->startup_cost = startup_cost;
+ path->total_cost = startup_cost + run_cost;
+}
+
+/*
* cost_index
* Determines and returns the cost of scanning a relation using an index.
*
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index bf8dbe0..86dc6e1 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -57,6 +57,8 @@ static Material *create_material_plan(PlannerInfo *root, MaterialPath *best_path
static Plan *create_unique_plan(PlannerInfo *root, UniquePath *best_path);
static SeqScan *create_seqscan_plan(PlannerInfo *root, Path *best_path,
List *tlist, List *scan_clauses);
+static SampleScan *create_samplescan_plan(PlannerInfo *root, Path *best_path,
+ List *tlist, List *scan_clauses);
static Scan *create_indexscan_plan(PlannerInfo *root, IndexPath *best_path,
List *tlist, List *scan_clauses, bool indexonly);
static BitmapHeapScan *create_bitmap_scan_plan(PlannerInfo *root,
@@ -99,6 +101,7 @@ static List *order_qual_clauses(PlannerInfo *root, List *clauses);
static void copy_path_costsize(Plan *dest, Path *src);
static void copy_plan_costsize(Plan *dest, Plan *src);
static SeqScan *make_seqscan(List *qptlist, List *qpqual, Index scanrelid);
+static SampleScan *make_samplescan(List *qptlist, List *qpqual, Index scanrelid);
static IndexScan *make_indexscan(List *qptlist, List *qpqual, Index scanrelid,
Oid indexid, List *indexqual, List *indexqualorig,
List *indexorderby, List *indexorderbyorig,
@@ -227,6 +230,7 @@ create_plan_recurse(PlannerInfo *root, Path *best_path)
switch (best_path->pathtype)
{
case T_SeqScan:
+ case T_SampleScan:
case T_IndexScan:
case T_IndexOnlyScan:
case T_BitmapHeapScan:
@@ -342,6 +346,13 @@ create_scan_plan(PlannerInfo *root, Path *best_path)
scan_clauses);
break;
+ case T_SampleScan:
+ plan = (Plan *) create_samplescan_plan(root,
+ best_path,
+ tlist,
+ scan_clauses);
+ break;
+
case T_IndexScan:
plan = (Plan *) create_indexscan_plan(root,
(IndexPath *) best_path,
@@ -545,6 +556,7 @@ disuse_physical_tlist(PlannerInfo *root, Plan *plan, Path *path)
switch (path->pathtype)
{
case T_SeqScan:
+ case T_SampleScan:
case T_IndexScan:
case T_IndexOnlyScan:
case T_BitmapHeapScan:
@@ -1132,6 +1144,45 @@ create_seqscan_plan(PlannerInfo *root, Path *best_path,
}
/*
+ * create_samplescan_plan
+ * Returns a samplecan plan for the base relation scanned by 'best_path'
+ * with restriction clauses 'scan_clauses' and targetlist 'tlist'.
+ */
+static SampleScan *
+create_samplescan_plan(PlannerInfo *root, Path *best_path,
+ List *tlist, List *scan_clauses)
+{
+ SampleScan *scan_plan;
+ Index scan_relid = best_path->parent->relid;
+
+ /* it should be a base rel with tablesample clause... */
+ Assert(scan_relid > 0);
+ Assert(best_path->parent->rtekind == RTE_RELATION);
+ Assert(best_path->pathtype == T_SampleScan);
+
+ /* Sort clauses into best execution order */
+ scan_clauses = order_qual_clauses(root, scan_clauses);
+
+ /* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
+ scan_clauses = extract_actual_clauses(scan_clauses, false);
+
+ /* Replace any outer-relation variables with nestloop params */
+ if (best_path->param_info)
+ {
+ scan_clauses = (List *)
+ replace_nestloop_params(root, (Node *) scan_clauses);
+ }
+
+ scan_plan = make_samplescan(tlist,
+ scan_clauses,
+ scan_relid);
+
+ copy_path_costsize(&scan_plan->plan, best_path);
+
+ return scan_plan;
+}
+
+/*
* create_indexscan_plan
* Returns an indexscan plan for the base relation scanned by 'best_path'
* with restriction clauses 'scan_clauses' and targetlist 'tlist'.
@@ -3317,6 +3368,24 @@ make_seqscan(List *qptlist,
return node;
}
+static SampleScan *
+make_samplescan(List *qptlist,
+ List *qpqual,
+ Index scanrelid)
+{
+ SampleScan *node = makeNode(SampleScan);
+ Plan *plan = &node->plan;
+
+ /* cost should be inserted by caller */
+ plan->targetlist = qptlist;
+ plan->qual = qpqual;
+ plan->lefttree = NULL;
+ plan->righttree = NULL;
+ node->scanrelid = scanrelid;
+
+ return node;
+}
+
static IndexScan *
make_indexscan(List *qptlist,
List *qpqual,
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 4d3fbca..0d78f27 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -446,6 +446,17 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
fix_scan_list(root, splan->plan.qual, rtoffset);
}
break;
+ case T_SampleScan:
+ {
+ SampleScan *splan = (SampleScan *) plan;
+
+ splan->scanrelid += rtoffset;
+ splan->plan.targetlist =
+ fix_scan_list(root, splan->plan.targetlist, rtoffset);
+ splan->plan.qual =
+ fix_scan_list(root, splan->plan.qual, rtoffset);
+ }
+ break;
case T_IndexScan:
{
IndexScan *splan = (IndexScan *) plan;
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 579d021..7da1a44 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -2163,6 +2163,7 @@ finalize_plan(PlannerInfo *root, Plan *plan, Bitmapset *valid_params,
break;
case T_SeqScan:
+ case T_SampleScan:
context.paramids = bms_add_members(context.paramids, scan_params);
break;
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 319e8b2..766d276 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -706,6 +706,26 @@ create_seqscan_path(PlannerInfo *root, RelOptInfo *rel, Relids required_outer)
}
/*
+ * create_samplescan_path
+ * Like seqscan but uses sampling function while scanning.
+ */
+Path *
+create_samplescan_path(PlannerInfo *root, RelOptInfo *rel, Relids required_outer)
+{
+ Path *pathnode = makeNode(Path);
+
+ pathnode->pathtype = T_SampleScan;
+ pathnode->parent = rel;
+ pathnode->param_info = get_baserel_parampathinfo(root, rel,
+ required_outer);
+ pathnode->pathkeys = NIL; /* samplescan has unordered result */
+
+ cost_samplescan(pathnode, root, rel, pathnode->param_info);
+
+ return pathnode;
+}
+
+/*
* create_index_path
* Creates a path node for an index scan.
*
@@ -1921,6 +1941,8 @@ reparameterize_path(PlannerInfo *root, Path *path,
case T_SubqueryScan:
return create_subqueryscan_path(root, rel, path->pathkeys,
required_outer);
+ case T_SampleScan:
+ return create_samplescan_path(root, rel, required_outer);
default:
break;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 4b5009b..87a797a 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -447,6 +447,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <range> relation_expr
%type <range> relation_expr_opt_alias
%type <target> target_el single_set_clause set_target insert_column_item
+%type <node> relation_expr_tablesample tablesample_clause opt_repeatable_clause
%type <str> generic_option_name
%type <node> generic_option_arg
@@ -611,8 +612,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
STATEMENT STATISTICS STDIN STDOUT STORAGE STRICT_P STRIP_P SUBSTRING
SYMMETRIC SYSID SYSTEM_P
- TABLE TABLES TABLESPACE TEMP TEMPLATE TEMPORARY TEXT_P THEN TIME TIMESTAMP
- TO TRAILING TRANSACTION TREAT TRIGGER TRIM TRUE_P
+ TABLE TABLES TABLESAMPLE TABLESPACE TEMP TEMPLATE TEMPORARY TEXT_P THEN
+ TIME TIMESTAMP TO TRAILING TRANSACTION TREAT TRIGGER TRIM TRUE_P
TRUNCATE TRUSTED TYPE_P TYPES_P
UNBOUNDED UNCOMMITTED UNENCRYPTED UNION UNIQUE UNKNOWN UNLISTEN UNLOGGED
@@ -10109,6 +10110,12 @@ table_ref: relation_expr opt_alias_clause
$1->alias = $2;
$$ = (Node *) $1;
}
+ | relation_expr_tablesample opt_alias_clause
+ {
+ RangeTableSample *n = (RangeTableSample *) $1;
+ n->relation->alias = $2;
+ $$ = (Node *) n;
+ }
| func_table func_alias_clause
{
RangeFunction *n = (RangeFunction *) $1;
@@ -10404,7 +10411,6 @@ relation_expr_list:
| relation_expr_list ',' relation_expr { $$ = lappend($1, $3); }
;
-
/*
* Given "UPDATE foo set set ...", we have to decide without looking any
* further ahead whether the first "set" is an alias or the UPDATE's SET
@@ -10434,6 +10440,30 @@ relation_expr_opt_alias: relation_expr %prec UMINUS
}
;
+
+relation_expr_tablesample: relation_expr tablesample_clause
+ {
+ RangeTableSample *n = (RangeTableSample *) $2;
+ n->relation = $1;
+ $$ = (Node *) n;
+ }
+ ;
+
+tablesample_clause:
+ TABLESAMPLE ColId '(' func_arg_list ')' opt_repeatable_clause
+ {
+ RangeTableSample *n = makeNode(RangeTableSample);
+ n->method = $2;
+ n->args = lcons($6, $4);
+ $$ = (Node *) n;
+ }
+ ;
+
+opt_repeatable_clause:
+ REPEATABLE '(' Iconst ')' { $$ = makeIntConst($3, @3); }
+ | /*EMPTY*/ { $$ = makeNullAConst(-1); }
+ ;
+
/*
* func_table represents a function invocation in a FROM list. It can be
* a plain function call, like "foo(...)", or a ROWS FROM expression with
@@ -13216,7 +13246,6 @@ unreserved_keyword:
| RELATIVE_P
| RELEASE
| RENAME
- | REPEATABLE
| REPLACE
| REPLICA
| RESET
@@ -13391,6 +13420,7 @@ type_func_name_keyword:
| OVERLAPS
| RIGHT
| SIMILAR
+ | TABLESAMPLE
| VERBOSE
;
@@ -13459,6 +13489,7 @@ reserved_keyword:
| PLACING
| PRIMARY
| REFERENCES
+ | REPEATABLE
| RETURNING
| SELECT
| SESSION_USER
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index 4931dca..c246a9c 100644
--- a/src/backend/parser/parse_clause.c
+++ b/src/backend/parser/parse_clause.c
@@ -16,6 +16,7 @@
#include "postgres.h"
#include "access/heapam.h"
+#include "access/htup_details.h"
#include "catalog/heap.h"
#include "catalog/pg_type.h"
#include "commands/defrem.h"
@@ -29,6 +30,7 @@
#include "parser/parse_coerce.h"
#include "parser/parse_collate.h"
#include "parser/parse_expr.h"
+#include "parser/parse_func.h"
#include "parser/parse_oper.h"
#include "parser/parse_relation.h"
#include "parser/parse_target.h"
@@ -36,6 +38,7 @@
#include "utils/guc.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
+#include "utils/syscache.h"
/* Convenience macro for the most common makeNamespaceItem() case */
@@ -413,6 +416,19 @@ transformJoinOnClause(ParseState *pstate, JoinExpr *j, List *namespace)
return result;
}
+static RangeTblEntry *
+transformTableSampleEntry(ParseState *pstate, RangeTableSample *r)
+{
+ RangeTblEntry *rte;
+ TableSampleClause *tablesample = NULL;
+
+ rte = transformTableEntry(pstate, r->relation);
+ tablesample = ParseTableSample(pstate, r->method, r->args);
+ rte->tablesample = tablesample;
+
+ return rte;
+}
+
/*
* transformTableEntry --- transform a RangeVar (simple relation reference)
*/
@@ -421,7 +437,7 @@ transformTableEntry(ParseState *pstate, RangeVar *r)
{
RangeTblEntry *rte;
- /* We need only build a range table entry */
+ /* We first need to build a range table entry */
rte = addRangeTableEntry(pstate, r, r->alias,
interpretInhOption(r->inhOpt), true);
@@ -1121,6 +1137,26 @@ transformFromClauseItem(ParseState *pstate, Node *n,
return (Node *) j;
}
+ else if (IsA(n, RangeTableSample))
+ {
+ /* Tablesample reference */
+ RangeTableSample *rv = (RangeTableSample *) n;
+ RangeTblRef *rtr;
+ RangeTblEntry *rte = NULL;
+ int rtindex;
+
+ rte = transformTableSampleEntry(pstate, rv);
+
+ /* assume new rte is at end */
+ rtindex = list_length(pstate->p_rtable);
+ Assert(rte == rt_fetch(rtindex, pstate->p_rtable));
+ *top_rte = rte;
+ *top_rti = rtindex;
+ *namespace = list_make1(makeDefaultNSItem(rte));
+ rtr = makeNode(RangeTblRef);
+ rtr->rtindex = rtindex;
+ return (Node *) rtr;
+ }
else
elog(ERROR, "unrecognized node type: %d", (int) nodeTag(n));
return NULL; /* can't get here, keep compiler quiet */
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 9ebd3fd..77a28ac 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -18,6 +18,7 @@
#include "catalog/pg_aggregate.h"
#include "catalog/pg_proc.h"
#include "catalog/pg_type.h"
+#include "catalog/pg_tablesamplemethod.h"
#include "funcapi.h"
#include "lib/stringinfo.h"
#include "nodes/makefuncs.h"
@@ -26,6 +27,7 @@
#include "parser/parse_clause.h"
#include "parser/parse_coerce.h"
#include "parser/parse_func.h"
+#include "parser/parse_expr.h"
#include "parser/parse_relation.h"
#include "parser/parse_target.h"
#include "parser/parse_type.h"
@@ -760,6 +762,104 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
}
+/*
+ * ParseTableSample
+ *
+ * Parse TABLESAMPLE clause and process the arguments
+ */
+extern TableSampleClause *
+ParseTableSample(ParseState *pstate, char *samplemethod, List *sampleargs)
+{
+ HeapTuple tuple;
+ Form_pg_tablesamplemethod tsm;
+ Form_pg_proc procform;
+ TableSampleClause *tablesample;
+ List *fargs;
+ ListCell *larg;
+ int nargs, pronargs;
+ Oid actual_arg_types[FUNC_MAX_ARGS];
+ Oid declared_arg_types[FUNC_MAX_ARGS];
+
+ /* Load the table sample method */
+ tuple = SearchSysCache1(TABLESAMPLEMETHODNAME, PointerGetDatum(samplemethod));
+ if (!HeapTupleIsValid(tuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_OBJECT),
+ errmsg("table sampling method \"%s\" does not exist",
+ samplemethod)));
+
+ tablesample = makeNode(TableSampleClause);
+ tablesample->tsmid = HeapTupleGetOid(tuple);
+
+ tsm = (Form_pg_tablesamplemethod) GETSTRUCT(tuple);
+
+ tablesample->tsminit = tsm->tsminit;
+ tablesample->tsmnextblock = tsm->tsmnextblock;
+ tablesample->tsmnexttuple = tsm->tsmnexttuple;
+ tablesample->tsmend = tsm->tsmend;
+ tablesample->tsmreset = tsm->tsmreset;
+
+ ReleaseSysCache(tuple);
+
+ /* Load the table sample method's init procedure. */
+ tuple = SearchSysCache1(PROCOID,
+ ObjectIdGetDatum(tablesample->tsminit));
+
+ if (!HeapTupleIsValid(tuple)) /* should not happen */
+ elog(ERROR, "cache lookup failed for function %u",
+ tablesample->tsminit);
+
+ procform = (Form_pg_proc) GETSTRUCT(tuple);
+ pronargs = procform->pronargs;
+ Assert(pronargs >= 3);
+
+ /*
+ * First parameter is used to pass the SampleScanState,
+ * skip the processing for it here, just assert that it's the correct type.
+ */
+ Assert(procform->proargtypes.values[0] == INTERNALOID);
+ pronargs--;
+ memcpy(declared_arg_types, procform->proargtypes.values + 1,
+ pronargs * sizeof(Oid));
+
+ /* Now we are done with the catalog */
+ ReleaseSysCache(tuple);
+
+ /* Transform the list of arguments ... */
+ fargs = NIL;
+ nargs = 0;
+ foreach(larg, sampleargs)
+ {
+ Node *arg = transformExpr(pstate, (Node *) lfirst(larg), EXPR_KIND_FROM_FUNCTION);
+ Oid argtype = exprType(arg);
+
+ fargs = lappend(fargs, arg);
+
+ actual_arg_types[nargs++] = argtype;
+ }
+
+ /*
+ * Check if parameters are correct.
+ *
+ * XXX: can we do better at hinting here?
+ */
+ if (pronargs != nargs ||
+ !can_coerce_type(pronargs, actual_arg_types, declared_arg_types,
+ COERCION_IMPLICIT))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("wrong parameters for TABLESAMPLE method \"%s\"",
+ samplemethod)));
+
+ /* perform the necessary typecasting of arguments */
+ make_fn_arguments(pstate, fargs, actual_arg_types, declared_arg_types);
+
+ /* Pass the arguments down */
+ tablesample->args = fargs;
+
+ return tablesample;
+}
+
/* func_match_argtypes()
*
* Given a list of candidate functions (having the right name and number
diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c
index 94d951c..6832e0b 100644
--- a/src/backend/utils/cache/syscache.c
+++ b/src/backend/utils/cache/syscache.c
@@ -55,6 +55,7 @@
#include "catalog/pg_shdescription.h"
#include "catalog/pg_shseclabel.h"
#include "catalog/pg_statistic.h"
+#include "catalog/pg_tablesamplemethod.h"
#include "catalog/pg_tablespace.h"
#include "catalog/pg_ts_config.h"
#include "catalog/pg_ts_config_map.h"
@@ -642,6 +643,28 @@ static const struct cachedesc cacheinfo[] = {
},
128
},
+ {TableSampleMethodRelationId, /* TABLESAMPLEMETHODNAME */
+ TableSampleMethodNameIndexId,
+ 1,
+ {
+ Anum_pg_tablesamplemethod_tsmname,
+ 0,
+ 0,
+ 0,
+ },
+ 2
+ },
+ {TableSampleMethodRelationId, /* TABLESAMPLEMETHODOID */
+ TableSampleMethodOidIndexId,
+ 1,
+ {
+ ObjectIdAttributeNumber,
+ 0,
+ 0,
+ 0,
+ },
+ 2
+ },
{TableSpaceRelationId, /* TABLESPACEOID */
TablespaceOidIndexId,
1,
diff --git a/src/backend/utils/misc/Makefile b/src/backend/utils/misc/Makefile
index c7b745e..f311c74 100644
--- a/src/backend/utils/misc/Makefile
+++ b/src/backend/utils/misc/Makefile
@@ -15,7 +15,7 @@ include $(top_builddir)/src/Makefile.global
override CPPFLAGS := -I. -I$(srcdir) $(CPPFLAGS)
OBJS = guc.o help_config.o pg_rusage.o ps_status.o rbtree.o \
- superuser.o timeout.o tzparser.o
+ sampling.o superuser.o timeout.o tzparser.o
# This location might depend on the installation directories. Therefore
# we can't subsitute it into pg_config.h.
diff --git a/src/backend/utils/misc/sampling.c b/src/backend/utils/misc/sampling.c
new file mode 100644
index 0000000..c07f01e
--- /dev/null
+++ b/src/backend/utils/misc/sampling.c
@@ -0,0 +1,131 @@
+/*-------------------------------------------------------------------------
+ *
+ * sampling.c
+ * Block sampling routines shared by ANALYZE and TABLESAMPLE.
+ *
+ * Portions Copyright (c) 1996-2012, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ * src/backend/utils/misc/sampling.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include <math.h>
+
+#include "utils/sampling.h"
+
+static unsigned short _sampler_seed[3] = { 0x330e, 0xabcd, 0x1234 };
+
+/*
+ * BlockSampler_Init -- prepare for random sampling of blocknumbers
+ *
+ * BlockSampler is used for stage one of our new two-stage tuple
+ * sampling mechanism as discussed on pgsql-hackers 2004-04-02 (subject
+ * "Large DB"). It selects a random sample of samplesize blocks out of
+ * the nblocks blocks in the table. If the table has less than
+ * samplesize blocks, all blocks are selected.
+ *
+ * Since we know the total number of blocks in advance, we can use the
+ * straightforward Algorithm S from Knuth 3.4.2, rather than Vitter's
+ * algorithm.
+ */
+void
+BlockSampler_Init(BlockSampler bs, BlockNumber nblocks, int samplesize)
+{
+ bs->N = nblocks; /* measured table size */
+
+ /*
+ * If we decide to reduce samplesize for tables that have less or not much
+ * more than samplesize blocks, here is the place to do it.
+ */
+ bs->n = samplesize;
+ bs->t = 0; /* blocks scanned so far */
+ bs->m = 0; /* blocks selected so far */
+}
+
+bool
+BlockSampler_HasMore(BlockSampler bs)
+{
+ return (bs->t < bs->N) && (bs->m < bs->n);
+}
+
+BlockNumber
+BlockSampler_Next(BlockSampler bs)
+{
+ BlockNumber K = bs->N - bs->t; /* remaining blocks */
+ int k = bs->n - bs->m; /* blocks still to sample */
+ double p; /* probability to skip block */
+ double V; /* random */
+
+ Assert(BlockSampler_HasMore(bs)); /* hence K > 0 and k > 0 */
+
+ if ((BlockNumber) k >= K)
+ {
+ /* need all the rest */
+ bs->m++;
+ return bs->t++;
+ }
+
+ /*----------
+ * It is not obvious that this code matches Knuth's Algorithm S.
+ * Knuth says to skip the current block with probability 1 - k/K.
+ * If we are to skip, we should advance t (hence decrease K), and
+ * repeat the same probabilistic test for the next block. The naive
+ * implementation thus requires an sampler_random_fract() call for each
+ * block number. But we can reduce this to one sampler_random_fract()
+ * call per selected block, by noting that each time the while-test
+ * succeeds, we can reinterpret V as a uniform random number in the range
+ * 0 to p. Therefore, instead of choosing a new V, we just adjust p to be
+ * the appropriate fraction of its former value, and our next loop
+ * makes the appropriate probabilistic test.
+ *
+ * We have initially K > k > 0. If the loop reduces K to equal k,
+ * the next while-test must fail since p will become exactly zero
+ * (we assume there will not be roundoff error in the division).
+ * (Note: Knuth suggests a "<=" loop condition, but we use "<" just
+ * to be doubly sure about roundoff error.) Therefore K cannot become
+ * less than k, which means that we cannot fail to select enough blocks.
+ *----------
+ */
+ V = sampler_random_fract();
+ p = 1.0 - (double) k / (double) K;
+ while (V < p)
+ {
+ /* skip */
+ bs->t++;
+ K--; /* keep K == N - t */
+
+ /* adjust p to be new cutoff point in reduced range */
+ p *= 1.0 - (double) k / (double) K;
+ }
+
+ /* select */
+ bs->m++;
+ return bs->t++;
+}
+
+
+/*----------
+ * Random number generator used by sampling
+ *----------
+ */
+
+void
+sampler_setseed(long seed)
+{
+ _sampler_seed[0] = 0x330e;
+ _sampler_seed[1] = (unsigned short) seed;
+ _sampler_seed[2] = (unsigned short) (seed >> 16);
+}
+
+/* Select a random value R uniformly distributed in (0 - 1) */
+double
+sampler_random_fract(void)
+{
+ return pg_erand48(_sampler_seed);
+}
diff --git a/src/include/access/tsm_bernoulli.h b/src/include/access/tsm_bernoulli.h
new file mode 100644
index 0000000..9488710
--- /dev/null
+++ b/src/include/access/tsm_bernoulli.h
@@ -0,0 +1,19 @@
+/*--------------------------------------------------------------------------
+ * tsm_bernoulli.h
+ * Header file for BERNOULLI table sampling method.
+ *
+ * Copyright (c) 2006-2014, PostgreSQL Global Development Group
+ *
+ * src/include/access/tsm_bernoulli.h
+ *--------------------------------------------------------------------------
+ */
+#ifndef TSM_BERNOULLI_H
+#define TSM_BERNOULLI_H
+
+extern Datum tsm_bernoulli_init(PG_FUNCTION_ARGS);
+extern Datum tsm_bernoulli_nextblock(PG_FUNCTION_ARGS);
+extern Datum tsm_bernoulli_nexttuple(PG_FUNCTION_ARGS);
+extern Datum tsm_bernoulli_end(PG_FUNCTION_ARGS);
+extern Datum tsm_bernoulli_reset(PG_FUNCTION_ARGS);
+
+#endif /* TSM_SYSTEM_H */
diff --git a/src/include/access/tsm_system.h b/src/include/access/tsm_system.h
new file mode 100644
index 0000000..37253da
--- /dev/null
+++ b/src/include/access/tsm_system.h
@@ -0,0 +1,19 @@
+/*--------------------------------------------------------------------------
+ * tsm_system.h
+ * Header file for SYSTEM table sampling method.
+ *
+ * Copyright (c) 2006-2014, PostgreSQL Global Development Group
+ *
+ * src/include/access/tsm_system.h
+ *--------------------------------------------------------------------------
+ */
+#ifndef TSM_SYSTEM_H
+#define TSM_SYSTEM_H
+
+extern Datum tsm_system_init(PG_FUNCTION_ARGS);
+extern Datum tsm_system_nextblock(PG_FUNCTION_ARGS);
+extern Datum tsm_system_nexttuple(PG_FUNCTION_ARGS);
+extern Datum tsm_system_end(PG_FUNCTION_ARGS);
+extern Datum tsm_system_reset(PG_FUNCTION_ARGS);
+
+#endif /* TSM_SYSTEM_H */
diff --git a/src/include/catalog/indexing.h b/src/include/catalog/indexing.h
index bde1a84..d40cfe6 100644
--- a/src/include/catalog/indexing.h
+++ b/src/include/catalog/indexing.h
@@ -305,6 +305,11 @@ DECLARE_UNIQUE_INDEX(pg_policy_oid_index, 3257, on pg_policy using btree(oid oid
DECLARE_UNIQUE_INDEX(pg_policy_polrelid_polname_index, 3258, on pg_policy using btree(polrelid oid_ops, polname name_ops));
#define PolicyPolrelidPolnameIndexId 3258
+DECLARE_UNIQUE_INDEX(pg_tablesamplemethod_name_index, 3262, on pg_tablesamplemethod using btree(tsmname name_ops));
+#define TableSampleMethodNameIndexId 3262
+DECLARE_UNIQUE_INDEX(pg_tablesamplemethod_oid_index, 3263, on pg_tablesamplemethod using btree(oid oid_ops));
+#define TableSampleMethodOidIndexId 3263
+
/* last step of initialization script: build the indexes declared above */
BUILD_INDICES
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 910cfc6..fdd83bb 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -5104,6 +5104,27 @@ DESCR("rank of hypothetical row without gaps");
DATA(insert OID = 3993 ( dense_rank_final PGNSP PGUID 12 1 0 2276 0 f f f f f f i 2 0 20 "2281 2276" "{2281,2276}" "{i,v}" _null_ _null_ hypothetical_dense_rank_final _null_ _null_ _null_ ));
DESCR("aggregate final function");
+DATA(insert OID = 3265 ( tsm_system_init PGNSP PGUID 12 1 0 0 0 f f f f t f v 3 0 2278 "2281 23 700" _null_ _null_ _null_ _null_ tsm_system_init _null_ _null_ _null_ ));
+DESCR("tsm_system_init(internal)");
+DATA(insert OID = 3266 ( tsm_system_nextblock PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 23 "2281" _null_ _null_ _null_ _null_ tsm_system_nextblock _null_ _null_ _null_ ));
+DESCR("tsm_system_nextblock(internal)");
+DATA(insert OID = 3267 ( tsm_system_nexttuple PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 21 "2281" _null_ _null_ _null_ _null_ tsm_system_nexttuple _null_ _null_ _null_ ));
+DESCR("tsm_system_nexttuple(internal)");
+DATA(insert OID = 3268 ( tsm_system_end PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 2278 "2281" _null_ _null_ _null_ _null_ tsm_system_end _null_ _null_ _null_ ));
+DESCR("tsm_system_end(internal)");
+DATA(insert OID = 3269 ( tsm_system_reset PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 2278 "2281" _null_ _null_ _null_ _null_ tsm_system_reset _null_ _null_ _null_ ));
+DESCR("tsm_system_reset(internal)");
+
+DATA(insert OID = 3271 ( tsm_bernoulli_init PGNSP PGUID 12 1 0 0 0 f f f f t f v 3 0 2278 "2281 23 700" _null_ _null_ _null_ _null_ tsm_bernoulli_init _null_ _null_ _null_ ));
+DESCR("tsm_bernoulli_init(internal)");
+DATA(insert OID = 3272 ( tsm_bernoulli_nextblock PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 23 "2281" _null_ _null_ _null_ _null_ tsm_bernoulli_nextblock _null_ _null_ _null_ ));
+DESCR("tsm_bernoulli_nextblock(internal)");
+DATA(insert OID = 3273 ( tsm_bernoulli_nexttuple PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 21 "2281" _null_ _null_ _null_ _null_ tsm_bernoulli_nexttuple _null_ _null_ _null_ ));
+DESCR("tsm_bernoulli_nexttuple(internal)");
+DATA(insert OID = 3274 ( tsm_bernoulli_end PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 2278 "2281" _null_ _null_ _null_ _null_ tsm_bernoulli_end _null_ _null_ _null_ ));
+DESCR("tsm_bernoulli_end(internal)");
+DATA(insert OID = 3275 ( tsm_bernoulli_reset PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 2278 "2281" _null_ _null_ _null_ _null_ tsm_bernoulli_reset _null_ _null_ _null_ ));
+DESCR("tsm_bernoulli_reset(internal)");
/*
* Symbolic values for provolatile column: these indicate whether the result
diff --git a/src/include/catalog/pg_tablesamplemethod.h b/src/include/catalog/pg_tablesamplemethod.h
new file mode 100644
index 0000000..229d8d2
--- /dev/null
+++ b/src/include/catalog/pg_tablesamplemethod.h
@@ -0,0 +1,68 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_tablesamplemethod.h
+ * definition of the table scan methods.
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/catalog/pg_tablesamplemethod.h
+ *
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_TABLESAMPLEMETHOD_H
+#define PG_TABLESAMPLEMETHOD_H
+
+#include "catalog/genbki.h"
+
+/* ----------------
+ * pg_tablesamplemethod definition. cpp turns this into
+ * typedef struct FormData_pg_tablesamplemethod
+ * ----------------
+ */
+#define TableSampleMethodRelationId 3261
+
+CATALOG(pg_tablesamplemethod,3261)
+{
+ NameData tsmname; /* tablescan method name */
+ regproc tsminit; /* init scan function */
+ regproc tsmnextblock; /* function returning next block to sample
+ or InvalidBlockOffset if finished */
+ regproc tsmnexttuple; /* function returning next tuple offset from current block
+ or InvalidOffsetNumber if end of the block was reacher */
+ regproc tsmend; /* end scan function*/
+ regproc tsmreset; /* reset state - used by rescan */
+} FormData_pg_tablesamplemethod;
+
+/* ----------------
+ * Form_pg_tablesamplemethod corresponds to a pointer to a tuple with
+ * the format of pg_tablesamplemethod relation.
+ * ----------------
+ */
+typedef FormData_pg_tablesamplemethod *Form_pg_tablesamplemethod;
+
+/* ----------------
+ * compiler constants for pg_tablesamplemethod
+ * ----------------
+ */
+#define Natts_pg_tablesamplemethod 6
+#define Anum_pg_tablesamplemethod_tsmname 1
+#define Anum_pg_tablesamplemethod_tsminit 2
+#define Anum_pg_tablesamplemethod_tsmnextblock 3
+#define Anum_pg_tablesamplemethod_tsmnexttuple 4
+#define Anum_pg_tablesamplemethod_tsmend 5
+#define Anum_pg_tablesamplemethod_tsmreset 6
+
+/* ----------------
+ * initial contents of pg_tablesamplemethod
+ * ----------------
+ */
+
+DATA(insert OID = 3264 ( system tsm_system_init tsm_system_nextblock tsm_system_nexttuple tsm_system_end tsm_system_reset ));
+DESCR("SYSTEM table sampling method");
+DATA(insert OID = 3270 ( bernoulli tsm_bernoulli_init tsm_bernoulli_nextblock tsm_bernoulli_nexttuple tsm_bernoulli_end tsm_bernoulli_reset ));
+DESCR("BERNOULLI table sampling method");
+
+#endif /* PG_TABLESAMPLEMETHOD_H */
diff --git a/src/include/executor/nodeSamplescan.h b/src/include/executor/nodeSamplescan.h
new file mode 100644
index 0000000..4b769da
--- /dev/null
+++ b/src/include/executor/nodeSamplescan.h
@@ -0,0 +1,24 @@
+/*-------------------------------------------------------------------------
+ *
+ * nodeSamplescan.h
+ *
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/executor/nodeSamplescan.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef NODESAMPLESCAN_H
+#define NODESAMPLESCAN_H
+
+#include "nodes/execnodes.h"
+
+extern SampleScanState *ExecInitSampleScan(SampleScan *node, EState *estate, int eflags);
+extern TupleTableSlot *ExecSampleScan(SampleScanState *node);
+extern void ExecEndSampleScan(SampleScanState *node);
+extern void ExecReScanSampleScan(SampleScanState *node);
+
+#endif /* NODESAMPLESCAN_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 41b13b2..b7f3129 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1212,6 +1212,26 @@ typedef struct ScanState
typedef ScanState SeqScanState;
/*
+ * SampleScan
+ */
+typedef struct SampleScanState
+{
+ ScanState ss;
+
+ /* Sampling method functions. */
+ FmgrInfo tsminit;
+ FmgrInfo tsmnextblock;
+ FmgrInfo tsmnexttuple;
+ FmgrInfo tsmend;
+ FmgrInfo tsmreset;
+
+ Buffer openbuffer; /* currently open buffer */
+ HeapTupleData tup; /* last tuple */
+
+ void *tsmdata; /* for use by table scan method */
+} SampleScanState;
+
+/*
* These structs store information about index quals that don't have simple
* constant right-hand sides. See comments for ExecIndexBuildScanKeys()
* for discussion.
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index bc71fea..cca592e 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -51,6 +51,7 @@ typedef enum NodeTag
T_BitmapOr,
T_Scan,
T_SeqScan,
+ T_SampleScan,
T_IndexScan,
T_IndexOnlyScan,
T_BitmapIndexScan,
@@ -97,6 +98,7 @@ typedef enum NodeTag
T_BitmapOrState,
T_ScanState,
T_SeqScanState,
+ T_SampleScanState,
T_IndexScanState,
T_IndexOnlyScanState,
T_BitmapIndexScanState,
@@ -413,6 +415,8 @@ typedef enum NodeTag
T_XmlSerialize,
T_WithClause,
T_CommonTableExpr,
+ T_RangeTableSample,
+ T_TableSampleClause,
/*
* TAGS FOR REPLICATION GRAMMAR PARSE NODES (replnodes.h)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 5eaa435..cc1dd40 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -307,6 +307,21 @@ typedef struct FuncCall
} FuncCall;
/*
+ * TableSampleClause - a sampling method information
+ */
+typedef struct TableSampleClause
+{
+ NodeTag type;
+ Oid tsmid;
+ Oid tsminit;
+ Oid tsmnextblock;
+ Oid tsmnexttuple;
+ Oid tsmend;
+ Oid tsmreset;
+ List *args;
+} TableSampleClause;
+
+/*
* A_Star - '*' representing all columns of a table or compound field
*
* This can appear within ColumnRef.fields, A_Indirection.indirection, and
@@ -507,6 +522,20 @@ typedef struct RangeFunction
} RangeFunction;
/*
+ * RangeTableSample - represents <table> TABLESAMPLE <method> (<params>) REPEATABLE (<num>)
+ *
+ * We are more generic than SQL Standard so we pass generic function
+ * arguments to the sampling method.
+ */
+typedef struct RangeTableSample
+{
+ NodeTag type;
+ RangeVar *relation;
+ char *method; /* sampling method */
+ List *args; /* arguments for sampling method */
+} RangeTableSample;
+
+/*
* ColumnDef - column definition (used in various creates)
*
* If the column has a default value, we may have the value expression
@@ -751,6 +780,7 @@ typedef struct RangeTblEntry
*/
Oid relid; /* OID of the relation */
char relkind; /* relation kind (see pg_class.relkind) */
+ TableSampleClause *tablesample; /* sampling method and parameters */
/*
* Fields valid for a subquery RTE (else NULL):
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 48203a0..8427b44 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -278,6 +278,12 @@ typedef struct Scan
typedef Scan SeqScan;
/* ----------------
+ * table sample scan node
+ * ----------------
+ */
+typedef Scan SampleScan;
+
+/* ----------------
* index scan node
*
* indexqualorig is an implicitly-ANDed list of index qual expressions, each
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 75e2afb..889c61c 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -68,6 +68,8 @@ extern double index_pages_fetched(double tuples_fetched, BlockNumber pages,
double index_pages, PlannerInfo *root);
extern void cost_seqscan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
ParamPathInfo *param_info);
+extern void cost_samplescan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
+ ParamPathInfo *param_info);
extern void cost_index(IndexPath *path, PlannerInfo *root,
double loop_count);
extern void cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 26b17f5..b96f903 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -32,6 +32,8 @@ extern bool add_path_precheck(RelOptInfo *parent_rel,
extern Path *create_seqscan_path(PlannerInfo *root, RelOptInfo *rel,
Relids required_outer);
+extern Path *create_samplescan_path(PlannerInfo *root, RelOptInfo *rel,
+ Relids required_outer);
extern IndexPath *create_index_path(PlannerInfo *root,
IndexOptInfo *index,
List *indexclauses,
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index e14dc9a..e565082 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -312,7 +312,7 @@ PG_KEYWORD("reindex", REINDEX, UNRESERVED_KEYWORD)
PG_KEYWORD("relative", RELATIVE_P, UNRESERVED_KEYWORD)
PG_KEYWORD("release", RELEASE, UNRESERVED_KEYWORD)
PG_KEYWORD("rename", RENAME, UNRESERVED_KEYWORD)
-PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
+PG_KEYWORD("repeatable", REPEATABLE, RESERVED_KEYWORD)
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
@@ -366,6 +366,7 @@ PG_KEYWORD("sysid", SYSID, UNRESERVED_KEYWORD)
PG_KEYWORD("system", SYSTEM_P, UNRESERVED_KEYWORD)
PG_KEYWORD("table", TABLE, RESERVED_KEYWORD)
PG_KEYWORD("tables", TABLES, UNRESERVED_KEYWORD)
+PG_KEYWORD("tablesample", TABLESAMPLE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("tablespace", TABLESPACE, UNRESERVED_KEYWORD)
PG_KEYWORD("temp", TEMP, UNRESERVED_KEYWORD)
PG_KEYWORD("template", TEMPLATE, UNRESERVED_KEYWORD)
diff --git a/src/include/parser/parse_func.h b/src/include/parser/parse_func.h
index 4423bc0..0202bf5 100644
--- a/src/include/parser/parse_func.h
+++ b/src/include/parser/parse_func.h
@@ -33,6 +33,9 @@ typedef enum
extern Node *ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
FuncCall *fn, int location);
+extern TableSampleClause *ParseTableSample(ParseState *pstate,
+ char *samplemethod, List *args);
+
extern FuncDetailCode func_get_detail(List *funcname,
List *fargs, List *fargnames,
int nargs, Oid *argtypes,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 48ebf59..1ba06b6 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -63,7 +63,6 @@ typedef struct RelationAmInfo
FmgrInfo amcanreturn;
} RelationAmInfo;
-
/*
* Here are the contents of a relation cache entry.
*/
diff --git a/src/include/utils/sampling.h b/src/include/utils/sampling.h
new file mode 100644
index 0000000..607f75f
--- /dev/null
+++ b/src/include/utils/sampling.h
@@ -0,0 +1,43 @@
+/*-------------------------------------------------------------------------
+ *
+ * sampling.h
+ * definitions for sampling functions
+ *
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/sampling.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef SAMPLING_H
+#define SAMPLING_H
+
+#include "storage/bufmgr.h"
+
+/* Data structure for Algorithm S from Knuth 3.4.2 */
+typedef struct
+{
+ BlockNumber N; /* number of blocks, known in advance */
+ int n; /* desired sample size */
+ BlockNumber t; /* current block number */
+ int m; /* blocks selected so far */
+} BlockSamplerData;
+
+typedef BlockSamplerData *BlockSampler;
+
+extern void BlockSampler_Init(BlockSampler bs, BlockNumber nblocks,
+ int samplesize);
+extern bool BlockSampler_HasMore(BlockSampler bs);
+extern BlockNumber BlockSampler_Next(BlockSampler bs);
+
+/* Vitter reservoir sampling functions */
+extern double vitter_init_selection_state(int n);
+extern double vitter_get_next_S(double t, int n, double *stateptr);
+
+/* Random generator */
+extern void sampler_setseed(long seed);
+extern double sampler_random_fract(void);
+
+#endif /* SAMPLING_H */
diff --git a/src/include/utils/syscache.h b/src/include/utils/syscache.h
index f97229f..29244c7 100644
--- a/src/include/utils/syscache.h
+++ b/src/include/utils/syscache.h
@@ -79,6 +79,8 @@ enum SysCacheIdentifier
RELOID,
RULERELNAME,
STATRELATTINH,
+ TABLESAMPLEMETHODNAME,
+ TABLESAMPLEMETHODOID,
TABLESPACEOID,
TSCONFIGMAP,
TSCONFIGNAMENSP,
diff --git a/src/test/regress/expected/sanity_check.out b/src/test/regress/expected/sanity_check.out
index c7be273..970d4da 100644
--- a/src/test/regress/expected/sanity_check.out
+++ b/src/test/regress/expected/sanity_check.out
@@ -127,6 +127,7 @@ pg_shdepend|t
pg_shdescription|t
pg_shseclabel|t
pg_statistic|t
+pg_tablesamplemethod|t
pg_tablespace|t
pg_trigger|t
pg_ts_config|t
diff --git a/src/test/regress/expected/tablesample.out b/src/test/regress/expected/tablesample.out
new file mode 100644
index 0000000..79ed140
--- /dev/null
+++ b/src/test/regress/expected/tablesample.out
@@ -0,0 +1,68 @@
+CREATE TABLE test_tablesample (id INT, name text) WITH (fillfactor=10); -- force smaller pages so we don't have to load too much data to get multiple pages
+INSERT INTO test_tablesample SELECT i, repeat(i::text, 200) FROM generate_series(0, 9) s(i) ORDER BY i;
+SELECT id FROM test_tablesample TABLESAMPLE SYSTEM (50) REPEATABLE (10);
+ id
+----
+ 0
+ 1
+ 2
+ 3
+ 4
+ 5
+ 9
+(7 rows)
+
+SELECT id FROM test_tablesample TABLESAMPLE SYSTEM (100.0/11) REPEATABLE (9999);
+ id
+----
+ 6
+ 7
+ 8
+(3 rows)
+
+SELECT id FROM test_tablesample TABLESAMPLE SYSTEM (100) REPEATABLE (10);
+ id
+----
+ 0
+ 1
+ 2
+ 3
+ 4
+ 5
+ 6
+ 7
+ 8
+ 9
+(10 rows)
+
+SELECT id FROM test_tablesample TABLESAMPLE SYSTEM (50) REPEATABLE (100);
+ id
+----
+ 0
+ 1
+ 2
+ 6
+ 7
+ 8
+ 9
+(7 rows)
+
+SELECT id FROM test_tablesample TABLESAMPLE BERNOULLI (50) REPEATABLE (100);
+ id
+----
+ 0
+ 1
+ 3
+ 4
+ 5
+ 9
+(6 rows)
+
+SELECT id FROM test_tablesample TABLESAMPLE BERNOULLI (5.5) REPEATABLE (1);
+ id
+----
+ 0
+ 5
+(2 rows)
+
+DROP TABLE test_tablesample;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 62cc198..cf789dc 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -83,7 +83,7 @@ test: select_into select_distinct select_distinct_on select_implicit select_havi
# ----------
# Another group of parallel tests
# ----------
-test: brin gin gist spgist privileges security_label collate matview lock replica_identity rowsecurity
+test: brin gin gist spgist privileges security_label collate matview lock replica_identity rowsecurity tablesample
# ----------
# Another group of parallel tests
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 07fc827..852fed9 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -151,3 +151,4 @@ test: with
test: xml
test: event_trigger
test: stats
+test: tablesample
diff --git a/src/test/regress/sql/tablesample.sql b/src/test/regress/sql/tablesample.sql
new file mode 100644
index 0000000..5f6e828
--- /dev/null
+++ b/src/test/regress/sql/tablesample.sql
@@ -0,0 +1,12 @@
+CREATE TABLE test_tablesample (id INT, name text) WITH (fillfactor=10); -- force smaller pages so we don't have to load too much data to get multiple pages
+
+INSERT INTO test_tablesample SELECT i, repeat(i::text, 200) FROM generate_series(0, 9) s(i) ORDER BY i;
+
+SELECT id FROM test_tablesample TABLESAMPLE SYSTEM (50) REPEATABLE (10);
+SELECT id FROM test_tablesample TABLESAMPLE SYSTEM (100.0/11) REPEATABLE (9999);
+SELECT id FROM test_tablesample TABLESAMPLE SYSTEM (100) REPEATABLE (10);
+SELECT id FROM test_tablesample TABLESAMPLE SYSTEM (50) REPEATABLE (100);
+SELECT id FROM test_tablesample TABLESAMPLE BERNOULLI (50) REPEATABLE (100);
+SELECT id FROM test_tablesample TABLESAMPLE BERNOULLI (5.5) REPEATABLE (1);
+
+DROP TABLE test_tablesample;
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers