Table AM modifications to accept column projection lists

2020-11-13 Thread Soumyadeep Chakraborty
Hello,

This patch introduces a set of changes to the table AM APIs, making them
accept a column projection list. That helps columnar table AMs, so that
they don't need to fetch all columns from disk, but only the ones
actually needed.

The set of changes in this patch is not exhaustive -
there are many more opportunities that are discussed in the TODO section
below. Before digging deeper, we want to elicit early feedback on the
API changes and the column extraction logic.

TableAM APIs that have been modified are:

1. Sequential scan APIs
2. Index scan APIs
3. API to lock and return a row
4. API to fetch a single row

We have seen performance benefits in Zedstore for many of the optimized
operations [0]. This patch is extracted from the larger patch shared in
[0].


Building the column projection set:

In terms of building the column projection set necessary for each of
these APIs, this patch builds off of the scanCols patch [1], which
Ashwin and Melanie had started earlier. As noted in [1], there are cases
where the scanCols set is not representative of the columns to be
projected. For instance, in a DELETE .. RETURNING query, there is
typically a sequential scan and a separate invocation of
tuple_fetch_row_version() in order to satisfy the RETURNING clause (see
ExecDelete()). So for a query such as:

  DELETE from foo WHERE i < 100 && j < 1000 RETURNING k, l;

We need to pass the set (i, j) to the scan and (k, l) to the
tuple_fetch_row_version() invocation. This is why we had to introduce
the returningCols field.

In the same spirit, separate column projection sets are computed for any
operations that involve an EPQ check (INSERT, DELETE, UPDATE, row-level
locking etc), the columns involved in an ON CONFLICT UPDATE etc.

Recognizing and collecting these sets of columns is done at various
stages: analyze and rewrite, planner and executor - depending on the
type of operation for which the subset of columns is calculated. The
column bitmaps are stored in different places as well - such as the ones
for scans and RETURNING are stored in RangeTblEntry, whereas the set of
columns for ON CONFLICT UPDATE are stored in OnConflictSetState.


Table AM API changes:

The changes made to the table AM API, introducing the column projection
set, come in different flavors. We would like feedback on what style
we need to converge to or if we should use different styles depending
on the situation.

- A new function variant that takes a column projection list, such as:

  TableScanDesc (*scan_begin) (Relation rel,
Snapshot snapshot,
int nkeys, struct ScanKeyData *key,
ParallelTableScanDesc pscan,
uint32 flags);
->

  TableScanDesc (*scan_begin_with_column_projection)(Relation relation,
 Snapshot snapshot,
 int nkeys, struct ScanKeyData *key,
 ParallelTableScanDesc parallel_scan,
 uint32 flags,
 Bitmapset *project_columns);

- Modifying the existing function to take a column projection list, such
as:

  TM_Result (*tuple_lock) (Relation rel,
 ItemPointer tid,
 Snapshot snapshot,
 TupleTableSlot *slot,
 CommandId cid,
 LockTupleMode mode,
 LockWaitPolicy wait_policy,
 uint8 flags,
 TM_FailureData *tmfd);

->

  TM_Result (*tuple_lock) (Relation rel,
 ItemPointer tid,
 Snapshot snapshot,
 TupleTableSlot *slot,
 CommandId cid,
 LockTupleMode mode,
 LockWaitPolicy wait_policy,
 uint8 flags,
 TM_FailureData *tmfd,
 Bitmapset *project_cols);

- A new function index_fetch_set_column_projection() to be called after
index_beginscan() to set the column projection set, which will be used
later by index_getnext_slot().

  void (*index_fetch_set_column_projection) (struct IndexFetchTableData *data,
 Bitmapset *project_columns);

The set of columns expected by the new/modified functions is represented
as a Bitmapset of attnums for a specific base relation. An empty/NULL
bitmap signals to the AM that no data columns are needed. A bitmap
containing the single element 0 indicates that we want all data columns
to be fetched.

The bitmaps do not include system columns.

Additionally, the TupleTableSlots populated by functions such
as table_scan_getnextslot(), need to be densely filled upto the highest
numbered column in the projection list (any column not in the projection
list should be populated with NULL). This is due to the implicit
assumptions of the slot_get_***() APIs.


TODOs:

- Explore opportunities to push the column extraction logic to the
planner or pre-planner stages from the executor stage (like scanCols and
returningCols), or at least elevate the column extraction logic to be
done once per executor run instead of once per tuple.

- As was requested in [1], we should guard column projection set
extraction logic

Re: Table AM modifications to accept column projection lists

2021-03-01 Thread Justin Pryzby
On Thu, Dec 31, 2020 at 01:02:24PM -0800, Soumyadeep Chakraborty wrote:
> Hey Masahiko,
> 
> I added it to the Jan CF (https://commitfest.postgresql.org/31/2922/).
> 
> PFA a rebased version against latest head.

Thanks for working on this.

-   
pgstat_progress_update_param(PROGRESS_ANALYZE_CURRENT_CHILD_TABLE_RELID,

   
-
RelationGetRelid(childrel));
  
-   

   

Why is this removed ?

+* returningCols of it's base table's RTE.

its (possessive) not it's ("it is")

-- 
Justin




Re: Table AM modifications to accept column projection lists

2021-03-01 Thread Jacob Champion
On Mon, 2021-03-01 at 16:59 -0600, Justin Pryzby wrote:
> -   
> pgstat_progress_update_param(PROGRESS_ANALYZE_CURRENT_CHILD_TABLE_RELID,  
>   
>
> 
> -
> RelationGetRelid(childrel));  
> 
> - 
>   
>
> 
> Why is this removed ?

Mm, both of those analyze.c changes seem suspect. Possibly a mismerge
from the zedstore branch; let me double-check.

> 
> +* returningCols of it's base table's RTE.
> 
> its (possessive) not it's ("it is")

Thanks, I'll fix this at the same time.

--Jacob


Re: Table AM modifications to accept column projection lists

2021-03-02 Thread Jacob Champion
On Mon, 2021-03-01 at 23:13 +, Jacob Champion wrote:
> On Mon, 2021-03-01 at 16:59 -0600, Justin Pryzby wrote:
> > Why is this removed ?
> 
> Mm, both of those analyze.c changes seem suspect. Possibly a mismerge
> from the zedstore branch; let me double-check.
> 
> > its (possessive) not it's ("it is")
> 
> Thanks, I'll fix this at the same time.

Both fixed in v3; thanks for the catch!

--Jacob
From 484cdaf5050f4010127ae2de5624f39d7f12982d Mon Sep 17 00:00:00 2001
From: Jacob Champion 
Date: Tue, 2 Mar 2021 08:59:45 -0800
Subject: [PATCH v3] tableam: accept column projection list

Co-authored-by: Soumyadeep Chakraborty 
Co-authored-by: Melanie Plageman 
Co-authored-by: Ashwin Agrawal 
Co-authored-by: Jacob Champion 
---
 src/backend/access/heap/heapam_handler.c |   5 +-
 src/backend/access/nbtree/nbtsort.c  |   3 +-
 src/backend/access/table/tableam.c   |   5 +-
 src/backend/commands/trigger.c   |  19 +++-
 src/backend/executor/execMain.c  |   3 +-
 src/backend/executor/execPartition.c |   2 +
 src/backend/executor/execReplication.c   |   6 +-
 src/backend/executor/execScan.c  | 108 +
 src/backend/executor/nodeIndexscan.c |  10 ++
 src/backend/executor/nodeLockRows.c  |   7 +-
 src/backend/executor/nodeModifyTable.c   |  38 ++--
 src/backend/executor/nodeSeqscan.c   |  43 -
 src/backend/executor/nodeTidscan.c   |  11 ++-
 src/backend/nodes/copyfuncs.c|   2 +
 src/backend/nodes/equalfuncs.c   |   2 +
 src/backend/nodes/outfuncs.c |   2 +
 src/backend/nodes/readfuncs.c|   2 +
 src/backend/optimizer/path/allpaths.c|  85 +++-
 src/backend/optimizer/plan/planner.c |  19 
 src/backend/optimizer/prep/preptlist.c   |  13 ++-
 src/backend/optimizer/util/inherit.c |  37 ++-
 src/backend/parser/analyze.c |  40 ++--
 src/backend/parser/parse_relation.c  |  18 
 src/backend/partitioning/partbounds.c|  14 ++-
 src/backend/rewrite/rewriteHandler.c |   8 ++
 src/include/access/tableam.h | 118 +--
 src/include/executor/executor.h  |   6 ++
 src/include/nodes/execnodes.h|   1 +
 src/include/nodes/parsenodes.h   |  14 +++
 src/include/utils/rel.h  |  14 +++
 30 files changed, 611 insertions(+), 44 deletions(-)

diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index bd5faf0c1f..5d7bc5f6bd 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -180,7 +180,8 @@ static bool
 heapam_fetch_row_version(Relation relation,
 		 ItemPointer tid,
 		 Snapshot snapshot,
-		 TupleTableSlot *slot)
+		 TupleTableSlot *slot,
+		 Bitmapset *project_cols)
 {
 	BufferHeapTupleTableSlot *bslot = (BufferHeapTupleTableSlot *) slot;
 	Buffer		buffer;
@@ -348,7 +349,7 @@ static TM_Result
 heapam_tuple_lock(Relation relation, ItemPointer tid, Snapshot snapshot,
   TupleTableSlot *slot, CommandId cid, LockTupleMode mode,
   LockWaitPolicy wait_policy, uint8 flags,
-  TM_FailureData *tmfd)
+  TM_FailureData *tmfd, Bitmapset *project_cols)
 {
 	BufferHeapTupleTableSlot *bslot = (BufferHeapTupleTableSlot *) slot;
 	TM_Result	result;
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index 2c4d7f6e25..57462eb9c7 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -1966,7 +1966,8 @@ _bt_parallel_scan_and_sort(BTSpool *btspool, BTSpool *btspool2,
 	indexInfo = BuildIndexInfo(btspool->index);
 	indexInfo->ii_Concurrent = btshared->isconcurrent;
 	scan = table_beginscan_parallel(btspool->heap,
-	ParallelTableScanFromBTShared(btshared));
+	ParallelTableScanFromBTShared(btshared),
+	NULL);
 	reltuples = table_index_build_scan(btspool->heap, btspool->index, indexInfo,
 	   true, progress, _bt_build_callback,
 	   (void *) &buildstate, scan);
diff --git a/src/backend/access/table/tableam.c b/src/backend/access/table/tableam.c
index 5ea5bdd810..9bda57247a 100644
--- a/src/backend/access/table/tableam.c
+++ b/src/backend/access/table/tableam.c
@@ -172,7 +172,7 @@ table_parallelscan_initialize(Relation rel, ParallelTableScanDesc pscan,
 }
 
 TableScanDesc
-table_beginscan_parallel(Relation relation, ParallelTableScanDesc parallel_scan)
+table_beginscan_parallel(Relation relation, ParallelTableScanDesc parallel_scan, Bitmapset *proj)
 {
 	Snapshot	snapshot;
 	uint32		flags = SO_TYPE_SEQSCAN |
@@ -194,6 +194,9 @@ table_beginscan_parallel(Relation relation, ParallelTableScanDesc parallel_scan)
 		snapshot = SnapshotAny;
 	}
 
+	if (proj)
+		return relation->rd_tableam->scan_begin_with_column_projection(relation, snapshot, 0, NULL,
+			parallel_scan, flags, proj);
 	return relation->rd_tableam->scan_begin(relation, snapshot, 0, NULL,
 			pa

Re: Table AM modifications to accept column projection lists

2021-03-02 Thread Zhihong Yu
Hi,

+   /* Make sure the the new slot is not dependent on the original tuple */

There is duplicate 'the'.

For neededColumnContextWalker(),

+   else if(var->varattno == 0) {

I think the if following the else is not needed - I assume var->varattno
wouldn't be negative.
Similar comment for extract_scan_columns().

+   while ((col = bms_next_member(parent_cols, col)) >= 0)
+   {
+   Var *var = (Var *) list_nth(translated_vars, col - 1);

If col is 0, do you still want to call list_nth() ?

Cheers

On Tue, Mar 2, 2021 at 9:10 AM Jacob Champion  wrote:

> On Mon, 2021-03-01 at 23:13 +, Jacob Champion wrote:
> > On Mon, 2021-03-01 at 16:59 -0600, Justin Pryzby wrote:
> > > Why is this removed ?
> >
> > Mm, both of those analyze.c changes seem suspect. Possibly a mismerge
> > from the zedstore branch; let me double-check.
> >
> > > its (possessive) not it's ("it is")
> >
> > Thanks, I'll fix this at the same time.
>
> Both fixed in v3; thanks for the catch!
>
> --Jacob
>


Re: Table AM modifications to accept column projection lists

2021-03-03 Thread Jacob Champion
On Tue, 2021-03-02 at 10:35 -0800, Zhihong Yu wrote:
> Hi,

Thanks for the review!

> +   /* Make sure the the new slot is not dependent on the original tuple */
> 
> There is duplicate 'the'.

Thanks, I'll add this for the next batch of updates.

> For neededColumnContextWalker(),
> 
> +   else if(var->varattno == 0) {
> 
> I think the if following the else is not needed - I assume var->varattno 
> wouldn't be negative.
> Similar comment for extract_scan_columns().

I think you can have system columns in the tree here -- a common
example that we run into with the `make check` suite is ctid. (To see
this, you can change the (> 0) check just above this code into a (!= 0)
check, and then take a look at the failing cases in the test suite.)

> +   while ((col = bms_next_member(parent_cols, col)) >= 0)
> +   {
> +   Var *var = (Var *) list_nth(translated_vars, col - 1);
> 
> If col is 0, do you still want to call list_nth() ?

The (col == 0) case is handled just above this, with
contains_whole_row_col() and get_ordinal_attnos() replacing any zero
columns with the entire user-column range. If one of those functions
fails to do its job due to programmer error, we'll assert in the call
to list_nth(), and I think that's what we want.

--Jacob


Re: Table AM modifications to accept column projection lists

2021-06-01 Thread Aleksander Alekseev
Hi Soumyadeep, Jacob,

> Thanks for the review!

I came across this patch and noticed that it rotted a little, especially
after removing inheritance_planner() in 86dc9005. I managed to resolve the
conflicts on current `master` (eb89cb43), see the attached patch. The code
compiles but doesn't pass the tests. I'm currently in the process of
reviewing it and didn't figure out what the issue is yet. Just wanted to
let you know. I also believe changing the patch status to "Waiting on
Author" would be appropriate.

-- 
Best regards,
Aleksander Alekseev


wip.patch
Description: Binary data


Re: Table AM modifications to accept column projection lists

2021-06-04 Thread Jacob Champion
On Tue, 2021-06-01 at 15:38 +0300, Aleksander Alekseev wrote:
> I came across this patch and noticed that it rotted a little,
> especially after removing inheritance_planner() in 86dc9005. I
> managed to resolve the conflicts on current `master` (eb89cb43), see
> the attached patch. The code compiles but doesn't pass the tests. I'm
> currently in the process of reviewing it and didn't figure out what
> the issue is yet. Just wanted to let you know.

Hi Alexsander, thanks!

In your patch's transformInsertStmt(), I see what I think is an
extraneous call to transformReturningList() right before the ON
CONFLICT processing. That call is already done later in the function,
during the RETURNING processing (this change came in with 6c0373ab77).
Other than that, your rebased patch looks the same as mine.

>  I also believe changing the patch status to "Waiting on Author"
> would be appropriate.

Agreed. I'm going to double-check with Deep that the new calls
to table_tuple_fetch_row_version() should be projecting the full row,
then post an updated patch some time next week.

Thanks again!
--Jacob


Re: Table AM modifications to accept column projection lists

2021-06-05 Thread Zhihong Yu
On Fri, Jun 4, 2021 at 4:14 PM Jacob Champion  wrote:

> On Tue, 2021-06-01 at 15:38 +0300, Aleksander Alekseev wrote:
> > I came across this patch and noticed that it rotted a little,
> > especially after removing inheritance_planner() in 86dc9005. I
> > managed to resolve the conflicts on current `master` (eb89cb43), see
> > the attached patch. The code compiles but doesn't pass the tests. I'm
> > currently in the process of reviewing it and didn't figure out what
> > the issue is yet. Just wanted to let you know.
>
> Hi Alexsander, thanks!
>
> In your patch's transformInsertStmt(), I see what I think is an
> extraneous call to transformReturningList() right before the ON
> CONFLICT processing. That call is already done later in the function,
> during the RETURNING processing (this change came in with 6c0373ab77).
> Other than that, your rebased patch looks the same as mine.
>
> >  I also believe changing the patch status to "Waiting on Author"
> > would be appropriate.
>
> Agreed. I'm going to double-check with Deep that the new calls
> to table_tuple_fetch_row_version() should be projecting the full row,
> then post an updated patch some time next week.
>
> Thanks again!
> --Jacob
>
Hi,

+   return
relation->rd_tableam->scan_begin_with_column_projection(relation, snapshot,
0, NULL,
+   parallel_scan, flags, proj);

scan_begin_with_column_projection() adds a parameter to scan_begin().
Can scan_begin() be enhanced with this projection parameter ?
Otherwise in the future we may
have scan_begin_with_column_projection_with_x_y ...

+   /* Make sure the the new slot is not dependent on the original tuple */

Double 'the' in the comment. More than one place with duplicate 'the' in
the patch.

+typedef struct neededColumnContext
+{
+   Bitmapset **mask;
+   int n;

Should field n be named ncol ? 'n' seems too general.

+* TODO: Remove this hack!! This should be done once at the start
of the tid scan.

Would the above be addressed in the next patch ?

Toward the end of extract_scan_columns():

+   bms_free(rte->scanCols);
+   rte->scanCols = bms_make_singleton(0);
+   break;

Should 'goto outer;' be in place of 'break;' (since rte->scanCols has been
assigned for whole-row) ?

Cheers


Re: Table AM modifications to accept column projection lists

2020-12-28 Thread Masahiko Sawada
Hi Soumyadeep,

On Sat, Nov 14, 2020 at 3:02 AM Soumyadeep Chakraborty
 wrote:
>
> Hello,
>
> This patch introduces a set of changes to the table AM APIs, making them
> accept a column projection list. That helps columnar table AMs, so that
> they don't need to fetch all columns from disk, but only the ones
> actually needed.
>
> The set of changes in this patch is not exhaustive -
> there are many more opportunities that are discussed in the TODO section
> below. Before digging deeper, we want to elicit early feedback on the
> API changes and the column extraction logic.
>
> TableAM APIs that have been modified are:
>
> 1. Sequential scan APIs
> 2. Index scan APIs
> 3. API to lock and return a row
> 4. API to fetch a single row
>
> We have seen performance benefits in Zedstore for many of the optimized
> operations [0]. This patch is extracted from the larger patch shared in
> [0].
>
> 
> Building the column projection set:
>
> In terms of building the column projection set necessary for each of
> these APIs, this patch builds off of the scanCols patch [1], which
> Ashwin and Melanie had started earlier. As noted in [1], there are cases
> where the scanCols set is not representative of the columns to be
> projected. For instance, in a DELETE .. RETURNING query, there is
> typically a sequential scan and a separate invocation of
> tuple_fetch_row_version() in order to satisfy the RETURNING clause (see
> ExecDelete()). So for a query such as:
>
>   DELETE from foo WHERE i < 100 && j < 1000 RETURNING k, l;
>
> We need to pass the set (i, j) to the scan and (k, l) to the
> tuple_fetch_row_version() invocation. This is why we had to introduce
> the returningCols field.
>
> In the same spirit, separate column projection sets are computed for any
> operations that involve an EPQ check (INSERT, DELETE, UPDATE, row-level
> locking etc), the columns involved in an ON CONFLICT UPDATE etc.
>
> Recognizing and collecting these sets of columns is done at various
> stages: analyze and rewrite, planner and executor - depending on the
> type of operation for which the subset of columns is calculated. The
> column bitmaps are stored in different places as well - such as the ones
> for scans and RETURNING are stored in RangeTblEntry, whereas the set of
> columns for ON CONFLICT UPDATE are stored in OnConflictSetState.
>
> 
> Table AM API changes:
>
> The changes made to the table AM API, introducing the column projection
> set, come in different flavors. We would like feedback on what style
> we need to converge to or if we should use different styles depending
> on the situation.
>
> - A new function variant that takes a column projection list, such as:
>
>   TableScanDesc (*scan_begin) (Relation rel,
> Snapshot snapshot,
> int nkeys, struct ScanKeyData *key,
> ParallelTableScanDesc pscan,
> uint32 flags);
> ->
>
>   TableScanDesc (*scan_begin_with_column_projection)(Relation relation,
>  Snapshot snapshot,
>  int nkeys, struct ScanKeyData *key,
>  ParallelTableScanDesc parallel_scan,
>  uint32 flags,
>  Bitmapset *project_columns);
>
> - Modifying the existing function to take a column projection list, such
> as:
>
>   TM_Result (*tuple_lock) (Relation rel,
>  ItemPointer tid,
>  Snapshot snapshot,
>  TupleTableSlot *slot,
>  CommandId cid,
>  LockTupleMode mode,
>  LockWaitPolicy wait_policy,
>  uint8 flags,
>  TM_FailureData *tmfd);
>
> ->
>
>   TM_Result (*tuple_lock) (Relation rel,
>  ItemPointer tid,
>  Snapshot snapshot,
>  TupleTableSlot *slot,
>  CommandId cid,
>  LockTupleMode mode,
>  LockWaitPolicy wait_policy,
>  uint8 flags,
>  TM_FailureData *tmfd,
>  Bitmapset *project_cols);
>
> - A new function index_fetch_set_column_projection() to be called after
> index_beginscan() to set the column projection set, which will be used
> later by index_getnext_slot().
>
>   void (*index_fetch_set_column_projection) (struct IndexFetchTableData *data,
>  Bitmapset *project_columns);
>
> The set of columns expected by the new/modified functions is represented
> as a Bitmapset of attnums for a specific base relation. An empty/NULL
> bitmap signals to the AM that no data columns are needed. A bitmap
> containing the single element 0 indicates that we want all data columns
> to be fetched.
>
> The bitmaps do not include system columns.
>
> Additionally, the TupleTableSlots populated by functions such
> as table_scan_getnextslot(), need to be densely filled upto the highest
> numbered column in the projection list (any column not in the projection
> list should be populated with NULL). This is due to the implicit
> assumptions of the slot_get_***() APIs.
>
> 
> TODOs:
>
> - Explore opportunities to push the co

Re: Table AM modifications to accept column projection lists

2020-12-31 Thread Soumyadeep Chakraborty
Hey Masahiko,

I added it to the Jan CF (https://commitfest.postgresql.org/31/2922/).

PFA a rebased version against latest head.

Regards,
Soumyadeep
From d033a0e3bceaf6e3f861e08363d4f170bc2a9fea Mon Sep 17 00:00:00 2001
From: Jacob Champion 
Date: Mon, 9 Nov 2020 16:36:10 -0800
Subject: [PATCH v2 1/1] tableam: accept column projection list

Co-authored-by: Soumyadeep Chakraborty 
Co-authored-by: Melanie Plageman 
Co-authored-by: Ashwin Agrawal 
Co-authored-by: Jacob Champion 
---
 src/backend/access/heap/heapam_handler.c |   5 +-
 src/backend/access/nbtree/nbtsort.c  |   3 +-
 src/backend/access/table/tableam.c   |   5 +-
 src/backend/commands/analyze.c   |   5 +-
 src/backend/commands/trigger.c   |  19 +++-
 src/backend/executor/execMain.c  |   3 +-
 src/backend/executor/execPartition.c |   2 +
 src/backend/executor/execReplication.c   |   6 +-
 src/backend/executor/execScan.c  | 108 +
 src/backend/executor/nodeIndexscan.c |  10 ++
 src/backend/executor/nodeLockRows.c  |   7 +-
 src/backend/executor/nodeModifyTable.c   |  38 ++--
 src/backend/executor/nodeSeqscan.c   |  43 -
 src/backend/executor/nodeTidscan.c   |  11 ++-
 src/backend/nodes/copyfuncs.c|   2 +
 src/backend/nodes/equalfuncs.c   |   2 +
 src/backend/nodes/outfuncs.c |   2 +
 src/backend/nodes/readfuncs.c|   2 +
 src/backend/optimizer/path/allpaths.c|  85 +++-
 src/backend/optimizer/plan/planner.c |  19 
 src/backend/optimizer/prep/preptlist.c   |  13 ++-
 src/backend/optimizer/util/inherit.c |  37 ++-
 src/backend/parser/analyze.c |  40 ++--
 src/backend/parser/parse_relation.c  |  18 
 src/backend/partitioning/partbounds.c|  14 ++-
 src/backend/rewrite/rewriteHandler.c |   8 ++
 src/include/access/tableam.h | 118 +--
 src/include/executor/executor.h  |   6 ++
 src/include/nodes/execnodes.h|   1 +
 src/include/nodes/parsenodes.h   |  14 +++
 src/include/utils/rel.h  |  14 +++
 31 files changed, 612 insertions(+), 48 deletions(-)

diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 3eea215b85..c18641700b 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -180,7 +180,8 @@ static bool
 heapam_fetch_row_version(Relation relation,
 		 ItemPointer tid,
 		 Snapshot snapshot,
-		 TupleTableSlot *slot)
+		 TupleTableSlot *slot,
+		 Bitmapset *project_cols)
 {
 	BufferHeapTupleTableSlot *bslot = (BufferHeapTupleTableSlot *) slot;
 	Buffer		buffer;
@@ -348,7 +349,7 @@ static TM_Result
 heapam_tuple_lock(Relation relation, ItemPointer tid, Snapshot snapshot,
   TupleTableSlot *slot, CommandId cid, LockTupleMode mode,
   LockWaitPolicy wait_policy, uint8 flags,
-  TM_FailureData *tmfd)
+  TM_FailureData *tmfd, Bitmapset *project_cols)
 {
 	BufferHeapTupleTableSlot *bslot = (BufferHeapTupleTableSlot *) slot;
 	TM_Result	result;
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index 8730de25ed..25ee10806b 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -1967,7 +1967,8 @@ _bt_parallel_scan_and_sort(BTSpool *btspool, BTSpool *btspool2,
 	indexInfo = BuildIndexInfo(btspool->index);
 	indexInfo->ii_Concurrent = btshared->isconcurrent;
 	scan = table_beginscan_parallel(btspool->heap,
-	ParallelTableScanFromBTShared(btshared));
+	ParallelTableScanFromBTShared(btshared),
+	NULL);
 	reltuples = table_index_build_scan(btspool->heap, btspool->index, indexInfo,
 	   true, progress, _bt_build_callback,
 	   (void *) &buildstate, scan);
diff --git a/src/backend/access/table/tableam.c b/src/backend/access/table/tableam.c
index 6438c45716..187e861b5d 100644
--- a/src/backend/access/table/tableam.c
+++ b/src/backend/access/table/tableam.c
@@ -172,7 +172,7 @@ table_parallelscan_initialize(Relation rel, ParallelTableScanDesc pscan,
 }
 
 TableScanDesc
-table_beginscan_parallel(Relation relation, ParallelTableScanDesc parallel_scan)
+table_beginscan_parallel(Relation relation, ParallelTableScanDesc parallel_scan, Bitmapset *proj)
 {
 	Snapshot	snapshot;
 	uint32		flags = SO_TYPE_SEQSCAN |
@@ -194,6 +194,9 @@ table_beginscan_parallel(Relation relation, ParallelTableScanDesc parallel_scan)
 		snapshot = SnapshotAny;
 	}
 
+	if (proj)
+		return relation->rd_tableam->scan_begin_with_column_projection(relation, snapshot, 0, NULL,
+			parallel_scan, flags, proj);
 	return relation->rd_tableam->scan_begin(relation, snapshot, 0, NULL,
 			parallel_scan, flags);
 }
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 8af12b5c6b..791451c765 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/comma

Re: Table AM modifications to accept column projection lists

2020-12-31 Thread Zhihong Yu
Hi, Soumyadeep:
Happy New Year.

+typedef struct neededColumnContext
+{
+   Bitmapset **mask;
+   int n;

+ * n specifies the number of allowed entries in mask: we use
+ * it for bounds-checking in the walker above.

I think the code would be easier to read if the above comment is moved or
copied for field n of neededColumnContext

Cheers

On Thu, Dec 31, 2020 at 1:03 PM Soumyadeep Chakraborty <
soumyadeep2...@gmail.com> wrote:

> Hey Masahiko,
>
> I added it to the Jan CF (https://commitfest.postgresql.org/31/2922/).
>
> PFA a rebased version against latest head.
>
> Regards,
> Soumyadeep
>


Re: Table AM modifications to accept column projection lists

2021-06-11 Thread Jacob Champion
On Sat, 2021-06-05 at 09:47 -0700, Zhihong Yu wrote:
> On Fri, Jun 4, 2021 at 4:14 PM Jacob Champion  wrote:
> > Agreed. I'm going to double-check with Deep that the new calls
> > to table_tuple_fetch_row_version() should be projecting the full row,
> > then post an updated patch some time next week.

(The discussions over the fallout of the inheritance_planner fallout
are still going, but in the meantime here's an updated v4 that builds
and passes `make check`.)

> +   return 
> relation->rd_tableam->scan_begin_with_column_projection(relation, snapshot, 
> 0, NULL,
> +   parallel_scan, flags, proj);
> 
> scan_begin_with_column_projection() adds a parameter to scan_begin().
> Can scan_begin() be enhanced with this projection parameter ?
> Otherwise in the future we may have 
> scan_begin_with_column_projection_with_x_y ...

Maybe; I agree that would match the current "extension" APIs a little
better. I'll let Deep and/or Ashwin chime in on why this design was
chosen.

> +   /* Make sure the the new slot is not dependent on the original tuple */
> 
> Double 'the' in the comment. More than one place with duplicate 'the'
> in the patch.

Fixed.

> +typedef struct neededColumnContext
> +{
> +   Bitmapset **mask;
> +   int n;
> 
> Should field n be named ncol ? 'n' seems too general.

Agreed; changed to ncol.

> +* TODO: Remove this hack!! This should be done once at the start of 
> the tid scan.
> 
> Would the above be addressed in the next patch ?

I have not had time to get to this in v4, sorry.

> Toward the end of extract_scan_columns():
> 
> +   bms_free(rte->scanCols);
> +   rte->scanCols = bms_make_singleton(0);
> +   break;
> 
> Should 'goto outer;' be in place of 'break;' (since rte->scanCols has
> been assigned for whole-row) ?

Agreed and fixed. Thank you!

--Jacob
From 7ac00847f17e2a0448ae06370598db0f034fcc7c Mon Sep 17 00:00:00 2001
From: Jacob Champion 
Date: Tue, 2 Mar 2021 08:59:45 -0800
Subject: [PATCH v4] tableam: accept column projection list

TODO: check the additional bms_make_singleton(0) calls in
  src/backend/executor/nodeModifyTable.c

Co-authored-by: Soumyadeep Chakraborty 
Co-authored-by: Melanie Plageman 
Co-authored-by: Ashwin Agrawal 
Co-authored-by: Jacob Champion 
---
 src/backend/access/heap/heapam_handler.c |   5 +-
 src/backend/access/nbtree/nbtsort.c  |   3 +-
 src/backend/access/table/tableam.c   |   5 +-
 src/backend/commands/trigger.c   |  19 +++-
 src/backend/executor/execMain.c  |   3 +-
 src/backend/executor/execPartition.c |   2 +
 src/backend/executor/execReplication.c   |   6 +-
 src/backend/executor/execScan.c  | 108 +
 src/backend/executor/nodeIndexscan.c |  10 ++
 src/backend/executor/nodeLockRows.c  |   7 +-
 src/backend/executor/nodeModifyTable.c   |  47 +++--
 src/backend/executor/nodeSeqscan.c   |  43 -
 src/backend/executor/nodeTidscan.c   |  11 ++-
 src/backend/nodes/copyfuncs.c|   2 +
 src/backend/nodes/equalfuncs.c   |   2 +
 src/backend/nodes/outfuncs.c |   2 +
 src/backend/nodes/readfuncs.c|   2 +
 src/backend/optimizer/path/allpaths.c|  85 +++-
 src/backend/optimizer/prep/preptlist.c   |  13 ++-
 src/backend/optimizer/util/inherit.c |  37 ++-
 src/backend/parser/analyze.c |  40 ++--
 src/backend/parser/parse_relation.c  |  18 
 src/backend/partitioning/partbounds.c|  14 ++-
 src/backend/rewrite/rewriteHandler.c |   8 ++
 src/include/access/tableam.h | 118 +--
 src/include/executor/executor.h  |   6 ++
 src/include/nodes/execnodes.h|   1 +
 src/include/nodes/parsenodes.h   |  14 +++
 src/include/utils/rel.h  |  14 +++
 29 files changed, 598 insertions(+), 47 deletions(-)

diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index e2cd79ec54..e440ae4a69 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -182,7 +182,8 @@ static bool
 heapam_fetch_row_version(Relation relation,
 		 ItemPointer tid,
 		 Snapshot snapshot,
-		 TupleTableSlot *slot)
+		 TupleTableSlot *slot,
+		 Bitmapset *project_cols)
 {
 	BufferHeapTupleTableSlot *bslot = (BufferHeapTupleTableSlot *) slot;
 	Buffer		buffer;
@@ -350,7 +351,7 @@ static TM_Result
 heapam_tuple_lock(Relation relation, ItemPointer tid, Snapshot snapshot,
   TupleTableSlot *slot, CommandId cid, LockTupleMode mode,
   LockWaitPolicy wait_policy, uint8 flags,
-  TM_FailureData *tmfd)
+  TM_FailureData *tmfd, Bitmapset *project_cols)
 {
 	BufferHeapTupleTableSlot *bslot = (BufferHeapTupleTableSlot *) slot;
 	TM_Result	result;
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
inde

Re: Table AM modifications to accept column projection lists

2021-06-30 Thread Jacob Champion
Hi all,

Ashwin, Deep, and I were discussing this patch today. We agree that
it's fairly difficult to review in its current state, and the lack of a
concrete implementation of the new API isn't helping. (A big chunk of
the context for the patch exists in the zedstore megathread, which
isn't exactly light reading.)

We'd like to improve that, but with current time constraints, we won't
be able to get to it for the July commitfest. So I'll mark this patch
Withdrawn for now, to reduce the review load. (239 Needs Review and
counting!) We hope to revisit in the September timeframe.

Thanks for all the reviews and input!

--Jacob


Re: Table AM modifications to accept column projection lists

2022-09-05 Thread Nikita Malakhov
Hi hackers!

Due to experiments with columnar data storage I've decided to revive this
thread -
Table AM modifications to accept column projection lists
<https://www.postgresql.org/message-id/flat/cae-ml+9rmtnzkcntzpqf8o3b-ujhwgfbsoxpqa3wvuc8ybb...@mail.gmail.com>

To remind:

This patch introduces a set of changes to the table AM APIs, making them
accept a column projection list. That helps columnar table AMs, so that
they don't need to fetch all columns from disk, but only the ones
actually needed.

The set of changes in this patch is not exhaustive -
there are many more opportunities that are discussed in the TODO section
below. Before digging deeper, we want to elicit early feedback on the
API changes and the column extraction logic.

TableAM APIs that have been modified are:

1. Sequential scan APIs
2. Index scan APIs
3. API to lock and return a row
4. API to fetch a single row

We have seen performance benefits in Zedstore for many of the optimized
operations [0]. This patch is extracted from the larger patch shared in
[0].




-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/


v5_0001-tableam-accept-column-projection.patch
Description: Binary data


Re: Table AM modifications to accept column projection lists

2022-09-05 Thread Justin Pryzby
On Mon, Sep 05, 2022 at 05:38:51PM +0300, Nikita Malakhov wrote:
> Due to experiments with columnar data storage I've decided to revive this
> thread - Table AM modifications to accept column projection lists
> <https://www.postgresql.org/message-id/flat/cae-ml+9rmtnzkcntzpqf8o3b-ujhwgfbsoxpqa3wvuc8ybb...@mail.gmail.com>
> 
> To remind:
> 
> This patch introduces a set of changes to the table AM APIs, making them
> accept a column projection list. That helps columnar table AMs, so that
> they don't need to fetch all columns from disk, but only the ones
> actually needed.
> 
> The set of changes in this patch is not exhaustive -
> there are many more opportunities that are discussed in the TODO section
> below. Before digging deeper, we want to elicit early feedback on the
> API changes and the column extraction logic.
> 
> TableAM APIs that have been modified are:
> 
> 1. Sequential scan APIs
> 2. Index scan APIs
> 3. API to lock and return a row
> 4. API to fetch a single row
> 
> We have seen performance benefits in Zedstore for many of the optimized
> operations [0]. This patch is extracted from the larger patch shared in
> [0].

What parts of the original patch were left out ?  This seems to be the
same size as the original.

With some special build options like -DWRITE_READ_PARSE_PLAN_TREES, this
currently fails with:

WARNING:  outfuncs/readfuncs failed to produce equal parse tree

There's poor code coverage in PopulateNeededColumnsForScan()
IndexNext(), check_default_partition_contents() and nodeSeqscan.c.
https://cirrus-ci.com/task/5516554904272896
https://api.cirrus-ci.com/v1/artifact/task/5516554904272896/coverage/coverage/00-index.html

Is it currently possible to hit those code paths in postgres ?  If not,
you may need to invent a minimal columnar extension to allow excercising
that.

Note that the cirrusci link is on top of my branch which runs "extended"
checks in cirrusci, but you can also run code coverage report locally
with --enable-coverage.

When you mail next, please run pgindent first (BTW there's a debian
package in PGDG for pgindent).

-- 
Justin




Re: Table AM modifications to accept column projection lists

2022-09-05 Thread Nikita Malakhov
Hi hackers!

This is the original patch rebased onto v15 master with conflicts resolved.
I'm currently
studying it and latest comments in the original thread, and would try go
the way that
was mentioned in the thread (last message) -
[1]
https://stratos.seas.harvard.edu/files/stratos/files/columnstoresfntdbs.pdf
[2] https://github.com/zhihuiFan/postgres/tree/lazy_material_v2
I agree it is not in the state for review, so I've decided not to change
patch status,
just revive the thread because we found that Pluggable Storage API is not
somewhat
not sufficient.
Thanks for the recommendations, I'll check that out.

On Mon, Sep 5, 2022 at 7:36 PM Justin Pryzby  wrote:

> On Mon, Sep 05, 2022 at 05:38:51PM +0300, Nikita Malakhov wrote:
> > Due to experiments with columnar data storage I've decided to revive this
> > thread - Table AM modifications to accept column projection lists
> > <
> https://www.postgresql.org/message-id/flat/cae-ml+9rmtnzkcntzpqf8o3b-ujhwgfbsoxpqa3wvuc8ybb...@mail.gmail.com
> >
> >
> > To remind:
> >
> > This patch introduces a set of changes to the table AM APIs, making them
> > accept a column projection list. That helps columnar table AMs, so that
> > they don't need to fetch all columns from disk, but only the ones
> > actually needed.
> >
> > The set of changes in this patch is not exhaustive -
> > there are many more opportunities that are discussed in the TODO section
> > below. Before digging deeper, we want to elicit early feedback on the
> > API changes and the column extraction logic.
> >
> > TableAM APIs that have been modified are:
> >
> > 1. Sequential scan APIs
> > 2. Index scan APIs
> > 3. API to lock and return a row
> > 4. API to fetch a single row
> >
> > We have seen performance benefits in Zedstore for many of the optimized
> > operations [0]. This patch is extracted from the larger patch shared in
> > [0].
>
> What parts of the original patch were left out ?  This seems to be the
> same size as the original.
>
> With some special build options like -DWRITE_READ_PARSE_PLAN_TREES, this
> currently fails with:
>
> WARNING:  outfuncs/readfuncs failed to produce equal parse tree
>
> There's poor code coverage in PopulateNeededColumnsForScan()
> IndexNext(), check_default_partition_contents() and nodeSeqscan.c.
> https://cirrus-ci.com/task/5516554904272896
>
> https://api.cirrus-ci.com/v1/artifact/task/5516554904272896/coverage/coverage/00-index.html
>
> Is it currently possible to hit those code paths in postgres ?  If not,
> you may need to invent a minimal columnar extension to allow excercising
> that.
>
> Note that the cirrusci link is on top of my branch which runs "extended"
> checks in cirrusci, but you can also run code coverage report locally
> with --enable-coverage.
>
> When you mail next, please run pgindent first (BTW there's a debian
> package in PGDG for pgindent).
>
> --
> Justin
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/


Re: Table AM modifications to accept column projection lists

2022-09-05 Thread Zhihong Yu
On Mon, Sep 5, 2022 at 9:51 AM Nikita Malakhov  wrote:

> Hi hackers!
>
> This is the original patch rebased onto v15 master with conflicts
> resolved. I'm currently
> studying it and latest comments in the original thread, and would try go
> the way that
> was mentioned in the thread (last message) -
> [1]
> https://stratos.seas.harvard.edu/files/stratos/files/columnstoresfntdbs.pdf
> [2] https://github.com/zhihuiFan/postgres/tree/lazy_material_v2
> I agree it is not in the state for review, so I've decided not to change
> patch status,
> just revive the thread because we found that Pluggable Storage API is not
> somewhat
> not sufficient.
> Thanks for the recommendations, I'll check that out.
>
> Hi,
bq. is not somewhat not sufficient.

I am a bit confused by the double negation.
I guess you meant insufficient.

Cheers


Re: Table AM modifications to accept column projection lists

2021-08-18 Thread Andy Fan
On Thu, Jul 1, 2021 at 7:42 AM Jacob Champion  wrote:
>
> Hi all,
>
> Ashwin, Deep, and I were discussing this patch today. We agree that
> it's fairly difficult to review in its current state, and the lack of a
> concrete implementation of the new API isn't helping. (A big chunk of
> the context for the patch exists in the zedstore megathread, which
> isn't exactly light reading.)
>
> We'd like to improve that, but with current time constraints, we won't
> be able to get to it for the July commitfest. So I'll mark this patch
> Withdrawn for now, to reduce the review load. (239 Needs Review and
> counting!) We hope to revisit in the September timeframe.
>
> Thanks for all the reviews and input!
>
> --Jacob

Since this thread is mentioned again, I want to share some thoughts on the lazy
material part[1].  If we go with that direction, we may not need the AM for pass
projection to heapam.  we just need something like AM like

Datum fetch_column_value_for_column(..,  rowid,  colid);

I worked in this direction with a small step and stopped quickly because of some
other reasons.  I would like to share my work so far here [2], However
lazy material
is not always good.

[1] https://stratos.seas.harvard.edu/files/stratos/files/columnstoresfntdbs.pdf
[2] https://github.com/zhihuiFan/postgres/tree/lazy_material_v2


-- 
Best Regards
Andy Fan (https://www.aliyun.com/)