Hi,

I think this needs to be evaluated for performance...

I agree with the nearby comment that the commits need a bit of justification
at least to review them.


On 2022-06-23 15:12:27 +0300, Maxim Orlov wrote:
> From 03b78cdade3b86a0e97723721fa1d0bd64d0c7df Mon Sep 17 00:00:00 2001
> From: Alexander Korotkov <akorot...@postgresql.org>
> Date: Tue, 21 Jun 2022 13:28:27 +0300
> Subject: [PATCH v2 1/6] Remove Tuplesortstate.copytup

Yea. I was recently complaining about the pointlessness of copytup.


> From 1d78e271b22d7c6a1557defbe15ea5039ff28510 Mon Sep 17 00:00:00 2001
> From: Alexander Korotkov <akorot...@postgresql.org>
> Date: Tue, 21 Jun 2022 14:03:13 +0300
> Subject: [PATCH v2 2/6] Tuplesortstate.getdatum1 method

Hm. This adds a bunch of indirect function calls were there previously
weren't.


> From 494d46dcf938e5f824a498e38ce77782751208e1 Mon Sep 17 00:00:00 2001
> From: Alexander Korotkov <akorot...@postgresql.org>
> Date: Tue, 21 Jun 2022 14:13:56 +0300
> Subject: [PATCH v2 3/6] Put abbreviation logic into puttuple_common()

There's definitely a lot of redundancy removed... But the list of branches in
puttuple_common() grew.  Perhaps we instead can add a few flags to
putttuple_common() that determine whether abbreviation should happen, so that
we only do the work necessary for the "type" of sort?



> From ee2dd46b07d62e13ed66b5a38272fb5667c943f3 Mon Sep 17 00:00:00 2001
> From: Alexander Korotkov <akorot...@postgresql.org>
> Date: Wed, 22 Jun 2022 00:14:51 +0300
> Subject: [PATCH v2 4/6] Move freeing memory away from writetup()

Seems to do more than just moving freeing around?

> @@ -1973,8 +1963,13 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, 
> bool isNull)
>  static void
>  puttuple_common(Tuplesortstate *state, SortTuple *tuple)
>  {
> +     MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
> +
>       Assert(!LEADER(state));
>
> +     if (tuple->tuple != NULL)
> +             USEMEM(state, GetMemoryChunkSpace(tuple->tuple));
> +

Adding even more branches into common code...



> From 3a0e1fa7c7e4da46a86f7d5b9dd0392549f3b460 Mon Sep 17 00:00:00 2001
> From: Alexander Korotkov <akorot...@postgresql.org>
> Date: Wed, 22 Jun 2022 18:11:26 +0300
> Subject: [PATCH v2 5/6] Reorganize data structures

Hard to know what this is trying to achieve.

> -struct Tuplesortstate
> +struct TuplesortOps
>  {
> -     TupSortStatus status;           /* enumerated value as shown above */
> -     int                     nKeys;                  /* number of columns in 
> sort key */
> -     int                     sortopt;                /* Bitmask of flags 
> used to setup sort */
> -     bool            bounded;                /* did caller specify a maximum 
> number of
> -                                                              * tuples to 
> return? */
> -     bool            boundUsed;              /* true if we made use of a 
> bounded heap */
> -     int                     bound;                  /* if bounded, the 
> maximum number of tuples */
> -     bool            tuples;                 /* Can SortTuple.tuple ever be 
> set? */
> -     int64           availMem;               /* remaining memory available, 
> in bytes */
> -     int64           allowedMem;             /* total memory allowed, in 
> bytes */
> -     int                     maxTapes;               /* max number of input 
> tapes to merge in each
> -                                                              * pass */
> -     int64           maxSpace;               /* maximum amount of space 
> occupied among sort
> -                                                              * of groups, 
> either in-memory or on-disk */
> -     bool            isMaxSpaceDisk; /* true when maxSpace is value for 
> on-disk
> -                                                              * space, false 
> when it's value for in-memory
> -                                                              * space */
> -     TupSortStatus maxSpaceStatus;   /* sort status when maxSpace was 
> reached */
>       MemoryContext maincontext;      /* memory context for tuple sort 
> metadata that
>                                                                * persists 
> across multiple batches */
>       MemoryContext sortcontext;      /* memory context holding most sort 
> data */
>       MemoryContext tuplecontext; /* sub-context of sortcontext for tuple 
> data */
> -     LogicalTapeSet *tapeset;        /* logtape.c object for tapes in a temp 
> file */
>
>       /*
>        * These function pointers decouple the routines that must know what 
> kind

To me it seems odd to have memory contexts and similar things in a
datastructure calls *Ops.


> From b06bcb5f3666f0541dfcc27c9c8462af2b5ec9e0 Mon Sep 17 00:00:00 2001
> From: Alexander Korotkov <akorot...@postgresql.org>
> Date: Wed, 22 Jun 2022 21:48:05 +0300
> Subject: [PATCH v2 6/6] Split tuplesortops.c

I strongly suspect this will cause a slowdown. There was potential for
cross-function optimization that's now removed.


Greetings,

Andres Freund


Reply via email to