[HACKERS] Re: Abbreviated keys for Datum tuplesort (was: Re: B-Tree support function number 3 (strxfrm() optimization))

2015-05-13 Thread Robert Haas
On Wed, May 13, 2015 at 2:44 PM, Peter Geoghegan  wrote:
> On Wed, May 13, 2015 at 11:40 AM, Robert Haas  wrote:
>> Committed.  Thanks for the patch and your patience.
>
> This comment was not updated:
>
> /*
>  * The sortKeys variable is used by every case other than the datum and
>  * hash index cases; it is set by tuplesort_begin_xxx.  tupDesc is only
>  * used by the MinimalTuple and CLUSTER routines, though.
>  */
> TupleDesc tupDesc;
> SortSupport sortKeys; /* array of length nKeys */

Oops.  Fixed.  Thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Abbreviated keys for Datum tuplesort (was: Re: B-Tree support function number 3 (strxfrm() optimization))

2015-05-13 Thread Peter Geoghegan
On Wed, May 13, 2015 at 11:40 AM, Robert Haas  wrote:
> Committed.  Thanks for the patch and your patience.

This comment was not updated:

/*
 * The sortKeys variable is used by every case other than the datum and
 * hash index cases; it is set by tuplesort_begin_xxx.  tupDesc is only
 * used by the MinimalTuple and CLUSTER routines, though.
 */
TupleDesc tupDesc;
SortSupport sortKeys; /* array of length nKeys */



-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Abbreviated keys for Datum tuplesort (was: Re: B-Tree support function number 3 (strxfrm() optimization))

2015-05-13 Thread Robert Haas
On Fri, Jan 23, 2015 at 4:13 PM, Andrew Gierth
 wrote:
> [pruning the Cc: list and starting a new thread]
>
> Here's the cleaned-up version of the patch to allow abbreviated keys
> when sorting a single Datum. This also removes comments that suggest
> that the caller of tuplesort_begin_datum should ever have to care
> about the abbreviated key optimization.
>
> I'll add this to the CF.

Committed.  Thanks for the patch and your patience.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Abbreviated keys for Datum tuplesort

2015-04-03 Thread Peter Geoghegan
On Fri, Apr 3, 2015 at 1:17 PM, Stephen Frost  wrote:
>> but even I'm not willing to
>> expend the amount of ink and emotional energy you have on whether a
>> variable that holds +1, 0, or -1 ought to be declared as "int" or
>> "int32".  Does it matter?  Yeah.  Is it worth this much argument?  No.
>
> My only comment will be on this very minor aspect (and I'm quite
> agnostic as to what is decided, particularly as I haven't read the patch
> at all), but, should we consider an enum (generically) for such cases?
> If that's truly the extent of possible values, and anything else is an
> error, then at least if I was writing DDL to support this, I'd use an
> enum, maybe a domain, or a CHECK constraint (though I'd likely feel
> better about the enum or domain approach).

It isn't the case that an enum can support it. Some comparators will
return -5 rather than -1 (as with C-style comparators in general,
sometimes comparisons can be implemented using subtraction and things
like that).

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Abbreviated keys for Datum tuplesort

2015-04-03 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
> I'm about as much
> of a stickler for the details as you will find on this mailing list,
> or possibly, in the observable universe,

This made me laugh. :)

> but even I'm not willing to
> expend the amount of ink and emotional energy you have on whether a
> variable that holds +1, 0, or -1 ought to be declared as "int" or
> "int32".  Does it matter?  Yeah.  Is it worth this much argument?  No.

My only comment will be on this very minor aspect (and I'm quite
agnostic as to what is decided, particularly as I haven't read the patch
at all), but, should we consider an enum (generically) for such cases?
If that's truly the extent of possible values, and anything else is an
error, then at least if I was writing DDL to support this, I'd use an
enum, maybe a domain, or a CHECK constraint (though I'd likely feel
better about the enum or domain approach).

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Re: Abbreviated keys for Datum tuplesort

2015-04-03 Thread Peter Geoghegan
On Fri, Apr 3, 2015 at 1:07 PM, Robert Haas  wrote:
> I'm about as much
> of a stickler for the details as you will find on this mailing list,
> or possibly, in the observable universe, but even I'm not willing to
> expend the amount of ink and emotional energy you have on whether a
> variable that holds +1, 0, or -1 ought to be declared as "int" or
> "int32".  Does it matter?  Yeah.  Is it worth this much argument?  No.


I haven't really spent very much time arguing about this point at all,
and intend to spend no more time on it. It's up to you.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Abbreviated keys for Datum tuplesort

2015-04-03 Thread Robert Haas
On Thu, Apr 2, 2015 at 7:02 PM, Peter Geoghegan  wrote:
> On Thu, Apr 2, 2015 at 11:21 PM, Robert Haas  wrote:
>>> The changes that Andrew
>>> took issue with are utterly insignificant.
>>
>> Great.  Then you will be utterly indifferent to which version gets committed.
>
> *shrug*
>
> You were the one that taught me to be bureaucratically minded about
> keeping code consistent at this fine a level. I think it's odd that
> you of all people are opposing me on this point, but whatever.

Sure, consistency is important.  But sometimes there is more than one
thing that you can choose to be consistent with.  IIUC, you're
complaining because somebody assigned the return value of a function
to a variable whose type matches the function's return type, rather
than assigning it to a variable of the same mismatching type used in
parallel code elsewhere.  Which form of consistency to aim for in such
cases is fundamentally a judgement call.  I'll have another look over
the patch and maybe I'll come around to your point of view, but you
don't seem very willing to concede the point that intelligent people
could disagree over what is most consistent here.  I'm about as much
of a stickler for the details as you will find on this mailing list,
or possibly, in the observable universe, but even I'm not willing to
expend the amount of ink and emotional energy you have on whether a
variable that holds +1, 0, or -1 ought to be declared as "int" or
"int32".  Does it matter?  Yeah.  Is it worth this much argument?  No.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Abbreviated keys for Datum tuplesort

2015-04-02 Thread Peter Geoghegan
On Thu, Apr 2, 2015 at 11:21 PM, Robert Haas  wrote:
> On Thu, Apr 2, 2015 at 5:34 PM, Peter Geoghegan  wrote:
>> I think it's explained by the pre-check for sorted input making the
>> number of comparisons exactly n -1. As I pointed out to Tomas, if you
>> put a single, solitary unsorted element at the end, the abbreviated
>> version is then 8x faster (maybe that was in relation to a slightly
>> different case, but the pattern is the same). So this case isn't an
>> argument against datum abbreviation, or even abbreviation in general,
>> but rather an argument against using strxfrm() in general (which for
>> example the GCC docs strongly recommend for sorting lists of strings).
>> It's a bad argument, IMV. This sort is already extremely fast.
>
> OK, I see.

Wait. It's also due to the fact that some cases are spuriously
aborted, because the cost model for text needs to be fixed, I believe.
It's not clear that this is actually such a case from Tomas' remarks,
because the smaller scale tested *did* win significantly, and that
appeared to be otherwise comparable to the regressed cases. So
although what I describe here about perfectly pre-sorted input is
something I've seen (and something that we discussed on another thread
IIRC), this is not an example of it. Again, this is just an example of
why we need to fix the cost model for text.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Abbreviated keys for Datum tuplesort

2015-04-02 Thread Peter Geoghegan
On Thu, Apr 2, 2015 at 11:21 PM, Robert Haas  wrote:
>> The changes that Andrew
>> took issue with are utterly insignificant.
>
> Great.  Then you will be utterly indifferent to which version gets committed.

*shrug*

You were the one that taught me to be bureaucratically minded about
keeping code consistent at this fine a level. I think it's odd that
you of all people are opposing me on this point, but whatever.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Abbreviated keys for Datum tuplesort

2015-04-02 Thread Robert Haas
On Thu, Apr 2, 2015 at 5:34 PM, Peter Geoghegan  wrote:
> I think it's explained by the pre-check for sorted input making the
> number of comparisons exactly n -1. As I pointed out to Tomas, if you
> put a single, solitary unsorted element at the end, the abbreviated
> version is then 8x faster (maybe that was in relation to a slightly
> different case, but the pattern is the same). So this case isn't an
> argument against datum abbreviation, or even abbreviation in general,
> but rather an argument against using strxfrm() in general (which for
> example the GCC docs strongly recommend for sorting lists of strings).
> It's a bad argument, IMV. This sort is already extremely fast.

OK, I see.

> The changes that Andrew
> took issue with are utterly insignificant.

Great.  Then you will be utterly indifferent to which version gets committed.

> Also, the changes that Andrew didn't mention are clearly appropriate.
> In particular, the comments on the SortKeys variable being used by
> every case except the hash case and datum case should still be updated
> to reflect that that's only true for the hash case now.

On that point, I agree.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Abbreviated keys for Datum tuplesort

2015-04-02 Thread Peter Geoghegan
On Thu, Apr 2, 2015 at 8:17 PM, Robert Haas  wrote:
> On Fri, Feb 20, 2015 at 3:57 PM, Tomas Vondra
>  wrote:
>> I've spent a fair amount of testing this today, and when using the
>> simple percentile_disc example mentioned above, I see this pattern:
>>
>>  master   patched   speedup
>>-
>> generate_series(1,100)  4.2   0.7  6
>> generate_series(1,200)  9.2   9.8  0.93
>> generate_series(1,300) 14.5  15.3  0.95
>>
>>
>> so for a small dataset the speedup is very nice, but for larger sets
>> there's ~5% slowdown. Is this expected?

I think it's explained by the pre-check for sorted input making the
number of comparisons exactly n -1. As I pointed out to Tomas, if you
put a single, solitary unsorted element at the end, the abbreviated
version is then 8x faster (maybe that was in relation to a slightly
different case, but the pattern is the same). So this case isn't an
argument against datum abbreviation, or even abbreviation in general,
but rather an argument against using strxfrm() in general (which for
example the GCC docs strongly recommend for sorting lists of strings).
It's a bad argument, IMV. This sort is already extremely fast.

BTW, Corey Huinker (who I believe you also spoke to during pgConf.US)
reported that his problematic CREATE INDEX case was 12x faster with
9.5. That used tapesort. That was also perfectly pre-sorted, but since
it was using tapesort and abbreviated there was a huge improvement.

> I had a look at this patch today with a view to committing it, but it
> seems that nobody's commented on this point, which seems like an
> important one.  Any thoughts?
>
> For what it's worth, and without wishing to provoke another flamewar,
> I am inclined to use Andrew's version of this patch rather than
> Peter's.  I have not undertaken an exhaustive comparison, nor do I
> intend to.  It is the reviewer's responsibility to justify the changes
> they think the author needs to make, and that wasn't done here.  On
> the points of difference Andrew highlighted, I think his version is
> fine.

The justification is that Andrew's version had arbitrary differences
with the existing code, in particular in the datum comparator, which
was different to all other cases in the way that Andrew pointed out.
Overall, there were only tiny differences between the two versions. I
see no reason to not match the existing style. The changes that Andrew
took issue with are utterly insignificant.

Also, the changes that Andrew didn't mention are clearly appropriate.
In particular, the comments on the SortKeys variable being used by
every case except the hash case and datum case should still be updated
to reflect that that's only true for the hash case now.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Abbreviated keys for Datum tuplesort

2015-04-02 Thread Robert Haas
On Fri, Feb 20, 2015 at 3:57 PM, Tomas Vondra
 wrote:
> I've spent a fair amount of testing this today, and when using the
> simple percentile_disc example mentioned above, I see this pattern:
>
>  master   patched   speedup
>-
> generate_series(1,100)  4.2   0.7  6
> generate_series(1,200)  9.2   9.8  0.93
> generate_series(1,300) 14.5  15.3  0.95
>
>
> so for a small dataset the speedup is very nice, but for larger sets
> there's ~5% slowdown. Is this expected?

I had a look at this patch today with a view to committing it, but it
seems that nobody's commented on this point, which seems like an
important one.  Any thoughts?

For what it's worth, and without wishing to provoke another flamewar,
I am inclined to use Andrew's version of this patch rather than
Peter's.  I have not undertaken an exhaustive comparison, nor do I
intend to.  It is the reviewer's responsibility to justify the changes
they think the author needs to make, and that wasn't done here.  On
the points of difference Andrew highlighted, I think his version is
fine.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Abbreviated keys for Datum tuplesort

2015-03-14 Thread Peter Geoghegan
On Fri, Mar 13, 2015 at 7:51 PM, Andrew Gierth
 wrote:
> Since ApplySortComparator returns int, and "compare" is used to store
> the return value of comparetup_datum which is also declared int, this
> seems inappropriate even as a "stylistic tweak".

Consistency matters. That change, and the other changes to the
structure of comparetup_datum() makes it match the other comparetup_*
cases more closely. I don't think it's a big deal, and I'm surprised
you're calling it out specifically.

> Also, your changes to the block comment for SortTuple now hide the fact
> that datum1 is potentially the abbreviated value in tuple as well as
> single-Datum cases.  Here are the versions for comparison (mine is
> first):

I don't think that I've failed to convey that, even just based on the
diff you included. My remarks apply to the new case, datum sorting,
where there is a new convention that must consider whether the type is
pass-by-value or pass-by-reference. Surely if it's not clear that
abbreviation is used for the heap tuple case (I think that's what you
mean), then that's a problem with the extant code in the master branch
that has nothing to do with this.

The whole point of this patch is that virtually every reasonable case
for abbreviation is now supported. That uniformity clarifies things.
So of course sortSupport (and abbreviation) is supported by every case
other than the hash case, where it clearly cannot apply. We just
needed to add a note on what the datum sort case does with
pass-by-value/past-by-reference-ness with respect to abbreviation,
alongside the existing discussion of that, since that becomes a
special case.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Abbreviated keys for Datum tuplesort

2015-03-13 Thread Andrew Gierth
> "Peter" == Peter Geoghegan  writes:

 Peter> I attach a slightly tweaked version of Andrew's original.

You changed this:

 static int
 comparetup_datum(const SortTuple *a, const SortTuple *b, Tuplesortstate *state)
 {
-int compare;
+int32   compare;
 
 compare = ApplySortComparator(a->datum1, a->isnull1,

Since ApplySortComparator returns int, and "compare" is used to store
the return value of comparetup_datum which is also declared int, this
seems inappropriate even as a "stylistic tweak".

Also, your changes to the block comment for SortTuple now hide the fact
that datum1 is potentially the abbreviated value in tuple as well as
single-Datum cases.  Here are the versions for comparison (mine is
first):

***
*** 136,175 
  /*
   * The objects we actually sort are SortTuple structs.  These contain
   * a pointer to the tuple proper (might be a MinimalTuple or IndexTuple),
   * which is a separate palloc chunk --- we assume it is just one chunk and
   * can be freed by a simple pfree().  SortTuples also contain the tuple's
   * first key column in Datum/nullflag format, and an index integer.
   *
   * Storing the first key column lets us save heap_getattr or index_getattr
   * calls during tuple comparisons.  We could extract and save all the key
   * columns not just the first, but this would increase code complexity and
   * overhead, and wouldn't actually save any comparison cycles in the common
   * case where the first key determines the comparison result.  Note that
   * for a pass-by-reference datatype, datum1 points into the "tuple" storage.
   *
-  * There is one special case: when the sort support infrastructure provides an
-  * "abbreviated key" representation, where the key is (typically) a pass by
-  * value proxy for a pass by reference type.  In this case, the abbreviated 
key
-  * is stored in datum1 in place of the actual first key column.
-  *
   * When sorting single Datums, the data value is represented directly by
!  * datum1/isnull1 for pass by value types (or null values).  If the datatype 
is
!  * pass-by-reference and isnull1 is false, then "tuple" points to a separately
!  * palloc'd data value, otherwise "tuple" is NULL.  The value of datum1 is 
then
!  * either the same pointer as "tuple", or is an abbreviated key value as
!  * described above.  Accordingly, "tuple" is always used in preference to
!  * datum1 as the authoritative value for pass-by-reference cases.
   *
   * While building initial runs, tupindex holds the tuple's run number.  During
   * merge passes, we re-use it to hold the input tape number that each tuple in
   * the heap was read from, or to hold the index of the next tuple pre-read
   * from the same tape in the case of pre-read entries.  tupindex goes unused
   * if the sort occurs entirely in memory.
   */
  typedef struct
  {
  void   *tuple;  /* the tuple proper */
  Datum   datum1; /* value of first key column */
  boolisnull1;/* is first key column NULL? */
  int tupindex;   /* see notes above */
  } SortTuple;
--- 136,170 
  /*
   * The objects we actually sort are SortTuple structs.  These contain
   * a pointer to the tuple proper (might be a MinimalTuple or IndexTuple),
   * which is a separate palloc chunk --- we assume it is just one chunk and
   * can be freed by a simple pfree().  SortTuples also contain the tuple's
   * first key column in Datum/nullflag format, and an index integer.
   *
   * Storing the first key column lets us save heap_getattr or index_getattr
   * calls during tuple comparisons.  We could extract and save all the key
   * columns not just the first, but this would increase code complexity and
   * overhead, and wouldn't actually save any comparison cycles in the common
   * case where the first key determines the comparison result.  Note that
   * for a pass-by-reference datatype, datum1 points into the "tuple" storage.
   *
   * When sorting single Datums, the data value is represented directly by
!  * datum1/isnull1.  If the datatype is pass-by-reference and isnull1 is false,
!  * then datum1 points to a separately palloc'd data value that is also pointed
!  * to by the "tuple" pointer; otherwise "tuple" is NULL.  There is one special
!  * case:  when the sort support infrastructure provides an "abbreviated key"
!  * representation, where the key is (typically) a pass by value proxy for a
!  * pass by reference type.
   *
   * While building initial runs, tupindex holds the tuple's run number.  During
   * merge passes, we re-use it to hold the input tape number that each tuple in
   * the heap was read from, or to hold the index of the next tuple pre-read
   * from the same tape in the case of pre-read entries.  tupindex goes unused
   * if the sort occurs entirely in memory.
   */
  typedef struct
  {
  void   *tuple;  /* the tuple proper */
  Datum   datum1; /* value of

Re: [HACKERS] Re: Abbreviated keys for Datum tuplesort

2015-03-13 Thread Peter Geoghegan
On Sun, Jan 25, 2015 at 3:15 AM, Andrew Gierth
 wrote:
>  Robert> I think this is a good idea.  Do you have a test case that
>  Robert> shows the benefit?
>
> The best test case for datum sort performance is to use percentile_disc,
> since that has almost no overhead beyond performing the actual sort.

I attach a slightly tweaked version of Andrew's original.

This revision makes the reverted comments within orderedsetaggs.c
consistent with back branches (no need to call abbreviation out as an
interesting special case anymore, just as in the back branches, where
abbreviation doesn't even exist). Better to keep those consistent for
backpatching purposes. Also, I've changed back tuplesort.c header
comments to how they were back in November and until recently, to
reflect that now it really is the case that only the hash index case
doesn't have the "sortKeys" field reliably set. We now need to set
"sortKeys" for the datum case, so don't say that we don't...we need to
worry about the applicability of the onlyKey optimization for the
datum sort case now, too.

Other than that, there are a number of minor stylistic tweaks. The
datum case does not support pass-by-value abbreviation, which could be
useful in theory (e.g., abbreviation of float8 values, which was
briefly discussed before). This isn't worth calling out as a special
case in the tuplesort header comments IMV; there is now a brief note
on this added to tuplesort_begin_datum(). We still support
abbreviation for pass-by-value types for non-datumsort cases (there is
of course no known example of opclass abbreviation support for a
pass-by-value type, so this is only a theoretical deficiency).

I've marked this "ready for committer".

Thanks
-- 
Peter Geoghegan
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 9ff0eff..ee13fc0 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -363,10 +363,6 @@ initialize_aggregates(AggState *aggstate,
 			 * We use a plain Datum sorter when there's a single input column;
 			 * otherwise sort the full tuple.  (See comments for
 			 * process_ordered_aggregate_single.)
-			 *
-			 * In the future, we should consider forcing the
-			 * tuplesort_begin_heap() case when the abbreviated key
-			 * optimization can thereby be used, even when numInputs is 1.
 			 */
 			peraggstate->sortstate =
 (peraggstate->numInputs == 1) ?
diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index f9a5f7f..869a83b 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -266,13 +266,7 @@ ordered_set_startup(FunctionCallInfo fcinfo, bool use_tuples)
 	osastate->qstate = qstate;
 	osastate->gcontext = gcontext;
 
-	/*
-	 * Initialize tuplesort object.
-	 *
-	 * In the future, we should consider forcing the tuplesort_begin_heap()
-	 * case when the abbreviated key optimization can thereby be used, even
-	 * when !use_tuples.
-	 */
+	/* Initialize tuplesort object */
 	if (use_tuples)
 		osastate->sortstate = tuplesort_begin_heap(qstate->tupdesc,
    qstate->numSortCols,
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 8ad5635..92f0355 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -336,9 +336,9 @@ struct Tuplesortstate
 	bool		markpos_eof;	/* saved "eof_reached" */
 
 	/*
-	 * The sortKeys variable is used by every case other than the datum and
-	 * hash index cases; it is set by tuplesort_begin_xxx.  tupDesc is only
-	 * used by the MinimalTuple and CLUSTER routines, though.
+	 * The sortKeys variable is used by every case other than the hash index
+	 * case; it is set by tuplesort_begin_xxx.  tupDesc is only used by the
+	 * MinimalTuple and CLUSTER routines, though.
 	 */
 	TupleDesc	tupDesc;
 	SortSupport sortKeys;		/* array of length nKeys */
@@ -901,29 +901,41 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation,
 	state->copytup = copytup_datum;
 	state->writetup = writetup_datum;
 	state->readtup = readtup_datum;
+	state->abbrevNext = 10;
 
 	state->datumType = datumType;
 
+	/* lookup necessary attributes of the datum type */
+	get_typlenbyval(datumType, &typlen, &typbyval);
+	state->datumTypeLen = typlen;
+	state->datumTypeByVal = typbyval;
+
 	/* Prepare SortSupport data */
-	state->onlyKey = (SortSupport) palloc0(sizeof(SortSupportData));
+	state->sortKeys = (SortSupport) palloc0(sizeof(SortSupportData));
+
+	state->sortKeys->ssup_cxt = CurrentMemoryContext;
+	state->sortKeys->ssup_collation = sortCollation;
+	state->sortKeys->ssup_nulls_first = nullsFirstFlag;
 
-	state->onlyKey->ssup_cxt = CurrentMemoryContext;
-	state->onlyKey->ssup_collation = sortCollation;
-	state->onlyKey->ssup_nulls_first = nullsFirstFlag;
 	/*
-	 * Conversion to abbreviated representation infeasible in the Datum case.
-	 * It must be possible to subsequently fetch original datum values wit

Re: [HACKERS] Re: Abbreviated keys for Datum tuplesort

2015-02-20 Thread Tomas Vondra
On 25.1.2015 12:15, Andrew Gierth wrote:
>
> So given some suitable test data, such as
> 
> create table stuff as select random()::text as randtext
>   from generate_series(1,100);  -- or however many rows
> 
> you can do
> 
> select percentile_disc(0) within group (order by randtext) from stuff;
> 
> or
> 
> select count(distinct randtext) from stuff;
> 
> The performance improvements I saw were pretty much exactly as
> expected from the improvement in the ORDER BY and CREATE INDEX cases.

I've spent a fair amount of testing this today, and when using the
simple percentile_disc example mentioned above, I see this pattern:

 master   patched   speedup
   -
generate_series(1,100)  4.2   0.7  6
generate_series(1,200)  9.2   9.8  0.93
generate_series(1,300) 14.5  15.3  0.95


so for a small dataset the speedup is very nice, but for larger sets
there's ~5% slowdown. Is this expected?


-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Abbreviated keys for Datum tuplesort

2015-01-25 Thread Andrew Gierth
> "Robert" == Robert Haas  writes:

 >> Here's the cleaned-up version of the patch to allow abbreviated keys
 >> when sorting a single Datum. This also removes comments that suggest
 >> that the caller of tuplesort_begin_datum should ever have to care
 >> about the abbreviated key optimization.
 >> 
 >> I'll add this to the CF.

 Robert> I think this is a good idea.  Do you have a test case that
 Robert> shows the benefit?

The best test case for datum sort performance is to use percentile_disc,
since that has almost no overhead beyond performing the actual sort.
(Unlike, say, count(distinct) or mode(), both of which have to do an
additional comparison pass over the sorted data; but count(distinct) is
probably the most common use of the datum sort in the wild, so it's
useful to try that too.)

So given some suitable test data, such as

create table stuff as select random()::text as randtext
  from generate_series(1,100);  -- or however many rows

you can do

select percentile_disc(0) within group (order by randtext) from stuff;

or

select count(distinct randtext) from stuff;

The performance improvements I saw were pretty much exactly as expected
from the improvement in the ORDER BY and CREATE INDEX cases.

The best test case for checking the correct order of results is to use
array_agg(x order by x), for example as follows:

select u, u <= lag(u) over ()
  from (select unnest(a) as u
  from (select array_agg(randtext order by randtext)
  from stuff) s1) s2;

(note that array_agg(x order by x) uses the datum sort, but
array_agg(x order by y) uses the ordinary heap tuple sort)

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Abbreviated keys for Datum tuplesort (was: Re: B-Tree support function number 3 (strxfrm() optimization))

2015-01-24 Thread Peter Geoghegan
On Sat, Jan 24, 2015 at 6:19 PM, Robert Haas  wrote:
> I think this is a good idea.  Do you have a test case that shows the benefit?

I agree. It seems likely that this will show a similar benefit to
other cases already tested, but I'd like to see a test case too.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Abbreviated keys for Datum tuplesort (was: Re: B-Tree support function number 3 (strxfrm() optimization))

2015-01-24 Thread Robert Haas
On Fri, Jan 23, 2015 at 4:13 PM, Andrew Gierth
 wrote:
> [pruning the Cc: list and starting a new thread]
>
> Here's the cleaned-up version of the patch to allow abbreviated keys
> when sorting a single Datum. This also removes comments that suggest
> that the caller of tuplesort_begin_datum should ever have to care
> about the abbreviated key optimization.
>
> I'll add this to the CF.

I think this is a good idea.  Do you have a test case that shows the benefit?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers