Re: Changing SQL Inlining Behaviour (or...?)

2019-01-08 Thread Adam Brightwell
>> > * Solution #2 - Quick and dirty and invisible. Tom suggested a hack that 
>> > achieves the aims of #1 but without adding syntax to CREATE FUNCTION: have 
>> > the inlining logic look at the cost of the wrapper and the cost of 
>> > parameters, and if the cost of the wrapper "greatly exceeded" the cost of 
>> > the parameters, then inline. So the PostGIS project would just set the 
>> > cost of our wrappers very high, and we'd get the behaviour we want, while 
>> > other users who want to use wrappers to force caching of calculations 
>> > would have zero coded wrapper functions.  Pros: Solves the problem and 
>> > easy to implement, I'm happy to contribute. Cons: it's so clearly a hack 
>> > involving hidden (from users) magic.
>> > ...
>> > So my question to hackers is: which is less worse, #1 or #2, to implement 
>> > and submit to commitfest, in case #3 does not materialize in time for 
>> > PgSQL 12?
>>
>> I've been working with Paul to create and test a patch (attached) that
>> addresses Solution #2. This patch essentially modifies the inlining
>> logic to compare the cost of the function with the total cost of the
>> parameters. The goal as stated above, is that for these kinds of
>> functions, they would be assigned relatively high cost to trigger the
>> inlining case.
>
>
> I've tested this patch for both the internal types case and for the PostGIS 
> functions case, and in both cases it provides a way to get around the 
> undesirable inlining behaviour for our case without impacting people for whom 
> the behaviour has been desirable in the past.
>
> Is this already in the commitfest?

Yes.

https://commitfest.postgresql.org/21/1965/

-Adam



Re: Changing SQL Inlining Behaviour (or...?)

2018-12-31 Thread Adam Brightwell
All,

> So, context:
>
> - We want PostGIS to parallelize more. In order to achieve that we need to 
> mark our functions with more realistic COSTs. Much much higher COSTs.
> - When we do that, we hit a different problem. Our most commonly used 
> functions, ST_Intersects(), ST_DWithin() are actually SQL wrapper functions 
> are more complex combinations of index operators and exact computational 
> geometry functions.
> - In the presence of high cost parameters that are used multiple times in SQL 
> functions, PostgreSQL will stop inlining those functions, in an attempt to 
> save the costs of double-calculating the parameters.
> - For us, that's the wrong choice, because we lose the index operators at the 
> same time as we "save" the cost of double calculation.
> - We need our wrapper functions inlined, even when they are carrying a high 
> COST.
>
> At pgconf.eu, I canvassed this problem and some potential solutions:
> ...
> * Solution #2 - Quick and dirty and invisible. Tom suggested a hack that 
> achieves the aims of #1 but without adding syntax to CREATE FUNCTION: have 
> the inlining logic look at the cost of the wrapper and the cost of 
> parameters, and if the cost of the wrapper "greatly exceeded" the cost of the 
> parameters, then inline. So the PostGIS project would just set the cost of 
> our wrappers very high, and we'd get the behaviour we want, while other users 
> who want to use wrappers to force caching of calculations would have zero 
> coded wrapper functions.  Pros: Solves the problem and easy to implement, I'm 
> happy to contribute. Cons: it's so clearly a hack involving hidden (from 
> users) magic.
> ...
> So my question to hackers is: which is less worse, #1 or #2, to implement and 
> submit to commitfest, in case #3 does not materialize in time for PgSQL 12?

I've been working with Paul to create and test a patch (attached) that
addresses Solution #2. This patch essentially modifies the inlining
logic to compare the cost of the function with the total cost of the
parameters. The goal as stated above, is that for these kinds of
functions, they would be assigned relatively high cost to trigger the
inlining case.

The modification that this patch makes is the following:

* Collect the cost for each parameter (no longer short circuits when a
single parameter is costly).
* Compare the total cost of all parameters to the individual cost of
the function.
* If the function cost is greater than the parameters, then it will
inline the function.
* If the function cost is less than the parameters, then it will
perform the original parameter checks (short circuiting as necessary).

I've included a constant called "INLINE_FUNC_COST_FACTOR" that is
currently set to '1'. This is meant to assume for adjustments to what
"greatly exceeded" means in the description above. Perhaps this isn't
necessary, but I wanted to at least initially provide the flexibility
in case it were.

I have also attached a simple test case as was originally previously
by Paul to demonstrate the functionality.

-Adam


pg-example.sql
Description: application/sql
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index f4446169f5..58755151fa 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -4648,6 +4648,9 @@ inline_function(Oid funcid, Oid result_type, Oid result_collid,
 	Query	   *querytree;
 	Node	   *newexpr;
 	int		   *usecounts;
+	double	   *paramcosts;
+	Cost		total_param_cost;
+	double		func_cost;
 	ListCell   *arg;
 	int			i;
 
@@ -4839,9 +4842,20 @@ inline_function(Oid funcid, Oid result_type, Oid result_collid,
 	 * substituted.
 	 */
 	usecounts = (int *) palloc0(funcform->pronargs * sizeof(int));
+	paramcosts = (double *) palloc0(funcform->pronargs * sizeof(double));
 	newexpr = substitute_actual_parameters(newexpr, funcform->pronargs,
 		   args, usecounts);
 
+	/*
+	 * Get the cost of the function without parameters.
+	 *
+	 * Calculation based on how costsize.c performs lookup for function costs
+	 * via 'cost_qual_eval_walker'.
+	 */
+	func_cost = get_func_cost(funcid) * cpu_operator_cost;
+
+	total_param_cost = 0.0;
+
 	/* Now check for parameter usage */
 	i = 0;
 	foreach(arg, args)
@@ -4867,10 +4881,10 @@ inline_function(Oid funcid, Oid result_type, Oid result_collid,
 			 */
 			if (contain_subplans(param))
 goto fail;
+
 			cost_qual_eval(&eval_cost, list_make1(param), NULL);
-			if (eval_cost.startup + eval_cost.per_tuple >
-10 * cpu_operator_cost)
-goto fail;
+			paramcosts[i] = (eval_cost.startup + eval_cost.per_tuple);
+			total_param_cost += (eval_cost.startup + eval_cost.per_tuple);
 
 			/*
 			 * Check volatility last since this is more expensive than the
@@ -4883,6 +4897,19 @@ inline_function(Oid funcid, Oid result_type, Oid result_collid,
 	}
 
 	/*
+	 * If the cost of the function is greater than the sum of all the param
+	 * costs, then inline. Otherwise, check the cost of each param

Re: PATCH: Exclude unlogged tables from base backups

2018-01-29 Thread Adam Brightwell
On Mon, Jan 29, 2018 at 1:17 PM, David Steele  wrote:
> On 1/29/18 9:13 AM, David Steele wrote:
>> On 1/29/18 5:28 AM, Masahiko Sawada wrote:
>>> But I
>>> have a question; can we exclude temp tables as well? The pg_basebackup
>>> includes even temp tables. But I don't think that it's necessary for
>>> backups
>> Thank you for having another look at the patch.
>>
>> Temp tables should be excluded by this code which is already in
>> basebackup.c:
>>
>> /* Skip temporary files */
>> if (strncmp(de->d_name,
>> PG_TEMP_FILE_PREFIX,
>> strlen(PG_TEMP_FILE_PREFIX)) == 0)
>> continue;
>>
>> This looks right to me.
>
>
> Whoops, my bad.  Temp relations are stored in the db directories with a
> "t" prefix.  Looks like we can take care of those easily enough but I
> think it should be a separate patch.
>
> I'll plan to submit that for CF 2018-03.

I agree, I believe this should be a separate patch.

As for the latest patch above, I have reviewed, applied and tested it.

It looks good to me. As well, it applies cleanly against master at
(97d4445a03). All tests passed when running 'check-world'.

If it is agreed that the temp file exclusion should be submitted as a
separate patch, then I will mark 'ready for committer'.

-Adam



Re: PATCH: Exclude unlogged tables from base backups

2018-01-24 Thread Adam Brightwell
> I agree with #1 and feel the updated docs are reasonable and
> sufficient to address this case for now.
>
> I have retested these patches against master at d6ab720360.
>
> All test succeed.
>
> Marking "Ready for Committer".

Actually, marked it "Ready for Review" to wait for Masahiko to comment/agree.

Masahiko,

If you agree with the above, would you mind updating the status accordingly?

-Adam



Re: PATCH: Exclude unlogged tables from base backups

2018-01-24 Thread Adam Brightwell
>> If a new unlogged relation is created after constructed the
>> unloggedHash before sending file, we cannot exclude such relation. It
>> would not be problem if the taking backup is not long because the new
>> unlogged relation unlikely becomes so large. However, if takeing a
>> backup takes a long time, we could include large main fork in the
>> backup.
>
> This is a good point.  It's per database directory which makes it a
> little better, but maybe not by much.
>
> Three options here:
>
> 1) Leave it as is knowing that unlogged relations created during the
> backup may be copied and document it that way.
>
> 2) Construct a list for SendDir() to work against so the gap between
> creating that and creating the unlogged hash is as small as possible.
> The downside here is that the list may be very large and take up a lot
> of memory.
>
> 3) Check each file that looks like a relation in the loop to see if it
> has an init fork.  This might affect performance since an
> opendir/readdir loop would be required for every relation.
>
> Personally, I'm in favor of #1, at least for the time being.  I've
> updated the docs as indicated in case you and Adam agree.

I agree with #1 and feel the updated docs are reasonable and
sufficient to address this case for now.

I have retested these patches against master at d6ab720360.

All test succeed.

Marking "Ready for Committer".

-Adam



Re: PATCH: Exclude unlogged tables from base backups

2018-01-16 Thread Adam Brightwell
All,

I have reviewed and tested these patches.

The patches applied cleanly in order against master at (90947674fc).

I ran the provided regression tests and a 'check-world'.  All tests succeeded.

Marking ready for committer.

-Adam