Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-09-07 Thread Tom Lane
I wrote:
> Dmitriy Sarafannikov  writes:
>> [ snapshot_non_vacuumable_v3.patch ]

> In short I think we should just set up the threshold as RecentGlobalXmin.

Pushed with that adjustment and some fooling with the comments.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-09-07 Thread Tom Lane
Dmitriy Sarafannikov  writes:
> [ snapshot_non_vacuumable_v3.patch ]

Starting to look at this.  I think that the business with choosing
RecentGlobalXmin vs. RecentGlobalDataXmin is just wrong.  What we
want to do is accept any tuple that would not be considered killable
by heap_hot_search_buffer, and that looks only at RecentGlobalXmin.

It's possible that it'd be sensible for heap_hot_search_buffer to
use RecentGlobalDataXmin when dealing with a non-catalog table,
allowing it to consider more tuples killable.  But I'm not sure;
it looks like the decision which to use costs some effort, which
we probably don't want to be spending in heap_hot_search_buffer.
In any case that would be a different patch topic.

With the patch as it stands, if we have a user-data tuple that is dead
more recently than RecentGlobalXmin but not more recently than
RecentGlobalDataXmin, the snapshot will reject it so the planner's
indexscan will keep searching.  But heap_hot_search_buffer won't
tell the index AM to kill the tuple, so the index entry stays marked
as live, which means the next get_actual_variable_range call will have
to scan past it again.  So we don't get the hoped-for benefit with
tuple deaths in that range.  I do not know whether this effect might
explain your finding that the patch didn't entirely fix your performance
problem.

In short I think we should just set up the threshold as RecentGlobalXmin.
However, this seems like a decision that might be pretty specific to
get_actual_variable_range's use-case, so I'm inclined to have the
threshold be chosen by get_actual_variable_range itself not hard-wired
into InitNonVacuumableSnapshot.

Another point is that I think it might be possible for RecentGlobalXmin
to advance after get_actual_variable_range looks at it (for example,
if we have to take a new catalog snapshot during index scan setup).
This creates the possibility that the snapshot accepts tuples that
heap_hot_search_buffer would consider dead.  That seems fairly harmless
though, so I'm inclined not to fuss about it.  We could arrange for
HeapTupleSatisfiesNonVacuumable to use RecentGlobalXmin directly,
but that seems like it makes it too special-purpose.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-09-05 Thread Daniel Gustafsson
> On 18 Aug 2017, at 08:50, Andrey Borodin  wrote:
> 
> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, failed
> Implements feature:   tested, failed
> Spec compliant:   tested, failed
> Documentation:tested, failed
> 
> Hi! I've looked into the patch.
> 
> There is not so much of the code. The code itself is pretty clear and well 
> documented. I've found just one small typo "transactionn" in 
> HeapTupleSatisfiesNonVacuumable() comments.
> 
> Chosen Snapshot type is not a solution to the author's problem at hand. 
> Though implemented SnapshotNonVacuumable is closer to rational choice  than 
> currently used SnapshotDirty for range sketching. As far as I can see this is 
> what is get_actual_variable_range() is used for, despite word "actual" in 
> it's name. 
> The only place where get_actual_variable_range() is used for actual range is 
> preprocessed-out in get_variable_range():
>   /*
>* XXX It's very tempting to try to use the actual column min and max, 
> if
>* we can get them relatively-cheaply with an index probe.  However, 
> since
>* this function is called many times during join planning, that could
>* have unpleasant effects on planning speed.  Need more investigation
>* before enabling this.
>*/
> #ifdef NOT_USED
>   if (get_actual_variable_range(root, vardata, sortop, min, max))
>   return true;
> #endif
> 
> I think it would be good if we could have something like "give me actual 
> values, but only if it is on first index page", like conditional locks. But I 
> think this patch is a step to right direction.

Thanks for your review!  Based on your review and conclusions, I’ve bumped the
status to “Ready for Committer”. If you believe another status would be more
appropriate, please go in an update.

cheers ./daniel



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-08-18 Thread Andrey Borodin
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

Oops, missed those checkboxes. Sorry for the noise. Here's correct checklist: 
installcheck-world passes, documented as expected. Feature is there.

Best regards, Andrey Borodin.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-08-18 Thread Andrey Borodin
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   tested, failed
Spec compliant:   tested, failed
Documentation:tested, failed

Hi! I've looked into the patch.

There is not so much of the code. The code itself is pretty clear and well 
documented. I've found just one small typo "transactionn" in 
HeapTupleSatisfiesNonVacuumable() comments.

Chosen Snapshot type is not a solution to the author's problem at hand. Though 
implemented SnapshotNonVacuumable is closer to rational choice  than currently 
used SnapshotDirty for range sketching. As far as I can see this is what is 
get_actual_variable_range() is used for, despite word "actual" in it's name. 
The only place where get_actual_variable_range() is used for actual range is 
preprocessed-out in get_variable_range():
/*
 * XXX It's very tempting to try to use the actual column min and max, 
if
 * we can get them relatively-cheaply with an index probe.  However, 
since
 * this function is called many times during join planning, that could
 * have unpleasant effects on planning speed.  Need more investigation
 * before enabling this.
 */
#ifdef NOT_USED
if (get_actual_variable_range(root, vardata, sortop, min, max))
return true;
#endif

I think it would be good if we could have something like "give me actual 
values, but only if it is on first index page", like conditional locks. But I 
think this patch is a step to right direction.

Best regards, Andrey Borodin.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-05-12 Thread Dmitriy Sarafannikov

> Ok, i agree. Patch is attached.

I added a patch to the CF


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-05-08 Thread Dmitriy Sarafannikov

> + else \
> + (snapshotdata).xmin = \
> + TransactionIdLimitedForOldSnapshots(RecentGlobalDataXmin, \
> + relation); \
> 
> I think we don't need to use TransactionIdLimitedForOldSnapshots() as
> that is required to override xmin for table vacuum/pruning purposes.
> 
>> Maybe we need
>> to use GetOldestXmin()?
> 
> It is a costly call as it needs ProcArrayLock, so I don't think it
> makes sense to use it here.

Ok, i agree. Patch is attached.



snapshot_non_vacuumable_v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-05-08 Thread Amit Kapila
On Mon, May 8, 2017 at 6:30 PM, Dmitriy Sarafannikov
 wrote:
>
> I think we can use RecentGlobalDataXmin for non-catalog relations and
> RecentGlobalXmin for catalog relations (probably a check similar to
> what we have in heap_page_prune_opt).
>
>
> I took check from heap_page_prune_opt (Maybe this check must be present as
> separate function?)
> But it requires to initialize snapshot for specific relation.
>

+ else \
+ (snapshotdata).xmin = \
+ TransactionIdLimitedForOldSnapshots(RecentGlobalDataXmin, \
+ relation); \

I think we don't need to use TransactionIdLimitedForOldSnapshots() as
that is required to override xmin for table vacuum/pruning purposes.

> Maybe we need
> to use GetOldestXmin()?

It is a costly call as it needs ProcArrayLock, so I don't think it
makes sense to use it here.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-05-08 Thread Dmitriy Sarafannikov
I think we can use RecentGlobalDataXmin for non-catalog relations andRecentGlobalXmin for catalog relations (probably a check similar towhat we have in heap_page_prune_opt).I took check from heap_page_prune_opt (Maybe this check must be present as separate function?)But it requires to initialize snapshot for specific relation. Maybe we need to use GetOldestXmin()?New version of patch is attached.

snapshot_non_vacuumable_v2.patch
Description: Binary data


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-05-06 Thread Amit Kapila
On Fri, May 5, 2017 at 1:28 PM, Dmitriy Sarafannikov
 wrote:
> Amit, thanks for comments!
>
>> 1.
>> +#define InitNonVacuumableSnapshot(snapshotdata)  \
>> + do { \
>> + (snapshotdata).satisfies = HeapTupleSatisfiesNonVacuumable; \
>> + (snapshotdata).xmin = RecentGlobalDataXmin; \
>> + } while(0)
>> +
>>
>> Can you explain and add comments why you think RecentGlobalDataXmin is
>> the right to use it here?  As of now, it seems to be only used for
>> pruning non-catalog tables.
>
> Can you explain me, what value for xmin should be used there?
>

I think we can use RecentGlobalDataXmin for non-catalog relations and
RecentGlobalXmin for catalog relations (probably a check similar to
what we have in heap_page_prune_opt).


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-05-05 Thread Dmitriy Sarafannikov
Amit, thanks for comments!

> 1.
> +#define InitNonVacuumableSnapshot(snapshotdata)  \
> + do { \
> + (snapshotdata).satisfies = HeapTupleSatisfiesNonVacuumable; \
> + (snapshotdata).xmin = RecentGlobalDataXmin; \
> + } while(0)
> +
> 
> Can you explain and add comments why you think RecentGlobalDataXmin is
> the right to use it here?  As of now, it seems to be only used for
> pruning non-catalog tables.

Can you explain me, what value for xmin should be used there?

> 2.
> +bool
> +HeapTupleSatisfiesNonVacuumable(HeapTuple htup, Snapshot snapshot,
> + Buffer buffer)
> +{
> + return HeapTupleSatisfiesVacuum(htup, snapshot->xmin, buffer)
> + != HEAPTUPLE_DEAD;
> +}
> +
> 
> Add comments on top of this function and for the sake of consistency
> update the file header as well (Summary of visibility functions:)

Yes, i will add comments and send new patch.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-05-05 Thread Amit Kapila
On Thu, May 4, 2017 at 9:42 PM, Dmitriy Sarafannikov
 wrote:
>
>> Maybe we need another type of snapshot that would accept any
>> non-vacuumable tuple.  I really don't want SnapshotAny semantics here,
>> but a tuple that was live more recently than the xmin horizon seems
>> like it's acceptable enough.  HeapTupleSatisfiesVacuum already
>> implements the right behavior, but we don't have a Snapshot-style
>> interface for it.
>
>
> I have tried to implement this new type of snapshot that accepts any
> non-vacuumable tuples.
> We have tried this patch in our load environment. And it has smoothed out
> and reduced magnitude of the cpu usage peaks.
> But this snapshot does not solve the problem completely.
>
> Patch is attached.

1.
+#define InitNonVacuumableSnapshot(snapshotdata)  \
+ do { \
+ (snapshotdata).satisfies = HeapTupleSatisfiesNonVacuumable; \
+ (snapshotdata).xmin = RecentGlobalDataXmin; \
+ } while(0)
+

Can you explain and add comments why you think RecentGlobalDataXmin is
the right to use it here?  As of now, it seems to be only used for
pruning non-catalog tables.

2.
+bool
+HeapTupleSatisfiesNonVacuumable(HeapTuple htup, Snapshot snapshot,
+ Buffer buffer)
+{
+ return HeapTupleSatisfiesVacuum(htup, snapshot->xmin, buffer)
+ != HEAPTUPLE_DEAD;
+}
+

Add comments on top of this function and for the sake of consistency
update the file header as well (Summary of visibility functions:)

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-05-04 Thread Dmitriy Sarafannikov

> Maybe we need another type of snapshot that would accept any
> non-vacuumable tuple.  I really don't want SnapshotAny semantics here,
> but a tuple that was live more recently than the xmin horizon seems
> like it's acceptable enough.  HeapTupleSatisfiesVacuum already
> implements the right behavior, but we don't have a Snapshot-style
> interface for it.


I have tried to implement this new type of snapshot that accepts any
non-vacuumable tuples.
We have tried this patch in our load environment. And it has smoothed out
and reduced magnitude of the cpu usage peaks.
But this snapshot does not solve the problem completely.

Patch is attached.



snapshot_non_vacuumable.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-05-02 Thread Vladimir Borodin
Hi.

> 25 апр. 2017 г., в 18:13, Dmitriy Sarafannikov  
> написал(а):
> 
> I'd like to propose to search min and max value in index with SnapshotAny in 
> get_actual_variable_range function.
> Current implementation scans index with SnapshotDirty which accepts 
> uncommitted rows and rejects dead rows.

The patch is already being discussed here [1].

[1] 
https://www.postgresql.org/message-id/05C72CF7-B5F6-4DB9-8A09-5AC897653113%40yandex.ru



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-05-02 Thread Dmitriy Sarafannikov

> If that is the case, then how would using SnapshotAny solve this
> problem.  We get the value from index first and then check its
> visibility in heap, so if time is spent in _bt_checkkeys, why would
> using a different kind of Snapshot solve the problem?

1st scanning on the index with SnapshotAny will accept this rows and
will not mark this entries as killed. 

Theremore, killed entries are ignored on standby. And scanning with
SnapshotAny will always accept first row from index.
 /*
  * During recovery we ignore killed tuples and don't bother to kill them
  * either. We do this because the xmin on the primary node could easily be
  * later than the xmin on the standby node, so that what the primary
  * thinks is killed is supposed to be visible on standby. So for correct
  * MVCC for queries during recovery we must ignore these hints and check
  * all tuples. Do *not* set ignore_killed_tuples to true when running in a
  * transaction that was started during recovery. xactStartedInRecovery
  * should not be altered by index AMs.
  */
 scan->kill_prior_tuple = false;
 scan->xactStartedInRecovery = TransactionStartedDuringRecovery();
 scan->ignore_killed_tuples = !scan->xactStartedInRecovery;

We execute this read-only queries on standby.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-05-02 Thread Vladimir Borodin

> 29 апр. 2017 г., в 17:34, Tom Lane  написал(а):
> 
> Dmitriy Sarafannikov  writes:
>>> Maybe we need another type of snapshot that would accept any
>>> non-vacuumable tuple.  I really don't want SnapshotAny semantics here,
> 
>> If I understood correctly, this new type of snapshot would help if
>> there are long running transactions which can see this tuples.
>> But if there are not long running transactions, it will be the same.
>> Am i right?
> 
> Right.  You haven't shown us much about the use-case you're concerned
> with, so it's not clear what's actually needed.

The use case is nearly the same as the way to reproduce the problem described 
in the first letter. It’s an OLTP database with short mostly read-only queries 
(~ 6k rps). Every 10 minutes new data is inserted (~5-10% of rows in 
polygon_table) and old is deleted (~ 5-10% or rows in polygon_table). Insert 
and delete are made in different transactions. Until the vacuum after delete is 
finished the planning time is two orders of magnitude is higher than usually. 
If we use prepared statements the problem doesn’t reproduce since planning is 
not actually done.

> 
>> And what about don’t fetch actual min and max values from indexes
>> whose columns doesn’t involved in join? 
> 
> We don't fetch that info unless we need it.

We used to think that it’s actually not so (there was a problem in our test 
case), but we rechecked and it is, the planner doesn’t find min and max in 
unneeded for join indexes.

> 
> I'm not entirely certain, but there could be cases where a single
> planning cycle ends up fetching that data more than once.  (There's
> caching at the RestrictInfo level, but that might not be enough.)
> So a line of thought that might be worth looking into is adding a
> lower layer of caching to make sure it's not done more than once per
> plan.  Again, whether this saves anything would depend a lot on
> specific use-cases.
> 
>   regards, tom lane



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-05-01 Thread Amit Kapila
On Fri, Apr 28, 2017 at 10:02 PM, Dmitriy Sarafannikov
 wrote:
>
> What I'm thinking of is the regular indexscan that's done internally
> by get_actual_variable_range, not whatever ends up getting chosen as
> the plan for the user query.  I had supposed that that would kill
> dead index entries as it went, but maybe that's not happening for
> some reason.
>
>
> Really, this happens as you said. Index entries are marked as dead.
> But after this, backends spends cpu time on skip this killed entries
> in _bt_checkkeys :
>

If that is the case, then how would using SnapshotAny solve this
problem.  We get the value from index first and then check its
visibility in heap, so if time is spent in _bt_checkkeys, why would
using a different kind of Snapshot solve the problem?


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-04-29 Thread Tom Lane
Dmitriy Sarafannikov  writes:
>> Maybe we need another type of snapshot that would accept any
>> non-vacuumable tuple.  I really don't want SnapshotAny semantics here,

> If I understood correctly, this new type of snapshot would help if
> there are long running transactions which can see this tuples.
> But if there are not long running transactions, it will be the same.
> Am i right?

Right.  You haven't shown us much about the use-case you're concerned
with, so it's not clear what's actually needed.

> And what about don’t fetch actual min and max values from indexes
> whose columns doesn’t involved in join? 

We don't fetch that info unless we need it.

I'm not entirely certain, but there could be cases where a single
planning cycle ends up fetching that data more than once.  (There's
caching at the RestrictInfo level, but that might not be enough.)
So a line of thought that might be worth looking into is adding a
lower layer of caching to make sure it's not done more than once per
plan.  Again, whether this saves anything would depend a lot on
specific use-cases.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-04-29 Thread Dmitriy Sarafannikov

> Maybe we need another type of snapshot that would accept any
> non-vacuumable tuple.  I really don't want SnapshotAny semantics here,
> but a tuple that was live more recently than the xmin horizon seems
> like it's acceptable enough.  HeapTupleSatisfiesVacuum already
> implements the right behavior, but we don't have a Snapshot-style
> interface for it.


If I understood correctly, this new type of snapshot would help if
there are long running transactions which can see this tuples.
But if there are not long running transactions, it will be the same.
Am i right?

And what about don’t fetch actual min and max values from indexes
whose columns doesn’t involved in join? 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-04-28 Thread Robert Haas
On Fri, Apr 28, 2017 at 3:00 PM, Tom Lane  wrote:
> You are confusing number of tuples in the index, which we estimate from
> independent measurements such as the file size, with endpoint value,
> which is used for purposes like guessing whether a mergejoin will be
> able to stop early.  For purposes like that, we do NOT want to include
> dead tuples, because the merge join is never gonna see 'em.

I spent several hours today thinking about this and, more than once,
thought I'd come up with an example demonstrating why my idea was
better than yours (despite the fact that, as you point out, the merge
join is never gonna see 'em).  However, in each instance, I eventually
realized that I was wrong, so I guess I'll have to concede this point.

> Or put another way: the observed problem is planning time, not that the
> resulting estimates are bad.  You argue that we should cut planning
> time by changing the definition of the estimate, and claim that the
> new definition is better, but I think you have nothing but wishful
> thinking behind that.  I'm willing to try to cut planning time, but
> I don't want the definition to change any further than it has to.

OK, I guess that makes sense. There can be scads of dead tuples at the
end of the index, and there's no surety that the query actually
requires touching that portion of the index at all apart from
planning, so it seems a bit unfortunate to burden planning with the
overhead of cleaning them up.  But I guess with your new proposed
definition at least that can only happen once.  After that there may
be a bunch of pages to skip at the end of the index before we actually
find a tuple, but they should at least be empty.  Right now, you can
end up skipping many tuples repeatedly for every query.  I tested a
two-table join with a million committed deleted but not dead tuples at
the end of one index; that increased planning time from ~0.25ms to
~90ms; obviously, a two-and-a-half order of magnitude increase in CPU
time spent planning is not a welcome development on a production
system.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-04-28 Thread Tom Lane
Robert Haas  writes:
> On Fri, Apr 28, 2017 at 12:12 PM, Tom Lane  wrote:
>> Maybe we need another type of snapshot that would accept any
>> non-vacuumable tuple.  I really don't want SnapshotAny semantics here,

> I don't, in general, share your intuition that using SnapshotAny is
> the wrong thing.

I think you are mistaken.

> We're looking up the last value in the index for
> planning purposes.  It seems to me that, as far as making index scans
> more or less expensive to scan, a dead tuple is almost as good as a
> live one.

You are confusing number of tuples in the index, which we estimate from
independent measurements such as the file size, with endpoint value,
which is used for purposes like guessing whether a mergejoin will be
able to stop early.  For purposes like that, we do NOT want to include
dead tuples, because the merge join is never gonna see 'em.

In short, arguing about the cost of an indexscan per se is quite
irrelevant, because this statistic is not used when estimating the
cost of an indexscan.

Or put another way: the observed problem is planning time, not that the
resulting estimates are bad.  You argue that we should cut planning
time by changing the definition of the estimate, and claim that the
new definition is better, but I think you have nothing but wishful
thinking behind that.  I'm willing to try to cut planning time, but
I don't want the definition to change any further than it has to.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-04-28 Thread Robert Haas
On Fri, Apr 28, 2017 at 12:12 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Thu, Apr 27, 2017 at 5:22 PM, Tom Lane  wrote:
>>> How so?  Shouldn't the indexscan go back and mark such tuples dead in
>>> the index, such that they'd be visited this way only once?  If that's
>>> not happening, maybe we should try to fix it.
>
>> Hmm.  Actually, I think the scenario I saw was where there was a large
>> number of tuples at the end of the index that weren't dead yet due to
>> an old snapshot held open.  That index was being scanned by lots of
>> short-running queries.  Those queries executed just fine, but they
>> took a long to plan because they had to step over all of the dead
>> tuples in the index one by one.
>
> But that was the scenario that we intended to fix by changing to
> SnapshotDirty, no?  Or I guess not quite, because
> dead-but-still-visible-to-somebody tuples are rejected by SnapshotDirty.

Yup.

> Maybe we need another type of snapshot that would accept any
> non-vacuumable tuple.  I really don't want SnapshotAny semantics here,
> but a tuple that was live more recently than the xmin horizon seems
> like it's acceptable enough.  HeapTupleSatisfiesVacuum already
> implements the right behavior, but we don't have a Snapshot-style
> interface for it.

Maybe.  What I know is that several people have found SnapshotDirty to
be problematic, and in the case with which I am acquainted, using
SnapshotAny fixed it.  I do not know whether, if everybody in the
world were using SnapshotAny, somebody would have the problem you're
talking about, or some other one.  And if they did, I don't know
whether using the new kind of snapshot you are proposing would fix it.
I do know that giving SnapshotAny to people seems to have only
improved things according to the information currently available to
me.

I don't, in general, share your intuition that using SnapshotAny is
the wrong thing.  We're looking up the last value in the index for
planning purposes.  It seems to me that, as far as making index scans
more or less expensive to scan, a dead tuple is almost as good as a
live one.  Until that tuple is not only marked dead, but removed from
the index page, it contributes to the cost of an index scan.  To put
that another way, suppose the range of index keys is 0 to 2 million,
but the heap tuples for values 1 million and up are committed deleted.
All the index tuples remain (and may or may not be removable depending
on what other snapshots exist in the system).  Now, consider the
following three cases:

(a) index scan from 0 to 10,000
(b) index scan from 1,000,000 to 1,010,000
(c) index scan from 3,000,000 to 3,010,000

I contend that the cost of index scan (b) is a lot closer to the cost
of (a) than to the cost of (c).  (b) finds a whole bunch of index
tuples; (c) gives up immediately and goes home.  So I actually think
that using the actual last value in the index - 2,000,000 - is
conceptually *correct* regardless of whether it's marked dead and
regardless of whether the corresponding heap tuple is dead.  The cost
depends mostly on which tuples are present in the index, not which
table rows the user can see.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-04-28 Thread Dmitriy Sarafannikov

> What I'm thinking of is the regular indexscan that's done internally
> by get_actual_variable_range, not whatever ends up getting chosen as
> the plan for the user query.  I had supposed that that would kill
> dead index entries as it went, but maybe that's not happening for
> some reason.


Really, this happens as you said. Index entries are marked as dead.
But after this, backends spends cpu time on skip this killed entries
in _bt_checkkeys :

if (scan->ignore_killed_tuples && ItemIdIsDead(iid))
{
/* return immediately if there are more tuples on the page */
if (ScanDirectionIsForward(dir))
{
if (offnum < PageGetMaxOffsetNumber(page))
return NULL;
}
else
{
BTPageOpaque opaque = (BTPageOpaque) 
PageGetSpecialPointer(page);

if (offnum > P_FIRSTDATAKEY(opaque))
return NULL;
}

This confirmed by perf records and backtrace reported by Vladimir earlier.
root@pgload01e ~ # perf report | grep -v '^#' | head
56.67%  postgres   postgres[.] _bt_checkkeys
19.27%  postgres   postgres[.] _bt_readpage
 2.09%  postgres   postgres[.] pglz_decompress
 2.03%  postgres   postgres[.] LWLockAttemptLock
 1.61%  postgres   postgres[.] PinBuffer.isra.3
 1.14%  postgres   postgres[.] hash_search_with_hash_value
 0.68%  postgres   postgres[.] LWLockRelease
 0.42%  postgres   postgres[.] AllocSetAlloc
 0.40%  postgres   postgres[.] SearchCatCache
 0.40%  postgres   postgres[.] ReadBuffer_common
root@pgload01e ~ #
It seems like killing dead tuples does not solve this problem.

Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-04-28 Thread Tom Lane
Robert Haas  writes:
> On Thu, Apr 27, 2017 at 5:22 PM, Tom Lane  wrote:
>> How so?  Shouldn't the indexscan go back and mark such tuples dead in
>> the index, such that they'd be visited this way only once?  If that's
>> not happening, maybe we should try to fix it.

> Hmm.  Actually, I think the scenario I saw was where there was a large
> number of tuples at the end of the index that weren't dead yet due to
> an old snapshot held open.  That index was being scanned by lots of
> short-running queries.  Those queries executed just fine, but they
> took a long to plan because they had to step over all of the dead
> tuples in the index one by one.

But that was the scenario that we intended to fix by changing to
SnapshotDirty, no?  Or I guess not quite, because
dead-but-still-visible-to-somebody tuples are rejected by SnapshotDirty.

Maybe we need another type of snapshot that would accept any
non-vacuumable tuple.  I really don't want SnapshotAny semantics here,
but a tuple that was live more recently than the xmin horizon seems
like it's acceptable enough.  HeapTupleSatisfiesVacuum already
implements the right behavior, but we don't have a Snapshot-style
interface for it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-04-28 Thread Robert Haas
On Thu, Apr 27, 2017 at 5:22 PM, Tom Lane  wrote:
>>> But if we delete many rows from beginning or end of index, it would be
>>> very expensive too because we will fetch each dead row and reject it.
>
>> Yep, and I've seen that turn into a serious problem in production.
>
> How so?  Shouldn't the indexscan go back and mark such tuples dead in
> the index, such that they'd be visited this way only once?  If that's
> not happening, maybe we should try to fix it.

Hmm.  Actually, I think the scenario I saw was where there was a large
number of tuples at the end of the index that weren't dead yet due to
an old snapshot held open.  That index was being scanned by lots of
short-running queries.  Those queries executed just fine, but they
took a long to plan because they had to step over all of the dead
tuples in the index one by one.  That increased planning time -
multiplied by the number of times it was incurred - was sufficient to
cripple the system.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-04-27 Thread Tom Lane
Andres Freund  writes:
> On 2017-04-27 17:22:25 -0400, Tom Lane wrote:
>> How so?  Shouldn't the indexscan go back and mark such tuples dead in
>> the index, such that they'd be visited this way only once?  If that's
>> not happening, maybe we should try to fix it.

> One way that never happens is if you end up choosing bitmap index scans
> all the time...

What I'm thinking of is the regular indexscan that's done internally
by get_actual_variable_range, not whatever ends up getting chosen as
the plan for the user query.  I had supposed that that would kill
dead index entries as it went, but maybe that's not happening for
some reason.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-04-27 Thread Andres Freund
On 2017-04-27 17:22:25 -0400, Tom Lane wrote:
> > Yep, and I've seen that turn into a serious problem in production.
> 
> How so?  Shouldn't the indexscan go back and mark such tuples dead in
> the index, such that they'd be visited this way only once?  If that's
> not happening, maybe we should try to fix it.

One way that never happens is if you end up choosing bitmap index scans
all the time...  I've previously had to force a certain percentage of
queries with it disabled to keep indexes in a good state :(

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-04-27 Thread Tom Lane
Robert Haas  writes:
> On Thu, Apr 27, 2017 at 4:08 AM, Dmitriy Sarafannikov
>  wrote:
>> I'd like to propose to search min and max value in index with SnapshotAny in
>> get_actual_variable_range function.

> +1 from me, but Tom rejected that approach last time.

I remain concerned about the fact that this would allow accepting deleted
values that might be way outside the surviving range of values.
SnapshotDirty is a bit lax about tuple state, but it's not so lax as to
accept fully dead tuples.

>> But if we delete many rows from beginning or end of index, it would be
>> very expensive too because we will fetch each dead row and reject it.

> Yep, and I've seen that turn into a serious problem in production.

How so?  Shouldn't the indexscan go back and mark such tuples dead in
the index, such that they'd be visited this way only once?  If that's
not happening, maybe we should try to fix it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-04-27 Thread Robert Haas
On Thu, Apr 27, 2017 at 4:08 AM, Dmitriy Sarafannikov
 wrote:
> I'd like to propose to search min and max value in index with SnapshotAny in
> get_actual_variable_range function.

+1 from me, but Tom rejected that approach last time.

> But if we delete many rows from beginning or end of index, it would be
> very expensive too because we will fetch each dead row and reject it.

Yep, and I've seen that turn into a serious problem in production.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers