Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
I wrote: > Dmitriy Sarafannikovwrites: >> [ snapshot_non_vacuumable_v3.patch ] > In short I think we should just set up the threshold as RecentGlobalXmin. Pushed with that adjustment and some fooling with the comments. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
Dmitriy Sarafannikovwrites: > [ snapshot_non_vacuumable_v3.patch ] Starting to look at this. I think that the business with choosing RecentGlobalXmin vs. RecentGlobalDataXmin is just wrong. What we want to do is accept any tuple that would not be considered killable by heap_hot_search_buffer, and that looks only at RecentGlobalXmin. It's possible that it'd be sensible for heap_hot_search_buffer to use RecentGlobalDataXmin when dealing with a non-catalog table, allowing it to consider more tuples killable. But I'm not sure; it looks like the decision which to use costs some effort, which we probably don't want to be spending in heap_hot_search_buffer. In any case that would be a different patch topic. With the patch as it stands, if we have a user-data tuple that is dead more recently than RecentGlobalXmin but not more recently than RecentGlobalDataXmin, the snapshot will reject it so the planner's indexscan will keep searching. But heap_hot_search_buffer won't tell the index AM to kill the tuple, so the index entry stays marked as live, which means the next get_actual_variable_range call will have to scan past it again. So we don't get the hoped-for benefit with tuple deaths in that range. I do not know whether this effect might explain your finding that the patch didn't entirely fix your performance problem. In short I think we should just set up the threshold as RecentGlobalXmin. However, this seems like a decision that might be pretty specific to get_actual_variable_range's use-case, so I'm inclined to have the threshold be chosen by get_actual_variable_range itself not hard-wired into InitNonVacuumableSnapshot. Another point is that I think it might be possible for RecentGlobalXmin to advance after get_actual_variable_range looks at it (for example, if we have to take a new catalog snapshot during index scan setup). This creates the possibility that the snapshot accepts tuples that heap_hot_search_buffer would consider dead. That seems fairly harmless though, so I'm inclined not to fuss about it. We could arrange for HeapTupleSatisfiesNonVacuumable to use RecentGlobalXmin directly, but that seems like it makes it too special-purpose. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
> On 18 Aug 2017, at 08:50, Andrey Borodinwrote: > > The following review has been posted through the commitfest application: > make installcheck-world: tested, failed > Implements feature: tested, failed > Spec compliant: tested, failed > Documentation:tested, failed > > Hi! I've looked into the patch. > > There is not so much of the code. The code itself is pretty clear and well > documented. I've found just one small typo "transactionn" in > HeapTupleSatisfiesNonVacuumable() comments. > > Chosen Snapshot type is not a solution to the author's problem at hand. > Though implemented SnapshotNonVacuumable is closer to rational choice than > currently used SnapshotDirty for range sketching. As far as I can see this is > what is get_actual_variable_range() is used for, despite word "actual" in > it's name. > The only place where get_actual_variable_range() is used for actual range is > preprocessed-out in get_variable_range(): > /* >* XXX It's very tempting to try to use the actual column min and max, > if >* we can get them relatively-cheaply with an index probe. However, > since >* this function is called many times during join planning, that could >* have unpleasant effects on planning speed. Need more investigation >* before enabling this. >*/ > #ifdef NOT_USED > if (get_actual_variable_range(root, vardata, sortop, min, max)) > return true; > #endif > > I think it would be good if we could have something like "give me actual > values, but only if it is on first index page", like conditional locks. But I > think this patch is a step to right direction. Thanks for your review! Based on your review and conclusions, I’ve bumped the status to “Ready for Committer”. If you believe another status would be more appropriate, please go in an update. cheers ./daniel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:tested, passed Oops, missed those checkboxes. Sorry for the noise. Here's correct checklist: installcheck-world passes, documented as expected. Feature is there. Best regards, Andrey Borodin. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
The following review has been posted through the commitfest application: make installcheck-world: tested, failed Implements feature: tested, failed Spec compliant: tested, failed Documentation:tested, failed Hi! I've looked into the patch. There is not so much of the code. The code itself is pretty clear and well documented. I've found just one small typo "transactionn" in HeapTupleSatisfiesNonVacuumable() comments. Chosen Snapshot type is not a solution to the author's problem at hand. Though implemented SnapshotNonVacuumable is closer to rational choice than currently used SnapshotDirty for range sketching. As far as I can see this is what is get_actual_variable_range() is used for, despite word "actual" in it's name. The only place where get_actual_variable_range() is used for actual range is preprocessed-out in get_variable_range(): /* * XXX It's very tempting to try to use the actual column min and max, if * we can get them relatively-cheaply with an index probe. However, since * this function is called many times during join planning, that could * have unpleasant effects on planning speed. Need more investigation * before enabling this. */ #ifdef NOT_USED if (get_actual_variable_range(root, vardata, sortop, min, max)) return true; #endif I think it would be good if we could have something like "give me actual values, but only if it is on first index page", like conditional locks. But I think this patch is a step to right direction. Best regards, Andrey Borodin. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
> Ok, i agree. Patch is attached. I added a patch to the CF -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
> + else \ > + (snapshotdata).xmin = \ > + TransactionIdLimitedForOldSnapshots(RecentGlobalDataXmin, \ > + relation); \ > > I think we don't need to use TransactionIdLimitedForOldSnapshots() as > that is required to override xmin for table vacuum/pruning purposes. > >> Maybe we need >> to use GetOldestXmin()? > > It is a costly call as it needs ProcArrayLock, so I don't think it > makes sense to use it here. Ok, i agree. Patch is attached. snapshot_non_vacuumable_v3.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
On Mon, May 8, 2017 at 6:30 PM, Dmitriy Sarafannikovwrote: > > I think we can use RecentGlobalDataXmin for non-catalog relations and > RecentGlobalXmin for catalog relations (probably a check similar to > what we have in heap_page_prune_opt). > > > I took check from heap_page_prune_opt (Maybe this check must be present as > separate function?) > But it requires to initialize snapshot for specific relation. > + else \ + (snapshotdata).xmin = \ + TransactionIdLimitedForOldSnapshots(RecentGlobalDataXmin, \ + relation); \ I think we don't need to use TransactionIdLimitedForOldSnapshots() as that is required to override xmin for table vacuum/pruning purposes. > Maybe we need > to use GetOldestXmin()? It is a costly call as it needs ProcArrayLock, so I don't think it makes sense to use it here. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
I think we can use RecentGlobalDataXmin for non-catalog relations andRecentGlobalXmin for catalog relations (probably a check similar towhat we have in heap_page_prune_opt).I took check from heap_page_prune_opt (Maybe this check must be present as separate function?)But it requires to initialize snapshot for specific relation. Maybe we need to use GetOldestXmin()?New version of patch is attached. snapshot_non_vacuumable_v2.patch Description: Binary data
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
On Fri, May 5, 2017 at 1:28 PM, Dmitriy Sarafannikovwrote: > Amit, thanks for comments! > >> 1. >> +#define InitNonVacuumableSnapshot(snapshotdata) \ >> + do { \ >> + (snapshotdata).satisfies = HeapTupleSatisfiesNonVacuumable; \ >> + (snapshotdata).xmin = RecentGlobalDataXmin; \ >> + } while(0) >> + >> >> Can you explain and add comments why you think RecentGlobalDataXmin is >> the right to use it here? As of now, it seems to be only used for >> pruning non-catalog tables. > > Can you explain me, what value for xmin should be used there? > I think we can use RecentGlobalDataXmin for non-catalog relations and RecentGlobalXmin for catalog relations (probably a check similar to what we have in heap_page_prune_opt). -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
Amit, thanks for comments! > 1. > +#define InitNonVacuumableSnapshot(snapshotdata) \ > + do { \ > + (snapshotdata).satisfies = HeapTupleSatisfiesNonVacuumable; \ > + (snapshotdata).xmin = RecentGlobalDataXmin; \ > + } while(0) > + > > Can you explain and add comments why you think RecentGlobalDataXmin is > the right to use it here? As of now, it seems to be only used for > pruning non-catalog tables. Can you explain me, what value for xmin should be used there? > 2. > +bool > +HeapTupleSatisfiesNonVacuumable(HeapTuple htup, Snapshot snapshot, > + Buffer buffer) > +{ > + return HeapTupleSatisfiesVacuum(htup, snapshot->xmin, buffer) > + != HEAPTUPLE_DEAD; > +} > + > > Add comments on top of this function and for the sake of consistency > update the file header as well (Summary of visibility functions:) Yes, i will add comments and send new patch. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
On Thu, May 4, 2017 at 9:42 PM, Dmitriy Sarafannikovwrote: > >> Maybe we need another type of snapshot that would accept any >> non-vacuumable tuple. I really don't want SnapshotAny semantics here, >> but a tuple that was live more recently than the xmin horizon seems >> like it's acceptable enough. HeapTupleSatisfiesVacuum already >> implements the right behavior, but we don't have a Snapshot-style >> interface for it. > > > I have tried to implement this new type of snapshot that accepts any > non-vacuumable tuples. > We have tried this patch in our load environment. And it has smoothed out > and reduced magnitude of the cpu usage peaks. > But this snapshot does not solve the problem completely. > > Patch is attached. 1. +#define InitNonVacuumableSnapshot(snapshotdata) \ + do { \ + (snapshotdata).satisfies = HeapTupleSatisfiesNonVacuumable; \ + (snapshotdata).xmin = RecentGlobalDataXmin; \ + } while(0) + Can you explain and add comments why you think RecentGlobalDataXmin is the right to use it here? As of now, it seems to be only used for pruning non-catalog tables. 2. +bool +HeapTupleSatisfiesNonVacuumable(HeapTuple htup, Snapshot snapshot, + Buffer buffer) +{ + return HeapTupleSatisfiesVacuum(htup, snapshot->xmin, buffer) + != HEAPTUPLE_DEAD; +} + Add comments on top of this function and for the sake of consistency update the file header as well (Summary of visibility functions:) -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
> Maybe we need another type of snapshot that would accept any > non-vacuumable tuple. I really don't want SnapshotAny semantics here, > but a tuple that was live more recently than the xmin horizon seems > like it's acceptable enough. HeapTupleSatisfiesVacuum already > implements the right behavior, but we don't have a Snapshot-style > interface for it. I have tried to implement this new type of snapshot that accepts any non-vacuumable tuples. We have tried this patch in our load environment. And it has smoothed out and reduced magnitude of the cpu usage peaks. But this snapshot does not solve the problem completely. Patch is attached. snapshot_non_vacuumable.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
Hi. > 25 апр. 2017 г., в 18:13, Dmitriy Sarafannikov> написал(а): > > I'd like to propose to search min and max value in index with SnapshotAny in > get_actual_variable_range function. > Current implementation scans index with SnapshotDirty which accepts > uncommitted rows and rejects dead rows. The patch is already being discussed here [1]. [1] https://www.postgresql.org/message-id/05C72CF7-B5F6-4DB9-8A09-5AC897653113%40yandex.ru -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
> If that is the case, then how would using SnapshotAny solve this > problem. We get the value from index first and then check its > visibility in heap, so if time is spent in _bt_checkkeys, why would > using a different kind of Snapshot solve the problem? 1st scanning on the index with SnapshotAny will accept this rows and will not mark this entries as killed. Theremore, killed entries are ignored on standby. And scanning with SnapshotAny will always accept first row from index. /* * During recovery we ignore killed tuples and don't bother to kill them * either. We do this because the xmin on the primary node could easily be * later than the xmin on the standby node, so that what the primary * thinks is killed is supposed to be visible on standby. So for correct * MVCC for queries during recovery we must ignore these hints and check * all tuples. Do *not* set ignore_killed_tuples to true when running in a * transaction that was started during recovery. xactStartedInRecovery * should not be altered by index AMs. */ scan->kill_prior_tuple = false; scan->xactStartedInRecovery = TransactionStartedDuringRecovery(); scan->ignore_killed_tuples = !scan->xactStartedInRecovery; We execute this read-only queries on standby. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
> 29 апр. 2017 г., в 17:34, Tom Laneнаписал(а): > > Dmitriy Sarafannikov writes: >>> Maybe we need another type of snapshot that would accept any >>> non-vacuumable tuple. I really don't want SnapshotAny semantics here, > >> If I understood correctly, this new type of snapshot would help if >> there are long running transactions which can see this tuples. >> But if there are not long running transactions, it will be the same. >> Am i right? > > Right. You haven't shown us much about the use-case you're concerned > with, so it's not clear what's actually needed. The use case is nearly the same as the way to reproduce the problem described in the first letter. It’s an OLTP database with short mostly read-only queries (~ 6k rps). Every 10 minutes new data is inserted (~5-10% of rows in polygon_table) and old is deleted (~ 5-10% or rows in polygon_table). Insert and delete are made in different transactions. Until the vacuum after delete is finished the planning time is two orders of magnitude is higher than usually. If we use prepared statements the problem doesn’t reproduce since planning is not actually done. > >> And what about don’t fetch actual min and max values from indexes >> whose columns doesn’t involved in join? > > We don't fetch that info unless we need it. We used to think that it’s actually not so (there was a problem in our test case), but we rechecked and it is, the planner doesn’t find min and max in unneeded for join indexes. > > I'm not entirely certain, but there could be cases where a single > planning cycle ends up fetching that data more than once. (There's > caching at the RestrictInfo level, but that might not be enough.) > So a line of thought that might be worth looking into is adding a > lower layer of caching to make sure it's not done more than once per > plan. Again, whether this saves anything would depend a lot on > specific use-cases. > > regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
On Fri, Apr 28, 2017 at 10:02 PM, Dmitriy Sarafannikovwrote: > > What I'm thinking of is the regular indexscan that's done internally > by get_actual_variable_range, not whatever ends up getting chosen as > the plan for the user query. I had supposed that that would kill > dead index entries as it went, but maybe that's not happening for > some reason. > > > Really, this happens as you said. Index entries are marked as dead. > But after this, backends spends cpu time on skip this killed entries > in _bt_checkkeys : > If that is the case, then how would using SnapshotAny solve this problem. We get the value from index first and then check its visibility in heap, so if time is spent in _bt_checkkeys, why would using a different kind of Snapshot solve the problem? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
Dmitriy Sarafannikovwrites: >> Maybe we need another type of snapshot that would accept any >> non-vacuumable tuple. I really don't want SnapshotAny semantics here, > If I understood correctly, this new type of snapshot would help if > there are long running transactions which can see this tuples. > But if there are not long running transactions, it will be the same. > Am i right? Right. You haven't shown us much about the use-case you're concerned with, so it's not clear what's actually needed. > And what about don’t fetch actual min and max values from indexes > whose columns doesn’t involved in join? We don't fetch that info unless we need it. I'm not entirely certain, but there could be cases where a single planning cycle ends up fetching that data more than once. (There's caching at the RestrictInfo level, but that might not be enough.) So a line of thought that might be worth looking into is adding a lower layer of caching to make sure it's not done more than once per plan. Again, whether this saves anything would depend a lot on specific use-cases. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
> Maybe we need another type of snapshot that would accept any > non-vacuumable tuple. I really don't want SnapshotAny semantics here, > but a tuple that was live more recently than the xmin horizon seems > like it's acceptable enough. HeapTupleSatisfiesVacuum already > implements the right behavior, but we don't have a Snapshot-style > interface for it. If I understood correctly, this new type of snapshot would help if there are long running transactions which can see this tuples. But if there are not long running transactions, it will be the same. Am i right? And what about don’t fetch actual min and max values from indexes whose columns doesn’t involved in join? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
On Fri, Apr 28, 2017 at 3:00 PM, Tom Lanewrote: > You are confusing number of tuples in the index, which we estimate from > independent measurements such as the file size, with endpoint value, > which is used for purposes like guessing whether a mergejoin will be > able to stop early. For purposes like that, we do NOT want to include > dead tuples, because the merge join is never gonna see 'em. I spent several hours today thinking about this and, more than once, thought I'd come up with an example demonstrating why my idea was better than yours (despite the fact that, as you point out, the merge join is never gonna see 'em). However, in each instance, I eventually realized that I was wrong, so I guess I'll have to concede this point. > Or put another way: the observed problem is planning time, not that the > resulting estimates are bad. You argue that we should cut planning > time by changing the definition of the estimate, and claim that the > new definition is better, but I think you have nothing but wishful > thinking behind that. I'm willing to try to cut planning time, but > I don't want the definition to change any further than it has to. OK, I guess that makes sense. There can be scads of dead tuples at the end of the index, and there's no surety that the query actually requires touching that portion of the index at all apart from planning, so it seems a bit unfortunate to burden planning with the overhead of cleaning them up. But I guess with your new proposed definition at least that can only happen once. After that there may be a bunch of pages to skip at the end of the index before we actually find a tuple, but they should at least be empty. Right now, you can end up skipping many tuples repeatedly for every query. I tested a two-table join with a million committed deleted but not dead tuples at the end of one index; that increased planning time from ~0.25ms to ~90ms; obviously, a two-and-a-half order of magnitude increase in CPU time spent planning is not a welcome development on a production system. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
Robert Haaswrites: > On Fri, Apr 28, 2017 at 12:12 PM, Tom Lane wrote: >> Maybe we need another type of snapshot that would accept any >> non-vacuumable tuple. I really don't want SnapshotAny semantics here, > I don't, in general, share your intuition that using SnapshotAny is > the wrong thing. I think you are mistaken. > We're looking up the last value in the index for > planning purposes. It seems to me that, as far as making index scans > more or less expensive to scan, a dead tuple is almost as good as a > live one. You are confusing number of tuples in the index, which we estimate from independent measurements such as the file size, with endpoint value, which is used for purposes like guessing whether a mergejoin will be able to stop early. For purposes like that, we do NOT want to include dead tuples, because the merge join is never gonna see 'em. In short, arguing about the cost of an indexscan per se is quite irrelevant, because this statistic is not used when estimating the cost of an indexscan. Or put another way: the observed problem is planning time, not that the resulting estimates are bad. You argue that we should cut planning time by changing the definition of the estimate, and claim that the new definition is better, but I think you have nothing but wishful thinking behind that. I'm willing to try to cut planning time, but I don't want the definition to change any further than it has to. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
On Fri, Apr 28, 2017 at 12:12 PM, Tom Lanewrote: > Robert Haas writes: >> On Thu, Apr 27, 2017 at 5:22 PM, Tom Lane wrote: >>> How so? Shouldn't the indexscan go back and mark such tuples dead in >>> the index, such that they'd be visited this way only once? If that's >>> not happening, maybe we should try to fix it. > >> Hmm. Actually, I think the scenario I saw was where there was a large >> number of tuples at the end of the index that weren't dead yet due to >> an old snapshot held open. That index was being scanned by lots of >> short-running queries. Those queries executed just fine, but they >> took a long to plan because they had to step over all of the dead >> tuples in the index one by one. > > But that was the scenario that we intended to fix by changing to > SnapshotDirty, no? Or I guess not quite, because > dead-but-still-visible-to-somebody tuples are rejected by SnapshotDirty. Yup. > Maybe we need another type of snapshot that would accept any > non-vacuumable tuple. I really don't want SnapshotAny semantics here, > but a tuple that was live more recently than the xmin horizon seems > like it's acceptable enough. HeapTupleSatisfiesVacuum already > implements the right behavior, but we don't have a Snapshot-style > interface for it. Maybe. What I know is that several people have found SnapshotDirty to be problematic, and in the case with which I am acquainted, using SnapshotAny fixed it. I do not know whether, if everybody in the world were using SnapshotAny, somebody would have the problem you're talking about, or some other one. And if they did, I don't know whether using the new kind of snapshot you are proposing would fix it. I do know that giving SnapshotAny to people seems to have only improved things according to the information currently available to me. I don't, in general, share your intuition that using SnapshotAny is the wrong thing. We're looking up the last value in the index for planning purposes. It seems to me that, as far as making index scans more or less expensive to scan, a dead tuple is almost as good as a live one. Until that tuple is not only marked dead, but removed from the index page, it contributes to the cost of an index scan. To put that another way, suppose the range of index keys is 0 to 2 million, but the heap tuples for values 1 million and up are committed deleted. All the index tuples remain (and may or may not be removable depending on what other snapshots exist in the system). Now, consider the following three cases: (a) index scan from 0 to 10,000 (b) index scan from 1,000,000 to 1,010,000 (c) index scan from 3,000,000 to 3,010,000 I contend that the cost of index scan (b) is a lot closer to the cost of (a) than to the cost of (c). (b) finds a whole bunch of index tuples; (c) gives up immediately and goes home. So I actually think that using the actual last value in the index - 2,000,000 - is conceptually *correct* regardless of whether it's marked dead and regardless of whether the corresponding heap tuple is dead. The cost depends mostly on which tuples are present in the index, not which table rows the user can see. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
> What I'm thinking of is the regular indexscan that's done internally > by get_actual_variable_range, not whatever ends up getting chosen as > the plan for the user query. I had supposed that that would kill > dead index entries as it went, but maybe that's not happening for > some reason. Really, this happens as you said. Index entries are marked as dead. But after this, backends spends cpu time on skip this killed entries in _bt_checkkeys : if (scan->ignore_killed_tuples && ItemIdIsDead(iid)) { /* return immediately if there are more tuples on the page */ if (ScanDirectionIsForward(dir)) { if (offnum < PageGetMaxOffsetNumber(page)) return NULL; } else { BTPageOpaque opaque = (BTPageOpaque) PageGetSpecialPointer(page); if (offnum > P_FIRSTDATAKEY(opaque)) return NULL; } This confirmed by perf records and backtrace reported by Vladimir earlier. root@pgload01e ~ # perf report | grep -v '^#' | head 56.67% postgres postgres[.] _bt_checkkeys 19.27% postgres postgres[.] _bt_readpage 2.09% postgres postgres[.] pglz_decompress 2.03% postgres postgres[.] LWLockAttemptLock 1.61% postgres postgres[.] PinBuffer.isra.3 1.14% postgres postgres[.] hash_search_with_hash_value 0.68% postgres postgres[.] LWLockRelease 0.42% postgres postgres[.] AllocSetAlloc 0.40% postgres postgres[.] SearchCatCache 0.40% postgres postgres[.] ReadBuffer_common root@pgload01e ~ # It seems like killing dead tuples does not solve this problem.
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
Robert Haaswrites: > On Thu, Apr 27, 2017 at 5:22 PM, Tom Lane wrote: >> How so? Shouldn't the indexscan go back and mark such tuples dead in >> the index, such that they'd be visited this way only once? If that's >> not happening, maybe we should try to fix it. > Hmm. Actually, I think the scenario I saw was where there was a large > number of tuples at the end of the index that weren't dead yet due to > an old snapshot held open. That index was being scanned by lots of > short-running queries. Those queries executed just fine, but they > took a long to plan because they had to step over all of the dead > tuples in the index one by one. But that was the scenario that we intended to fix by changing to SnapshotDirty, no? Or I guess not quite, because dead-but-still-visible-to-somebody tuples are rejected by SnapshotDirty. Maybe we need another type of snapshot that would accept any non-vacuumable tuple. I really don't want SnapshotAny semantics here, but a tuple that was live more recently than the xmin horizon seems like it's acceptable enough. HeapTupleSatisfiesVacuum already implements the right behavior, but we don't have a Snapshot-style interface for it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
On Thu, Apr 27, 2017 at 5:22 PM, Tom Lanewrote: >>> But if we delete many rows from beginning or end of index, it would be >>> very expensive too because we will fetch each dead row and reject it. > >> Yep, and I've seen that turn into a serious problem in production. > > How so? Shouldn't the indexscan go back and mark such tuples dead in > the index, such that they'd be visited this way only once? If that's > not happening, maybe we should try to fix it. Hmm. Actually, I think the scenario I saw was where there was a large number of tuples at the end of the index that weren't dead yet due to an old snapshot held open. That index was being scanned by lots of short-running queries. Those queries executed just fine, but they took a long to plan because they had to step over all of the dead tuples in the index one by one. That increased planning time - multiplied by the number of times it was incurred - was sufficient to cripple the system. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
Andres Freundwrites: > On 2017-04-27 17:22:25 -0400, Tom Lane wrote: >> How so? Shouldn't the indexscan go back and mark such tuples dead in >> the index, such that they'd be visited this way only once? If that's >> not happening, maybe we should try to fix it. > One way that never happens is if you end up choosing bitmap index scans > all the time... What I'm thinking of is the regular indexscan that's done internally by get_actual_variable_range, not whatever ends up getting chosen as the plan for the user query. I had supposed that that would kill dead index entries as it went, but maybe that's not happening for some reason. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
On 2017-04-27 17:22:25 -0400, Tom Lane wrote: > > Yep, and I've seen that turn into a serious problem in production. > > How so? Shouldn't the indexscan go back and mark such tuples dead in > the index, such that they'd be visited this way only once? If that's > not happening, maybe we should try to fix it. One way that never happens is if you end up choosing bitmap index scans all the time... I've previously had to force a certain percentage of queries with it disabled to keep indexes in a good state :( - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
Robert Haaswrites: > On Thu, Apr 27, 2017 at 4:08 AM, Dmitriy Sarafannikov > wrote: >> I'd like to propose to search min and max value in index with SnapshotAny in >> get_actual_variable_range function. > +1 from me, but Tom rejected that approach last time. I remain concerned about the fact that this would allow accepting deleted values that might be way outside the surviving range of values. SnapshotDirty is a bit lax about tuple state, but it's not so lax as to accept fully dead tuples. >> But if we delete many rows from beginning or end of index, it would be >> very expensive too because we will fetch each dead row and reject it. > Yep, and I've seen that turn into a serious problem in production. How so? Shouldn't the indexscan go back and mark such tuples dead in the index, such that they'd be visited this way only once? If that's not happening, maybe we should try to fix it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range
On Thu, Apr 27, 2017 at 4:08 AM, Dmitriy Sarafannikovwrote: > I'd like to propose to search min and max value in index with SnapshotAny in > get_actual_variable_range function. +1 from me, but Tom rejected that approach last time. > But if we delete many rows from beginning or end of index, it would be > very expensive too because we will fetch each dead row and reject it. Yep, and I've seen that turn into a serious problem in production. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers