Re: [HACKERS] Performance penalty of visibility info in indexes?
On Thu, Feb 01, 2007 at 11:57:41PM -0600, Jim Nasby wrote: Has anyone actually measured the performance overhead of storing visibility info in indexes? I know the space overhead sounds daunting, but even if it doubled the size of the index in many cases that'd still be a huge win over having to scan the heap as well as the index (esp. for things like count(*)). There would also be overhead from having to update the old index tuple, but for the case of updates you're likely to need that page for the new index tuple anyway. I thought the main problem was locking. If you change the visibility of an existing row, you have to update the index in a way that won't kill concurrant scans, either by returning the row twice, or skipping it. I think it hinges on what exactly falls under visibility info. Maybe with the page-at-a-time index scans, the problem is easier now. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [HACKERS] Performance penalty of visibility info in indexes?
On Feb 2, 2007, at 1:41 PM, Simon Riggs wrote: On Thu, 2007-02-01 at 23:57 -0600, Jim Nasby wrote: Has anyone actually measured the performance overhead of storing visibility info in indexes? I know the space overhead sounds daunting, but even if it doubled the size of the index in many cases that'd still be a huge win over having to scan the heap as well as the index (esp. for things like count(*)). There would also be overhead from having to update the old index tuple, but for the case of updates you're likely to need that page for the new index tuple anyway. I know this wouldn't work for all cases, but ISTM there are many cases where it would be a win. It would prevent any optimization that sought to avoid inserting rows into the index each time we perform an UPDATE. Improving UPDATE performance seems more important than improving count(*), IMHO. That depends on what you're doing; a large read-mostly table would likely see a lot of benefit from being able to do covering index scans. Of course this would have to be optional; there's lots of cases where you wouldn't want the added index size. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Performance penalty of visibility info in indexes?
Ühel kenal päeval, P, 2007-02-04 kell 22:23, kirjutas Jim Nasby: On Feb 2, 2007, at 1:41 PM, Simon Riggs wrote: On Thu, 2007-02-01 at 23:57 -0600, Jim Nasby wrote: Has anyone actually measured the performance overhead of storing visibility info in indexes? I know the space overhead sounds daunting, but even if it doubled the size of the index in many cases that'd still be a huge win over having to scan the heap as well as the index (esp. for things like count(*)). There would also be overhead from having to update the old index tuple, but for the case of updates you're likely to need that page for the new index tuple anyway. I know this wouldn't work for all cases, but ISTM there are many cases where it would be a win. It would prevent any optimization that sought to avoid inserting rows into the index each time we perform an UPDATE. Not always. If we do in-page update and keep the unchanged index entry pointing to the first tuple inside the page, then the indexes visibility info would still be valid for that tuple and also right for that field. Improving UPDATE performance seems more important than improving count(*), IMHO. That depends on what you're doing; a large read-mostly table would likely see a lot of benefit from being able to do covering index scans. A large read-mostly table would also benefit from separating the visibility info out to a compressed visibility heap. Of course this would have to be optional; there's lots of cases where you wouldn't want the added index size. Of course. All alternative ways of storing MVCC info should be optional and user-selectable so DBA can test and select the most suitable one for each usecase. -- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Performance penalty of visibility info in indexes?
Has anyone actually measured the performance overhead of storing visibility info in indexes? I know the space overhead sounds daunting, but even if it doubled the size of the index in many cases that'd still be a huge win over having to scan the heap as well as the index (esp. for things like count(*)). There would also be overhead from having to update the old index tuple, but for the case of updates you're likely to need that page for the new index tuple anyway. I know this wouldn't work for all cases, but ISTM there are many cases where it would be a win. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Performance penalty of visibility info in indexes?
On Thu, 2007-02-01 at 23:57 -0600, Jim Nasby wrote: Has anyone actually measured the performance overhead of storing visibility info in indexes? I know the space overhead sounds daunting, but even if it doubled the size of the index in many cases that'd still be a huge win over having to scan the heap as well as the index (esp. for things like count(*)). There would also be overhead from having to update the old index tuple, but for the case of updates you're likely to need that page for the new index tuple anyway. I know this wouldn't work for all cases, but ISTM there are many cases where it would be a win. It would prevent any optimization that sought to avoid inserting rows into the index each time we perform an UPDATE. Improving UPDATE performance seems more important than improving count(*), IMHO. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org