On Fri, Aug 5, 2016 at 8:55 PM, Claudio Freire <klaussfre...@gmail.com>
wrote:

> On Fri, Aug 5, 2016 at 1:27 AM, Pavan Deolasee <pavan.deola...@gmail.com>
> wrote:
> >
> >
> > I don't see why it is hard.  Trying to find the index entry for the old
> > update row seems odd, and updating an index row seems odd, but skipping
> > an index addition for the new row seems simple.  Is the problem
> > concurrent activity?  I assumed already had that ability to add to HOT
> > chains because we have to lock the update chain.
>
>
> Think of an index over a 1TB table whose keys have only 2 values: true
> and false.
>
> Sure, it's a horrible index. But I've seen things like that, and I've
> seen cases when they're useful too.
>
> So, conceptually, for each key you have circa N/2 tids on the index.
> nbtree finds the leftmost valid insert point comparing keys, it
> doesn't care about tids, so to find the index entries that point to
> the page where the new tuple is, you'd have to scan the N/2 tids in
> the index, an extremely expensive operation.
>
>
Well, it's always going to be extremely hard to solve for all use cases.
This is one such extreme case and we should just give up and do cold
update.

I think we can look at the index type (unique vs non-unique) along with
table statistics to find what fraction of column values are distinct and
then estimate whether its worthwhile to look for duplicate (key, CTID) or
just do a cold update. In addition put some cap of how hard we try once we
decide to check for duplicates and give up after we cross that threshold.

Thanks,
Pavan

-- 
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply via email to