On Thu, Jan 30, 2014 at 4:34 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Quite aside from the index bloat risk, this effect means a 3-4x reduction > in the maximum string length that can be indexed before getting the > dreaded "Values larger than 1/3 of a buffer page cannot be indexed" error. > Worse, a value insertion might well succeed, with the failure happening > only (much?) later when that entry is chosen as a page split boundary.
That's not hard to prevent. If that should happen, we don't go with the strxfrm() datum. We have a spare IndexTuple bit we could use to mark when the optimization was applied. So we consider the appropriateness of a regular strcoll() or a strxfrm() in all contexts (in a generic and extensible manner, but that's essentially what we do). I'm not too discouraged by this restriction, because in practice it won't come up very often. >> I'm sure anyone that has read this far knows where I'm going with >> this: why can't we just have strxfrm() blobs in the inner pages, >> implying large savings for a big majority of text comparisons that >> service index scans, without bloating the indexes too badly, and >> without breaking anything? We only use inner pages to find leaf pages. >> They're redundant copies of the data within the index. > > It's a cute idea though, and perhaps worth pursuing as long as you've > got the pitfalls in mind. I'll think about pursuing it. I might prefer to declare it as fair game for anyone else that wants to do it. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers