On Tue, Nov 22, 2016 at 4:53 AM, Claudio Freire <klaussfre...@gmail.com> wrote:
> On Mon, Nov 21, 2016 at 2:15 PM, Masahiko Sawada <sawada.m...@gmail.com> > wrote: > > On Fri, Nov 18, 2016 at 6:54 AM, Claudio Freire <klaussfre...@gmail.com> > wrote: > >> On Thu, Nov 17, 2016 at 6:34 PM, Robert Haas <robertmh...@gmail.com> > wrote: > >>> On Thu, Nov 17, 2016 at 1:42 PM, Claudio Freire < > klaussfre...@gmail.com> wrote: > >>>> Attached is patch 0002 with pgindent applied over it > >>>> > >>>> I don't think there's any other formatting issue, but feel free to > >>>> point a finger to it if I missed any > >>> > >>> Hmm, I had imagined making all of the segments the same size rather > >>> than having the size grow exponentially. The whole point of this is > >>> to save memory, and even in the worst case you don't end up with that > >>> many segments as long as you pick a reasonable base size (e.g. 64MB). > >> > >> Wastage is bound by a fraction of the total required RAM, that is, > >> it's proportional to the amount of required RAM, not the amount > >> allocated. So it should still be fine, and the exponential strategy > >> should improve lookup performance considerably. > > > > I'm concerned that it could use repalloc for large memory area when > > vacrelstats->dead_tuples.dead_tuple is bloated. It would be overhead > > and slow. > > How large? > > That array cannot be very large. It contains pointers to > exponentially-growing arrays, but the repalloc'd array itself is > small: one struct per segment, each segment starts at 128MB and grows > exponentially. > > In fact, IIRC, it can be proven that such a repalloc strategy has an > amortized cost of O(log log n) per tuple. If it repallocd the whole > tid array it would be O(log n), but since it handles only pointers to > segments of exponentially growing tuples it becomes O(log log n). > > Furthermore, n there is still limited to MAX_INT, which means the cost > per tuple is bound by O(log log 2^32) = 5. And that's an absolute > worst case that's ignoring the 128MB constant factor which is indeed > relevant. > > > What about using semi-fixed memroy space without repalloc; > > Allocate the array of ItemPointerData array, and each ItemPointerData > > array stores the dead tuple locations. The size of ItemPointerData > > array starts with small size (e.g. 16MB or 32MB). After we used an > > array up, we then allocate next segment with twice size as previous > > segment. > > That's what the patch does. > > > But to prevent over allocating memory, it would be better to > > set a limit of allocating size of ItemPointerData array to 512MB or > > 1GB. > > There already is a limit to 1GB (the maximum amount palloc can allocate) > > Moved to next CF with "needs review" status. Regards, Hari Babu Fujitsu Australia