Thank you for the information. Tomorrow, I will also run a few tests to measure the time required to collect tids from the index; however, since I do not work with vanilla postgres, the results may vary.
If the results indicate that this procedure is time-consuming, I maybe will develop an additional patch specifically for b-tree indexes, as they are the default and most commonly used type. Best regards, Sergey On Mon, Jun 16, 2025, 11:01 PM Mihail Nikalayeu <mihailnikala...@gmail.com> wrote: > Hello, Sergey! > > > I think it's to avoid duplicate errors when adding tuples from STIP to > the main index, > > but couldn't we just suppress that error during validation and skip the > new tuple insertion if it already exists? > > In some cases, it is not possible: > – Some index types (GiST, GIN, BRIN) do not provide an easy way to > detect such duplicates. > – When we are building a unique index, we cannot simply skip > duplicates, because doing so would also skip the rows that should > prevent the unique index from being created (unless we add extra logic > for B-tree indexes to compare TIDs as well). > > > The main index may get huge after building, and iterating over it in a > single thread and then sorting tids can be time consuming. > My tests indicate that the overhead is minor compared with the time > spent scanning the heap and building the index itself. > > > At least I guess one can skip it when STIP is empty. > Yes, that’s a good idea; I’ll add it later. > > > p.s. I noticed that `stip.c` has a lot of functions that don't follow > the Postgres coding style of return type on separate line. > Hmm... I’ll fix that as well. > > Best regards, > Mikhail. >