I didn't test with TB indices, but the API took around 100-300ms to analyze a GB index.
On Tue, Jun 14, 2022 at 11:15 AM Robert Muir <[email protected]> wrote: > On Tue, Jun 14, 2022 at 10:37 AM Michael Sokolov <[email protected]> > wrote: > > > > Oh, yes that's a clever idea. It seems it would take quite a while > > (tens of minutes?) for a larger index though? Much faster than the > > force-merge solution for sure. I guess to get faster we would have to > > instrument each format. I mean they generally do know how much space > > each field is occupying, but perhaps it's too much API change to > > expose that. > > Why tens of minutes? That simple first doc/last doc works for the term > vectors and docvalues too. For the postings, Terms.java has methods > getMin() and getMax(), so it is possible to seek to the first and last > term for the field. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
