OK sorry, I must have misread the timings in the issue you forwarded! Maybe confusing secs with ms or so
On Tue, Jun 14, 2022 at 11:43 AM Nhat Nguyen <nhat.ngu...@elastic.co.invalid> wrote: > > I didn't test with TB indices, but the API took around 100-300ms to analyze a > GB index. > > On Tue, Jun 14, 2022 at 11:15 AM Robert Muir <rcm...@gmail.com> wrote: >> >> On Tue, Jun 14, 2022 at 10:37 AM Michael Sokolov <msoko...@gmail.com> wrote: >> > >> > Oh, yes that's a clever idea. It seems it would take quite a while >> > (tens of minutes?) for a larger index though? Much faster than the >> > force-merge solution for sure. I guess to get faster we would have to >> > instrument each format. I mean they generally do know how much space >> > each field is occupying, but perhaps it's too much API change to >> > expose that. >> >> Why tens of minutes? That simple first doc/last doc works for the term >> vectors and docvalues too. For the postings, Terms.java has methods >> getMin() and getMax(), so it is possible to seek to the first and last >> term for the field. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org