On Tue, Jul 30, 2013 at 8:41 AM, Michael McCandless < luc...@mikemccandless.com> wrote:
> I think that's ~ 110 billion, not trillion, tokens :) > > Are you certain you don't have any term vectors? > > Even if your index has no term vectors, CheckIndex goes through all > docIDs trying to load them, but that ought to be very fast, and then > you should see "test: doc values..." after that. > I think this is actually guarded around fieldinfos.hasVectors, so it does nothing if you really dont have them. But if even one field of one document has vectors, it must check every document. One thing that could happen is if you had ONLY a tiny field that compresses very very well with vectors, there could be a little n^2 like behavior if thousands and thousands of docs are compressed into one block. We added a safety switch to CompressingStoredFields for this (an upper bound on the number of documents that will be compressed in the block), but I'm not sure if there is a similar one for vectors.