Thanks, thats what i kind of expected. still debating whether the space increase is worth it, right now Im at .7% of searches taking longer than 10 seconds, and 6% taking longer than 1, so when i see things like this in the morning it bugs me a bit:
2017-08-02 11:50:48 : 58979/1000 secs : ("Rules of Practice for the Courts of Equity of the United States") 2017-08-02 02:16:36 : 54749/1000 secs : ("The American Cause") 2017-08-02 19:27:58 : 54561/1000 secs : ("register of the department of justice") which could all be annihilated with CG's, at the expense, according to HT, of a 40% increase in index size. On Thu, Aug 3, 2017 at 11:21 AM, Erick Erickson <erickerick...@gmail.com> wrote: > bq: will that search still return results form the earlier documents > as well as the new ones > > In a word, "no". By definition the analysis chain applied at index > time puts tokens in the index and that's all you have to search > against for the doc unless and until you re-index the document. > > You really have two choices here: > 1> live with the differing results until you get done re-indexing > 2> index to an offline collection and then use, say, collection > aliasing to make the switch atomically. > > Best, > Erick > > On Thu, Aug 3, 2017 at 8:07 AM, David Hastings > <hastings.recurs...@gmail.com> wrote: > > Hey all, I have yet to run an experiment to test this but was wondering > if > > anyone knows the answer ahead of time. > > If i have an index built with documents before implementing the > commongrams > > filter, then enable it, and start adding documents that have the > > filter/tokenizer applied, will searches that fit the criteria, for > example: > > "to be or not to be" > > will that search still return results form the earlier documents as well > as > > the new ones? The idea is that a full re-index is going to be difficult, > > so would rather do it over time by replacing large numbers of documents > > incrementally. Thanks, > > Dave >