On Fri, May 22, 2009 at 10:40:03PM +0400, Earwin Burrfoot wrote: > >> Custom analyzers. > > No problem. > How are they recorded in the index?
Analyzers must implement dump() and load(), which convert the Analyzer to/from a JSON-izable data structure. They end up as JSON in index_dir/schema_NNN.json. Custom subclasses must be loaded by whatever app wants to read the index, naturally. > >> Intentionally different analyzers for indexing and searching. > > No problem. That only makes sense in the context of QueryParser, and the KS > > QueryParser allows you to supply an analyzer which overrides the Schema. > But well, it differs from analyzer used for indexation in one or two > options, and shares a heap of others. A constructor argument solves that problem, doesn't it? Am I missing something? > >> Using this analyzer without any index at all - like I do highlight on > >> a separate machine to minimize GC pauses, or tag docs by running a > >> heap of queries against MemoryIndex. > > No problem. Distribute a Schema subclass among several machines. > You mean read an index on one machine, create Analyzer, serialize it > and send over the wire to other machines? I hope that's either a joke > or I misunderstood you. Please. How did your Analyzer class get on the other machines? Do the same thing with your Schema subclass. > Storing a list of stopwords in the index sounds fun. Storing a fat > synonym/morphology dictionary while completely analogous, is no longer > fun. So, don't store that whole dictionary in the serialized Analyzer -- just store a version number. Make the synonym data class data. If it's reasonable to key multiple versions of the class data off of the version number constructor argument, do that. If not and an index was built with an version of the Analyzer that is no longer supported, either throw an exception or intentionally ignore the mismatch and serve screwed up search results. Your call. Marvin Humphrey --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org