On Wed, Aug 10, 2011 at 7:32 AM, eks dev <eks...@yahoo.co.uk> wrote: > Thanks David, > > I did not know I can mix Automaton with LevenshteinAutomaton. > > What you say is Automaton.concatenate(LevenshteinAutomaton), > intersect, union would work. >
You can, by doing this: LevenshteinAutomata builder = new LevenshteinAutomata("foobar"); Automaton a1 = builder.toAutomaton(1); // n=1 Automaton a2 = builder.toAutomaton(2); // n=2 Other notes: we actually use these operations (e.g. concatenate) internally, because FuzzyQuery historically supported a "prefixLen". so if you do foobar with edit distance=1 and prefixLen of 3, FuzzyTermsEnum builds a "prefix automaton" of "foo" and concatenates it with a n=1 automaton of "bar" Automaton a = builder.toAutomaton(i); // constant prefix if (realPrefixLength > 0) { Automaton prefix = BasicAutomata.makeString( UnicodeUtil.newString(termText, 0, realPrefixLength)); a = BasicOperations.concatenate(prefix, a); } For the regexp syntax you discuss, you can actually already do this. This is one reason why RegexpQuery has a constructor that takes AutomatonProvider: public RegexpQuery(Term term, int flags, AutomatonProvider provider) { super(term, new RegExp(term.text(), flags).toAutomaton(provider)); } So you can provide a subclass of AutomatonProvider that implements custom syntax of your own as long as its surrounded in brackets < >, e.g. <LEV1:foobar> AutomatonProvider is a simple interface that answers to named automata: public Automaton getAutomaton(String name) throws IOException; If you do this, make sure you enable named automata (RegExp.AUTOMATON or of course RegExp.ALL) in the flags! -- lucidimagination.com --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org