By the way:
I compiled core and corresponding tests with an old JDK 1.4 version, I found
locally on my machine. Works fine!

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

> -----Original Message-----
> From: Uwe Schindler (JIRA) [mailto:j...@apache.org]
> Sent: Monday, June 15, 2009 5:48 PM
> To: java-dev@lucene.apache.org
> Subject: [jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable
> regex)
> 
> 
>     [ https://issues.apache.org/jira/browse/LUCENE-
> 1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-
> tabpanel&focusedCommentId=12719606#action_12719606 ]
> 
> Uwe Schindler commented on LUCENE-1606:
> ---------------------------------------
> 
> Doesn't seem to work, I will check the sources:
> 
> {code}
> compile-core:
>     [javac] Compiling 12 source files to
> C:\Projects\lucene\trunk\build\contrib\regex\classes\java
>     [javac]
> C:\Projects\lucene\trunk\contrib\regex\src\java\org\apache\lucene\search\r
> egex\AutomatonFuzzyQuery.java:11: cannot access
> dk.brics.automaton.Automaton
>     [javac] bad class file:
> C:\Projects\lucene\trunk\contrib\regex\lib\automaton
> .jar(dk/brics/automaton/Automaton.class)
>     [javac] class file has wrong version 49.0, should be 48.0
>     [javac] Please remove or make sure it appears in the correct
> subdirectory of
>  the classpath.
>     [javac] import dk.brics.automaton.Automaton;
>     [javac]                           ^
>     [javac] 1 error
> {code}
> 
> > Automaton Query/Filter (scalable regex)
> > ---------------------------------------
> >
> >                 Key: LUCENE-1606
> >                 URL: https://issues.apache.org/jira/browse/LUCENE-1606
> >             Project: Lucene - Java
> >          Issue Type: New Feature
> >          Components: contrib/*
> >            Reporter: Robert Muir
> >            Assignee: Uwe Schindler
> >            Priority: Minor
> >             Fix For: 2.9
> >
> >         Attachments: automaton.patch, automatonMultiQuery.patch,
> automatonmultiqueryfuzzy.patch, automatonMultiQuerySmart.patch,
> automatonWithWildCard.patch, automatonWithWildCard2.patch, LUCENE-
> 1606.patch
> >
> >
> > Attached is a patch for an AutomatonQuery/Filter (name can change if its
> not suitable).
> > Whereas the out-of-box contrib RegexQuery is nice, I have some very
> large indexes (100M+ unique tokens) where queries are quite slow, 2
> minutes, etc. Additionally all of the existing RegexQuery implementations
> in Lucene are really slow if there is no constant prefix. This
> implementation does not depend upon constant prefix, and runs the same
> query in 640ms.
> > Some use cases I envision:
> >  1. lexicography/etc on large text corpora
> >  2. looking for things such as urls where the prefix is not constant
> (http:// or ftp://)
> > The Filter uses the BRICS package (http://www.brics.dk/automaton/) to
> convert regular expressions into a DFA. Then, the filter "enumerates"
> terms in a special way, by using the underlying state machine. Here is my
> short description from the comments:
> >      The algorithm here is pretty basic. Enumerate terms but instead of
> a binary accept/reject do:
> >
> >      1. Look at the portion that is OK (did not enter a reject state in
> the DFA)
> >      2. Generate the next possible String and seek to that.
> > the Query simply wraps the filter with ConstantScoreQuery.
> > I did not include the automaton.jar inside the patch but it can be
> downloaded from http://www.brics.dk/automaton/ and is BSD-licensed.
> 
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-dev-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to