[
https://issues.apache.org/jira/browse/LUCENE-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Muir updated LUCENE-2970:
--------------------------------
Attachment: LUCENE-2970.patch
Attached is a patch: imagine a regexp with lots of "optionals" e.g.
[abcd]e?f?[gh]a?b? ...
In this case the code is not linear in number of states... if we are at state
A, and it has a transition to B, we determine that B is finite, then later if
we are at C and it leads to B too, we need not determine if B is finite again,
as we already did so. So, I keep "visited" for this.
Additionally I changed it to use a Bitset instead of a HashSet, which helps the
speed (but just a constant-time speedup).
I took the old code, dumped it into AutomatonTestUtil as "isFiniteSimple" and
the test just generates random automata and compares this versus the new
implementation.
I'd appreciate any reviews/suggestions any automaton-hackers want to give here.
> SpecialOperations.isFinite can have TERRIBLE TERRIBLE runtime in certain
> situations
> -----------------------------------------------------------------------------------
>
> Key: LUCENE-2970
> URL: https://issues.apache.org/jira/browse/LUCENE-2970
> Project: Lucene - Java
> Issue Type: Bug
> Affects Versions: 4.0
> Reporter: Robert Muir
> Assignee: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2970.patch
>
>
> in an application of mine, i experienced some very slow query times with
> finite automata (all the DFAs are acyclic)
> It turned out, the slowdown is some terrible runtime in
> SpecialOperations.isFinite <-- this is used to determine if the DFA is
> acyclic or not.
> (in this case I am talking about even up to minutes of cpu).
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]