zhaih commented on code in PR #15939:
URL: https://github.com/apache/lucene/pull/15939#discussion_r3059599000
##########
lucene/core/src/java/org/apache/lucene/util/automaton/NFARunAutomaton.java:
##########
@@ -214,24 +228,28 @@ private void setTransitionAccordingly(Transition t) {
@Override
public int getNumTransitions(int state) {
- dStates[state].determinize();
- return dStates[state].outgoingTransitions;
+ DState dState = getDState(state);
+ dState.determinize();
+ return dState.outgoingTransitions;
}
@Override
public void getTransition(int state, int index, Transition t) {
- dStates[state].determinize();
- int outgoingTransitions = -1;
- t.transitionUpto = -1;
- t.source = state;
- while (outgoingTransitions < index && t.transitionUpto < points.length -
1) {
- if (dStates[t.source].transitions[++t.transitionUpto] != MISSING) {
- outgoingTransitions++;
+ DState dState = getDState(state);
+ synchronized (dState) {
+ dState.determinize();
+ int outgoingTransitions = -1;
+ t.transitionUpto = -1;
+ t.source = state;
+ while (outgoingTransitions < index && t.transitionUpto < points.length -
1) {
+ if (dState.transitions[++t.transitionUpto] != MISSING) {
+ outgoingTransitions++;
+ }
}
- }
- assert outgoingTransitions == index;
+ assert outgoingTransitions == index;
- setTransitionAccordingly(t);
+ setTransitionAccordingly(t, dState);
+ }
Review Comment:
This sync block is probably not needed? `dState.determinize();` is sync'd
and once it's fully determinized there'll be no further modification necessary
to this state?
Let's add a TODO for now? Since I guess correctness is more important for
this PR
##########
lucene/core/src/java/org/apache/lucene/util/automaton/NFARunAutomaton.java:
##########
@@ -310,7 +328,7 @@ private void assignTransition(int charClass, int dest) {
* wrapped as a DFA state
*/
private DState step(int c) {
- statesSet.reset(); // TODO: fork IntHashSet from hppc instead?
+ StateSet statesSet = new StateSet(5);
Review Comment:
Do we want to make this per DState as anyway all operations using this one
will be locked? Per step seems a bit too much allocation?
##########
lucene/core/src/java/org/apache/lucene/util/automaton/NFARunAutomaton.java:
##########
@@ -270,7 +288,7 @@ private DState(int[] nfaStates) {
this.hashCode = hashCode;
}
- private int nextState(int charClass) {
+ private synchronized int nextState(int charClass) {
Review Comment:
We probably are able to have a finer grained sync control in this method as
well in the future.
##########
lucene/CHANGES.txt:
##########
@@ -76,6 +76,9 @@ API Changes
* GITHUB#15627 : Deferred lambda in TermStates.java according to prefetch
(Shubham Sharma)
+* GITHUB#15939: Removed RegexpQuery constructor overloads that exposed
determinization controls (determinizeWorkLimit / doDeterminization).
+ Determinization is now handled internally, simplifying the public API.
(Dimitris Rempapis)
Review Comment:
Maybe we should mention that this may have some performance implication?
Since in some case it might be slower and if people want to restore the
behavior they need to explicitly call `determinize` before creating the query.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]