AndreyBozhko commented on code in PR #1961: URL: https://github.com/apache/solr/pull/1961#discussion_r1344744097
########## solr/solr-ref-guide/modules/indexing-guide/pages/tokenizers.adoc: ########## @@ -772,11 +964,25 @@ The syntax is more limited than `PatternTokenizerFactory`, but the tokenization *Arguments:* -`pattern`: (Required) The regular expression, as defined by in the {lucene-javadocs}/core/org/apache/lucene/util/automaton/RegExp.html[`RegExp`] javadocs, identifying the characters that should split tokens. +`pattern`:: ++ +[%autowidth,frame=none] +|=== +s|Required |Default: none +|=== ++ +The regular expression, as defined by in the {lucene-javadocs}/core/org/apache/lucene/util/automaton/RegExp.html[`RegExp`] javadocs, identifying the characters that should split tokens. The matching is greedy such that the longest token separator matching at a given point is matched. Empty tokens are never created. -`maxDeterminizedStates`: (Optional, default 10000) the limit on total state count for the determined automaton computed from the regexp. +`determinizeWorkLimit`:: Review Comment: Correct, it seems that the name has changed to `determinizeWorkLimit` in Lucene 9.0 as part of [LUCENE-9981](https://issues.apache.org/jira/browse/LUCENE-9981). https://github.com/apache/lucene/blob/75da33836b16e78e6c1cfcf76a049ba8a6600f1e/lucene/analysis/common/src/java/org/apache/lucene/analysis/pattern/SimplePatternTokenizerFactory.java#L37-L39 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org